JP2009111733A

JP2009111733A - Method, device and program for encoding image

Info

Publication number: JP2009111733A
Application number: JP2007282077A
Authority: JP
Inventors: Yukihiro Bando; 幸浩坂東; Kazuya Hayase; 和也早瀬; Masayuki Takamura; 誠之高村; Kazuto Kamikura; 一人上倉; Yoshiyuki Yajima; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2007-10-30
Filing date: 2007-10-30
Publication date: 2009-05-21
Anticipated expiration: 2027-10-30
Also published as: JP4820800B2

Abstract

<P>PROBLEM TO BE SOLVED: To prevent generation of a block distortion, and to prevent the deterioration of a decoded image in the encoding system of a block base. <P>SOLUTION: A code quantity computing section 303 computes the quantity of codes on the basis of an encoded frame signal, a reference frame signal and a quantized parameter. A weighted-distortion quantity computing section 305 computes the quantity of a weighted distortion from the lower-limit value of the quantity of a change according to a mode stored in a mode storage section 302. An undefined multiplier computing section 307 computes an undefined multiplier on the basis of the input encoded frame signal, reference frame signal and quantized parameter. A cost computing section 309 computes a cost on the basis of the quantity of codes, the quantity of weighted distortion and the undefined multiplier. A minimum cost decision section 311 decides the cost as the minimum one. An optimum mode output section 316 outputs the mode in this case as the optimum one when the cost is determined as the minimum one. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、画像符号化方法、画像符号化装置、及び画像符号化プログラムに関する。 The present invention relates to an image encoding method, an image encoding device, and an image encoding program.

［二乗誤差規範のコスト関数を用いる符号化方式］
Ｈ．２６４では、イントラ予測及び可変形状動き補償の導入に伴い、従来の標準化方式と比べて、予測モードの種類が増加している。このため、一定の主観画質を保持しつつ、符号量を削減するには、適切な予測モードを選択する必要がある。Ｈ．２６４の参照ソフトウェアＪＭでは、次式（１）に示すＲ−Ｄコストを最小化する予測モードを選択している（非特許文献１参照）。 [Encoding method using cost function of square error criterion]
H. In H.264, with the introduction of intra prediction and variable shape motion compensation, the types of prediction modes are increased as compared with the conventional standardized method. For this reason, in order to reduce the amount of codes while maintaining a constant subjective image quality, it is necessary to select an appropriate prediction mode. H. In the H.264 reference software JM, a prediction mode that minimizes the RD cost shown in the following equation (1) is selected (see Non-Patent Document 1).

ここで、Ｓは原信号、ｑは量子化パラメータ、ｍは予測モードを表す番号であり、Ｓ_ｍ，ｑはＳに対してモードｍを用いて予測し、ｑを用いて量子化した場合の復号信号である。また、λはモード選択に用いるラグランジェの未定乗数である。さらに、Ｄ（Ｓ，（＾）Ｓ_ｍ，ｑ）は次式（２）に示す二乗誤差和である。 Here, S is the original signal, q is the number quantization parameter, m is representative of a prediction mode, S _{m, q} is predicted using the mode m against S, in the case of quantized using q It is a decoded signal. Further, λ is a Lagrange's undetermined multiplier used for mode selection. Further, D (S, (^) S _{m, q} ) is a square error sum represented by the following equation (2).

ここで、Ｓ^Ｙ、Ｓ^Ｕ、Ｓ^Ｖは原信号のＹ，Ｕ，Ｖ成分であり、Ｓ^Ｙ _ｍ，ｑ、Ｓ^Ｕ _ｍ，ｑ、Ｓ^Ｖ _ｍ，ｑは復号信号のＹ，Ｕ，Ｖ成分である。 Here, S ^Y , S ^U and S ^V are Y, U and V components of the original signal, and S ^Y _{m, q} , S ^U _{m, q} and S ^V _{m, q} are Y, U and V of the decoded signal. It is an ingredient.

Ｈ．２６４における復号信号の算出手順を以下に説明する。なお、説明に用いる記号を図１０にまとめる。Ｈ．２６４の符号化処理では、モード番号ｍの予測を用いた場合の予測誤差信号Ｒ（＝Ｓ−Ｐｍ）に対して、変換行列Φを用いた直交変換が次式（３）のように施される。 H. The calculation procedure of the decoded signal in H.264 will be described below. The symbols used for the explanation are summarized in FIG. H. In the H.264 encoding process, orthogonal transformation using the transformation matrix Φ is applied to the prediction error signal R (= S−Pm) when the prediction of the mode number m is used as shown in the following equation (3). The

Φ^ｔは変換行列Φに対する転置行列を表す。なお、変換行列Φは次式（４）で表される整数要素の直交行列である。 Φ ^t represents a transposed matrix with respect to the transformation matrix Φ. The transformation matrix Φ is an orthogonal matrix of integer elements expressed by the following equation (4).

次に、行列Φが非正規行列であるため、次式（５）に従って行列の正規化に相当する処理を行う。 Next, since the matrix Φ is a non-normal matrix, processing corresponding to matrix normalization is performed according to the following equation (5).

これは、数式（３）においてΦの代わりに次式（６）のφを用いることに相当する。 This corresponds to using φ in the following equation (6) instead of Φ in equation (3).

さらに、Ｃに対して、量子化パラメータｑを用いた量子化が次式（７）の通り施される。なお、参照ソフトウェアＪＭでは、正規化は、量子化の中に組み込まれている。 Further, the quantization using the quantization parameter q is performed on C as shown in the following equation (7). In the reference software JM, normalization is incorporated in quantization.

一方、Ｈ．２６４の復号処理では、Ｖに対して、次式（８）のように逆量子化を施し、変換係数の復号値を得る。 On the other hand, H. In the H.264 decoding process, V is inversely quantized as in the following equation (8) to obtain a decoded value of the transform coefficient.

次に、（＾）Ｃ_ｑに対して、次式（９）のように、逆変換を施し、予測誤差の復号信号を得る。 Next, (^) C _q is subjected to inverse transformation as shown in the following equation (9) to obtain a decoded signal of prediction error.

最後に、次式（１０）により、符号化対象画像の復号信号を得る。 Finally, a decoded signal of the encoding target image is obtained by the following equation (10).

［主観画質を考慮した歪み量への重み付け］
前述した通り、参照ソフトウェアＪＭで用いられている主観画質の尺度は、二乗誤差である。しかしながら、この二乗誤差は、必ずしも、主観的な画質劣化を反映した歪み量ではない。例えば、高周波数成分の変化は、低周波成分の変化に比べて、視覚的には、検知されにくい。しかしながら、こうした視覚特性を利用していない符号化器（例えば、参照ソフトウェアＪＭ）には、符号量の効率的な削減に関して、改良の余地が残る。 [Weighting distortion amount considering subjective image quality]
As described above, the subjective image quality measure used in the reference software JM is a square error. However, this square error is not necessarily a distortion amount reflecting subjective image quality degradation. For example, changes in high frequency components are less likely to be detected visually than changes in low frequency components. However, an encoder (for example, reference software JM) that does not use such visual characteristics has room for improvement in terms of efficient code amount reduction.

そこで、時空間周波数成分に対して視覚感度に差があることを利用する検討がなされている。直交変換係数に対して、視覚感度に応じて空間周波数成分毎に歪み量の重み付けを行うことで、主観画質に対応した歪み量を定義する。さらに、時間方向の視覚感度も考慮して、上述した重み付けされた歪み量に対して、変移量に応じてさらに重み付けを行う。こうして、時空間の視覚感度に基づいて重み付けされた歪み量を、符号化パラメータ選択のコスト関数において用いる。 Therefore, studies have been made to use the difference in visual sensitivity with respect to spatio-temporal frequency components. A distortion amount corresponding to the subjective image quality is defined by weighting the distortion amount for each spatial frequency component in accordance with the visual sensitivity with respect to the orthogonal transform coefficient. Further, in consideration of the visual sensitivity in the time direction, the weighted distortion amount described above is further weighted according to the shift amount. In this way, the amount of distortion weighted based on the spatiotemporal visual sensitivity is used in the cost function for selecting the encoding parameter.

次に、量子化誤差信号に対する視覚感度に基づく重み付けについて説明する。量子化誤差信号に対する視覚感度に基づく重み付けとしは、例えば、次式（１１）で表わされるＲ−Ｄコストがある。 Next, weighting based on visual sensitivity for the quantization error signal will be described. As the weighting based on the visual sensitivity for the quantization error signal, for example, there is an RD cost represented by the following equation (11).

ここで、Ｃ_ｍはモード番号ｍを用いた場合の予測残差信号Ｒ_ｍに対する変換係数であり、Ｃ_ｍ，ｑはＣ_ｍを量子化パラメータｑで量子化・逆量子化して得られる係数の復号値である。このＲ−Ｄコストの計算に用いる歪み量として、次式（１２）に示す重み付け歪み量を用いる。 Here, C _m is a transform coefficient for the prediction residual signal R _m when the mode number m is used, and C _{m, q} is a coefficient obtained by quantizing and dequantizing C _m with the quantization parameter q. It is a decoded value. As a distortion amount used for calculating the RD cost, a weighted distortion amount represented by the following equation (12) is used.

ここで、Ｃ^γ（ｉ） _ｍ［ｋ，ｌ］（γ＝Ｙ，Ｕ，Ｖ）は、Ｃ_ｍの要素であり、マクロブロック（Ｙ成分の場合、１６×１６［画素］、Ｕ，Ｖ成分の場合、８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち、ラスター走査においてｉ番目に走査されるサブブロックに含まれる変換係数である。また、（＾）Ｃ^γ（ｉ） _ｍ［ｋ，ｌ］（γ＝Ｙ，Ｕ，Ｖ）は、Ｃ_ｍ，ｑの要素であり、マクロブロック（Ｙ成分の場合、１６×１６［画素］），Ｕ，Ｖ成分の場合、８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち、ラスター走査においてｉ番目に走査されるサブブロックに含まれる復号変換係数である。 Here, C ^{γ (i)} _m [k, l] (γ = Y, U, V) is an element of C _m and is a macroblock (16 × 16 [pixel], U, V in the case of Y component). In the case of the component, it is a conversion coefficient included in the i-th sub-block scanned in the raster scan among the sub-blocks (N × N [pixel]) in 8 × 8 [pixel]). Further, (^) C ^{γ (i)} _m [k, l] (γ = Y, U, V) is an element of C _{m, q} , and a macroblock (16 × 16 [pixel] in the case of Y component) ), U, and V components, decoding transform coefficients included in the i-th sub-block scanned in the raster scanning among the sub-blocks (N × N [pixels]) in 8 × 8 [pixels)). .

さらに、Ｗ^γ _ｋ，ｌ（γ＝Ｙ，Ｕ，Ｖ）は、１以下に設定される重み係数であり、以下では、感度係数と呼ぶ。感度係数の算出については、［感度係数の算出］にて詳述する。上記数式（１２）において、Ｗ^γ _ｋ，ｌを小さな値に設定することは、量子化歪みＤ（Ｃ_ｍ，Ｃ_ｍ，ｑ）を小さく見積もることに相当する。 Further, W ^γ _{k, l} (γ = Y, U, V) is a weighting coefficient set to 1 or less, and is hereinafter referred to as a sensitivity coefficient. The calculation of the sensitivity coefficient will be described in detail in [Calculation of sensitivity coefficient]. In Equation (12), setting W ^γ _{k, l} to a small value corresponds to estimating the quantization distortion D (C _m , C _{m, q} ) to be small.

なお、直交変換の正規性より、Ｗ^γ _ｋ，ｌ＝１（^∀ｋ，ｌ；γ＝Ｙ，Ｕ，Ｖ）とすれば、上述した重み付け歪み量は、二乗誤差和と等価となる。Ｗ^γ _ｋ，ｌ（γ＝Ｙ，Ｕ，Ｖ）は、空間周波数及び時間周波数が高いほど、小さな値をとる。なお、直交変換の正規性より、Ｗ^γ _ｋ，ｌ＝１（^∀ｋ，ｌ；γ＝Ｙ，Ｕ，Ｖ）とすれば、上述した重み付け歪み量は、二乗誤差和と等価となる。
K.P.Lim and G.Sullivan and T.Wiegand, Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-R95, Jan., 2006. http://ftp3.itu.ch/av-arch/jvt-site/2006-01-Bangkok/JVT-R095.zip From the normality of orthogonal transformation, if W ^γ _{k, l} = 1 ( ^∀ k, l; γ = Y, U, V), the above-described weighted distortion amount is equivalent to the square error sum. W ^γ _{k, l} (γ = Y, U, V) takes a smaller value as the spatial frequency and the temporal frequency are higher. From the normality of orthogonal transformation, if W ^γ _{k, l} = 1 ( ^∀ k, l; γ = Y, U, V), the above-described weighted distortion amount is equivalent to the square error sum.
KPLim and G. Sullivan and T. Wiegand, Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods.Joint Video Team (JVT) of ISO / IEC MPEG and ITU-T VCEG, JVT-R95, Jan., 2006. http : //ftp3.itu.ch/av-arch/jvt-site/2006-01-Bangkok/JVT-R095.zip

上述したコスト感度関数に基づく歪み量への重み付けを行う方法では、マクロブロック単位に感度関数による重み付けを行うため、重み付け後の歪み量にブロック境界における不連続性（ブロック歪み）が反映されない。動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式（例えば、Ｈ．２６４）では、ブロック歪みは、特徴的な符号化歪みである。このため、ブロック歪みが考慮されていない場合、得られた重み付き歪み量が主観画質を正しく反映できないケースが発生する。 In the above-described method of weighting the distortion amount based on the cost sensitivity function, weighting by the sensitivity function is performed in units of macroblocks, and thus the discontinuity (block distortion) at the block boundary is not reflected in the weighted distortion amount. In a block-based coding scheme (for example, H.264) that combines inter-frame prediction based on motion compensation and orthogonal transform, block distortion is characteristic coding distortion. For this reason, when the block distortion is not taken into consideration, a case where the obtained weighted distortion amount cannot correctly reflect the subjective image quality occurs.

符号化歪み尺度においてブロック歪みが反映されなかった原因について説明する。従来技術では、各マクロブロックのＤＣＴ係数に対して、コントラスト感度関数に基づき重み付けを行っていた。このため、各マクロブロック内の波形に対するコントラスト感度は反映していたが、隣接ブロック間の不連続性については、考慮されていなかった。図１１に示す１次元信号を例に取ると、各ブロック（ブロックｋ−１、ブロックｋ、ブロックｋ＋１）のＤＣＴ係数に対して重み付けを行うブロックに閉じた処理では、ブロック間の不連続性（ブロックk −１とブロックｋの間の不連続性、あるいは、ブロックｋとブロックｋ＋１の間の不連続性）を知り得ない。 The reason why the block distortion is not reflected in the coding distortion scale will be described. In the prior art, the DCT coefficient of each macroblock is weighted based on the contrast sensitivity function. For this reason, the contrast sensitivity with respect to the waveform in each macroblock was reflected, but the discontinuity between adjacent blocks was not considered. Taking the one-dimensional signal shown in FIG. 11 as an example, discontinuity between blocks (block k−1, block k, block k + 1) is closed in a block that performs weighting on the DCT coefficient of each block (block k−1, block k, block k + 1). The discontinuity between block k-1 and block k or the discontinuity between block k and block k + 1) cannot be known.

すなわち、上述した従来技術では、画素単位でしか画質を考慮していなかった。このため、画素単位の主観画質しか担保できておらず、ブロック歪みが発生し、復号画像の画質を大きく劣化させるという問題があった。 That is, in the above-described conventional technology, the image quality is considered only in pixel units. For this reason, only subjective image quality in pixel units can be ensured, block distortion occurs, and the image quality of the decoded image is greatly degraded.

本発明は、このような事情を考慮してなされたものであり、その目的は、動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式において、ブロック歪みによる復号画像の画像劣化を防止することができ、かつ、符号量を削減することができる画像符号化方法、画像符号化装置、及び画像符号化プログラムを提供することにある。 The present invention has been made in consideration of such circumstances, and the purpose of the present invention is to improve the image degradation of a decoded image due to block distortion in a block-based encoding scheme that combines inter-frame prediction by motion compensation and orthogonal transformation. It is an object to provide an image encoding method, an image encoding device, and an image encoding program that can prevent the above-described problem and reduce the amount of codes.

上述した課題を解決するために、本発明は、画像信号、あるいは、フレーム内予測及びフレーム間予測により得られた予測誤差信号に対して、変換符号化、量子化による情報圧縮を行う画像符号化方法において、歪み量、符号量、未定乗数からなるラグランジェのコスト関数に基づいて、動画像符号化における動き補償ブロックサイズ、インター予測モード、量子化パラメータ、静止画像符号化における量子化パラメータ、イントラ予測モード等の符号化パラメータを決定する際、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する空間周波数成分を計測するステップと、さらに、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する時間周波数成分を推定するステップと、前記空間周波数成分及び前記時間周波数成分の成分毎に、視覚感度関数に基づいて、重要度を算出するステップと、前記重要度に基づいて、周波数毎に重み付けされた二乗誤差として得られる歪み量を用いて符号量との加重和としてコスト関数を設定するステップと、前記コスト関数を最小化するモードを選択するステップと、を含むことを特徴とする画像符号化方法である。 In order to solve the above-described problems, the present invention provides an image coding that performs information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intraframe prediction and interframe prediction. In the method, based on a Lagrangian cost function including a distortion amount, a code amount, and an undetermined multiplier, a motion compensation block size in moving image coding, an inter prediction mode, a quantization parameter, a quantization parameter in still image coding, an intra When determining coding parameters such as the prediction mode, the step of measuring the spatial frequency component in the block and the discontinuity between adjacent blocks, and the frequency component in the block and the non-interval between adjacent blocks are measured. Estimating a temporal frequency component related to continuity, the spatial frequency component and the temporal frequency For each component, a step of calculating the importance based on the visual sensitivity function, and a weighted sum of the code amount using a distortion amount obtained as a square error weighted for each frequency based on the importance. And a step of setting a cost function and selecting a mode that minimizes the cost function.

本発明は、上記の発明において、ブロック内の空間周波数成分を計測する際、符号化に用いる直交変換の基底画像に対して、水平・垂直方向のサンプル数が各々、２倍になるように、零値を加えた修正基底画像を生成するステップと、前記修正基底画像に対して、離散フーリエ変換係数を計算することで、ブロック内の空間周波数成分及び隣接ブロック間の不連続性に関する空間周波数成分を算出するステップとを更に含むことを特徴とする。 In the present invention, in the above invention, when measuring spatial frequency components in a block, the number of samples in the horizontal and vertical directions is doubled with respect to the orthogonal transform base image used for encoding. Generating a corrected base image to which a zero value is added, and calculating a discrete Fourier transform coefficient for the corrected base image, thereby generating a spatial frequency component in the block and a discontinuity between adjacent blocks. The method further includes the step of calculating.

また、上述した課題を解決するために、本発明は、画像信号、あるいは、フレーム内予測及びフレーム間予測により得られた予測誤差信号に対して、変換符号化、量子化による情報圧縮を行う画像符号化装置において、歪み量、符号量、未定乗数からなるラグランジェのコスト関数に基づいて、動画像符号化における動き補償ブロックサイズ、インター予測モード、量子化パラメータ、静止画像符号化における量子化パラメータ、イントラ予測モード等の符号化パラメータを決定する際、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する空間周波数成分を計測する空間周波数成分計測手段と、さらに、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する時間周波数成分を推定する時間周波数成分推定手段と、前記空間周波数成分及び前記時間周波数成分の成分毎に、視覚感度関数に基づいて、重要度を算出する重要度算出手段と、前記重要度に基づいて、周波数毎に重み付けされた二乗誤差として得られる歪み量を用いて符号量との加重和としてコスト関数を設定するコスト関数設定手段と、前記コスト関数を最小化するモードを選択するモード選択手段と、を備えることを特徴とする画像符号化装置である。 In order to solve the above-described problem, the present invention provides an image in which information compression is performed by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. In a coding device, based on a Lagrangian cost function consisting of distortion, coding amount, and undetermined multiplier, motion compensation block size in moving picture coding, inter prediction mode, quantization parameter, quantization parameter in still picture coding , When determining encoding parameters such as intra prediction mode, spatial frequency component measuring means for measuring the frequency component in the block and the spatial frequency component related to the discontinuity between adjacent blocks, and the frequency component in the block, And a time-frequency component estimation means for estimating a time-frequency component related to discontinuity between adjacent blocks; For each component of the spatial frequency component and the temporal frequency component, importance calculation means for calculating the importance based on the visual sensitivity function, and a square error weighted for each frequency based on the importance is obtained. An image encoding apparatus comprising: a cost function setting unit that sets a cost function as a weighted sum with a code amount using a distortion amount; and a mode selection unit that selects a mode that minimizes the cost function. It is.

本発明は、上記の発明において、前記空間周波数計測手段は、ブロック内の空間周波数成分を測定する際、符号化に用いる直交変換の基底画像に対して、水平・垂直方向のサンプル数が各々、２倍になるように、零値を加えた修正基底画像を生成する修正基底画像生成手段と、前記修正基底画像に対して、離散フーリエ変換係数を計算することで、ブロック内の空間周波数成分及び隣接ブロック間の不連続性に関する空間周波数成分を算出する空間周波数成分算出手段とを備えることを特徴とする。 According to the present invention, in the above invention, when the spatial frequency measurement means measures the spatial frequency component in the block, the number of samples in the horizontal and vertical directions is respectively set with respect to the base image of the orthogonal transform used for encoding A corrected base image generating means for generating a corrected base image to which a zero value is added so as to be doubled, and by calculating a discrete Fourier transform coefficient for the corrected base image, a spatial frequency component in the block and Spatial frequency component calculation means for calculating a spatial frequency component related to discontinuity between adjacent blocks.

また、上述した課題を解決するために、本発明は、画像信号、あるいは、フレーム内予測及びフレーム間予測により得られた予測誤差信号に対して、変換符号化、量子化による情報圧縮を行う画像符号化装置を制御するコンピュータに、歪み量、符号量、未定乗数からなるラグランジェのコスト関数に基づいて、動画像符号化における動き補償ブロックサイズ、インター予測モード、量子化パラメータ、静止画像符号化における量子化パラメータ、イントラ予測モード等の符号化パラメータを決定する際、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する空間周波数成分を計測するステップと、さらに、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する時間周波数成分を推定するステップと、前記空間周波数成分及び前記時間周波数成分の成分毎に、視覚感度関数に基づいて、重要度を算出するステップと、前記重要度に基づいて、周波数毎に重み付けされた二乗誤差として得られる歪み量を用いて符号量との加重和としてコスト関数を設定するステップと、前記コスト関数を最小化するモードを選択するステップと、を実行させることを特徴とする画像符号化プログラムである。 In order to solve the above-described problem, the present invention provides an image in which information compression is performed by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. Based on the Lagrangian cost function consisting of distortion, code amount, and undetermined multiplier, the computer that controls the encoding device, motion compensation block size, inter prediction mode, quantization parameter, still image coding in moving image coding When determining coding parameters such as a quantization parameter and an intra prediction mode, a step of measuring a frequency component in a block and a spatial frequency component related to discontinuity between adjacent blocks, and a frequency component in the block, And estimating a temporal frequency component related to discontinuity between adjacent blocks, and the spatial frequency For each component and component of the time frequency component, a step of calculating importance based on a visual sensitivity function, and a code using a distortion amount obtained as a square error weighted for each frequency based on the importance. An image encoding program, comprising: setting a cost function as a weighted sum with a quantity; and selecting a mode that minimizes the cost function.

この発明によれば、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する空間周波数成分を計測し、さらに、ブロック内の周波数成分、及び隣接ブロック間の不連続性に関する時間周波数成分を推定し、空間周波数成分及び時間周波数成分の成分毎に、視覚感度関数に基づいて、重要度を算出し、該重要度に基づいて、周波数毎に重み付けされた二乗誤差として得られる歪み量を用いて符号量との加重和としてコスト関数を設定し、該コスト関数を最小化するモードを選択する。したがって、動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式において、ブロック歪みによる復号画像の画像劣化を防止することができ、かつ、符号量を削減することができるという利点が得られる。 According to the present invention, the frequency component in the block and the spatial frequency component related to the discontinuity between adjacent blocks are measured, and the frequency component in the block and the time frequency component related to the discontinuity between adjacent blocks are estimated. Then, for each of the spatial frequency component and the temporal frequency component, the importance is calculated based on the visual sensitivity function, and the distortion amount obtained as the square error weighted for each frequency is used based on the importance. A cost function is set as a weighted sum with the code amount, and a mode for minimizing the cost function is selected. Therefore, in a block-based coding scheme that combines interframe prediction based on motion compensation and orthogonal transform, there is an advantage that image degradation of a decoded image due to block distortion can be prevented and the code amount can be reduced. can get.

また、本発明によれば、ブロック内の空間周波数成分を計測する際、符号化に用いる直交変換の基底画像に対して、水平・垂直方向のサンプル数が各々、２倍になるように、零値を加えた修正基底画像を生成し、該修正基底画像に対して、離散フーリエ変換係数を計算することで、ブロック内及び隣接ブロック間の不連続性に関する空間周波数成分を算出する。したがって、動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式において、ブロック歪みによる復号画像の画像劣化を防止することができ、かつ、符号量を削減することができるという利点が得られる。 Further, according to the present invention, when measuring the spatial frequency component in the block, the number of samples in the horizontal and vertical directions is doubled with respect to the orthogonal transform base image used for encoding. A corrected base image to which values are added is generated, and a discrete Fourier transform coefficient is calculated for the corrected base image, thereby calculating a spatial frequency component related to discontinuity within a block and between adjacent blocks. Therefore, in a block-based coding scheme that combines interframe prediction based on motion compensation and orthogonal transform, there is an advantage that image degradation of a decoded image due to block distortion can be prevented and the code amount can be reduced. can get.

以下、本発明の一実施形態を、図面を参照して説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

Ａ．本発明の原理
本発明の原理について説明する。本発明では、ブロック歪みを考慮した歪み尺度を導入し、符号化パラメータの設定に用いるコスト関数において、ブロック歪みを考慮した符号化パラメータを設定する。具体的には、同歪み尺度の計算時に、隣接ブロックとの不連続性に対しても、周波数分析を行い、隣接ブロックとの不連続性を歪み尺度の計算時に反映させる。そこで、ブロック内の波形とあわせて、隣接ブロック間の不連続性も考慮した周波数分析を行い、コントラスト感度関数に基づき歪み量に対する重み付けを行う。 A. Principle of the Present Invention The principle of the present invention will be described. In the present invention, a distortion scale considering block distortion is introduced, and an encoding parameter considering block distortion is set in a cost function used for setting the encoding parameter. Specifically, when calculating the same distortion measure, frequency analysis is also performed for discontinuity with adjacent blocks, and the discontinuity with adjacent blocks is reflected when calculating the distortion measure. Therefore, frequency analysis is performed in consideration of discontinuity between adjacent blocks together with the waveform in the block, and the distortion amount is weighted based on the contrast sensitivity function.

［感度係数の算出］
上述した重み付けの係数の算出において、入力となるのは、変換行列と変移量である。なお、以下では、縦幅Ｈの画像を視距離ｒＨにおいて観測する場合を考える。ｒを視距離パラメータと呼ぶ。 [Calculation of sensitivity coefficient]
In the calculation of the weighting coefficient described above, the input is the transformation matrix and the shift amount. In the following, a case where an image having a vertical width H is observed at a viewing distance rH is considered. r is called a viewing distance parameter.

変換行列Φ（Ｎ×Ｎ行列）の第ｋ列ベクトル（Ｎ次元ベクトル）をφ_ｋとすると、同行列に対する基底画像は、次式より得られる。なお、Ｈ．２６４の場合、Ｎとして取りうる値は４または８のいずれかである。 When the k-th column vector of the transformation matrix [Phi (N × N matrix) to (N-dimensional vector) and phi _k, base image for the matrix is obtained from the following equation. H. In the case of H.264, the possible value for N is either 4 or 8.

ここで、φ^ｔ _ｌはφ_ｌの転置ベクトルである。各基底画像ｆ_ｋ，ｌ（ｘ，ｙ）（０≦ｘ，ｙ≦Ｎ−１）に対して、ゼロ埋めにより、サイズを２倍に拡大する。ゼロ埋めの方法としては、例えば、次式（１４）、（１５）のような方法がある。
ゼロパディング方法１： Here, φ ^t _l is a transposed vector of φ _l . Each base image f _{k, l} (x, y) (0 ≦ x, y ≦ N−1) is doubled in size by zero padding. As a zero padding method, for example, there are methods like the following formulas (14) and (15).
Zero padding method 1:

ゼロパディング方法２： Zero padding method 2:

ゼロ埋めの結果得られる（〜）ｆ_ｋ，ｌ（ｘ，ｙ）を修正基底画像と呼ぶ。例えば、Ｎ＝４の場合には、図９（ａ）、（ｂ）に示すように、網掛け部の４×４画素に、基底画像が配置され、それ以外の位置に、零値がパディングされる（ゼロ埋め）。図９（ａ）が上記セロパディング方法１であり、図９（ｂ）が上記セロパディング方法２である。修正基底画像に対して、次式（１６）に示す離散フーリエ変換を施し、フーリエ係数を得る。なお、以下では、Ｎ＝２^ｍとおく。 The (˜) f _{k, l} (x, y) obtained as a result of zero padding is called a corrected base image. For example, when N = 4, as shown in FIGS. 9A and 9B, the base image is arranged at 4 × 4 pixels in the shaded portion, and zero values are padded at other positions. (Zero padding). FIG. 9A shows the above-described cell padding method 1, and FIG. 9B shows the above-described cell padding method 2. The modified base image is subjected to a discrete Fourier transform represented by the following equation (16) to obtain a Fourier coefficient. In the following, it is assumed that N = 2 ^m .

ここで、ｊは虚数単位である。
上記数式（１６）によるフーリエ係数を次式（１７）に示す。 Here, j is an imaginary unit.
The Fourier coefficient according to the above equation (16) is shown in the following equation (17).

さらに、数式（１７）に対して、次式（１８）に示す重み付けを行う。 Furthermore, the weighting shown in the following equation (18) is performed on the equation (17).

以下、数式（１８）の右辺の第１項の（＾）Ｆ_ｋ，ｌ（ｕ，ｖ）について説明する。ここで、ｇ（η，ｗ）は、視覚感度関数として知られる関数であり、次式（１９）のような関数形で表される。 Hereinafter, (^) F _{k, l} (u, v) of the first term on the right side of Expression (18) will be described. Here, g (η, w) is a function known as a visual sensitivity function, and is expressed in a function form as in the following equation (19).

ここで、ａ，ｂ，ｃ，ｄは、視覚感度関数の関数形を定めるパラメータ（以後、モデルパラメータと呼ぶ）であり、例えば、次式（２０）で示すような値が用いられる。 Here, a, b, c, and d are parameters (hereinafter referred to as model parameters) that define the function form of the visual sensitivity function. For example, values as shown in the following equation (20) are used.

また、式（１９）において、ηは、次式（２１）で示す値とする。 In the equation (19), η is a value represented by the following equation (21).

θ（ｒ，Ｈ）は、縦幅Ｈの画像を視距離ｒＨにおいて観測する場合の一画素あたりの角度であり、次式（２２）により与えられる。 θ (r, H) is an angle per pixel when an image having a vertical width H is observed at a viewing distance rH, and is given by the following equation (22).

ωは単位時間当たりの角度の変化量［ｄｅｇｒｅｅｓ／ｓｅｃ］であり、以下のように設定される。このマクロブロックの変移量が（ｄ_ｘ，ｄ_ｙ）と推定され、縦幅Ｈの画像を視距離ｒＨにおいて観測する場合、単位時間当たりの角度の変化量は、次式（２３）により与えられる。 ω is an angle change amount per unit time [degrees / sec], and is set as follows. When the macroblock displacement is estimated as (d _x , _dy ) and an image having a vertical width H is observed at the viewing distance rH, the angle variation per unit time is given by the following equation (23). .

ここで、ｆ_ｒは、フレームレートである。ｋ，ｌ基底に対する感度係数を次式（２４）で示す電力比として定義する。 Here, _fr is a frame rate. The sensitivity coefficient for the k, l base is defined as the power ratio represented by the following equation (24).

なお、Ｗ^Ｕ _ｋ，ｌ（ω），Ｗ^ｖ _ｋ，ｌ（ω）についても同様に求めることができる。このとき、輝度成分と色差成分とでモデルパラメータを変更することも可能である。 Note that W ^U _{k, l} (ω) and W ^v _{k, l} (ω) can be similarly obtained. At this time, it is also possible to change the model parameter between the luminance component and the color difference component.

Ｂ．第１実施形態
図１は、本発明の第１実施形態によるモード選択装置の構成を示すブロック図である。図において、変移量記憶部３００は、推定変移量を記憶する。初期モード設定部３０１は、初期モードを設定する。モード記憶部３０２は、初期設定されたモードを記憶するとともに、適宜、後述するモード設定部３１６により設定されたモードを記憶（更新）する。 B. First Embodiment FIG. 1 is a block diagram showing a configuration of a mode selection device according to a first embodiment of the present invention. In the figure, a transition amount storage unit 300 stores an estimated transition amount. The initial mode setting unit 301 sets an initial mode. The mode storage unit 302 stores the initially set mode and appropriately stores (updates) the mode set by the mode setting unit 316 described later.

符号量算出部３０３は、入力される符号化対象フレーム信号、参照フレーム信号及び量子化パラメータに基づいて、符号量を算出する。符号量記憶部３０４は、算出された符号量を記憶する。重み付き歪み量算出部３０５は、モード記憶部３０２に記憶されているモードに従って、変移量記憶部３００に記憶されている変移量から重み付き歪み量を算出する。重み付き歪み量記憶部３０６は、算出された重み付き歪み量を記憶する。未定乗数算出部３０７は、入力される符号化対象フレーム信号、参照フレーム信号及び量子化パラメータに基づいて、未定乗数を算出する。未定乗数記憶部３０８は、算出された未定乗数を記憶する。 The code amount calculation unit 303 calculates the code amount based on the input encoding target frame signal, the reference frame signal, and the quantization parameter. The code amount storage unit 304 stores the calculated code amount. The weighted distortion amount calculation unit 305 calculates the weighted distortion amount from the transition amount stored in the transition amount storage unit 300 according to the mode stored in the mode storage unit 302. The weighted distortion amount storage unit 306 stores the calculated weighted distortion amount. The undetermined multiplier calculation unit 307 calculates an undetermined multiplier based on the input encoding target frame signal, the reference frame signal, and the quantization parameter. The undetermined multiplier storage unit 308 stores the calculated undetermined multiplier.

コスト算出部（コスト関数設定手段）３０９は、符号量記憶部３０４に記憶されている符号量と、重み付き歪み量記憶部３０６に記憶されている重み付き歪み量と、未定乗数記憶部３０８に記憶されている未定乗数とに基づいて、コストを算出する。コスト記憶部３１０は、算出されたコストを記憶する。最小コスト判定部（モード選択手段）３１１は、コスト記憶部３１０に記憶されているコストから、最小コストを判定する。最小コスト記憶部３１２は、最小コスト判定部３１１により判定された最小コストを記憶する。 The cost calculation unit (cost function setting unit) 309 stores the code amount stored in the code amount storage unit 304, the weighted distortion amount stored in the weighted distortion amount storage unit 306, and the undetermined multiplier storage unit 308. The cost is calculated based on the stored undetermined multiplier. The cost storage unit 310 stores the calculated cost. The minimum cost determination unit (mode selection unit) 311 determines the minimum cost from the costs stored in the cost storage unit 310. The minimum cost storage unit 312 stores the minimum cost determined by the minimum cost determination unit 311.

最適モード更新部３１３は、最小コスト判定部３１１により、算出したコストがレジスタに格納されているコストの最小値より小さいと判定された場合に、その時点でモード記憶部３０２に記憶されているモードを最適モード記憶部３１４に格納する。最適モード記憶部３１４は、算出したコストがレジスタに格納されているコストの最小値より小さいと判定された時点のモードを最適モードとして記憶する。 When the minimum cost determination unit 311 determines that the calculated cost is smaller than the minimum cost stored in the register, the optimum mode update unit 313 stores the mode stored in the mode storage unit 302 at that time. Are stored in the optimum mode storage unit 314. The optimum mode storage unit 314 stores the mode at the time when it is determined that the calculated cost is smaller than the minimum cost stored in the register as the optimum mode.

最終モード判定部３１５は、全てのモードについて処理を終えたか否か、すなわち最終モードであるか否かを判定し、最終モードでない場合には、最小コスト判定３１１の判定結果をモード設定部（モード選択手段）３１６に供給し、最終モードである場合には、最小コスト判定３１１の判定結果を最適モード出力部３１６に供給する。最適モード出力部３１６は、最終モード判定部３１５から最小コスト判定３１１の判定結果が供給されると、すなわち、全てのモードについて処理を終えると、最適モード記憶部３１４に記憶されている最適モードを出力する。 The final mode determination unit 315 determines whether or not the processing has been completed for all modes, that is, whether or not the mode is the final mode. If the final mode is not the final mode, the determination result of the minimum cost determination 311 is displayed as a mode setting unit (mode Selection means) 316, and in the final mode, the determination result of the minimum cost determination 311 is supplied to the optimum mode output unit 316. When the determination result of the minimum cost determination 311 is supplied from the final mode determination unit 315, that is, when the processing is completed for all modes, the optimal mode output unit 316 displays the optimal mode stored in the optimal mode storage unit 314. Output.

図２は、上述した重み付き歪み量算出部３０５の構成を示すブロック図である。図において、変換係数正規化部４０１は、入力される変換係数を正規化する。正規化変換係数記憶部４０２は、正規化された変換係数を記憶する。変換係数復号部４０３は、入力される変換係数を復号する。復号変換係数記憶部４０４は、復号された変換係数を記憶する。変移量記憶部４０８は、入力される変換量を記憶する。 FIG. 2 is a block diagram showing the configuration of the weighted distortion amount calculation unit 305 described above. In the figure, a conversion coefficient normalization unit 401 normalizes input conversion coefficients. The normalized conversion coefficient storage unit 402 stores the normalized conversion coefficient. The transform coefficient decoding unit 403 decodes input transform coefficients. The decoded transform coefficient storage unit 404 stores the decoded transform coefficient. The transition amount storage unit 408 stores the input conversion amount.

感度係数算出部（空間周波数成分計測手段、時間周波数成分推定手段、重要度算出手段、空間周波数成分算出手段）４０９は、変移量記憶部４０８に記憶されている変移量から感度係数を算出する。感度係数記憶部４１０は、算出された感度係数を記憶する。感度係数乗算部４０７は、歪み量記憶部４０６から歪み量を読み出し、感度係数記憶部４１０から読み出した感度係数を乗算する。歪み量記憶部４１１は、感度係数が乗算された歪み量を記憶する。歪み量和算出部４１２は、歪み量記憶部４１１から読み出した歪み量を加算し、重み付き歪み量として出力する。 A sensitivity coefficient calculation unit (spatial frequency component measurement unit, temporal frequency component estimation unit, importance calculation unit, spatial frequency component calculation unit) 409 calculates a sensitivity coefficient from the shift amount stored in the shift amount storage unit 408. The sensitivity coefficient storage unit 410 stores the calculated sensitivity coefficient. The sensitivity coefficient multiplication unit 407 reads the distortion amount from the distortion amount storage unit 406 and multiplies the sensitivity coefficient read from the sensitivity coefficient storage unit 410. The distortion amount storage unit 411 stores the distortion amount multiplied by the sensitivity coefficient. The distortion amount sum calculation unit 412 adds the distortion amounts read from the distortion amount storage unit 411 and outputs the result as a weighted distortion amount.

図３は、上述した感度係数算出部４０９の構成を示すブロック図である。図において、変換係数記憶部６０１は、入力される変換係数を記憶する。変換行列サイズ記憶部６０２は、入力される変換行列サイズを記憶する。基底画像算出部６０３は、変換係数記憶部６０１から読み出した変換係数と、変換行列サイズ記憶部６０２から読み出した変換行列サイズとに従って、基底画像を算出する。基底画像記憶部６０４は、算出された基底画像を記憶する。 FIG. 3 is a block diagram illustrating a configuration of the sensitivity coefficient calculation unit 409 described above. In the figure, a conversion coefficient storage unit 601 stores input conversion coefficients. The transformation matrix size storage unit 602 stores the inputted transformation matrix size. The base image calculation unit 603 calculates a base image according to the transform coefficient read from the transform coefficient storage unit 601 and the transform matrix size read from the transform matrix size storage unit 602. The base image storage unit 604 stores the calculated base image.

修正基底画像算出部（修正基底画像生成手段）６０５は、基底画像記憶部６０４から読み出した基底画像を、変換行列サイズ記憶部６０２から読み出した変換行列サイズに従って、修正基底画像を算出する。修正基底画像記憶部６０６は、修正された基底画像を記憶する。コントラスト感度関数記憶部６０７は、入力されるコントラスト感度関数を記憶する。乗算部６０８は、修正基底画像記憶部６０６から読み出した修正された基底画像と、コントラスト感度関数記憶部６０７から読み出したコントラスト感度関数とを乗算する。加算部６０９は、乗算された値を加算し、感度係数として出力する。 A corrected base image calculation unit (corrected base image generation unit) 605 calculates a corrected base image from the base image read from the base image storage unit 604 according to the transformation matrix size read from the transformation matrix size storage unit 602. The corrected base image storage unit 606 stores the corrected base image. The contrast sensitivity function storage unit 607 stores the input contrast sensitivity function. The multiplication unit 608 multiplies the corrected base image read from the corrected base image storage unit 606 by the contrast sensitivity function read from the contrast sensitivity function storage unit 607. Adder 609 adds the multiplied values and outputs the result as a sensitivity coefficient.

図４は、本実施形態による、符号化モード選択処理の動作を説明するためのフローチャートである。まず、初期モード設定部３０１で、モードの初期値を設定する（ステップＳ１０１）。具体的には、最小コスト、最適モードを格納するレジスタを初期化する。次に、変移量記憶部３００に変移量を格納する（ステップＳ１０２）。次に、符号量算出部３０３で、符号量を算出し（ステップＳ１０３）、重み付き歪み量算出部３０５で、重み付き歪み量を算出する（ステップＳ１０４）。なお、重み付き歪み量の算出の詳細については後述する。 FIG. 4 is a flowchart for explaining the operation of the encoding mode selection process according to this embodiment. First, the initial mode setting unit 301 sets the initial value of the mode (step S101). Specifically, a register storing the minimum cost and optimum mode is initialized. Next, the transition amount is stored in the transition amount storage unit 300 (step S102). Next, the code amount calculation unit 303 calculates the code amount (step S103), and the weighted distortion amount calculation unit 305 calculates the weighted distortion amount (step S104). Details of the calculation of the weighted distortion amount will be described later.

次に、未定乗数算出部で、未定乗数を算出し（ステップＳ１０５）、コスト算出部３０９で、符号量、重み付き歪み量、及び未定乗数により、コストを算出する（ステップＳ１０６）。次に、最小コスト判定部３１１で、算出したコストが最小コスト記憶部３１２（レジスタ）に格納されているコストの最小値より小さいか否かを判定し（ステップＳ１０７）、最小値より小さくない場合には、モードを変更し（ステップＳ１１１）、ステップＳ１０３に戻り、変更されたモードで、符号量、重み付き符号量、未定乗数、コストの算出を行う。 Next, an undetermined multiplier calculation unit calculates an undetermined multiplier (step S105), and a cost calculation unit 309 calculates a cost based on the code amount, the weighted distortion amount, and the undetermined multiplier (step S106). Next, the minimum cost determination unit 311 determines whether or not the calculated cost is smaller than the minimum value of the cost stored in the minimum cost storage unit 312 (register) (step S107). The mode is changed (step S111), the process returns to step S103, and the code amount, weighted code amount, undetermined multiplier, and cost are calculated in the changed mode.

一方、算出したコストがレジスタに格納されているコストの最小値より小さい場合には、算出されたコストを格納し（ステップＳ１０８）、その時点のモードを格納する（ステップＳ１０９）。次に、全てのモードについて処理を終えたか否かを判定し（ステップS１１０）、終えていない場合には、モードを変更し（ステップＳ１１１）、ステップＳ１０３に戻り、変更されたモードで、上述した処理を繰り返す。一方、全てのモードについて処理を終えた場合には、当該処理を終了する。 On the other hand, when the calculated cost is smaller than the minimum cost stored in the register, the calculated cost is stored (step S108), and the mode at that time is stored (step S109). Next, it is determined whether or not the processing has been completed for all modes (step S110). If the processing has not been completed, the mode is changed (step S111), and the process returns to step S103. Repeat the process. On the other hand, when the process has been completed for all modes, the process ends.

図５は、上記ステップＳ１０４の重み付き歪み量の算出手順を示すフローチャートである。まず、変換係数正規化部４０１で、入力される変換係数を正規化し（ステップＳ２０１）、変換係数復号部４０３で、入力される変換係数を復号する（ステップＳ２０２）。次に、変換量を読み込み（ステップＳ２０３）、感度係数算出部４０９で感度係数を算出し、感度係数を設定する（ステップＳ２０４）。次に、カウンタｉ、レジスタＳ（歪み量記憶部４１１）を０に初期化し（ステップＳ２０５）、歪み量算出部４０５で、変換係数の第ｉ成分の符号化歪みを算出する（ステップＳ２０６）。 FIG. 5 is a flowchart showing the calculation procedure of the weighted distortion amount in step S104. First, the transform coefficient normalization unit 401 normalizes the input transform coefficient (step S201), and the transform coefficient decoding unit 403 decodes the input transform coefficient (step S202). Next, the conversion amount is read (step S203), the sensitivity coefficient calculation unit 409 calculates the sensitivity coefficient, and sets the sensitivity coefficient (step S204). Next, the counter i and the register S (distortion amount storage unit 411) are initialized to 0 (step S205), and the distortion amount calculation unit 405 calculates the encoding distortion of the i-th component of the transform coefficient (step S206).

次に、感度係数乗算部４０７で、変換係数の第ｉ成分の符号化歪みに感度係数を乗じ（ステップＳ２０７）、歪み量和算出部４１２で、感度係数を乗じた符号化歪み（歪み量）をレジスタＳ（歪み量記憶部４１１）に加算する（ステップＳ２０８）。次に、変換係数の全成分について処理が終了したか否かを判定し（ステップＳ２０９）、全て終了していない場合には、カウンタｉに１加算し（ステップＳ２１０）、ステップＳ２０６に戻り、次の成分について上述した処理を繰り返す。以下、全成分について終了するまでカウンタｉに１を加算して上述した処理を繰り返す。そして、変換係数の全成分について処理が終了すると、当該処理を終了する。 Next, the sensitivity coefficient multiplication unit 407 multiplies the encoding distortion of the i-th component of the transform coefficient by the sensitivity coefficient (step S207), and the distortion amount sum calculation unit 412 multiplies the sensitivity coefficient by the encoding distortion (distortion amount). Is added to the register S (distortion amount storage unit 411) (step S208). Next, it is determined whether or not the processing has been completed for all components of the transform coefficient (step S209). If not all have been completed, 1 is added to the counter i (step S210), and the process returns to step S206. The above-described process is repeated for the other component. Thereafter, 1 is added to the counter i until the processing is completed for all components, and the above-described processing is repeated. Then, when the process is completed for all components of the conversion coefficient, the process ends.

図６は、上記ステップＳ２０４の感度係数の設定手順を示すフローチャートである。まず、変換行列（ＤＣＴ）のサイズ（Ｎ×Ｎとする）を読み込み（ステップＳ５０１）、カウンタｉを０に初期化する（ステップＳ５０２）。次に、基底画像算出部６０３で、変換行列（ＤＣＴ）における、第ｉ番目の基底画像を算出し（ステップＳ５０３）、修正基底画像算出部６０５で、前述した数式（２１）に従って、ゼロ埋めにより、該基底画像に対して２倍のサイズの画像（２Ｎ×２Ｎの画像：修正基底画像）を生成する（ステップＳ５０４）。 FIG. 6 is a flowchart showing the procedure for setting the sensitivity coefficient in step S204. First, the size (N × N) of the transformation matrix (DCT) is read (step S501), and the counter i is initialized to 0 (step S502). Next, the base image calculation unit 603 calculates the i-th base image in the transformation matrix (DCT) (step S503), and the corrected base image calculation unit 605 performs zero padding according to the equation (21) described above. Then, an image twice as large as the base image (2N × 2N image: corrected base image) is generated (step S504).

次に、修正基底画像に対して、周波数分析（ＤＦＴ）を行い、修正基底画像内の周波数成分の分布を算出する（ステップＳ５０５）。次に、コントラスト感度関数（ＣＳＦ）を読み込み（ステップＳ５０６）、カウンタｋ、レジスタＷ［ｉ］を０に初期化する（ステップＳ５０７）。次に、乗算部６０８で、前述した数式（２２）に従って、修正基底画像の周波数成分の第ｋ成分に対してコントラスト感度関数の対応する値を乗じ（ステップＳ５０８）、加算部６０９で、該乗算の結果得られた値を、レジスタＷ［ｉ］に加算する（ステップＳ５０９）。 Next, frequency analysis (DFT) is performed on the corrected base image, and the distribution of frequency components in the corrected base image is calculated (step S505). Next, the contrast sensitivity function (CSF) is read (step S506), and the counter k and the register W [i] are initialized to 0 (step S507). Next, the multiplication unit 608 multiplies the k-th component of the frequency component of the corrected base image by the corresponding value of the contrast sensitivity function according to the equation (22) described above (step S508), and the addition unit 609 multiplies the multiplication. Is added to the register W [i] (step S509).

次に、修正基底画像の全周波数成分について処理が終了したか否かを判定し（ステップＳ５１０）、全周波数成分について終了していない場合には、カウンタｋに１加算し（ステップＳ５１１）、ステップＳ５０８に戻り、上述した処理を繰り返す。一方、全周波数成分について処理が終了した場合には、全ての基底画像について処理が終了したか否かを判定し（ステップＳ５１２）、全ての基底画像について終了していない場合には、カウンタｉに１加算し（ステップＳ５１３）、ステップＳ５０３に戻り、上述した処理を繰り返す。そして、全ての基底画像について処理が終了すると、当該処理を終了する。 Next, it is determined whether or not the processing has been completed for all the frequency components of the modified base image (step S510). If the processing has not been completed for all the frequency components, 1 is added to the counter k (step S511). Returning to S508, the above-described processing is repeated. On the other hand, if the processing has been completed for all the frequency components, it is determined whether or not the processing has been completed for all the base images (step S512). 1 is added (step S513), the process returns to step S503, and the above-described processing is repeated. Then, when the processing is completed for all base images, the processing ends.

Ｃ．第２実施形態
次に、本発明の第２実施形態について説明する。上述した第１実施形態では、モード選択による例であったが、本第２実施形態では、量子化パラメータを選択するようになっている。 C. Second Embodiment Next, a second embodiment of the present invention will be described. In the first embodiment described above, the example is based on the mode selection, but in the second embodiment, the quantization parameter is selected.

図７は、本第２実施形態による量子化パラメータ選択装置の構成を示すブロックである。図において、変移量記憶部８００は、推定変移量を記憶する。初期量子化パラメータ設定部８０１は、初期量子化パラメータを設定する。量子化パラメータ記憶部８０２は、初期設定された量子化パラメータを記憶するとともに、適宜、後述する量子化パラメータ設定部８１６により設定された量子化パラメータを記憶（更新）する。 FIG. 7 is a block diagram showing the configuration of the quantization parameter selection apparatus according to the second embodiment. In the figure, a transition amount storage unit 800 stores an estimated transition amount. The initial quantization parameter setting unit 801 sets initial quantization parameters. The quantization parameter storage unit 802 stores the initially set quantization parameter, and appropriately stores (updates) the quantization parameter set by the quantization parameter setting unit 816 described later.

符号量算出部８０３は、入力される符号化対象フレーム信号、参照フレーム信号及び量子化パラメータに基づいて、符号量を算出する。符号量記憶部８０４は、算出された符号量を記憶する。重み付き歪み量算出部８０５は、量子化パラメータ記憶部８０２に記憶されている量子化パラメータに従って、変移量記憶部８００に記憶されている変移量の下限値から重み付き歪み量を算出する。重み付き歪み量記憶部８０６は、算出された重み付き歪み量を記憶する。未定乗数算出部８０７は、入力される符号化対象フレーム信号、参照フレーム信号及び量子化パラメータに基づいて、未定乗数を算出する。未定乗数記憶部８０８は、算出された未定乗数を記憶する。 The code amount calculation unit 803 calculates the code amount based on the input encoding target frame signal, the reference frame signal, and the quantization parameter. The code amount storage unit 804 stores the calculated code amount. The weighted distortion amount calculation unit 805 calculates the weighted distortion amount from the lower limit value of the transition amount stored in the transition amount storage unit 800 according to the quantization parameter stored in the quantization parameter storage unit 802. The weighted distortion amount storage unit 806 stores the calculated weighted distortion amount. The undetermined multiplier calculation unit 807 calculates an undetermined multiplier based on the input encoding target frame signal, the reference frame signal, and the quantization parameter. The undetermined multiplier storage unit 808 stores the calculated undetermined multiplier.

コスト算出部８０９は、符号量記憶部８０４に記憶されている符号量と、重み付き歪み量記憶部８０６に記憶されている重み付き歪み量と、未定乗数記憶部８０８に記憶されている未定乗数とに基づいて、コストを算出する。コスト記憶部８１０は、算出されたコストを記憶する。最小コスト判定部８１１は、コスト記憶部８１０に記憶されているコストから、最小コストを判定する。最小コスト記憶部８１２は、最小コスト判定部８１１により判定された最小コストを記憶する。 The cost calculation unit 809 includes a code amount stored in the code amount storage unit 804, a weighted distortion amount stored in the weighted distortion amount storage unit 806, and an undetermined multiplier stored in the undetermined multiplier storage unit 808. Based on the above, the cost is calculated. The cost storage unit 810 stores the calculated cost. The minimum cost determination unit 811 determines the minimum cost from the costs stored in the cost storage unit 810. The minimum cost storage unit 812 stores the minimum cost determined by the minimum cost determination unit 811.

最適量子化パラメータ更新部８１３は、最小コスト判定部８１１により、算出したコストがレジスタに格納されているコストの最小値より小さいと判定された場合に、その時点で量子化パラメータ記憶部８０２に記憶されている量子化パラメータを最適量子化パラメータ記憶部８１４に格納する。最適量子化パラメータ記憶部８１４は、算出したコストがレジスタに格納されているコストの最小値より小さいと判定された時点の量子化パラメータを最適量子化パラメータとして記憶する。 When the minimum cost determination unit 811 determines that the calculated cost is smaller than the minimum value of the cost stored in the register, the optimal quantization parameter update unit 813 stores it in the quantization parameter storage unit 802 at that time. The quantization parameter being stored is stored in the optimum quantization parameter storage unit 814. The optimum quantization parameter storage unit 814 stores, as the optimum quantization parameter, the quantization parameter at the time when it is determined that the calculated cost is smaller than the minimum cost stored in the register.

最終量子化パラメータ判定部８１５は、全ての量子化パラメータについて処理を終えたか否か、すなわち最終量子化パラメータであるか否かを判定し、最終量子化パラメータでない場合には、最小コスト判定８１１の判定結果を量子化パラメータ設定部８１６に供給し、最終量子化パラメータである場合には、最小コスト判定８１１の判定結果を最適量子化パラメータ出力部８１６に供給する。最適量子化パラメータ出力部８１６は、最終量子化パラメータ判定部８１５から最小コスト判定８１１の判定結果が供給されると、すなわち、全ての量子化パラメータについて処理を終えると、最適量子化パラメータ記憶部８１４に記憶されている最適量子化パラメータを出力する。 The final quantization parameter determination unit 815 determines whether or not processing has been completed for all the quantization parameters, that is, whether or not the final quantization parameter, and if not, the minimum cost determination 811 The determination result is supplied to the quantization parameter setting unit 816. If the determination result is the final quantization parameter, the determination result of the minimum cost determination 811 is supplied to the optimal quantization parameter output unit 816. The optimal quantization parameter output unit 816 receives the determination result of the minimum cost determination 811 from the final quantization parameter determination unit 815, that is, when the processing is completed for all the quantization parameters, the optimal quantization parameter storage unit 814 The optimal quantization parameter stored in is output.

図８は、本第２実施形態による、量子化パラメータ選択処理の動作を説明するためのフローチャートである。まず、初期量子化パラメータ設定部８０１で、量子化パラメータの初期値を設定する（ステップＳ７００）。具体的には、最小コスト、最適量子化パラメータを格納するレジスタを初期化する。次に、変移量記憶部８００に変移量を格納する（ステップＳ７０１）。次に、符号量算出部８０３で、符号量を算出し（ステップＳ７０３）、重み付き歪み量算出部８０５で、重み付き歪み量を算出する（ステップＳ７０４）。なお、重み付き歪み量の算出の詳細については前述した第１実施形態と同様である。 FIG. 8 is a flowchart for explaining the operation of the quantization parameter selection process according to the second embodiment. First, the initial quantization parameter setting unit 801 sets the initial value of the quantization parameter (step S700). Specifically, a register storing the minimum cost and optimum quantization parameter is initialized. Next, the transition amount is stored in the transition amount storage unit 800 (step S701). Next, the code amount calculation unit 803 calculates the code amount (step S703), and the weighted distortion amount calculation unit 805 calculates the weighted distortion amount (step S704). The details of the calculation of the weighted distortion amount are the same as in the first embodiment described above.

次に、未定乗数算出部８０７で、未定乗数を算出し（ステップＳ７０５）、コスト算出部８０９で、符号量、重み付き歪み量、及び未定乗数により、コストを算出する（ステップＳ７０６）。次に、最小コスト判定部８１１で、算出したコストが最小コスト記憶部８１２（レジスタ）に格納されているコストの最小値より小さいか否かを判定し（ステップＳ７０７）、最小値より小さくない場合には、量子化パラメータを変更し（ステップＳ７１１）、ステップＳ７０３に戻り、変更された量子化パラメータで、符号量、重み付き符号量、未定乗数、コストの算出を行う。 Next, the undetermined multiplier calculation unit 807 calculates the undetermined multiplier (step S705), and the cost calculation unit 809 calculates the cost based on the code amount, the weighted distortion amount, and the undetermined multiplier (step S706). Next, the minimum cost determination unit 811 determines whether or not the calculated cost is smaller than the minimum value of the cost stored in the minimum cost storage unit 812 (register) (step S707). In step S711, the quantization parameter is changed, and the process returns to step S703, and the code amount, the weighted code amount, the undetermined multiplier, and the cost are calculated using the changed quantization parameter.

一方、算出したコストがレジスタに格納されているコストの最小値より小さい場合には、算出されたコストを最小コスト格納部８１２に格納し（ステップＳ１０８）、その時点の量子化パラメータを格納する（ステップＳ７０９）。次に、全ての量子化パラメータについて処理を終えたか否かを判定し（ステップS７１０）、終えていない場合には、量子化パラメータを変更し（ステップＳ７１１）、ステップＳ７０３に戻り、変更された量子化パラメータで、上述した処理を繰り返す。一方、全ての量子化パラメータについて処理を終えた場合には、当該処理を終了する。 On the other hand, when the calculated cost is smaller than the minimum value of the cost stored in the register, the calculated cost is stored in the minimum cost storage unit 812 (step S108), and the quantization parameter at that time is stored (step S108). Step S709). Next, it is determined whether or not the processing has been completed for all the quantization parameters (step S710). If not, the quantization parameter is changed (step S711), and the process returns to step S703 to change the changed quantum. The above-described processing is repeated with the optimization parameter. On the other hand, when the processing has been completed for all the quantization parameters, the processing ends.

上述した第１、第２実施形態によれば、ブロック歪みを考慮した歪み尺度を導入し、符号化パラメータの設定に用いるコスト関数において、同歪み尺度の計算時に、隣接ブロックとの不連続性に対しても、周波数分析を行い、隣接ブロックとの不連続性を歪み尺度の計算時に反映させるべく、コントラスト感度関数に基づき歪み量に対する重み付けを行うようにしたので、ブロック歪みの発生を防止し、復号画像の画像劣化を防止することができるという利点が得られる。 According to the first and second embodiments described above, a distortion measure considering block distortion is introduced, and in the cost function used for setting the encoding parameter, the discontinuity with adjacent blocks is calculated when the distortion measure is calculated. In contrast, frequency analysis is performed and the amount of distortion is weighted based on the contrast sensitivity function in order to reflect discontinuity with adjacent blocks when calculating the distortion measure. There is an advantage that image degradation of the decoded image can be prevented.

本発明の第１実施形態によるモード選択装置の構成を示すブロック図である。It is a block diagram which shows the structure of the mode selection apparatus by 1st Embodiment of this invention. 本第１実施形態による重み付き歪み量算出部３０５の構成を示すブロック図である。It is a block diagram which shows the structure of the weighted distortion amount calculation part 305 by this 1st Embodiment. 本第１実施形態による感度係数算出部４０９の構成を示すブロック図である。It is a block diagram which shows the structure of the sensitivity coefficient calculation part 409 by this 1st Embodiment. 本実施形態による、符号化モード選択処理の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the encoding mode selection process by this embodiment. 本第１実施形態による重み付き歪み量の算出手順を示すフローチャートである。It is a flowchart which shows the calculation procedure of the weighted distortion amount by this 1st Embodiment. 本第１実施形態による感度係数の設定手順を示すフローチャートである。It is a flowchart which shows the setting procedure of the sensitivity coefficient by this 1st Embodiment. 本発明の第２実施形態による量子化パラメータ選択装置の構成を示すブロックである。It is a block which shows the structure of the quantization parameter selection apparatus by 2nd Embodiment of this invention. 本第２実施形態による、量子化パラメータ選択処理の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the quantization parameter selection process by this 2nd Embodiment. ゼロパディング方法を説明するための概念図である。It is a conceptual diagram for demonstrating the zero padding method. 本明細書で説明に用いる記号の例を示す表図である。It is a table | surface figure which shows the example of the symbol used for description in this specification. 従来技術の課題を説明するための概念図である。It is a conceptual diagram for demonstrating the subject of a prior art.

Explanation of symbols

３００変移量記憶部
３０１初期モード設定部
３０２モード記憶部
３０３符号量算出部
３０４符号量記憶部
３０５重み付き歪み量算出部
３０６重み付き歪み量記憶部
３０７未定乗数算出部
３０８未定乗数記憶部
３０９コスト算出部
３１０コスト記憶部
３１１最小コスト判定部
３１２最小コスト記憶部
３１３最適モード更新部
３１４最適モード記憶部
３１５最終モード判定部
３１６最適モード出力部
４０１変換係数正規化部
４０２正規化変換係数記憶部
４０３変換係数復号部
４０４復号変換係数記憶部
４０７感度係数乗算部
４０８変移量記憶部
４０９感度係数算出部
４１０感度係数記憶部
４１１歪み量記憶部
４１２歪み量和算出部
６０１変換係数記憶部
６０２変換行列サイズ記憶部
６０３基底画像算出部
６０４基底画像記憶部
６０５修正基底画像算出部
６０６修正基底画像記憶部
６０７コントラスト感度関数記憶部
６０８乗算部
６０９加算部 300 Transition amount storage unit 301 Initial mode setting unit 302 Mode storage unit 303 Code amount calculation unit 304 Code amount storage unit 305 Weighted distortion amount calculation unit 306 Weighted distortion amount storage unit 307 Undetermined multiplier calculation unit 308 Undetermined multiplier storage unit 309 Cost Calculation unit 310 Cost storage unit 311 Minimum cost determination unit 312 Minimum cost storage unit 313 Optimal mode update unit 314 Optimal mode storage unit 315 Final mode determination unit 316 Optimal mode output unit 401 Conversion coefficient normalization unit 402 Normalized conversion coefficient storage unit 403 Transform coefficient decoding unit 404 Decoding transform coefficient storage unit 407 Sensitivity coefficient multiplication unit 408 Transition amount storage unit 409 Sensitivity coefficient calculation unit 410 Sensitivity coefficient storage unit 411 Distortion amount storage unit 412 Distortion amount sum calculation unit 601 Conversion coefficient storage unit 602 Conversion matrix size Storage unit 603 Base image calculation 604 base image storage unit 605 fixes the base image calculating unit 606 fixes the base image storage unit 607 contrast sensitivity function storage unit 608 multiplying unit 609 adding unit

Claims

In an image coding method for performing information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction,
Based on Lagrangian cost function consisting of distortion, code amount, and undetermined multiplier, motion compensation block size in video coding, inter prediction mode, quantization parameter, quantization parameter in still image coding, intra prediction mode, etc. Measuring the frequency component in the block and the spatial frequency component related to the discontinuity between adjacent blocks when determining the encoding parameters of
And estimating a frequency component within the block and a time frequency component related to the discontinuity between adjacent blocks;
Calculating importance based on a visual sensitivity function for each of the spatial frequency component and the temporal frequency component; and
Setting a cost function as a weighted sum with a code amount using a distortion amount obtained as a square error weighted for each frequency based on the importance;
Selecting a mode that minimizes the cost function;
An image encoding method comprising:

When measuring the spatial frequency components in a block, a modified base image with zero values added so that the number of samples in the horizontal and vertical directions is doubled with respect to the base image of orthogonal transform used for encoding. Generating step;
Calculating a spatial frequency component in a block and a discontinuity between adjacent blocks by calculating a discrete Fourier transform coefficient for the modified base image, further comprising: Item 2. The image encoding method according to Item 1.

In an image coding apparatus that performs information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction,
Based on Lagrangian cost function consisting of distortion, code amount, and undetermined multiplier, motion compensation block size in video coding, inter prediction mode, quantization parameter, quantization parameter in still image coding, intra prediction mode, etc. Spatial frequency component measurement means for measuring the frequency component in the block and the spatial frequency component related to the discontinuity between adjacent blocks when determining the encoding parameter of
Furthermore, a time frequency component estimation means for estimating a frequency component in a block and a time frequency component related to discontinuity between adjacent blocks;
Importance calculating means for calculating importance based on a visual sensitivity function for each of the spatial frequency component and the temporal frequency component;
Cost function setting means for setting a cost function as a weighted sum with a code amount using a distortion amount obtained as a square error weighted for each frequency based on the importance,
Mode selection means for selecting a mode that minimizes the cost function;
An image encoding device comprising:

The spatial frequency measuring means includes
When measuring the spatial frequency components in a block, a modified base image with a zero value added so that the number of samples in the horizontal and vertical directions is twice that of the orthogonal transform base image used for encoding. Modified base image generation means for generating;
Spatial frequency component calculation means for calculating a spatial frequency component in the block and a spatial frequency component related to discontinuity between adjacent blocks by calculating a discrete Fourier transform coefficient for the modified base image. The image encoding device according to claim 3.

To a computer that controls an image encoding device that performs information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction,
Based on Lagrangian cost function consisting of distortion, code amount, and undetermined multiplier, motion compensation block size in video coding, inter prediction mode, quantization parameter, quantization parameter in still image coding, intra prediction mode, etc. Measuring the frequency component in the block and the spatial frequency component related to the discontinuity between adjacent blocks when determining the encoding parameters of
And estimating a frequency component within the block and a time frequency component related to the discontinuity between adjacent blocks;
Calculating importance based on a visual sensitivity function for each of the spatial frequency component and the temporal frequency component; and
Setting a cost function as a weighted sum with a code amount using a distortion amount obtained as a square error weighted for each frequency based on the importance;
Selecting a mode that minimizes the cost function;
An image encoding program characterized in that