JP5455229B2

JP5455229B2 - Image encoding apparatus and image decoding apparatus

Info

Publication number: JP5455229B2
Application number: JP2010100958A
Authority: JP
Inventors: 晴久加藤; 整内藤
Original assignee: KDDI R&D Laboratories Inc
Current assignee: KDDI R&D Laboratories Inc
Priority date: 2010-04-26
Filing date: 2010-04-26
Publication date: 2014-03-26
Anticipated expiration: 2030-04-26
Also published as: JP2011234020A

Description

本発明は、画像符号化装置及び画像復号装置に関し、特に、直交変換の基底関数を適応的に変化させることで、符号化効率を向上させる画像符号化装置及び画像復号装置に関するものである。 The present invention relates to an image encoding device and an image decoding device , and more particularly to an image encoding device and an image decoding device that improve encoding efficiency by adaptively changing a basis function of orthogonal transformation.

従来の画像符号化において、ブロック単位で処理される画素は、局所的な相互相関が高いことを利用して、周波数変換が適用されてきた。周波数変換そのものには圧縮の機能はないが、変換によって画素情報が低周波係数に集中しやすいこと、及び人の視覚特性が高周波係数の誤差に比較的敏感ではないことを利用して、量子化で高周波係数の情報を排除しやすくした上で、効率的に符号化される。このとき、周波数変換の性能が画質に影響するため、低周波係数への集中度合いに優れた様々な変換基底関数が検討されてきた。また一方で、変換に要する計算量の観点からも、簡便な変換基底関数が検討されてきた。 In conventional image coding, frequency conversion has been applied to pixels processed in units of blocks by utilizing the fact that local cross-correlation is high. The frequency conversion itself does not have a compression function, but it is quantized by taking advantage of the fact that the pixel information tends to concentrate on the low frequency coefficient by the conversion and that the human visual characteristics are relatively insensitive to the error of the high frequency coefficient. Thus, it is possible to efficiently encode the information while facilitating the removal of high frequency coefficient information. At this time, since the performance of the frequency conversion affects the image quality, various conversion basis functions excellent in the degree of concentration on the low frequency coefficient have been studied. On the other hand, simple conversion basis functions have been studied from the viewpoint of the amount of calculation required for conversion.

以下の非特許文献１で示される画像符号化方式では、直交変換にDCTが利用されている。また、以下の非特許文献２で示されるＨ．２６４では、計算量削減及び浮動小数点のミスマッチの回避を目的として、DCTを整数で近似した基底関数を採用している。また、複数のブロック単位に対応するため、変換基底関数の次数も複数用意している。 In the image coding method shown in Non-Patent Document 1 below, DCT is used for orthogonal transform. In addition, the H.P. H.264 uses a basis function approximating DCT with an integer for the purpose of reducing the amount of calculation and avoiding a floating point mismatch. In order to deal with a plurality of block units, a plurality of transformation basis functions are also prepared.

また以下の非特許文献３では、Ｈ．２６４のIntra予測モードに応じて、異なる変換基底関数を用意している。あるブロックでIntra予測モードが選択されると、該モードと一対一対応した変換基底関数によってIntra予測残差が周波数領域に変換される。 In the following non-patent document 3, H. Different conversion basis functions are prepared according to the H.264 intra prediction mode. When an intra prediction mode is selected in a certain block, the intra prediction residual is converted into the frequency domain by a conversion basis function corresponding to the mode.

上符他，「最新ＭＰＥＧ教科書ポイント図解式」，アスキー，１９９４年Top and others, "Latest MPEG textbook point diagram", ASCII, 1994 角野他，「Ｈ．２６４／ＡＶＣ教科書インプレス標準教科書シリーズ」，インプレスネットビジネスカンパニー，２００４年Kakuno et al., “H.264 / AVC Textbook Impress Standard Textbook Series”, Impress Net Business Company, 2004 M. Budagavi, et al. , " Orthogonal MDDT and Mode Dependent DCT," ITU-TSG16 Q.6 VCEG, 2009.M. Budagavi, et al., "Orthogonal MDDT and Mode Dependent DCT," ITU-TSG16 Q.6 VCEG, 2009.

非特許文献１及び非特許文献２は、ある符号化対象ブロックに対して一通りの変換基底関数しか用意していないため、様々な画素分布を取り得るブロックに対して変換基底関数が最適であるとは限らないという問題がある。また、複数の予測モードによって誤差の分布が変化する場合に対応できないという問題がある。例えば、Ｈ．２６４のIntra予測は符号化済みの近接する画素を基準として対象ブロックの予測値を生成するので、予測の基準となる画素から離れるほど大きな予測誤差が生じるという問題が指摘されている。Intra予測は9種類の予測方法で構成されているので、大きな予測誤差の発生しやすい場所は予測方法毎に異なる。よって、非特許文献１及び非特許文献２では、符号化効率を十分に高めることができないという問題がある。 In Non-Patent Document 1 and Non-Patent Document 2, only one conversion basis function is prepared for a certain encoding target block, so that the conversion basis function is optimal for a block that can take various pixel distributions. There is a problem that is not always. In addition, there is a problem that it is not possible to cope with a case where an error distribution changes depending on a plurality of prediction modes. For example, H.M. Since H.264 intra prediction generates a predicted value of a target block with reference to encoded neighboring pixels, there is a problem that a larger prediction error occurs as the distance from the prediction reference pixel increases. Intra prediction is composed of nine types of prediction methods, and the location where a large prediction error is likely to occur differs depending on the prediction method. Therefore, Non-Patent Document 1 and Non-Patent Document 2 have a problem that the encoding efficiency cannot be sufficiently increased.

一方、非特許文献３は、それぞれ複数の異なる変換基底関数をIntra予測モードに応じて変化させるため、非特許文献１及び非特許文献２の問題を一部解決できる。しかし、いずれもIntra予測モードに対して、固有の変換基底関数を割り当てているため、固定的であることに変わりはない。ブロック単位の画素若しくは予測残差が取り得る組み合わせは多岐に渡るため、該モードに割り当てられた変換基底関数が最適であるとは限らないという問題は依然として残る。 On the other hand, Non-Patent Document 3 can solve some of the problems of Non-Patent Document 1 and Non-Patent Document 2 because a plurality of different transformation basis functions are changed according to the Intra prediction mode. However, in any case, a unique transformation basis function is assigned to the Intra prediction mode, so that it is still fixed. Since there are a wide variety of combinations of pixels or prediction residuals in units of blocks, the problem remains that the transform basis function assigned to the mode is not always optimal.

本発明の目的は前述した従来技術の問題点を解消し、変換基底関数を改善して符号化効率の高い画像符号化装置を提供することにある。 An object of the present invention is to solve the above-mentioned problems of the prior art and to provide an image coding apparatus having a high coding efficiency by improving a transform basis function.

上記した従来技術の課題を解決するために、本発明は、複数の画素から構成される単位ブロックの各画素に対して、直交変換、量子化および符号化を順次行って単位ブロック毎に符号化を行う画像符号化装置において、符号化済画素の情報を受け取り、符号化対象ブロックに対する変換基底関数を構成し、直交変換手段に送出する基底関数構成手段を具備し、該基底関数構成手段は前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索して、該類似領域を入力として用いて前記変換基底関数を構成することを第１の特徴とする In order to solve the above-described problems of the prior art, the present invention performs encoding for each unit block by sequentially performing orthogonal transform, quantization, and encoding on each pixel of a unit block composed of a plurality of pixels. In the image encoding device that performs the above, the image processing apparatus includes basis function configuration means that receives information on the encoded pixels, configures a transform basis function for the block to be encoded, and sends the basis function to the orthogonal transform unit. A first feature is that a similar area of the block to be encoded is searched from an encoded pixel area, and the transformation basis function is configured using the similar area as an input.

また本発明は、複数の画素から構成される単位ブロックの各画素に対して、符号化済画素から予測される各画素との間で差分処理を行って得られた予測残差について直交変換、量子化および符号化を順次行って単位ブロック毎に符号化を行う画像符号化装置において、前記符号化済画素の情報と、前記各画素を予測する予測情報と、を受け取り、符号化対象ブロックに対する変換基底関数を構成し、直交変換手段に送出する基底関数構成手段を具備し、該基底関数構成手段は前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索して、該類似領域に対して前記予測情報を適用して得られる予測残差を入力として用いて前記変換基底関数を構成することを第２の特徴とする。 In addition, the present invention performs orthogonal transform on a prediction residual obtained by performing a difference process between each pixel predicted from an encoded pixel and each pixel of a unit block composed of a plurality of pixels. In an image encoding device that sequentially performs quantization and encoding to perform encoding for each unit block, the encoded pixel information and prediction information for predicting each pixel are received, and the encoding target block A basis function constituting unit configured to form a transformed basis function and sending the transformed basis function to an orthogonal transform unit; the basis function constituting unit searches the similar region of the encoding target block from the region of the encoded pixel, and the similarity A second feature is that the transformation basis function is configured using a prediction residual obtained by applying the prediction information to a region as an input.

前記基底関数構成手段は、前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索するに際して、前記符号化済画素の領域に含まれる前記符号化された単位ブロック毎に探索することを第３の特徴とする。 The basis function constituting unit searches for each encoded unit block included in the encoded pixel area when searching for a similar area of the encoding target block from the encoded pixel area. Is the third feature.

前記画像符号化装置は、画素に含まれる第１の色空間座標に対して前記符号化対象ブロックを符号化して符号化済第１色ブロックとし、前記基底関数構成手段は、前記符号化済画素の領域から画素に含まれる第２の色空間座標及び第３の色空間座標の少なくとも一方に対して当該符号化対象ブロックの類似領域を探索するに際して、前記符号化済第１色ブロックの前記第１の色空間座標における領域を当該探索結果として採用することを第４の特徴とする The image encoding device encodes the encoding target block with respect to a first color space coordinate included in a pixel to form an encoded first color block, and the basis function configuring unit includes the encoded pixel. When searching for a similar region of the current block to be encoded with respect to at least one of the second color space coordinate and the third color space coordinate included in the pixel from the region, the first of the encoded first color block The fourth feature is that an area in one color space coordinate is adopted as the search result.

また、前記基底関数構成手段は、前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索するに際して、前記予測情報に含まれるフレーム間動き情報を用いて、前記符号化対象ブロックに対する前記フレーム間動き情報の参照先を前記探索された類似領域とすることを第５の特徴とする。 In addition, when searching for a similar area of the encoding target block from the encoded pixel area, the basis function constituting unit uses the interframe motion information included in the prediction information to perform the encoding on the encoding target block. A fifth feature is that the inter-frame motion information is referred to as the searched similar area.

また、前記基底関数構成手段は、前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索するに際して、前記符号化対象ブロックの同一フレーム内の近傍に位置し且つ前記符号化済画素に含まれる画素からなる所定領域を探索用近傍領域とし、該探索用近傍領域の類似領域を前記符号化済画素の領域から探索して近傍領域類似領域とし、前記探索用近傍領域を前記近傍領域類似領域に平行移動するのに等しい平行移動を前記符号化対象ブロックに施すことで移される領域を前記符号化対象ブロックに対して探索された類似領域とすることを第６の特徴とする。 Further, when searching for a similar area of the encoding target block from the encoded pixel area, the basis function constituting unit is located in the vicinity of the encoding target block in the same frame and the encoded pixel. A predetermined area consisting of pixels included in the search area is used as a search neighborhood area, a similar area of the search neighborhood area is searched from the encoded pixel area as a neighborhood area similarity area, and the search neighborhood area is used as the neighborhood area. A sixth feature is that a region that is moved by performing parallel movement equal to parallel movement to a similar region on the encoding target block is set as a similar region searched for the encoding target block.

また、前記探索用近傍領域が、前記符号化対象ブロックに接する隣接領域のみからなることを第７の特徴とする。 Further, a seventh feature is that the search neighboring area consists only of an adjacent area in contact with the encoding target block.

また、前記基底関数構成手段は、前記符号化済画素の領域から前記符号化対象ブロックの類似領域を探索するに際して、前記符号化対象ブロックの同一フレーム内の近傍に位置し且つ前記符号化済画素に含まれる画素からなる所定領域を探索用近傍領域とし、該探索用近傍領域の類似領域を前記符号化済画素の領域から探索して近傍領域類似領域とし、前記探索用近傍領域を前記近傍領域類似領域に平行移動するのに等しい平行移動を前記符号化対象ブロックに施すことで移される領域を前記符号化対象ブロックに対して探索された類似領域とし、前記探索用近傍領域が、前記予測情報に含まれる画面内予測の参照画素からなることを第８の特徴とする。 Further, when searching for a similar area of the encoding target block from the encoded pixel area, the basis function constituting unit is located in the vicinity of the encoding target block in the same frame and the encoded pixel. A predetermined area consisting of pixels included in the search area is used as a search neighborhood area, a similar area of the search neighborhood area is searched from the encoded pixel area as a neighborhood area similarity area, and the search neighborhood area is used as the neighborhood area. A region to be moved by performing a parallel movement equivalent to a parallel movement to a similar region to the coding target block is set as a similar region searched for the coding target block, and the search neighboring region is the prediction information. An eighth feature is that the reference pixel is comprised of reference pixels for intra prediction included in the.

また、前記符号化対象ブロックに対して探索される類似領域が、前記予測情報に含まれる画面内予測の種類が一致する領域とブロック単位で一致するよう、前記近傍領域類似領域を探索する範囲を限定することを第９の特徴とする。 In addition, a range in which the similar region searched for the encoding target block is searched for the neighboring region similar region so that the region in which the types of intra prediction included in the prediction information match is matched in block units. The limitation is the ninth feature.

また、前記基底関数構成手段は、前記近傍領域類似領域を探索するに際して、画素に含まれる第１の色空間座標に対して前記探索された前記近傍領域類似領域を、画素に含まれる第２の色空間座標及び第３の色空間座標の少なくとも一方における前記近傍領域類似領域として採用することを第１０の特徴とする。 In addition, when searching for the neighborhood region similar region, the basis function constituting unit searches the neighborhood region similar region searched for the first color space coordinate included in the pixel with the second region included in the pixel. A tenth feature is that it is adopted as the neighborhood region similarity region in at least one of a color space coordinate and a third color space coordinate.

また、前記基底関数構成手段は、前記近傍領域類似領域を探索するに際して、領域画素間の差分二乗和に基づいて探索することを第１１の特徴とする。 In addition, an eleventh feature is that the basis function constituting unit searches based on a difference sum of squares between area pixels when searching for the neighborhood area similar area.

また、前記基底関数構成手段は、前記近傍領域類似領域を探索するに際して、探索元領域および該探索元領域を移動させて一致させる探索先領域の各領域において各領域の画素の平均値を減算してから算出する領域画素間の差分二乗和、または領域画素間の相関値に基づいて探索することを第１２の特徴とする。 Further, when searching for the neighboring region similar region, the basis function constituting unit subtracts an average value of pixels in each region in each region of the search source region and the search destination region to be matched by moving the search source region. A twelfth feature is that a search is performed based on a sum of squared differences between area pixels calculated after the calculation or a correlation value between area pixels.

また、前記基底関数構成手段は、前記入力を特異値分解して直交行列を算出し、該直交行列に基づいて前記変換基底関数を構成することを第１３の特徴とする。 The thirteenth feature is that the basis function forming means calculates singular value decomposition of the input to calculate an orthogonal matrix, and forms the transformed basis function based on the orthogonal matrix.

前記第１、第２の特徴によれば、当該符号化対象単位ブロックに対する変換基底関数を適応的に構成でき、かつ符号化対象単位ブロックからではなく符号化済の画素情報から変換基底関数を構成するため変換基底関数自体の符号量が不要であるので、効率的な符号化を行うことができる。 According to the first and second features, the transform basis function for the encoding target unit block can be adaptively configured, and the conversion basis function is configured not from the encoding target unit block but from the encoded pixel information. Therefore, since the code amount of the transform basis function itself is unnecessary, efficient encoding can be performed.

前記第３の特徴によれば、類似領域の探索が画素単位ではなくブロック単位で行われるため、探索を高速に行うことができる。 According to the third feature, since the search for the similar region is performed in units of blocks instead of in units of pixels, the search can be performed at high speed.

前記第４の特徴によれば、符号化済第１色ブロックの当該第１の色空間座標における画素値の領域をそのまま第２の色空間座標及び第３の色空間座標の少なくとも一方における探索結果として採用するので、探索に要する計算量を削減できる。 According to the fourth feature, the search result in at least one of the second color space coordinate and the third color space coordinate as it is is the region of the pixel value in the first color space coordinate of the encoded first color block. Therefore, the calculation amount required for the search can be reduced.

前記第５の特徴によれば、動き情報の参照先をそのまま探索領域として決定するので、探索に要する計算量を削減できる。 According to the fifth feature, since the reference destination of the motion information is determined as it is as the search area, the calculation amount required for the search can be reduced.

前記第６の特徴によれば、符号化対象ブロックの画素との相関が大きい可能性が高い、符号化対象ブロックの近傍画素を用いて探索を行うので、符号化対象ブロックの類似領域の探索精度が高まる。 According to the sixth feature, since the search is performed using the neighboring pixels of the encoding target block, which is highly likely to have a large correlation with the pixel of the encoding target block, the search accuracy of the similar region of the encoding target block is determined. Will increase.

前記第７の特徴によれば、符号化対象ブロックの画素との相関が大きい可能性が高い、符号化対象ブロックの隣接画素を用いて探索を行うので、符号化対象ブロックの類似領域の探索精度が高まる。 According to the seventh feature, since the search is performed using the neighboring pixels of the encoding target block, which is highly likely to have a large correlation with the pixel of the encoding target block, the search accuracy of the similar region of the encoding target block is determined. Will increase.

前記第８の特徴によれば、符号化対象ブロックの画素との相関が大きい可能性が高い、符号化対象ブロックの画素に対する画面内予測の参照画素を用いて探索を行うので、符号化対象ブロックの類似領域の探索精度が高まる。 According to the eighth feature, since the search is performed using the reference pixel of the intra prediction for the pixel of the encoding target block, which is highly likely to have a large correlation with the pixel of the encoding target block, The accuracy of searching for similar regions increases.

前記第８の特徴に加えてさらに第９の特徴によれば、符号化対象ブロックと画面内予測の種類が一致するブロックが類似領域として選ばれるので、該類似領域において予測残差の算出を省略できる。 According to the ninth feature in addition to the eighth feature, a block having the same type of intra-frame prediction as that of the encoding target block is selected as the similar region, so that calculation of the prediction residual in the similar region is omitted. it can.

前記第１０の特徴によれば、第１の色空間座標に対する探索結果を第２の色空間座標及び第３の色空間座標の少なくとも一方における探索結果として採用するので、探索計算が省略できる。 According to the tenth feature, the search result for the first color space coordinate is adopted as the search result for at least one of the second color space coordinate and the third color space coordinate, so that the search calculation can be omitted.

前記第１１の特徴によれば、画素間の差分二乗和に基づいて探索を行うことで符号化対象ブロックの類似領域の探索精度が高まる。 According to the eleventh feature, the search accuracy of the similar region of the block to be encoded is increased by performing a search based on the sum of squared differences between pixels.

前記第１２の特徴によれば、平均値を除去した上での画素間の差分二乗和又は相関値に基づいて探索を行うことにより、前記第１１の特徴における場合よりも多くの候補領域から探索することとなり、符号化対象ブロックの類似領域の探索精度が高まる。さらに前記第１２の特徴および前記第１３の特徴によれば、平均値を除去した上での画素間の差分二乗和又は相関値に基づいて探索を行っても、適切な変換基底関数を構成することができる。 According to the twelfth feature, by searching based on the sum of squared differences or correlation values between pixels after removing the average value, the search is performed from more candidate regions than in the eleventh feature. As a result, the search accuracy of the similar region of the encoding target block is increased. Furthermore, according to the twelfth feature and the thirteenth feature, even if a search is performed based on the sum of squared differences or correlation values between pixels after removing the average value, an appropriate conversion basis function is configured. be able to.

前記第１３の特徴によれば、特異値分解を用いることによって任意の符号化対象ブロックに対する適応的な変換基底関数を構成することができる。 According to the thirteenth feature, an adaptive transform basis function for an arbitrary block to be encoded can be configured by using singular value decomposition.

本発明の一実施形態に係る画像符号化装置のブロック図である。It is a block diagram of the image coding apparatus which concerns on one Embodiment of this invention. 図１の画像符号化装置に対応する画像復号装置の一実施形態のブロック図である。It is a block diagram of one Embodiment of the image decoding apparatus corresponding to the image coding apparatus of FIG. 基底関数構成手段における変換基底関数算出のフロー図である。It is a flowchart of the conversion basis function calculation in a basis function formation means. 符号化対象ブロックと同一フレームからの符号化対象ブロックの類似領域探索の模式図である。It is a schematic diagram of the similar area search of the encoding object block from the same frame as an encoding object block. 色空間を利用して類似領域を探索する一実施形態を模式的に示す図である。It is a figure which shows typically one Embodiment which searches a similar area | region using color space. 色空間を利用して類似領域を探索する一実施形態を模式的に示す図である。It is a figure which shows typically one Embodiment which searches a similar area | region using color space. 類似領域に対して符号化対象ブロックと同様の画面内予測を適用して予測残差を算出することを模式的に示す図である。It is a figure which shows typically applying a prediction similar to an encoding object block with respect to a similar area | region, and calculating a prediction residual. 類似領域に対して符号化対象ブロックと同様の動き予測を適用して予測残差を算出することを模式的に示す図である。It is a figure which shows typically applying the motion prediction similar to an encoding object block with respect to a similar area | region, and calculating a prediction residual.

以下に、図面を参照して本発明の実施形態について詳細に説明する。本発明の一実施形態における画像符号化装置のブロック図および対応する画像復号装置のブロック図をそれぞれ図１、図２に示す。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. A block diagram of an image encoding device and a block diagram of a corresponding image decoding device according to an embodiment of the present invention are shown in FIGS. 1 and 2, respectively.

図１に示す本発明の一実施形態における画像符号化装置は、複数の画素から構成される単位ブロックの各画素に対して、符号化済画素から予測される各画素との間で差分処理を行って得られた予測残差について直交変換・量子化・符号化を行って単位ブロック毎に符号化を行う画像符号化装置に対して、前記単位ブロックを効率的に直交変換できる基底関数を適応的に構成する機能を付加した構成を特徴とするものである。 The image encoding apparatus in one embodiment of the present invention shown in FIG. 1 performs a difference process on each pixel of a unit block composed of a plurality of pixels with each pixel predicted from the encoded pixel. Apply a basis function that can efficiently orthogonally transform the unit block to an image coding device that performs orthogonal transform, quantization, and coding on the prediction residual obtained by performing the transform. It is characterized by a configuration to which a function to be configured is added.

すなわち、本発明の画像符号化装置は図１に示すように、符号化済画素から入力画素を予測する予測情報を算出する予測手段９と、予測情報および符号化済画素から予測画素を生成する補償手段１０と、入力画像の単位ブロックの各画素（入力画素）から予測画素を減算し予測残差を出力する差分器１と、後述の基底関数構成手段８で構成された変換基底関数を用いて予測誤差を周波数領域の変換係数に変換する変換手段２と、変換係数を量子化する量子化手段３と、量子化された変換係数（量子化値）を可変長符号化する符号化手段４と、量子化値を逆量子化する逆量子化手段５と、逆量子化された変換係数を逆変換する逆変換手段６と、予測画素と逆変換された予測残差とを加算し復号画素を再構成する加算器７と、画素又は予測残差に応じて適応的に変換基底関数及び逆変換基底関数を構成する基底関数構成手段８とを備える。 That is, as shown in FIG. 1, the image encoding apparatus of the present invention generates prediction pixels 9 that calculate prediction information for predicting input pixels from encoded pixels, and generates prediction pixels from the prediction information and encoded pixels. A transform basis function configured by a compensation unit 10, a differentiator 1 that subtracts a prediction pixel from each pixel (input pixel) of a unit block of an input image and outputs a prediction residual, and a basis function configuration unit 8 described later is used. Conversion means 2 for converting the prediction error into a frequency domain conversion coefficient, quantization means 3 for quantizing the conversion coefficient, and encoding means 4 for variable-length encoding the quantized conversion coefficient (quantized value). And the inverse quantization means 5 that inversely quantizes the quantized value, the inverse transform means 6 that inversely transforms the inversely quantized transform coefficient, and the prediction pixel and the inversely transformed prediction residual are added to obtain a decoded pixel. And adder 7 to reconstruct the pixel or prediction residual Flip and and a basis function configuration means 8 constituting the adaptively transform basis functions and inverse transform basis functions.

一方、本発明の一実施形態の画像復号装置は図２に示すように、可変長復号で量子化値などに逆変換する復号手段２１と、量子化されている変換係数を逆量子化する逆量子化手段２２と、逆量子化された変換係数を逆変換する逆変換手段２３と、予測情報及び復号済画素から予測画素を生成する補償手段２４と、予測画素と再構成された予測残差とを加算して復号画素を再構成する加算器２５と、画素又は予測残差に応じて適応的に逆変換基底関数を構成する基底関数構成手段２６とを備える。該画像復号装置の動作については後述する。 On the other hand, as shown in FIG. 2, an image decoding apparatus according to an embodiment of the present invention includes a decoding unit 21 that performs inverse conversion to a quantized value or the like by variable length decoding, and an inverse that performs inverse quantization on a quantized transform coefficient. Quantization means 22, inverse transform means 23 for inverse transform of the inversely quantized transform coefficient, compensation means 24 for generating a prediction pixel from the prediction information and the decoded pixel, and a prediction residual reconstructed from the prediction pixel And an adder 25 for reconstructing a decoded pixel, and a basis function constructing means 26 for adaptively constructing an inverse transform basis function according to the pixel or the prediction residual. The operation of the image decoding apparatus will be described later.

以下に、図１の画像符号化装置の動作を説明する。 Hereinafter, the operation of the image coding apparatus in FIG. 1 will be described.

入力画像ａである符号化対象領域の入力画素は、差分器１と予測手段９に入力する。差分器１は、入力画素と補償手段１０から送られる符号化済画素（に対応する復号画素）から予測された予測画素との差分を予測残差として算出し、算出された予測残差は変換手段２および基底関数構成手段８に送られる。 The input pixel of the encoding target area that is the input image a is input to the differentiator 1 and the prediction unit 9. The differentiator 1 calculates a difference between the input pixel and a predicted pixel predicted from the encoded pixel (and corresponding decoded pixel) sent from the compensation means 10 as a prediction residual, and the calculated prediction residual is converted. It is sent to the means 2 and the basis function construction means 8.

変換手段２は、差分器１から送られる予測残差を、基底関数構成手段８から導出された変換基底関数又は所定の直交変換を用いて周波数領域に変換し、該変換基底関数による変換によって得られた変換係数を量子化手段３に出力する。 The transform unit 2 transforms the prediction residual sent from the differentiator 1 into the frequency domain using the transform basis function derived from the basis function construction unit 8 or a predetermined orthogonal transform, and is obtained by transforming with the transform basis function. The obtained transform coefficient is output to the quantization means 3.

変換手段２において所定の直交変換を用いる場合には、直交変換としてはDCT（離散コサイン変換）、DCTの近似変換またはDWT（離散ウェーブレット変換）などを利用することができる。 When a predetermined orthogonal transform is used in the transform unit 2, DCT (Discrete Cosine Transform), DCT approximate transform, DWT (Discrete Wavelet Transform), or the like can be used as the orthogonal transform.

入力画素の各ピクチャ（フレーム）は、あらかじめ規定された数の画素（例えば３２×３２画素、１６×１６画素、８×８画素、４×４画素あるいはそれらの組み合わせ）から構成される単位ブロックに分割され、単位ブロック毎に基底関数構成手段８から導出された直交変換基底関数による変換又は所定の直交変換が行われる。 Each picture (frame) of input pixels is a unit block composed of a predetermined number of pixels (for example, 32 × 32 pixels, 16 × 16 pixels, 8 × 8 pixels, 4 × 4 pixels, or a combination thereof). The unit block is subjected to transformation by an orthogonal transformation basis function derived from the basis function construction means 8 or predetermined orthogonal transformation for each unit block.

なお、変換手段２において基底関数構成手段８が導出する適応的な変換基底関数を用いるか、所定の直交変換を用いるかの切替は、フレームの種類（Intra予測か動き予測か）およびフレーム中の単位ブロックに応じて決まり、該切替情報はフラグ情報として基底関数構成手段８で与えられ変換手段２から符号化手段４に送られるものとする。 It should be noted that the switching between the adaptive conversion basis function derived by the basis function constructing means 8 in the conversion means 2 and the use of a predetermined orthogonal transformation depends on the type of frame (intra prediction or motion prediction) and in the frame. It is determined according to the unit block, and the switching information is given as flag information by the basis function construction means 8 and sent from the conversion means 2 to the encoding means 4.

量子化手段３は、変換手段２から送られた変換係数を量子化する。量子化によって得られた量子化値は符号化手段４および逆量子化手段５に出力される。量子化処理に用いられる量子化パラメータは、定数値の組み合わせとして設定することが可能である。または、変換係数の情報量に応じて制御することで、出力するビットレートを一定に保つこと可能である。 The quantization unit 3 quantizes the transform coefficient sent from the transform unit 2. The quantized value obtained by the quantization is output to the encoding means 4 and the inverse quantization means 5. The quantization parameter used for the quantization process can be set as a combination of constant values. Alternatively, the output bit rate can be kept constant by controlling according to the information amount of the transform coefficient.

符号化手段４は、変換手段２より送られた切替情報、量子化手段３より送られた量子化値および予測手段９から送られた予測情報を符号化し、符号情報ｂとして出力する。符号化は、符号間の冗長性を取り除く可変長符号又は算術符号などを利用することができる。 The encoding means 4 encodes the switching information sent from the converting means 2, the quantized value sent from the quantizing means 3, and the prediction information sent from the predicting means 9, and outputs them as code information b. For the encoding, a variable length code or an arithmetic code that removes redundancy between codes can be used.

一方、逆量子化手段５は、量子化手段３における量子化の逆の手順を行うことで、量子化手段から送られた量子化値を変換係数に逆量子化する。逆量子化により得られた変換係数は逆変換手段６に送られる。 On the other hand, the inverse quantization unit 5 performs inverse quantization on the quantization unit 3 to inversely quantize the quantized value sent from the quantization unit into a transform coefficient. The transform coefficient obtained by inverse quantization is sent to the inverse transform means 6.

なお、入力画像の画素のうち差分器１、変換手段２、量子化手段３および符号化手段４を経た画素は符号化済画素であり、まだ処理されていない画素と区別される。符号化済画素は量子化手段３から符号化手段４と同時に逆変換手段５以降の各手段にも入力されるので、逆変換手段５以降の各手段においても、まだ処理されていない画素と区別して符号化済画素と呼ぶこととする。 Of the pixels of the input image, the pixels that have passed through the differentiator 1, the conversion unit 2, the quantization unit 3, and the encoding unit 4 are encoded pixels and are distinguished from pixels that have not been processed yet. Since the encoded pixels are input from the quantizing means 3 to each means after the inverse transforming means 5 at the same time as the encoding means 4, the means after the inverse transforming means 5 are also separated from the unprocessed pixels. Separately, they are called encoded pixels.

逆変換手段６は、変換手段２の逆の手順を行うことで、逆量子化手段５から送られた変換係数を逆直交変換する。該逆直交変換にあたっては変換手段２において用いた、所定の直交関数の逆関数、または基底関数構成手段８で導出した基底関数の逆関数を用いる。逆変換によって得られた予測残差は加算器７に送られる。 The inverse transform unit 6 performs inverse orthogonal transform on the transform coefficient sent from the inverse quantization unit 5 by performing the reverse procedure of the transform unit 2. In the inverse orthogonal transform, the inverse function of a predetermined orthogonal function used in the transform unit 2 or the inverse function of the basis function derived by the basis function constructing unit 8 is used. The prediction residual obtained by the inverse transformation is sent to the adder 7.

加算器７には、逆変換手段６から送られる予測残差と、補償手段１０から送られる符号化済画素から予測された予測画素とが入力される。加算器７は入力された予測誤差と予測画素との和を復号画素として算出する。算出された復号画素は基底関数構成手段８、予測手段９および補償手段１０に送られる。 The adder 7 receives the prediction residual sent from the inverse transform unit 6 and the predicted pixel predicted from the encoded pixel sent from the compensation unit 10. The adder 7 calculates the sum of the input prediction error and the prediction pixel as a decoded pixel. The calculated decoded pixels are sent to the basis function construction unit 8, the prediction unit 9, and the compensation unit 10.

基底関数構成手段８は、差分器１から送られる予測残差と加算器７から送られる復号画素とが入力される。基底関数構成手段８では後述の手順によって、復号画素内から予測残差のブロック領域の入力画素（差分器１を経て予測残差となる前の入力画素）との類似領域を探索し、該類似領域の予測残差に対応する量を求めて、差分器１から送られた予測残差に適した変換基底を算出し、変換手段２に変換基底を、逆変換手段７に逆変換基底を送る。 The basis function construction means 8 receives the prediction residual sent from the differentiator 1 and the decoded pixel sent from the adder 7. The basis function construction means 8 searches for a similar region with the input pixel of the block region of the prediction residual (the input pixel before the prediction residual through the differentiator 1) from the decoded pixel by the procedure described later, and the similarity An amount corresponding to the prediction residual of the region is obtained, a conversion base suitable for the prediction residual sent from the differentiator 1 is calculated, a conversion base is sent to the conversion means 2, and an inverse conversion base is sent to the inverse conversion means 7. .

予測手段９は、画像の冗長性を削減する予測情報を決定するものであり、加算器７から送られた復号画素をもとに、入力画像ａの各画素を近似するための方法を予測情報として決定する。決定された予測情報は符号化手段４および補償手段１０に送られる。 The prediction means 9 determines prediction information for reducing image redundancy. Based on the decoded pixels sent from the adder 7, the prediction means 9 predicts a method for approximating each pixel of the input image a. Determine as. The determined prediction information is sent to the encoding means 4 and the compensation means 10.

予測方法については、従来技術の各種方法が利用できる。例えば、Ｈ．２６４ではIntra予測（画面内予測）や動き予測が利用されている。このとき、予測情報の決定については、各予測モードで個別に符号化し、符号量と歪量から算出される符号化コストを最小化する予測モードを選択する。符号化コストを最小化する方法の詳細については、非特許文献１に記載されているので説明を省略する。 Various methods of the prior art can be used for the prediction method. For example, H.M. In H.264, intra prediction (intra-screen prediction) and motion prediction are used. At this time, regarding the determination of the prediction information, the prediction mode that individually encodes in each prediction mode and minimizes the encoding cost calculated from the code amount and the distortion amount is selected. The details of the method for minimizing the coding cost are described in Non-Patent Document 1, and thus the description thereof is omitted.

さらに、補償手段１０は、加算器７から送られた復号画素及び予測手段９で決定された予測情報に基づいて、入力画素の近似値としての予測画素を生成する。生成された予測画素は、差分器１および加算器７へ送られる。 Further, the compensation unit 10 generates a prediction pixel as an approximate value of the input pixel based on the decoded pixel sent from the adder 7 and the prediction information determined by the prediction unit 9. The generated prediction pixel is sent to the differentiator 1 and the adder 7.

なお、図１において入力がIntra予測フレーム内の最初のブロックなどであって予測画素、復号画素及び予測情報が利用できないような場合、差分器１において入力画素から減算される量はゼロであり、予測残差ではなく入力画素が変換手段２に入力され、以降は予測残差に対して行われたのと同様の処理である。よってこのような場合の入力画素も予測残差の特別な場合に含まれるものとしてよい。また変換手段２などではブロック単位で処理を行っているが、加算器７や予測手段９や補償手段１０などでは適宜符号化済みブロックの蓄積情報を利用する。該情報は図１に不図示のフレームメモリなどに格納して適宜利用することは明らかである。 In addition, when the input is the first block in the intra prediction frame in FIG. 1 and the prediction pixel, the decoded pixel, and the prediction information cannot be used, the amount subtracted from the input pixel in the differentiator 1 is zero. The input pixel is input to the conversion means 2 instead of the prediction residual, and the subsequent processing is the same as that performed for the prediction residual. Therefore, the input pixel in such a case may be included in a special case of the prediction residual. In addition, the conversion unit 2 and the like perform processing in units of blocks, but the adder 7, the prediction unit 9, the compensation unit 10 and the like appropriately use the accumulated information of the encoded blocks. It is obvious that this information is stored in a frame memory not shown in FIG.

なお、図１の構成はＨ．２６４の符号化装置を想定しているが、本発明はこれに限定されず、MPEG、JPEGなどの符号化装置にも適用できることは明らかである。MPEGの符号化装置は図１において予測手段９として動き予測が適用される構成である。JPEGの符号化装置は、図１の予測手段９、補償手段１０が省略された構成であり、この構成において差分器１、加算器７は減算、加算を行う必要がない。 The configuration of FIG. Although an H.264 encoding device is assumed, the present invention is not limited to this, and it is apparent that the present invention can also be applied to encoding devices such as MPEG and JPEG. The MPEG encoding apparatus has a configuration in which motion prediction is applied as the prediction means 9 in FIG. The JPEG encoding apparatus has a configuration in which the prediction unit 9 and the compensation unit 10 in FIG. 1 are omitted. In this configuration, the differencer 1 and the adder 7 do not need to perform subtraction or addition.

JPEGの符号化装置の構成では予測残差ではなく入力画素がそのまま変換手段２で変換係数となり、量子化手段３で量子化され符号化手段４で符号化される。また量子化値は逆量子化手段５、逆変換手段６を経て復号画素となり、基底関数構成手段８に入力される。基底関数構成手段８は予測残差ではなく入力画素に適した基底関数を算出して変換手段２と逆変換手段６に送る。 In the configuration of the JPEG encoding apparatus, the input pixel, not the prediction residual, is directly converted into a conversion coefficient by the conversion means 2, quantized by the quantization means 3, and encoded by the encoding means 4. The quantized value becomes a decoded pixel through the inverse quantization means 5 and the inverse transform means 6 and is input to the basis function construction means 8. The basis function construction unit 8 calculates a basis function suitable for the input pixel instead of the prediction residual and sends it to the conversion unit 2 and the inverse conversion unit 6.

本発明の画像符号化装置の特徴的構成である基底関数構成手段８で変換基底関数を算出する手順について説明する。 A procedure for calculating a transform basis function by the basis function construction means 8 which is a characteristic configuration of the image coding apparatus of the present invention will be described.

基底関数構成手段８の目的は、後段の変換手段及び逆変換手段において符号化対象ブロックを周波数領域へ変換・逆変換するに際して、当該符号化対象ブロックに最適な変換基底関数及び逆変換基底関数を導出して変換手段２及び逆変換手段６に送ることにある。ただし、当該符号化対象ブロックから最適な変換基底関数を導出すると、基底関数そのものを符号化する必要が生じてしまうため、符号化効率を改善できないという問題がある。この問題を解決するため、準最適解として符号化済領域から当該符号化対象ブロックに類似した領域を探索し、探索結果の最適な変換基底関数を当該符号化対象ブロックの変換基底関数として代用するという手順を取る。 The purpose of the basis function constructing means 8 is to convert the transform target block and the inverse transform basis function optimum for the target block when the target block is converted / inverted to the frequency domain in the subsequent transform means and inverse transform means. Deriving and sending to the conversion means 2 and the inverse conversion means 6. However, if an optimal transform basis function is derived from the encoding target block, it is necessary to encode the basis function itself, and thus there is a problem that the encoding efficiency cannot be improved. In order to solve this problem, a region similar to the target block to be encoded is searched from the encoded region as a suboptimal solution, and the optimal conversion basis function of the search result is used as the conversion basis function of the target block to be encoded. Take the steps.

図３に、基底関数構成手段８における変換基底関数算出の手順のフロー図を示す。 FIG. 3 shows a flow chart of the procedure for calculating the conversion basis function in the basis function construction means 8.

なお、図３の手順を開始するにあたっては所定量の復号画素領域（類似領域探索対象）が既に存在していることが前提である。よって例えばIntra予測のフレームにおいてはフレーム内の最初の所定量の符号化対象ブロックに関しては、基底関数構成手段８にて個別に算出する基底関数ではなくDCT等の所定の固定基底関数を用いることとし、復号画素領域を確保しておくものとする。該固定基底関数の利用は基底関数構成手段８から変換手段２に対して、前述の切替情報として指定することができる。 3 is started on the premise that a predetermined amount of decoded pixel regions (similar region search targets) already exist. Therefore, for example, in an intra prediction frame, a predetermined fixed basis function such as DCT is used for the first predetermined amount of encoding target block in the frame, instead of the basis function individually calculated by the basis function construction unit 8. Assume that a decoded pixel area is secured. The use of the fixed basis function can be designated as the aforementioned switching information from the basis function construction means 8 to the conversion means 2.

なおまた当該切替情報は切替フラグであるので、その符号量は、前述の符号化対象ブロックから最適関数を算出して基底関数そのものを符号化したと仮定した場合に要する符号量と比べてわずかであり、符号化効率への影響はほとんどない。また例えば前述のようなIntra予測フレーム内の最初の所定数のブロックのみ固定基底関数を利用するとあらかじめ定めておけば、切替情報を省略してフレームの種類とブロック位置の情報のみから切替判断を行うこともできる。 Since the switching information is a switching flag, the amount of code is slightly smaller than the amount of code required when it is assumed that the optimal function is calculated from the above-described encoding target block and the basis function itself is encoded. There is almost no influence on the coding efficiency. Also, for example, if it is determined in advance that the fixed basis function is used only for the first predetermined number of blocks in the intra prediction frame as described above, the switching information is omitted, and the switching determination is performed only from the frame type and block position information. You can also.

まずステップＳ１では、基底関数構成手段８は、加算器７から送られる復号画素と差分器１から送られる予測残差とを入力する。予測残差からは、符号化対象ブロックの領域（どのフレームのどの位置であるか）の情報を把握する。入力された復号画素は、異なる時刻のフレームに限らず、同一時刻のフレームであっても良い。 First, in step S <b> 1, the basis function constituting unit 8 inputs the decoded pixel sent from the adder 7 and the prediction residual sent from the differentiator 1. From the prediction residual, information on the region of the block to be encoded (which position in which frame) is grasped. The input decoded pixels are not limited to frames at different times, but may be frames at the same time.

次にステップＳ２では、復号画素からは符号化対象ブロックと最も類似していると判定される領域を探索する。類似領域を探索する各種の実施形態を以下に説明するが、各実施形態では画素信号を記述する時間（フレーム時刻）、空間（フレーム画面内での位置）および信号（画素信号を構成する色空間信号）のいずれか（又は複数）に注目して探索を行う。まず空間に注目して探索する実施形態として、符号化対象ブロックと同一のフレームから類似領域を探索する一実施形態を説明する。 Next, in step S2, an area determined to be most similar to the encoding target block is searched from the decoded pixels. Various embodiments for searching for similar regions are described below. In each embodiment, the time (frame time), space (position in the frame screen) and signal (color space constituting the pixel signal) describing the pixel signal are described. The search is performed by paying attention to any (or a plurality of) of (signal). First, an embodiment in which a similar region is searched from the same frame as an encoding target block will be described as an embodiment in which a search is made paying attention to a space.

該実施形態を図４に模式的に示す。図４は符号化対象ブロックＢが存在するある時刻のフレーム全体を示している。符号化対象ブロックＢは例として４×４画素の場合を示している。フレームにおいてラスタースキャン順（スキャンは画素ではなくブロック単位）に符号化が行われ、符号化対象ブロックＢの手前まで符号化が終わったとすると、該符号化対象ブロックＢの手前までの領域が符号化済画素からなる復号画素領域である。該復号画素領域がステップＳ１にて入力された復号画素であり、類似領域を探索する領域全体となる。また符号化対象ブロックＢ以降のラスタースキャン順の領域（符号化対象ブロックＢ自体を含む）はまだ符号化されていない符号化前領域である。 The embodiment is schematically shown in FIG. FIG. 4 shows the entire frame at a certain time when the encoding target block B exists. For example, the encoding target block B is 4 × 4 pixels. If the frame is encoded in the raster scan order (the scan is not a pixel but a block unit) and encoding is completed up to the encoding target block B, the area up to the encoding target block B is encoded. This is a decoded pixel area composed of completed pixels. The decoded pixel area is the decoded pixel input in step S1, and is the entire area for searching for a similar area. A region in the raster scan order after the encoding target block B (including the encoding target block B itself) is a pre-encoding region that has not been encoded yet.

符号化対象ブロックＢに隣接して且つ符号化済画素領域である探索用隣接領域Ｐを利用して、隣接領域Ｐを復号画素領域内で平行移動しながら類似した領域を探索する。隣接領域Ｐと最も類似した領域からは、符号化対象ブロックＢと隣接領域Ｐとの相対位置関係かつ大きさが同等となる領域を、符号化対象ブロックＢと類似した領域として抽出する。 Using the search adjacent area P that is adjacent to the encoding target block B and is an encoded pixel area, a similar area is searched while the adjacent area P is translated in the decoded pixel area. From the region most similar to the adjacent region P, a region having the same relative positional relationship and size between the encoding target block B and the adjacent region P is extracted as a region similar to the encoding target block B.

例えば図４に示すように隣接領域Ｐに最も類似した領域の探索結果（ａ）として領域ａ１が得られたならば、符号化対象ブロックＢの類似領域は同サイズ（図４なら４×４画素サイズ）でかつ領域a1との相対位置関係が領域PとブロックＢとの相対的位置関係と一致する領域ａ２となり、隣接領域Ｐに最も類似した領域の探索結果が（ｂ）の領域ｂ１であれば符号化対象ブロックＢの類似領域は同様に領域ｂ２となる。 For example, as shown in FIG. 4, if the region a1 is obtained as a search result (a) of the region most similar to the adjacent region P, the similar region of the encoding target block B has the same size (4 × 4 pixels in FIG. 4). Size) and the relative positional relationship with the region a1 is the region a2 that matches the relative positional relationship between the region P and the block B, and the search result of the region most similar to the adjacent region P is the region b1 in (b). For example, the similar region of the encoding target block B is similarly the region b2.

なお隣接領域Ｐによる類似領域の探索対象は復号画像領域内であって且つ、その探索結果として抽出される符号化対象ブロックＢの類似領域が全て復号画素領域に含まれるような範囲において行われるものとする。例えば図４に示す例では隣接領域Ｐの１画素上段の領域が類似領域として探索されると、符号化対象ブロックＢの類似領域として縦３画素×横４画素の領域が符号化済画素でない領域となってしまうので、このような領域は探索対象としない。 The similar region search target by the adjacent region P is within the decoded image region, and is performed in such a range that all the similar regions of the encoding target block B extracted as the search result are included in the decoded pixel region. And For example, in the example illustrated in FIG. 4, when a region one pixel above the adjacent region P is searched for as a similar region, a region of 3 vertical pixels × 4 horizontal pixels as a similar region of the encoding target block B is not an encoded pixel. Therefore, such a region is not a search target.

隣接領域Ｐは符号化対象ブロックＢの左隣に位置する一列以上の画素、あるいは上隣に位置する一行以上の画素、あるいは両者の全て若しくは一部を利用することができる。図４に示す例では、隣接領域Ｐは４×４画素の符号化対象ブロックＢの隣接左隣１列４画素と、隣接上隣１行４画素と、該行および列の交点でブロックＢの対角位置にあたる１画素と、の計９画素となっているが、例えばブロックＢの隣接左隣２列を用いて４画素×２列の計８画素を隣接領域とするなどしてもよい。（なお図４における領域Ｐ２は後述のテンプレートマッチングの説明において用いる。） The adjacent region P can use one or more pixels located on the left side of the encoding target block B, one or more rows of pixels located on the upper side, or all or a part of both. In the example shown in FIG. 4, the adjacent region P is the 4 × 4 pixel encoding target block B adjacent to the left adjacent 1 column 4 pixels, the adjacent upper adjacent 1 row 4 pixels, and the intersection of the row and column of the block B Although the total number of pixels is one pixel corresponding to the diagonal position, for example, two adjacent columns on the left side of block B may be used, and a total of eight pixels of 4 pixels × 2 columns may be used as the adjacent region. (Note that the region P2 in FIG. 4 is used in the description of template matching described later.)

また隣接領域Ｐの一実施形態として、Ｈ．２６４におけるIntra予測のように画面内で予測が行われる場合は、予測画素を生成するのに使われる参照画素を用いても良い。該参照画素はＨ．２６４の４×４Intra予測であれば周知のように、符号化対象ブロックの左隣１列の画素、上隣１列の画素、又はそれら両者を含む。 As an embodiment of the adjacent region P, H. When prediction is performed within the screen as in Intra prediction in H.264, reference pixels used to generate prediction pixels may be used. The reference pixel is H.264. As is well known in the case of H.264 4 × 4 Intra prediction, it includes one pixel on the left side of the encoding target block, one pixel on the top side, or both.

また隣接領域Ｐによる探索範囲の一実施形態として、符号化対象ブロックＢに画面内予測が採用されている場合は、符号化済領域（復号画素領域）においてブロックＢと同一の画面内予測モードが採用されている領域内に、ブロックＢの類似領域として判定される領域がブロック単位で一致するような探索範囲に限定することができる。該実施形態の場合、隣接領域Ｐによる探索範囲はブロックＢの大きさ分の離散格子状の領域のうち、対応領域がブロックＢと同一の画面内予測が採用されている一部となるので探索範囲が少なくなり処理時間の短縮が図れる。また該実施形態によれば、このように探索回数を限定するだけでなく類似領域における予測残差の算出を省略できることから処理時間の短縮を図ることもできる。すなわち後述のステップＳ３において算出する、対応する予測残差が、既に類似領域の符号化にあたって用いられた予測残差と一致するので、算出を省略できる。 As an embodiment of the search range by the adjacent region P, when intra prediction is adopted for the encoding target block B, the same intra prediction mode as that of the block B is used in the encoded region (decoded pixel region). Within the adopted area, it is possible to limit the search range so that the area determined as the similar area of the block B matches in block units. In the case of this embodiment, the search range by the adjacent region P is a part of the discrete lattice-like region corresponding to the size of the block B, and the corresponding region is a part where the same intra prediction as the block B is adopted. The range is reduced and the processing time can be shortened. Further, according to this embodiment, not only the number of searches can be limited in this way, but also the calculation of the prediction residual in the similar region can be omitted, so that the processing time can be shortened. That is, since the corresponding prediction residual calculated in step S3 described later matches the prediction residual already used for encoding the similar region, the calculation can be omitted.

また隣接領域Ｐによる探索範囲の一実施形態として、探索範囲を当該フレーム内の復号画素領域の全体とするのではなく、符号化対象ブロックＢから所定の距離以内にある領域又は符号化対象ブロックＢ付近の所定の領域に限定することができる。 As one embodiment of the search range by the adjacent region P, the search range is not the entire decoded pixel region in the frame, but is a region within a predetermined distance from the encoding target block B or the encoding target block B. It can be limited to a predetermined area in the vicinity.

また隣接領域Ｐによる探索範囲の一実施形態として、探索結果として得られる符号化対象ブロックＢの類似領域が、変換手段２にて変換される単位ブロックに一致するような領域に探索範囲を限定することができる。例えば符号化対象ブロックＢが４×４画素であれば、該探索範囲は復号画素領域において４画素毎の格子点領域となる。なお該実施形態は別時刻フレームに探索範囲がある場合にも適用できる。また前述の同一画面内予測モードが採用されている領域に探索範囲を限定する実施形態は、該実施形態の１つである。 As an embodiment of the search range by the adjacent region P, the search range is limited to a region where the similar region of the encoding target block B obtained as a search result matches the unit block converted by the conversion means 2. be able to. For example, if the encoding target block B is 4 × 4 pixels, the search range is a lattice point region for every four pixels in the decoded pixel region. The embodiment can also be applied to a case where a search range exists in another time frame. Further, an embodiment in which the search range is limited to a region where the above-described intra-screen prediction mode is employed is one of the embodiments.

なお、図４を用いて説明したような隣接領域Ｐは、より一般には符号化対象ブロックＢの近傍領域（必ずしも符号化対象ブロックＢに接している必要はない領域）すなわち探索用近傍領域Ｐを用いる場合における一実施形態とみなせる。この場合、上記の符号化対象ブロックＢの類似領域は、探索用近傍領域Ｐの類似領域の探索結果として得られたａ１やｂ１を近傍領域類似領域と呼ぶことにすると、探索用近傍領域Ｐを近傍領域類似領域に平行移動するのと等しい平行移動を符号化対象ブロックＢに対して適用することで移される領域、と表現することもできる。 Note that the adjacent area P described with reference to FIG. 4 is more generally a neighboring area of the encoding target block B (an area not necessarily in contact with the encoding target block B), that is, a searching neighboring area P. It can be regarded as one embodiment in the case of using. In this case, if the a1 and b1 obtained as the search result of the similar region of the search neighboring region P are referred to as the neighborhood region similar region, the similar region of the encoding target block B is defined as the search neighboring region P. It can also be expressed as an area that is moved by applying a translation that is equivalent to a translation to a neighboring area similar area to the encoding target block B.

また同一フレームから符号化対象領域と類似した領域を探索し且つ構築する別の実施形態として、以下の非特許文献４に開示されたテンプレートマッチングを用いることができる。テンプレートマッチングでは図４に示す符号化対象領域Ｂの類似領域の探索且つ構築のために、図４の隣接領域ＰのうちＢの左上側２×２画素領域に隣接する領域Ｐ２（縦２画素＋横２画素＋対角１画素）を探索に用いる。 Further, as another embodiment for searching and constructing a region similar to the encoding target region from the same frame, template matching disclosed in Non-Patent Document 4 below can be used. In the template matching, in order to search and construct a similar region of the encoding target region B shown in FIG. 4, a region P2 (vertical 2 pixels + (2 horizontal pixels + 1 diagonal pixel) is used for the search.

テンプレートマッチングにおいてはＰ２の類似領域の探索結果から抽出する領域は図４のａ２やｂ２（４×４画素の領域）のうち左上の２×２画素の領域であり、Ｐ２の類似領域を上位４個探索して、該探索結果から得られる４個の２×２画素の領域を所定の順（例えばラスタースキャン順）に並べることによって、４×４画素の符号化対象ブロックＢの類似領域を構築する。符号化対象ブロックＢが４×４以外のサイズであっても同様に４分割ブロック上位４個によって類似領域を構築する。該テンプレートマッチングによれば符号化対象ブロックＢのテクスチャを考慮した類似領域を得ることができる等の効果がある。 In the template matching, the region extracted from the search result of the similar region of P2 is the upper left 2 × 2 pixel region of a2 and b2 (region of 4 × 4 pixels) in FIG. A similar region of the 4 × 4 pixel encoding target block B is constructed by performing an individual search and arranging four 2 × 2 pixel regions obtained from the search results in a predetermined order (for example, raster scan order). To do. Similarly, even if the encoding target block B has a size other than 4 × 4, a similar region is constructed by the upper four four-divided blocks. According to the template matching, there is an effect that a similar region in consideration of the texture of the encoding target block B can be obtained.

（非特許文献４）T. K. Tan, C. S. Boon, and Y. Suzuki, "Intra prediction by template matching," IEEE International Conference on Image Processing, pp.1693--1696, 2006. (Non-Patent Document 4) T. K. Tan, C. S. Boon, and Y. Suzuki, "Intra prediction by template matching," IEEE International Conference on Image Processing, pp.1693--1696, 2006.

一方、図４では特に空間に注目して探索する各実施形態を説明したが、時間に注目して探索する、すなわち符号化対象ブロックが属するフレームとは時刻が異なるフレームから類似領域を探索する場合を説明する。この場合、予測手段９が符号化対象ブロックに対する予測情報としてフレーム間動き情報を算出していれば、動き情報の参照先（当該参照先は予測情報が与えられているので符号化済フレーム内の領域となる）をそのまま類似領域とする。 On the other hand, FIG. 4 has described each embodiment in which the search is performed with particular attention to the space. However, the search is performed with a focus on time. Will be explained. In this case, if the prediction unit 9 calculates interframe motion information as prediction information for the encoding target block, the motion information reference destination (the reference destination is provided with prediction information, so The region) becomes a similar region as it is.

該動き情報の参照先の利用によれば、図４の実施形態におけるような近傍領域を用いた探索が不要なため、計算量を削減することができる。予測手段９が動き情報を算出していない場合は、前述の図４の実施形態によって同一フレーム上近傍画素によって同一フレーム上から類似領域探索を実行する。 According to the use of the reference destination of the motion information, it is not necessary to perform a search using a neighboring region as in the embodiment of FIG. 4, so that the calculation amount can be reduced. When the prediction means 9 does not calculate motion information, a similar region search is executed from the same frame by neighboring pixels on the same frame according to the above-described embodiment of FIG.

さらに、類似領域探索の別実施形態として、画素信号の色空間に注目して探索する場合（色空間による実施形態その１）を説明する。該実施形態を模式的に図５に示す。なお、同図では同一時刻フレーム内に類似領域を探索した例を示すが、別時刻フレームで類似領域が探索されていてもよい。例えばＲ（赤）、Ｇ（緑）、Ｂ（青）の３つの色信号で画素が表現されている場合に所定時刻フレームの所定位置の符号化対象ブロックの類似領域を探索するとする。この場合、同図（Ａ）のＲ信号フレームにおける探索結果（ａ）に示すようにＲ信号のフレームにおいては上述の各実施形態のいずれか、すなわち空間又は時間に注目した探索によって類似領域が既に探索されているならば、残りのＧ、Ｂ信号において同一フレームおよび同位置の符号化対象ブロックの類似領域として、同図（Ｂ）、（Ｃ）の各探索結果（ｂ）、（ｃ）に示すようにＲ信号において探索された類似領域と同じ位置（フレーム内の空間位置およびフレーム間の時間位置）であって且つ各Ｇ、Ｂ信号に属する領域を用いる。 Furthermore, as another embodiment of the similar region search, a case where the search is performed paying attention to the color space of the pixel signal (embodiment 1 using the color space) will be described. This embodiment is schematically shown in FIG. In addition, although the figure shows the example which searched the similar area | region within the same time frame, the similar area | region may be searched by another time frame. For example, when a pixel is represented by three color signals of R (red), G (green), and B (blue), a similar region of a block to be encoded at a predetermined position in a predetermined time frame is searched. In this case, as shown in the search result (a) in the R signal frame of FIG. 9A, in the R signal frame, a similar region has already been found by any of the above-described embodiments, that is, by a search focusing on space or time. If a search has been performed, the search results (b) and (c) in FIGS. (B) and (C) are displayed as similar regions of the encoding target block at the same frame and the same position in the remaining G and B signals. As shown, the same position (spatial position in the frame and time position between frames) as the similar area searched in the R signal and the area belonging to each G and B signal are used.

なお色空間による実施形態その１では、Ｇ信号、Ｂ信号における符号化対象ブロックに対して後述のステップＳ３ではＲ信号利用による各Ｇ信号、Ｂ信号内の探索結果を用いて、各Ｇ信号、Ｂ信号フレーム内で（図７、図８のような）処理を行う。 In the first embodiment based on color space, for each block to be encoded in the G signal and B signal, in each step S3 described later, each G signal by using the R signal and the search result in the B signal are used. Processing is performed within the B signal frame (as shown in FIGS. 7 and 8).

また色空間をさらに積極的に利用した別実施形態（色空間による実施形態その２）を説明する。該実施形態を模式的に図６に示す。例えばＲ、Ｇ、Ｂの３つの色信号で画素が表現されているとし、Ｒ信号のフレームにおいていずれかの実施形態（色空間利用以外の実施形態）により類似領域が探索され符号化が済んでいるとする。この場合、残りのＧ、Ｂ信号における符号化対象ブロックの類似領域を、同図の矢印（ａ）、（ｂ）に示すように同一フレーム同一ブロック位置の符号化済Ｒ信号のブロック自身とする。すなわちＧ信号、Ｂ信号の類似領域がＲ信号の画素値によって与えられる。 In addition, another embodiment that uses the color space more actively (embodiment 2 using the color space) will be described. This embodiment is schematically shown in FIG. For example, assuming that a pixel is represented by three color signals of R, G, and B, a similar region is searched and encoded in any of the embodiments (embodiments other than using the color space) in the frame of the R signal. Suppose that In this case, the similar region of the encoding target block in the remaining G and B signals is the block of the encoded R signal at the same block position in the same frame as indicated by arrows (a) and (b) in the figure. . That is, a similar region of the G signal and the B signal is given by the pixel value of the R signal.

なお色空間による実施形態その２において、Ｇ信号、Ｂ信号における符号化対象ブロックに対して後述のステップＳ３、Ｓ４では、その類似領域をＲ信号としたのでＲ信号フレーム内で処理を行って基底を算出する。この場合、前もって色空間利用以外の実施形態でＲ信号に対して基底を算出した処理とは異なり、Ｇ信号、Ｂ信号における基底算出処理では、類似領域として符号化対象ブロックと同位置のＲ信号画素（符号化済）がそのまま利用でき、かつステップＳ３で当該同位置のブロックを対象として対応する予測残差算出等を行い、ステップＳ４で基底を求める。またこの場合、Ｇ信号、Ｂ信号における処理は一致するので、Ｇ信号に対して求めた基底をＢ信号で用いることで計算を省略できる。 In the second embodiment using the color space, in steps S3 and S4 to be described later with respect to the encoding target block in the G signal and the B signal, the similar region is set as the R signal. Is calculated. In this case, unlike the processing for calculating the base for the R signal in the embodiment other than using the color space in advance, in the base calculation processing for the G signal and the B signal, the R signal at the same position as the encoding target block is used as a similar region. The pixel (encoded) can be used as it is, and the corresponding prediction residual is calculated for the block at the same position in step S3, and the base is obtained in step S4. In this case, since the processing for the G signal and the B signal is the same, the calculation can be omitted by using the basis obtained for the G signal for the B signal.

以上の色空間による実施形態その１、その２では信号間の高い相関を利用することにより符号化性能を上げ、かつ探索などを省いて計算量を抑制する。Ｒ信号の結果をＧ信号、Ｂ信号に用いるとしたが、その両方に用いてもいずれか一方のみに用いてもよい。これら実施形態はRGB信号を変換したその他の信号、例えばＹ（輝度信号）、Ｃ_ｂ（色差信号）、Ｃ_ｒ（色差信号）などにおいても同様に適用できる。 In the first and second embodiments using the color space described above, the coding performance is improved by using a high correlation between signals, and the calculation amount is suppressed by omitting the search. Although the result of the R signal is used for the G signal and the B signal, it may be used for both or only one of them. These embodiments can be similarly applied to other signals obtained by converting RGB signals, for example, Y (luminance signal), C _b (color difference signal), C _r (color difference signal), and the like.

なお、以上説明した図３のステップＳ２の各実施形態において、探索範囲の中での画素との類似度判定には、画素値そのものの一致性として、差分二乗和を利用できる。ただし、後述するステップＳ４における最適な変換基底関数の導出に際しては、平均値の相違は許容できるため、平均値の影響を除く処理として、隣接画素の平均値を減算した後の数値と探索対象の平均値を減算した後の数値との差分二乗和を利用しても良い。あるいは、該隣接画素と探索対象との相関も利用できる。 In each embodiment of step S2 in FIG. 3 described above, the sum of squares of differences can be used as the similarity of the pixel value itself for similarity determination with the pixel in the search range. However, since the difference between the average values can be allowed when deriving the optimal conversion basis function in step S4, which will be described later, as a process for removing the influence of the average value, the numerical value after subtracting the average value of adjacent pixels and the search target You may use the sum of squared differences with the numerical value after subtracting an average value. Alternatively, the correlation between the adjacent pixels and the search target can also be used.

この平均値の影響を除いた後の差分二乗和又は相関を利用する実施形態においては、差分二乗和を用いる場合よりも類似と判定される領域が増えるため、より符号化対象ブロックに対する類似度の高い領域が選出される傾向が強くなる。且つ平均値の差などがあっても前述のようにステップＳ４では符号化対象ブロックに対して適切な基底を算出できるので符号化効率が高まる。 In the embodiment using the sum of squared differences or correlation after removing the influence of the average value, the number of regions determined to be similar is larger than when using the sum of squared differences. The tendency to select high areas becomes stronger. Even if there is a difference in average values, as described above, an appropriate base can be calculated for the block to be encoded in step S4, so that the encoding efficiency is increased.

また前述の色空間による実施形態その１、その２では探索が省略される色空間に関しては、上述のようにステップＳ４において平均値の相違が許容されることから、特に色空間による実施形態その２において別の色空間の値を類似領域として採用する場合であって画素値の平均値自体に各色空間で差があるような場合であっても一般に各色空間の画素値間に相関はあるので、別の色空間の値による類似領域からも最適な基底関数を算出できるという効果がある。 In addition, regarding the color space in which the search is omitted in the first and second embodiments using the color space described above, the difference in average value is allowed in step S4 as described above. In general, there is a correlation between the pixel values of each color space even when the value of another color space is adopted as a similar region and there is a difference in the average value of the pixel values in each color space. There is an effect that an optimum basis function can be calculated from a similar region based on a value of another color space.

ステップＳ２に続いてステップＳ３では、変換手段２で変換する符号化対象ブロックに相当する情報を前記探索結果から算出する。符号化対象ブロック領域に画素値上で類似しているとしてステップＳ２で探索された類似領域であるが、変換手段２で変換するのは入力画像そのままの画素値ではなく符号化対象ブロック領域の画素から予測画素を差分器１で引いた予測残差である。変換手段２では該予測残差に対する適切な基底関数を用いる必要がある。よって探索結果の類似領域から基底関数構成手段８が基底関数を算出するにあたって、類似領域の画素からも対応する予測画像を減算して、対応する予測残差を求めておくのがステップＳ３である。 In step S3 following step S2, information corresponding to the block to be encoded to be converted by the conversion means 2 is calculated from the search result. Although it is the similar area searched in step S2 as being similar to the encoding target block area on the pixel value, the conversion means 2 converts the pixel of the encoding target block area instead of the pixel value of the input image as it is This is a prediction residual obtained by subtracting the prediction pixel from the subtractor 1 using the subtractor 1. The conversion means 2 needs to use an appropriate basis function for the prediction residual. Therefore, when the basis function construction means 8 calculates the basis function from the similar region of the search result, it is step S3 to subtract the corresponding prediction image from the pixels of the similar region to obtain the corresponding prediction residual. .

すなわちステップＳ３ではステップＳ２の探索結果得られた類似領域に対して符号化対象ブロックと同等の予測及び補償によって対応する予測残差を算出する。なお、前述のJPEGの実施形態のような場合には、入力画素からそのまま基底関数が算出されるので、ステップＳ３での処理は必要ない。 That is, in step S3, a corresponding prediction residual is calculated for the similar region obtained as a result of the search in step S2 by prediction and compensation equivalent to the encoding target block. In the case of the above-described JPEG embodiment, the basis function is directly calculated from the input pixel, so that the process in step S3 is not necessary.

例えば、予測手段９による予測情報を利用する場合、符号化対象ブロックが予測情報としてIntra予測を採用していれば、類似領域でも同様のIntra予測を適用し予測残差を算出する。これを模式的に図７に示す。 For example, when the prediction information by the prediction means 9 is used, if the encoding target block adopts Intra prediction as the prediction information, the same intra prediction is applied to the similar region to calculate the prediction residual. This is schematically shown in FIG.

図７に示すように、あるフレームにおいて符号化対象ブロックＢ１の類似領域がＢ２である。ブロックＢ１にはＨ．２６４の４×４Intra予測のモード０〜８のうち、例えば予測モード０が適用されており、ブロックＢ１の上部隣接１行の４×１画素領域Ｐ１０によってＢ１を予測する。よって符号化対象ブロックＢ１の予測残差は次の減算[式A]
（Ｂ１の画素）−（対応するＰ１０の画素） [式A]
により得られる。これに対応する類似領域Ｂ２から算出する予測残差は、ブロックＢ２における予測モード０の領域である、ブロックＢ２の上部隣接１行の４×１画素領域Ｐ２０の画素を用いて次[式B]で与えられる。
（Ｂ２の画素）−（対応するＰ２０の画素） [式B]
上記[式B]の値が、次のステップＳ４に送られる。なお前述のように、同一画面内予測モードが採用されている領域に探索範囲を限定する実施形態では、各予測モードにおける[式Ｂ]に対応する量が既に符号化にあたって算出済みであるため、本ステップＳ３にて改めて算出する必要がない。 As shown in FIG. 7, the similar region of the encoding target block B1 in a certain frame is B2. Block B1 contains H.264. Of the H.264 4 × 4 Intra prediction modes 0 to 8, for example, the prediction mode 0 is applied, and B1 is predicted by the 4 × 1 pixel region P10 in the upper adjacent row of the block B1. Therefore, the prediction residual of the encoding target block B1 is subtracted by [Formula A]
(Pixel of B1)-(corresponding pixel of P10) [Formula A]
Is obtained. The prediction residual calculated from the similar region B2 corresponding to this is expressed by the following [formula B] using the pixels of the 4 × 1 pixel region P20 in the upper adjacent row of the block B2, which is the region of the prediction mode 0 in the block B2. Given in.
(Pixel of B2)-(corresponding pixel of P20) [Formula B]
The value of [Formula B] is sent to the next step S4. As described above, in the embodiment in which the search range is limited to a region where the same intra-screen prediction mode is adopted, the amount corresponding to [Equation B] in each prediction mode has already been calculated in encoding. There is no need to calculate again in this step S3.

あるいはステップＳ３において、符号化対象ブロックが予測手段９による予測情報として動き予測を採用し、動き予測の参照先を類似領域とする実施形態においては、参照先である類似領域においても同様に動き予測を適用し予測残差を算出する。これを模式的に図８に示す。 Alternatively, in the embodiment in which the encoding target block adopts motion prediction as prediction information by the prediction unit 9 and uses the motion prediction reference destination as the similar region in step S3, the motion prediction is similarly performed in the similar region as the reference destination. Is applied to calculate the prediction residual. This is schematically shown in FIG.

図８では時系列順のフレームＦ１、Ｆ２、Ｆ３が存在し、各フレームのブロック間には動き予測の参照関係がある。符号化対象ブロックはフレームＦ３のブロックＦ３Ｂであり、フレームＦ１、Ｆ２は符号化済みである。符号化対象ブロックＦ３Ｂは動き予測（ａ）で過去フレームＦ２のブロックＦ２Ｂを参照している。この場合、符号化対象ブロックＦ３Ｂの予測残差は次[式Ｃ]
（Ｆ３Ｂの画素）−（Ｆ２Ｂの画素） [式Ｃ]
で与えられる。これに対応する、類似領域Ｆ２Ｂの予測残差は、Ｆ２Ｂにおける同様の（Ｆ２Ｂを符号化するに際して適用された）動き予測（ｂ）で参照されるフレームＦ１のブロックをＦ１Ｂとして、次[式Ｄ]
（Ｆ２Ｂの画素）−（Ｆ１Ｂの画素） [式Ｄ]
で与えられる。上記[式Ｄ]の値が、次のステップＳ４に送られる。 In FIG. 8, there are frames F1, F2, and F3 in chronological order, and there is a motion prediction reference relationship between the blocks of each frame. The encoding target block is the block F3B of the frame F3, and the frames F1 and F2 have been encoded. The encoding target block F3B refers to the block F2B of the past frame F2 in motion prediction (a). In this case, the prediction residual of the encoding target block F3B is expressed by the following [formula C].
(F3B pixel) − (F2B pixel) [Formula C]
Given in. Corresponding to this, the prediction residual of the similar region F2B is expressed by the following [formula D with the block of the frame F1 referenced in the same motion prediction (applied when encoding F2B) (b) in F2B as F1B. ]
(F2B pixel)-(F1B pixel) [Formula D]
Given in. The value of [Expression D] is sent to the next step S4.

なお、図８の説明の例では動き予測としてフレーム間の順方向予測の例を示したが、双方向の場合で別時刻の２枚のフレームを参照している場合には、そのうち１枚のみを選択して利用することで図８の順方向予測の例と同様に対応する予測残差を算出できる。この場合、参照先は未来のフレームであってもよい。 In the example of FIG. 8, an example of forward prediction between frames is shown as motion prediction. However, in the case of bi-directional and referring to two frames at different times, only one of them is referred to. By selecting and using, the corresponding prediction residual can be calculated in the same manner as the forward prediction example of FIG. In this case, the reference destination may be a future frame.

なお、図７、図８の説明における、又は図３のステップＳ２で用いられるような予測情報は、図１において入力画像ａが直接に予測手段９へ入力され、加算器７経由の符号化済画素と比較して算出され、符号化手段４や補償手段１０以降に入力される。よって符号化対象ブロックの予測残差を差分器１が算出する時点で、当該符号化対象ブロックに対する予測情報は既に算出され符号化手段４にも入力されているので、上記の図７、図８、図３の説明におけるような利用（符号化対象ブロックを符号化する段階における当該ブロックの予測情報の利用）が可能となる。対応する図２の復号装置においても復号対象ブロックが復号手段２１で可変長復号された時点で対応する予測情報が利用可能となり、補償手段２４に送られる。 Note that the prediction information used in the description of FIGS. 7 and 8 or used in step S2 of FIG. 3 is the input image a directly input to the prediction means 9 in FIG. It is calculated in comparison with the pixel, and is input to the encoding means 4 and the compensation means 10 and subsequent. Therefore, when the subtractor 1 calculates the prediction residual of the encoding target block, since the prediction information for the encoding target block has already been calculated and input to the encoding means 4, the above-described FIG. 7 and FIG. 3 can be used as in the description of FIG. 3 (use of prediction information of the block at the stage of encoding the block to be encoded). Also in the corresponding decoding apparatus in FIG. 2, the corresponding prediction information becomes available when the decoding target block is subjected to variable length decoding by the decoding means 21 and is sent to the compensation means 24.

なおまた、図７、図８の説明の例で類似領域として用いた予測情報の参照先において、符号化対象ブロックに適用されているIntra予測又は動き予測が利用できず、従って対応する予測残差を算出できない場合には、当該符号化対象ブロックにおける適用基底フラグ（切替情報）を所定の固定基底利用に設定し、次のステップＳ４による最適変換基底算出は行わない。また、参照先でこのように動き予測が利用できない場合は固定基底を用いるとあらかじめ決めておけば切替情報は省略してもよく、対応する画像復号装置側でも同様の判断が行われる。 In addition, in the reference destination of the prediction information used as the similar region in the example of FIGS. 7 and 8, Intra prediction or motion prediction applied to the encoding target block cannot be used, and thus the corresponding prediction residual If the application base flag (switching information) in the encoding target block is set to a predetermined fixed base use, the optimum transform base calculation in the next step S4 is not performed. In addition, when the motion prediction cannot be used at the reference destination, the switching information may be omitted if it is determined in advance that a fixed base is used, and the same determination is performed on the corresponding image decoding device side.

図３に戻り、ステップＳ３の次の最後のステップＳ４において、類似領域の探索結果若しくはその予測残差に対する最適な変換基底関数をKL（カルーネン・レーベ、Karhunen-Loeve）変換や特異値分解によって算出する。本発明では前述のように符号化済画素から類似領域の予測誤差対応量を求めてこれを符号化対象ブロックの予測誤差該当量として変換基底を求めるため、変換基底を符号化する必要がない。よってこれらKL変換や特異値分解の他にも（DCTなどと異なり）任意の入力に対して入力毎に最適な変換を算出できるような技術を用いることができる。 Returning to FIG. 3, in step S4 after step S3, the optimal transformation basis function for the similar region search result or its prediction residual is calculated by KL (Karhunen-Loeve) transformation or singular value decomposition. To do. In the present invention, as described above, since the prediction error correspondence amount of the similar region is obtained from the encoded pixels and this is used as the prediction error corresponding amount of the encoding target block, it is not necessary to encode the conversion base. Therefore, in addition to these KL transforms and singular value decomposition (unlike DCT, etc.), a technique that can calculate an optimum transform for each input for an arbitrary input can be used.

例えば、特異値分解によって変換基底関数を算出する場合は、類似領域の探索結果若しくはその予測残差をm行n列の行列Aで表すとき、行列Aを次式のように行列の積に分解することに相当する。
A=UΣV^t
なお、Uはm行m列の直交行列、Vはn行n列の直交行列、tは転置操作(V^tはVの転置行列)を表す。また、ΣはAの特異値σ_i(1≦i≦rank A)を降順に並べたm行n列の対角行列を表す。特異値σ_iは、m<nの場合はAA^t、m≧nの場合はA^tAの固有値λ_iの平方である。具体的にm≧nの場合の手順は、まずA^tAの固有値を求め、特異値を算出する。次に、直交行列U及びVはその定義からU^tU=I、V^tV=Iであることを利用して（Iは単位行列）、
A^tAV=VΣ²
が得られるので、次式で示すようにVの列ベクトルv_iはA^tAの固有値σ_i ²に対応する固有ベクトルとして求められる。
A^tAv_i=σ_i ²v_i
一方、UはAVΣ^−１で求められる。 For example, when calculating the transformation basis function by singular value decomposition, if the search result of the similar region or its prediction residual is represented by a matrix A of m rows and n columns, the matrix A is decomposed into a matrix product as in the following equation: It corresponds to doing.
A = UΣV ^t
U represents an m-by-m orthogonal matrix, V represents an n-by-n orthogonal matrix, and t represents a transposition operation (V ^t is a transposed matrix of V). Σ represents an m-by-n diagonal matrix in which singular values σ _{i of A} (1 ≦ i ≦ rank A) are arranged in descending order. Singular value sigma _i, if the m <n In the case of AA ^t, m ≧ n is a square of the eigenvalue lambda _i of A ^t A. Procedure for specifically m ≧ n, first eigenvalues of A ^t A, and calculates singular values. Next, the orthogonal matrices U and V use U ^t U = I and V ^t V = I from their definitions (I is a unit matrix),
A ^t AV = VΣ ²
Since is obtained, the column vector v _i of V as shown in the following equation is obtained as the eigenvector corresponding to the eigenvalue sigma _i ² of A ^t A.
A ^t Av _i = σ _i ² v _i
On the other hand, U is obtained by AVΣ- ¹ .

直交行列U^t及びVは基底関数として変換手段２に送られ、直交行列U及びV^tは逆基
底関数として逆変換手段６に送られる。 The orthogonal matrices U ^t and V are sent to the conversion means 2 as basis functions, and the orthogonal matrices U and V ^t are sent to the inverse conversion means 6 as inverse basis functions.

なお、この基底関数による周波数変換では対角成分に非0値が出現しやすいので、量子化後のスキャン順序は対角成分を優先するようにすることが望ましい。あるいは、特許文献（特願2010-090392）による適応的スキャン順序と組み合わせても良い。 Since frequency conversion using this basis function tends to cause non-zero values in the diagonal component, it is desirable to prioritize the diagonal component in the scan order after quantization. Or you may combine with the adaptive scan order by patent document (Japanese Patent Application No. 2010-090392).

以上説明した図１の構成の画像符号化装置（前述のJPEGやMPEGの構成の場合を含む）によれば、各単位ブロックの周波数領域変換に際して、符号化済領域から類似領域を探索し、探索結果の最適な変換基底関数を符号化対象ブロックの変換基底関数として利用することで、符号化効率が向上する。また、変換基底関数が適応的に変化しうるにも関わらず、変換基底関数に関する追加的な情報を格納する必要がないことから高い符号化効率が可能となる。 According to the image coding apparatus having the configuration shown in FIG. 1 described above (including the case of the above-described JPEG or MPEG configuration), when the frequency domain conversion of each unit block is performed, a similar region is searched from the encoded region and searched. Encoding efficiency is improved by using the optimal conversion basis function of the result as the conversion basis function of the encoding target block. In addition, although the transform basis function can be adaptively changed, it is not necessary to store additional information regarding the transform basis function, thereby enabling high coding efficiency.

次に、図１の画像符号化装置に対応する図２の画像復号装置の動作を説明する。図１の符号化手段４にて符号化された符号情報ｂは、図２の復号手段２１に入力し、復号手段２１は符号化手段４の逆の手順としての可変長復号を行うことで量子化値、予測情報および切替情報を出力する。可変長復号により得られた量子化値は逆量子化手段２２に、予測情報は補償手段２４に、切替情報は基底関数構成手段２６に送られる。 Next, the operation of the image decoding apparatus in FIG. 2 corresponding to the image encoding apparatus in FIG. 1 will be described. The code information b encoded by the encoding means 4 in FIG. 1 is input to the decoding means 21 in FIG. 2, and the decoding means 21 performs variable length decoding as a reverse procedure of the encoding means 4 to quantize. Outputs the conversion value, prediction information, and switching information. The quantization value obtained by the variable length decoding is sent to the inverse quantization means 22, the prediction information is sent to the compensation means 24, and the switching information is sent to the basis function construction means 26.

逆量子化手段２２は量子化値を量子化手段３の逆の手順によって逆量子化して変換係数とし、逆変換手段２３へ送る。逆変換手段２３は基底関数構成手段２６から逆変換基底関数の情報を受け取って変換係数を予測残差に変換して基底関数構成手段２６および加算器２５に送る。補償手段２４は予測情報および復号済画素の情報から復号対象ブロックの予測画素を求め、加算器２５に送る。加算器２５は予測画素と予測残差とを加算して当該復号対象ブロックの復号画像ｃを得て、補償手段２４および基底関数構成手段２６に送る。 The inverse quantization means 22 inversely quantizes the quantized value by the reverse procedure of the quantization means 3 to obtain a transform coefficient, and sends it to the inverse transform means 23. The inverse transform unit 23 receives the information of the inverse transform basis function from the basis function construction unit 26, converts the transform coefficient into a prediction residual, and sends it to the basis function construction unit 26 and the adder 25. The compensation unit 24 obtains the prediction pixel of the decoding target block from the prediction information and the decoded pixel information, and sends the prediction pixel to the adder 25. The adder 25 adds the prediction pixel and the prediction residual to obtain a decoded image c of the decoding target block, and sends the decoded image c to the compensation unit 24 and the basis function construction unit 26.

基底関数構成手段２６（図２）は基底関数構成手段８（図１）と同様に、復号対象ブロックの変換係数を逆変換手段２３で予測残差に変換する逆変換基底関数を求める。加算器２５から送られる復号画像領域内を、図１の説明と同様にして復号対象ブロックの類似領域を求め、対応する予測残差を算出して逆変換基底関数を求める。また適宜切替情報を参照して固定基底を用いる。 The basis function constituting unit 26 (FIG. 2) obtains an inverse transform basis function for transforming the transform coefficient of the block to be decoded into a prediction residual by the inverse transform unit 23 as in the basis function constituting unit 8 (FIG. 1). In the decoded image area sent from the adder 25, a similar area of the decoding target block is obtained in the same manner as described in FIG. 1, and a corresponding prediction residual is calculated to obtain an inverse transformation basis function. In addition, a fixed base is used with reference to switching information as appropriate.

なお、図１の説明と同様に、補償手段２４や加算器２５などでは不図示のフレームメモリなどを参照して適宜復号済画素の情報を利用する。また図２でも図１と同様に、Ｈ．２６４の復号装置を想定しているが、MPEG、JPEGなどの復号装置にも適用できることは明らかである。 As in the description of FIG. 1, the compensation means 24, the adder 25, and the like appropriately use the decoded pixel information with reference to a frame memory (not shown). 2 is similar to FIG. Although a H.264 decoding device is assumed, it is apparent that the present invention can be applied to decoding devices such as MPEG and JPEG.

２…変換手段、３…量子化手段、４…符号化手段、５…逆量子化手段、６…逆変換手段、８…基底関数構成手段、９…予測手段、１０…補償手段、２１…復号手段、２２…逆量子化手段、２３…逆変換手段、２４…補償手段、２６…基底関数構成手段 DESCRIPTION OF SYMBOLS 2 ... Conversion means, 3 ... Quantization means, 4 ... Coding means, 5 ... Inverse quantization means, 6 ... Inverse transformation means, 8 ... Basis function formation means, 9 ... Prediction means, 10 ... Compensation means, 21 ... Decoding Means 22 ... Inverse quantization means 23 ... Inverse transform means 24 ... Compensation means 26 ... Basis function construction means

Claims

In an image encoding device that performs encoding for each unit block by sequentially performing orthogonal transform, quantization, and encoding on each pixel of a unit block composed of a plurality of pixels,
Comprising basis function constructing means for receiving encoded pixel information, constructing a transform basis function for the encoding target block, and sending it to an orthogonal transform means;
The basis function constituting unit searches the similar region of the encoding target block from the encoded pixel region by using only the encoded pixel without using the pixel of the encoding target block , and performs the search. By configuring the transformation basis function using the similar region as an input and not using the pixel of the encoding target block as an input , encoding of the searched similar region information is omitted. image coding apparatus characterized by.

For each pixel of a unit block composed of a plurality of pixels, orthogonal transformation, quantization, and encoding are performed on the prediction residual obtained by performing difference processing with each pixel predicted from the encoded pixel. In an image encoding apparatus that performs encoding sequentially for each unit block,
Receiving the encoded pixel information and prediction information for predicting each pixel, comprising a transform basis function for the block to be coded, and comprising a basis function constructing means for sending to the orthogonal transform means,
The basis function constituting unit searches for a similar region of the coding target block from the region of the encoded pixel, and uses a prediction residual obtained by applying the prediction information to the similar region as an input. Constructing the transformation basis function ,
The basis function constituting unit, when searching for a similar region of the encoding target block from the encoded pixel region, is located in the vicinity of the encoding target block in the same frame and included in the encoded pixel. A predetermined area composed of pixels to be searched is used as a neighborhood area for search, a similar area of the search neighborhood area is searched from the area of the encoded pixel, and is set as a neighborhood area similarity area, and the neighborhood area for search is used as the neighborhood area similarity area An image coding apparatus characterized in that a similar area searched for the block to be coded is set as a region to be moved by performing a parallel movement equivalent to the translation to the block to be coded.

The basis function constituting unit searches for each encoded unit block included in the encoded pixel area when searching for a similar area of the encoding target block from the encoded pixel area. The image encoding device according to claim 1, wherein:

The basis function constituting unit, when searching for a similar region of the encoding target block from the encoded pixel region, is located in the vicinity of the encoding target block in the same frame and included in the encoded pixel. A predetermined area composed of pixels to be searched is used as a neighborhood area for search, a similar area of the search neighborhood area is searched from the area of the encoded pixel, and is set as a neighborhood area similarity area, and the neighborhood area for search is used as the neighborhood area similarity area 3. The similar area searched for the block to be encoded is defined as a region to be transferred by performing a parallel shift equal to the parallel shift to the block to be encoded. Image coding apparatus.

The image coding apparatus according to claim 4 , wherein the search neighboring area includes only an adjacent area that is in contact with the coding target block.

The basis function constituting unit, when searching for a similar region of the encoding target block from the encoded pixel region, is located in the vicinity of the encoding target block in the same frame and included in the encoded pixel. A predetermined area composed of pixels to be searched is used as a neighborhood area for search, a similar area of the search neighborhood area is searched from the area of the encoded pixel, and is set as a neighborhood area similarity area, and the neighborhood area for search is used as the neighborhood area similarity area A region that is moved by performing a parallel movement equal to the parallel movement to the encoding target block is a similar region searched for the encoding target block,
The image coding apparatus according to claim 2, wherein the search neighboring area includes reference pixels for intra prediction included in the prediction information.

Limit the search range of the neighboring region similar region so that the similar region searched for the coding target block matches in block units the region where the types of intra prediction included in the prediction information match. The image coding apparatus according to claim 6 .

When searching for the neighborhood region similar region, the basis function constituting unit uses the searched near region similarity region for the first color space coordinates included in the pixel as a second color space included in the pixel. the image coding apparatus according to any one of 4 claims, characterized in that employed as the neighboring region similar region in at least one of the coordinates and the third color space coordinates 7.

The basis functions configuration means, when searching the neighboring region similar region, the image coding apparatus according to any one of claims 4 to 7, characterized in that searching on the basis of the difference square sum between region pixel.

The basis function constituting unit subtracts the average value of the pixels in each region in each region of the search source region and the search destination region to be matched by moving the search source region when searching for the neighboring region similar region. the image coding apparatus according to any one of the sum of squared differences between calculated for region pixels, or to 4 claims, characterized in that searching on the basis of the correlation values between regions pixels 7.

The basis functions configuration means calculates an orthogonal matrix with singular value decomposition of the input, according to any one of claims 1 to 10, wherein the configuring the transform basis functions based on the orthogonal matrix Image encoding device.

In an image decoding apparatus that performs decoding, dequantization, and inverse orthogonal transform on code information obtained by encoding each pixel of a unit block composed of a plurality of pixels to perform decoding for each unit block.
Comprising basis function construction means for receiving decoded pixel information, constructing an inverse transform basis function for a block to be decoded, and sending it to an inverse orthogonal transform means;
The basis function construction means searches for a similar region of the decoding target block from the decoded pixel region by using only the decoded pixel without using the information decoded from the decoding target block. An image decoding apparatus comprising the inverse transform basis function using the similar region as an input and not using information decoded from the decoding target block as an input.

A code of a prediction residual obtained by performing a difference process with each pixel predicted from an encoded pixel with respect to code information in which each pixel of a unit block composed of a plurality of pixels is encoded In an image decoding apparatus that performs decoding for each unit block by sequentially performing decoding, inverse quantization, and inverse orthogonal transform on information,
Receiving the decoded pixel information and the prediction information for predicting each pixel, comprising an inverse transform basis function for the decoding target block, and comprising a basis function constructing means for sending to the inverse orthogonal transform means,
The basis function constituting unit searches for a similar region of the decoding target block from the region of the decoded pixel, and uses the prediction residual obtained by applying the prediction information to the similar region as an input. Construct transformation basis functions,
The basis function constituting unit includes pixels located in the vicinity of the decoding target block in the same frame and included in the decoded pixel when searching for a similar area of the decoding target block from the decoded pixel area A predetermined area is set as a search neighboring area, a similar area of the search neighboring area is searched from the decoded pixel area as a neighboring area similar area, and the searching neighboring area is translated to the neighboring area similar area. An image decoding apparatus characterized in that a region moved by performing parallel movement equal to is applied to the decoding target block is a similar region searched for the decoding target block.