JP5484377B2

JP5484377B2 - Decoding device and decoding method

Info

Publication number: JP5484377B2
Application number: JP2011037149A
Authority: JP
Inventors: 孝之仲地; 和正大串; 望浜田
Original assignee: Nippon Telegraph and Telephone Corp; Keio University
Current assignee: Nippon Telegraph and Telephone Corp; Keio University
Priority date: 2011-02-23
Filing date: 2011-02-23
Publication date: 2014-05-07
Anticipated expiration: 2031-02-23
Also published as: JP2012175526A

Description

本発明は、復号化装置及び復号化方法に関する。特に、本発明は、分散符号化のための超解像可能領域判定を実施する復号化装置及び復号化方法に関する。 The present invention relates to a decoding device and a decoding method. In particular, the present invention relates to a decoding apparatus and a decoding method that perform super-resolvable region determination for distributed encoding.

マルチメディアコンテンツの需要増加に伴い動画圧縮に関する研究が盛んに行われている。現在広く普及しているH.264/AVCなどの動画圧縮法に対し、分散映像符号化（DVC：Distributed Video Coding）という新しい手法が提案され、注目を集めている。 With the increasing demand for multimedia content, research on video compression has been actively conducted. A new method called Distributed Video Coding (DVC) has been proposed and attracted attention for video compression methods such as H.264 / AVC which are now widely used.

従来の動画圧縮法では符号化側で高負荷な処理を行うのに対し、DVCでは復号化側で高負荷な処理を行う。この特徴を活かして携帯電話など処理能力が低い端末でも従来の動画圧縮法と同等の符号化効率で映像を配信できるようになることが期待されている。しかしDVCの研究の歴史は浅く、まだ従来の動画圧縮法に匹敵するほどの符号化効率は達成出来ていない。 In the conventional video compression method, high-load processing is performed on the encoding side, whereas in DVC, high-load processing is performed on the decoding side. Taking advantage of this feature, it is expected that even a terminal having a low processing capability such as a mobile phone can distribute video with the same coding efficiency as that of the conventional video compression method. However, DVC has a short history of research, and has not yet achieved encoding efficiency comparable to that of conventional video compression methods.

最初に、分散映像符号化（DVC）と超解像（SR：Super Resolution）について説明する。 First, distributed video coding (DVC) and super resolution (SR) will be described.

（１）分散映像符号化（DVC）
図１に、一般的なDVC符号器・復号器のブロック図を示す（非特許文献１参照）。符号化側は映像を数フレーム置きにDCT変換や、Wavelet変換によってイントラ符号化する。この符号化されたフレームのことをKeyフレームと呼ぶ。その他のWyner-Zivフレームと呼ばれるフレームは冗長性を除去した後に量子化し、SW符号化器によりパリティシンドロームを生成し伝送する。SW復号器は、まずKeyフレームを復号した後に、復号された情報から間のフレームを動き補償によって求める。こうして得られた画像列はサイドインフォメーションと呼ばれる。SW復号器は、このサイドインフォメーションで予測しきれなかった成分を、パリティシンドロームを用いて誤り訂正を行う。訂正に失敗した場合、復号器は符号器に追加の情報を要求し、より長いパリティシンドロームを受信する。復号された信号は逆量子化され、冗長性を復元し、出力される。 (1) Distributed video coding (DVC)
FIG. 1 shows a block diagram of a general DVC encoder / decoder (see Non-Patent Document 1). The encoding side performs intra encoding by DCT conversion or Wavelet conversion every few frames. This encoded frame is called a Key frame. Other frames called Wyner-Ziv frames are quantized after removing redundancy, and a parity syndrome is generated and transmitted by the SW encoder. The SW decoder first decodes the Key frame and then obtains a frame between the decoded information by motion compensation. The image sequence obtained in this way is called side information. The SW decoder performs error correction on the components that could not be predicted by the side information using the parity syndrome. If the correction fails, the decoder requests additional information from the encoder and receives a longer parity syndrome. The decoded signal is inversely quantized to restore redundancy and output.

（２）ロバスト超解像（SR）
超解像復元における画像獲得モデルは一般的に以下のように表わされる（非特許文献２参照）。このモデルを図２に示す。 (2) Robust super resolution (SR)
An image acquisition model in super-resolution restoration is generally expressed as follows (see Non-Patent Document 2). This model is shown in FIG.

Y_kは[H×W]ピクセルの観測画像を辞書式に並べ替えた[HW×1]のベクトル、Xは[RH×RW]ピクセルの高解像度画像を辞書式に並べ替えた[R²HW×1]のベクトルである。k、N、Rはそれぞれフレームインデックス、観測画像の枚数、観測画像と高解像度画像の解像度の比である。D、H_k、F_kはそれぞれダウンサンプル、ぶれ、位置ずれを作用させる行列を表す。V_kは付加ノイズのベクトルである。ぶれの点拡がり関数（PSF：Point Spread Function）が映像全体で一様であると仮定する時、H_k=Hと置き換えることができる。さらに位置ずれが並進動きのみと仮定すると、HF_k=F_kHが成立する。ここで、Z=HX（ZはXにぶれを加えた画像）とおくと、最尤推定問題は次式で表される。 Y _k is a [HW × 1] vector of [H × W] pixel observation images rearranged lexicographically, and X is a [RH 2 RW] pixel high resolution image lexicographically rearranged [R ² HW × 1] vector. k, N, and R are the frame index, the number of observation images, and the resolution ratio of the observation image and the high-resolution image, respectively. D, H _k , and F _k represent matrices that cause down-sampling, blurring, and displacement, respectively. V _k is a vector of additional noise. When it is assumed that the point spread function (PSF) of blurring is uniform throughout the entire image, H _k = H can be substituted. Further, assuming that the displacement is only translational movement, HF _k = F _k H is established. Here, if Z = HX (Z is an image obtained by adding blur to X), the maximum likelihood estimation problem is expressed by the following equation.

ここで、‖ ‖_pは損失関数のノルムである。式（２）を解くことは困難であり、勾配法による反復計算を用いて近似解を求めるのが一般的な手法である。勾配法で解く場合、 Where ‖ ‖ _p is the norm of the loss function. It is difficult to solve equation (2), and it is a general method to obtain an approximate solution using iterative calculation by the gradient method. When solving with the gradient method,

を考える。まずp=1のとき、 think of. First, when p = 1

となる。ここで、D、F_kの転置行列D^T、F^T _kはそれぞれアップサンプル、F_kと反対方向のシフトを作用させる行列である。よってD、F_k、D^T、F^T _kはZの値を変化させず、それらの高解像度格子における位置を変えるだけである。式（４）は、各画素Z(i)が、signの「+1」と「-1」の値が等しくなるときに成立し、それはすなわち解がその点に位置合わせされる画素値のメディアンであることを意味する。 It becomes. Here, a D, F _k transposed matrix D ^T, F ^T _k is upsampled respectively, to act in the opposite direction shifts the F _k matrix. Therefore, D, F _k , D ^T , and F ^T _k do not change the value of Z, but only change their positions in the high resolution grid. Equation (4) holds when each pixel Z (i) has the same sign “+1” and “−1” values, ie, the median of the pixel value whose solution is aligned to that point. It means that.

次にp=2のとき、 Next, when p = 2,

となる。先程と同様に、D、F_k、D^T、F^T _kはZの値を変化させない。式（５）は各画素の差分の和が0に等しいときに成立し、式（５）の解がその点に位置合わせされる画素値の平均値であることを意味する。このように位置合わせと、平均値あるいはメディアンを採用する処理をfusionと呼ぶ。 It becomes. As before, D, F _k , D ^T , and F ^T _k do not change the value of Z. Equation (5) is established when the sum of the differences of each pixel is equal to 0, and means that the solution of Equation (5) is the average value of the pixel values aligned with that point. The process of using the alignment and the average value or median in this way is called fusion.

図３に、高解像格子に位置合わせされる例を示す。図の左は[2×2]ピクセルの観測画像列、右は2倍の高解像度格子に位置された結果の例を示す。 FIG. 3 shows an example of alignment with a high resolution grating. The left side of the figure shows an example of an observation image sequence of [2 × 2] pixels, and the right side shows an example of the result of being positioned on a double high-resolution grid.

fusionを行った後、位置合わせで該当する画素値が割り振られなかった画素値の補間とぶれ除去を同時に行うことにより、高解像度画像Xを求める。Xは次式によって求められる。 After performing the fusion, a high-resolution image X is obtained by simultaneously performing interpolation and blur removal of pixel values to which the corresponding pixel values are not assigned in the alignment. X is obtained by the following equation.

AはZ(i)に位置合わせされる画素数を、その位置に対応する対角成分に持つ重み行列である。S^l _xはx軸方向へ画像をlピクセルシフトさせ、S^m _yはy軸方向へ画像をmピクセルシフトさせる。第二項は画像が平坦であることを制約に課している。αは注目画素から距離が離れる程重みを軽くする0≦α≦1の値であり、λは正則化パラメータである。式（６）は繰り返し計算によって解かれる。 A is a weight matrix having the number of pixels aligned with Z (i) as diagonal components corresponding to the positions. S ^l _x is the image is l pixel shift the x-axis direction, S ^m _y is an image to m pixel shift the y-axis direction. The second term imposes a constraint that the image is flat. α is a value of 0 ≦ α ≦ 1 that reduces the weight as the distance from the target pixel increases, and λ is a regularization parameter. Equation (6) is solved by iterative calculation.

βはステップサイズパラメータである。 β is a step size parameter.

fusionとぶれ除去＋補間の処理の計算はともに高速に行うことができる。またfusionステップにおける平均値とメディアンでは、算出速度と外れ値に対するロバスト性において異なる。平均値のほうが高速に算出することができ、メディアン処理はアルゴリズム上、サンプル数が多くなるにつれて多くの時間を要する。外れ値に対するロバスト性はメディアンの方が優れている。平均値は外れ値の影響を受けるが、メディアンはサンプルの半分が外れ値であってもその影響を受けない。 Both fusion and blur removal + interpolation calculations can be performed at high speed. Also, the average value and median in the fusion step differ in robustness against the calculation speed and outliers. The average value can be calculated faster, and the median processing requires more time as the number of samples increases according to the algorithm. The median is more robust against outliers. The average value is affected by outliers, but the median is not affected even if half of the samples are outliers.

B. Girod, A.M. Aaron, S. Rane, and D. Rebollo-Monedero,"Distributed video coding", Proceedings of the IEEE, Vol. 93, No. 1, pp. 71-83, 2004.B. Girod, A.M.Aaron, S. Rane, and D. Rebollo-Monedero, "Distributed video coding", Proceedings of the IEEE, Vol. 93, No. 1, pp. 71-83, 2004. S. Farsiu, M.D. Robinson, M. Elad, and P. Milanfar,"Fast and robust multiframe super resolution", IEEE Transactions on Image Processing, Vol. 13, No. 10, pp. 1327 -1344, 2004.S. Farsiu, M.D.Robinson, M. Elad, and P. Milanfar, "Fast and robust multiframe super resolution", IEEE Transactions on Image Processing, Vol. 13, No. 10, pp. 1327 -1344, 2004.

一般的な分散映像符号化（DVC）ではKeyフレームを複数枚間隔で送信をし、送信されなかったWyner-ZivフレームをKeyフレームから推定する。復号器ではWyner-Zivフレームの現信号の情報を知り得ないため、いかに推定精度を高くするかが、符号化効率の向上のために重要である。一つの方法として、低周波数から高周波数へ向け再帰的に復号し、その過程で復号されたWyner-Zivフレームの低周波数の信号を高周波数帯域の推定に利用することが考えられる。しかし、一般的に用いられているバイキュービック補間などの補間法では高周波成分の推定を伴わないため効果は乏しい。 In general distributed video coding (DVC), Key frames are transmitted at intervals of a plurality of frames, and Wyner-Ziv frames that are not transmitted are estimated from the Key frames. Since the decoder cannot know the information of the current signal of the Wyner-Ziv frame, how to increase the estimation accuracy is important for improving the coding efficiency. One method is to recursively decode from low frequency to high frequency, and use the low frequency signal of the Wyner-Ziv frame decoded in the process for estimation of the high frequency band. However, interpolation methods such as bicubic interpolation that are generally used do not involve estimation of high-frequency components, and thus are not effective.

本発明は、この問題点に鑑みなされたもので、超解像（SR：Super Resolution）をDVCの枠組み内で利用し、DVCの符号化効率を高めることを目的とする。超解像法は複数枚の画像から高周波成分を推定しながら画像を拡大するため、Keyフレームを縮小して送っても精度良く高解像度画像を推定できると考えられる。 The present invention has been made in view of this problem, and an object of the present invention is to use super resolution (SR) within the framework of the DVC to increase the encoding efficiency of the DVC. Since the super-resolution method enlarges an image while estimating a high-frequency component from a plurality of images, it is considered that a high-resolution image can be accurately estimated even if the key frame is reduced and sent.

先に述べたSRの再構成手法はある画素に位置合わせされた画素値のうち、その中央値を超解像に利用することでロバスト性を実現している。この手法は十分に多くの画像を使って超解像を行う場合は有効であるが、画像が多くない場合にはロバストであるといえない。なぜなら、ある高解像度グリッドに誤った画素値が1つだけ位置合せされてしまうとそのデータが超解像に利用されてしまうからである。この問題を解決するためには位置合わせされた画素値が信頼できるものかどうかを考えなければならない。本発明では、画像の位置合せ時に画素単位で得られた位置合せが信頼できるかどうかを判定し、その結果に応じて超解像に利用する画素とそうでない画素に分ける。さらに信頼度が高い画素ほど超解像に寄与する率を高める再構成法を提案する。 The SR reconstruction method described above achieves robustness by using the median value of the pixel values aligned with a certain pixel for super-resolution. This method is effective when super-resolution is performed using a sufficiently large number of images, but is not robust when there are not many images. This is because if only one erroneous pixel value is aligned with a certain high-resolution grid, the data will be used for super-resolution. To solve this problem, we must consider whether the aligned pixel values are reliable. In the present invention, it is determined whether or not the alignment obtained in pixel units at the time of image alignment is reliable, and the pixels used for super-resolution and the other pixels are divided according to the result. Furthermore, we propose a reconstruction method that increases the rate of contribution to super-resolution for pixels with higher reliability.

このように、本発明は、分散符号化の符号化効率の改善を目的とした超解像可能領域判定を実施する復号化装置及び復号化方法を提供することを目的とする。 As described above, an object of the present invention is to provide a decoding apparatus and a decoding method that perform super-resolvable region determination for the purpose of improving the encoding efficiency of distributed encoding.

本発明の復号化装置は、
前記符号化装置から低周波成分の画像列のキーフレームを受信して復号化するキーフレーム復号化部と、
復号化された画像列の平坦部及び非平坦部を判定し、非平坦部と判定された部分のデータを前記符号化装置に要求する平坦部判定部と、
非平坦部のデータを受信して復号化する非平坦部復号化部と、
復号化された非平坦部のデータを用いて、画像間の類似度を計算し、類似度が閾値以上の非平坦部の画素に対して超解像法を適用することにより復号化された画像の高周波成分を推定する超解像処理部と、
を有することを特徴とする。 The decoding device of the present invention
A key frame decoding unit that receives and decodes a key frame of an image sequence of a low frequency component from the encoding device;
A flat part determination unit that determines a flat part and a non-flat part of a decoded image sequence, and requests data of a part determined to be a non-flat part from the encoding device;
A non-flat part decoding unit that receives and decodes the data of the non-flat part;
Using the decoded non-flat portion data, the degree of similarity between images is calculated, and the image decoded by applying the super-resolution method to the non-flat portion pixels whose similarity is equal to or greater than the threshold value A super-resolution processing unit that estimates high-frequency components of
It is characterized by having.

本発明の復号化方法は、
符号化装置から画像列を受信して復号化する復号化装置における復号化方法であって、
前記符号化装置から低周波成分の画像列のキーフレームを受信して復号化するキーフレーム復号化ステップと、
復号化された画像列の平坦部及び非平坦部を判定し、非平坦部と判定された部分のデータを前記符号化装置に要求する平坦部判定ステップと、
非平坦部のデータを受信して復号化する非平坦部復号化ステップと、
復号化された非平坦部のデータを用いて、画像間の類似度を計算し、類似度が閾値以上の非平坦部の画素に対して超解像法を適用することにより復号化された画像の高周波成分を推定する超解像処理ステップと、
を有することを特徴とする。
The decoding method of the present invention includes:
A decoding method in a decoding device for receiving and decoding an image sequence from an encoding device,
A key frame decoding step for receiving and decoding a key frame of a low-frequency component image sequence from the encoding device;
A flat part determining step of determining a flat part and a non-flat part of the decoded image sequence, and requesting data of a part determined to be a non-flat part to the encoding device;
A non-flat portion decoding step for receiving and decoding the non-flat portion data;
Using the decoded non-flat portion data, the degree of similarity between images is calculated, and the image decoded by applying the super-resolution method to the non-flat portion pixels whose similarity is equal to or greater than the threshold value A super-resolution processing step for estimating a high-frequency component of
It is characterized by having.

本発明によれば、分散符号化の符号化効率を改善することができる。 According to the present invention, the coding efficiency of distributed coding can be improved.

DVC符号器・復号器のブロック図Block diagram of DVC encoder / decoder 超解像復元における画像獲得モデルを示す図Diagram showing image acquisition model in super-resolution restoration 高解像格子に位置合わせされる例を示す図Diagram showing an example of alignment with a high resolution grid 本発明の実施例に係る符号化装置・復号化装置の構成図Configuration diagram of encoding apparatus / decoding apparatus according to an embodiment of the present invention ４近傍ラプラシアンカーネルを示す図Diagram showing a 4-neighbor Laplacian kernel 動画における類似度を説明するための図Illustration for explaining the similarity in a video

以下、本発明の実施例について詳細に説明する。 Examples of the present invention will be described in detail below.

図４に、本発明の実施例に係る符号化装置１０・復号化装置１５の構成図を示す。 FIG. 4 shows a configuration diagram of the encoding device 10 and the decoding device 15 according to the embodiment of the present invention.

本発明の実施例では、低ビットレートによる画像列の送信を対象とする。具体的には、高周波成分の大部分の情報は、圧縮の過程で削減されることを考慮し送信しないこととする。このため、符号化装置１０は、高周波成分の画像列を除去して低周波成分の画像列を取得する低周波成分画像列取得部１０１を有する。 The embodiment of the present invention is directed to transmission of an image sequence at a low bit rate. Specifically, most of the high-frequency component information is not transmitted in consideration of being reduced in the compression process. For this reason, the encoding apparatus 10 includes a low-frequency component image sequence acquisition unit 101 that acquires an image sequence of low-frequency components by removing an image sequence of high-frequency components.

低周波成分画像列取得部１０１は、伝送したい映像をローパスフィルタへかけ、低周波成分を取得する。一例として、ローパスフィルタとしてサイズが3×3、標準偏差σが0.8であるガウシアンフィルタがある。低周波成分の画像列はレートrでダウンサンプルされる。今後このダウンサンプルされた画像列を低解像度画像列Y_kと呼ぶ。 The low frequency component image sequence acquisition unit 101 applies a video to be transmitted to a low pass filter to acquire a low frequency component. As an example, there is a Gaussian filter having a size of 3 × 3 and a standard deviation σ of 0.8 as a low-pass filter. The low frequency component image sequence is downsampled at rate r. Hereinafter, this downsampled image sequence will be referred to as a low resolution image sequence _Yk .

本発明の実施例では、高周波成分の情報を補償するために、復号化装置１５でSRにより高周波成分を精度高く推定し、復号画像の品質を挙げることを行う。 In the embodiment of the present invention, in order to compensate the information of the high frequency component, the decoding device 15 estimates the high frequency component with high accuracy by SR, and raises the quality of the decoded image.

低解像度画像列は、フレーム分割部１０３において、奇数フレームと偶数フレームとに分けられ、Keyフレーム符号化部１０５から奇数フレームであるキーフレーム（Keyフレーム）が送信される。 The low resolution image sequence is divided into odd frames and even frames by the frame dividing unit 103, and a key frame (Key frame) that is an odd frame is transmitted from the Key frame encoding unit 105.

復号化装置１５では、Keyフレーム復号化部１５１においてKeyフレームを復号化し、動き推定部１５３において隣接するKeyフレーム間の動きベクトルを計算する。Y_2k-1からY_2k+1への動きベクトルをm^f _kとし、Y_2k+1からY_2k-1への動きベクトルをm^b _kとすると、次の関係が成立する。 In the decoding device 15, the Key frame decoding unit 151 decodes the Key frame, and the motion estimation unit 153 calculates a motion vector between adjacent Key frames. When the motion vector from Y _2k-1 to Y _{2k + 1} is m ^f _k and the motion vector from Y _2k _{+ 1} to Y _2k-1 is m ^b _k , the following relationship is established.

このようにして得られた動きベクトルとKeyフレームを利用して、Wyner-Zivレームを双方向動き補償により補間する。復号化装置１５で得られる低解像度画像列をY'_kとすると次のようになる。 The Wyner-Ziv frame is interpolated by bidirectional motion compensation using the motion vector and the Key frame obtained in this way. If the low-resolution image sequence obtained by the decoding device 15 is Y ′ _k , the result is as follows.

以上の過程より、復号化装置１５は符号化装置１０と同じ数の画像列を取得する。次に復号化装置１５は、平坦部判定部１５５においてY'から超解像が有効となる非平坦部と、平坦部に分ける。これはテクスチャが乏しい領域である平坦部は位置合せが不正確になるからである。また、そのような領域は高周波成分が少ないため、単純補間で十分でもある。平坦部判定部１５５は非平坦部と判定した部分のデータ（偶数フレームの非平坦部に対応するデータ）を符号化装置１０に要求し、符号化装置１０の非平坦部符号化部１０７から受け取る。これは超解像を行うには正確な低解像度画像が必要となるからである。復号化装置１５は、非平坦部復号化部１５７において非平坦部のデータを復号化し、拡大処理部１５９において非平坦部に対して超解像法を適用して画像の拡大を行う。なお、拡大処理部１５９は、平坦部に対しては単純補間を行う。 Through the above process, the decoding device 15 acquires the same number of image sequences as the encoding device 10. Next, the decoding apparatus 15 divides the flat portion determination unit 155 into a non-flat portion and a flat portion where super-resolution is effective from Y ′. This is because the alignment is inaccurate in the flat portion which is a region having a poor texture. Moreover, since such a region has few high frequency components, simple interpolation is sufficient. The flat part determining unit 155 requests the encoding device 10 for data of the portion determined to be a non-flat part (data corresponding to the non-flat part of the even frame) and receives the data from the non-flat part encoding unit 107 of the encoding device 10. . This is because an accurate low resolution image is required for super-resolution. In the decoding device 15, the non-flat portion decoding unit 157 decodes the data of the non-flat portion, and the enlargement processing unit 159 applies the super-resolution method to the non-flat portion to enlarge the image. The enlargement processing unit 159 performs simple interpolation on the flat portion.

＜レジストレーション＞
次に、動き推定部１５３におけるレジストレーションについて説明する。 <Registration>
Next, registration in the motion estimation unit 153 will be described.

動き推定部１５３におけるレジストレーションは、空間領域でのブロックマッチングと位相限定相関法を利用して行われる。位相限定相関法は高精度に位置ずれを推定出来る一方で計算量が多く、探索領域が小さいといった問題がある。即ち、フレーム間で物体が大きく移動する場合位相限定相関法は向かない。この問題を解決するために本発明の実施例では、まず輝度値のブロックマッチングによって大まかな動きを算出し、位相限定相関法で正確な位置ずれ量を算出する。 Registration in the motion estimation unit 153 is performed using block matching in the spatial domain and a phase-only correlation method. The phase-only correlation method has a problem that the position shift can be estimated with high accuracy but the calculation amount is large and the search area is small. That is, the phase-only correlation method is not suitable when an object moves greatly between frames. In order to solve this problem, in the embodiment of the present invention, first, a rough motion is calculated by block matching of luminance values, and an accurate positional deviation amount is calculated by a phase-only correlation method.

超解像に利用するK枚の画像の中から2枚の画像F(x,y)とG(x,y)を考え、G(x,y)はF(x,y)をm_x(x,y)、m_y(x,y)だけ平行移動したものとする。すなわち Consider two images F (x, y) and G (x, y) out of K images used for super-resolution, and G (x, y) expresses F (x, y) as m _x ( _{x, y), m y (} x, y) is assumed to have moved only in parallel. Ie

であり、Fが超解像されるフレームだと想定する。m_x(x,y)、m_y(x,y)は実数であるが、これを整数部と小数部に分けて考える。 Suppose F is a super-resolution frame. _{m x (x, y),} m y (x, y) but is a real number, considered separately this into an integer part and a fraction part.

m^intは単純な空間領域ブロックマッチングで求めることが出来る。次にサブピクセルでの位置ずれを1画素毎に求める。F、GからそれぞれN₁×N₂の小領域を取り出し、POC（Picture Order Count）により求める。整数レベルでの位置ずれは求められているので、 m ^int can be obtained by simple space domain block matching. Next, the position shift at the sub-pixel is obtained for each pixel. A small area of N ₁ × N ₂ is extracted from F and G, respectively, and obtained by POC (Picture Order Count). Since displacement at the integer level is required,

とし、f、g間のサブピクセル位置ずれを求めればよい。 And the subpixel positional deviation between f and g may be obtained.

このようにして求められた位置ずれは精度が細かいため、平坦部判定部１５５における平坦部判定において、精度の良い平坦部判定が可能になる。 Since the positional deviation obtained in this way is accurate, the flat portion determination in the flat portion determination unit 155 can be performed with high accuracy.

なお、位置ずれを求めるだけであれば、画像間の類似度値α_k(x,y)を計算する必要は無いが、画像間の類似度値は拡大処理部１５９で利用されるため、具体的な計算方法について説明する。G(x,y)をk枚目の画像とし、注目しているピクセルを(x,y)とし、fとgのPOC最大値をr(n₁,n₂)とすると、類似度値α_k(x,y)は次式で与えられる。 It is not necessary to calculate the similarity value α _k (x, y) between images if only the positional deviation is _obtained. However, since the similarity value between images is used by the enlargement processing unit 159, it is more specific. A typical calculation method will be described. If G (x, y) is the kth image, the pixel of interest is (x, y), and the POC maximum value of f and g is r (n ₁ , n ₂ ), the similarity value α _k (x, y) is given by the following equation.

＜平坦部判定＞
次に、平坦部判定部１５５における平坦部判定について説明する。 <Flat portion judgment>
Next, the flat part determination in the flat part determination part 155 will be described.

画像のテクスチャが乏しい部分は位置合せの精度良く推定することが困難であり、逆にテクスチャが豊富な領域は精度が良い。正確な位置ずれ量推定は超解像をする上で欠かせない。また、平坦な領域は単純な補間でも精度よく拡大が可能である。これらの事を考慮すると、平坦部を判定しその結果によって拡大法を切り替えることが有効となる。画像が平坦な箇所の判定には様々な手法が考えられる。従来技術として、画像ブロックの輝度値の標準偏差を計算し、閾値よりも大きい場合は非平坦部とする方法もある。しかしこの手法では、高い輝度値の部分と、低い輝度値の部分が混在した画像では、平坦部を非平坦部と判定する場合が生じる。 It is difficult to estimate a portion having a poor texture in an image with high accuracy of alignment, and conversely, a region having a lot of texture has a high accuracy. Accurate positional deviation estimation is indispensable for super-resolution. In addition, the flat area can be accurately enlarged even by simple interpolation. Considering these things, it is effective to determine the flat portion and switch the enlargement method according to the result. Various methods are conceivable for determining a flat portion of the image. As a conventional technique, there is a method in which the standard deviation of the luminance value of an image block is calculated, and when it is larger than a threshold value, a non-flat portion is obtained. However, with this method, in an image in which a portion with a high luminance value and a portion with a low luminance value are mixed, a flat portion may be determined as a non-flat portion.

本発明の実施例では、高周波成分が少ない領域を平坦部であると定義する。即ち画像のエッジを抽出し、エネルギーが少ない領域を平坦部であるとする。平坦部判定部１５５は、Keyフレームから生成されて動き推定部１５３から入力された画像列Y'に対してハイパスフィルタをかけ、エッジ画像列Wを求める。 In the embodiment of the present invention, a region having a low high-frequency component is defined as a flat portion. That is, an edge of an image is extracted, and an area with less energy is a flat portion. The flat part determination unit 155 applies a high-pass filter to the image sequence Y ′ generated from the Key frame and input from the motion estimation unit 153 to obtain an edge image sequence W.

ここでCは図５に示す４近傍ラプラシアンカーネルである。 Here, C is a 4-neighbor Laplacian kernel shown in FIG.

次に得られた画像列をN₁×N₂のブロック毎に処理する。今、エッジ画像列Wのうち注目しているブロックをw(x,y)とする。ここでx=1,...,Nかつy=1,...,Nである。注目ブロックの輝度値の絶対値の平均値を閾値処理することで平坦部、非平坦部の判定を行う。非平坦度を表す数値を、 Next, the obtained image sequence is processed for each N ₁ × N ₂ block. Now, let the block of interest in the edge image sequence W be w (x, y). Where x = 1, ..., N and y = 1, ..., N. A flat portion and a non-flat portion are determined by performing threshold processing on an average value of absolute values of luminance values of the block of interest. A numerical value representing non-flatness,

として定義する。N₁、N₂は注目している領域の高さと幅である。この値が閾値よりも大きい場合は非平坦部と判定し、平坦部判定部１５５は符号化装置１０にデータを要求する。前述のように、平坦部判定部１５５は動き推定部１５３からサブピクセル位置ずれを受け取ることにより、通常の位置ずれを受け取るより高い精度の平坦部判定が可能になる。符号化装置１０の非平坦部符号化部１０７は、要求されたデータを符号化装置１５に送信する。 Define as N ₁ and N ₂ are the height and width of the region of interest. When this value is larger than the threshold value, it is determined as a non-flat portion, and the flat portion determination unit 155 requests data from the encoding device 10. As described above, the flat part determination unit 155 receives the sub-pixel positional deviation from the motion estimation unit 153, thereby enabling the flat part determination with higher accuracy than the normal positional deviation is received. The non-flat part encoding unit 107 of the encoding device 10 transmits the requested data to the encoding device 15.

＜画素単位での選択的SR＞
次に、拡大処理部１５９における画素単位での選択的SRについて説明する。 <Selective SR in pixel units>
Next, the selective SR for each pixel in the enlargement processing unit 159 will be described.

時間局所的に大きく変化するシーケンスを超解像すると、好ましくない画素値が利用されてしまい、不自然な画像が生成されてしまう例が発生する。本発明の実施例では、これを回避する為に位置合せ時の類似度を利用する。上記のように、復号化装置１５は非平坦部のデータを符号化装置１０から取得するが、取得した非平坦部のデータの中で更に類似度を用いることで、超解像による効果を高めることができる。 If a sequence that greatly changes in time locally is super-resolved, an undesired pixel value is used, and an unnatural image is generated. In the embodiment of the present invention, the similarity at the time of alignment is used to avoid this. As described above, the decoding device 15 acquires the non-flat portion data from the encoding device 10, but further increases the effect of super-resolution by using the similarity in the acquired non-flat portion data. be able to.

図６は動画Suzieから女性が瞬きをするフレームを示した図である。目に注目すると（ａ）と（ｂ）は似ていないが、（ａ）と（ｃ）は似ている。（ａ）と（ｂ）の類似度値、（ａ）と（ｃ）の類似度値をそれぞれ式（１３）に従って計算すると、前者は8.7、後者は9.9となる。このことから類似度が高いデータは超解像に利用し、類似度が低いデータは超解像に利用しないことが好ましい。本発明の実施例では類似度が高いデータほど超解像への寄与を大きくし、類似度がしきい値よりも小さいデータは超解像へ利用しないようにする。このことは式（６）の重み付け行列であるAを変更することで達成出来る。従来手法では低解像度画像を高解像度グリッドへ位置合せしたとき、位置合せ先の画素値の位置に対応する要素A(x,y)の値を一律に増加させていたが、本発明の実施例では計算された類似度を加算することにより、類似度に比例した超解像への寄与を大きくする。 FIG. 6 is a diagram showing a frame in which a woman blinks from the video Suzie. Paying attention to the eyes, (a) and (b) are not similar, but (a) and (c) are similar. When the similarity values of (a) and (b) and the similarity values of (a) and (c) are calculated according to the equation (13), the former is 8.7 and the latter is 9.9. For this reason, it is preferable that data with high similarity is used for super-resolution, and data with low similarity is not used for super-resolution. In the embodiment of the present invention, the higher the degree of similarity, the greater the contribution to super-resolution, and the data whose degree of similarity is lower than the threshold value is not used for super-resolution. This can be achieved by changing A which is the weighting matrix of equation (6). In the conventional method, when the low-resolution image is aligned with the high-resolution grid, the value of the element A (x, y) corresponding to the position of the pixel value of the alignment destination is uniformly increased. Then, by adding the calculated similarity, the contribution to the super-resolution proportional to the similarity is increased.

ただし、rは拡大倍率、roundは丸め関数である。 Here, r is an enlargement factor and round is a rounding function.

＜本発明の実施例の効果＞
以上のように、本発明の実施例によれば、DVCの高符号化効率を実現することができ、復元画像の品質を向上させることが可能となる。また、本発明の実施例によれば、超解像法に欠かせない正確な位置ずれ量推定が可能となり、超解像法のロバスト性を実現することができる。 <Effect of the embodiment of the present invention>
As described above, according to the embodiment of the present invention, high encoding efficiency of DVC can be realized, and the quality of a restored image can be improved. In addition, according to the embodiment of the present invention, it is possible to accurately estimate the amount of displacement that is indispensable for the super-resolution method, and to realize the robustness of the super-resolution method.

説明の便宜上、本発明の実施例に係る符号化装置・復号化装置は機能的なブロック図を用いて説明しているが、本発明の符号化装置・復号化装置は、ハードウェア、ソフトウェア又はそれらの組み合わせで実現されてもよい。例えば、符号化装置・復号化装置の各機能部がソフトウェアで実現され、プログラムとして符号化装置・復号化装置内に実現されてもよい。また、２以上の実施例及び実施例の各構成要素が必要に応じて組み合わせて使用されてもよい。 For convenience of explanation, the encoding device / decoding device according to the embodiment of the present invention has been described using a functional block diagram, but the encoding device / decoding device of the present invention may be hardware, software, or A combination thereof may be realized. For example, each functional unit of the encoding device / decoding device may be realized by software, and may be realized as a program in the encoding device / decoding device. In addition, two or more embodiments and each component of the embodiments may be used in combination as necessary.

以上、本発明の実施例について説明したが、本発明は、上記の実施例に限定されることなく、特許請求の範囲内において、種々の変更・応用が可能である。 As mentioned above, although the Example of this invention was described, this invention is not limited to said Example, A various change and application are possible within a claim.

１０符号化装置
１０１低周波成分画像列取得部
１０３フレーム分割部
１０５ Keyフレーム符号化部
１０７非平坦部符号化部
１５符号化装置
１５１ Keyフレーム復号化部
１５３動き推定部
１５５平坦部判定部
１５７非平坦部符号化部
１５９拡大処理部 DESCRIPTION OF SYMBOLS 10 Encoding apparatus 101 Low frequency component image sequence acquisition part 103 Frame division part 105 Key frame encoding part 107 Non flat part encoding part 15 Encoding apparatus 151 Key frame decoding part 153 Motion estimation part 155 Flat part determination part 157 Non Flat part encoding part 159 Enlargement processing part

Claims

A decoding device that receives and decodes an image sequence from an encoding device,
A key frame decoding unit that receives and decodes a key frame of an image sequence of a low frequency component from the encoding device;
A flat part determination unit that determines a flat part and a non-flat part of a decoded image sequence, and requests data of a part determined to be a non-flat part from the encoding device;
A non-flat part decoding unit that receives and decodes the data of the non-flat part;
Using the decoded non-flat portion data, the degree of similarity between images is calculated, and the image decoded by applying the super-resolution method to the non-flat portion pixels whose similarity is equal to or greater than the threshold value A super-resolution processing unit that estimates high-frequency components of
A decoding device.

The flat portion determination unit, by comparing the average value and the threshold value of the absolute value of the luminance value of a predetermined block in the edge image sequence obtained by multiplying the high-pass filter to the image sequence, the magnitude of the high frequency component determining a flat portion and a non-flat portion, the decoding apparatus according to claim 1.

Performing motion estimation using block matching in the spatial domain, further further comprising a motion estimator for motion estimation using the phase-only correlation method, decoding apparatus according to claim 1 or 2.

A decoding method in a decoding device for receiving and decoding an image sequence from an encoding device,
A key frame decoding step for receiving and decoding a key frame of a low-frequency component image sequence from the encoding device;
A flat part determining step of determining a flat part and a non-flat part of the decoded image sequence, and requesting data of a part determined to be a non-flat part to the encoding device;
A non-flat portion decoding step for receiving and decoding the non-flat portion data;
Using the decoded non-flat portion data, the degree of similarity between images is calculated, and the image decoded by applying the super-resolution method to the non-flat portion pixels whose similarity is equal to or greater than the threshold value A super-resolution processing step for estimating a high-frequency component of
A decoding method comprising:

In the flat portion determination step, the magnitude of the high-frequency component is determined by comparing an average value of absolute values of luminance values of a predetermined block in the edge image sequence obtained by applying a high-pass filter to the image sequence and a threshold value The decoding method according to claim 4, wherein the flat part and the non-flat part are determined by the following.