JP2009527978A

JP2009527978A - Image encoding and decoding method and apparatus

Info

Publication number: JP2009527978A
Application number: JP2008556247A
Authority: JP
Inventors: リー，サン−フン; リー，ヒョン−グック
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2006-02-24
Filing date: 2007-02-23
Publication date: 2009-07-30

Abstract

空間領域及び周波数領域での人間の視覚システムの特性を考慮して、ウェーブレット係数を符号化する方法及び装置を提供する。人間の目の視覚軸を基準に正規化された局部帯域幅を利用して生成された空間的加重値及びウェーブレット領域でサブバンドのエラー敏感度を利用して生成された周波数加重値の積を計算して視覚的加重値を生成し、生成された視覚的加重値を基準に決定された符号化順序によってウェーブレット係数を符号化して伝送することによって、低いチャンネル容量でも改善された画質を提供しうる。
A method and apparatus for encoding wavelet coefficients in consideration of characteristics of the human visual system in the spatial domain and the frequency domain are provided. The product of the spatial weight generated using the local bandwidth normalized with respect to the visual axis of the human eye and the frequency weight generated using the subband error sensitivity in the wavelet domain. By calculating and generating visual weights, and encoding and transmitting wavelet coefficients according to an encoding order determined based on the generated visual weights, it provides improved image quality even at low channel capacity. sell.

Description

本発明は、画像の符号化、復号化方法及び装置に係り、さらに詳細には、周波数領域と空間領域とで人間の時刻システムを考慮して決定された視覚的な加重値を利用して、ウェーブレット変換された画像を符号化する画像の符号化方法及び装置、復号化方法及び装置に関する。 The present invention relates to an image encoding and decoding method and apparatus, and more particularly, using a visual weight determined in consideration of a human time system in a frequency domain and a spatial domain, The present invention relates to an image encoding method and apparatus, and a decoding method and apparatus for encoding a wavelet transformed image.

広域無線ネットワークのチャンネル容量の増加によって、無線ネットワーク領域内でサービスされる画像やアプリケーションの画質を改善するための数多くの試みがなされてきた。しかし、チャンネル容量の可変的な特性のため、全体的なトラフィックを何れも伝送するには十分な帯域幅を保証されていない。したがって、可変的なチャンネルに効率的に適応できるように、関心のある個体や領域に追加的なコーディング資源を割り当てるために、多様な客体オリエンテッドアルゴリズムや階層化されたアルゴリズムが提案されている。 With increasing channel capacity of wide area wireless networks, many attempts have been made to improve the image quality of images and applications serviced within the wireless network area. However, due to the variable nature of channel capacity, sufficient bandwidth is not guaranteed to transmit any overall traffic. Therefore, various object-oriented algorithms and layered algorithms have been proposed to allocate additional coding resources to interested individuals and regions so that they can be efficiently adapted to variable channels.

最近には、ウェーブレット基盤の多様な画像圧縮アルゴリズムが提案された。ウェーブレット基盤の従来の画像圧縮アルゴリズムは、各帯域内の係数間の相関関係を利用した。公知の代表的なウェーブレット係数の圧縮方法として、ＥＺＷ（ＥｍｂｅｄｄｅｄｉｍａｇｅｃｏｄｉｎｇｕｓｉｎｇＺｅｒｏｔｒｅｅｓｏｆＷａｖｅｌｅｔｃｏｅｆｆｉｃｉｅｎｔｓ）アルゴリズムとＳＰＩＨＴ（ＳｅｔＰａｒｔｉｔｉｏｎｉｎｇＩｎＨｉｅｒａｒｃｈｉｃａｌＴｒｅｅｓ）アルゴリズムとがある。 Recently, various wavelet-based image compression algorithms have been proposed. Conventional wavelet-based image compression algorithms use the correlation between coefficients in each band. Known typical wavelet coefficient compression methods include an EZW (Embedded image coding using Zerotrees of Wavelet coefficients) algorithm and an SPIHT (Set Partitioning In Hierarchical Trees) algorithm.

ウェーブレット分解の階層的構造は、画像シーケンスから全域的な特徴を獲得するのに有利な構造を有する。すなわち、ウェーブレット領域では、空間と周波数との領域の情報を同時に解釈できる階層的構造を有するため、一つのサブバンドの情報から全体的な画像の特性把握に良好な構造である。また、ウェーブレット領域は、基本的に多解像度特性を有するため、漸進的な画像符号化器のデータ伝送時に有利である。 The hierarchical structure of wavelet decomposition has an advantageous structure for obtaining global features from an image sequence. In other words, the wavelet domain has a hierarchical structure that can simultaneously interpret the spatial and frequency domain information, and therefore has a good structure for grasping the overall image characteristics from the information of one subband. In addition, the wavelet region basically has multi-resolution characteristics, so that it is advantageous at the time of data transmission of a gradual image encoder.

一方、人間の網膜に分布された視神経の空間的な分布は、非線形的な特性を有する。すなわち、フォビアを中心に視神経が最も密集されており、フォビアから遠くなるほど視神経の密度は急減する。したがって、視神経で感知される局部視覚周波数帯域幅は、フォビアから遠くなるほど急減する。 On the other hand, the spatial distribution of the optic nerve distributed in the human retina has non-linear characteristics. That is, the optic nerve is most densely centered on the phobia, and the density of the optic nerve decreases rapidly as the distance from the phobia increases. Therefore, the local visual frequency bandwidth perceived by the optic nerve decreases sharply with increasing distance from the phobia.

従来の技術による画像符号化器は、このような人間の視覚システム（ＨＶＳ：ヒューマンビジュアルシステム）の特性を考慮して、視覚的に重要な情報のチャンネル伝送率を高めることによって、主観的な画質の向上に焦点を合わせたが、人間のＨＶＳの周波数と空間的な視覚的分解能とを考慮して、視覚的に重要な情報を選択するための具体的な基準値を提示していない。 Prior art image encoders take into account the characteristics of the human visual system (HVS: human visual system) and increase the channel transmission rate of visually important information, thereby improving the subjective image quality. However, it does not present a specific reference value for selecting visually important information in consideration of human HVS frequency and spatial visual resolution.

本発明は、前記問題点を解決するために案出されたものであって、空間領域及び周波数領域における人間の視覚的敏感度を考慮して、ウェーブレット変換係数の視覚的加重値を設定し、この視覚的加重値に基づいて、ウェーブレット変換係数の漸進的な符号化順序を決定することによって、チャンネル容量が少ない場合にも符号化された画像の画質を改善できる画像の符号化方法及び装置、復号化方法及び装置を提供することを目的とする。 The present invention has been devised to solve the above problems, and sets the visual weight value of the wavelet transform coefficient in consideration of the human visual sensitivity in the spatial domain and the frequency domain, An image encoding method and apparatus capable of improving the image quality of an encoded image even when the channel capacity is small by determining a progressive encoding order of wavelet transform coefficients based on this visual weight value, An object is to provide a decoding method and apparatus.

前記課題を解決するために、本発明による画像符号化方法は、入力画像に対するウェーブレット変換を行ってウェーブレット変換係数を生成するステップと、空間領域及び周波数領域における人間の視覚的敏感度を考慮して、前記ウェーブレット変換係数の視覚的加重値を生成するステップと、前記生成された視覚的加重値を利用して、前記ウェーブレット変換係数の符号化順序を決定するステップと、前記決定された符号化順序によって前記ウェーブレット変換係数を符号化するステップと、を含むことを特徴とする。 In order to solve the above problems, an image encoding method according to the present invention performs a wavelet transform on an input image to generate wavelet transform coefficients, and considers human visual sensitivity in the spatial domain and frequency domain. Generating a visual weight value of the wavelet transform coefficient; determining an encoding order of the wavelet transform coefficient using the generated visual weight value; and the determined encoding order And encoding the wavelet transform coefficients.

本発明による画像符号化装置は、入力画像に対するウェーブレット変換を行ってウェーブレット変換係数を生成する変換部と、空間領域及び周波数領域における人間の視覚的敏感度を考慮して、前記ウェーブレット変換係数の視覚的加重値を生成する視覚的加重値生成部と、前記生成された視覚的加重値を利用して、前記ウェーブレット変換係数の符号化順序を決定する符号化順序決定部と、前記決定された符号化順序によって前記ウェーブレット変換係数を符号化する順次的ウェーブレット係数符号化部と、を備えることを特徴とする。 An image encoding apparatus according to the present invention includes a transform unit that performs wavelet transform on an input image to generate a wavelet transform coefficient, and the visual sensitivity of the wavelet transform coefficient in consideration of human visual sensitivity in a spatial domain and a frequency domain. A visual weight value generation unit that generates a dynamic weight value, an encoding order determination unit that determines an encoding order of the wavelet transform coefficients using the generated visual weight value, and the determined code And a sequential wavelet coefficient encoding unit that encodes the wavelet transform coefficients according to a conversion order.

本発明による画像復号化方法は、空間領域及び周波数領域における人間の視覚的敏感度を考慮して生成された視覚的加重値の大きさ順序によって符号化されたウェーブレット変換係数を復号化するステップと、前記復号化されたウェーブレット変換係数に対する逆ウェーブレット変換を行うステップと、前記逆ウェーブレット変換された各サブバンドの係数を利用して画像を復元するステップと、を含むことを特徴とする。 An image decoding method according to the present invention includes a step of decoding wavelet transform coefficients encoded according to a magnitude order of visual weights generated in consideration of human visual sensitivity in a spatial domain and a frequency domain. And performing inverse wavelet transform on the decoded wavelet transform coefficients, and restoring an image using the coefficients of each subband subjected to the inverse wavelet transform.

本発明による画像復号化装置は、空間領域及び周波数領域における人間の視覚的敏感度を考慮して生成された視覚的加重値の大きさ順序によって符号化されたウェーブレット変換係数を復号化する順次的ウェーブレット係数復号化部と、前記復号化されたウェーブレット変換係数に対する逆ウェーブレット変換を行う逆変換部と、前記逆ウェーブレット変換された各サブバンドの係数を利用して画像を復元する画像復元部と、を備えることを特徴とする。 An image decoding apparatus according to the present invention sequentially decodes wavelet transform coefficients encoded according to the order of magnitudes of visual weights generated in consideration of human visual sensitivity in a spatial domain and a frequency domain. A wavelet coefficient decoding unit, an inverse transform unit that performs inverse wavelet transform on the decoded wavelet transform coefficient, an image restoration unit that restores an image using the coefficient of each subband subjected to the inverse wavelet transform, It is characterized by providing.

本発明によれば、周波数領域及び空間的領域で人間のＨＶＳを考慮して生成された視覚的加重値を基準にウェーブレット係数を順次に符号化して伝送することによって、低いチャンネル容量でさらに改善された画質の画像を符号化して伝送しうる。 According to the present invention, wavelet coefficients are sequentially encoded and transmitted based on a visual weight generated in consideration of human HVS in the frequency domain and the spatial domain, thereby further improving at a low channel capacity. It is possible to encode and transmit an image having a high quality.

以下、本願発明で空間的領域及び周波数領域における人間の視覚的敏感度を考慮して、ウェーブレット変換係数の視覚的加重値の設定に利用される視覚的エントロピーの理解を助けるために、まずエントロピーの概念、空間的領域における視覚的エントロピー、ウェーブレット領域における視覚的エントロピーについて説明した後、本発明の画像符号化、復号化方法及び装置について説明する。 Hereinafter, in order to assist in understanding the visual entropy used for setting the visual weight value of the wavelet transform coefficient in consideration of the human visual sensitivity in the spatial domain and the frequency domain in the present invention, After describing the concept, visual entropy in the spatial domain, and visual entropy in the wavelet domain, the image coding and decoding method and apparatus of the present invention will be described.

（エントロピーの定義）
画像の符号化時にスカラー量子化器Ｑは、実数値を有するランダム変数Ｘを量子化して ( Definition of entropy)
During image coding, the scalar quantizer Q quantizes a random variable X having real values.

を生成する。Ｘの値が［ｙ₋，ｙ_＋］の範囲内に存在し、［ｙ₋，ｙ_＋］の範囲の値をＭ個の間隔に分ければ、各間隔は、[ｙ_ｍ−１，ｙ_ｍ］（１≦ｍ≦Ｍ、ｙ_０＝ｙ₋、ｙ_Ｍ＝ｙ_＋）で表現される。このとき、ｘ∈[ｙ_ｍ−１，ｙ_ｍ］であれば、Ｑ（ｘ）＝ｘ_ｍとなる。Ｍ個に分けられたそれぞれの間隔内で、ｍ番目の範囲の値の確率ｐ_ｍを次の数式 Is generated. The value of X is _{_[y} -, y _+] present in the range _{of, _[y} -, y _+] If Wakere a value in the range of the M-number of intervals, each _{_{interval, [y m-1, y}} m ] (1 ≦ m ≦ M, y ₀ = y ₋ , y _M = y ₊ ). At this time, if x∈ [y _m−1 , y _m ], Q (x) = x _m . Within each interval divided into M, the probability pm of the value of the _mth range is expressed as

であると仮定する。この場合、量子化されたランダム変数 Assume that In this case, the quantized random variable

のエントロピー Entropy

は、次の数式 Is the following formula

の通りである。ここで、 It is as follows. here,

は、量子化されたランダム変数 Is the quantized random variable

の値の符号化に必要な平均ビット数の最小値を意味する。 Means the minimum value of the average number of bits required for encoding the value of.

一般的に、ランダム変数Ｘに対して、前記ランダム変数Ｘの確率密度関数（ＰｒｏｂａｂｉｌｉｔｙＤｅｎｓｉｔｙＦｕｎｃｔｉｏｎ：ＰＤＦ）をＰ（ｘ）とすれば、ランダム変数Ｘの微分エントロピーＨ_ｄ（ｘ）は、次の式（１）のように定義される。 In general, for a random variable X, if the probability density function (PDF) of the random variable X is P (x), the differential entropy H _d (x) of the random variable X is given by It is defined as equation (1).

もし、スカラー量子化器Ｑで発生する量子化エラーをＤとすれば、次の式 If the quantization error generated in the scalar quantizer Q is D, the following equation

が成立することが知られている。スカラー量子化器Ｑとして均一な量子化器を利用する場合には、前記数式で等号が成立する。すなわち、量子化されたランダム変数 Is known to hold. When a uniform quantizer is used as the scalar quantizer Q, the equal sign holds in the above equation. That is, the quantized random variable

の値の符号化に必要な平均ビット数を最小にするために、均一な量子化器を利用しうる。均一な量子化器で使われる一つの量子化ビンの大きさをΔとするとき、Ｄ＝（Δ^２／１２）となり、最小平均ビット率Ｒ_ｘは、 A uniform quantizer can be used to minimize the average number of bits required to encode the value of. When the size of each quantization bin used in a uniform quantizer and ^{Δ, D = (Δ 2/} 12) , and the minimum average bit rate R _x is

となる。 It becomes.

もし、信号Ａについて変換された係数ａ［ｍ］と直交正規基本関数ｇ_ｍとを利用して If the coefficient a [m] transformed for the signal A and the orthogonal normal basic function g _m are used,

（Ｎは、変換領域における信号Ａのサンプル数）で表せると仮定すれば、前記ａ［ｍ］の量子化された係数は、 Assuming that N is the number of samples of signal A in the transform domain, the quantized coefficient of a [m] is

となり、エントロピーは、 And the entropy is

となる。量子化された変換係数の全体量子化エラーをＤとするとき、最適のビット割当は、前記量子化された変換係数ａ［ｍ］の符号化に必要な総ビット数Ｒ、すなわち、 It becomes. When the total quantization error of the quantized transform coefficient is D, the optimum bit allocation is the total number of bits R required for encoding the quantized transform coefficient a [m], that is,

（Ｒ_ｍは、ａ［ｍ］の符号化に必要なビット数）を最小にするものである。各サンプル当たり平均発生ビット数を (R _m is, a number of bits required to encode the [m]) is intended to minimize. Average number of bits generated per sample

とすれば、ラグランジュュ乗算子を利用して各変換係数ａ［ｍ］のそれぞれの量子化エラーであるＤ_ｍ、すなわち、 Then, D _m which is a quantization error of each transform coefficient a [m] using a Lagrange multiplier, ie,

が変換係数の間に相同である場合に、各サンプル当たり平均発生ビット数 Is the average number of bits generated per sample if is homologous between the transform coefficients

が最小値を有する。平均微分エントロピー Has a minimum value. Mean differential entropy

は、Ｎ個のサンプリングされた変換係数の微分エントロピーの平均値、すなわち、 Is the average differential entropy of the N sampled transform coefficients, ie,

と定義される。もし、信号Ａがガウスランダム変数であり、ウェーブレット係数ａ［ｍ］の分散は、 Is defined. If the signal A is a Gaussian random variable, the variance of the wavelet coefficient a [m] is

とすれば、ガウスランダム変数のエントロピーは、次の式（２）の通りである。 Then, the entropy of the Gaussian random variable is as the following formula (2).

ａ［ｍ」がラプラスランダム変数であれば、ラプラスランダム変数のエントロピーは、次の式（３）の通りである。 If a [m] is a Laplace random variable, the entropy of the Laplace random variable is as in the following equation (3).

（空間的領域における視覚的エントロピー）
前述したように、人間の目は、網膜に存在する非線形的な視神経の分布を通じて情報をサンプリングするため、人間の目は、非線形的に視覚的なデータを得る。したがって、人間の目は、凝視点を基準に非線形的な割合でデータを得、網膜の視神経に結ばれる画像は、視神経の非線形的なサンプリング過程を経て高い周波数成分が非線形的に除去された画像となる。このように、人間の視神経によって認識される画像をフォビエーテッド画像と定義する。 ( Visual entropy in the spatial domain)
As described above, since the human eye samples information through a non-linear distribution of optic nerves existing in the retina, the human eye obtains non-linear visual data. Therefore, the human eye obtains data at a non-linear rate based on the fixation point, and the image connected to the retinal optic nerve is an image in which high frequency components are removed non-linearly through a non-linear sampling process of the optic nerve. It becomes. Thus, an image recognized by the human optic nerve is defined as a forbidden image.

一般的に、人間の凝視点は、一つの点、または色々な点となることもあり、一つの客体、多数の客体となることもある。また、人間の凝視点は、画像のコンテンツ、アプリケーションによって画像の特定領域となることもある。 In general, a human fixation point may be one point or various points, and may be one object or many objects. In addition, the human fixation point may be a specific region of the image depending on the image content and application.

図１Ａ及び図１Ｂは、原画像とフォビエーテッド画像とを比較するために、それぞれ原画像ａ（ｘ）及びフォビエーテッド画像 FIG. 1A and FIG. 1B show an original image a (x) and a forbidden image, respectively, in order to compare the original image and the forbidden image.

の一例を示す図である。 It is a figure which shows an example.

図１Ａ及び図１Ｂを参照するに、観察者の関心領域は、テニス選手と仮定する。この場合、フォビエーテッド領域は、テニス選手を中心にした周辺領域となる。図１Ｂに示したように、人間の視神経の非線形的な特性によって、人間の視神経によって認識される画像の解像度は、網膜を中心に対称的なパターンを有し、指数的に減衰する。このような非線形的なマッピング構造で得られる新たな座標系を曲線座標系 Referring to FIGS. 1A and 1B, the region of interest of the observer is assumed to be a tennis player. In this case, the forbidden area is a peripheral area centered on the tennis player. As shown in FIG. 1B, due to the non-linear characteristics of the human optic nerve, the resolution of the image recognized by the human optic nerve has a symmetrical pattern around the retina and decays exponentially. A new coordinate system obtained with such a nonlinear mapping structure is a curved coordinate system.

と定義する。 It is defined as

図２Ａ及び図２Ｂは、それぞれ図１Ａ及び図１Ｂの原画像ａ（ｘ）及びフォビエーテッド画像 2A and 2B show the original image a (x) and the forbidden image of FIGS. 1A and 1B, respectively.

を曲線座標系 The curved coordinate system

にマッピングした原画像 Original image mapped to

及びフォビエーテッド And fobied

を示す図である。すなわち、 FIG. That is,

は、図１Ａ及び図１Ｂの原画像ａ（ｘ）及びフォビエーテッド画像 Are the original image a (x) and the forbidden image of FIGS. 1A and 1B.

を凹状の曲面形態を有する人間の目の構造を考慮した曲線座標系で座標変換した画像を示したものである。 Is an image obtained by performing coordinate transformation in a curved coordinate system in consideration of the structure of the human eye having a concave curved surface form.

図２Ａ及び図２Ｂを比較すれば、実際人間の視神経によって認識されるマッピングされる原画像 Comparing FIG. 2A and FIG. 2B, the mapped original image that is actually recognized by the human optic nerve

とマッピングされたフォビエーテッド画像 And mapped forbidden images

とは、ほとんど視覚的に同一である。 Is almost visually identical.

図１Ａで、原画像の空間領域を In FIG. 1A, the spatial area of the original image

直交座標系で原画像に該当する面積をＡ_ｏと仮定すれば、図２Ａ及び図２Ｂに示された曲線座標系におけるマッピングされた原画像 Assuming that the area corresponding to the original image in the Cartesian coordinate system is A _o , the mapped original image in the curved coordinate system shown in FIGS. 2A and 2B.

及びマッピングされたフォビエーテッド画像 And mapped fovated images

の面積は、 The area of

である。ここで、 It is. here,

は、ｘから From x

への座標変換を表すヤコビアン関数である。離散領域で、 This is a Jacobian function that represents a coordinate transformation to. In the discrete domain,

は、局部周波数の自乗である Is the square of the local frequency

に比例するので、 Is proportional to

は、次の式（４）のように定義される。 Is defined as the following equation (4).

前記式（４）で、ｃは、定数である。与えられた画像の一ピクセルの変換係数値をランダム変数Ｘとすれば、Ｈ_ｄ（ｘ）は、前述した式（１）のように得られる。画像に対する総微分エントロピー In the formula (4), c is a constant. Assuming that the conversion coefficient value of one pixel of a given image is a random variable X, H _d (x) is obtained as in the above-described equation (1). Total differential entropy for images

は、次の式（５）の通りである。 Is as in the following equation (5).

同様に、曲線座標系で座標変換されたフォビエーテッド画像 Similarly, a forevated image that has been transformed in a curved coordinate system

の微分エントロピー Differential entropy of

と総視覚的エントロピー And total visual entropy

とは、次の式（６）及び（７）のように定義される。 Is defined as the following equations (6) and (7).

原画像ａ（ｘ）と曲線座標系のマッピングされたフォビエーテッド画像 Foveated image in which original image a (x) and curved coordinate system are mapped

とは、局部帯域幅Ω_ｏを有する局部的に帯域制限的な信号であるので、直交座標系での原画像の確率密度関数、曲線座標系でのフォビエーテッド画像の確率密度関数及び微分エントロピーは、同一であると仮定しうる。すなわち、 Is a locally band-limited signal having a local bandwidth Ω _o , so the probability density function of the original image in the Cartesian coordinate system, the probability density function of the Fovated image in the curvilinear coordinate system, and the differential entropy Can be assumed to be the same. That is,

である。 It is.

したがって、直交座標系に存在する原画像を、人間の視覚的な特性を考慮した曲線座標系に変換したフォビエーテッド画像の表現に必要な情報量の差は、原画像の面積Ａ_ｏと曲線座標系でのフォビエーテッド画像Ａ_ｃとの差を利用して決定される。すなわち、曲線座標系のフォビエーテッド画像を利用して画像を符号化する場合、直交座標系の原画像を符号化する場合に比べて、（Ａ_ｏ−Ａ_ｃ）Ｈ（ｘ）（ここで、Ａ_ｏ≧Ａ_ｃ）ほどのエントロピーが節約される。 Thus, the original image that exists in the orthogonal coordinate system, the difference between the amount of information required to represent the human visual characteristics Fobi er Ted image converted into consideration the curvilinear coordinate system, and the area A _o of the original image curve It is determined by using the difference between the Fobi er Ted image a _c in the coordinate system. That is, when an image is encoded using a curved coordinate system, a (A _o −A _c ) H (x) (here, compared to encoding an original image in an orthogonal coordinate system). , A _o ≧ A _c ) is saved.

理論的に、節約されたエントロピーの量は、視覚的な情報を失わずに画像データの符号化時に減少させうる上限値となる。したがって、人間の視覚的特性を考慮して曲線座標系のフォビエーテッド画像を符号化することによって得られる正規化利得Ｇｍは、（Ａ_ｏ−Ａ_ｃ）／Ａ_ｏである。 Theoretically, the amount of entropy saved is an upper limit that can be reduced when encoding image data without losing visual information. Therefore, the normalization gain Gm obtained by encoding the forged image in the curved coordinate system in consideration of human visual characteristics is (A _o −A _c ) / A _o .

（ウェーブレット係数の微分エントロピー）
まず、Ｗ（Ｘ）をウェーブレット変換関数と仮定する。図１Ａで、原画像ａ（Ｘ）は、Ｗ（Ｘ）によってウェーブレット領域に変換される。このとき、ウェーブレット係数ａ［ｍ］（ｍは、ウェーブレット係数のインデックス）は、次の式（８）のように定義しうる。 ( Differential entropy of wavelet coefficients)
First, W (X) is assumed to be a wavelet transform function. In FIG. 1A, an original image a (X) is converted into a wavelet region by W (X). At this time, the wavelet coefficient a [m] (m is an index of the wavelet coefficient) can be defined as the following equation (8).

前述したように、ｇ_ｍは、直交正規基本関数を表す。 As described above, g _m represents an orthogonal normal basic function.

曲線座標系にマッピングされた原画像 Original image mapped to curved coordinate system

と曲線座標系のマッピングされたフォビエーテッド画像 Foveated image with curved coordinate system mapped

は、局部帯域幅Ω_ｏを有する局部的に帯域制限的な信号と仮定すれば、 Is assumed to be a locally band-limited signal with local bandwidth Ω _o

と近似化しうる。 And can be approximated.

曲線座標系でマッピングされた原画像のウェーブレット係数ｂ［ｍ］は、次の式（９）のように定義しうる。 The wavelet coefficient b [m] of the original image mapped in the curved coordinate system can be defined as the following equation (9).

式（１）及び式（６）を利用すれば、直交座標系でのウェーブレット変換係数ａ［ｍ］と曲線座標系でのウェーブレット変換係数ｂ［ｍ］とは、それぞれ次の式（１０）のように表現される。 If Expression (1) and Expression (6) are used, the wavelet transform coefficient a [m] in the orthogonal coordinate system and the wavelet transform coefficient b [m] in the curved coordinate system are respectively expressed by the following Expression (10). It is expressed as follows.

（ウェーブレット領域での視覚的エントロピー）
空間的領域及び周波数領域で人間の視覚特性を考慮して設定される視覚的加重値をω_ｍと定義する。与えられた視覚的加重値ω_ｍに対して、視覚的エントロピー ( Visual entropy in wavelet domain)
A visual weight value set in consideration of human visual characteristics in the spatial domain and the frequency domain is defined as ω _m . For a given visual weight ω _m , visual entropy

は、次の式（１１）のように表現しうる。 Can be expressed as the following equation (11).

前述したように、ω_ｍは、空間的領域に関する成分と周波数領域に関する成分とで構成される。 As described above, ω _m includes a component related to the spatial domain and a component related to the frequency domain.

前記式（４）に述べられた局部周波数ｆ_ｎを空間的領域での視覚的加重値として利用しうる。もし、ｆ_ｍをウェーブレット領域での局部周波数と仮定すれば、ｆ_ｍは、次の式（１２）のように表現される。 The local frequency f _n described in the equation (4) can be used as a visual weight value in the spatial domain. If, assuming the f _m and the local frequency in the wavelet domain, f _m is expressed as the following equation (12).

ここで、ｍは、ウェーブレット係数ａ［ｍ］のインデックスであり、ｒは、ウェーブレットのディスプレイ解像度を表す。また、式（１２）で、ｆ_ｃは、臨界周波数を表し、ｆ_ｄは、ディスプレイナイキスト周波数を表す。臨界周波数及びディスプレイナイキスト周波数について説明すれば、次の通りである。 Here, m is an index of the wavelet coefficient a [m], and r represents the display resolution of the wavelet. In Expression (12), f _c represents a critical frequency, and f _d represents a display Nyquist frequency. The critical frequency and the display Nyquist frequency will be described as follows.

網膜の偏心を媒介変数とする関数として、人間のＨＶＳの明暗敏感度を測定するための実験によれば、次の式（１３）のような関係が成立することが知られている。 According to an experiment for measuring the sensitivity of a human HVS to light and dark as a function having the retina eccentricity as a parameter, it is known that the following relationship (13) is established.

ここで、ｆは、空間周波数（ｃｙｃｌｅｓ／ｄｅｇ）、ｅは、網膜の偏心（ｄｅｇ）、ＣＴｏは、最小明暗臨界値、αは、空間周波数相殺定数、ｅ^２は、ハーフ解像度偏心定数、ＣＴ（ｆ，ｅ）は、ｆ及びｅを媒介変数として関数の視覚的に認知可能な明暗臨界値を表す。明暗敏感度ＣＳ（ｆ，ｅ）は、前記明暗臨界値の逆数１／ＣＴ（ｆ，ｅ）と定義される。 Where f is the spatial frequency (cycles / deg), e is the retinal eccentricity (deg), CTo is the minimum light / dark critical value, α is the spatial frequency cancellation constant, e ² is the half resolution eccentricity constant, CT (F, e) represents a visually perceptible light / dark critical value of the function with f and e as parameters. The light / dark sensitivity CS (f, e) is defined as the reciprocal 1 / CT (f, e) of the light / dark critical value.

与えられた偏心ｅに対して、前記式（１３）を利用して臨界周波数ｆ_ｃを計算しうる。ここで、臨界周波数ｆ_ｃは、人間が視覚的に認知可能な空間周波数の限界を表す値であって、臨界周波数ｆ_ｃより大きい周波数成分は、視覚的に認知できない。 For a given eccentricity e, may calculate the critical frequency f _c by using the equation (13). Here, the critical frequency f _c is a human a value representing the limit of visually perceptible spatial frequency greater than the frequency components critical frequency f _c can not visually perceptible.

可能な最大明暗値である場合を仮定してＣＴ（ｆ，ｅ）を１とすれば、前記式（１３）から次の式（１４）のように臨界周波数ｆ_ｃが得られる。 Maximum possible brightness value in a case assuming a CT (f, e) if a critical frequency f _c as the expression (13) from the following equation (14) is obtained.

図３は、網膜の偏心及び視覚的認知構造を説明するための図である。ここで、観察される画像平面３００は、Ｎ個のピクセル幅を有し、フォビアから凝視点３１０を連結した線は、画像平面３００に垂直であると仮定する。また、フォビアから観察者の目までの距離を画像サイズによって正規化した値をｖと仮定する。 FIG. 3 is a diagram for explaining the eccentricity of the retina and the visual cognitive structure. Here, it is assumed that the observed image plane 300 has a width of N pixels, and the line connecting the fixation point 310 from the phobia is perpendicular to the image plane 300. Further, it is assumed that a value obtained by normalizing the distance from the phobia to the eyes of the observer by the image size is v.

図３を参照するに、偏心ｅは、観察者の凝視点３１０と前記凝視点から所定距離ｕ（画像サイズによって正規化されて測定された値）ほど離れた任意の地点ｘ
３２０とが網膜に結ばれる位置の差によって発生する角度差を意味する。したがって、画像平面３００の凝視点３１０を観察する場合、画像平面３００でｖほど離れた距離の観察者による偏心ｅは、 Referring to FIG. 3, the eccentricity e is an arbitrary point x that is separated from the observer's fixation point 310 by a predetermined distance u (a value measured by being normalized by the image size) from the fixation point.
320 denotes an angle difference generated by a difference in position where the retina is connected to the retina. Therefore, when observing the fixation point 310 of the image plane 300, the eccentricity e by the observer at a distance of about v in the image plane 300 is

である。 It is.

一方、実際のデジタル画像で認知できる最大解像度は、ディスプレイ解像度ｒによっても制限されることが知られている。このとき、 On the other hand, it is known that the maximum resolution that can be recognized in an actual digital image is also limited by the display resolution r. At this time,

と定義される。サンプリング原則によって、ディスプレイ装置によってアライアシングなしに表せる最大周波数であるディスプレイナイキスト周波数ｆ_ｄは、ディスプレイ装置の解像度の半分となる。したがって、ディスプレイナイキスト周波数ｆ_ｄは、次の式（１５）の通りである。 Is defined. Due to the sampling principle, the display Nyquist frequency f _d , which is the maximum frequency that can be represented by the display device without any aliasing, is half the resolution of the display device. Therefore, the display Nyquist frequency _fd is as the following equation (15).

２次元空間的領域では、次の式（１６）のように正規化された局部周波数ｆ_ｍの自乗値を空間領域での加重値 In the two-dimensional spatial domain, a weighted value of the square values of the normalized local frequency f _m as the following equation (16) in the spatial domain

として利用しうる。 It can be used as

図４は、ウェーブレット分解構造を示す図である。図４を参照するに、水平及び垂直ウェーブレット分解過程を交互に適用することによって、ＬＬ、ＨＬ、ＬＨ及びＨＨサブバンドが得られる。ＬＬサブバンドはまた、さらに小さいサブバンドに分解でき、前記過程は、何回でも反復される。 FIG. 4 is a diagram illustrating a wavelet decomposition structure. Referring to FIG. 4, LL, HL, LH, and HH subbands are obtained by alternately applying horizontal and vertical wavelet decomposition processes. The LL subband can also be decomposed into smaller subbands, and the process is repeated any number of times.

他のサブバンド及び位置に存在するウェーブレット係数は、人間のＨＶＳに可変的な認知重要性を提供する。人間のＨＶＳを考慮して、周波数領域で各ウェーブレット係数の有する視覚的重要性を判断する必要がある。本発明では、視覚的加重値ω_ｍの周波数領域成分である周波数領域加重値 Wavelet coefficients present in other subbands and locations provide variable cognitive importance to human HVS. It is necessary to determine the visual significance of each wavelet coefficient in the frequency domain in consideration of human HVS. In the present invention, the frequency domain weight value which is the frequency domain component of the visual weight value ω _m

を各ウェーブレットのサブバンドによって決定する。視覚的に感知可能なノイズ臨界値をＹとすれば、実験を通じてＹの値は、次の式（１７）のように表現できることが知られている。 Is determined by the subband of each wavelet. Assuming that the visually perceptible noise critical value is Y, it is known that the value of Y can be expressed by the following equation (17) through experiments.

ここで、θは、ウェーブレットのサブバンドを表すインデックスであり、ｆは、空間周波数（ｃｙｃｌｅｓ／ｄｅｇｒｅｅ）、ｇ_θ、ｆ_ｏ、ｋは、定数である。与えられたディスプレイ解像度ｒとウェーブレットの分解レベルλとを利用して、空間周波数ｆは、次の数式ｆ＝ｒ２^−λの通りである。 Here, θ is an index representing a subband of a wavelet, f is a spatial frequency (cycles / degree), and g _θ , f _o , and k are constants. Using the given display resolution r and wavelet decomposition level λ, the spatial frequency f is as follows: f = r 2 ^−λ

このとき、任意のウェーブレット分解レベルλ及びサブバンドθでのウェーブレット係数のエラー検出臨界値Ｔ_λ，θは、次の式（１８）の通りである。 At this time, the error detection critical value T _{λ, θ} of the wavelet coefficient at an arbitrary wavelet decomposition level λ and subband _θ is expressed by the following equation (18).

ここで、Ａ_λ，θは、基本関数大きさを表す。したがって、一つのサブバンドでのエラー敏感度Ｓ_ω（λ，θ）は、前記エラー検出臨界値Ｔ_λ，θの逆数、すなわち、１／Ｔ_λ，θの値を有する。 Here, A _{λ, θ} represents the basic function size. Accordingly, the error sensitivity S _ω (λ, θ) in one subband has a reciprocal of the error detection critical value T _{λ, θ} , that is, a value of 1 / T _{λ, θ} .

本発明では、次の式（１９）のように正規化されたＳ_ω（λ，θ）を周波数領域での加重値 In the present invention, S _ω (λ, θ) normalized as in the following equation (19) is weighted in the frequency domain.

として利用する。 Use as

前記式（１６）及び（１９）を利用して、空間的領域及び周波数領域で人間の視覚特性を考慮して設定される視覚的加重値ω_ｍは、最終的に次の式（２０）のように定義される。 Using the above equations (16) and (19), the visual weight value ω _m set in consideration of human visual characteristics in the spatial domain and the frequency domain is finally expressed by the following equation (20). Is defined as

（視覚的加重値を考慮した画像符号化、復号化方法及び装置）
以下では、前述した空間領域加重値及び周波数領域加重値の積を計算して得られた視覚的加重値を利用して、画像の符号化、復号化を行う方法及びこれを利用した画像コーダについて説明する。 ( Image coding / decoding method and apparatus considering visual weight)
In the following, a method for encoding and decoding an image using a visual weight obtained by calculating a product of the above-described spatial domain weight and frequency domain weight, and an image coder using the same explain.

図５は、本発明による画像符号化装置を示すブロック図であり、図６は、本発明による画像符号化方法を示すフローチャートである。図５を参照するに、本発明による画像符号化装置５００は、変換部５１０、視覚的加重値生成部５２０、関心領域決定部５３０、符号化順序決定部５４０及び順次的ウェーブレット係数符号化部５５０を備える。 FIG. 5 is a block diagram showing an image encoding apparatus according to the present invention, and FIG. 6 is a flowchart showing an image encoding method according to the present invention. Referring to FIG. 5, an image encoding apparatus 500 according to the present invention includes a transform unit 510, a visual weight value generation unit 520, a region of interest determination unit 530, an encoding order determination unit 540, and a sequential wavelet coefficient encoding unit 550. Is provided.

ステップ６１０で、変換部５１０は、入力画像に対するウェーブレット変換を行って、入力画像を低周波数サブバンドと高周波数サブバンドとに区分し、入力画像の各ピクセルに対するウェーブレット変換係数を求める。 In step 610, the transform unit 510 performs wavelet transform on the input image, divides the input image into low frequency subbands and high frequency subbands, and obtains a wavelet transform coefficient for each pixel of the input image.

ステップ６２０で、視覚的加重値生成部５２０は、空間領域及び周波数領域での人間の視覚的敏感度を考慮して、ウェーブレット変換係数の視覚的加重値を生成する。 In step 620, the visual weight generation unit 520 generates visual weights of wavelet transform coefficients in consideration of human visual sensitivity in the spatial domain and the frequency domain.

前述したように、視覚的加重値生成部５２０は、式（４）に述べられた局部周波数ｆ_ｎを空間的領域での視覚的加重値として利用するか、またはウェーブレット領域での臨界周波数ｆ_ｃ及びディスプレイナイキスト周波数ｆ_ｄのうち、最小値をウェーブレット領域での局部周波数ｆ_ｍとして選択し、前記式（１６）のように正規化された局部周波数ｆ_ｍの自乗値を空間領域に関する加重値 As described above, the visual weight generation unit 520 uses the local frequency f _n described in Equation (4) as the visual weight in the spatial domain or the critical frequency f _c in the wavelet domain. and display of the Nyquist frequency f _d, and selects the minimum value as the local frequency f _m in the wavelet domain, the equation normalized weight values for the square value spatial region of the local frequency f _m as (16)

として利用しうる。すなわち、視覚的加重値生成部５２０は、ウェーブレット領域での臨界周波数 It can be used as That is, the visual weight generation unit 520 performs the critical frequency in the wavelet domain.

及びディスプレイ装置によってアライアシングなしに表せる最大周波数であるディスプレイナイキスト周波数 And the display Nyquist frequency, which is the maximum frequency that can be expressed without aliasing by the display device

のうち最小値を選択し、これを式（１６）のように正規化して空間領域での加重値 The smallest value is selected from the above, and this is normalized as shown in Equation (16), and the weighted value in the spatial domain

を生成する。また、視覚的加重値生成部５２０は、サブバンドでのエラー検出臨界値Ｔ_λ，θの逆数、すなわち１／Ｔ_λ，θの値を有するエラー敏感度Ｓ_ω（λ，θ）を式（１９）のように正規化することによって、周波数領域での加重値 Is generated. Further, the visual weight generation unit 520 expresses an error sensitivity S _ω (λ, θ) having a value of 1 / T _{λ, θ} as a reciprocal of the error detection critical values T _{λ, θ in} the subband ( 19) weighted values in the frequency domain by normalization

を生成する。そして、視覚的加重値生成部５２０は、空間領域での加重値 Is generated. The visual weight generation unit 520 then calculates the weight in the spatial domain.

と周波数領域での加重値 And weights in the frequency domain

との積によって、ウェーブレット係数の符号化順序を決定する基準値の視覚的加重値を生成する。 Is used to generate a visual weight of a reference value that determines the encoding order of the wavelet coefficients.

関心領域決定部５３０は、視覚的加重値を生成するとき、人間の視線が固定される領域を判断して、人間の視神経によって認識される画像領域、すなわち、フォビエーテッド領域を判断する。関心領域決定部５３０は、動き検出を通じて画像から視覚的に高い動き活動を有する領域を検出するか、または警備カメラ応用プログラムで利用されるように、観察者の瞳の動きをトラックキングすることによって画像から関心領域を検出するか、またはユーザに選択によって入力された領域を関心領域と決定しうる。 When generating the visual weight value, the region-of-interest determination unit 530 determines an area in which the human line of sight is fixed, and determines an image area recognized by the human optic nerve, that is, a forbidden area. The region-of-interest determination unit 530 detects a region having a visually high motion activity from the image through motion detection, or tracks an observer's pupil movement as used in a security camera application program. A region of interest can be detected from the image, or a region input by selection by the user can be determined as the region of interest.

ステップ６３０で、符号化順序決定部５４０は、生成された視覚的加重値を利用して前記ウェーブレット変換係数の符号化順序を決定し、ステップ６４０で、順次的ウェーブレット係数符号化部５５０は、決定された符号化順序によってウェーブレット変換係数を量子化及びエントロピー符号化してビットストリームを生成する。例えば、符号化順序決定部５４０は、視覚的加重値生成部５２０で生成された視覚的加重値を利用して、一つのフレーム内の各サブバンドのウェーブレット係数の視覚的加重値の大きさ順序に再配列し、順次的ウェーブレット係数符号化部５５０は、さらに大きい視覚的加重値を有するウェーブレット係数から符号化して伝送する。 In step 630, the encoding order determination unit 540 determines the encoding order of the wavelet transform coefficients using the generated visual weights. In step 640, the sequential wavelet coefficient encoding unit 550 determines The wavelet transform coefficients are quantized and entropy-coded according to the encoded order, and a bit stream is generated. For example, the encoding order determination unit 540 uses the visual weight values generated by the visual weight value generation unit 520 to use the visual weight value order of the wavelet coefficients of each subband in one frame. The sequential wavelet coefficient encoding unit 550 encodes and transmits wavelet coefficients having a larger visual weight.

また、符号化順序決定部５４０は、現在チャンネル容量と前記ウェーブレット係数の差分エントロピー値とを利用して、現在チャンネル容量で伝送可能な前記ウェーブレット係数の総数を計算し、生成された視覚的加重値の大きさ順序によって、総数ほどのウェーブレット変換係数を選択しうる。 Also, the encoding order determination unit 540 calculates the total number of wavelet coefficients that can be transmitted with the current channel capacity using the current channel capacity and the difference entropy value of the wavelet coefficient, and generates the generated visual weight value. Depending on the size order, the total number of wavelet transform coefficients can be selected.

一方、伝送される視覚的情報の総和は、伝送されたデータの視覚的エントロピーの和によって決定される。チャンネル容量が制限されている視覚的な伝送量を最大化するために、本願発明のように、相対的に高い視覚情報を含んでいるウェーブレット係数値を先に伝送することがさらに効率的である。一つのビットに含まれている視覚情報は、前述したように、周波数と空間との領域での人間の視覚特性を考慮して設定された空間的加重値及び視覚的加重値の積である視覚的加重値によって評価される。前述した式（２０）を利用して、視覚的エントロピーは、次の式（２１）のように定義される。 On the other hand, the total sum of the transmitted visual information is determined by the sum of the visual entropy of the transmitted data. In order to maximize the amount of visual transmission with limited channel capacity, it is more efficient to transmit wavelet coefficient values containing relatively high visual information first, as in the present invention. . As described above, the visual information included in one bit is a product of a spatial weight value and a visual weight value set in consideration of human visual characteristics in the frequency and space areas. It is evaluated by the dynamic weight. Using the above equation (20), the visual entropy is defined as the following equation (21).

与えられたチャンネル容量をＣとするとき、伝送可能なウェーブレット係数の総数Ｍは、次の式（２２）のように計算される。 Assuming that the given channel capacity is C, the total number M of wavelet coefficients that can be transmitted is calculated by the following equation (22).

本発明によって視覚的加重値を基準に再配置されたウェーブレット係数の順序を表すインデックスをｋとする。この場合、伝送可能な視覚的エントロピーは、次の式（２３）のように計算される。 Let k be an index representing the order of wavelet coefficients rearranged based on visual weights according to the present invention. In this case, the transmittable visual entropy is calculated as in the following equation (23).

式（２３）で、Ｋは、Ｃという制限的なチャンネル容量を有する時に伝送できるウェーブレット変換係数の最大個数を意味する。このように、視覚の重要性によって伝送されるウェーブレット係数の視覚的エントロピーは、次の式（２４）の通りである。 In Equation (23), K means the maximum number of wavelet transform coefficients that can be transmitted when C has a limited channel capacity. Thus, the visual entropy of the wavelet coefficient transmitted according to the importance of vision is as the following equation (24).

ここで、Ｃ_ωは、与えられたチャンネル容量Ｃに伝送された視覚的エントロピーの和を表す。もし、本願発明による視覚的加重値 Here, _Cω represents the sum of visual entropies transmitted to a given channel capacity C. If visual weighting according to the present invention

を使用すれば、次の式（２５）のように、相対的な視覚的エントロピー利得Ｇ_ｔを有する。 Is used, it has a relative visual entropy gain G _t as shown in the following equation (25).

ここで、 here,

を満足する。式（２５）で、 Satisfied. In equation (25),

は、ウェーブレット係数のエントロピーに視覚的加重値を考慮して計算した全体視覚的エントロピーを意味する。すなわち、Ｍ^Ｔを総ウェーブレット係数とするとき、 Means the total visual entropy calculated by taking the visual weights into consideration for the entropy of the wavelet coefficients. That is, when the total wavelet coefficients M ^T,

である。 It is.

図７は、本発明による画像復号化装置の構成を示すブロック図であり、図８は、本発明による画像復号化方法を示すフローチャートである。図７を参照するに、本発明による画像復号化装置７００は、順次的ウェーブレット係数復号化部７１０、逆変換部７２０及び画像復元部７３０を備える。 FIG. 7 is a block diagram showing a configuration of an image decoding apparatus according to the present invention, and FIG. 8 is a flowchart showing an image decoding method according to the present invention. Referring to FIG. 7, the image decoding apparatus 700 according to the present invention includes a sequential wavelet coefficient decoding unit 710, an inverse transform unit 720, and an image restoration unit 730.

ステップ８１０で、順次的ウェーブレット係数復号化部７１０は、前述した画像符号化方法によって、空間領域及び周波数領域での人間の視覚的敏感度を考慮して生成されたウェーブレット変換係数の視覚的加重値の大きさ順序によって符号化されたウェーブレット変換係数を復号化する。すなわち、順次的ウェーブレット係数復号化部７１０は、ビットストリームに備えられたウェーブレット変換係数をエントロピー復号化及び逆量子化してウェーブレット変換係数を出力する。 In step 810, the sequential wavelet coefficient decoding unit 710 performs visual weighting of wavelet transform coefficients generated in consideration of human visual sensitivity in the spatial domain and the frequency domain by the above-described image coding method. The wavelet transform coefficients encoded according to the magnitude order of are decoded. That is, the sequential wavelet coefficient decoding unit 710 performs wavelet transform coefficient entropy decoding and inverse quantization on the wavelet transform coefficient provided in the bitstream, and outputs the wavelet transform coefficient.

ステップ８２０で、逆変換部７２０は、復号化されたウェーブレット変換係数に対する逆ウェーブレット変換を行って、各サブバンドでのウェーブレット係数を出力する。 In step 820, the inverse transform unit 720 performs inverse wavelet transform on the decoded wavelet transform coefficients, and outputs the wavelet coefficients in each subband.

ステップ８３０で、画像復元部７３０は、逆ウェーブレット変換された各サブバンドの係数を利用して画像を復元する。 In step 830, the image restoration unit 730 restores the image using the coefficients of the subbands subjected to the inverse wavelet transform.

図９Ａは、従来のＳＰＩＨＴアルゴリズムによって符号化された後に復元された画像の画質を目標ビット率によって測定した図であり、図９Ｂは、本発明による視覚的加重値の大きさ順序によって符号化された後に復元された画像の画質を目標ビット率によって測定した図である。 FIG. 9A is a diagram in which the image quality of an image restored after being encoded by the conventional SPIHT algorithm is measured by a target bit rate, and FIG. 9B is encoded according to the order of magnitude of visual weight values according to the present invention. FIG. 6 is a diagram in which the image quality of an image restored after being measured is measured by a target bit rate.

画質測定方法としては、ＰＳＮＲ（ＰｅａｋＳｉｇｎａｌｔｏＮｏｉｓｅＲａｔｉｏ）及びＦＷＱＩ（ＦｏｖｅａｔｅｄＷａｖｅｌｅｔｉｍａｇｅＱｕａｌｉｔｙＩｎｄｅｘ）を利用した。ＦＷＱＩは、“Ａｕｎｉｖｅｒｓａｌｉｍａｇｅｑｕａｌｉｔｙｉｎｄｅｘ”（Ｚ．ＷａｎｇａｎｄＡ．Ｃ．Ｂｏｖｉｋ、ＩＥＥＥＳｉｇｎａｌＰｒｏｃｅｓｓｉｎｇＬｅｔｔｅｒ“に詳細に説明されているところ、ここで具体的な説明は省略する。 As the image quality measurement method, PSNR (Peak Signal to Noise Ratio) and FWQI (Foveated Wavelet Image Quality Index) were used. The FWQI is described in detail in “A universal image quality index” (Z. Wang and AC Bovic, IEEE Signal Processing Letter), but a specific description thereof is omitted here.

図９Ａ及び図９Ｂを比較すれば、低いビット率で、本発明によって視覚的加重値を基準に符号化した後に復元された画像の画質は、従来のＳＰＩＨＴアルゴリズムによって符号化された後に復元された画像に比べて、さらに優秀な画質を有することが確認できる。ビット率が上昇する場合には、伝送可能なウェーブレット係数の量が増加するので、復元された画像の画質の差は大きくないが、本願発明は、特に、チャンネルの帯域幅が小さい場合に、改善された画質の画像を提供しうる。 Comparing FIG. 9A and FIG. 9B, the image quality of the image restored after encoding according to the present invention with the low bit rate and based on the visual weight is restored after being encoded by the conventional SPIHT algorithm. It can be confirmed that the image quality is superior to that of the image. When the bit rate increases, the amount of wavelet coefficients that can be transmitted increases, so the difference in the image quality of the restored image is not large, but the present invention is particularly improved when the channel bandwidth is small. Can provide an image of a selected quality.

図１０は、チャンネル容量によって、本発明によって視覚的加重値を基準にウェーブレット係数を再配列して伝送した場合、及び従来のＳＰＩＨＴアルゴリズムによって伝送した場合の視覚的なエントロピーを線形伝送方式と比較して示したグラフであり、図１１は、チャンネル容量によって、本発明によって視覚的加重値を基準にウェーブレット係数を再配列して伝送した場合、及び従来のＳＰＩＨＴアルゴリズムによって伝送した場合に、式（２５）で定義された視覚的エントロピー利得を示したグラフである。図１０及び図１１で、ｘ軸は、 FIG. 10 shows a comparison of the visual entropy when the wavelet coefficients are rearranged and transmitted according to the present invention according to the channel capacity according to the channel capacity, and when compared with the linear transmission method. FIG. 11 is a graph showing the case where the wavelet coefficients are rearranged and transmitted according to the present invention according to the channel capacity according to the channel capacity, and when the conventional SPIHT algorithm is used for transmission. Is a graph showing the visual entropy gain defined in (1). 10 and 11, the x axis is

によって加重値が考慮されたチャンネル容量を正規化させた値である。 Is a value obtained by normalizing the channel capacity in which the weight value is considered.

図１０を参照するに、本発明による画像符号化方法による場合、伝送された視覚的エントロピーの総量は、低いチャンネル容量で急増し、チャンネル容量が１である場合に、順次に収斂する。図１１を参照するに、本発明による画像符号化方法による場合、低いチャンネル容量で従来のＳＰＩＨＴアルゴリズムに比べて相対的に高い視覚的なエントロピー利得値を有することを再確認しうる。図１１を参照するに、チャンネル容量が約０.１である場合に、視覚的エントロピー利得値が約０.２３に急増することが確認できる。このような視覚的エントロピー利得は、チャンネル容量が０.１〜０.４５ほどである場合、従来のＳＰＩＨＴアルゴリズムに比べて、約０.２さらに大きい利得を有する。 Referring to FIG. 10, in the image coding method according to the present invention, the total amount of transmitted visual entropy increases rapidly with a low channel capacity, and converges sequentially when the channel capacity is 1. Referring to FIG. 11, it can be reconfirmed that the image coding method according to the present invention has a relatively high visual entropy gain value compared to the conventional SPIHT algorithm with a low channel capacity. Referring to FIG. 11, it can be confirmed that when the channel capacity is about 0.1, the visual entropy gain value increases rapidly to about 0.23. Such visual entropy gain has a gain of about 0.2 even greater than that of the conventional SPIHT algorithm when the channel capacity is about 0.1 to 0.45.

一方、前述した画像符号化方法及び復号化方法は、コンピュータプログラムで作成可能である。前記プログラムを構成するコード及びコードセグメントは、当分野のコンピュータプログラマによって容易に推論される。また、前記プログラムは、コンピュータで読み取り可能な情報記録媒体に保存され、コンピュータによって読み取られ、かつ実行されることによって、動画の符号化及び復号化方法を具現する。前記情報記録媒体は、磁気記録媒体、光記録媒体、及びキャリアウェーブ媒体を含む。 On the other hand, the above-described image encoding method and decoding method can be created by a computer program. The code and code segments that make up the program are easily inferred by computer programmers in the field. In addition, the program is stored in a computer-readable information recording medium, and is read and executed by the computer, thereby realizing a method for encoding and decoding moving images. The information recording medium includes a magnetic recording medium, an optical recording medium, and a carrier wave medium.

以上、本発明についてその望ましい実施例を中心に説明した。当業者は、本発明が本発明の本質的な特性から逸脱しない範囲で変形された形態で具現されるということが分かるであろう。したがって、開示された実施例は、限定的な観点ではなく、説明的な観点で考慮されねばならない。本発明の範囲は、前述した説明ではなく、特許請求の範囲に現れており、それと同等な範囲内にある全ての差異点は、本発明に含まれていると解釈されねばならない。 The present invention has been mainly described with reference to preferred embodiments. Those skilled in the art will appreciate that the present invention may be embodied in variations that do not depart from the essential characteristics of the invention. Accordingly, the disclosed embodiments should be considered in an illustrative, not a limiting sense. The scope of the present invention is shown not in the above description but in the claims, and all differences within the equivalent scope should be construed as being included in the present invention.

原画像ａ（ｘ）の一例を示す図である。It is a figure which shows an example of the original image a (x). フォビエーテッド画像Fobied image

の一例を示す図である。
図１Ａの原画像ａ（ｘ）を曲線座標系 It is a figure which shows an example.
The original image a (x) in FIG.

にマッピングした原画像 Original image mapped to

を示す図である。
図１Ｂのフォビエーテッド画像 FIG.
Fig. 1B forbidden image

を曲線座標系 The curved coordinate system

にマッピングしたフォビエーテッド画像 Foveated image mapped to

を示す図である。
網膜の偏心及び視覚的認知構造を説明するための図である。ウェーブレット分解構造を示す図である。本発明による画像符号化装置を示すブロック図である。本発明による画像符号化方法を示すフローチャートである。本発明による画像復号化装置の構成を示すブロック図である。本発明による画像復号化方法を示すフローチャートである。従来のＳＰＩＨＴアルゴリズムによって符号化された後、復元された画像の画質を目標ビット率によって測定した図である。本発明による視覚的加重値の大きさ順序によって符号化された後、復元された画像の画質を目標ビット率によって測定した図である。チャンネル容量によって、本発明によって視覚的加重値を基準にウェーブレット係数を再配列して伝送した場合及び従来のＳＰＩＨＴアルゴリズムによって伝送した場合の視覚的なエントロピーを線形伝送方式と比較して示すグラフである。チャンネル容量によって、本発明によって視覚的加重値を基準にウェーブレット係数を再配列して伝送した場合及び従来のＳＰＩＨＴアルゴリズムによって伝送した場合、式（２５）で定義された視覚的エントロピー利得を示すグラフである。 FIG.
It is a figure for demonstrating the eccentricity and visual cognitive structure of a retina. It is a figure which shows a wavelet decomposition structure. It is a block diagram which shows the image coding apparatus by this invention. 3 is a flowchart illustrating an image encoding method according to the present invention. It is a block diagram which shows the structure of the image decoding apparatus by this invention. 5 is a flowchart illustrating an image decoding method according to the present invention. It is the figure which measured the image quality of the image decompress | restored after encoding by the conventional SPIHT algorithm with the target bit rate. FIG. 5 is a diagram illustrating the image quality of a restored image after encoding according to the order of magnitudes of visual weight values according to the present invention, according to a target bit rate. 6 is a graph showing visual entropy in comparison with a linear transmission method when channellet capacity is transmitted by rearranging wavelet coefficients based on visual weight values according to the channel capacity and when transmitted by a conventional SPIHT algorithm. . When the wavelet coefficients are rearranged based on the visual weight according to the channel capacity according to the channel capacity and transmitted according to the conventional SPIHT algorithm, the graph shows the visual entropy gain defined by Equation (25). is there.

Claims

In the image encoding method,
Performing wavelet transform on the input image to generate wavelet transform coefficients;
Generating visual weights of the wavelet transform coefficients in consideration of human visual sensitivity in the spatial domain and frequency domain;
Determining an encoding order of the wavelet transform coefficients using the generated visual weights;
And a step of encoding the wavelet transform coefficients in accordance with the determined encoding order.

Generating a visual weight value of the wavelet transform coefficient;
Applying the normalized local bandwidth around the region of interest of the wavelet transformed input image, the spatial domain weight value of the wavelet transform coefficient
A step of determining
Frequency domain weighted value of the wavelet transform coefficient using error sensitivity in subbands of the wavelet transformed input image
A step of determining
The image encoding method according to claim 1, further comprising: calculating the product of the spatial domain weight value and the frequency domain weight value to generate the visual weight value.

The spatial domain weight
Is
Human-determined by using a minimum value of the display Nyquist frequency f _d is the maximum frequency that can be represented a critical frequency f _c and the image is a value representing the limit of visually perceptible spatial frequency without aliasing The image encoding method according to claim 2.

e
Where N is the number of pixels, v is the distance between the eye and the image normalized by the image size, and d is the distance between the corresponding pixel position of the wavelet transform coefficient and the oscillation point. The critical frequency f _c is expressed by the following equation, where CT ₀ is the minimum contrast critical value, α is the spatial frequency cancellation constant, and e ² is the half resolution eccentricity constant.
Defined as
The display Nyquist frequency f _d is given by
Defined as
The spatial domain weight
It is the local frequency in the wavelet domain to the minimum value of the critical frequency f _c and a display Nyquist frequency f _d
Where m is the index of the wavelet coefficient,
The image encoding method according to claim 3, wherein the image encoding method has a value defined as follows.

The frequency domain weight value
Is
When the wavelet decomposition level is λ and the index representing the wavelet subband is θ, the error sensitivity S _ω (λ, θ) in the subband to which the wavelet coefficient belongs is normalized. The image encoding method according to claim 2.

The error sensitivity S _ω (λ, θ) is
A _{λ, θ} is the magnitude of the basic function, f is the spatial frequency, g _θ , f _o , k is a constant, and r is the display resolution.
6. The image encoding method according to claim 5, further comprising a value obtained by normalizing an inverse value of an error detection critical value _{Tλ, θ} of the wavelet coefficient defined as follows.

Determining the encoding order of the wavelet transform coefficients;
Calculating the total number of wavelet coefficients that can be transmitted with the current channel capacity using a current channel capacity and a differential entropy value of the wavelet coefficients;
The image encoding method according to claim 2, further comprising: selecting as many of the wavelet transform coefficients as the total number according to the order of the generated visual weights.

The region of interest of the input image is
3. The region according to claim 2, determined by tracking movement of an area having a visually high motion activity in the image or motion of the observer's pupil through motion detection, or by user selection. Image coding method.

In an image encoding device,
A transform unit that performs wavelet transform on the input image to generate wavelet transform coefficients;
In consideration of human visual sensitivity in the spatial domain and the frequency domain, a visual weight generation unit that generates a visual weight of the wavelet transform coefficient;
An encoding order determining unit that determines an encoding order of the wavelet transform coefficients using the generated visual weight values;
An image encoding apparatus comprising: a sequential wavelet coefficient encoding unit that encodes the wavelet transform coefficients in accordance with the determined encoding order.

The visual weight generation unit
A spatial domain weighted value of the wavelet transform coefficient by applying a local bandwidth normalized around a region of interest of the wavelet transformed input image
A spatial domain weight determination unit for determining
Utilizing error sensitivity in subbands of the wavelet transformed input image, the frequency domain weight value of the wavelet transform coefficient
A frequency domain weight determination unit for determining
The image code of claim 9, further comprising: a multiplication unit that calculates a normalized product of the spatial domain weight value and the frequency domain weight value to generate the visual weight value. Device.

The spatial domain weight
Is
Human of display Nyquist frequency f _d is the maximum frequency that can be represented a critical frequency f _c and the image is a value representing the limit of visually perceptible spatial frequency without aliasing is determined by using the minimum value The image coding apparatus according to claim 10.

e
(Where N is the number of pixels, v is the distance between the eye and the image normalized by the image size, and d is the distance between the corresponding pixel position of the wavelet transform coefficient and the oscillation point) eccentricity defined, CT ₀ is minimum contrast threshold, alpha is the spatial frequency offset constant, e ^2, when the half resolution eccentric constant), the critical frequency f _c, the following equation
Defined as
The display Nyquist frequency f _d is given by
Defined as
The spatial domain weight
It is the local frequency in the wavelet domain to the minimum value of the critical frequency f _c and a display Nyquist frequency f _d
Where m is the index of the wavelet coefficient,
The image encoding device according to claim 11, wherein the image encoding device has a value defined as follows.

The frequency domain weight value
Is
When the wavelet decomposition level is λ, and the index representing the sublet of the wavelet is θ, the error sensitivity S _ω (λ, θ) in the subband to which the wavelet coefficient belongs is normalized. The image encoding device according to claim 10.

The error sensitivity S _ω (λ, θ) is
A _{λ, θ} is the magnitude of the basic function, f is the spatial frequency, g _θ , f _o , k is a constant, and r is the display resolution.
14. The image encoding device according to claim 13, wherein the image encoding device has a value obtained by normalizing an inverse value of an error detection critical value T _{λ, θ} of the wavelet coefficient defined as follows.

The encoding order determination unit includes:
Using the current channel capacity and the difference entropy value of the wavelet coefficient, the total number of wavelet coefficients that can be transmitted with the current channel capacity is calculated, and the total number of the wavelet coefficients can be calculated according to the order of the generated visual weight values. The image encoding apparatus according to claim 9, wherein the wavelet transform coefficient is selected.

A region of interest determination unit that determines a region of interest by tracking a movement of a region having a visually high motion activity or an observer's pupil in the image through motion detection or by user selection is further provided. The image encoding device according to claim 9.

In the image decoding method,
Decoding wavelet transform coefficients encoded according to the order of magnitude of visual weights generated taking into account human visual sensitivity in the spatial and frequency domains;
Performing an inverse wavelet transform on the decoded wavelet transform coefficients;
And a step of restoring an image using coefficients of each subband subjected to the inverse wavelet transform.

The visual weight is
Human of display Nyquist frequency f _d is the maximum frequency that can be represented a critical frequency f _c and the image is a value representing the limit of visually perceptible spatial frequency without aliasing is determined using the minimum value Spatial region weight
And a frequency domain weight having a value obtained by normalizing the error sensitivity S _ω (λ, θ) in the subband to which the wavelet coefficient belongs, where λ is the wavelet decomposition level and θ is the index representing the wavelet subband. value
The image decoding method according to claim 17, wherein the image decoding method is calculated using a product of.

In the image decoding device,
A sequential wavelet coefficient decoding unit that decodes wavelet transform coefficients encoded according to the order of magnitude of visual weights generated in consideration of human visual sensitivity in the spatial domain and the frequency domain;
An inverse transform unit for performing an inverse wavelet transform on the decoded wavelet transform coefficients;
An image decoding apparatus comprising: an image restoration unit that restores an image using a coefficient of each subband subjected to the inverse wavelet transform.

The visual weight is
Human of display Nyquist frequency f _d is the maximum frequency that can be represented a critical frequency f _c and the image is a value representing the limit of visually perceptible spatial frequency without aliasing is determined using the minimum value Spatial region weight
And a frequency domain weight having a value obtained by normalizing the error sensitivity S _ω (λ, θ) in the subband to which the wavelet coefficient belongs, where λ is the wavelet decomposition level and θ is the index representing the wavelet subband. value
The image decoding apparatus according to claim 19, wherein the image decoding apparatus is calculated using a product of.