JP2009272969A

JP2009272969A - Image encoding apparatus and image encoding method, and image decoding apparatus and image decoding method

Info

Publication number: JP2009272969A
Application number: JP2008122851A
Authority: JP
Inventors: Masashi Takahashi; 昌史高橋; Hiroo Ito; 浩朗伊藤; Muneaki Yamaguchi; 宗明山口
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2008-05-09
Filing date: 2008-05-09
Publication date: 2009-11-19
Also published as: WO2009136475A1

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image encoding apparatus which has improved compression efficiency. <P>SOLUTION: A method of encoding an image to be encoded divided into a plurality of regions is disclosed and includes: an image feature quantity calculation step of calculating an image feature quantity indicating features of an image in a region neighboring to a region to be encoded of the image to be encoded; an encoding mode group selection step of selecting an encoding mode group of the region to be encoded using the image feature quantity calculated in the image feature quantity calculation step; an encoding mode selection step of selecting one encoding mode from among a plurality of encoding mode belonging to the encoding mode group selected in the encoding mode group selection step; and an output step of performing predetermined conversion processing upon a predicted differential value calculated by prediction processing using the encoding mode selected in the encoding mode selection step, and incorporating the predicted differential value into an encoded stream. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は動画像を符号化する動画像符号化技術および動画像を復号化する動画像復号化技術に関する。 The present invention relates to a moving picture coding technique for coding a moving picture and a moving picture decoding technique for decoding a moving picture.

大容量の動画像情報をデジタルデータ化して記録、伝達する手法として、MPEG (Moving Picture Experts Group)方式等の符号化方式が知られている。 An encoding method such as MPEG (Moving Picture Experts Group) method is known as a method for recording and transmitting a large amount of moving image information as digital data.

MPEG (Moving Picture Experts Group)方式やH.264/AVC規格などが知られている。 MPEG (Moving Picture Experts Group) and H.264 / AVC standards are known.

符号化モードを増加させた場合にオーバヘッド情報の符号量の増加の影響が大きくなり、符号化効率が低下する。これを改善する技術として、特許文献１が知られている。 When the encoding mode is increased, the influence of the increase in the code amount of overhead information is increased, and the encoding efficiency is lowered. As a technique for improving this, Patent Document 1 is known.

特開2007-235991JP2007-235991

しかし、特許文献１に開示される技術では、符号化モード群を示すテーブル（符号化モードテーブル）を示す情報と符号化モードを示す情報を符号化ストリームに含める必要があり、符号化モードを示す情報の符号量の増加を十分に低減することができないという課題があった。 However, in the technique disclosed in Patent Literature 1, it is necessary to include information indicating a table (encoding mode table) indicating a group of encoding modes and information indicating an encoding mode in the encoded stream, which indicates the encoding mode. There has been a problem that an increase in the amount of code of information cannot be sufficiently reduced.

本発明の目的は、符号化モードを示す情報の符号量を低減させることにより、圧縮効率を向上することにある。 An object of the present invention is to improve compression efficiency by reducing the code amount of information indicating a coding mode.

上記目的を達成するために、本発明の一実施の態様は、例えば、特許請求の範囲に記載されるように構成すればよい。 In order to achieve the above object, an embodiment of the present invention may be configured as described in the claims, for example.

本発明によれば、圧縮効率を向上することが可能である。 According to the present invention, compression efficiency can be improved.

以下、本発明の実施例を、図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

まず、図3を用いて従来のH.264/AVC規格による符号化処理の動作例について説明する。H.264/AVC規格では、符号化対象画像に対してラスタースキャンの順序に従って16×16画素のマクロブロック単位による符号化を実行し(301)、符号化対象となるマクロブロックに対して画面間予測(302)および画面内予測(303)を実行する。 First, an example of the operation of encoding processing according to the conventional H.264 / AVC standard will be described with reference to FIG. In the H.264 / AVC standard, 16 × 16 pixel macroblock unit encoding is performed on the encoding target image according to the raster scan order (301), and the encoding target macroblock is displayed between screens. Prediction (302) and in-screen prediction (303) are executed.

ここで、画面内予測では、符号化対象ブロックの左、左上、上、右上に隣接する符号化ブロック（符号化対象ブロックよりも先に符号化処理／復号化処理が行われるブロック、以下、単に「既符号化ブロック」または「既復号化ブロック」と記す。）の復号化画像を参照し、主に参照画素値を特定の方向にコピーすることによって予測を実行する。H.264/AVC規格では、マクロブロックをさらに小さなブロックに分割して予測を行うことが可能であり、さらに予測方向も複数の候補が用意されている。そのため、ここではそれぞれのブロックサイズ(304)と予測方向(305)による予測を実行する。 Here, in the intra prediction, the encoding block adjacent to the left, upper left, upper, and upper right of the encoding target block (a block in which encoding / decoding processing is performed before the encoding target block, hereinafter simply referred to as “encoding block”). The prediction is executed by referring to the decoded image of “pre-encoded block” or “pre-decoded block” and mainly copying the reference pixel value in a specific direction. In the H.264 / AVC standard, it is possible to perform prediction by dividing a macroblock into smaller blocks, and a plurality of candidates are also prepared for prediction directions. Therefore, here, prediction based on each block size (304) and prediction direction (305) is executed.

一方、画面間予測の場合は、既に符号化した画像を参照し、動き探索により符号化対象ブロックと類似した領域を参照画像中から探すことによって予測を行う。この場合も予測を行うブロックサイズを選択することができるため、候補となるすべてのブロックサイズ(306)に対して、動き探索(307)を実行し、予測を行う。 On the other hand, in the case of inter-screen prediction, prediction is performed by referring to an already encoded image and searching for a region similar to the encoding target block in the reference image by motion search. Also in this case, since the block size to be predicted can be selected, the motion search (307) is executed for all candidate block sizes (306) to perform prediction.

最後に、予測を行った複数の符号化モード(符号化方法とブロックサイズの組み合わせ)の中から最適なものを選択(308)する。 Finally, an optimal one is selected (308) from the plurality of prediction encoding modes (combination of encoding method and block size).

図10は符号化モードの種類を列挙したリストの一例である。左側に、従来のH.264/AVC規格の符号化モードのリストと示し、右側に本発明の一実施例の符号化モードのリストと示している。 FIG. 10 is an example of a list listing the types of encoding modes. The left side shows a list of coding modes of the conventional H.264 / AVC standard, and the right side shows a list of coding modes according to an embodiment of the present invention.

ここで、上述のように、H.264/AVC規格では、すべての予測方法とブロックサイズに対して予測を実行し、それらの組み合わせとして表される多数の符号化モードの中から最適なものを選択する。H.264/AVC規格における符号化モードの一例(1001)を示す。H.264/AVC規格では、選択された符号化モードを表す記号が予測差分とともに符号化され、符号化ストリーム内に含められる。すなわち、例(1001)に示す利用可能な複数の符号化モードのそれぞれには互いに重複しない番号が与えられ、この番号が可変長符号化される。しかし、H.264/AVC規格の場合、候補となる符号化モードの数が多いためこの番号を表すためのビット量が多く必要であるといった課題があった。 Here, as described above, in the H.264 / AVC standard, prediction is performed for all prediction methods and block sizes, and an optimum one among a large number of coding modes expressed as a combination thereof is selected. select. An example (1001) of an encoding mode in the H.264 / AVC standard is shown. In the H.264 / AVC standard, a symbol representing the selected encoding mode is encoded together with the prediction difference and included in the encoded stream. That is, a number that does not overlap each other is assigned to each of the plurality of available encoding modes shown in the example (1001), and this number is variable-length encoded. However, in the case of the H.264 / AVC standard, since there are a large number of candidate encoding modes, there is a problem that a large amount of bits is required to represent this number.

これに対し、図10の本発明の一実施例の符号化処理の例(1002)では、互いに性質の似通った符号化モードの集合を「符号化モード群」としてグループ分けを行い、複数の符号化モード群から一つの符号化モード群を選択し、当該選択された符号化モード群に属する複数の符号化モードの中から最適なものを選択する手順により、最終的な符号化モードを選択する。 On the other hand, in the example (1002) of the encoding process of the embodiment of the present invention in FIG. 10, a set of encoding modes having similar properties is grouped as an “encoding mode group”, and a plurality of code A final coding mode is selected by a procedure for selecting a coding mode group from the coding mode group and selecting an optimum one from a plurality of coding modes belonging to the selected coding mode group. .

本発明の一実施例の復号化処理においては、符号化処理において選択された符号化モード群と、選択された符号化モードとを特定し、特定した符号化モードに対応する復号化処理を行う。 In the decoding process according to an embodiment of the present invention, the encoding mode group selected in the encoding process and the selected encoding mode are specified, and the decoding process corresponding to the specified encoding mode is performed. .

なお、本実施例では、符号化処理、復号化処理の単位は、従来のH.264/AVC規格と同じくマクロブロック単位を基本として説明するが、符号化対象画像を複数に分割する領域であればマクロブロック単位でなくとも良い。 In the present embodiment, the unit of the encoding process and decoding process is described based on the macroblock unit as in the conventional H.264 / AVC standard, but may be an area that divides the encoding target image into a plurality of parts. For example, it does not have to be a macroblock unit.

ここで、本発明の一実施例の符号化処理の符号化モード群の選択においては、符号化モード群を符号化対象領域の周辺に位置する既符号化ブロックの情報に基づいて選択する。また、本発明の一実施例の復号化処理の符号化モード群の特定においても、同じく、符号化モード群を符号化対象領域の周辺に位置する既符号化ブロックの情報に基づいて特定する。これにより、例えば、符号化処理において選択した符号化モード群を示すフラグを符号化ストリームに含めなくても、復号化処理において符号化モード群の特定を行うことができる。 Here, in the selection of the encoding mode group of the encoding process according to the embodiment of the present invention, the encoding mode group is selected based on the information of the already-encoded block located around the encoding target area. Similarly, in specifying the encoding mode group of the decoding process according to the embodiment of the present invention, the encoding mode group is specified based on the information of the already-encoded block located around the encoding target area. Thereby, for example, the encoding mode group can be specified in the decoding process without including the flag indicating the encoding mode group selected in the encoding process in the encoded stream.

本発明の一実施例の符号化処理においても、選択した符号化モード群に属する一つの符号化モードを示すフラグを符号化ストリームに格納する。ここで、符号化モード群は、利用可能な複数の符号化モードをグループ分けしたものであるから、各符号化モード群に含まれる符号化モードの数は、利用可能な符号化モードの総数よりも小さい。 Also in the encoding process of an embodiment of the present invention, a flag indicating one encoding mode belonging to the selected encoding mode group is stored in the encoded stream. Here, since the encoding mode group is a group of a plurality of available encoding modes, the number of encoding modes included in each encoding mode group is based on the total number of available encoding modes. Is also small.

よって、従来技術のように、利用可能な複数の符号化モードのそれぞれに互いに重複しない番号を与えて、これを示すフラグを符号化ストリームに格納する場合に比べて、本発明の一実施例の符号化処理において、符号化ストリームに格納するフラグの情報量は小さくなる。 Therefore, as in the prior art, compared to a case where a number that does not overlap each other is assigned to each of a plurality of available encoding modes and a flag indicating this is stored in the encoded stream, the embodiment of the present invention In the encoding process, the information amount of the flag stored in the encoded stream is reduced.

図6に、符号化モード群の選択方法の一例を示す。図6において、隣接ブロックの例(601)に示すように、対象ブロックの左側、上側、右上側、左上側に隣接する既符号化ブロックをそれぞれ隣接ブロックA、隣接ブロックB、隣接ブロックC、隣接ブロックDとする。 FIG. 6 shows an example of a method for selecting a coding mode group. In FIG. 6, as shown in the adjacent block example (601), the encoded blocks adjacent to the left side, the upper side, the upper right side, and the upper left side of the target block are the adjacent block A, the adjacent block B, the adjacent block C, and the adjacent block, respectively. Let it be block D.

さらにそれらのブロックの復号画像に対して、所定の画像処理を行って取得された画像特徴量をそれぞれICA、ICB、ICC、ICDとした場合、対象ブロックの符号化モード群はこれらの値を引数とする関数gを用いて式(602)のように示す。 Furthermore, if the image feature values obtained by performing predetermined image processing on the decoded images of those blocks are ICA, ICB, ICC, and ICD, the encoding mode group of the target block uses these values as arguments. This is expressed as in equation (602) using the function g.

この関数gを用いることにより、対象領域で利用する符号化モード群は既符号化ブロックの情報から選択することが可能となる。さらに関数gを符号化処理のみならず、復号化処理側でも用いることにより、符号化モード群を表す番号を符号化して伝送する必要がなくなる。これにより、符号化モード群を示すのに必要なビット量(603)を0とすることができる。 By using this function g, the encoding mode group used in the target region can be selected from the information of the already encoded block. Further, by using the function g not only in the encoding process but also on the decoding process side, it is not necessary to encode and transmit a number representing the encoding mode group. Thereby, the bit amount (603) necessary for indicating the encoding mode group can be set to zero.

ここで、上記画像特徴量としては、例えばエッジ情報（エッジ強度またはエッジ角度）を用いてもよく、画素値（輝度値や色ごとの強度）の分散情報などを用いてもよい。または、これらの情報の組み合わせでもよい。 Here, as the image feature amount, for example, edge information (edge intensity or edge angle) may be used, or dispersion information of pixel values (luminance value or intensity for each color) may be used. Alternatively, a combination of these information may be used.

上記所定の画像処理の方法は、用いる画像特徴量に応じた技術を用いればよい。例えば、エッジ情報を用いるのであれば、例えば図７に示すソーベルフィルタを用いたエッジ検出処理を利用すれば効果的である。 The predetermined image processing method may use a technique according to the image feature amount to be used. For example, if edge information is used, it is effective to use edge detection processing using a Sobel filter shown in FIG. 7, for example.

ここで、ソーベルフィルタを利用する場合、垂直方向用のフィルタ(701)および垂直方向用のフィルタ(702)の2種類を用いて各方向のエッジを検出する。また、プレウィットフィルタを利用しても良い。この場合、垂直方向用のフィルタ(703)や水平方向用のフィルタ(704)の他にも、斜め方向用のフィルタ(705)(706)が用意されている。また、より単純なフィルタの利用例として、まず特定の大きさの矩形フィルタを用意し、その中の濃度値の最大値と最小値の差を計算するMIN-MAXフィルタの利用が考えられる。 Here, when the Sobel filter is used, an edge in each direction is detected using two types of filters, ie, a vertical filter (701) and a vertical filter (702). A pre-witt filter may be used. In this case, in addition to the vertical filter (703) and the horizontal filter (704), diagonal filters (705) and (706) are prepared. As a simpler use example of the filter, it is conceivable to use a MIN-MAX filter that first prepares a rectangular filter of a specific size and calculates the difference between the maximum value and the minimum value of the density value.

図9に、ソーベルフィルタ(701)(702)を利用してエッジ強度とエッジ角度を計算する方法と、画素値の分散値を計算する方法ついて、その一例を示す。ここでは、対象ブロックに対してそれぞれ左側、上側、右上側、左上側に隣接する既符号化ブロックA、B、C、D(901)の復号化画像に対して、それぞれブロック内の画素(画素1〜画素m×n) (902)に対して、垂直方向用のフィルタ(701)と水平方向用のフィルタ(702)を適用させる。このとき、画素i(i=1,..,m×n)に水平方向用のフィルタと垂直方向用のフィルタを作用させて得た値をそれぞれfx(i)、fy(x)とすると、例えば、エッジ強度は式(903)のように、エッジ角度は式(904)のように計算できる。さらに、画素値の分散値は式(905)のように計算できる。画像特徴量ICA、ICB、ICC、ICDとしては、これらの値をそのまま利用しても良いし、正規化や複数の特徴量を合成してもよい。なお、上記の例ではブロックサイズがm×n画素の場合を示しており、H.264/AVC規格の場合はm=16、n=16となる。 FIG. 9 shows an example of a method for calculating edge strength and edge angle using Sobel filters (701) and (702) and a method for calculating a variance of pixel values. Here, with respect to the decoded images of the already-encoded blocks A, B, C, and D (901) adjacent to the target block on the left side, upper side, upper right side, and upper left side, respectively, The vertical filter (701) and the horizontal filter (702) are applied to 1 to pixels m × n) (902). At this time, assuming that the values obtained by applying the horizontal filter and the vertical filter to the pixel i (i = 1,..., M × n) are fx (i) and fy (x), respectively, For example, the edge strength can be calculated as in Equation (903) and the edge angle can be calculated as in Equation (904). Further, the dispersion value of the pixel value can be calculated as in Expression (905). As the image feature values ICA, ICB, ICC, and ICD, these values may be used as they are, or normalization and a plurality of feature values may be synthesized. The above example shows a case where the block size is m × n pixels. In the case of the H.264 / AVC standard, m = 16 and n = 16.

対象ブロックの予測モードを出力するための関数gはどのようなものでも構わないが、例えばニューラルネットワークの機械学習機能を利用することによって関数gを実現すると効果的である。 Any function g for outputting the prediction mode of the target block may be used, but it is effective to realize the function g by using, for example, a machine learning function of a neural network.

図8を用いて、ニューラルネットワークを利用して関数gを実現した場合の例について説明する。ニューラルネットワークとは、複数のしきい値論理ユニットを入力層から出力層まで階層的に配置したネットワークのことである。フィードフォーワード型のネットワークでは、ユニット間の結合は隣接する層間でのみ存在し、かつ入力層から出力層へ向かう一方向である。結合されたユニット間には結合の重みが与えられ、上位階層のユニットへの入力は下位階層のユニット群が出力する値と結合の重みの積和となる。学習を行う際には、出力層で所望の結果が得られるようにこれらの重みを調整する。ここでは、隣接ブロックA〜Dのエッジ強度もしくはエッジ角度を正規化したもの、または画素値の分散値を正規化したものを画像特徴量として入力した際に(701)、符号化モード群番号nの尤度がそれぞれ計算されて出力される(703)ように、あらかじめニューラルネットワーク(702)の学習を行っておく。本願における尤度とは、入力される画像特徴量を有する対象ブロックに対して、一つの符号化モード群に属する符号化モードにより符号化した場合の符号量が、利用可能な他の符号化モード群に属する符号化モードにより符号化した場合の符号量に比べて、最も小さくなる可能性の確からしさを示す指標である。このとき、最も高い尤度が出力される符号化モード群の番号を返す関数を上記関数gとして設定すれば(704)、図6にて示した方法による符号化および復号化が可能になる。 An example when the function g is realized using a neural network will be described with reference to FIG. A neural network is a network in which a plurality of threshold logic units are arranged hierarchically from an input layer to an output layer. In a feed-forward network, coupling between units exists only between adjacent layers and is unidirectional from the input layer to the output layer. A combined weight is given between the combined units, and an input to the upper layer unit is a product sum of a value output from the lower layer unit group and the combined weight. When learning is performed, these weights are adjusted so that a desired result is obtained in the output layer. Here, when the normalized edge strength or edge angle of the adjacent blocks A to D or the normalized dispersion value of the pixel value is input as the image feature amount (701), the encoding mode group number n The neural network (702) is learned in advance so that the likelihood of each is calculated and output (703). Likelihood in the present application is the other coding modes in which the coding amount when the target block having the input image feature amount is coded by a coding mode belonging to one coding mode group can be used. This is an index indicating the probability of the possibility of being the smallest as compared with the amount of code when encoding is performed in the encoding mode belonging to the group. At this time, if a function that returns the number of the encoding mode group that outputs the highest likelihood is set as the function g (704), encoding and decoding can be performed by the method shown in FIG.

上記学習方法は特に問わないが、例えば誤差逆伝播法(BP法:Back Propagation method)を利用すれば大きな効果が見られる。BP法については、例えば参考文献１に詳しく解説されている。
[参考文献１] 特開2003-44827
また、上記以外にも、関数gとしては、例えばエッジの強度や角度、分散値などを変数とする単純な多項式から、カーネル法、SVM(Support Vector Machine)、k近傍法、線形判別式分析、ベイズベット、隠れマルコフモデル、決定木学習などの機械学習手法を利用したものまで、幅広く考えられる。 The learning method is not particularly limited. For example, if a back propagation method (BP method) is used, a great effect can be seen. The BP method is described in detail in Reference Document 1, for example.
[Reference Document 1] JP2003-44827
In addition to the above, as the function g, for example, from a simple polynomial having variables such as edge strength and angle, variance, kernel method, SVM (Support Vector Machine), k-nearest neighbor method, linear discriminant analysis, It can be widely considered to use machine learning techniques such as Bayes betting, hidden Markov models, and decision tree learning.

さらに、ブースティングを利用するなどの手段により、複数の識別機を組み合わせても良い。どのモデルを利用して関数gを実現するかについてあらかじめ定めておいても良い。また、関数gがどのような入出力を行うのかについてはあらかじめ符号化側と復号化側の両者に入出力の対応表のテーブルなどを備えておいても良い。また、符号化ストリームに関数gの情報を格納できるようにしても構わない。 Further, a plurality of discriminators may be combined by means such as using boosting. It may be determined in advance which model is used to realize the function g. In addition, regarding the input / output of the function g, an input / output correspondence table may be provided on both the encoding side and the decoding side in advance. Further, the information on the function g may be stored in the encoded stream.

また、上記の実施例では変数として周辺ブロックにおける画素値の分散値、エッジの強度や角度を利用しているが、周辺ブロックの画素値平均や標準偏差、符号化方法、符号化モードなど、周辺ブロックの情報ならどのようなものを利用しても良いし、QP(Quantization Parameter：量子化パラメータ)や画面解像度など、符号化条件に関するパラメータを追加しても構わない。また、本実施例では対象画像と同一の画面における既符号化ブロックの情報を利用しているが、例えば一つ前のフレームなど、対象画像とは別の画像における画像特徴量を利用してもよい。 In the above embodiment, the variance value of the pixel value in the surrounding block, the edge strength and angle are used as variables, but the pixel value average and standard deviation of the surrounding block, the encoding method, the encoding mode, etc. Any block information may be used, and parameters relating to encoding conditions such as QP (Quantization Parameter) and screen resolution may be added. Further, in this embodiment, information on an already-encoded block on the same screen as the target image is used, but an image feature amount in an image different from the target image such as a previous frame may be used. Good.

次に図１を用いて、本発明による動画像符号化装置の一実施例について説明する。 Next, an embodiment of the moving picture coding apparatus according to the present invention will be described with reference to FIG.

動画像符号化装置は、入力された原画像(101)を保持する入力画像メモリ(102)と、入力画像を小領域に分割するブロック分割部(103)と、ブロック単位で動きを検出する動き探索部(104)と、ブロック単位で画面内予測を行う画面内予測部(105)と、動き探索部(104)にて検出された動き量を基にブロック単位で画面間予測を行う画面間予測部(106)と、符号化対象ブロックの周辺に位置する既符号化ブロックの復号画像に所定の画像処理を施して画像特徴量を算出する画像特徴量算出部(117)と、画像特徴量算出部(117)が算出した画像特徴量を用いて、当該符号化対象ブロックの符号化モード群を選択するモード群選択部(108)と、モード群選択部(108)が選択した符号化モード群に属する複数の符号化モードのうちから画像の性質に合った符号化モード(予測方法およびブロックサイズ)を選択するモード選択部(107)と、予測差分を生成するための減算部(109)と、予測差分に対して符号化を行う周波数変換部(110)および量子化部(111)と、記号の発生確率に応じた符号化を行うための可変長符号化部(112)と、一度符号化した予測差分を復号化するための逆量子化処理部(113)および逆周波数変換部(114)と、復号化された予測差分を用いて復号化画像を生成するための加算部(115)と、復号化画像を保持して後の予測に活用するための参照画像メモリ(116)を有する。以下に各部の動作の詳細を説明する。 The moving image encoding apparatus includes an input image memory (102) that holds an input original image (101), a block dividing unit (103) that divides the input image into small regions, and a motion that detects motion in units of blocks. Search unit (104), intra-screen prediction unit (105) that performs intra-screen prediction in units of blocks, and inter-screen prediction that performs inter-screen prediction in units of blocks based on the amount of motion detected by the motion search unit (104) A prediction unit (106), an image feature amount calculation unit (117) that performs predetermined image processing on a decoded image of an already-encoded block located around the encoding target block, and calculates an image feature amount; and an image feature amount Using the image feature amount calculated by the calculation unit (117), a mode group selection unit (108) that selects a coding mode group of the encoding target block, and a coding mode selected by the mode group selection unit (108) A coding mode (prediction method and Mode selection unit (107) for selecting a block size), a subtraction unit (109) for generating a prediction difference, a frequency conversion unit (110) and a quantization unit (111) for encoding the prediction difference ), A variable length encoding unit (112) for encoding according to the occurrence probability of the symbol, an inverse quantization processing unit (113) for decoding the prediction difference encoded once, and an inverse frequency transform Unit (114), an addition unit (115) for generating a decoded image using the decoded prediction difference, and a reference image memory (116) for holding the decoded image and utilizing it for later prediction ). Details of the operation of each unit will be described below.

入力画像メモリ(102)は原画像(101)の中から一枚の画像を符号化対象画像として保持し、これをブロック分割部(103)にて細かなブロックに分割し、動き探索部(104)、画面内予測部(105)、および画面間予測部(106)に渡す。動き探索部(104)では、参照画像メモリ(116)に格納されている復号化済み画像を用いて該当ブロックの動き量を計算し、動きベクトルを画面間予測部(106)に渡す。画面内予測部(105)および画面間予測部(106)では画面内予測処理および画面間予測処理をいくつかの大きさのブロック単位で実行する。 The input image memory (102) holds one image as an encoding target image from the original image (101), and divides it into fine blocks by the block dividing unit (103), and the motion search unit (104 ), An intra-screen prediction unit (105), and an inter-screen prediction unit (106). The motion search unit (104) calculates the amount of motion of the corresponding block using the decoded image stored in the reference image memory (116), and passes the motion vector to the inter-screen prediction unit (106). The intra-screen prediction unit (105) and the inter-screen prediction unit (106) execute the intra-screen prediction process and the inter-screen prediction process in units of several blocks.

ここで、画像特徴量算出部(117)では、参照画像メモリ(116)から対象ブロックの周辺に位置する既符号化ブロックの復号画像を受信し、図６において説明した所定の画像処理を施して画像特徴量を取得し、(602)に示す関数gによって符号化モード群を選択する。モード選択部(108)では、選択した符号化モード群に含まれる符号化モードの中から最適なものを選択する。 Here, the image feature amount calculation unit (117) receives the decoded image of the already-encoded block located around the target block from the reference image memory (116), and performs the predetermined image processing described in FIG. An image feature amount is acquired, and an encoding mode group is selected by a function g shown in (602). The mode selection unit (108) selects an optimal one from the encoding modes included in the selected encoding mode group.

続いて減算部(109)では選択された符号化モードによって予測差分を生成し、周波数変換部(110)に渡す。周波数変換部(110)および量子化処理部(111)では、送られてきた予測差分に対して指定された大きさのブロック単位でそれぞれDCT(Discrete Cosine Transformation：離散コサイン変換)などの周波数変換および量子化処理を行い、可変長符号化処理部(112)および逆量子化処理部(113)に送信する。 Subsequently, the subtraction unit (109) generates a prediction difference according to the selected encoding mode, and passes it to the frequency conversion unit (110). The frequency transform unit (110) and the quantization processing unit (111) each perform frequency transform such as DCT (Discrete Cosine Transformation) in units of blocks of a specified size with respect to the transmitted prediction difference. Quantization processing is performed, and the result is transmitted to the variable length coding processing unit (112) and the inverse quantization processing unit (113).

さらに可変長符号化処理部(112)では、周波数変換係数によって表される予測差分情報を、例えば符号化モード群の番号や符号化モード番号、画面内予測符号化における予測方向、画面間予測符号化における動きベクトルなど、予測復号化に必要な情報とともに、記号の発生確率に基づいて可変長符号化を行って符号化ストリームを生成する。また、逆量子化処理部(113)および逆周波数変換部(114)では、量子化後の周波数変換係数に対して、それぞれ逆量子化およびIDCT(Inverse DCT：逆DCT)などの逆周波数変換を施し、予測差分を復号して加算部(115)に送る。続いて加算部(115)にて予測差分と予測値を加算し、復号化画像を生成して参照画像メモリ(116)に格納する。 Further, in the variable length coding processing unit (112), the prediction difference information represented by the frequency transform coefficient is converted into, for example, a coding mode group number, a coding mode number, a prediction direction in intra prediction encoding, and an inter prediction code. In addition to information necessary for predictive decoding, such as motion vectors in encoding, variable-length encoding is performed based on the occurrence probability of symbols to generate an encoded stream. In addition, the inverse quantization processing unit (113) and the inverse frequency transform unit (114) perform inverse frequency transform such as inverse quantization and IDCT (Inverse DCT) on the frequency transform coefficients after quantization. Then, the prediction difference is decoded and sent to the adding unit (115). Subsequently, the adder (115) adds the prediction difference and the prediction value, generates a decoded image, and stores it in the reference image memory (116).

次に、図2を用いて、図1に示す動画像符号化装置の実施例における1フレームの動画像符号化方法について説明する。まず、ループ１(201)に示されるように、符号化対象となるフレーム内に存在するすべてのブロックに対して、以下の処理を行う。すなわち、ソーベルフィルタなどを利用して対象ブロックの周辺に位置する符号化済みのブロックの画像特徴量を算出し(202)、図6の式(602)に示す関数gによって符号化モード群を選択する。続いて、選択した符号化モード群に含まれるすべての符号化モードに対して(204)、予測を実行し(205)、予測差分の計算を行う(206)。そして、その中から予測精度が最も高い符号化モードを選択する(207)。続いて、選ばれた符号化モードで生成された予測差分に対して周波数変換(208)と量子化処理(209)を施す。さらに可変長符号化を行い、可変長符号化されたデータを符号化ストリームに含めて出力する(210)。このとき、選択された符号化モードを示す情報も符号化ストリームに含めて出力する。一方、量子化済みの周波数変換係数に対しては、逆量子化処理(211)と逆周波数変換処理(212)を施して予測差分を復号化し、復号化画像を生成して参照画像メモリに格納する(213)。以上の処理をすべてのブロックに対して完了すれば、画像1フレーム分の符号化は終了する(214)。 Next, a one-frame moving image encoding method in the embodiment of the moving image encoding apparatus shown in FIG. 1 will be described with reference to FIG. First, as shown in loop 1 (201), the following processing is performed on all blocks existing in the frame to be encoded. That is, the image feature amount of the encoded block located around the target block is calculated using a Sobel filter or the like (202), and the encoding mode group is determined by the function g shown in the equation (602) of FIG. select. Subsequently, prediction is executed for all the encoding modes included in the selected encoding mode group (204), and a prediction difference is calculated (206). Then, an encoding mode with the highest prediction accuracy is selected from among them (207). Subsequently, frequency conversion (208) and quantization processing (209) are performed on the prediction difference generated in the selected encoding mode. Further, variable length encoding is performed, and the variable length encoded data is included in the encoded stream and output (210). At this time, information indicating the selected encoding mode is also included in the encoded stream and output. On the other hand, the quantized frequency transform coefficient is subjected to inverse quantization processing (211) and inverse frequency transformation processing (212) to decode the prediction difference, generate a decoded image, and store it in the reference image memory (213). When the above processing is completed for all the blocks, the encoding for one frame of the image is completed (214).

以上説明した本実施例に係る動画像符号化装置及び動画像符号化方法によれば、選択した一つの符号化モードを符号化モード群の情報と、符号化モード群に属する符号化モードのうち一つの符号化モードを示す情報とに分け、符号化モード群の情報については符号化対象ブロックに隣接するブロックの画像の特徴に基づいて選択し、符号化モード群の情報を符号化ストリームに含めず、符号化モード群に属する符号化モードのうち一つの符号化モードを示す情報のみを符号化ストリームに含める。 According to the moving picture coding apparatus and the moving picture coding method according to the present embodiment described above, the selected one coding mode is the coding mode group information and the coding modes belonging to the coding mode group. The coding mode group information is selected based on the characteristics of the image of the block adjacent to the coding target block, and the coding mode group information is included in the coded stream. Instead, only the information indicating one coding mode among the coding modes belonging to the coding mode group is included in the coded stream.

これにより、利用可能な複数の符号化モードのうち選択した一つの符号化モードを示す情報を、それぞれに互いに重複しない番号からなるフラグ情報として符号化ストリームに含める場合に比べて、符号化モードを示す情報の符号量を低減させることが可能となり、圧縮効率を向上することが可能となる。 As a result, the encoding mode is compared with a case where information indicating one encoding mode selected from a plurality of available encoding modes is included in the encoded stream as flag information including numbers that do not overlap each other. It is possible to reduce the code amount of the information to be shown, and to improve the compression efficiency.

次に図4を用いて、本発明による動画像復号化装置の一実施例を説明する。動画像復号化装置は、例えば図1に示す動画像符号化装置によって生成された符号化ストリーム(401)に対して可変長符号化の逆の手順を踏む可変長復号化部(402)と、予測差分を復号化するための逆量子化処理部(403)および逆周波数変換部(404)と、対象ブロックの周辺に位置する既符号化ブロックの復号画像に所定の画像処理を施して画像特徴量算出する画像特徴量算出部(410)と、画像特徴量算出部(410)が算出した画像特徴量を用いて、当該符号化対象ブロックの符号化モード群を特定するモード群特定部(411)と、モード群特定部(411)が特定した符号化モード群に属する符号化モード群のうちから、該当ブロックに使用されている符号化モードを特定するモード特定部(405)と、画面内予測を行う画面内予測部(406)と、画面間予測を行う画面間予測部(407)と、復号化画像を取得する加算部(408)と、復号化画像を一時的に記憶しておくための参照画像メモリ(409)を有する。以下に各部の動作の詳細を説明する。 Next, an embodiment of the moving picture decoding apparatus according to the present invention will be described with reference to FIG. The video decoding device includes, for example, a variable length decoding unit (402) that performs the reverse procedure of variable length encoding on the encoded stream (401) generated by the video encoding device shown in FIG. Inverse quantization processing unit (403) and inverse frequency transform unit (404) for decoding the prediction difference, and image characteristics obtained by performing predetermined image processing on the decoded image of the already-encoded block located around the target block An image feature amount calculation unit (410) for calculating the amount, and a mode group specifying unit (411) for specifying the encoding mode group of the encoding target block using the image feature amount calculated by the image feature amount calculation unit (410) ), A mode specifying unit (405) for specifying the coding mode used in the corresponding block among the coding mode groups belonging to the coding mode group specified by the mode group specifying unit (411), Intra-screen prediction unit (406) that performs prediction, and inter-screen prediction unit (407) that performs inter-screen prediction Includes an adder which obtains a decoded image (408), a reference image memory (409) for temporarily storing the decoded image. Details of the operation of each unit will be described below.

可変長復号化部(402)では、符号化ストリーム(401)を可変長復号化し、予測差分の周波数変換係数成分と、ブロックサイズや動きベクトルなど予測処理に必要な情報を取得する。前者の予測差分情報に対しては逆量子化処理部(403)に、後者の予測処理に必要な情報に対しては、予測手段に応じて画面内予測部(406)、または画面間予測部(407)に送られる。続いて、逆量子化処理部(403)および逆周波数変換部(404)では、予測差分情報に対してそれぞれ逆量子化と逆周波数変換を施して復号化を行う。 The variable length decoding unit (402) performs variable length decoding on the encoded stream (401), and acquires information necessary for prediction processing, such as a frequency transform coefficient component of a prediction difference, a block size, and a motion vector. For the former prediction difference information, the inverse quantization processing unit (403), for the information necessary for the latter prediction processing, in-screen prediction unit (406) or inter-screen prediction unit depending on the prediction means Sent to (407). Subsequently, the inverse quantization processing unit (403) and the inverse frequency transform unit (404) perform decoding by performing inverse quantization and inverse frequency transform on the prediction difference information, respectively.

ここで、画像特徴量算出部(410)では、参照画像メモリ(409)から対象ブロックの周辺に位置する既符号化ブロックの復号画像を受信する。受信した既符号化ブロックの復号画像に図６において説明した所定の画像処理を施して画像特徴量を算出する。モード群特定部(411)は画像特徴量算出部(410)が算出した画像特徴量と図６の式(602)に示す関数gなどにより、当該符号化対象ブロックの符号化モード群を特定する。モード特定部(405)では、モード群特定部(411)が特定した符号化モード群の情報と符号化ストリームに含まれている符号化モード番号とから該当ブロックに使用されている符号化モードを特定する。さらに、特定した符号化モードについての情報を画面内予測部(406)または画面間予測部(407)に送信する。 Here, the image feature amount calculation unit (410) receives the decoded image of the already-encoded block located around the target block from the reference image memory (409). The predetermined image processing described with reference to FIG. 6 is performed on the received decoded image of the already-encoded block to calculate the image feature amount. The mode group specifying unit (411) specifies the encoding mode group of the encoding target block based on the image feature amount calculated by the image feature amount calculating unit (410) and the function g shown in the equation (602) of FIG. . The mode specifying unit (405) determines the encoding mode used for the corresponding block from the information of the encoding mode group specified by the mode group specifying unit (411) and the encoding mode number included in the encoded stream. Identify. Further, information on the identified encoding mode is transmitted to the intra prediction unit (406) or the inter prediction unit (407).

続いて画面内予測部(406)または画面間予測部(407)では、可変長復号化部(402)から送られてきた情報と参照画像メモリ(409)に格納される復号化画像を参照し、前記モード特定部(405)で特定された符号化モードに対応する予測処理を行う。加算部(408)にて、当該予測処理により取得した参照画像を、逆量子化処理部(403)および逆周波数変換部(404)により復号化された予測差分とを加算して復号化画像を生成し、復号化画像を参照画像メモリ(409)に格納する。 Subsequently, the intra prediction unit (406) or the inter prediction unit (407) refers to the information sent from the variable length decoding unit (402) and the decoded image stored in the reference image memory (409). The prediction process corresponding to the coding mode specified by the mode specifying unit (405) is performed. In the addition unit (408), the reference image acquired by the prediction process is added to the prediction difference decoded by the inverse quantization processing unit (403) and the inverse frequency conversion unit (404) to obtain a decoded image. The decoded image is generated and stored in the reference image memory (409).

図5は、図4に示す動画像復号化装置の実施例における1フレームの動画像復号化方法について示している。まず、ループ１(501)に示されるように、1フレーム内のすべてのブロックに対して、以下の処理を行う。すなわち、入力ストリームに対して可変長復号化処理を施し(502)、逆量子化処理(503)および逆周波数変換処理(504)を施して復号化対象領域の予測差分を復号化する。続いて、ソーベルフィルタなどを利用して対象ブロックの周辺に位置する符号化済みのブロックの画像特徴量を算出し(505)、図６の式(602)に示す関数gによって符号化モード群を特定する(506)。続いて、特定したと符号化モード群の情報と、符号化ストリームに含まれている符号化モード番号とから符号化モードを特定する(507)。さらに、復号した予測差分に対して特定した符号化モードに対応する予測処理を行い、当該予測処理により取得した参照画像と復号化した予測差分とを合成することにより、復号化画像を生成する。生成した復号化画像を参照画像メモリに格納する(508)。以上の処理をフレーム中のすべてのブロックに対して完了すれば、画像1フレーム分の復号化が終了する(509)。 FIG. 5 shows a one-frame moving picture decoding method in the embodiment of the moving picture decoding apparatus shown in FIG. First, as shown in loop 1 (501), the following processing is performed for all blocks in one frame. That is, the variable length decoding process is performed on the input stream (502), and the inverse quantization process (503) and the inverse frequency transform process (504) are performed to decode the prediction difference in the decoding target region. Subsequently, an image feature amount of an encoded block located around the target block is calculated using a Sobel filter or the like (505), and an encoding mode group is expressed by a function g shown in Expression (602) of FIG. Is identified (506). Subsequently, the encoding mode is specified from the information of the encoding mode group that has been specified and the encoding mode number included in the encoded stream (507). Furthermore, a prediction process corresponding to the specified encoding mode is performed on the decoded prediction difference, and a decoded image is generated by combining the reference image acquired by the prediction process and the decoded prediction difference. The generated decoded image is stored in the reference image memory (508). When the above processing is completed for all the blocks in the frame, decoding for one frame of the image is completed (509).

以上説明した本実施例に係る画像復号化装置及び画像復号化方法によれば、選択した一つの符号化モードを符号化モード群の情報と、符号化モード群に属する符号化モードのうち一つの符号化モードを示す情報とに分け、符号化モードを符号化モード群の情報を含めずに生成された符号化ストリームが入力された場合も、復号化対象ブロックに隣接するブロックの画像の特徴に基づいて符号化モード群の情報を特定することが可能となり、符号化モードを示す情報の符号量を低減したより圧縮効率の高い符号化ストリームに対する復号処理が可能となる。 According to the image decoding apparatus and the image decoding method according to the present embodiment described above, the selected one encoding mode is encoded mode group information and one of the encoding modes belonging to the encoding mode group. Even when an encoded stream generated by dividing the coding mode into information indicating the coding mode and not including the coding mode group information is input, the characteristics of the image of the block adjacent to the decoding target block Based on this, it is possible to specify the information of the encoding mode group, and it is possible to perform decoding processing on the encoded stream with higher compression efficiency in which the code amount of the information indicating the encoding mode is reduced.

以上説明した実施例では周波数変換の一例としてDCTを挙げているが、DST(Discrete Sine Transformation：離散サイン変換)、WT(Wavelet Transformation：ウェーブレット変換)、DFT(Discrete Fourier Transformation：離散フーリエ変換)、KLT(Karhunen-Loeve Transformation：カルーネン-レーブ変換)など、画素間相関除去に利用する直交変換ならどんなものでも構わない。特に周波数変換を施さずに予測差分そのものに対して符号化を行っても構わない。さらに、可変長符号化も特に行わなくて良い。 In the embodiments described above, DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT Any orthogonal transformation used for removing correlation between pixels, such as (Karhunen-Loeve Transformation), may be used. In particular, the prediction difference itself may be encoded without performing frequency conversion. Furthermore, variable length coding is not particularly required.

また、実施例では、符号化モード番号を可変長符号化しているが、符号化モード番号も符号化モード群の番号と同様に、図6のように周辺ブロックの画像特徴量を用いて選択しても良い。この場合、符号化モード群の番号と符号化モード番号を符号化する必要がなくなり、さらなる圧縮率の向上が期待できる。 In the embodiment, the encoding mode number is variable-length encoded. However, the encoding mode number is selected using the image feature amount of the peripheral block as shown in FIG. 6 in the same manner as the encoding mode group number. May be. In this case, it is not necessary to encode the encoding mode group number and the encoding mode number, and further improvement of the compression rate can be expected.

実施例は動画像を符号化する場合について述べているが、本発明は静止画像の符号化にも有効である。例えば、図1のブロック図から動き探索部(104)と画面間予測部(106)を排除すれば、静止画像に特化した符号化装置のブロック図に相当する。 Although the embodiment describes the case of encoding a moving image, the present invention is also effective for encoding a still image. For example, if the motion search unit (104) and the inter-screen prediction unit (106) are excluded from the block diagram of FIG. 1, it corresponds to a block diagram of an encoding device specialized for still images.

また、実施例ではブロック単位で符号化する場合について述べているが、それ以外にも例えば画像の背景から分離したオブジェクト単位で符号化する場合に本発明を利用しても良い。 In the embodiment, the case where encoding is performed in units of blocks has been described. However, the present invention may be used when encoding is performed in units of objects separated from the background of an image.

本実施例で用いる画像符号化装置のブロック図Block diagram of an image encoding apparatus used in this embodiment 本実施例で用いる画像符号化装置の流れ図Flow chart of image encoding apparatus used in this embodiment H.264/AVC規格による符号化処理の概念的な説明図Conceptual explanatory diagram of encoding processing according to the H.264 / AVC standard 本実施例で用いる画像復号化装置のブロック図Block diagram of an image decoding apparatus used in this embodiment 本実施例で用いる画像復号化装置の流れ図Flow chart of image decoding apparatus used in this embodiment 本実施例で用いる符号化モード群の選択方法の一例の説明図Explanatory drawing of an example of the selection method of the encoding mode group used by a present Example. エッジ検出に用いるフィルタの例の説明図Illustration of examples of filters used for edge detection 符号化モード群の尤度計算の一例の説明図Explanatory drawing of an example of likelihood calculation of coding mode group 画像特徴量の算出方法についての一例の説明図An explanatory diagram of an example of a method for calculating an image feature amount H.264/AVC規格及び本実施例で用いる符号化モードの種類を列挙したリストの一例の図Illustration of an example of a list listing the types of encoding modes used in the H.264 / AVC standard and this embodiment

Explanation of symbols

１０１…原画像、１０２…原画像メモリ、１０３…ブロック分割部、１０４…動き探索部、１０５…画面内予測部、１０６…画面間予測部、１０７…モード選択部、１０８…モード群選択部、１０９…減算部、１１０…周波数変換部、１１１…量子化処理部、１１２…可変長符号化部、１１３…逆量子化処理部、１１４…逆周波数変換部、１１５…加算部、１１６…参照画像メモリ、１１７…画像特徴量算出部、４０１…符号化ストリーム、４０２…可変長復号化部、４０３…逆量子化処理部、４０４…逆周波数変換部、４０５…モード特定部、４０６…画面内予測部、４０７…画面間予測部、４０８…加算部、４０９…参照画像メモリ、４１０…画像特徴量算出部、４１１…符号化モード群特定部 DESCRIPTION OF SYMBOLS 101 ... Original image, 102 ... Original image memory, 103 ... Block division part, 104 ... Motion search part, 105 ... In-screen prediction part, 106 ... Inter-screen prediction part, 107 ... Mode selection part, 108 ... Mode group selection part, DESCRIPTION OF SYMBOLS 109 ... Subtraction part, 110 ... Frequency conversion part, 111 ... Quantization processing part, 112 ... Variable length coding part, 113 ... Dequantization processing part, 114 ... Inverse frequency conversion part, 115 ... Addition part, 116 ... Reference image Memory 117, image feature amount calculation unit 401 401 encoded stream 402 variable length decoding unit 403 inverse quantization processing unit 404 inverse frequency transform unit 405 mode identification unit 406 intra prediction 407: Inter-screen prediction unit, 408 ... Addition unit, 409 ... Reference image memory, 410 ... Image feature amount calculation unit, 411 ... Coding mode group specifying unit

Claims

An image encoding device that encodes an encoding target image divided into a plurality of regions,
An image feature amount calculating unit that calculates an image feature amount indicating the feature of an image in an area adjacent to the encoding target area of the encoding target image;
An encoding mode group selection unit that selects an encoding mode group of the encoding target region using the image feature amount calculated by the image feature amount calculation unit;
An encoding mode selection unit that selects one encoding mode among a plurality of encoding modes belonging to the encoding mode group selected by the encoding mode group selection unit;
An output unit that performs a predetermined conversion process on the prediction difference value calculated by the prediction process using the encoding mode selected by the encoding mode selection unit, and outputs the prediction difference value in an encoded stream. An image encoding device.

The image feature amount calculation unit performs a predetermined filtering process on an image in an area that has already been encoded among areas adjacent to the encoding target area of the encoding target image, and includes the image in the image in the area 2. The image feature amount is calculated by calculating one of a variance value of pixel values to be detected, an edge strength included in an image of the region, an edge angle included in the image of the region, or a combination thereof. Image coding apparatus.

The output unit includes only the information about the coding mode out of the coding mode group selected by the coding mode group selection unit and the coding mode selected by the coding mode selection unit, and outputs it. The image coding apparatus according to claim 1, wherein:

An encoding method of an encoding target image divided into a plurality of regions,
An image feature amount calculating step for calculating an image feature amount indicating a feature of an image in an area adjacent to the encoding target area of the encoding target image;
An encoding mode group selection step of selecting an encoding mode group of the encoding target region using the image feature amount calculated in the image feature amount calculation step;
An encoding mode selection step of selecting one encoding mode among a plurality of encoding modes belonging to the encoding mode group selected in the encoding mode group selection step;
An image code comprising: an output step of performing a predetermined conversion process on the prediction difference value calculated by the prediction process using the encoding mode selected in the encoding mode selection step and including it in an encoded stream Method.

In the image feature amount calculating step, a predetermined filter process is performed on an image in an area where the encoding process is performed before the encoding target area among the areas adjacent to the encoding target area of the encoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. The image encoding method according to claim 4, wherein the image encoding method is characterized in that:

In the output step, out of the encoding mode group selected in the encoding mode group selection step and the encoding mode selected in the encoding mode selection step, only the information about the encoding mode is included in the encoded stream and output. The image encoding method according to claim 4, wherein:

An image decoding device for decoding an encoded stream in which an image divided into a plurality of regions is encoded,
A prediction difference decoding unit that performs predetermined processing on the encoded stream and decodes a prediction difference value of a decoding target region;
An image feature quantity calculating unit for calculating an image feature quantity indicating the feature of the image of the area adjacent to the decoding target area of the decoding target image;
An encoding mode group specifying unit that specifies an encoding mode group of the decoding target area using the image feature amount calculated by the image feature amount calculating unit;
An encoding mode specifying unit for specifying one encoding mode among a plurality of encoding modes belonging to the encoding mode group specified by the encoding mode group selecting unit;
Decoding that generates a decoded image by synthesizing a reference image acquired by a prediction process corresponding to an encoding mode specified by the encoding mode specifying unit and a prediction difference decoded by the prediction difference decoding unit An image decoding apparatus comprising: a converted image generation unit.

The image feature amount calculation unit performs a predetermined filtering process on an image in an area that has been subjected to a decoding process prior to the decoding target area among areas adjacent to the encoding target area of the decoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. 8. The image decoding apparatus according to claim 7, wherein

The encoding mode specifying unit specifies an encoding mode using information on the encoding mode group specified by the encoding mode group specifying unit and an encoding mode number included in the encoded stream. The image decoding apparatus according to claim 7, wherein:

An image decoding method for decoding an encoded stream in which an image divided into a plurality of regions is encoded,
A prediction difference decoding step for performing a predetermined process on the encoded stream and decoding a prediction difference value of a decoding target region;
An image feature amount calculating step for calculating an image feature amount indicating a feature of an image in a region adjacent to the decoding target region of the decoding target image;
An encoding mode group specifying step for specifying an encoding mode group of the decoding target area using the image feature amount calculated in the image feature amount calculating step;
An encoding mode specifying step for specifying one encoding mode among a plurality of encoding modes belonging to the encoding mode group specified in the encoding mode group selecting step;
Decoding that generates a decoded image by synthesizing the reference image acquired by the prediction process corresponding to the encoding mode specified in the encoding mode specifying step and the prediction difference decoded in the prediction difference decoding step An image decoding method comprising: a converted image generation step.

In the image feature amount calculating step, a predetermined filter process is performed on an image in an area that has been subjected to a decoding process prior to the decoding target area among areas adjacent to the encoding target area of the decoding target image. And calculating one of a variance value of pixel values included in the image of the region, an edge strength included in the image of the region, an edge angle included in the image of the region, or a combination thereof as an image feature amount. The image decoding method according to claim 10, wherein:

In the encoding mode specifying step, the encoding mode is specified using the information of the encoding mode group specified in the encoding mode group specifying step and the encoding mode number included in the encoded stream. The image decoding method according to claim 10, wherein: