JP6585776B2

JP6585776B2 - Processing method

Info

Publication number: JP6585776B2
Application number: JP2018117566A
Authority: JP
Inventors: 昌史高橋; 村上　智一; 智一村上; 山口　宗明; 宗明山口; 浩朗伊藤
Original assignee: Maxell Ltd
Current assignee: Maxell Ltd
Priority date: 2009-09-16
Filing date: 2018-06-21
Publication date: 2019-10-02
Anticipated expiration: 2030-07-20
Also published as: JP5363581B2; JP6837110B2; JP2014007759A; JP2016067062A; JP2018164299A; JP5882416B2; JP2017103810A; JP6088080B2; JPWO2011033853A1; JP2020005294A; WO2011033853A1; JP6360214B2; JP5611432B2; JP2014207713A

Description

本発明は動画像を復号化する動画像復号化技術に関する。 The present invention relates to a moving picture decoding technique for decoding a moving picture.

大容量の動画像情報をデジタルデータ化して記録、伝達する手法として、ＭＰＥＧ（Moving Picture Experts Group）方式及びその他の符号化方式が策定されている。これらの規格は、符号化処理が完了した画像情報を利用して符号化対象画像をブロック単位で予測し、原画像との差分（予測差分）を符号化することによって、動画像の持つ冗長性を除いて符号量を減らしている。 An MPEG (Moving Picture Experts Group) system and other encoding systems have been developed as techniques for recording and transmitting large-volume moving image information as digital data. These standards predict the encoding target image in units of blocks using image information that has been encoded, and encode the difference (prediction difference) from the original image, thereby providing redundancy of the moving image. The code amount is reduced except for.

特に、対象画像とは別の画像を参照する画面間予測は、符号化対象ブロックと相関の高いブロックを参照画像中から探索することによって、高精度な予測を可能としている。また、予測差分の符号化は、数値の集積度を高めるために一度周波数変換、例えば離散コサイン変換（ＤＣＴ：Discrete Cosine Transform）、を施し、変換後の係数値を量子化する。予測差分もまた局所領域に強い相関性を有するため、上記周波数変換も画像を細かく分割したブロック単位で施される。 In particular, inter-screen prediction that refers to an image different from the target image enables high-precision prediction by searching for a block having a high correlation with the encoding target block from the reference image. In addition, the prediction difference is encoded by performing frequency conversion, for example, discrete cosine transform (DCT) once in order to increase the degree of numerical integration, and quantizing the converted coefficient values. Since the prediction difference also has a strong correlation with the local region, the frequency conversion is also performed in units of blocks obtained by finely dividing the image.

しかし、これらの方式は固定サイズのブロック（マクロブロック）を符号化処理の基本単位として設定しているため、マクロブロックを超えるサイズのブロックや、複数のマクロブロックにまたがるブロックを設定することができず、これが圧縮効率向上の妨げとなっていた。 However, because these methods set a fixed-size block (macroblock) as the basic unit of encoding processing, it is possible to set a block that exceeds the macroblock or a block that spans multiple macroblocks. This has hindered improvement in compression efficiency.

これに対し、例えば、特許文献１では、その段落０００３から段落０００５に記載されるように、「高精細動画像などの同一動き量とみなされる領域が大きい映像素材」について「符号化効率の改善を図る」ために、「動き予測を行って動画像を符号化する動画像符号化装置において、符号化する当該ピクチャのマクロブロックサイズの上限を、当該ピクチャの直前のピクチャ又は／及び当該ピクチャのマクロブロックの特徴量を基に最適に決定する手段を具備し、動き予測を行う際のマクロブロックサイズの上限をピクチャまたはマクロブロック単位にて任意に選択可能」とする技術が開示されている。 On the other hand, for example, in Patent Document 1, as described in paragraphs 0003 to 0005, “enhancement of encoding efficiency” for “video material having a large area that is regarded as the same motion amount, such as a high-definition moving image”. To achieve this, “in the moving picture coding apparatus that performs motion prediction and codes a moving picture, the upper limit of the macroblock size of the picture to be coded is set to the picture immediately before the picture and / or the picture. A technique is disclosed that includes means for determining optimally based on the feature amount of a macroblock, and the upper limit of the macroblock size when performing motion prediction can be arbitrarily selected in units of pictures or macroblocks.

特開２００６−３３９７７４JP 2006-339774 A

特許文献１に開示の技術は、予測を行う際のブロックを拡大するため、予測精度が低下するといった課題があった。予測精度が低下すると人の目につく雑音が発生する原因となり、主観画質が低下する。 The technique disclosed in Patent Document 1 has a problem in that prediction accuracy is reduced because a block for performing prediction is enlarged. When the prediction accuracy is lowered, noise that is noticeable to human eyes is generated, and subjective image quality is lowered.

本発明は上記課題を鑑みてなされたものであり、その目的は、符号量削減と主観画質向上を図ることにある。 The present invention has been made in view of the above problems, and an object thereof is to reduce the amount of code and improve the subjective image quality.

本発明の一態様の動画像復号化方法は、以下の処理を行う。符号化ストリームを入力する。前記入力した符号化ストリームに可変長復号化処理を行う。前記可変長復号化処理を行ったデータについて第１のブロック単位で逆量子化処理及び逆周波数変換処理を行って予測差分を生成する。第２のブロック単位で予測処理を行う。前記生成した予測差分と前記予測処理の結果に基づいて復号画像を生成する。前記第１のブロック単位は、前記第２のブロック単位よりも大きいブロック単位である。 The moving image decoding method according to an aspect of the present invention performs the following processing. Input an encoded stream. A variable length decoding process is performed on the input encoded stream. A prediction difference is generated by performing an inverse quantization process and an inverse frequency transform process on the data subjected to the variable length decoding process on a first block basis. Prediction processing is performed in units of second blocks. A decoded image is generated based on the generated prediction difference and the result of the prediction process. The first block unit is a larger block unit than the second block unit.

本発明によれば、より好適に符号量削減と主観画質向上が可能となる。 According to the present invention, it is possible to more suitably reduce the code amount and improve the subjective image quality.

本実施形態で用いる画像符号化装置のブロック図である。It is a block diagram of the image coding apparatus used by this embodiment. 本実施形態で用いる画像復号化装置のブロック図である。It is a block diagram of the image decoding apparatus used by this embodiment. Ｈ．２６４／ＡＶＣの符号化方法に関する概念的な説明図である。H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. Ｈ．２６４／ＡＶＣの符号化方法に関する概念的な説明図である。H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. Ｈ．２６４／ＡＶＣの符号化方法に関する概念的な説明図である。H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. Ｈ．２６４／ＡＶＣの符号化方法に関する概念的な説明図である。H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. Ｈ．２６４／ＡＶＣの符号化方法に関する概念的な説明図である。H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態で利用する可変長符号表の一例である。It is an example of the variable length code table utilized in this embodiment. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態により生成される符号化ストリームの一例である。It is an example of the encoding stream produced | generated by this embodiment. 本実施形態で用いる画像符号化装置のフローチャートである。It is a flowchart of the image coding apparatus used by this embodiment. 本実施形態で用いる画像復号化装置のフローチャートである。It is a flowchart of the image decoding apparatus used by this embodiment. 本実施形態の符号化方法に関する概念的な説明図である。It is a conceptual explanatory drawing regarding the encoding method of this embodiment. 本実施形態で利用する可変長符号表の一例である。It is an example of the variable length code table utilized in this embodiment. 本実施形態により生成される符号化ストリームの一例である。It is an example of the encoding stream produced | generated by this embodiment. 本実施形態で用いる画像符号化装置のフローチャートである。It is a flowchart of the image coding apparatus used by this embodiment. 本実施形態で用いる画像復号化装置のフローチャートである。It is a flowchart of the image decoding apparatus used by this embodiment.

実施形態１．
以下本発明の実施形態１について、Ｈ．２６４／ＡＶＣでの処理と比較して説明する。まず、Ｈ．２６４／ＡＶＣは、符号化処理が完了した画像情報を利用して符号化対象画像を予測し、原画像との予測差分を符号化することによって、動画像の持つ冗長性を減らして符号量を削減している。ここでは、動画像の局所的性質を利用するために、画像を細かく分割したブロック単位で予測が行われる。 Embodiment 1. FIG.
Hereinafter, the first embodiment of the present invention will be described in H.264. This will be described in comparison with the processing in H.264 / AVC. First, H. H.264 / AVC predicts an encoding target image using image information that has been encoded, and encodes a prediction difference from the original image, thereby reducing the redundancy of the moving image and increasing the code amount. Reduced. Here, in order to use the local property of a moving image, prediction is performed in units of blocks obtained by finely dividing an image.

図３に示すとおり、符号化処理は対象画像３０５に対してラスタースキャンの順序（矢印）３０１に従い、１６×１６画素で構成されるマクロブロック３０２単位で実行される。図３において、対象画像３０５は既符号化領域３０６と未符号化領域３０７とで構成されている。予測には大別して画面内予測と画面間予測がある。 As shown in FIG. 3, the encoding process is performed on the target image 305 in units of macroblocks 302 configured by 16 × 16 pixels according to the raster scan order (arrow) 301. In FIG. 3, the target image 305 includes an already encoded area 306 and an uncoded area 307. Prediction is roughly classified into intra-screen prediction and inter-screen prediction.

図４はＨ．２６４／ＡＶＣによる画面間予測処理の動作を概念的に示している。画面間予測を行う際には、符号化対象画像４０３と同じ映像４０１に含まれる符号化済みの画像の復号画像を参照画像４０２とし、対象画像中の対象ブロック４０４と相関の高いブロック（予測画像）４０５を参照画像４０２中から探索する。 FIG. The operation | movement of the inter prediction process by H.264 / AVC is shown notionally. When performing inter-screen prediction, a decoded image of an encoded image included in the same video 401 as the encoding target image 403 is used as a reference image 402, and a block (predicted image) having a high correlation with the target block 404 in the target image is used. ) 405 is searched from the reference image 402.

このとき、両ブロックの差分として計算される予測差分に加えて、予測に必要なサイド情報として、両ブロックの座標値の差分として表される動きベクトル４０６を符号化する。一方復号化の際には上記の逆の手順を行えばよく、復号化された予測差分を参照画像中のブロック（予測画像）４０５に加算することにより、復号化画像を取得できる。 At this time, in addition to the prediction difference calculated as the difference between both blocks, the motion vector 406 represented as the difference between the coordinate values of both blocks is encoded as side information necessary for prediction. On the other hand, the reverse procedure described above may be performed at the time of decoding, and the decoded image can be acquired by adding the decoded prediction difference to the block (predicted image) 405 in the reference image.

また、Ｈ．２６４／ＡＶＣは、マクロブロックをさらに小さなサイズのブロックに分割して上記予測を行うことができる。図５は、画面間予測を行う際に許容されているマクロブロックの分割パターンについて示している。すなわち、Ｈ．２６４／ＡＶＣは、対象画像５０１中の各マクロブロック５０２の予測において、あらかじめ定義された４×４画素サイズから１６×１６画素サイズまでの分割パターン（マクロブロックの分割パターン）５０３の中から最適なものを選択することができる。マクロブロックごとにどの分割パターンを用いて分割をしたのかを示す情報が、マクロブロック単位で符号化される。 H. H.264 / AVC can perform the above prediction by dividing a macroblock into smaller blocks. FIG. 5 shows a macroblock division pattern allowed when inter-screen prediction is performed. That is, H.I. H.264 / AVC is optimal for predicting each macroblock 502 in the target image 501 from among a predetermined division pattern (macroblock division pattern) 503 from 4 × 4 pixel size to 16 × 16 pixel size. You can choose one. Information indicating which division pattern is used for each macroblock is encoded on a macroblock basis.

一方、上記予測処理により生成された予測差分は、周波数変換手法の一つであるＤＣＴ（Discrete Cosine Transformation：離散コサイン変換）により周波数成分に分解され、その係数値が符号化される。図６は予測差分がＤＣＴにより周波数成分に分解される様子を概念的に示している。ＤＣＴは、入力信号を基底信号６０３とその係数値の加重和によって表現する周波数変換の一手法である。予測差分６０１に対してＤＣＴを適用することによりその係数値６０２が低周波成分に偏ることが多いため、効率的に可変長符号化を行うことができる。 On the other hand, the prediction difference generated by the prediction process is decomposed into frequency components by DCT (Discrete Cosine Transformation) which is one of frequency conversion methods, and the coefficient value is encoded. FIG. 6 conceptually shows how the prediction difference is decomposed into frequency components by DCT. DCT is a method of frequency conversion in which an input signal is expressed by a weighted sum of a base signal 603 and its coefficient value. By applying DCT to the prediction difference 601, the coefficient value 602 is often biased toward low frequency components, so variable length coding can be performed efficiently.

なお、Ｈ．２６４／ＡＶＣでは、予測差分に対してもマクロブロックをさらに小さなサイズのブロックに分割してＤＣＴを適用することができるが、ＤＣＴを行う際のブロックサイズは固定されており、例えばＨ．２６４／ＡＶＣのBaselineプロファイルでは、図７に示すように、そのサイズを４×４画素とすることが規定されている。図７において、予測差分７０１のマクロブロック７０２は、４×４の小さいブロック（画素）に分割されている（図７における７０３）。 H. In H.264 / AVC, DCT can be applied to a prediction difference by dividing a macroblock into smaller blocks, but the block size when performing DCT is fixed. In the Baseline profile of H.264 / AVC, as shown in FIG. 7, it is specified that the size is 4 × 4 pixels. In FIG. 7, the macro block 702 of the prediction difference 701 is divided into 4 × 4 small blocks (pixels) (703 in FIG. 7).

以上のように、Ｈ．２６４／ＡＶＣは、適応的に画像を細かなブロックに分割して符号化することにより、高い性能を実現している。しかし、Ｈ．２６４／ＡＶＣはマクロブロックを符号化処理の基本単位としているため、マクロブロックよりも大きなサイズのブロックや、複数のマクロブロックを跨ぐようなブロックを扱うことができなかった。こういったブロック形状に関する制限が、圧縮効率の向上を妨げている要因の一つであった。 As described above, H.P. H.264 / AVC achieves high performance by adaptively dividing and encoding an image into fine blocks. However, H. Since H.264 / AVC uses macroblocks as a basic unit of encoding processing, it has not been possible to handle blocks larger in size than macroblocks or blocks that straddle a plurality of macroblocks. This restriction on the block shape was one of the factors that hindered the improvement of compression efficiency.

一般的に、小さなサイズのブロックを用いるときめ細かな処理が可能になるため、予測やＤＣＴの精度が向上して画質が高くなる。しかし一方で、小さなブロックを用いると符号量が増大するといった問題がある。これは、画像内のブロック数が増加することに起因する。例えば画面間予測を行う場合には、予測処理に必要な動きベクトルをブロックごとに符号化する必要があるため、ブロック数が増加するとそれに伴って動きベクトル数も増加し、符号量が増大する。 Generally, fine processing is possible when using a small-sized block, so that the accuracy of prediction and DCT is improved and the image quality is improved. On the other hand, however, there is a problem that the code amount increases when a small block is used. This is due to the increase in the number of blocks in the image. For example, when performing inter-screen prediction, it is necessary to encode a motion vector necessary for prediction processing for each block. Therefore, when the number of blocks increases, the number of motion vectors increases accordingly, and the code amount increases.

また、ＤＣＴを行う場合には、ブロック数が増加するとそれに伴ってＤＣＴ係数のうちで有意となる低周波成分の数が増えるため、ＶＬＣ（Variable Length Coding）の効率が落ちてしまい符号量が増大する。そのため、適切なブロックサイズの判定は、こういった画質と符号量のトレードオフを考慮する必要がある。 In addition, when DCT is performed, the number of significant low-frequency components in the DCT coefficients increases with an increase in the number of blocks, so that the efficiency of VLC (Variable Length Coding) decreases and the code amount increases. To do. Therefore, it is necessary to consider such a trade-off between image quality and code amount when determining an appropriate block size.

一方、近年ではデジタルシネマやスーパＨＤなどハイビジョンを越える高精細映像に対する需要が高まっており、これら高精細映像を効率的に符号化する方式の登場が望まれている。一般的に、解像度の高い高精細映像は画面内の相関性が高いため、大きなサイズのブロックを用いても画質の劣化が少ないことが知られている。 On the other hand, in recent years, demand for high-definition video exceeding high-definition such as digital cinema and super HD is increasing, and the appearance of a method for efficiently encoding such high-definition video is desired. In general, high-definition video with high resolution has high correlation in the screen, and it is known that there is little deterioration in image quality even when a large size block is used.

そのため、符号化のターゲットを高解像度映像に絞れば、符号化の処理単位となるブロックのサイズを拡大することにより圧縮率の劇的向上を実現することができる。例えば、特許文献１の技術は、マクロブロックのサイズを変更可能にし、その上限値を既符号化領域の特徴量に応じて適応的に変更している。この方法によれば、画像の性質に応じてマクロブロックを拡大することが可能になり、特に高精細映像の圧縮効率を高めることができる。 Therefore, if the encoding target is narrowed down to high-resolution video, the compression ratio can be dramatically improved by increasing the size of the block that is the processing unit of encoding. For example, the technique of Patent Document 1 makes it possible to change the size of a macroblock and adaptively changes the upper limit value according to the feature amount of an already-encoded area. According to this method, it is possible to enlarge the macroblock according to the property of the image, and in particular, it is possible to increase the compression efficiency of high definition video.

しかし、この方法では予測を行う際のブロックを拡大するため、予測精度が低下するといった課題があった。予測精度が低下すると人の目につく雑音が発生する原因となり、主観画質が低下する。 However, this method has a problem in that the prediction accuracy is reduced because the block used for prediction is enlarged. When the prediction accuracy is lowered, noise that is noticeable to human eyes is generated, and subjective image quality is lowered.

本実施形態は上記課題を改善し、主観画質をより好適に維持したまま符号量をより低減する。具体的には、本実施形態は、画面間予測処理を小さなブロック単位、例えばマクロブロック単位、で細かく行う一方で、予測差分に対する周波数変換（本実施形態は一例としてＤＣＴを用いる）の適用サイズを拡大可能にする。 The present embodiment improves the above problem and further reduces the code amount while maintaining the subjective image quality more suitably. Specifically, in the present embodiment, inter-screen prediction processing is performed in small blocks, for example, in units of macroblocks, while the application size of frequency conversion for the prediction difference (this embodiment uses DCT as an example). Make it expandable.

例えば図８は、予測差分８０１を示している。図８に示すように、対象画像が複数のオブジェクトにより構成されるような複雑なテキスチャを有する場合でも、予測精度が高ければ予測差分は低周波成分の多いなだらかな分布となり、大きなブロック単位でＤＣＴを施しても画質劣化が少なくなる。そのため、予測精度が高い領域８０２に対しては隣接する複数ブロックの予測差分を統合し、大きなブロックを形成してＤＣＴを施すことにより、ＤＣＴ係数の符号量を大幅に削減することができる。 For example, FIG. 8 shows a prediction difference 801. As shown in FIG. 8, even when the target image has a complex texture composed of a plurality of objects, if the prediction accuracy is high, the prediction difference has a gentle distribution with many low-frequency components, and the DCT has a large block unit. Even if it is applied, image quality degradation is reduced. Therefore, for the region 802 with high prediction accuracy, the prediction difference of a plurality of adjacent blocks is integrated, a large block is formed, and DCT is performed, so that the code amount of DCT coefficients can be greatly reduced.

また、複雑な動きを伴う物体の一部など予測精度が低い領域８０３に対しては、予測差分の分布が複雑になり、高周波成分が多くなるため大きなブロック単位でＤＣＴを施すと画質劣化が目立つ。そのため、こういった予測精度が低い領域に対しては、ブロックの統合は行わず、予測を行った際のブロックと同じもしくはそれよりも小さいブロック単位でＤＣＴを施すことにより、画質を維持することができる。以上のように、複数ブロックの予測差分を統合してＤＣＴを施すことにより、圧縮率を高め、符号量を低減することができる。 In addition, for a region 803 with low prediction accuracy, such as a part of an object with complex motion, the distribution of prediction differences becomes complicated, and high-frequency components increase. Therefore, when DCT is performed on a large block basis, image quality degradation is conspicuous. . Therefore, for such regions with low prediction accuracy, block integration is not performed, and image quality is maintained by applying DCT in units of blocks that are the same as or smaller than the block at the time of prediction. Can do. As described above, by compressing the prediction differences of a plurality of blocks and performing DCT, the compression rate can be increased and the code amount can be reduced.

以下、本実施形態の詳細について述べる。なお、本実施形態において説明する処理は画面間符号化を行うことが可能なフレーム（Ｈ．２６４／ＡＶＣで言えばＰスライスもしくはＢスライス）に適用するものとして説明する。画面内のすべての領域を画面内符号化するフレーム（Ｈ．２６４／ＡＶＣで言えばＩスライス）に対しては、以下の実施形態において説明する処理を適用してもよく、適用しなくても良い。 Details of this embodiment will be described below. Note that the processing described in the present embodiment will be described as being applied to a frame (P slice or B slice in H.264 / AVC) that can perform inter-frame coding. The processing described in the following embodiment may or may not be applied to a frame (I slice in H.264 / AVC) in which all areas in the screen are encoded. good.

実施形態１では、画面間符号化を行うことが可能なフレーム（Ｈ．２６４／ＡＶＣで言えばＰスライスもしくはＢスライス）においてすべての領域を画面間符号化する場合、すなわち画面内にはインターマクロブロック（画面間符号化を行うマクロブロック）のみが存在し、イントラマクロブロック（画面内符号化を行うマクロブロック）は存在しない場合を例として説明する。 In the first embodiment, when all areas are inter-coded in a frame (P slice or B slice in H.264 / AVC) in which inter picture coding can be performed, that is, an inter macro is included in the picture. An example will be described in which there is only a block (macroblock that performs inter-screen coding) and no intra macroblock (macroblock that performs intra-screen coding).

図９は、本実施形態におけるＤＣＴに用いるブロックサイズの一例を表している。ここで、予測差分９０１は、例えば、以下の方法で生成される。この方法は、例えばＨ．２６４／ＡＶＣと同様の手段（図５）により１６×１６画素サイズのマクロブロック単位でブロック分割を行い、それぞれのマクロブロックの画面間予測を行い、それらの予測差分を１画面分統合する。本実施形態は、この予測差分に対してＤＣＴを施す際に、例えば隣接する１６個のマクロブロックを統合したブロック群９０２（６４×６４画素）を形成し、ブロック群単位でブロック分割を行う。 FIG. 9 shows an example of a block size used for DCT in the present embodiment. Here, the prediction difference 901 is generated by the following method, for example. This method is described in, for example, H.H. The same means as in H.264 / AVC (FIG. 5) performs block division in units of 16 × 16 pixel macroblocks, performs inter-screen prediction of each macroblock, and integrates those prediction differences for one screen. In this embodiment, when DCT is performed on this prediction difference, for example, a block group 902 (64 × 64 pixels) in which 16 adjacent macroblocks are integrated is formed, and block division is performed on a block group basis.

ただし、ブロック群９０２のサイズは６４×６４画素サイズに限らず、複数のマクロブロック９０３を統合したものであれば３２×３２や１２８×１２８など、どのようなものでもよい。好ましい一つの方法は、ブロック群９０２の分割パターン９０３として８×８画素、１６×１６画素、３２×３２画素、６４×６４画素など、多くの種類をあらかじめ用意しておき、それらの中から最適なパターンを選択してＤＣＴを施す。 However, the size of the block group 902 is not limited to the 64 × 64 pixel size, and may be any size such as 32 × 32 or 128 × 128 as long as a plurality of macroblocks 903 are integrated. One preferable method is to prepare a number of types such as 8 × 8 pixels, 16 × 16 pixels, 32 × 32 pixels, 64 × 64 pixels in advance as the division pattern 903 of the block group 902, and the optimum among them. DCT is performed by selecting a proper pattern.

その際、例えば図１０に示すような符号表を利用し、どのパターンを選択したのかを表す情報をブロック群ごとに符号化する。ここでは、頻繁に選択されるパターンに対して短い符号長を割り当てることにより、全体の符号量を削減することができる。また、上記ブロックパターンの選択は、例えば数式１に示すコスト関数を利用し、これを最小化する分割パターンが最適であると判断すると効果的である。 At this time, for example, a code table as shown in FIG. 10 is used, and information indicating which pattern is selected is encoded for each block group. Here, the overall code amount can be reduced by assigning a short code length to a frequently selected pattern. The selection of the block pattern is effective when, for example, the cost function shown in Equation 1 is used and it is determined that the division pattern for minimizing this is optimal.

ただし、数式１において、Ｄｉｓｔは原画像と復号画像の誤差和、ＲａｔｅはＤＣＴ係数の符号量とブロック分割パターンの符号量の和、Ｗｅｉｇｈｔは重み係数を表す。ここで、Ｗｅｉｇｈｔの値を調整することにより、画質と符号量のトレードオフを制御することができる。例えば画質を多少劣化させても符号量を大幅に低下させたければ、コスト値に対する符号量の寄与率が大きくなるようＷｅｉｇｈの値を高めに設定すればよい。 In Equation 1, Dist represents the sum of errors between the original image and the decoded image, Rate represents the sum of the code amount of the DCT coefficient and the code amount of the block division pattern, and Weight represents the weight coefficient. Here, the trade-off between image quality and code amount can be controlled by adjusting the value of Weight. For example, if the code amount is to be significantly reduced even if the image quality is somewhat deteriorated, the value of Weigh may be set higher so that the contribution rate of the code amount to the cost value increases.

図１１は、各ブロックに対する予測差分の符号化手順を示す。この符号化は、まず対象ブロックの予測差分１１０１に対してＤＣＴを施し、ＤＣＴ係数１１０２を取得する。続いて、ＤＣＴ係数１１０２に対して量子化を行い、符号化対象となる要素数を減少させる。この際、本実施形態のように大きなブロックサイズでＤＣＴを施すと高周波成分に多くのＤＣＴ係数が発生し符号量が増加するため、例えばＤＣＴ係数の高周波成分に対して大きな量子化ステップを適用するように量子化ステップの重み１１０３を設定することにより、高周波成分を大幅に削減して効率的に符号化を行うことができる。ただし、本図では基準となる量子化ステップをＱとして表している。 FIG. 11 shows a prediction difference encoding procedure for each block. In this encoding, first, DCT is performed on the prediction difference 1101 of the target block to obtain a DCT coefficient 1102. Subsequently, the DCT coefficient 1102 is quantized to reduce the number of elements to be encoded. At this time, if DCT is performed with a large block size as in the present embodiment, many DCT coefficients are generated in the high frequency components and the amount of codes increases. For example, a large quantization step is applied to the high frequency components of the DCT coefficients. By setting the quantization step weight 1103 in this way, it is possible to significantly reduce the high frequency components and perform the encoding efficiently. However, in this figure, the reference quantization step is represented as Q.

続いて、量子化後のＤＣＴ係数１１０４に対し、低周波成分から高周波成分に向かって２次元的なジグザグ方向のスキャンによる一次元展開を行い（１１０５）、ＶＬＣを施して符号語を生成する（１１０６）。以上の処理を、ブロック群を分割したすべてのブロックに対して繰り返す。 Subsequently, the quantized DCT coefficient 1104 is subjected to one-dimensional expansion by scanning in a two-dimensional zigzag direction from the low frequency component to the high frequency component (1105), and VLC is applied to generate a code word ( 1106). The above processing is repeated for all blocks obtained by dividing the block group.

各ブロック群に対する処理順序はどのようなものでも構わないが、図１２にその一例を示す。ここでは、ブロック群をラスタースキャンの順序に従って処理する例について示している。まず、画面左上端に位置するブロック群１２０１を処理し、続いてブロック群１２０１の右側に隣接するブロック群１２０２を処理する。その後、さらに右側に隣接するブロック群１２０３、ブロック群１２０４に対して処理を進め、処理が画面右端に到達した時点で、ブロック群１２０１の下側に隣接するブロック群１２０５を処理する。以上の処理を画面右下端に到達するまで行う。このとき、同一ブロック群に含まれるマクロブロックの処理順序はどのようなものでも良いが、例えばジグザグ方向１２１０に沿って処理すると効果的である。 Any processing order may be used for each block group, and FIG. 12 shows an example. Here, an example is shown in which a block group is processed in the order of raster scanning. First, the block group 1201 located at the upper left corner of the screen is processed, and then the block group 1202 adjacent to the right side of the block group 1201 is processed. Thereafter, the processing further proceeds on the block group 1203 and the block group 1204 adjacent to the right side. When the processing reaches the right end of the screen, the block group 1205 adjacent below the block group 1201 is processed. The above processing is performed until the lower right corner of the screen is reached. At this time, the processing order of the macroblocks included in the same block group may be any, but it is effective to process along the zigzag direction 1210, for example.

また、本実施形態は、複数のマクロブロックによる予測差分を統合してＤＣＴを施すために、予測差分を一時的に記憶しておくためのメモリが必要になる。上記メモリに一度に格納する領域を『アクセスグループ』と呼ぶことにする。このとき、予測とＤＣＴはそれぞれアクセスグループ単位で行われる。本符号化方法は、例えば画面全体を１つのアクセスグループとして設定した場合、画面内のすべてのマクロブロックに対して予測処理を行い、順次メモリに格納する。 Further, in the present embodiment, in order to perform DCT by integrating prediction differences due to a plurality of macroblocks, a memory for temporarily storing the prediction differences is required. The area stored in the memory at once is called an “access group”. At this time, prediction and DCT are performed for each access group. In this encoding method, for example, when the entire screen is set as one access group, prediction processing is performed on all macroblocks in the screen, and sequentially stored in the memory.

続いてメモリに格納されている１画面分の予測差分に対して、ブロック群単位でブロック分割を行い、ＤＣＴを施す。アクセスグループはどのような範囲に設定しても構わないが、例えば図１３に示すように１ライン分のブロック群により１つのアクセスグループを構成すると、効率的に符号化を行うことができる。 Subsequently, the prediction difference for one screen stored in the memory is divided into blocks in units of blocks and DCT is performed. The access group may be set in any range. For example, as shown in FIG. 13, if one access group is constituted by a block group for one line, encoding can be performed efficiently.

この場合、まず初めに画面最上ラインに位置するブロック群１３０１〜ブロック群１３０４によって構成されるアクセスグループ１３１１に対して予測とＤＣＴを行った後、その次のラインに位置するブロック群１３０５〜ブロック群１３０８によって構成されるアクセスグループ１３１２に対して予測とＤＣＴを行う。これを画面最下ラインに到達するまで続ければ、１フレーム分の符号化処理は完了する。 In this case, first, prediction and DCT are performed on the access group 1311 configured by the block group 1301 to block group 1304 located on the top line of the screen, and then the block group 1305 to block group located on the next line. Prediction and DCT are performed on the access group 1312 configured by 1308. If this is continued until the bottom line of the screen is reached, the encoding process for one frame is completed.

図１４は、本実施形態における符号化ストリームの構成例（１ブロック群分）を表す。ここでは、該当ブロック群内に、予測処理の基本単位となるマクロブロックが１６個存在する場合について説明する。ここではまず、最初のマクロブロック（マクロブロック１）に対して予測方法（順方向画面間予測、逆方向画面間予測、双方向画面間予測、画面内予測など）とそのブロック分割パターンの組み合わせとして表される予測モード１４０１を符号化し、続いて予測に必要なサイド情報１４０２として、各ブロックにおける動きベクトルを符号化する。 FIG. 14 illustrates a configuration example (for one block group) of the encoded stream in the present embodiment. Here, a case will be described in which 16 macroblocks serving as basic units for prediction processing exist in the block group. Here, first, as a combination of a prediction method (forward inter-screen prediction, reverse inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction, etc.) and its block division pattern for the first macroblock (macroblock 1) The represented prediction mode 1401 is encoded, and then the motion vector in each block is encoded as side information 1402 necessary for prediction.

続いて、２個目のマクロブロック２に対する予測モード１４０３とそのマクロブロックを分割した各ブロックにおける動きベクトル１４０４を符号化する。これを該当ブロック群に含まれるすべてのマクロブロックに対して繰り返す。続いて、該当ブロック群の予測差分に対してＤＣＴを施す際のブロック分割パターン１４０５と、各ブロックのＤＣＴ係数１４０６を符号化する。このとき、ＤＣＴを行うブロックサイズは例えば６４×６４などで固定値に設定してもよく、この場合は、ブロック群の分割パターン１４０５の符号化は不要である。 Subsequently, the prediction mode 1403 for the second macroblock 2 and the motion vector 1404 in each block obtained by dividing the macroblock are encoded. This is repeated for all macroblocks included in the corresponding block group. Subsequently, the block division pattern 1405 when performing DCT on the prediction difference of the corresponding block group and the DCT coefficient 1406 of each block are encoded. At this time, the block size for performing DCT may be set to a fixed value such as 64 × 64. In this case, the encoding of the block group division pattern 1405 is not necessary.

図１は本実施形態における動画像符号化装置の一例を示したものである。動画像符号化装置は、入力された原画像１０１を保持する入力画像メモリ１０２と、入力画像メモリ１０２中の画像に対してブロック分割を行うブロック分割部１０３と、ブロック単位で画面内予測を行う画面内予測部１０４と、動き探索部１０５にて検出された動きベクトルを基に画面間予測を行う画面間予測部１０６と、画像の性質に合った予測方法及びブロック形状を決定する予測方法・ブロック決定部１０７を有する。 FIG. 1 shows an example of a moving image encoding apparatus according to this embodiment. The moving image encoding apparatus performs an intra-screen prediction in units of blocks, an input image memory 102 that holds an input original image 101, a block division unit 103 that performs block division on an image in the input image memory 102 An intra-screen prediction unit 104, an inter-screen prediction unit 106 that performs inter-screen prediction based on the motion vector detected by the motion search unit 105, and a prediction method and a prediction method that determine a block shape according to the nature of the image A block determination unit 107 is included.

動画像符号化装置は、さらに、予測差分を生成するための減算部１０８と、予測差分に対して周波数変換を行うＤＣＴ部１１０及び予測差分の性質に合った周波数変換のブロック形状を決定する周波数変換ブロック決定部１１６と、周波数変換後の係数値に対して量子化を施す量子化処理部１１１及び記号の発生確率に応じた符号化を行うための可変長符号化処理部１１２と、一度符号化した予測差分を復号化するための逆量子化処理部１１３及び逆ＤＣＴ部１１４と、復号化された予測差分を用いて復号化画像を生成するための加算部１１５と、復号化画像を保持して後の予測に活用するための参照画像メモリ１１７を有する。 The moving image encoding apparatus further includes a subtraction unit 108 for generating a prediction difference, a DCT unit 110 that performs frequency conversion on the prediction difference, and a frequency that determines a block shape for frequency conversion that matches the nature of the prediction difference. A transform block determining unit 116; a quantization processing unit 111 that performs quantization on a coefficient value after frequency conversion; a variable-length coding processing unit 112 that performs coding according to a symbol generation probability; An inverse quantization processing unit 113 and an inverse DCT unit 114 for decoding the normalized prediction difference, an addition unit 115 for generating a decoded image using the decoded prediction difference, and holding the decoded image Thus, a reference image memory 117 for use in later prediction is provided.

入力画像メモリ１０２は原画像１０１の中から一枚の画像を符号化対象画像として保持する。ブロック分割部１０３は画像データを適切なサイズのブロックに分割し、画面内予測部１０４、動き探索部１０５、画面間予測部１０６及び減算部１０８に送る。動き探索部１０５は、参照画像メモリ１１７に格納されている復号化済み画像を用いて該当ブロックの動き量を計算し、動きベクトルを画面間予測部１０６に送る。画面内予測部１０４及び画面間予測部１０６は、画面内予測処理及び画面間予測処理を数種類の形状のブロック単位で実行する。予測方法・ブロック決定部１０７は、最適な予測方法とブロック形状（マクロブロックの分割パターン）を選ぶ。 The input image memory 102 holds one image from the original images 101 as an encoding target image. The block dividing unit 103 divides the image data into blocks of an appropriate size and sends them to the intra-screen prediction unit 104, motion search unit 105, inter-screen prediction unit 106, and subtraction unit 108. The motion search unit 105 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 117 and sends the motion vector to the inter-screen prediction unit 106. The intra-screen prediction unit 104 and the inter-screen prediction unit 106 execute the intra-screen prediction process and the inter-screen prediction process in units of several types of blocks. The prediction method / block determination unit 107 selects an optimal prediction method and block shape (macroblock division pattern).

続いて減算部１０８は、原画像と予測結果を用いて最適な予測符号化手段による予測差分を生成し、予測差分メモリ１０９に送る。予測差分メモリ１０９は、１アクセスグループ分の予測差分が蓄えられた段階で、予測差分をＤＣＴ部１１０に送る。ＤＣＴ部１１０及び量子化処理部１１１は、ブロック群単位で数種類の形状のブロックに分割してそれぞれＤＣＴなどの周波数変換及び量子化処理を行い、可変長符号化処理部１１２及び逆量子化処理部１１３に送る。逆量子化処理部１１３及び逆ＤＣＴ部１１４は、量子化後の周波数変換係数に対して、それぞれ逆量子化及び逆周波数変換（例えばＩＤＣＴ（Inverse DCT：逆ＤＣＴ））を施し、予測差分を取得して加算部１１５に送る。 Subsequently, the subtraction unit 108 generates a prediction difference by an optimal prediction encoding unit using the original image and the prediction result, and sends the prediction difference to the prediction difference memory 109. The prediction difference memory 109 sends the prediction difference to the DCT unit 110 when the prediction difference for one access group is stored. The DCT unit 110 and the quantization processing unit 111 divide the blocks into several types of blocks and perform frequency conversion and quantization processing such as DCT, respectively. The variable length coding processing unit 112 and the inverse quantization processing unit 113. The inverse quantization processing unit 113 and the inverse DCT unit 114 perform inverse quantization and inverse frequency transform (for example, IDCT (Inverse DCT)) on the quantized frequency transform coefficient, respectively, and obtain a prediction difference And sent to the adder 115.

続いて加算部１１５は復号化画像を生成する。周波数変換ブロック決定部１１６及び参照画像メモリ１１７は、復号化画像を格納する。周波数変換ブロック決定部１１６は、周波数変換を行う際の最適なブロック形状（ブロック群の分割パターン）を決定し、その情報を可変長符号化処理部１１２に送る。さらに可変長符号化処理部１１２は、予測・周波数変換を行う際の最適なブロック形状情報（マクロブロック、ブロック群の分割パターン）と、最適なブロック形状による周波数変換係数（予測差分情報）、及び復号化時の予測処理に必要なサイド情報（例えば画面内予測を行う際の予測方向や画面間予測を行う際の動きベクトル）を、記号の発生確率に基づいて可変長符号化して符号化ストリームを生成する。 Subsequently, the adding unit 115 generates a decoded image. The frequency transform block determination unit 116 and the reference image memory 117 store the decoded image. The frequency transform block determination unit 116 determines an optimal block shape (block group division pattern) for performing frequency conversion, and sends the information to the variable length coding processing unit 112. Furthermore, the variable-length encoding processing unit 112 has optimal block shape information (macroblock, division pattern of block group) when performing prediction / frequency conversion, frequency conversion coefficient (prediction difference information) based on the optimal block shape, and Side information necessary for prediction processing at the time of decoding (for example, a prediction direction when performing intra-screen prediction and a motion vector when performing inter-screen prediction) is variable-length-encoded based on the occurrence probability of a symbol and is encoded stream Is generated.

図２は本実施形態による動画像復号化装置の一例を示したものである。動画像復号化装置は、例えば図１に示す動画像符号化装置によって生成された符号化ストリーム２０１に対して可変長符号化の逆の手順を行って各種情報を復号化するための可変長復号化部２０２と、予測差分情報を復号化するための逆量子化処理部２０３及び逆ＤＣＴ部２０４と、１アクセスグループ分の予測差分を記憶するための予測差分メモリ２０５と、画面間予測を行う画面間予測部２０６と、画面内予測を行う画面内予測部２０７と、復号化画像を取得するための加算部２０８と、復号化画像を一時的に記憶しておくための参照画像メモリ２０９を有する。 FIG. 2 shows an example of a moving picture decoding apparatus according to this embodiment. For example, the moving picture decoding apparatus performs variable length decoding for decoding various information by performing the reverse procedure of variable length coding on the encoded stream 201 generated by the moving picture encoding apparatus shown in FIG. , An inverse quantization processing unit 203 and an inverse DCT unit 204 for decoding prediction difference information, a prediction difference memory 205 for storing a prediction difference for one access group, and inter-screen prediction An inter-screen prediction unit 206, an intra-screen prediction unit 207 that performs intra-screen prediction, an addition unit 208 for acquiring a decoded image, and a reference image memory 209 for temporarily storing the decoded image Have.

可変長復号化部２０２は、符号化ストリーム２０１を可変長復号化し、予測と周波数変換を行う際のブロック形状情報、予測差分情報、及び復号化時の予測処理に必要なサイド情報を取得する。これらのうち、周波数変換を行う際のブロック形状情報（ブロック群の分割パターン）と予測差分情報は逆量子化処理部２０３に送られ、予測を行う際のブロック形状情報（マクロブロックの分割パターン）と復号化時の予測処理に必要なサイド情報は、画面間予測部２０６、または画面内予測部２０７に送られる。 The variable-length decoding unit 202 performs variable-length decoding on the encoded stream 201, and acquires block shape information when performing prediction and frequency conversion, prediction difference information, and side information necessary for prediction processing at the time of decoding. Among these, block shape information (block group division pattern) and prediction difference information when performing frequency conversion are sent to the inverse quantization processing unit 203, and block shape information (macroblock division pattern) when performing prediction. The side information necessary for the prediction process at the time of decoding is sent to the inter-screen prediction unit 206 or the intra-screen prediction unit 207.

続いて、逆量子化処理部２０３及び逆ＤＣＴ部２０４は、ブロック群単位で指定されたブロック形状（ブロック群の分割パターン）でそれぞれ、予測差分情報に対する逆量子化と逆ＤＣＴなどの逆周波数変換を施して復号化を行い、予測差分メモリ２０５に送る。続いて画面間予測部２０６または画面内予測部２０７は、可変長復号化部２０２から送られてきた情報に基づいて、参照画像メモリ２０９を参照して指定されたブロック形状（マクロブロック分割パターン）にて予測処理を実行する。加算部２０８は、予測処理の結果と予測差分メモリ２０５が記憶している１アクセスグループ分の予測差分とから復号化画像を生成するとともに、復号化画像を参照画像メモリ２０９に格納する。 Subsequently, the inverse quantization processing unit 203 and the inverse DCT unit 204 respectively perform inverse frequency transforms such as inverse quantization and inverse DCT on the prediction difference information in the block shape (block group division pattern) specified in units of blocks. To be decoded and sent to the prediction difference memory 205. Subsequently, the inter-screen prediction unit 206 or the intra-screen prediction unit 207 refers to the block shape (macroblock division pattern) designated with reference to the reference image memory 209 based on the information sent from the variable length decoding unit 202. The prediction process is executed at. The adding unit 208 generates a decoded image from the prediction processing result and the prediction difference for one access group stored in the prediction difference memory 205, and stores the decoded image in the reference image memory 209.

図１５は、本実施形態における１フレームの符号化処理手順を示している。まず、符号化対象となるフレーム内に存在するすべての領域に対して（１５０１）、以下の処理を行う。すなわち、該当アクセスグループ内のすべてのマクロブロックに対して（１５０２）、利用可能なすべての予測方法（前方向画面間予測、後方向画面間予測、双方向画面間予測、画面内予測など）及びブロック形状（マクロブロックの分割パターン）にて予測を実行し（１５０３）、予測差分の計算を行う。 FIG. 15 shows an encoding processing procedure for one frame in the present embodiment. First, the following processing is performed for all regions existing in the frame to be encoded (1501). That is, for all macroblocks in the corresponding access group (1502), all available prediction methods (forward inter-screen prediction, backward inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction, etc.) and Prediction is executed with the block shape (macroblock division pattern) (1503), and the prediction difference is calculated.

そして、すべての予測方法・ブロック形状にて予測を行った結果の中から、最適な組み合わせを選択し（１５０４）、その組み合わせの情報を符号化するとともに、予測差分をメモリに格納する。ここで言う最適とは、予測差分と符号量がともに小さくなる場合を言う。上記予測方法とブロック形状の組み合わせの選択には、例えばすべての組み合わせに対して数式２にて表される符号化コスト（Ｃｏｓｔ）を計算し、これが最も小さい組み合わせを選択すると効果的である。 Then, an optimal combination is selected from the prediction results of all prediction methods and block shapes (1504), information on the combination is encoded, and the prediction difference is stored in the memory. The term “optimal” here means a case where both the prediction difference and the code amount are small. For selecting the combination of the prediction method and the block shape, for example, it is effective to calculate the coding cost (Cost) expressed by Equation 2 for all the combinations and select the combination having the smallest value.

ここで、ＳＡＤ（Square Absolute Difference）は予測差分の絶対値和を、Ｒ（Rate）は、予測差分を符号化した際の符号量の見積もり値を表す。また、λは重み付けをするための定数であり、この値は予測方法（画面内予測・画面間予測）や量子化時のパラメータなどによって最適値が異なるので、これらに応じて値を使い分けると効果的である。符号量の見積もり値は、予測差分情報だけではなく、ブロック形状情報や動きベクトルなどの符号量を考慮して算出するのが望ましい。 Here, SAD (Square Absolute Difference) represents the sum of the absolute values of the prediction differences, and R (Rate) represents an estimated value of the code amount when the prediction differences are encoded. Λ is a constant for weighting, and this value differs depending on the prediction method (intra-screen prediction / inter-screen prediction), quantization parameters, etc., so it is effective to use different values accordingly. Is. It is desirable to calculate the estimated code amount in consideration of not only the prediction difference information but also the code amount such as block shape information and motion vector.

アクセスグループ内のすべてのマクロブロックに対して上記の処理が終了すれば、続いてメモリに格納されている該当アクセスグループ分の予測差分に対して（１５０５）、ブロック群ごとに利用可能なすべてのブロック形状（ブロック群の分割パターン）にて、ＤＣＴ（１５０６）、量子化（１５０７）、及び可変長符号化（１５０８）を行う。そして、量子化後のＤＣＴ係数に対して、逆量子化（１５０９）及び逆ＤＣＴ（１５１０）を施して予測差分情報を復号化し、さらに数式１を利用して最適なブロック形状（ブロック群の分割パターン）を選択して（１５１１）、その形状情報を符号化する。 When the above processing is completed for all the macroblocks in the access group, the prediction difference corresponding to the corresponding access group stored in the memory (1505) is subsequently obtained for all the blocks available for each block group. DCT (1506), quantization (1507), and variable length coding (1508) are performed in the block shape (block group division pattern). Then, the quantized DCT coefficients are subjected to inverse quantization (1509) and inverse DCT (1510) to decode the prediction difference information, and further using Equation 1, an optimal block shape (block group division) Pattern) is selected (1511), and the shape information is encoded.

また、形状情報の選択は、数式１以外にも、例えば画質歪みと符号量の関係から最適な符号化モードを決定するＲＤ−Ｏｐｔｉｍｉｚａｔｉｏｎ方式を利用することができる。ＲＤ−Ｏｐｔｉｍｉｚａｔｉｏｎ方式は広く知られた技術であり、ここでの詳細な説明を省略する。詳細については、例えば、参考文献１を参照のこと（参考文献１：G. Sullivan and T. Wiegand : “Rate-Distortion Optimization for Video Compression”、IEEE Signal Processing Magazine, vol.15, no.6, pp.74-90, 1998.）。 In addition to Equation 1, the shape information can be selected using, for example, an RD-Optimization method that determines an optimal encoding mode from the relationship between image quality distortion and code amount. The RD-Optimization method is a well-known technique and will not be described in detail here. For details, see Reference 1, for example (Reference 1: G. Sullivan and T. Wiegand: “Rate-Distortion Optimization for Video Compression”, IEEE Signal Processing Magazine, vol. 15, no. 6, pp. .74-90, 1998.).

続いて、復号化された予測差分と予測画像とを加算することにより復号化画像を取得し（１５１２）、参照画像メモリに格納する。以上の処理をすべてのアクセスグループに対して完了すれば、画像１フレーム分の符号化は終了する（１５１３）。 Subsequently, a decoded image is obtained by adding the decoded prediction difference and the predicted image (1512), and stored in the reference image memory. When the above processing is completed for all the access groups, the encoding for one frame of the image is completed (1513).

図１６は、本実施形態における１フレームの復号化処理手順を示している。まず、１フレーム内のすべてのアクセスグループに対して、以下の処理を行う（１６０１）。すなわち、アクセスグループ内のすべてのブロック群に対して（１６０２）、可変長復号化処理を施し（１６０３）、指定されたブロック形状（ブロック群の分割パターン）にて逆量子化処理（１６０４）及び逆ＤＣＴ（１６０５）を施して予測差分を復号化してメモリに格納する。 FIG. 16 shows a decoding processing procedure for one frame in the present embodiment. First, the following processing is performed for all access groups in one frame (1601). That is, all the block groups in the access group (1602) are subjected to variable length decoding processing (1603), and the inverse quantization processing (1604) and the designated block shape (block group division pattern) are performed. Inverse DCT (1605) is applied to decode the prediction difference and store it in the memory.

以上の処理をアクセスグループ内のすべてのブロック群に対して完了すると、続いて、同じアクセスグループに対して（１６０６）、可変長復号化した予測方法と予測を行う際のブロック形状（マクロブロックの分割パターン）に基づいて予測（１６０７）を行い、メモリに格納されている予測差分と加算することにより、復号画像を取得する（１６０８）。以上の処理をフレーム中のすべてのアクセスグループに対して完了すれば、画像1フレーム分の復号化が終了する（１６０９）。 When the above processing is completed for all the block groups in the access group, subsequently, for the same access group (1606), the variable length decoding prediction method and the block shape (macroblock Based on the division pattern), prediction (1607) is performed, and a decoded image is obtained by adding the prediction difference stored in the memory (1608). When the above processing is completed for all access groups in the frame, decoding for one frame of image is completed (1609).

本実施形態では周波数変換の一例としてＤＣＴを挙げているが、ＤＳＴ（Discrete Sine Transformation：離散サイン変換）、ＷＴ（Wavelet Transformation：ウェーブレット変換）、ＤＦＴ（Discrete Fourier Transformation：離散フーリエ変換）、ＫＬＴ（Karhunen-Loeve Transformation：カルーネン-レーブ変換）など、画素間相関除去に利用する直交変換ならどのようなものでも構わないし、特に周波数変換を施さずに予測差分そのものに対して符号化を行っても構わない。さらに、可変長符号化も特に行わなくて良い。また、本実施形態を別の方法と組み合わせて利用しても良い。 In this embodiment, DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT (Karhunen). -Loeve Transformation (Carhunen-Leave transformation), etc., any orthogonal transformation can be used to remove correlation between pixels, and encoding can be performed on the prediction difference itself without frequency conversion. . Furthermore, variable length coding is not particularly required. Further, this embodiment may be used in combination with another method.

以上説明した実施形態１に係る動画像符号化装置及び符号化方法、動画像復号化装置及び復号化方法によれば、インターマクロブロックにより構成されるフレームについての符号化、復号化において、予測に用いるマクロブロックよりも大きいブロック単位で周波数変換を行うことにより、主観画質をより好適に維持したまま符号量をより低減する動画像符号化装置及び符号化方法、動画像復号化装置及び復号化方法を実現することができる。 According to the video encoding device and encoding method, the video decoding device, and the decoding method according to Embodiment 1 described above, prediction is performed in encoding and decoding of a frame configured by inter macroblocks. Moving picture encoding apparatus, encoding method, moving picture decoding apparatus, and decoding method for reducing code amount while maintaining subjective image quality more suitably by performing frequency conversion in units of blocks larger than the macroblock to be used Can be realized.

実施形態２．
実施形態１は、画面間符号化を行うことが可能なフレーム（Ｈ．２６４／ＡＶＣで言えばＰスライスもしくはＢスライス）においてすべての領域を画面間符号化する場合、すなわち画面内にはインターマクロブロック（画面間符号化を行うマクロブロック）のみが存在し、イントラマクロブロック（画面内符号化を行うマクロブロック）は存在しない場合を例として説明した。 Embodiment 2. FIG.
In the first embodiment, when all areas are inter-coded in a frame (P slice or B slice in H.264 / AVC) in which inter-picture coding can be performed, that is, an inter macro is included in the picture. An example has been described in which only blocks (macroblocks that perform inter-frame encoding) exist and intra macroblocks (macroblocks that perform intra-frame encoding) do not exist.

これに対し実施形態２は、画面間符号化を行うことが可能なフレーム（Ｈ．２６４／ＡＶＣで言えばＰスライスもしくはＢスライス）において画面内符号化を適用可能な場合、すなわちインターマクロブロック（画面間符号化を行うマクロブロック）とイントラマクロブロック（画面内符号化を行うマクロブロック）とを混在させて符号化を行うことが可能である場合について説明する。 On the other hand, in the second embodiment, when intra-frame coding can be applied to a frame (P slice or B slice in H.264 / AVC) in which inter-frame coding can be performed, that is, an inter macroblock ( A case will be described in which encoding can be performed by mixing a macro block that performs inter-screen encoding) and an intra macro block (macro block that performs intra-screen encoding).

イントラマクロブロックでは、既に符号化が完了したブロックの復号化画像を利用して予測を行うため、対象ブロックを予測する際には、隣接するブロックにおけるＤＣＴ処理が完了している必要がある。そのため、１ブロック群分のマクロブロックに対して一括で予測を行い、それら予測差分を統合してＤＣＴを施すことはできない。 In an intra macroblock, prediction is performed using a decoded image of a block that has already been encoded. Therefore, when predicting a target block, it is necessary to complete DCT processing in adjacent blocks. For this reason, it is impossible to perform batch prediction on macroblocks for one block group and perform DCT by integrating these prediction differences.

そのため、イントラマクロブロックが１つでも含まれるブロック群に対してはマクロブロック単位で予測とＤＣＴを行う。このとき、ＤＣＴ処理はマクロブロックを分割したブロック単位で行うことになり、そのサイズはマクロブロック以下となる。そのため、ＤＣＴ処理におけるブロックサイズ（ＤＣＴのためのマクロブロックの分割パターン）としては、図９における３２×３２、６４×６４などは利用できない。このとき、例えば図１８に示す符号表を利用して上記分割パターンの符号化を行う。 Therefore, prediction and DCT are performed on a macroblock basis for a block group including even one intra macroblock. At this time, the DCT process is performed in units of blocks obtained by dividing the macroblock, and the size thereof is equal to or smaller than the macroblock. Therefore, 32 × 32, 64 × 64, etc. in FIG. 9 cannot be used as a block size (DCT macroblock division pattern) in DCT processing. At this time, for example, the division pattern is encoded using a code table shown in FIG.

図１７は、各ブロック群の符号化方法の一例について、概念的に示している。この例では、アクセスグループがブロック群に一致する場合について示している。符号化処理は、まず、画像左上端に位置するブロック群１７０１に含まれるすべてのマクロブロック対して予測を行い、すべてのマクロブロックがインターマクロブロックであれば、実施形態１と同様の手段（図９）でブロック群の分割パターンを決定してＤＣＴを施す。続いてブロック群１の右側に隣接するブロック群１７０２に対しても、イントラマクロブロックを含まなければ同じ処理を行う。 FIG. 17 conceptually shows an example of an encoding method for each block group. This example shows a case where the access group matches the block group. In the encoding process, first, prediction is performed for all macroblocks included in the block group 1701 located at the upper left end of the image, and if all macroblocks are inter macroblocks, the same means as in the first embodiment (see FIG. In 9), the division pattern of the block group is determined and DCT is performed. Subsequently, the same processing is performed for the block group 1702 adjacent to the right side of the block group 1 if no intra macroblock is included.

さらにブロック群１７０３、ブロック群１７０４に対しても同じ処理を進め、画面右端に到達した時点で、ブロック群１７０１の下側に隣接するブロック群１７０５を処理する。ここで、例えばブロック群１７０６に含まれる１つ以上のマクロブロックがイントラマクロブロックであったとすると、このブロック群に対してはマクロブロック単位で予測とＤＣＴを行う。以上のように、本実施形態は、該当マクロブロック群がイントラマクロブロックを含むか否かによって、ＤＣＴを行う単位をマクロブロックとマクロブロック群で切り替える。処理が画面右下端に到達すると、位置画面分の符号化が終了する。 Further, the same processing is advanced for the block group 1703 and the block group 1704, and when the right end of the screen is reached, the block group 1705 adjacent below the block group 1701 is processed. Here, for example, if one or more macroblocks included in the block group 1706 are intra macroblocks, prediction and DCT are performed on the block group in units of macroblocks. As described above, in the present embodiment, the unit for performing DCT is switched between the macroblock and the macroblock group depending on whether or not the corresponding macroblock group includes an intra macroblock. When the process reaches the lower right corner of the screen, the encoding for the position screen ends.

図１９は、本実施形態において、１つ以上のイントラマクロブロックが存在するブロック群に対する符号化ストリームの構成例を表す。ここでは、該当ブロック群内に、マクロブロックが１６個存在する場合について説明する。まず、最初のマクロブロック（マクロブロック１）に対して予測を行う際の予測方法（順方向画面間予測、逆方向画面間予測、双方向画面間予測、画面内予測など）とその分割パターンの組み合わせとして表される予測モード１９０１を符号化する。続いて予測に必要なサイド情報１９０２として、インターマクロブロックの場合は動きベクトルを、イントラマクロブロックの場合は予測方向に関する情報を符号化する。 FIG. 19 shows a configuration example of an encoded stream for a block group in which one or more intra macroblocks exist in the present embodiment. Here, a case where 16 macroblocks exist in the block group will be described. First, the prediction method (forward inter-screen prediction, reverse inter-screen prediction, bi-directional inter-screen prediction, intra-screen prediction, etc.) and its division pattern when predicting the first macroblock (macroblock 1) A prediction mode 1901 expressed as a combination is encoded. Subsequently, as side information 1902 necessary for prediction, a motion vector is encoded in the case of an inter macroblock, and information regarding a prediction direction is encoded in the case of an intra macroblock.

続いて、同じマクロブロックに対してＤＣＴを施す際のマクロブロックブロック分割パターン１９０３と、各ブロックのＤＣＴ係数１９０４を符号化する。以上の処理を１ブロック群に含まれるすべてのマクロブロックに対して行う。また、ＤＣＴを行うブロックサイズは例えば８×８などで固定値に設定してもよく、この場合は、マクロブロック単位の分割パターンの符号化は不要である。なお、イントラマクロブロックが１つも存在しないブロック群に対する符号化ストリームの構成は、実施形態１と同様（図１４）である。 Subsequently, the macroblock block division pattern 1903 when DCT is performed on the same macroblock and the DCT coefficient 1904 of each block are encoded. The above processing is performed for all macroblocks included in one block group. Also, the block size for performing DCT may be set to a fixed value such as 8 × 8. In this case, encoding of a division pattern for each macroblock is not necessary. Note that the configuration of the encoded stream for a block group having no intra macroblock is the same as that of the first embodiment (FIG. 14).

図２０は、本実施形態において、１つ以上のイントラマクロブロックが存在するブロック群における符号化処理手順を示している。この処理は、該当ブロック群に含まれるすべてのマクロブロックに対して（２００１）、利用可能なすべての予測方法（前方向画面間予測、後方向画面間予測、双方向画面間予測、画面内予測など）及びブロック形状（マクロブロックの分割パターン）にて予測を実行し（２００２）、予測差分の計算を行う。 FIG. 20 shows an encoding processing procedure in a block group in which one or more intra macroblocks exist in the present embodiment. This process is performed for all macroblocks included in the corresponding block group (2001), for all available prediction methods (forward inter-screen prediction, backward inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction). Etc.) and a block shape (macroblock division pattern) (2002), and a prediction difference is calculated.

そして、すべての予測方法・ブロック形状にて予測を行った結果の中から、好適な組み合わせを選択し（２００３）、その組み合わせの情報を符号化する。ここで言う好適とは、予測差分と符号量がともに小さくなる場合を言い、その評価には数式２あるいは他の数式で表されるコスト関数を利用すると効果的である。 Then, a suitable combination is selected from the prediction results of all prediction methods / block shapes (2003), and information on the combination is encoded. The term “preferable” here means a case where both the prediction difference and the code amount are small, and it is effective to use a cost function expressed by Formula 2 or another formula for the evaluation.

続いて同じマクロブロックの予測差分に対して、利用可能なブロック形状（ＤＣＴを施すマクロブロックの分割パターン）にて、ＤＣＴ（２００４）、量子化（２００５）、及び可変長符号化（２００６）を行う。そして、量子化後のＤＣＴ係数に対して、逆量子化（２００７）及び逆ＤＣＴ（２００８）を施して予測差分情報を復号化し、さらに数式１を利用して最適なブロック形状（ＤＣＴのためのマクロブロックの分割パターン）を選択し（２００９）、その形状情報を符号化する。形状情報の選択は、上述の参考文献１の別手法あるいは他の手法を利用しても良い。 Subsequently, DCT (2004), quantization (2005), and variable length coding (2006) are performed on the prediction difference of the same macroblock in an available block shape (macroblock division pattern subjected to DCT). Do. Then, the quantized DCT coefficients are subjected to inverse quantization (2007) and inverse DCT (2008) to decode the prediction difference information, and further using Equation 1, an optimal block shape (for DCT) Macroblock division pattern) is selected (2009), and its shape information is encoded. The selection of the shape information may use another method of Reference Document 1 described above or another method.

続いて、復号化された予測差分と予測画像と加算することにより復号化画像を取得し（２０１０）、参照画像メモリに格納する。以上の処理を該当ブロック群に含まれるすべてのマクロブロックに対して完了すれば、該当ブロック群の符号化が終了する。なお、イントラマクロブロックが存在しないブロック群における符号化処理手順は、実施形態１と同様（図１５）である。 Subsequently, a decoded image is acquired by adding the decoded prediction difference and the predicted image (2010), and stored in the reference image memory. When the above processing is completed for all the macroblocks included in the corresponding block group, the encoding of the corresponding block group is completed. Note that the encoding processing procedure in a block group in which no intra macroblock exists is the same as that in the first embodiment (FIG. 15).

図２１は、本実施形態において、１つ以上のイントラマクロブロックが存在するブロック群における復号化処理手順を示している。この処理は、ブロック群に含まれるすべてのマクロブロックに対して（２１０１）、可変長復号化処理を施し（２１０２）、指定されたブロック形状にて逆量子化処理（２１０３）及び逆ＤＣＴ（２１０４）を施して予測差分を復号化してメモリに格納する。 FIG. 21 shows a decoding processing procedure in a block group in which one or more intra macroblocks exist in this embodiment. In this processing, all macroblocks included in the block group (2101) are subjected to variable length decoding processing (2102), and inverse quantization processing (2103) and inverse DCT (2104) are performed in the designated block shape. ) To decode the prediction difference and store it in the memory.

続いて、可変長復号化した予測方法と予測を行う際のブロック形状（ＤＣＴのためのマクロブロックの分割パターン）に基づいて予測（２１０５）を行い、メモリに格納されている予測差分と加算することにより、復号画像を取得する（２１０６）。以上の処理を該当ブロック群に含まれるすべてのマクロブロックに対して完了すれば、該当ブロック群の復号化が終了する。なお、イントラマクロブロックが存在しないブロック群における復号化処理手順は、実施形態１と同様（図１６）である。 Subsequently, prediction (2105) is performed based on the prediction method subjected to variable length decoding and the block shape (macroblock division pattern for DCT) when performing prediction, and the prediction difference stored in the memory is added. Thus, a decoded image is acquired (2106). When the above processing is completed for all the macroblocks included in the block group, the decoding of the block group is completed. Note that the decoding processing procedure in the block group in which no intra macroblock exists is the same as that in the first embodiment (FIG. 16).

なお、実施形態２の処理を行う動画像符号化装置、動画像復号化装置は、図１及び図２に示す実施形態１の動画像符号化装置、動画像復号化装置の構成の各構成部の動作を上述の動作に変更すればよいため、構成自体の説明は省略する。 Note that the moving picture encoding apparatus and the moving picture decoding apparatus that perform the processing of the second embodiment are components of the moving picture encoding apparatus and the moving picture decoding apparatus according to the first embodiment shown in FIGS. 1 and 2. Therefore, the description of the configuration itself is omitted.

本実施形態では周波数変換の一例としてＤＣＴを挙げているが、ＤＳＴ（Discrete Sine Transformation：離散サイン変換）、ＷＴ（Wavelet Transformation：ウェーブレット変換）、ＤＦＴ（Discrete Fourier Transformation：離散フーリエ変換）、ＫＬＴ（Karhunen-Loeve Transformation：カルーネン−レーブ変換）など、画素間相関除去に利用する直交変換ならどのようなものでも構わないし、特に周波数変換を施さずに予測差分そのものに対して符号化を行っても構わない。さらに、可変長符号化も特に行わなくて良い。また、本実施形態を別の方法と組み合わせて利用しても良い。 In this embodiment, DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT (Karhunen). -Loeve Transformation (Caroonen-Leave transformation), etc., any orthogonal transformation can be used to remove the correlation between pixels, and the prediction difference itself may be encoded without any frequency transformation. . Furthermore, variable length coding is not particularly required. Further, this embodiment may be used in combination with another method.

以上説明した実施形態２に係る動画像符号化装置及び符号化方法、動画像復号化装置及び復号化方法によれば、インターマクロブロックとイントラマクロブロックが混在する構成のフレームについての符号化、復号化においても、予測に用いるマクロブロックよりも大きいブロック単位で周波数変換を行うことにより、主観画質をより好適に維持したまま符号量をより低減する動画像符号化装置及び符号化方法、動画像復号化装置及び復号化方法を実現することができる。 According to the moving picture coding apparatus and coding method, the moving picture decoding apparatus, and the decoding method according to Embodiment 2 described above, coding and decoding of a frame having a configuration in which inter macro blocks and intra macro blocks are mixed. Video coding apparatus, coding method, and video decoding that further reduce code amount while maintaining subjective image quality more appropriately by performing frequency conversion in units of blocks larger than macroblocks used for prediction A decoding device and a decoding method can be realized.

以上、本発明を添付の図面を参照して詳細に説明したが、本発明はこのような具体的構成に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変更及び同等の構成を含むものである。 The present invention has been described in detail with reference to the accompanying drawings, but the present invention is not limited to such a specific configuration, and various modifications and equivalents within the spirit of the appended claims. The configuration is included.

特許請求の範囲に記載した以外の本発明の観点の代表的な方法として、次の処理があげられる。 As a representative method from the viewpoint of the present invention other than those described in the scope of claims, the following treatment can be given.

（１）符号化ストリームを入力する。前記入力した符号化ストリームに可変長復号化処理を行う。前記可変長復号化処理を行ったデータについて第１のブロック単位または第２のブロック単位で逆量子化処理及び逆周波数変換処理を行って予測差分を生成する。このとき、前記ブロック群に画面内予測ブロックが含まれる場合には、該ブロック群においては、前記第２のブロック単位で前記逆量子化処理及び前記逆周波数変換処理を行い、前記ブロック群に画面内予測ブロックが含まれない場合には、該ブロック群においては、前記第１のブロック単位で前記逆量子化処理及び前記逆周波数変換処理を行う。前記第２のブロック単位で予測処理を行う。前記生成した予測差分と前記予測処理結果とに基づいて復号画像を生成する。前記第１のブロック単位における１つのブロックは、前記第２の単位における複数のブロックからなるブロック群を統合した１つのブロックである。 (1) Input an encoded stream. A variable length decoding process is performed on the input encoded stream. A prediction difference is generated by performing an inverse quantization process and an inverse frequency transform process on the data subjected to the variable length decoding process on a first block basis or on a second block basis. At this time, if the block group includes an intra-screen prediction block, the block group performs the inverse quantization process and the inverse frequency transform process in units of the second block, and the block group includes a screen. When an intra prediction block is not included, the block group performs the inverse quantization process and the inverse frequency transform process in units of the first block. Prediction processing is performed in units of the second block. A decoded image is generated based on the generated prediction difference and the prediction processing result. One block in the first block unit is one block obtained by integrating a block group composed of a plurality of blocks in the second unit.

（２）符号化ストリームを入力する。前記入力した符号化ストリームに可変長復号化処理を行う。前記可変長復号化処理を行ったデータについて第１のブロック単位または第２のブロック単位で逆量子化処理及び逆周波数変換処理を行って予測差分を生成する。前記第２のブロック単位で予測処理を行う。前記生成した予測差分と前記予測処理結果とに基づいて復号画像を生成する。前記第１のブロック単位における１つのブロックは、前記第２の単位における複数のブロックからなるブロック群を統合した１つのブロックである。前記ブロック群に画面内予測ブロックが含まれる場合には、該ブロック群においては、前記第２のブロック単位で前記逆量子化処理及び前記逆周波数変換処理を行う。前記ブロック群に画面内予測ブロックが含まれない場合には、該ブロック群においては、前記第１のブロック単位で前記逆量子化処理及び前記逆周波数変換処理を行う。前記入力する前記符号化ストリームにおいて、画面内予測ブロックを含むブロック群のストリーム構成には、前記第２のブロック単位の予測モードの情報と前記第２のブロック単位の周波数変換係数とが含まれ、画面内予測ブロックを含まないブロック群のストリーム構成には、前記第２のブロック単位の予測モードの情報と前記第１のブロック単位の周波数変換係数とが含まれる。 (2) Input an encoded stream. A variable length decoding process is performed on the input encoded stream. A prediction difference is generated by performing an inverse quantization process and an inverse frequency transform process on the data subjected to the variable length decoding process on a first block basis or on a second block basis. Prediction processing is performed in units of the second block. A decoded image is generated based on the generated prediction difference and the prediction processing result. One block in the first block unit is one block obtained by integrating a block group composed of a plurality of blocks in the second unit. When an intra-screen prediction block is included in the block group, the inverse quantization process and the inverse frequency transform process are performed in the block group in units of the second block. When the intra-frame prediction block is not included in the block group, the inverse quantization process and the inverse frequency transform process are performed in the block group in the first block unit. In the encoded stream to be input, the stream configuration of the block group including the intra prediction block includes the prediction mode information of the second block unit and the frequency transform coefficient of the second block unit, The stream configuration of the block group that does not include the intra-screen prediction block includes the prediction mode information of the second block unit and the frequency conversion coefficient of the first block unit.

（３）入力画像を入力する。前記入力画像について第１のブロック単位で予測処理を行って予測差分を生成する。前記生成した予測差分に周波数変換処理及び量子化処理を行って量子化データを生成する。このとき、複数のブロックからなるブロック群に画面内予測ブロックが含まれる場合には、前記ブロック群において、前記第１のブロック単位で前記周波数変換処理と前記量子化処理を行い、前記ブロック群に画面内予測ブロックが含まれない場合には、前記ブロック群において、前記第１のブロック単位のブロックを複数個統合した大きさの第２のブロック単位で、前記周波数変換処理と前記量子化処理を行う。前記生成した量子化データに可変長符号化を行って符号化ストリームを生成する。 (3) Input an input image. A prediction difference is generated by performing a prediction process on the input image for each first block. The generated prediction difference is subjected to frequency conversion processing and quantization processing to generate quantized data. At this time, when an intra-screen prediction block is included in a block group including a plurality of blocks, the block group performs the frequency conversion process and the quantization process in the first block unit, and the block group includes When the intra prediction block is not included, in the block group, the frequency conversion process and the quantization process are performed in a second block unit having a size obtained by integrating a plurality of blocks in the first block unit. Do. Variable length coding is performed on the generated quantized data to generate an encoded stream.

（４）入力画像を入力する。前記入力画像について第１のブロック単位で予測処理を行って予測差分を生成する。前記生成した予測差分に周波数変換処理及び量子化処理を行って量子化データを生成する。前記生成した量子化データに可変長符号化を行って符号化ストリームを生成する。複数のブロックからなるブロック群に画面内予測ブロックが含まれる場合に、前記ブロック群において、前記第１のブロック単位で前記周波数変換処理と前記量子化処理を行い、前記ブロック群に画面内予測ブロックが含まれない場合に、前記ブロック群において、前記第１のブロック単位のブロックを複数個統合した大きさの第２のブロック単位で、前記周波数変換処理と前記量子化処理を行う。前記生成した符号化ストリームの前記複数のブロックからなるブロック群についてのストリーム構成において、前記ブロック群に画面内予測ブロックが含まれる場合には、該ブロック群のストリーム構成には、前記第１のブロック単位の予測モードの情報と前記第１のブロック単位の周波数変換係数とが含まれ、前記ブロック群に画面内予測ブロックが含まれない場合には、前記第１のブロック単位の予測モードの情報と前記第２のブロック単位の周波数変換係数とが含まれる。 (4) Input an input image. A prediction difference is generated by performing a prediction process on the input image for each first block. The generated prediction difference is subjected to frequency conversion processing and quantization processing to generate quantized data. Variable length coding is performed on the generated quantized data to generate an encoded stream. When an intra-screen prediction block is included in a block group consisting of a plurality of blocks, the block group performs the frequency conversion process and the quantization process in units of the first block, and the block group includes an intra-screen prediction block. Is not included in the block group, the frequency conversion process and the quantization process are performed in a second block unit having a size obtained by integrating a plurality of blocks in the first block unit. In the stream configuration for the block group including the plurality of blocks of the generated encoded stream, when the intra-frame prediction block is included in the block group, the stream configuration of the block group includes the first block. When the prediction mode information of the unit and the frequency conversion coefficient of the first block unit are included, and the intra-prediction block is not included in the block group, the prediction mode information of the first block unit and And a frequency conversion coefficient of the second block unit.

以上のように、本発明は、動画像の符号化／復号化に適用することができ、特にブロック単位での符号化／復号化に適用することができる。 As described above, the present invention can be applied to encoding / decoding of moving images, and in particular, can be applied to encoding / decoding in units of blocks.

Claims

A processing method in a system including a moving image encoding device and a moving image decoding device,
In the moving image encoding device, an encoded stream generating step of generating an encoded stream by encoding a moving image;
In the video decoding device, an input step of inputting the encoded stream;
In the video decoding device, a variable length decoding step for performing a variable length decoding process on the encoded stream input in the input step;
In the moving image decoding device, predicted the variable length decoding data subjected to variable length decoding in step, performing an inverse quantization process and an inverse frequency conversion processing in the first block or the second block An inverse quantization / inverse frequency conversion step for generating a difference;
In the moving picture decoding apparatus, a prediction step for performing a prediction process in units of the second block;
In the moving image decoding apparatus, the decoding image generation step of generating a decoded image based on the prediction difference generated in the inverse quantization / inverse frequency conversion step and the prediction processing result in the prediction step,
Wherein one block of the first block unit is one block that integrates block group including a plurality of blocks of said second block,
In the video decoding device, possible processing states in the inverse quantization / inverse frequency conversion step include:
When the block group includes an intra prediction block, the block group performs the inverse quantization process and the inverse frequency transform process in the second block unit in the order of the upper left block, the upper right, the lower left, and the lower right. A first processing state to proceed;
Wherein when the group of blocks does not include intra prediction block in the block group, there is a second processing state of performing the first inverse quantization process in units of blocks and inverse frequency conversion processing, the processing method.