JP2008219360A

JP2008219360A - Predictive encoding device

Info

Publication number: JP2008219360A
Application number: JP2007052807A
Authority: JP
Inventors: Koji Tsuchie; 江孝二土; Yuji Okuda; 田裕二奥
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2007-03-02
Filing date: 2007-03-02
Publication date: 2008-09-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide a predictive encoding device capable of performing moving image compression processing at a high speed. <P>SOLUTION: The predictive encoding device is provided with: an original image converting part for generating an original image having a first frequency conversion pattern by performing the first frequency conversion of the original image; and a predicted image converting part for generating a predicted image having the first frequency conversion pattern by performing the first frequency conversion of the predicted image. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、予測符号化装置に関する。 The present invention relates to a predictive coding apparatus.

近年、動画像圧縮技術の一つである予測符号化を行うことを目的として、種々の予測符号化装置が開発され、その代表的なものとして例えばH.264/AVCエンコーダが開発されている。しかし、かかる予測符号化装置では、最適な予測モードを判定する際に、アダマール変換などの直交変換を施した絶対値誤差和を求めた方が、予測性能が向上できるが、直交変換を施して予測モード判定を行うのには、必要とされる処理量が多く、高速処理を行うことが困難であるという問題があった。 In recent years, various predictive coding apparatuses have been developed for the purpose of performing predictive coding, which is one of the moving picture compression techniques, and for example, an H.264 / AVC encoder has been developed as a representative example. However, in such a predictive coding apparatus, when determining an optimal prediction mode, it is possible to improve the prediction performance by obtaining an absolute value error sum obtained by performing orthogonal transformation such as Hadamard transform, but performing orthogonal transformation. The prediction mode determination has a problem that a large amount of processing is required and it is difficult to perform high-speed processing.

以下、予測符号化装置に関する文献名を記載する。
T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC Video Coding Standard”, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, July 2003 I. G. Richardson, H.264 and MPEG-4 Video Compression, Wiley, 2003 Joint Video Team (JVT) of ITU-T VCEG and ISO/IEC MPEG, Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, ITU-T Rec. H.264 and ISO/IEC 14496-10 AVC, May 2003 The following is a list of literature names related to the predictive coding apparatus.
T. Wiegand, GJ Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264 / AVC Video Coding Standard”, IEEE Trans. On Circuits and Systems for Video Technology, vol. 13, no. 7, pp . 560-576, July 2003 IG Richardson, H.264 and MPEG-4 Video Compression, Wiley, 2003 Joint Video Team (JVT) of ITU-T VCEG and ISO / IEC MPEG, Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, ITU-T Rec.H.264 and ISO / IEC 14496-10 AVC, May 2003

本発明は、動画像圧縮処理を高速に行うことができる予測符号化装置を提供することを目的とする。 An object of the present invention is to provide a predictive coding apparatus capable of performing moving image compression processing at high speed.

本発明の一態様による予測符号化装置は、
異なる複数の予測モードの中から選択された前記予測モードに基づいて、参照画像から予測画像を作成する予測画像作成部と、
原画像に対して第１の周波数変換を行うことにより、第１の周波数変換パターンを有する前記原画像を生成する原画像変換部と、
前記予測画像に対して前記第１の周波数変換を行うことにより、前記第１の周波数変換パターンを有する前記予測画像を生成する予測画像変換部と、
前記第１の周波数変換パターンを有する前記原画像と、前記第１の周波数変換パターンを有する前記予測画像との差分を算出することにより、前記第１の周波数変換パターンを有する残差画像を生成する第１の残差画像変換部と、
前記第１の周波数変換パターンを有する前記残差画像に基づいて、前記複数の予測モードの中から、符号量が小さくなる前記予測モードを判定するモード判定部と、
前記原画像と前記予測画像の残差画像に対して第２の周波数変換を行うことにより、第２の周波数変換パターンを有する前記残差画像を生成する第２の残差画像変換部と、
前記第２の周波数変換パターンを有する前記残差画像に対して画質調整を行う画質調整部と、
前記画質調整が行われた、前記第２の周波数変換パターンを有する前記残差画像に対して、前記第２の周波数変換の逆変換を行うことにより、実空間の前記残差画像を生成する残差画像逆変換部と、
前記残差画像と前記予測画像とを用いて前記参照画像を作成し、前記予測画像作成部に出力する参照画像作成部と
を備える。 A predictive coding apparatus according to an aspect of the present invention includes:
A prediction image creating unit that creates a prediction image from a reference image based on the prediction mode selected from a plurality of different prediction modes;
An original image conversion unit for generating the original image having the first frequency conversion pattern by performing a first frequency conversion on the original image;
A prediction image conversion unit that generates the prediction image having the first frequency conversion pattern by performing the first frequency conversion on the prediction image;
A residual image having the first frequency conversion pattern is generated by calculating a difference between the original image having the first frequency conversion pattern and the predicted image having the first frequency conversion pattern. A first residual image conversion unit;
Based on the residual image having the first frequency conversion pattern, a mode determination unit that determines the prediction mode in which the code amount is small from the plurality of prediction modes;
A second residual image conversion unit that generates the residual image having a second frequency conversion pattern by performing a second frequency conversion on the residual image of the original image and the predicted image;
An image quality adjustment unit that performs image quality adjustment on the residual image having the second frequency conversion pattern;
The residual image for generating the residual image in the real space is obtained by performing inverse transformation of the second frequency transformation on the residual image having the second frequency transformation pattern that has undergone the image quality adjustment. A difference image inverse transform unit;
A reference image creating unit that creates the reference image using the residual image and the predicted image and outputs the reference image to the predicted image creating unit;

本発明の予測符号化装置によれば、動画像圧縮処理を高速に行うことができる。 According to the predictive coding apparatus of the present invention, moving picture compression processing can be performed at high speed.

以下、本発明の実施の形態について図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１に、本発明の実施の形態による予測符号化装置１０の構成を示す。予測画像作成部１４０は、参照画像作成部１５０から出力される参照画像と、モード選択部８０から出力される予測モードとに基づいて、周辺画素を用いて予測を行う画面内予測符号化を実行することにより、予測画像を作成し、これを予測画像アダマール変換部９０、残差画像作成部１００及び参照画像作成部１５０に出力する。 FIG. 1 shows the configuration of a predictive coding apparatus 10 according to an embodiment of the present invention. The predicted image creation unit 140 performs intra-screen predictive coding that performs prediction using surrounding pixels based on the reference image output from the reference image creation unit 150 and the prediction mode output from the mode selection unit 80. As a result, a predicted image is created and output to the predicted image Hadamard transform unit 90, the residual image creating unit 100, and the reference image creating unit 150.

原画像変換部としての原画像アダマール変換部５０は、原画像に対して例えば2次元アダマール変換（第１の周波数変換）を行い、このアダマール変換された原画像（第１の周波数変換パターンを有する原画像）を残差画像アダマール変換部６０に出力する。例えば、4x4要素の任意の行列Xと、アダマール変換行列Tとが、次式

のように表されると、2次元アダマール変換は、

と表される。 An original image Hadamard transform unit 50 as an original image transform unit performs, for example, two-dimensional Hadamard transform (first frequency transform) on the original image, and has the original image (first frequency transform pattern that has been subjected to the Hadamard transform). The original image) is output to the residual image Hadamard transform unit 60. For example, an arbitrary matrix X with 4x4 elements and Hadamard transformation matrix T

The two-dimensional Hadamard transform is expressed as

It is expressed.

予測画像変換部としての予測画像アダマール変換部９０は、予測画像に対して、例えば0、1または2次元アダマール変換（第１の周波数変換）を行い、このアダマール変換された予測画像（第１の周波数変換パターンを有する予測画像）を残差画像アダマール変換部６０に出力する。なお、アダマール変換の次元は、予測画像の性質、すなわち予測画像の実質的な次元数に応じて使い分ける。例えば予測画像が、図２（ａ）のように0次元的であれば0次元アダマール変換、図２（ｂ）のように1次元的であれば1次元アダマール変換、図２（ｃ）のように２次元的であれば２次元アダマール変換を使うといった具合である。 The predicted image Hadamard transform unit 90 as the predicted image transform unit performs, for example, 0, 1 or two-dimensional Hadamard transform (first frequency transform) on the predicted image, and this Hadamard transformed predicted image (first The predicted image having the frequency conversion pattern) is output to the residual image Hadamard transform unit 60. Note that the dimensions of the Hadamard transform are selectively used according to the nature of the predicted image, that is, the substantial number of dimensions of the predicted image. For example, if the predicted image is 0-dimensional as shown in FIG. 2 (a), 0-dimensional Hadamard transform, if it is one-dimensional as shown in FIG. 2 (b), 1-dimensional Hadamard transform, as shown in FIG. 2 (c). If it is two-dimensional, the two-dimensional Hadamard transform is used.

なお、第１の周波数変換としては、アダマール変換ではなく、他の種々の第１の直交変換を用いても良い。 Note that, as the first frequency transform, various other first orthogonal transforms may be used instead of the Hadamard transform.

第１の残差画像変換部としての残差画像アダマール変換部６０は、アダマール変換された原画像と、アダマール変換された予測画像との差分を算出することにより、残差画像のアダマール変換パターン（第１の周波数パターンを有する残差画像）を導出し、これをＳＡＴＤ計算部７０に出力する。 The residual image Hadamard transform unit 60 as the first residual image transform unit calculates the difference between the Hadamard-transformed original image and the Hadamard-transformed predicted image, thereby obtaining a Hadamard transform pattern ( (Residual image having the first frequency pattern) is derived and output to the SATD calculator 70.

SATD（Sum of Absolute Transformed Difference）計算部７０は、アダマール変換されたデータの各要素の絶対値の総和を計算する。この総和、すなわちSATDが小さければ予測符号化によって発生する符号量が小さくなると予想されるため、予測モードの判定の指標として用いられる。 A SATD (Sum of Absolute Transformed Difference) calculation unit 70 calculates the sum of absolute values of each element of Hadamard transformed data. Since this total sum, that is, the SATD is small, the amount of code generated by predictive coding is expected to be small, so it is used as an index for determining the prediction mode.

モード選択部８０は、SATDに基づいて、異なる複数の予測モードの中から、符号量が小さくなる最適な予測モードを選択し、これを予測画像作成部１４０に出力する。なお、ＳＡＴＤ計算部７０及びモード選択部８０は、モード判定部４０を形成する。 The mode selection unit 80 selects an optimal prediction mode with a small code amount from a plurality of different prediction modes based on SATD, and outputs this to the prediction image creation unit 140. The SATD calculation unit 70 and the mode selection unit 80 form a mode determination unit 40.

残差画像作成部１００は、原画像と予測画像との残差を計算し、得られた残差画像を残差画像ＤＣＴ部１１０に出力する。 The residual image creation unit 100 calculates a residual between the original image and the predicted image and outputs the obtained residual image to the residual image DCT unit 110.

残差画像DCT部１１０は、この残差画像に対して、例えばDCT（Discrete Cosine Transform）（第２の周波数変換）を行い、このＤＣＴされた残差画像（第２の周波数変換パターンを有する残差画像）を画質調整部１２０に出力する。例えば、4x4要素の任意の行列Xと、DCT変換行列Cとが、次式、

のように表されると、2次元DCTは、

と表される。 The residual image DCT unit 110 performs, for example, DCT (Discrete Cosine Transform) (second frequency transformation) on the residual image, and the DCT residual image (residual image having the second frequency transformation pattern). Difference image) is output to the image quality adjustment unit 120. For example, an arbitrary matrix X having 4 × 4 elements and a DCT transformation matrix C are expressed by the following equations:

The two-dimensional DCT is expressed as

It is expressed.

なお、第２の周波数変換としては、ＤＣＴではなく、第１の直交変換とは異なる変換行列を用いた他の種々の第２の直交変換を用いても良い。また、残差画像作成部１００及び残差画像ＤＣＴ部１１０は、第２の残差画像変換部を形成する。 As the second frequency transform, various other second orthogonal transforms using a transform matrix different from the first orthogonal transform may be used instead of DCT. The residual image creation unit 100 and the residual image DCT unit 110 form a second residual image conversion unit.

画質調整部１２０は、周波数空間のパターンに変換された残差画像の画質および符号データ量を調整する。残差画像逆変換部としての残差画像IDCT部１３０は、IDCT（Inverse Discrete Csine Transform）によって、画質調整された周波数空間のパターンを実空間の残差画像に逆変換する。参照画像作成部１５０は、実空間に逆変換された残差画像と、最適な予測モードによる予測画像とを用いて、参照画像を作成し、これを予測画像作成部１４０に出力する。 The image quality adjustment unit 120 adjusts the image quality and code data amount of the residual image converted into the frequency space pattern. A residual image IDCT unit 130 as a residual image inverse transform unit inversely transforms a frequency space pattern whose image quality has been adjusted to a residual image in real space by IDCT (Inverse Discrete Csine Transform). The reference image creation unit 150 creates a reference image using the residual image inversely transformed into the real space and the predicted image in the optimal prediction mode, and outputs this to the predicted image creation unit 140.

このように本実施の形態によれば、異なる複数の予測モードに基づいて、最適モード判定を行うときに、原画像に対する変換処理と予測画像の変換処理とを別々に行うことにより、最適モード判定の処理量の削減を実現することができる。 As described above, according to the present embodiment, when optimal mode determination is performed based on a plurality of different prediction modes, optimal mode determination is performed by separately performing conversion processing for an original image and conversion processing for a predicted image. The amount of processing can be reduced.

ここで図３に、本実施の形態の比較例として、予測符号化装置２００の構成を示す。なお、図１に示された要素と同一のものには同一の符号を付して説明を省略する。この比較例の予測符号化蔵置２００の場合、残差画像作成部１００は、原画像と予測画像の残差を計算し、得られた残差画像を残差画像アダマール変換部２１０に出力する。残差画像アダマール変換部２１０は、残差画像に対して2次元アダマール変換を行う。 Here, FIG. 3 shows a configuration of a predictive coding apparatus 200 as a comparative example of the present embodiment. In addition, the same code | symbol is attached | subjected to the same element as the element shown by FIG. 1, and description is abbreviate | omitted. In the case of the predictive coding storage 200 of this comparative example, the residual image creating unit 100 calculates the residual between the original image and the predicted image and outputs the obtained residual image to the residual image Hadamard transform unit 210. The residual image Hadamard transform unit 210 performs a two-dimensional Hadamard transform on the residual image.

この比較例の予測符号化蔵置２００では、予測モードの選択のために2次元アダマール変換を複数回行うことになり、その結果、処理量が多く、高速処理を行うことが困難であった。 In the predictive coding storage 200 of this comparative example, the two-dimensional Hadamard transform is performed a plurality of times for selecting the prediction mode. As a result, the processing amount is large and it is difficult to perform high-speed processing.

このように比較例の場合には、予測モードごとに残差画像を作成して2次元アダマール変換を行っていたのに対し、本実施の形態の場合には、原画像のアダマール変換と予測画像のアダマール変換とを別々に行っている。これにより、残差画像アダマール変換パターンを算出するための計算量を削減できる。 As described above, in the case of the comparative example, a residual image is created for each prediction mode and the two-dimensional Hadamard transform is performed. In the present embodiment, the Hadamard transform of the original image and the predicted image are performed. The Hadamard transform is performed separately. Thereby, the calculation amount for calculating the residual image Hadamard transform pattern can be reduced.

すなわち、予測画像が0次元的あるいは1次元的なものであれば、アダマール変換もそれぞれ0次元、1次元のものに置き換えることができるため処理量を削減できるわけである。ここで0次元的な予測画像と言っているのは、例えばH.264/AVCにおけるDC予測画像、1次元的な予測画像と言っているのは垂直予測画像、あるいは水平予測画像のことである。参考のためH.264/AVCにおける画面内予測モードの一例を図4に示す。 That is, if the predicted image is 0-dimensional or 1-dimensional, the Hadamard transform can be replaced with 0-dimensional and 1-dimensional images, respectively, and the processing amount can be reduced. Here, for example, a zero-dimensional prediction image means a DC prediction image in H.264 / AVC, and a one-dimensional prediction image means a vertical prediction image or a horizontal prediction image. . For reference, an example of the intra prediction mode in H.264 / AVC is shown in FIG.

この図４（ａ）〜（ｃ）のうち、図４（ａ）はイントラ４×４垂直予測モードを示し、図４（ｂ）はイントラ４×４水平予測モードを示し、図４（ｃ）はイントラ４×４ＤＣ予測モードを示す。このように予測モードは、０次元又は１次元に周波数変換可能な予測モードである。 4A to 4C, FIG. 4A shows the intra 4 × 4 vertical prediction mode, FIG. 4B shows the intra 4 × 4 horizontal prediction mode, and FIG. 4C. Indicates an intra 4 × 4 DC prediction mode. As described above, the prediction mode is a prediction mode that can be frequency-converted to the zero dimension or the one dimension.

なお、上述の実施の形態は一例であって、本発明を限定するものではない。例えば、予測モードは、３つではなく、少なくとも２つの予測モードの中から選択されるものであれば良い。 The above-described embodiment is an example and does not limit the present invention. For example, the prediction mode may be selected from at least two prediction modes instead of three.

本発明の実施の形態による予測符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the prediction encoding apparatus by embodiment of this invention. 予測画像の次元を示す説明図である。It is explanatory drawing which shows the dimension of a prediction image. 比較例による予測符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the prediction encoding apparatus by a comparative example. H.264/AVCにおけるIntra 4x4予測モードを示す説明図である。It is explanatory drawing which shows Intra 4x4 prediction mode in H.264 / AVC.

Explanation of symbols

１０予測符号化装置
５０原画像アダマール変換部
６０残差画像アダマール変換部
７０ＳＡＴＤ計算部
８０モード選択部
９０予測画像アダマール変換部
１００残差画像作成部
１１０残差画像ＤＣＴ部
１２０画質調整部
１３０残差画像ＩＤＣＴ部
１４０予測画像作成部
１５０参照画像作成部 10 Predictive coding apparatus 50 Original image Hadamard transform unit 60 Residual image Hadamard transform unit 70 SATD calculation unit 80 Mode selection unit 90 Predictive image Hadamard transform unit 100 Residual image creation unit 110 Residual image DCT unit 120 Image quality adjustment unit 130 Residual Difference image IDCT unit 140 Predictive image creation unit 150 Reference image creation unit

Claims

A prediction image creating unit that creates a prediction image from a reference image based on the prediction mode selected from a plurality of different prediction modes;
An original image conversion unit for generating the original image having the first frequency conversion pattern by performing a first frequency conversion on the original image;
A prediction image conversion unit that generates the prediction image having the first frequency conversion pattern by performing the first frequency conversion on the prediction image;
A residual image having the first frequency conversion pattern is generated by calculating a difference between the original image having the first frequency conversion pattern and the predicted image having the first frequency conversion pattern. A first residual image conversion unit;
Based on the residual image having the first frequency conversion pattern, a mode determination unit that determines the prediction mode in which the code amount is small from the plurality of prediction modes;
A second residual image conversion unit that generates the residual image having a second frequency conversion pattern by performing a second frequency conversion on the residual image of the original image and the predicted image;
An image quality adjustment unit that performs image quality adjustment on the residual image having the second frequency conversion pattern;
The residual image for generating the residual image in the real space is obtained by performing inverse transformation of the second frequency transformation on the residual image having the second frequency transformation pattern that has undergone the image quality adjustment. A difference image inverse transform unit;
A predictive coding apparatus comprising: a reference image creating unit that creates the reference image using the residual image and the predicted image and outputs the reference image to the predicted image creating unit.

The first frequency transform is a first orthogonal transform;
The predictive coding apparatus according to claim 1, wherein the second frequency transform is a second orthogonal transform using a transform matrix different from the first orthogonal transform.

The first frequency transform is a Hadamard transform;
The predictive coding apparatus according to claim 1, wherein the second frequency transform is DCT.

The prediction encoding apparatus according to claim 1, wherein the prediction mode is a prediction mode capable of frequency conversion to 0 dimension or 1 dimension.

The prediction encoding apparatus according to claim 1, wherein the prediction mode is any one of a vertical prediction mode, a horizontal prediction mode, and a DC prediction mode.