JP2007180862A

JP2007180862A - Image encoding device and image coding method

Info

Publication number: JP2007180862A
Application number: JP2005376569A
Authority: JP
Inventors: Yuji Nagaishi; 裕二永石; Tatsuro Shigesato; 達郎重里
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-12-27
Filing date: 2005-12-27
Publication date: 2007-07-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image encoding device or the like which is capable of fast encoding by reducing the number of cycles required for processing without causing the degradation of image quality. <P>SOLUTION: A quantization part 230 quantizes a conversion factor in each orthogonal conversion block in accordance with a quantization parameter. An inverse quantization part 240 restores orthogonal conversion blocks from conversion factors obtained by inverse conversion of orthogonal conversion factors quantized in accordance with the quantization parameter. The inverse quantization of orthogonal conversion factors is processed in parallel (for example, by double parallel processing or quadruple parallel processing). An inverse orthogonal conversion part 250 uses restored orthogonal conversion blocks to restore a difference macro block. When restoring the difference macro block, the inverse orthogonal conversion part 250 preferentially execute inverse orthogonal conversion operation of pixels in the rightmost column of each 4×4 block constituting the difference macro block, which are used in intraprediction of a right adjacent 4×4 block, by an instruction of a general control part 260. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、ＭＰＥＧシリーズおよびＨ．２６４／ＡＶＣ等の規格に準拠した画像符号化技術に関し、特に、イントラ予測を用いる画像符号化処理の効率化技術に関する。 The present invention relates to MPEG series and H.264. More particularly, the present invention relates to a technology for improving the efficiency of image coding processing using intra prediction.

デジタル画像（デジタル動画像を含む。以下単に「画像」という。）の圧縮技術として、より符号化効率の高いＨ．２６４／ＡＶＣ（ITU-Ｔ Recommendation H.264 and ISO/IEC 14496-10 Advanced Video Coding）が規格化されている。Ｈ．２６４／ＡＶＣ（以下「Ｈ．２６４」と略称する。）に準拠した画像符号化方式および画像復号化方式は、従来のＭＰＥＧ２（ISO/IEC 133818-2）の場合に比べ、２倍近い高符号化効率を実現するが、膨大な演算量を必要とするため、画素数の小さい画角への適用が一般的である。 As a compression technique for digital images (including digital moving images, hereinafter simply referred to as “images”), H.P. H.264 / AVC (ITU-T Recommendation H.264 and ISO / IEC 14496-10 Advanced Video Coding) is standardized. H. H.264 / AVC (hereinafter abbreviated as “H.264”) image coding and decoding methods are nearly twice as high as conventional MPEG2 (ISO / IEC 133818-2). However, since a large amount of calculation is required, application to an angle of view with a small number of pixels is common.

図６は、Ｈ．２６４に準拠した従来の画像符号化装置４００の機能構成を示すブロック図の一例である。図６に示すように、画像符号化装置４００は、ブロック分割部１０、差分処理部１５、直交変換部２０、量子化部３０、符号化部１３０、蓄積バッファ１４０、逆量子化部４０、逆直交変換部５０、加算処理部５５、デブロッキングフィルタ６０、フレームメモリ７０および７１、動き推定部８０、イントラ推定部９０、モード選択部１００、モードスッチ１０５、イントラ予測部１１０、動き補償部１２０、レート制御部１５０および全体制御部４６０を備える。 FIG. 2 is an example of a block diagram illustrating a functional configuration of a conventional image encoding device 400 based on H.264. As illustrated in FIG. 6, the image encoding device 400 includes a block dividing unit 10, a difference processing unit 15, an orthogonal transform unit 20, a quantization unit 30, an encoding unit 130, an accumulation buffer 140, an inverse quantization unit 40, and an inverse unit. Orthogonal transformation unit 50, addition processing unit 55, deblocking filter 60, frame memories 70 and 71, motion estimation unit 80, intra estimation unit 90, mode selection unit 100, mode switch 105, intra prediction unit 110, motion compensation unit 120, rate A control unit 150 and an overall control unit 460 are provided.

一般に、入力画像１１の各ピクチャは、予め規定された数の画素から構成される単位ブロックに分割され、この単位で符号化が行われる。この単位ブロックはマクロブロックと呼ばれ、１６画素×１６ライン（以下「１６×１６画素」という。）から構成される１個の輝度信号（Ｙ信号）ブロックと、それと空間的に対応する８画素×８ライン（以下「８×８画素」という。）の２つの色差信号（Ｃｒ信号およびＣｂ信号）ブロックから構成されている。 In general, each picture of the input image 11 is divided into unit blocks composed of a predetermined number of pixels, and encoding is performed in this unit. This unit block is called a macro block, and is composed of one luminance signal (Y signal) block composed of 16 pixels × 16 lines (hereinafter referred to as “16 × 16 pixels”) and 8 pixels spatially corresponding thereto. It is composed of two color difference signal (Cr signal and Cb signal) blocks of × 8 lines (hereinafter referred to as “8 × 8 pixels”).

ブロック分割部１０は、入力画像１１を受け付けると、各ピクチャをマクロブロックに分割（以下、これを「入力マクロブロック」という。）し、差分処理部１５に出力する。 Upon receipt of the input image 11, the block dividing unit 10 divides each picture into macro blocks (hereinafter referred to as “input macro blocks”), and outputs them to the difference processing unit 15.

差分処理部１５は、入力マクロブロックの各画素に対して、イントラ予測部１１０または動き補償部１２０で生成されたマクロブロック（以下、これを「予測マクロブロック」という。）の空間的に対応する各画素との間で差分処理を施し（以下、これを「差分マクロブロック」という。）、差分マクロブロックを直交変換部２０に出力する。 The difference processing unit 15 spatially corresponds to each pixel of the input macroblock of the macroblock generated by the intra prediction unit 110 or the motion compensation unit 120 (hereinafter referred to as “prediction macroblock”). Difference processing is performed between each pixel (hereinafter referred to as “difference macroblock”), and the difference macroblock is output to the orthogonal transform unit 20.

直交変換部２０は、差分処理部１５から取得した差分マクロブロックに対して、直交変換ブロック単位で画素領域表現を周波数領域表現に変換する直交変換処理を行う。なお、この場合の直交変換処理は、２並列又は４並列等のパイプライン処理で行ってもよい。また、直交変換ブロックのサイズは、従来のＭＰＥＧ方式では８×８画素であるが、Ｈ．２６４では４画素×４ライン（以下「４×４画素」という。）又は８×８画素である。 The orthogonal transform unit 20 performs orthogonal transform processing on the difference macroblock acquired from the difference processing unit 15 to transform the pixel region representation into the frequency region representation in units of orthogonal transform blocks. In this case, the orthogonal transformation processing may be performed by pipeline processing such as 2-parallel or 4-parallel. The size of the orthogonal transform block is 8 × 8 pixels in the conventional MPEG system. In H.264, 4 pixels × 4 lines (hereinafter referred to as “4 × 4 pixels”) or 8 × 8 pixels.

直交変換部２０は、４×４画素の各直交変換ブロックの直流成分のみを集めた直交変換ブロックを信号成分毎に構成し、さらに直交変換を行う。直交変換ブロック内の各変換係数は量子化部３０に出力される。 The orthogonal transform unit 20 configures an orthogonal transform block that collects only the DC components of each orthogonal transform block of 4 × 4 pixels for each signal component, and further performs orthogonal transform. Each transform coefficient in the orthogonal transform block is output to the quantization unit 30.

量子化部３０は、レート制御部１５０から取得した量子化パラメータに従って、各直交変換ブロック内の変換係数を量子化して符号化部１３０に出力する。なお、量子化された変換係数（以下、これを「直交変換係数」という。）は、符号化部１３０に供給されると同時に、逆量子化部４０に供給される。 The quantization unit 30 quantizes the transform coefficient in each orthogonal transform block according to the quantization parameter acquired from the rate control unit 150 and outputs the quantized coefficient to the encoding unit 130. The quantized transform coefficient (hereinafter referred to as “orthogonal transform coefficient”) is supplied to the encoding unit 130 and simultaneously to the inverse quantization unit 40.

符号化部１３０は、量子化部３０から量子化された変換係数を受け取り、符号化を行う。Ｈ．２６４に準拠した符号化部１３０は、可変長符号化を用いたＣＡＶＬＣ（Context-based Adaptive Variable Length Coding）あるいはＣＡＢＡＣ（Context-based Adaptive Binary Arithmetic Coding）を用いて符号化を行う。符号化された画像信号は蓄積バッファ１４０に蓄積される。 The encoding unit 130 receives the quantized transform coefficient from the quantization unit 30 and performs encoding. H. An encoding unit 130 compliant with H.264 performs encoding using CAVLC (Context-based Adaptive Variable Length Coding) or CABAC (Context-based Adaptive Binary Arithmetic Coding) using variable length coding. The encoded image signal is stored in the storage buffer 140.

逆量子化部４０は、レート制御部１５０から取得した量子化パラメータに従って量子化された直交変換係数を逆量子化することで得た変換係数から直交変換ブロックを復元する。 The inverse quantization unit 40 restores the orthogonal transform block from the transform coefficient obtained by inverse quantization of the orthogonal transform coefficient quantized according to the quantization parameter acquired from the rate control unit 150.

逆直交変換部５０は、復元された直交変換ブロックを用いて差分マクロブロックを復元する。復元された差分マクロブロックは、予測マクロブロックと共に加算処理部５５に入力される。 The inverse orthogonal transform unit 50 reconstructs the differential macroblock using the reconstructed orthogonal transform block. The restored difference macroblock is input to the addition processing unit 55 together with the predicted macroblock.

加算処理部５５は、復元された差分マクロブロックと予測マクロブロックの各画素に対して加算処理を施し、再生マクロブロックを生成する。この再生マクロブロックは、デブロッキングフィルタ６０に送信されると共に、さらに予測処理に用いるため、フレームメモリ７０に蓄積される。 The addition processing unit 55 performs addition processing on each pixel of the restored difference macroblock and prediction macroblock to generate a reproduction macroblock. This reproduced macroblock is transmitted to the deblocking filter 60 and further stored in the frame memory 70 for use in prediction processing.

なお、デブロッキングフィルタ６０は、再生マクロブロックから構成される復号画像に含まれるブロック歪の除去を行う。ブロック歪が除去された復号画像はフレームメモリ７１に格納される。 Note that the deblocking filter 60 removes block distortion included in a decoded image composed of reproduced macroblocks. The decoded image from which block distortion has been removed is stored in the frame memory 71.

次に、上記予測マクロブロックを生成するための予測方法および予測タイプについて説明する。 Next, a prediction method and a prediction type for generating the prediction macroblock will be described.

予測方法には、大きく分けて２種類あり、それぞれイントラ予測（「画面内予測」ともいう。）、動き補償予測（「動き補償フレーム間予測」あるいは「インター予測」ともいう。）と呼ばれる。以下では、イントラ予測の概要について説明する。 There are roughly two types of prediction methods, which are called intra prediction (also referred to as “intra-screen prediction”) and motion compensation prediction (also referred to as “motion-compensated interframe prediction” or “inter prediction”), respectively. Below, the outline | summary of intra prediction is demonstrated.

Ｈ．２６４のイントラ予測は、復号画像のフレーム内の符号化済みの画素を用いて、マクロブロック内の画素を予測する方法であり、２種類の予測タイプがある。２種類の予測タイプは、イントラ予測を行う際のブロックサイズ（４×４画素又は１６×１６画素）によって区別されており、それぞれ、「イントラ４×４予測」、「イントラ１６×１６予測」と呼ばれている。 H. H.264 intra prediction is a method of predicting pixels in a macroblock using encoded pixels in a frame of a decoded image, and there are two types of prediction. The two types of prediction are distinguished by the block size (4 × 4 pixels or 16 × 16 pixels) at the time of performing intra prediction, and “intra 4 × 4 prediction” and “intra 16 × 16 prediction”, respectively. being called.

イントラ予測では、上記４×４画素や１６×１６画素で構成されるブロックの単位で周辺のブロックの画素を利用して予測を行っている。 In intra prediction, prediction is performed by using pixels of neighboring blocks in units of blocks composed of the above 4 × 4 pixels or 16 × 16 pixels.

図７（ａ）は、イントラ４×４予測によって予測される符号化対象画素（画素ａから画素ｐの１６画素）とイントラ予測に用いる符号化済みの隣接画素（ＡからＭの１３画素）の配置を示す図である。ここで、前記符号化対象画素は、ブロック分割部１０から出力された符号化対象のマクロブロック内の画素である。一方、符号化済みの隣接画素は、復号化され復元されたマクロブロックあるいはブロックの画素でフレームメモリ７０から読み出された画素である。 FIG. 7A shows an encoding target pixel (16 pixels from pixel a to pixel p) predicted by intra 4 × 4 prediction and an encoded adjacent pixel (13 pixels from A to M) used for intra prediction. It is a figure which shows arrangement | positioning. Here, the encoding target pixel is a pixel in the encoding target macroblock output from the block dividing unit 10. On the other hand, the encoded adjacent pixel is a pixel read from the frame memory 70 as a decoded or restored macroblock or block pixel.

図７（ｂ）〜（ｊ）は、イントラ４×４予測における予測方向を示す図であり、符号化済みの隣接画素の画素値をもとに、予測方向に沿って、Ｈ．２６４で規定された演算式によって対象画素を算出する。ここで、このときの予測方向を予測モード番号（予測モード０から予測モード８）で示し、図７（ｂ）における予測モード０の予測方向は垂直下方、図７（ｃ）における予測モード1の予測方向は水平、図７（ｄ）における予測モード２の予測はＤＣ（平均）、図７（ｅ）における予測モード３の予測方向は斜め左下、図７（ｆ）における予測モード４の予測方向は斜め右下、図７（ｇ）における予測モード５の予測方向は垂直右方、図７（ｈ）における予測モード６の予測方向は水平下方、図７（ｉ）における予測モード７の予測方向は垂直左方、図７（ｊ）における予測モードの予測方向は水平上方をそれぞれ示している。 7B to 7J are diagrams showing the prediction direction in intra 4 × 4 prediction. Based on the encoded pixel values of adjacent pixels, H. The target pixel is calculated by an arithmetic expression defined in H.264. Here, the prediction direction at this time is indicated by a prediction mode number (prediction mode 0 to prediction mode 8). The prediction direction of prediction mode 0 in FIG. 7B is vertically downward, and that of prediction mode 1 in FIG. 7C. The prediction direction is horizontal, the prediction in prediction mode 2 in FIG. 7D is DC (average), the prediction direction in prediction mode 3 in FIG. 7E is diagonally lower left, and the prediction direction in prediction mode 4 in FIG. Is diagonally lower right, the prediction direction of prediction mode 5 in FIG. 7G is vertically right, the prediction direction of prediction mode 6 in FIG. 7H is horizontally downward, and the prediction direction of prediction mode 7 in FIG. Indicates the vertical left direction, and the prediction direction of the prediction mode in FIG.

上記イントラ４×４予測は、輝度信号の予測に適用される。例えば、予測モード０の場合、予測対象画素の４×４画素ブロックの上側に隣接する復号済みの画素データを用いて垂直下方の予測を行い，予測画像を生成する。この予測モード０は，予測対象の画像領域に垂直方向エッジや境界がある場合に有効な予測モードである。モード０以外の予測モードの場合も同様にそれぞれが特定方向のエッジや境界に対して有効な予測モードとなっており、隣接ブロックの復号済みの画素を用いてイントラ予測処理の一部を行う。 The intra 4 × 4 prediction is applied to prediction of a luminance signal. For example, in the case of the prediction mode 0, the prediction image is generated by performing the vertical downward prediction using the decoded pixel data adjacent to the upper side of the 4 × 4 pixel block of the prediction target pixel. This prediction mode 0 is an effective prediction mode when there are vertical edges and boundaries in the image area to be predicted. Similarly, prediction modes other than mode 0 are also effective prediction modes for edges and boundaries in a specific direction, and a part of intra prediction processing is performed using decoded pixels of adjacent blocks.

このようにイントラ予測においては、周辺のブロックに対する演算結果を用いて予測処理を行うため、周辺のブロックに対する逆直交変換処理が完了していることが必要になる。仮に、周辺ブロックの逆直交変換処理が完了していない場合には、その処理の完了を待って処理することなり、処理の遅延が発生することになる。 As described above, in the intra prediction, since the prediction process is performed using the calculation result for the surrounding blocks, it is necessary to complete the inverse orthogonal transform process for the surrounding blocks. If the inverse orthogonal transform processing of the peripheral blocks is not completed, the processing is performed after the processing is completed, and processing delay occurs.

図８は、マクロブロックを構成する直交変換ブロックにおけるイントラ予測の順序を説明するための図である。図８に示すように、４×４ブロック単位でイントラ予測を行う場合、「ブロック０」、「ブロック１」、「ブロック２」、・・・「ブロック１５」という順序で処理していくことになる。ここで、ブロック１のイントラ予測を行うためには、ブロック０に対するイントラ予測処理、直交変換処理、量子化処理、逆量子化処理、逆直交変換処理を施してはじめてブロック１の処理に必要な周辺のブロック情報（この場合、ブロック１の上位の各ブロック情報は処理済とする。）が得られることになる。つまり、ブロック０について直交変換処理、量子化処理、逆量子化処理、逆直交変換処理を行っている間は、ブロック１のイントラ予測ができず、処理待ちが発生する。なお、ブロック４の場合は、イントラ予測に必要なブロック１については既に処理が完了しているため、ブロック３の処理の後続けて処理を行うことができる。 FIG. 8 is a diagram for explaining the order of intra prediction in orthogonal transform blocks constituting a macroblock. As shown in FIG. 8, when performing intra prediction in units of 4 × 4 blocks, processing is performed in the order of “block 0”, “block 1”, “block 2”,... “Block 15”. Become. Here, in order to perform intra prediction of block 1, peripherals necessary for processing of block 1 are not performed until intra prediction processing, orthogonal transformation processing, quantization processing, inverse quantization processing, and inverse orthogonal transformation processing are performed on block 0. Block information (in this case, each block information on the upper level of block 1 has been processed) is obtained. That is, while orthogonal transformation processing, quantization processing, inverse quantization processing, and inverse orthogonal transformation processing are being performed on block 0, intra prediction of block 1 cannot be performed, and processing wait occurs. In the case of the block 4, since the processing has already been completed for the block 1 necessary for the intra prediction, the processing can be continued after the processing of the block 3.

このようにイントラ予測においては、処理ブロックによって連続的に処理できる箇所とできない箇所が存在する。そのため、この順序をうまく工夫すれば連続的に処理できる箇所が増え、パイプライン処理が効率的に実行でき、処理時間を短縮することが可能となる。 Thus, in intra prediction, there are places that can be processed continuously by processing blocks and places that cannot be processed. Therefore, if this order is well devised, the number of places where continuous processing can be performed increases, pipeline processing can be executed efficiently, and processing time can be shortened.

特許文献１においては、このようにイントラ予測の処理順序を効率的に変えることにより、パイプライン処理の時間を短縮する提案がなされている。
特開２００４−１４０４７３号公報 In Patent Document 1, there is a proposal for shortening the pipeline processing time by efficiently changing the processing order of intra prediction in this way.
JP 2004-140473 A

ここで、Ｈ．２６４の処理に必要な処理時間と処理量との関係について調べてみる。例えば、ハイビジョン映像（１９２０×１０８８画素）をＨ．２６４により圧縮符号化する場合には、１６×１６画素単位で表されるマクロブロック（以下「ＭＢ」とも記す。）あたりの処理数は、１画面あたり、
（１９２０×１０８８）÷（１６×１６）＝８，１６０ＭＢ（１）
となり、フレームレートを毎秒３０フレームとすると、１秒あたり、
８，１６０×３０ｆｐｓ＝２４４，８００ＭＢ／ｓｅｃ（２）
となる。 Here, H. The relationship between the processing time required for the H.264 processing and the processing amount will be examined. For example, a high-definition video (1920 × 1088 pixels) is converted to H.264. In the case of compression encoding with H.264, the number of processes per macroblock (hereinafter also referred to as “MB”) expressed in units of 16 × 16 pixels is as follows:
(1920 × 1088) ÷ (16 × 16) = 8,160MB (1)
If the frame rate is 30 frames per second,
8,160 × 30fps = 244,800MB / sec (2)
It becomes.

これは、仮にＣＰＵの動作周波数を１００ＭＨｚとした場合、１つのマクロブロックの圧縮符号化に費やすことのできるサイクル数は、
１００ＭＨｚ÷２４４，８００＝４０８サイクル（３）
となる。 This is because if the operating frequency of the CPU is 100 MHz, the number of cycles that can be spent for compression encoding of one macroblock is:
100MHz ÷ 244,800 = 408 cycles (3)
It becomes.

ここで、４×４画素単位で直交変換処理を行う場合の処理サイクルを考えてみると、４画素同時に処理するとして単純に２次元の直交変換処理に要するサイクル数は、水平方向４行、垂直方向４列の演算を必要とし、約４＋４＝８サイクルが必要となる。１マクロブロックでは１６回の直交変換演算を行うことになるので１６×８＝１２８サイクル、さらに逆直交変換処理も含めて考えると直交変換処理については１２８＋１２８＝２５６サイクルを消費することになる。 Here, considering the processing cycle in the case of performing orthogonal transform processing in units of 4 × 4 pixels, the number of cycles required for two-dimensional orthogonal transform processing is simply four rows in the horizontal direction and vertical, assuming that four pixels are processed simultaneously. It requires 4 columns in the direction and requires about 4 + 4 = 8 cycles. Since one macro block performs 16 orthogonal transform operations, 16 × 8 = 128 cycles, and 128 + 128 = 256 cycles are consumed for the orthogonal transform processing when considering the inverse orthogonal transform processing.

さらに、量子化処理や逆量子化処理まで考えると、上記の４０８サイクルを超えてしまうことになり、このままでは符号化処理が実現できないということになる。 Furthermore, when considering the quantization process and the inverse quantization process, the above 408 cycles will be exceeded, and the encoding process cannot be realized as it is.

このように、膨大な処理演算が必要なＨ．２６４によってハイビジョン画像のような大量データの圧縮符号化は、単純に処理するだけでは実現が困難である。そのため、演算アルゴリズムを工夫したり、演算の並列度を高くしたり、周波数を上げたりすることによって処理の効率化を図ることが必要になる。 In this way, H.264 requires enormous processing operations. According to H.264, compression encoding of a large amount of data such as a high-definition image is difficult to realize simply by processing. For this reason, it is necessary to improve processing efficiency by devising a calculation algorithm, increasing the parallel degree of calculation, or increasing the frequency.

例えば、図9に示すように、直交変換処理や逆直交変換処理は、行又は列単位（つまり、４画素単位や8画素単位など）の処理が一般的であるが、量子化処理は画素単位の処理であるため、ブロック間にはバッファを設けて処理時間の違いを吸収する必要がある。 For example, as shown in FIG. 9, orthogonal transform processing and inverse orthogonal transform processing are generally performed in units of rows or columns (that is, units of 4 pixels or units of 8 pixels), but quantization processing is performed in units of pixels. Therefore, it is necessary to provide a buffer between the blocks to absorb the difference in processing time.

図１０（ａ）は、Ｈ２６４における直交変換演算を表す式である。また、図１０（ｂ）は、Ｈ２６４における逆直交変換演算を表す式である。 FIG. 10A is an equation representing an orthogonal transformation operation in H264. FIG. 10B is an expression representing the inverse orthogonal transform operation in H264.

図１１（ａ）は、直交変換演算における行方向および列方向の行列演算の様子を表す図であり、図１２（ａ）が、この直交変換演算をバタフライ演算により実現する様子を示す図である。 FIG. 11A is a diagram illustrating a state of matrix operation in the row direction and the column direction in the orthogonal transform operation, and FIG. 12A is a diagram illustrating a state in which this orthogonal transform operation is realized by butterfly operation. .

また、図１１（ｂ）は、逆直交変換演算における行方向および列方向の行列演算の様子を表す図であり、図１２（ｂ）が、この逆直交変換演算をバタフライ演算により実現する様子を示す図である。 FIG. 11B is a diagram showing the state of the matrix operation in the row direction and the column direction in the inverse orthogonal transform operation. FIG. 12B shows how the inverse orthogonal transform operation is realized by the butterfly operation. FIG.

上記図１１（ａ）、（ｂ）に示されるように、従来のバタフライ演算では、行方向および列方向における計算の順序は、一律昇順（即ち、ｉ＝０．．３、ｊ＝０．．３）である。 As shown in FIGS. 11A and 11B, in the conventional butterfly operation, the calculation order in the row direction and the column direction is in ascending order (that is, i = 0..3, j = 0... 3).

上記記特許文献１の中に示されているイントラ予測は、処理の効率化を図る一手法であり、イントラ処理の順序を変えてパイプライン処理の効率化を実現するもので、単純に決められた処理順序ではなく、処理待ち時間を短くする順序でブロックを処理しようというものである。 The intra prediction shown in the above-mentioned Patent Document 1 is a method for improving the efficiency of the processing, and realizes the efficiency of the pipeline processing by changing the order of the intra processing. Instead of processing order, the blocks are processed in the order of shortening the processing waiting time.

そのため、上記の方法では、イントラ予測処理のみを最適化してパイプライン処理に要する時間を短縮しようというもので、イントラ予測処理以外の直交変換処理、量子化処理については考慮されていない。 Therefore, in the above method, only the intra prediction process is optimized to reduce the time required for the pipeline process, and the orthogonal transform process and the quantization process other than the intra prediction process are not considered.

さらに、パイプライン処理の高速化のために、参照する周辺のマクロブロックを一部別のマクロブロック情報で代用したり、参照できないものとして処理するため、画質劣化を伴うものであった。 Furthermore, in order to speed up the pipeline processing, the peripheral macroblocks to be referred to are partially substituted with different macroblock information, or are processed as those that cannot be referred to.

そこで、本発明は、上記の課題に鑑みてされたものであり、Ｈ．２６４のような符号化方式において、画質劣化をもたらすことなく処理にかかるサイクル数を軽減し、高速に符号化が可能な画像符号化装置等を提供することを目的とする。 The present invention has been made in view of the above problems. An object of the present invention is to provide an image encoding apparatus and the like that can reduce the number of cycles required for processing without causing deterioration in image quality and can perform high-speed encoding in an encoding method such as H.264.

上記課題を解決するために、本発明に係る画像符号化装置は、行単位又は列単位の画素データに対して演算が可能な第１処理と第２処理とを含み、予め規定された数の画素から構成される単位ブロック毎に符号化対象ピクチャを符号化する画像符号化装置であって、前記第２処理において使用される行単位又は列単位の画素データに対する演算を優先して実行する第１処理手段と、前記第１処理において前記第２処理の開始に必要な行単位又は列単位の画素データに対する演算が終了次第、前記第２処理を開始する第２処理手段とを備えることを特徴とする。 In order to solve the above-described problem, an image encoding device according to the present invention includes a first process and a second process that can be performed on pixel data in units of rows or columns, and includes a predetermined number of processes. An image encoding apparatus that encodes a picture to be encoded for each unit block composed of pixels, wherein the operation is performed with priority on pixel-unit or column-unit pixel data used in the second process. 1 processing means, and second processing means for starting the second processing upon completion of the operation on the pixel data in units of rows or columns necessary for starting the second processing in the first processing. And

これにより、Ｈ．２６４のような符号化方式において、画質劣化をもたらすことなく処理にかかるサイクル数を軽減し、高速に符号化が可能な画像符号化装置等を提供することができる。 As a result, H.C. In an encoding method such as H.264, it is possible to provide an image encoding device or the like that can reduce the number of cycles required for processing without causing image quality degradation and can perform high-speed encoding.

また、前記第１処理手段は、イントラ予測処理で使用される第１の単位ブロックにおける右端の列単位の画素データに対する、逆直交変換処理における一部の逆直交変換演算を優先して実行し、前記第２処理手段は、前記逆直交変換演算を終えた前記第１の単位ブロックにおける右端の列単位の画素データを用いて、前記第１の単位ブロックの右隣の第２の単位ブロックの画素に対するイントラ予測処理を開始することを特徴とする。 The first processing means preferentially executes a part of the inverse orthogonal transform operation in the inverse orthogonal transform process on the pixel data of the rightmost column in the first unit block used in the intra prediction process, The second processing means uses the pixel data of the right end column unit in the first unit block after the inverse orthogonal transform operation, and uses the pixel data of the second unit block adjacent to the right of the first unit block. Intra prediction processing is started.

また、前記第１処理手段は、逆直交変換処理で使用される、第１の単位ブロックにおける行単位の画素データが優先して算出されるように、前記第１の単位ブロックにおける列単位の逆量子化処理を、左右の隣り合うＮ個の画素単位で並列処理を表す並列度Ｎで実行し、前記第２処理手段は、前記第１の単位ブロックにおける逆量子化処理を終えた行単位の画素データを用いて、前記第１の単位ブロックに対する逆直交変換処理の実行を開始することを特徴とする。 Further, the first processing means reverses the column unit in the first unit block so that the pixel data in the row unit in the first unit block used in the inverse orthogonal transform process is preferentially calculated. The quantization processing is executed with a parallelism degree N that represents parallel processing in units of N pixels adjacent to the left and right, and the second processing means is configured to perform the row unit after the inverse quantization processing in the first unit block is finished. Using the pixel data, the execution of the inverse orthogonal transform process for the first unit block is started.

また、前記第２処理手段は、さらに、イントラ予測処理で使用される、前記第１の単位ブロックにおける右端の列単位の画素データに対する前記逆直交変換処理を優先して実行してもよい。 Further, the second processing means may preferentially execute the inverse orthogonal transform process on the pixel data of the rightmost column unit in the first unit block used in the intra prediction process.

また、前記画像符号化装置は、さらに、前記第１の単位ブロックにおける列単位の画素データに対する量子化処理を前記並列度Ｎで実行する第３処理手段を備えることとしてもよい。 The image encoding device may further include third processing means for executing a quantization process on the column-unit pixel data in the first unit block with the parallelism N.

さらに、本発明は、上記画像符号化装置における特徴的な構成手段をステップとする画像符号化方法として実現したり、それらのステップをパーソナルコンピュータ等に実行させるプログラムとして実現することもできる。そして、そのプログラムをＤＶＤ等の記録媒体やインターネット等の伝送媒体を介して広く流通させることができるのは云うまでもない。 Furthermore, the present invention can be realized as an image encoding method using characteristic constituent means in the image encoding apparatus as steps, or as a program for causing a personal computer or the like to execute these steps. Needless to say, the program can be widely distributed via a recording medium such as a DVD or a transmission medium such as the Internet.

以上のように、本発明に係る画像符号化装置は、演算順序を制御することにより、直交変換処理及び量子化処理において、後段の処理ブロック（例えば、イントラ符号化処理）の演算に必要となる最低限のデータを優先的に演算するため、並列演算の高速化が促進され、より効率的に符号化処理を行うことが可能となるという格別な効果を奏する。 As described above, the image coding apparatus according to the present invention is necessary for the computation of the subsequent processing block (for example, intra coding processing) in the orthogonal transform processing and the quantization processing by controlling the computation order. Since the minimum data is preferentially calculated, speeding up of parallel calculation is promoted, and the encoding process can be performed more efficiently.

以下、本発明に係る画像符号化装置の実施の形態について図面を参照しながら説明する。なお、以下の実施の形態において本発明について図面を用いて説明するが、本発明をこれらに限定することを意図するものではない。 Hereinafter, embodiments of an image encoding device according to the present invention will be described with reference to the drawings. In the following embodiments, the present invention will be described with reference to the drawings. However, the present invention is not intended to be limited thereto.

図１は、本実施の形態に係る画像符号化装置２００の機能構成を示すブロック図である。図１に示すように、画像符号化装置２００は、ブロック分割部１０、差分処理部１５、直交変換部２０、量子化部２３０、符号化部１３０、蓄積バッファ１４０、逆量子化部２４０、逆直交変換部２５０、加算処理部５５、デブロッキングフィルタ６０、フレームメモリ７０および７１、動き推定部８０、イントラ推定部９０、モード選択部１００、モードスッチ１０５、イントラ予測部１１０、動き補償部１２０、レート制御部１５０および全体制御部２６０を備える。なお、上記従来技術に係る画像符号化装置４００と同じ機能構成については同じ符番を付し、その説明は省略する。 FIG. 1 is a block diagram showing a functional configuration of an image encoding device 200 according to the present embodiment. As shown in FIG. 1, the image coding apparatus 200 includes a block dividing unit 10, a difference processing unit 15, an orthogonal transform unit 20, a quantization unit 230, a coding unit 130, a storage buffer 140, an inverse quantization unit 240, and an inverse unit. Orthogonal transformation unit 250, addition processing unit 55, deblocking filter 60, frame memories 70 and 71, motion estimation unit 80, intra estimation unit 90, mode selection unit 100, mode switch 105, intra prediction unit 110, motion compensation unit 120, rate A control unit 150 and an overall control unit 260 are provided. Note that the same reference numerals are given to the same functional configurations as those of the image encoding device 400 according to the conventional technique, and the description thereof is omitted.

量子化部２３０は、レート制御部１５０から取得した量子化パラメータに従って、各直交変換ブロック内の変換係数を量子化して符号化部１３０に出力する。なお、上記変換係数の量子化については、全体制御部２６０の指示により、左右の隣り合うＮ個の画素単位で並列処理（具体的には、隣り合う複数の画素単位で２並列又は４並列等のパイプライン処理）を実施する。なお、この場合のＮを「並列度」という。なお、本実施の形態において、量子化処理は、本発明の第３処理の一例であり、量子化部は、本発明の第３処理手段の一例である。 The quantization unit 230 quantizes the transform coefficient in each orthogonal transform block according to the quantization parameter acquired from the rate control unit 150 and outputs the quantized coefficient to the encoding unit 130. Note that the quantization of the transform coefficient is performed in parallel on the left and right adjacent N pixel units (specifically, two parallel or four parallel, etc. on a plurality of adjacent pixel units) according to an instruction from the overall control unit 260. Pipeline processing). N in this case is referred to as “parallelism”. In the present embodiment, the quantization process is an example of the third process of the present invention, and the quantization unit is an example of the third processing means of the present invention.

逆量子化部２４０は、レート制御部１５０から取得した量子化パラメータに従って量子化された直交変換係数を逆量子化することで得られた変換係数から直交変換ブロックを復元する。なお、後述するイントラ予測処理やイントラ予測部との関係においては、本実施の形態における逆直交変換処理は本発明の第１処理の一例であり、逆直交変換部は本発明の第１処理手段の一例である。さらに、上記量子化処理や量子化部との関係においては、本実施の形態における逆量子化処理は、本発明の第１処理の一例であり、逆量子化部は、本発明の第１処理手段の一例である。 The inverse quantization unit 240 restores the orthogonal transform block from the transform coefficient obtained by inverse quantization of the orthogonal transform coefficient quantized according to the quantization parameter acquired from the rate control unit 150. In addition, in the relationship with the intra prediction process mentioned later or an intra prediction part, the inverse orthogonal transformation process in this Embodiment is an example of the 1st process of this invention, and an inverse orthogonal transformation part is a 1st process means of this invention. It is an example. Further, in the relationship with the quantization process and the quantization unit, the inverse quantization process in the present embodiment is an example of the first process of the present invention, and the inverse quantization unit is the first process of the present invention. It is an example of a means.

なお、詳細は後述するが、上記量子化部２３０および逆量子化部２４０では、それぞれの後段の処理内容を考慮した並列処理を実施するため、従来の単純に並列処理を行う場合に比べて処理時間（処理に要するサイクル数）を短縮させることが可能である。 Although details will be described later, since the quantization unit 230 and the inverse quantization unit 240 perform parallel processing in consideration of the processing contents of the subsequent stages, the processing is performed in comparison with the conventional simple parallel processing. It is possible to reduce time (the number of cycles required for processing).

逆直交変換部２５０は、復元された直交変換ブロックを用いて差分マクロブロックを復元する。差分マクロブロックの復元の際、逆直交変換部２５０は、全体制御部２６０の指示により、差分マクロブロックを構成する各４×４ブロックにおいては、それぞれの右隣の４×４ブロックのイントラ予測処理に使用される列単位の画素である、最右端１列の画素の逆直交変換演算を優先して最初に実行する。例えば、４×４ブロックについて列方向の逆直交変換演算を４並列８サイクルで実行する場合は、最右端１列の画素の逆直交変換演算を最初に（後述するように、列方向の逆直交変換部をｊ＝３．．０の順序で）実行することで、従来に比べ、６サイクル短縮させることが可能となる（従来は、最左端１列の画素の逆直交変換演算を最初に（即ち、ｊ＝０．．３の順序で）実行していたため）。なお、イントラ予測処理やイントラ予測部との関係においては、本実施の形態における逆直交変換処理は本発明の第１処理の一例であり、逆直交変換部は本発明の第１処理手段の一例である。さらに、上記逆量子化処理や逆量子化部との関係においては、本実施の形態における逆直交変換処理は本発明の第２処理の一例であり、逆直交変換部は本発明の第２処理手段の一例である。 The inverse orthogonal transform unit 250 restores the differential macroblock using the restored orthogonal transform block. At the time of restoration of the difference macroblock, the inverse orthogonal transform unit 250, according to the instruction of the overall control unit 260, in each 4 × 4 block constituting the difference macroblock, the intra prediction process of the right adjacent 4 × 4 block. First, the inverse orthogonal transform operation of the pixel in the rightmost column, which is the pixel of the column unit used in the above, is executed first. For example, when the inverse orthogonal transform operation in the column direction is executed for 4 × 4 blocks in 4 parallel 8 cycles, the inverse orthogonal transform operation of the pixel in the rightmost column is first performed (as described later, the inverse orthogonal operation in the column direction). By executing the conversion unit in the order of j = 3.0., It is possible to shorten six cycles compared to the conventional case (conventionally, the inverse orthogonal transformation operation of the pixel in the leftmost column is first ( I.e. j = 0..3) in order)). Note that, in relation to the intra prediction process and the intra prediction unit, the inverse orthogonal transform process in the present embodiment is an example of the first process of the present invention, and the inverse orthogonal transform unit is an example of the first processing means of the present invention. It is. Further, in relation to the inverse quantization process and the inverse quantization unit, the inverse orthogonal transform process in the present embodiment is an example of the second process of the present invention, and the inverse orthogonal transform unit is the second process of the present invention. It is an example of a means.

イントラ予測部１１０は、復号画像のフレーム内の符号化済みの画素を用いて、マクロブロック内の画素の予測を行う。例えば、逆直交変換処理においてイントラ予測処理の開始に必要な行単位の画素データに対する演算が終了次第、当該イントラ予測処理を開始する。なお、本実施の形態において、イントラ予測処理は、本発明の第２処理の一例であり、イントラ予測部は、本発明の第２処理手段の一例である。 The intra prediction unit 110 predicts the pixels in the macroblock using the encoded pixels in the frame of the decoded image. For example, in the inverse orthogonal transform process, the intra prediction process is started as soon as the calculation for the pixel data in units of rows necessary for the start of the intra prediction process is completed. In the present embodiment, the intra prediction process is an example of the second process of the present invention, and the intra prediction unit is an example of the second processing means of the present invention.

全体制御部２６０は、例えばＲＡＭおよび制御プログラムを格納するＲＯＭを備えるマイクロコンピュータであり、画像符号化装置２００全体を制御する。特に、全体制御部２６０は、量子化部２３０および逆量子化部２４０における並列度、並びに逆直交変換部２５０における差分マクロブロックを復元する際の演算順序を決定（具体的には、列方向の演算順序をｊ＝３．．０になるように設定）する。 The overall control unit 260 is a microcomputer that includes, for example, a RAM and a ROM that stores a control program, and controls the entire image coding apparatus 200. In particular, the overall control unit 260 determines the degree of parallelism in the quantization unit 230 and the inverse quantization unit 240, and the calculation order for restoring the differential macroblock in the inverse orthogonal transform unit 250 (specifically, the column direction (The operation order is set so that j = 3.0).

図２（ａ）〜（ｃ）は、本実施の形態に係る量子化部２３０における量子化処理および逆量子化部２４０における逆量子化処理における、並列度による計算速度の差を説明するための図である。一般に、量子化処理および逆量子化処理は、画素単位の処理であるため、並列処理を行わない場合は１画素処理するのに１サイクル必要とする。一方、８×８画素を２並列で処理する場合は、８×８÷２＝３２サイクル、同様に８×８画素を４並列で処理する場合は、８×８÷４＝１６サイクルそれぞれ必要とする。 FIGS. 2A to 2C are diagrams for explaining a difference in calculation speed depending on the degree of parallelism in the quantization process in the quantization unit 230 and the inverse quantization process in the inverse quantization unit 240 according to the present embodiment. FIG. In general, since the quantization process and the inverse quantization process are pixel-by-pixel processes, one cycle is required to process one pixel when parallel processing is not performed. On the other hand, when 8 × 8 pixels are processed in 2 parallel, 8 × 8 ÷ 2 = 32 cycles, and similarly when 8 × 8 pixels are processed in 4 parallel, 8 × 8 ÷ 4 = 16 cycles are required. To do.

図２（ａ）は、８×８画素で構成される直交変換ブロックに対して並列度「１」（即ち、並列処理を行わない）で量子化処理又は逆量子化処理を行う例である。 FIG. 2A shows an example in which a quantization process or an inverse quantization process is performed on an orthogonal transform block composed of 8 × 8 pixels with a parallel degree “1” (that is, no parallel processing is performed).

この場合、量子化処理の処理順序は、画素番号で表すと「００」→「１０」→・・・「７０」→「０１」→・・・「０２」→・・・「０７」→・・・「７７」の順序で処理される。この後、逆量子化処理の場合も上記量子化処理と同じ順序で処理される。なお、逆量子化処理の場合は、画素番号が「０７」まで終えた時点で（即ち、最上段の１行全ての画素（画素番号「００」〜「０７」の８画素）について逆量子化処理が終えた時点で）、次段の処理である逆直交変換処理を開始することができる。つまり、図２（ａ）に示す画素番号「１７」〜「７７」の網掛け部分の７画素については逆量子化処理を終えていなくとも逆直交変換処理を開始することができる。 In this case, the processing order of the quantization processing is expressed by pixel numbers “00” → “10” →... “70” → “01” →... “02” →. .. Processed in the order of “77”. Thereafter, the inverse quantization process is performed in the same order as the quantization process. In the case of the inverse quantization process, the inverse quantization is performed for all pixels in the uppermost row (eight pixels of pixel numbers “00” to “07”) when the pixel number is finished to “07”. When the process is completed, the inverse orthogonal transform process, which is the next process, can be started. That is, the inverse orthogonal transform process can be started even if the inverse quantization process is not finished for the seven pixels in the shaded portions of the pixel numbers “17” to “77” shown in FIG.

一方、図２（ｂ）に示すように、並列度「２」（即ち、２画素ずつ並列処理を行う）で逆量子化処理を行う場合は、画素番号が「０６＆０７」まで終えた時点で（最上段の１行全ての画素について逆量子化処理が終えた時点で）、次段の処理である逆直交変換処理を開始することができる。つまり、図２（ｂ）に示す画素番号「１６」〜「７６」および「１７」〜「７７」の網掛け部分の１４画素については逆量子化処理を終えていなくとも逆直交変換処理を開始することができる。 On the other hand, as shown in FIG. 2B, when the inverse quantization process is performed with the degree of parallelism “2” (that is, two pixels are processed in parallel), when the pixel number reaches “06 & 07” ( When the inverse quantization process is completed for all the pixels in the uppermost row), the inverse orthogonal transform process, which is the next stage process, can be started. That is, the inverse orthogonal transformation process is started even if the inverse quantization process is not finished for the 14 pixels in the shaded portions of the pixel numbers “16” to “76” and “17” to “77” shown in FIG. can do.

同様に、図２（ｃ）に示すように、並列度「４」（即ち、４画素ずつ並列処理を行う）で逆量子化処理を行う場合は、画素番号が「０４＆０５＆０６＆０７」まで終えた時点で、次段の処理である逆直交変換処理を開始することができる。つまり、図２（ｃ）に示す画素番号「１４」〜「７４」、「１５」〜「７５」、「１６」〜「７６」および「１７」〜「７７」の網掛け部分の２８画素については逆量子化処理を終えていなくとも逆直交変換処理を開始することができる。 Similarly, as shown in FIG. 2C, when the inverse quantization process is performed with a degree of parallelism of “4” (that is, four pixels are processed in parallel), the pixel number is “04 & 05 & 06 & 07”. The inverse orthogonal transform process, which is the next stage process, can be started. That is, for the 28 pixels in the shaded portions of pixel numbers “14” to “74”, “15” to “75”, “16” to “76”, and “17” to “77” shown in FIG. Can start the inverse orthogonal transform process even if the inverse quantization process is not completed.

なお、上記のような左右に隣り合う画素同士で並列処理を行う場合と、上下に隣り合う画素同士で並列処理を行う場合とを比較すると、「並列度２」の場合では４サイクル、「並列度４」の場合では６サイクルの差が生じる。即ち、本発明に係る左右に隣り合う画素同士で並列処理を行う場合の方が、「並列度２」の場合では４サイクル、「並列度４」の場合では６サイクル早く次段の処理である逆直交変換処理を開始することができる。 In addition, when the parallel processing is performed between the pixels adjacent to the left and right as described above and the parallel processing is performed between the pixels adjacent to each other in the vertical direction, in the case of “parallel degree 2”, “parallel” In the case of “degree 4”, a difference of 6 cycles occurs. That is, in the case where parallel processing is performed between pixels adjacent to the left and right according to the present invention, the processing in the next stage is earlier in 4 cycles in the case of “parallel degree 2” and 6 cycles in the case of “parallel degree 4”. The inverse orthogonal transform process can be started.

以上のように、本実施の形態における量子化処理および逆量子化処理（例えば８×８画素のブロックを対象にした場合）を、並列度２、３２サイクルで実行する場合は、従来の単純に２並列で処理した場合に比べて４サイクルずつ短縮させることが可能となり、同様に、並列度４、１６サイクルで実行する場合も、従来の単純に４並列で処理した場合に比べて６サイクルずつ短縮させることが可能となる。 As described above, when the quantization process and the inverse quantization process (for example, a block of 8 × 8 pixels) in the present embodiment are executed with a parallelism of 2, 32 cycles, Compared to the case of processing in 2 parallels, it is possible to shorten the cycle by 4 cycles. Similarly, in the case of executing at a parallel degree of 4 and 16 cycles, 6 cycles each in comparison with the case of simply processing in 4 parallels. It can be shortened.

図３（ａ）、（ｂ）は、本実施の形態に係る逆直交変換部２５０における逆直交変換処理の様子を示す図である。図３（ａ）に示されるように、差分マクロブロックの復元の際、逆直交変換部２５０は、全体制御部２６０の指示により、差分マクロブロックを構成する各４×４画素の直交変換ブロックにおいては、それぞれの右隣の直交変換ブロックのイントラ予測に使用する最右端１列の画素の逆直交変換演算を最初に実行する。 FIGS. 3A and 3B are diagrams illustrating a state of the inverse orthogonal transform process in the inverse orthogonal transform unit 250 according to the present embodiment. As shown in FIG. 3A, when the differential macroblock is restored, the inverse orthogonal transform unit 250 performs an orthogonal transform block of each 4 × 4 pixels constituting the differential macroblock according to an instruction from the overall control unit 260. Performs first the inverse orthogonal transform operation on the pixels in the rightmost column used for intra prediction of the respective orthogonal transform blocks on the right.

図３（ａ）に示されるように、逆直交変換部２５０における逆直交変換処理における行方向の演算については、上から下（即ち、ｉ＝０．．３）に列単位で演算を行っても、下から上（即ち、ｉ＝３．．０）に行単位で演算を行っても計算速度上の差がない。しかし、逆直交変換処理における列方向の演算については、左から右（即ち、ｊ＝０．．３）に列単位で演算を行った場合と、右から左（即ち、ｊ＝３．．０）に列単位で演算を行った場合とでは、３サイクルの計算速度差が生じる。つまり、図３（ａ）を参照しながら説明すると、逆直交変換処理（直交変換処理も同様。）の場合は、列方向（及び行方向）の４画素分を同時に演算する際、従来は左端から演算するため４列目であるが、本発明では右端から演算するため最初の１列目として演算が終了し、（４−１）＝３サイクルの処理時間の削減が可能となる。したがって、逆直交変換処理における列方向の演算については、右から左（即ち、ｊ＝３．．０）に列単位で演算を行うこととする（図３（ｂ）参照）。 As shown in FIG. 3A, the row direction calculation in the inverse orthogonal transform process in the inverse orthogonal transform unit 250 is performed in units of columns from the top to the bottom (ie, i = 0 .. 3). However, there is no difference in calculation speed even if the operation is performed in units of rows from the bottom to the top (that is, i = 3.0). However, with respect to the column direction calculation in the inverse orthogonal transform process, the calculation is performed in units of columns from left to right (ie, j = 0 .. 3) and from right to left (ie, j = 3.0). ) Results in a difference in calculation speed of 3 cycles. That is, with reference to FIG. 3A, in the case of inverse orthogonal transform processing (orthogonal transform processing is also the same), conventionally, when calculating four pixels in the column direction (and row direction) simultaneously, In the present invention, since the calculation is performed from the right end, the calculation is completed as the first column, and the processing time of (4-1) = 3 cycles can be reduced. Therefore, the column direction calculation in the inverse orthogonal transform process is performed in units of columns from right to left (that is, j = 3.0) (see FIG. 3B).

図４は、本実施の形態に係る画像符号化装置２００の動作の流れを示すフローチャートである。 FIG. 4 is a flowchart showing an operation flow of the image coding apparatus 200 according to the present embodiment.

最初に、全体制御部２６０は、逆直交変換部２５０における列方向の演算順序を降順（例えば、ｊ＝３．．０）に設定する。さらに、全体制御部２６０は、量子化部２３０および逆量子化部２４０における並列度を特定する。なお、この並列度については、予め規定しておいてもよいし、ユーザから変更を受け付けることとしてもよい。 First, the overall control unit 260 sets the calculation order in the column direction in the inverse orthogonal transform unit 250 in descending order (for example, j = 3.0). Furthermore, the overall control unit 260 specifies the degree of parallelism in the quantization unit 230 and the inverse quantization unit 240. In addition, about this parallel degree, you may prescribe | regulate and it is good also as receiving a change from a user.

この後、全体制御部２６０は、上記の設定後、符号化対象ピクチャの各マクロブロックについて、直交変換処理および量子化処理（Ｓ１０３）、逆量子化処理および（Ｓ１０４）、並びにイントラ予測処理（Ｓ１０５）を実行する（Ｓ１０２〜Ｓ１０６）。なお、上記直交変換処理および量子化処理（Ｓ１０３）を終えたマクロブロックについては、符号化処理を行う（Ｓ１０７）。 Thereafter, after the above settings, the overall control unit 260 performs orthogonal transform processing and quantization processing (S103), inverse quantization processing and (S104), and intra prediction processing (S105) for each macroblock of the encoding target picture. ) Is executed (S102 to S106). In addition, about the macroblock which finished the said orthogonal transformation process and quantization process (S103), an encoding process is performed (S107).

図５は、本実施の形態に係る画像符号化装置２００の計算サイクルの概要を示すタイムチャートである。図５には、８×８画素ブロックに対する直交変換処理、量子化処理、逆量子化処理および一部の逆直交変換処理について示している。また、図５では、直交変換処理を２つのプロセッサを用いた並列処理で行っている（ＴＡおよびＴＢ）。同様に、逆直交変換処理についても２つのプロセッサを用いた並列処理を行っている（ＩＴＢおよびＩＴＡ）。さらに、上記直交変換処理および逆直交変換処理は、各行毎（例えば、「ｈ0」で示される１行目は画素番号「００」〜「０７」で構成）および各列毎（例えば、「ｖ0」で示される１列目は画素番号「００」〜「７０」で構成）に２サイクル（例えば、直交変換処理の「ｈ0」については、「サイクル０」と「サイクル１」）で処理を行っている。 FIG. 5 is a time chart showing an outline of a calculation cycle of the image coding apparatus 200 according to the present embodiment. FIG. 5 shows orthogonal transform processing, quantization processing, inverse quantization processing, and part of inverse orthogonal transform processing for an 8 × 8 pixel block. In FIG. 5, the orthogonal transform process is performed in parallel using two processors (TA and TB). Similarly, parallel processing using two processors is performed for inverse orthogonal transform processing (ITB and ITA). Further, the orthogonal transform process and the inverse orthogonal transform process are performed for each row (for example, the first row indicated by “h0” is composed of pixel numbers “00” to “07”) and for each column (for example, “v0”). The first column indicated by is composed of pixel numbers “00” to “70”) and processed in two cycles (for example, “cycle 0” and “cycle 1” for “h0” in the orthogonal transformation process). Yes.

一方、図５における量子化処理（Ｑ１およびＱ２）並びに逆量子化処理（ＩＱ１およびＩＱ２）については、それぞれ２つのプロセッサを用いて並列度２で実行している。従って、上記図５の場合は、少なくとも６つのプロセッサを用いていることとなる。 On the other hand, the quantization processing (Q1 and Q2) and the inverse quantization processing (IQ1 and IQ2) in FIG. 5 are executed with two parallelisms using two processors, respectively. Therefore, in the case of FIG. 5, at least six processors are used.

さらに、図５に示されるように、「サイクル３７」の時点で一番上の１行目の逆量子化処理が完了しているため、「サイクル３８」で逆直交変換処理の起動が可能であることも示されている。 Further, as shown in FIG. 5, since the inverse quantization process in the first row at the time of “cycle 37” is completed, the inverse orthogonal transform process can be activated in “cycle 38”. It is also shown that there is.

以上のように、本発明に係る画像符号化装置によれば、後段の処理に影響する一部の処理を早く完了させるため、符号化処理全体としての計算速度を向上させることが可能となる。 As described above, according to the image coding apparatus according to the present invention, a part of the processes affecting the subsequent processes are completed quickly, so that it is possible to improve the calculation speed of the entire coding process.

本発明は今後デジタル放送の普及、ハイビジョンムービーの普及に伴い、圧縮方法として採用されると予想されるＨ．２６４／ＡＶＣに関して、簡単な構成で処理効率を高めることが可能となるものであり、今後ますます多くの用途で使用されることになる。 The present invention is expected to be adopted as a compression method with the spread of digital broadcasting and high-definition movies in the future. With regard to H.264 / AVC, it becomes possible to increase the processing efficiency with a simple configuration, and it will be used in more and more applications in the future.

本発明に係る画像符号化装置２００の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image coding apparatus 200 which concerns on this invention. （ａ）〜（ｃ）は、本発明における量子化処理および逆量子化処理における、並列度による計算速度の差を説明するための図である。(A)-(c) is a figure for demonstrating the difference of the calculation speed by the parallel degree in the quantization process in this invention, and a dequantization process. （ａ）は、本発明に係る逆直交変換部における逆直交変換処理の順序を示す図である。（ｂ）は、本発明に係る逆直交変換部における逆直交変換処理の行列演算の様子を示す図である。(A) is a figure which shows the order of the inverse orthogonal transformation process in the inverse orthogonal transformation part which concerns on this invention. (B) is a figure which shows the mode of the matrix calculation of the inverse orthogonal transformation process in the inverse orthogonal transformation part which concerns on this invention. 本発明に係る画像符号化装置の動作の流れを示すフローチャートである。It is a flowchart which shows the flow of operation | movement of the image coding apparatus which concerns on this invention. 本発明に係る画像符号化装置の計算サイクルの概要を示すタイムチャートである。It is a time chart which shows the outline | summary of the calculation cycle of the image coding apparatus which concerns on this invention. Ｈ．２６４に準拠した従来の画像符号化装置の機能構成を示すブロック図の一例である。H. 2 is an example of a block diagram illustrating a functional configuration of a conventional image encoding device compliant with H.264. （ａ）は、イントラ４×４予測によって予測される符号化対象画素とイントラ予測に用いる符号化済みの隣接画素の配置を示す図である。（ｂ）〜（ｊ）は、イントラ４×４予測における各予測モードの予測方向を示す図である。(A) is a figure which shows arrangement | positioning of the encoding object pixel predicted by intra 4x4 prediction, and the encoding adjacent pixel used for intra prediction. (B)-(j) is a figure which shows the prediction direction of each prediction mode in intra 4x4 prediction. マクロブロックを構成する直交変換ブロックにおけるイントラ予測の順序を説明するための図である。It is a figure for demonstrating the order of the intra prediction in the orthogonal transformation block which comprises a macroblock. 従来の画像符号化装置における並列処理が可能な機能構成を示すブロック図の一例である。It is an example of the block diagram which shows the function structure in which the parallel processing in the conventional image coding apparatus is possible. （ａ）は、Ｈ２６４における直交変換演算を表す式である。（ｂ）は、Ｈ２６４における逆直交変換演算を表す式である。(A) is an equation representing an orthogonal transformation operation in H264. (B) is an expression representing the inverse orthogonal transform operation in H264. （ａ）は、Ｈ２６４における直交変換演算の行列演算を表す式である。（ｂ）は、Ｈ２６４における逆直交変換演算の行列演算を表す式である。(A) is an expression representing a matrix operation of an orthogonal transform operation in H264. (B) is an expression representing a matrix operation of the inverse orthogonal transform operation in H264. （ａ）は、直交変換演算をバタフライ演算により実現する様子を示す図である。（ｂ）は、逆直交変換演算をバタフライ演算により実現する様子を示す図である。(A) is a figure which shows a mode that orthogonal transformation calculation is implement | achieved by butterfly calculation. (B) is a figure which shows a mode that an inverse orthogonal transformation calculation is implement | achieved by a butterfly calculation.

Explanation of symbols

１０ブロック分割部
１１入力画像
１２ビットストリーム
１５差分処理部
２０直交変換部
３０、２３０量子化部
４０、２４０逆量子化部
５０、２５０逆直交変換部
５５加算処理部
６０デブロッキングフィルタ
７０、７１フレームメモリ
８０動き推定部
９０イントラ推定部
１００モード選択部
１０５モードスイッチ
１１０イントラ予測部
１２０動き補償部
１３０符号化部
１４０蓄積バッファ
１５０レート制御部
２００、４００画像符号化装置
２６０、４６０全体制御部 10 block division unit 11 input image 12 bit stream 15 difference processing unit 20 orthogonal transform unit 30, 230 quantization unit 40, 240 inverse quantization unit 50, 250 inverse orthogonal transform unit 55 addition processing unit 60 deblocking filter 70, 71 frames Memory 80 Motion estimation unit 90 Intra estimation unit 100 Mode selection unit 105 Mode switch 110 Intra prediction unit 120 Motion compensation unit 130 Encoding unit 140 Accumulation buffer 150 Rate control unit 200, 400 Image encoding device 260, 460 Overall control unit

Claims

An image that includes a first process and a second process that can be performed on pixel data in units of rows or columns, and that encodes a picture to be encoded for each unit block that includes a predetermined number of pixels. An encoding device comprising:
First processing means for preferentially executing calculation on pixel data in units of rows or columns used in the second processing;
Image coding comprising: second processing means for starting the second processing upon completion of computation on pixel data in units of rows or columns necessary for starting the second processing in the first processing. apparatus.

The first processing means preferentially executes a part of the inverse orthogonal transform operation in the inverse orthogonal transform process on the pixel data of the right end column unit in the first unit block used in the intra prediction process,
The second processing means uses the pixel data of the right end column unit in the first unit block after the inverse orthogonal transform operation, and uses the pixel data of the second unit block adjacent to the right of the first unit block. The image predicting apparatus according to claim 1, wherein an intra prediction process is started for.

The first processing means uses inverse quantization in units of columns in the first unit block so that pixel data in units of rows in the first unit block used in inverse orthogonal transform processing is preferentially calculated. The processing is executed with a parallelism N representing parallel processing in units of N pixels adjacent to the left and right,
The second processing means starts execution of an inverse orthogonal transform process for the first unit block using pixel data in units of rows after the inverse quantization process in the first unit block is completed. The image encoding device according to claim 1.

The second processing means further includes
The image coding apparatus according to claim 3, wherein the inverse orthogonal transform process is preferentially performed on pixel data of a rightmost column unit in the first unit block used in the intra prediction process.

The image encoding device further includes:
The image coding apparatus according to claim 3, further comprising third processing means for executing quantization processing on pixel data in units of columns in the first unit block at the parallelism N.

An image that includes a first process and a second process that can be performed on pixel data in units of rows or columns, and that encodes a picture to be encoded for each unit block that includes a predetermined number of pixels. An encoding method comprising:
A first processing step for preferentially performing an operation on pixel data in units of rows or columns used in the second processing;
And a second processing step of starting the second processing upon completion of the operation on pixel data in units of rows or columns necessary for starting the second processing in the first processing. Method.

An image that includes a first process and a second process that can be performed on pixel data in units of rows or columns, and that encodes a picture to be encoded for each unit block that includes a predetermined number of pixels. A program for use in an encoding device to be executed by a computer,
The program is
A first processing step for preferentially performing an operation on pixel data in units of rows or columns used in the second processing;
And a second processing step of starting the second process upon completion of the operation on pixel data in units of rows or columns necessary for starting the second process in the first process.

An integration including a first process and a second process that can be performed on pixel data in units of rows or columns, and encoding a picture to be encoded for each unit block composed of a predetermined number of pixels. A circuit,
First processing means for preferentially executing calculation on pixel data in units of rows or columns used in the second processing;
An integrated circuit, comprising: second processing means for starting the second processing upon completion of the operation on pixel data in units of rows or columns necessary for starting the second processing in the first processing.