WO2012096184A1

WO2012096184A1 - Image encoding apparatus, image encoding method, program, image decoding apparatus, image decoding method, and program

Info

Publication number: WO2012096184A1
Application number: PCT/JP2012/000154
Authority: WO
Inventors: Mitsuru Maeda
Original assignee: Canon Kabushiki Kaisha
Priority date: 2011-01-13
Filing date: 2012-01-12
Publication date: 2012-07-19
Also published as: JP2012147290A

Abstract

An image coding apparatus includes a setting unit configured to set a coding control parameter, a first transform unit configured to transform a frequency of pixel data of a block having a first size, and execute transform by substituting transform coefficients with predetermined values except a part of the transform coefficients, a second transform unit configured to transform a frequency of pixel data of a block having a second size smaller than the first size, a block size determination unit configured to limit use of the first transform unit based on the coding control parameter, and a coding unit configured to control and code one of outputs of the first transform unit and the second transform unit based on a determined block size.

Description

IMAGE ENCODING APPARATUS, IMAGE ENCODING METHOD, PROGRAM, IMAGE DECODING APPARATUS, IMAGE DECODING METHOD, AND PROGRAM

The present invention relates to an image encoding apparatus, an image encoding method, a program, an image decoding apparatus, an image decoding method, and a program, and more particularly to a coding method for coding/decoding an image by dividing it into blocks of a plurality of sizes.

As a moving image compressing and recording method, there is known a H. 264/Moving Picture Experts Group (MPEG)-4 AVC (Advanced Video Coding) (hereinafter, H. 264). The H. 264 coding method is widely used as in one-segment terrestrial digital broadcasting. As a feature of the H. 264 coding method, in addition to a conventional coding method, a plurality of intra predictions are prepared via integer transform by 4 x 4 pixels. In the H. 264 coding method, 16 x 16 macroblocks are used as units, and their inside is divided into 4 x 4 bocks to be used as units for transform. A part of intra coding employs transform by 8 x 8 blocks. As an improved technology based on the H. 264 coding method, for example, there is a technology of increasing the block size of H. 264 up to a block of 64 pixels x 64 pixels. (JCT-VC contribution JCTVC-G405)

However, in a large block such as 64 pixels x 64 pixels or 32 pixels x 32 pixels, costs of calculation of orthogonal transform itself are high, and many coefficients are generated. Thus, sequential coding takes a long time. As a method for executing coding by reducing the coefficients and decreasing a coding amount, there is known a method for forcibly setting high-frequency coefficients to 0 after orthogonal transform. Further, a method for transmitting only 64 coefficients in the case of the block of 32 pixels x 32 pixels can be used. When employing these methods, coefficients transmitted by the large block can be reduced, thereby improving compression efficiency.

However, if the high-frequency coefficients are uniformly decreased, naturally reproduction at high frequencies cannot be achieved, causing great deterioration of image quality. Residual components generated by motion compensation or intra predictions employed by H. 264 contain more high frequencies, which further deteriorates the image quality deterioration. In addition, when a quantization step is small, the number of quantization coefficients having values other than 0 after quantization is much larger. However, because the high frequencies are uniformly set to 0, image quality is not improved even when the quantization step is reduced.

There are many pixel inputs/outputs and many inputs/outputs of transform coefficients in transform of a large block size, and calculation costs are accordingly very high.

ISO/IEC14496-10: 2004 Information technology - - Coding of audio-visual objects - - Part 10: Advanced Video Coding, ITU-T H. 264 Advanced Video coding for generic audiovisual services

According to an aspect of the present invention, an image coding apparatus includes a setting unit configured to set a coding control parameter, a first transform unit configured to transform a frequency of pixel data of a block having a first size, and execute transform by substituting transform coefficients with predetermined values except a part of the transform coefficients, a second transform unit configured to transform a frequency of pixel data of a block having a second size smaller than the first size, a block size determination unit configured to limit use of the first transform unit based on the coding control parameter, and a coding unit configured to control and code one of outputs of the first transform unit and the second transform unit based on a determined block size.

The present invention enables improvement of image quality by limiting substitution by fixed values of transform coefficients in the appropriately sized block. Particularly, a block size can be suitably limited by determining the block size in association with quantization parameters which greatly contribute to generation of coefficients. Further, by limiting the block size, coded data indicating block division can be reduced.

Further features and aspects of the present invention will become apparent from the following detailed description of exemplary embodiments with reference to the attached drawings.

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the invention and, together with the description, serve to explain the principles of the invention.
Fig. 1 is a block diagram illustrating a configuration of an image encoding apparatus according to a first exemplary embodiment. Fig. 2 is a block diagram illustrating a configuration of a transform block size determination unit according to the first exemplary embodiment. Fig. 3A illustrates an arrangement of values of coefficients of transform blocks. Fig. 3B illustrates an arrangement of values of coefficients of transform blocks. Fig. 4 illustrates an example of an arrangement of the transform blocks. Fig. 5 illustrates an example of an arrangement of the transform blocks. Fig. 6 is a block diagram illustrating in detail an orthogonal transform unit according to the first exemplary embodiment. Fig. 7 is a block diagram illustrating another configuration of the image coding apparatus according to the first exemplary embodiment. Fig. 8 is a block diagram illustrating another configuration of the transform block size determination unit according to the first exemplary embodiment. Fig. 9 is a flowchart illustrating an operation of image encoding according to the first exemplary embodiment. Fig. 10 is a block diagram illustrating another configuration of the image encoding apparatus according to the first exemplary embodiment. Fig. 11 is a flowchart illustrating an operation of division flag encoding according to the first exemplary embodiment. Fig. 12 is a block diagram illustrating a configuration of an image encoding apparatus according to a second exemplary embodiment. Fig. 13 is a flowchart illustrating an operation of division flag encoding according to the second exemplary embodiment. Fig. 14 is a block diagram illustrating another configuration of the image encoding apparatus according to the second exemplary embodiment. Fig. 15 is a block diagram illustrating another configuration of the image encoding apparatus according to the second exemplary embodiment. Fig. 16 is a block diagram illustrating a configuration of an image decoding apparatus according to a third exemplary embodiment. Fig. 17 is a block diagram illustrating in detail an inverse orthogonal transform unit according to the third exemplary embodiment. Fig. 18 is a flowchart illustrating an operation of image decoding according to the third exemplary embodiment. Fig. 19 is a flowchart illustrating an operation of division flag decoding according to the third exemplary embodiment. Fig. 20 is a block diagram illustrating a configuration of an image decoding apparatus according to a fourth exemplary embodiment. Fig. 21 is a flowchart illustrating an operation of division flag decoding according to the fourth exemplary embodiment. Fig. 22 is a block diagram illustrating another configuration of the image decoding apparatus according to the fourth exemplary embodiment. Fig. 23 is a block diagram illustrating a hardware configuration example of a computer applicable to an image coding apparatus and an image decoding apparatus according to the present invention.

Various exemplary embodiments, features, and aspects of the invention will be described in detail below with reference to the drawings.

Configurations of exemplary embodiments described below are only examples, to which the present invention is not limited.

Exemplary embodiments of the present invention are described referring to Figs. 1 to 11. Fig. 1 is a block diagram illustrating a configuration of an image encoding apparatus according to a first exemplary embodiment of the present invention. In Fig. 1, the image encoding apparatus includes a terminal 1 that receives image data, a buffer 2 that temporarily stores the image data, and a basic block dividing unit 3 that divides an image into a plurality of basic blocks. For simplicity, the present embodiment is directed to a basic block size of 32 pixels x 32 pixels. However, the present invention is not limited to this basic block size. The image coding apparatus includes a transform block size/division determination unit 4 that forms transform blocks of sizes of blocks for dividing the basic block to execute orthogonal transform. In this case, the block is divided into transform blocks of 32 pixels x 32 pixels where no block division is carried out, 16 pixels x 16 pixels, 8 pixels x 8 pixels, and 4 pixels x 4 pixels. However, a division is not limited to this method.

A prediction unit 8 executes intra prediction or motion compensation prediction by the divided transform blocks to calculate a prediction error. An orthogonal transform unit 9 executes orthogonal transform for the prediction error to acquire a spatial frequency coefficient. The present embodiment is described by taking an example of a discrete cosine transform (DCT) as frequency transform. However, transform is not limited to this. Transform such as Hadamard transform can also be employed. The orthogonal transform unit 9 has a function of executing orthogonal transform corresponding to the plurality of transform block sizes divided by the transform block size/division determination unit 4. However, the orthogonal transform unit 9 calculates, for orthogonal transform of 32 pixels x 32 pixels and 16 pixels x 16 pixels, only coefficients of 8 pixels x 8 pixels of low frequencies among coefficients, and outputs 0 for the others.

Figs. 3A and 3B illustrate an arrangement of values of coefficients of the transform blocks. Fig. 3A illustrates coefficients after orthogonal transform of 32 pixels x 32 pixels, and similarly Fig 3B illustrates coefficients after orthogonal transform of 16 pixels x 16 pixels. In both drawing, black pixels hold coefficients acquired by orthogonal transform, while values are 0 at other white pixels. The arrangement of the held coefficients is not limited to this.

A quantization unit 10 quantizes the acquired coefficients according to quantization parameters (coding control parameters) to acquire quantized coefficients. An entropy coding unit 11 codes the quantization parameters of various heads or the blocks, and executes entropy-coding for the quantized coefficients. The entropy coding is not limited to any specified method. However, the coding is carried out by arithmetic codes or Huffman codes.

A terminal 5 receives an instruction from a user (not illustrated) to control image quality. A quantization parameter setting unit 6 generates an initial value of a quantization parameter set at the quantization unit 10 according to the instruction. A maximum transform block size determination unit 7 determines, by referring to the initial value of the quantization parameter, a maximum transform block size during division by the transform block dividing unit.

An inverse quantization unit 15 restores the coefficients from the quantized coefficients generated by the quantization unit 10. An inverse orthogonal transform unit 16 carries out inverse transform of the orthogonal transform unit 9 for the restored coefficients to restore the prediction error. A prediction error addition unit 17 adds together the prediction error and a prediction result. A frame memory 18 stores the restored image. A motion compensation unit 19 executes motion compensation where a content of the frame memory 18 is compared with image data of an input transform block unit to detect a motion. A motion vector coding unit 20 codes a motion vector that is a result of the motion detection to generate a motion vector code. A block division information encoding unit 21 encodes information regarding the state where the transform block size/division determination unit 4 has divided the basic block into the transform blocks to generate a block division code. A multiplexing unit 12 integrates and lines up outputs from the entropy coding unit 11, the motion vector encoding unit 20, and the block division information encoding unit 21, to output a bit stream. A terminal 13 outputs the bit stream to the outside. A rate control unit 14 controls the quantization parameters based on a code amount of the bit stream.

An operation of image encoding in the above configuration is described. Before the processing, the user (not illustrated) transmits, regarding image quality for coding, an instruction to the quantization parameter setting unit 6 via the terminal 5. The quantization parameter setting unit 6 sets a quantization parameter of a small value when the user desires high image quality, and sets a quantization parameter of a large value when the user desires bit reduction even with low image quality. In the present exemplary embodiment, for simpler description, a high image-quality mode for setting a quantization parameter QP to a small value QS and a high-efficiency mode for setting the quantization parameter QP to a large value QL can be set. However, a setting method is not limited to this.

The quantization parameter set by the quantization parameter setting unit 6 is input to the maximum transform block size determination unit 7 and the rate control unit 14. The maximum transform block size determination unit 7 sets a transform block size up to 32 pixels x 32 pixels when a value of the input quantization parameter QP is QL. The maximum transform block size determination unit 7 sets a transform block size up to 8 pixels x 8 pixels when a value of the input quantization parameter QP is QS.

Image data are sequentially input from the outside via the terminal 1 to be stored in the buffer 2. The basic block dividing unit 3 divides blocks of 32 pixels x 32 pixels in an input order to input them to the transform block size/division determination unit 4.

The transform block size/division determination unit 4 calculates coding costs when dividing each block size to select a division combination of small costs. Fig. 2 is a block diagram illustrating an example of a detailed configuration of the transform block size/division determination unit 4. In Fig. 2, a terminal 31 receives pixel data of a basic block from the basic block dividing unit 3, and a buffer 32 stores the input pixel data.

A 32 x 32 transform unit 33 carries out DCT for 32 pixels x 32 pixels stored in the buffer 32. A 16 x 16 transform unit 34 vertically and horizontally divides 32 pixels x 32 pixels stored in the buffer 32 into four, and carries out DCT for each block of 16 pixels x 16 pixels. A 8 x 8 transform unit 35 vertically and horizontally divides 32 pixels x 32 pixels stored in the buffer 32 into four, and carries out DCT for each block of 8 pixels x 8 pixels. A 4 x 4 transform unit 36 vertically and horizontally divides 32 pixels x 32 pixels stored in the buffer 32 into eight, and carries out DCT for each block of 4 pixels x 4 pixels.

Cost calculation units 37 to 40 receive input orthogonal transform results to calculate coding costs. As a calculation method, there is a method for calculating block costs by using Lagrange multiplication. The cost calculation unit is not limited to this. For example, there is a method for making a determination using an activity such as dispersion at each transform block or an edge amount.

Cost buffers 41 to 44 store the calculated costs by transform blocks corresponding to the cost calculation units 37 to 40. A transform block division determination unit 45 reads the costs of the transform blocks stored in the cost buffers to select a combination of smallest costs among the basic blocks. A terminal 46 outputs, as division information of the transform blocks, information regarding the selected combination output from the transform block division determination unit 45 to the prediction unit 8, the orthogonal transform unit 9, the motion compensation unit 19, and the block division information coding unit 21 illustrated in Fig. 1.

A transform block dividing unit 47 receives the transform block division information, and divides and outputs transform blocks from the buffer 32 according to the transform block division information. A terminal 48 sequentially outputs pixel data of the divided transform blocks

A terminal 49 receives a maximum size of the transform block from the maximum transform block size determination unit 7. According to the present exemplary embodiment, 32 is input when the value of the quantization parameter is QL, and 8 is input when the value is QS. A controller 50 controls, according to the input from the terminal 49, whether to operate the 32 x 32 transform unit 33, the cost calculation unit 37, the cost buffer 41, the 16 x 16 transform unit 34, the cost calculation unit 38, and the cost buffer 42. A control signal is input, in addition to these units, to the transform block division determination unit 45.

An operation of transform block division in the above configuration is described. The maximum size of the transform block is input though the terminal 49 to the controller 50. The controller 50 sets, when the maximum size of the transform block is 32, the 32 x 32 transform unit 33, the cost calculation unit 37, the cost buffer 41, the 16 x 16 transform unit 34, the cost calculation unit 38, and the cost buffer 42 in operational states. When the maximum size of the transform block is 8, the controller 50 sets the 32 x 32 transform unit 33, the cost calculation unit 37, the cost buffer 41, the 16 x 16 transform unit 34, the cost calculation unit 38, and the cost buffer 42 in stopped states. In this case, power supplied to the hardware is cut off.

The pixel data (32 pixels x 32 pixels) of the basic block is input from the terminal 31 to be stored in the buffer 32. It is presumed that the controller 50 operates the 32 x 32 transform unit 33, the cost calculation unit 37, the cost buffer 41, the 16 x 16 transform unit 34, the cost calculation unit 38, and the cost buffer 42. In this case, the 32 x 32 transform unit 33, the 16 x 16 transform unit 34, the 8 x 8 transform unit 35, and the 4 x 4 transform unit 36 read the data of the buffer 32 by transform block sizes. DCT is carried out by each size, and its transform coefficients are input to the cost calculation units 37 to 40 connected to each unit.

The cost calculation units 37 to 40 calculate coding costs from the coefficients to store them in the connected cost buffers. The cost buffer 41 stores cost of one block when the transform block is 32 pixels x 32 pixels. The cost buffer 42 stores, when the transform block is 16 pixels x 16 pixels, cost of four blocks according to positions thereof. The cost buffer 43 stores, when the transform block is 8 pixels x 8 pixels, cost of sixteen blocks according to positions thereof. The cost buffer 44 stores, when the transform block is 4 pixels x 4 pixels, cost of sixty four blocks according to positions thereof.

It is presumed that the controller 50 stops the 32 x 32 transform unit 33, the cost calculation unit 37, the cost buffer 41, the 16 x 16 transform unit 34, the cost calculation unit 38, and the cost buffer 42. In this case, the 8 x 8 transform unit 35 and the 4 x 4 transform unit 36 read the data of the buffer 32 by transform block sizes. DCT is carried out by each size, and its transform coefficients are input to the cost calculation units 39 to 40 connected to each unit.

The cost calculation units 39 to 40 calculate coding costs from the coefficients to store them in the cost buffers connected to each unit. The cost buffer 43 stores, when the transform block is 8 pixels x 8 pixels, cost of sixteen blocks according to positions thereof. The cost buffer 44 stores, when the transform block is 4 pixels x 4 pixels, cost of sixty four blocks according to positions thereof.

After all the costs have been input to the cost buffers 41 to 44, if the controller 50 has set all the cost buffers 41 to 44 in operational states, the transform block division determination unit 45 compares the costs to select a combination of division of smallest costs.

Fig. 4 illustrates an example of transform block division when the quantization parameter setting unit 6 selects the high-efficiency mode, and a maximum size of the transform block becomes 32. The block of 32 pixels x 32 pixels is divided into transform blocks A and D of 16 pixels x 16 pixels, transform blocks BA, BB, BC, BD, CB, and CD of 8 pixels x 8 pixels, and transform blocks CAA, CAB, CAC, CAD, CBA, CBB, CBC, and CBD of 4 pixels x 4 pixels.

Specifically, the state and the size of block division are expressed as follows: "block of 32 pixels x 32 pixels is divided into 16 pixel x 16 pixel blocks", "block A of 16 pixels x 16 pixels is not divided", "block B of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block BA of 8 pixels x 8 pixels is not divided", "block BB of 8 pixels x 8 pixels is not divided", "block BC of 8 pixels x 8 pixels is not divided", "block BD of 8 pixels x 8 pixels is not divided", "block C of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block CA of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks", "block CB of 8 pixels x 8 pixels is not divided", "block CC of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks", "block CD of 8 pixels x 8 pixels is not divided", and "block D of 16 pixels x 16 pixels is not divided". These become division information of the transform blocks.

Fig. 5 illustrates an example of transform block division when the quantization parameter setting unit 6 selects the high image-quality mode, and a maximum size of the transform block becomes 8. The block is divided into transform blocks AA, AB, AC, AD, BA, BB, BC, BD, CB, CD, DA, DB, DC, and DD of 8 pixels x 8 pixels, and transform blocks CAA, CAB, CAC, CAD, CBA, CBB, CBC, and CBD of 4 pixels x 4 pixels. The transform blocks of 16 pixels x 16 pixels illustrated in Fig. 4 are divided again into the transform blocks of 8 pixels x 8 pixels. In this case, division information of the transform blocks illustrated in Fig. 5 is as follows: "block of 32 pixels x 32 pixels is divided into 16 pixel x 16 pixel blocks", "block A of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block AA of 8 pixels x 8 pixels is not divided", "block AB of 8 pixels x 8 pixels is not divided", "block AC of 8 pixels x 8 pixels is not divided", "block AD of 8 pixels x 8 pixels is not divided", "block B of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block BA of 8 pixels x 8 pixels is not divided", "block BB of 8 pixels x 8 pixels is not divided", "block BC of 8 pixels x 8 pixels is not divided", "block BD of 8 pixels x 8 pixels is not divided", "block C of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block CA of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks", "block CB of 8 pixels x 8 pixels is not divided", "block CC of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks", "block CD of 8 pixels x 8 pixels is not divided", "block D of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks", "block DA of 8 pixels x 8 pixels is not divided", "block DB of 8 pixels x 8 pixels is not divided", "block DC of 8 pixels x 8 pixels is not divided", and "block DD of 8 pixels x 8 pixels is not divided".

The division information of the transform blocks is output from the transform block division determination unit 45 through the terminal 46 to the prediction unit 8, the orthogonal transform unit 9, the motion compensation unit 19, and the block division information coding unit 21 illustrated in Fig. 1, and simultaneously input to the transform block dividing unit 47. The transform block dividing unit 47 sequentially reads, based on the division information of the transform blocks, data of the transform blocks from the buffer 32 to output it from the terminal 48 to the prediction unit 8 and the motion compensation unit 19 illustrated in Fig. 1.

Referring back to Fig. 1, the division information of the transform blocks and the pixel data of the transform blocks generated at the transform block size/division determination unit 4 are input to the prediction unit 8 and the motion compensation unit 19. The motion compensation unit 19 predicts motion of the transform blocks from the frame memory 18 based on the data of the pixel blocks, and generates a motion vector and a pixel block to be referred to input them to the prediction unit 8. The motion vector is coded by the motion vector coding unit 20, and input as motion vector coded data to the multiplexing unit 12.

The prediction unit 8 executes intra prediction for the input pixel data, compares it with a prediction result of motion compensation input from the motion compensation unit 19, and selects the smallest prediction error to determine a prediction mode. The prediction unit 8 acquires a difference between the selected prediction result and the input pixel block to generate prediction error data. The prediction unit 8 inputs the generated prediction error data to the orthogonal transform unit 9. The prediction result used in this case is input to the prediction error addition unit 17. The prediction mode is output to the multiplexing unit 12.

The orthogonal transform unit 9 carries out orthogonal transform indicating the division information of the transform blocks for the prediction error. Fig. 6 illustrates the orthogonal transform unit 9 in detail. In Fig. 6, a terminal 60 receives the division information of the transform blocks from the transform block size/division determination unit 4. A terminal 61 receives the maximum size of the transform block from the maximum transform block size determination unit 7. A controller 62 controls, as in the case of the controller 50 illustrated in Fig. 2, whether to operate a 32 x 32 transform unit 65 and a 16 x 16 transform unit 66 according to an input from the terminal 61.

A terminal 63 receives the prediction error data from the prediction unit 8.

Selectors

64 and 69 select an input destination or an output destination based on the transform block size of the division information of the transform blocks input from the terminal 60.

The 32 x 32 transform unit 65 carries out DCT for the input transform blocks of 32 pixels x 32 pixels to acquire coefficient data of 32 x 32. The 16 x 16 transform unit 34 carries out DCT for the input transform blocks of 16 pixels x 16 pixels to acquire coefficient data of 16 x 16. A 8 x 8 transform unit 67 carries out DCT for the input transform blocks of 8 pixels x 8 pixels to acquire coefficient data of 8 x 8. A 4 x 4 transform unit 68 carries out DCT for the input transform blocks of 4 pixels x 4 pixels to acquire coefficient data of 4 x 4.

A terminal 70 outputs the coefficient data acquired by each transform. An operation of orthogonal transform in this configuration is described.

Before processing, the maximum size of the transform block is input through the terminal 61 to the controller 62. The controller 62 sets, when the maximum size of the transform block is 32, the 32 x 32 transform unit 65 and the 16 x 16 transform unit 66 in operational states. When the maximum size of the transform block is 8, the controller 62 sets the 32 x 32 transform unit 65 and the 16 x 16 transform unit 66 in stopped states. In this case, power supplied to the hardware is cut off.

The division information of the transform blocks is input through the terminal 60 to the

selectors

64 and 69. When the input transform block size is 32, the selector 64 sets the 32 x 32 transform unit 65 as its output destination while the selector 69 sets the 32 x 32 transform unit 65 as its input destination. Similarly, when the input transform block size is 16, the selector 64 sets the 16 x 16 transform unit 66 as its output destination while the selector 69 sets the 16 x 16 transform unit 66 as its input destination. When the input transform block size is 8, the selector 64 sets the 8 x 8 transform unit 67 as its output destination while the selector 69 sets the 8 x 8 transform unit 67 as its input destination. When the input transform block size is 4, the selector 64 sets the 4 x 4 transform unit 68 as its output destination while the selector 69 sets the 4 x 4 transform unit 68 as its input destination.

The selector 64 receives the pixel data of the transform blocks input from the terminal 63 to a transform unit of an appropriate size. The selector 69 outputs coefficient data of the transform block acquired by transform at each transform unit from the terminal 70. In this case, the 32 x 32 transform unit 65 calculates a value of an 8 x 8 portion illustrated in Fig. 3A, and other white portions are transformed to be 0. Similarly, the 16 x 16 transform unit 66 calculates a value of an 8 x 8 portion illustrated in Fig. 3B, and other white portions are transformed to be 0.

Referring back to Fig. 1, the coefficient data generated by the orthogonal transform unit 9 is input to the quantization unit 10. The quantization unit 10 carries out quantization by using a quantization parameter determined by the rate control unit 14, and generates quantization coefficient data to output it to the entropy coding unit 11. The rate control unit 14 also outputs the quantization parameter used in the processing to the entropy coding unit 11. The entropy coding unit 11 codes the input quantization parameter and the quantization coefficients. A coding method is not specifically limited. However, for example, for the quantization parameter, its difference from a quantization parameter used during last quantization of the transform blocks is subjected to Huffman coding. The quantization coefficients are arranged one-dimensionally as in the case of the H. 264 coding method, and achieved by variable length coding.

The quantization coefficient data acquired by the quantization unit 10 is input to the inverse quantization unit 15. Inverse quantization is executed by using the quantization parameter to restore the coefficient data. The restored coefficient data is input to the inverse orthogonal transform unit 16 to restore the prediction error data. The restored prediction error data is input to the prediction error addition unit 17. The prediction value of the intra prediction generated by the prediction unit 8 or the prediction value acquired by the motion compensation is input, and added to be stored in an area corresponding to the frame memory 18.

The division information of the transform blocks determined by the transform block size/division determination unit 4 is coded by the block division information coding unit 21. It is presumed that the transform block division information is sequentially coded left upward, right upward, left downward, and right downward in this order. The division information is coded by allocating 1 when divided, and 0 when not divided.

First, the basic block has been divided, and hence a basic block division flag indicating the division of the basic block is set to 1. When transform of 32 pixels x 32 pixels is carried out while the basic block is not divided, the basic block division flag is set to 0. Then, information indicating whether the transform block A of 16 pixels x 16 pixels has been divided is indicated by an A block division flag. In other words, this value is 1 when divided, and 0 when not divided. When the block A is divided again into blocks of 8 pixels x 8 pixels, a block division flag indicating whether the block is divided into blocks of 4 pixels x 4 pixels is set.

Block division illustrated in Fig. 4 is set as follows:
1 block of 32 pixels x 32 pixels is divided into 16 pixel x 16 pixel blocks
0 block A of 16 pixels x 16 pixels is not divided
1 block B of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks
0 block BA of 8 pixels x 8 pixels is not divided
0 block BB of 8 pixels x 8 pixels is not divided
0 block BC of 8 pixels x 8 pixels is not divided
0 block BD of 8 pixels x 8 pixels is not divided
1 block C of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks
0 block CA of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CB of 8 pixels x 8 pixels is not divide
1 block CC of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CD of 8 pixels x 8 pixels is not divided
0 block D of 16 pixels x 16 pixels is not divided

Similarly, for block division illustrated in Fig. 5, the following transform block flags are generated:
1 block of 32 pixels x 32 pixels is divided into 16 pixel x 16 pixel blocks
1 block A of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel block
0 block AA of 8 pixels x 8 pixels is not divided
0 block AB of 8 pixels x 8 pixels is not divided
0 block AC of 8 pixels x 8 pixels is not divided
0 block AD of 8 pixels x 8 pixels is not divided
1 block B of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks
0 block BA of 8 pixels x 8 pixels is not divided
0 block BB of 8 pixels x 8 pixels is not divided
0 block BC of 8 pixels x 8 pixels is not divided
0 block BD of 8 pixels x 8 pixels is not divided
1 block C of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks
1 block CA of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CB of 8 pixels x 8 pixels is not divided
1 block CC of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CD of 8 pixels x 8 pixels is not divided
1 block D of 16 pixels x 16 pixels is divided into 8 pixel x 8 pixel blocks
0 block DA of 8 pixels x 8 pixels is not divided
0 block DB of 8 pixels x 8 pixels is not divided
0 block DC of 8 pixels x 8 pixels is not divided
0 block DD of 8 pixels x 8 pixels is not divided

An order of the codes is not limited to this. For example, first, with a division flag of 32 pixels x 32 pixels block, division is made into 16 pixels x 16 pixels blocks, and a division flag 4 bits of the 16 pixels x 16 pixels blocks can be next. There are no transform blocks of a size below 4 pixels x 4 pixels blocks, and thus no division flag information is generated.

Referring to a flowchart of Fig. 11, a coding flow of the division flag is described. In step S051, determination is made as to whether the basic block has been divided into transform blocks of 16 pixels x 16 pixels. The processing proceeds to step S053 when divided (YES in step S051), ant to step S052 when not divided (NO in step S051).

In step S052, since the basic block has not been divided into transform blocks of 16 pixels x 16 pixels, a code "0" is allocated to the division flag. Then, since there is no division of transform blocks, the coding is ended.

In step S053, since the basic block has been divided into transform blocks of 16 pixels x 16 pixels, a code "1" is allocated to the division flag. Then, processing in the blocks of 16 pixels x 16 pixels is carried out.

In step S054, determination is made as to whether the transform blocks of 16 pixels x 16 pixels targeted in an order from a left upper side to right lower side have been divided into transform blocks of 8 pixels x 8 pixels. The processing proceeds to step S056 when divided (YES in step S054), ant to step S055 when not divided (NO in step S054).

In step S055, since the transform block of 16 pixels x 16 pixels has not been divided into transform blocks of 8 pixels x 8 pixels, a code "0" is allocated to the division flag. Then, the processing proceeds to step S061.

In step S056, since the transform block of 16 pixels x 16 pixels has been divided into transform blocks of 8 pixels x 8 pixels, a code "1" is allocated to the division flag, and the processing proceeds to step S057.

In step S057, determination is made as to whether the transform blocks of 8 pixels x 8 pixels targeted in an order from a left upper side to right lower side have been divided into transform blocks of 4 pixels x 4 pixels. The processing proceeds to step S059 when the transform blocks have been divided (YES in step S057), and to step S058 when the transform blocks have not been divided (NO in step S057).

In step S058, since the transform block of 8 pixels x 8 pixels has not been divided into transform blocks of 4 pixels x 4 pixels, a code "0" is allocated to the division flag. Then, the processing proceeds to step S060.

In step S059, since the transform block of 8 pixels x 8 pixels has been divided into transform blocks of 4 pixels x 4 pixels, a code "1" is allocated to the division flag, and the processing proceeds to step S060.

In step S060, determination is made as to whether encoding has ended for all processed transform blocks of 16 pixels x 16 pixels into the transform blocks of 8 pixels x 8 pixels. The processing proceeds to step S061 when ended (YES in step S060). When not ended (NO in step S060), the processing returns to step S057 to process next transform blocks of 8 pixels x 8 pixels.

In step S061, determination is made as to whether encoding has ended for all processed transform blocks of 32 pixels x 32 pixels into the transform blocks of 16 pixels x 16 pixels. The division flag encoding is ended when the encoding is ended (YES in step S061). When not ended (NO in step S061), the processing returns to step S057 to process next transform blocks of 8 pixels x 8 pixels.

Thus, the coded data of the quantization parameter and the quantization coefficients from the entropy coding unit 11, the prediction mode from the prediction unit 8, and the motion vector from the motion vector coding unit 20, and the division flag encoded by the block division information coding unit 21 are input to the multiplexing unit 12. The multiplexing unit 12 lines up the coded data in a predetermined order to output it as a bit stream from the terminal 13.

The bit stream is input to the rate control unit 14, and a new quantization parameter is generated by referring to the quantization parameter set by the quantization parameter setting unit 6. For example, a range permitted to be changed from the quantization parameter set by the quantization parameter setting unit 6 is set to achieve control within the range.

A simple coding flow is described referring to the drawings. Fig. 9 is a flowchart illustrating overall coding processing. In step S001, to start the processing, a quantization parameter determining image quality is set.

In step S002, a maximum size of a transform block to be used is determined by using the set quantization parameter. For example, the set quantization parameter QP is compared with a threshold value Th. When the set quantization parameter QP is larger than the threshold value Th, the maximum size of the transform block is set as a size of a basic block. When the set quantization parameter QP is equal to or less than the threshold value Th, the maximum size of the transform block is limited to a block size where a variable is not substituted with 0.

In step S003, image data of one frame is read. In step S004, the read one frame image data is divided into basic blocks (in the present embodiment, divided into blocks of 64 pixels x 64 pixels). In step S005, costs at each size when one basic block data is read and subjected to block division are calculated, and a division method of transform blocks is determined so that total costs can be minimum.

In step S006, information of the determined block division is coded. In step S007, the divided transform blocks are sequentially read, and motion compensation is carried out as to intra prediction or a moving image to calculate a prediction value.

In step S008, the calculated prediction value of the transform blocks is compared with an input to calculate a prediction error. The prediction error is subjected to orthogonal transform of each size. After the transform, quantization is executed with a quantization parameter to code a quantization result.

In step S009, inverse quantization and inverse orthogonal transform are performed on the quantization result to reproduce the prediction error. The prediction error and the prediction value calculated in step S006 are added together to generate a reproduced image, and the reproduced image is generated and held.

In step S010, determination is made as to whether the processing from steps S007 to S009 has ended for the processed transform blocks of the basic block. The processing proceeds to step S011 when the processing has ended (YES in step S010). The processing executes steps S007 to S009 when the processing has not ended (NO in step S010), targeting a next transform block.

In step S011, the coded quantization parameter, the coded data of the quantization coefficients, and other coded data are output. In step S012, determination is made as to whether the processing from steps S005 to S011 has ended for all the basic blocks in the frame. The processing proceeds to step S013 when the processing has ended (YES in step S012). When the processing has not ended (NO in step S012), the processing executes steps S003 to S012, targeting a next basic block

In step S013, determination is made as to whether there is a next frame of image data. The processing executes steps from S003 when there is the next frame (YES in step S013). The processing ends when there is not the next frame (NO in step S013).

Thus, a maximum transform block size is determined during division to limit use of a large transform block where a coefficient is forcibly set to 0. As a result, image quality deterioration can be efficiently suppressed. Further, control is performed based on a quantization parameter which controls generation of quantization coefficients, so that good image control can be performed.

With respect to a large transform block, calculation is complex, a circuit size is large, and power consumption is high. Thus, a calculation amount can be reduced and power consumption can be suppressed by limiting a transform block of a maximum transform block size. According to the present exemplary embodiment, a decoding side can reproduce an image by executing normal decoding.

In the present exemplary embodiment, a basic block size and a block dividing method are not limited to those described above. For example, for a basic block size, a large size of 64 pixels x 64 pixels can be employed. For a dividing method, for example, a rectangular shape of 8 pixels x 4 pixels can be employed.

According to the present exemplary embodiment, the maximum transform block size is determined based on the quantization parameter set by the quantization parameter setting unit 6, and the quantization parameter is set not to greatly fluctuate by the rate control unit 14. However, the present invention is not limited to this. As illustrated in Fig. 7, a new quantization parameter determined by a rate control unit 22 can be input to a maximum transform block size determination unit 23, and a maximum transform block size can be determined by a frame or a slice based on the input quantization parameter.

In the present exemplary embodiment, the configuration of the transform block size/division determination unit 4 is as illustrated in Fig. 2. However, a configuration is not limited to this. To make cost calculation more accurate for each division, a configuration illustrated in Fig. 8 can be employed. This is a method using a prediction error at each block size before cost calculation for each block size is made. Specifically, a terminal 80 receives image data of a reference frame necessary for motion prediction from the frame memory 18 illustrated in Fig. 1. A 32 x 32 prediction unit 81 carries out, by 32 pixels x 32 pixels, prediction by intra prediction or motion compensation. A 32 x 32 motion compensation unit 85 searches for motion vector by 32 pixels x 32 pixels to calculate prediction data. A 16 x 16 prediction unit 82 and a 16 x 16 motion compensation unit 86 are arranged before the 16 x 16 transform unit 34. A 8 x 8 prediction unit 83 and a 8 x 8 motion compensation unit 87 are arranged before the 8 x 8 transform unit 35. A 4 x 4 prediction unit 84 and a 4 x 4 motion compensation unit 88 are arranged before the 4 x 4 transform unit 36. Further, by supplying the generated prediction error to subsequent processing, the prediction unit 8 and the motion compensation unit 19 illustrated in Fig. 1 can be omitted.

As shown in Fig. 10, in determination of a maximum transform block size, not only a quantization parameter but also a quantization matrix can be taken into consideration. In step S002, comparing the set quantization parameter QP with the threshold value Th has been described as an example. However, not only the quantization parameter QP but also its quantization matrices QM [0 to i] (i is given by transform block size) can be taken into consideration. In Fig. 10, there is a quantization matrix setting unit 110 that sets quantization matrices QM [0 to i]. A coefficient Qsum is calculated by using the set quantization matrices QM [0 to i] and the quantization parameter QP.

The coefficient Qsum is compared with a threshold value Thm. When the coefficient Qsum is larger than the threshold value Thm, a maximum size of a transform block is set as a size of a basic block. When the coefficient Qsum is equal to or less than the threshold value Thm, the maximum size of the transform block can be limited to a block size where a variable is not substituted with 0 after the maximum size of the transform block is transformed.

The present embodiment has been described by taking the example of coding the moving image using a difference between frames. However, the coding is not limited to this. It is apparent that the present invention can be used for coding a still image using only the intra prediction.

The first exemplary embodiment is directed to limiting the sizes of the transform blocks by using the maximum transform block size. A second exemplary embodiment is directed to controlling a coding method of block division by using a maximum transform block size.

In Fig. 12, a maximum transform block size coding unit 107 codes a maximum transform block size determined by a maximum transform block size determination unit 7. A block division information coding unit 121 is different from the block division information coding unit 21 of the first exemplary embodiment in that a maximum block size is input from the maximum transform block size determination unit 7. A multiplexing unit 112 is also included.

Referring to a flowchart of Fig. 13, processing of the block division information coding unit 121 is described. In Fig. 13, steps similar to those illustrated in Fig. 11 are denoted by similar reference numerals, and description thereof is omitted.

In step S151, determination is made as to whether a maximum size of a transform block input from the maximum transform block size determination unit 7 is a transform block of 16 pixels x 16 pixels. The processing proceeds to step 054 when the maximum size of the transform block input is a transform block of 16 pixels x 16 pixels (YES in step S151). The processing proceeds to step S152 when not (NO in step S151).

In step S152, determination is made as to whether the maximum size of the transform block is a transform block of 8 pixels x 8 pixels. The processing proceeds to step 059 when the maximum size of the transform block is a transform block of 8 pixels x 8 pixels (YES in step S152). The processing proceeds to step S055 when not (NO in step S152).

Step S051 is executed when the maximum transform block size is 32 pixels x 32 pixels. In this case, as in the case of the first exemplary embodiment, processing from steps S051 to S061 is carried out to code a division flag.

When the maximum transform block size is determined to be 16 pixels x 16 pixels (YES in step S151), the processing from steps S054 to S061 is executed to code the division flag. Thus, for the division flag, information of divisions with respect to the transform block of 16 pixels x 16 pixels (16 pixels x 16 pixels, 8 pixels x 8 pixels, and 4 pixels x 4 pixels) is coded.

When the maximum transform block size is determined to be 8 pixels x 8 pixels (YES in step S152), the processing from steps S054 to S061 is executed to code the division flag. Thus, for the division flag, information of divisions with respect to the transform block of 8 pixels x 8 pixels (8 pixels x 8 pixels and 4 pixels x 4 pixels) is coded.

By the above operation, the block division illustrated in Fig. 5 is as follows when a maximum transform block size is 8 pixels x 8 pixels:
0 block AA of 8 pixels x 8 pixels is not divided
0 block AB of 8 pixels x 8 pixels is not divided
0 block AC of 8 pixels x 8 pixels is not divided
0 block AD of 8 pixels x 8 pixels is not divided
0 block BA of 8 pixels x 8 pixels is not divided
0 block BB of 8 pixels x 8 pixels is not divided
0 block BC of 8 pixels x 8 pixels is not divided
0 block BD of 8 pixels x 8 pixels is not divided
1 block CA of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CB of 8 pixels x 8 pixels is not divided
1 block CC of 8 pixels x 8 pixels is divided into 4 pixel x 4 pixel blocks
0 block CD of 8 pixels x 8 pixels is not divided
0 block DA of 8 pixels x 8 pixels is not divided
0 block DB of 8 pixels x 8 pixels is not divided
0 block DC of 8 pixels x 8 pixels is not divided
0 block DD of 8 pixels x 8 pixels is not divided

As compared with the coding of the division flag of the first exemplary embodiment illustrated in Fig. 5, the following codes can be omitted:
1 block of 32 pixels x 32 pixels is divided into 16 pixels x 16 pixels blocks
1 block A of 16 pixels x 16 pixels is divided into 8 pixels x 8 pixels blocks
1 block B of 16 pixels x 16 pixels is divided into 8 pixels x 8 pixels blocks
1 block C of 16 pixels x 16 pixels is divided into 8 pixels x 8 pixels blocks
1 block D of 16 pixels x 16 pixels is divided into 8 pixels x 8 pixels blocks

The maximum transform block size is coded by the maximum transform block size coding unit 107. There is no specific limitation on a coding method. For example, an index "0" is allocated when the maximum transform block size is 32 pixels x 32 pixels. An index "1" is allocated when the maximum transform block size is 16 pixels x 16 pixels. An index "2" is allocated when the maximum transform block size is 8 pixels x 8 pixels. Thus, indexes can be expressed by 2 bit codes.

For the coded maximum block size and the division flag, the multiplexing unit 112 lines up coded data in a predetermined order, and the coded data is output as a bit stream from the terminal 13.

By this method, codes indicating block divisions are determined according to the maximum transform block size during division. Thus, as in the case of the first exemplary embodiment, image quality deterioration can be efficiently suppressed with a small coding amount.

The maximum block size is determined by a quantization parameter. However, a determination method is not limited to this. Maximum block sizes can be individually set.

The first and second exemplary embodiments are directed to the example where a basic block size and a higher range are not forcibly set to 0 for the maximum transform block size. Not limited to this, however, an intermediate value is set as described above, and 16 pixels x 16 pixels can be set.

According to the present exemplary embodiment, a maximum transform size is always set. Not limited to this, however, a configuration illustrated in Fig. 14 can be employed. In Fig. 14, a terminal 151 receives an instruction whether to execute coding by using a maximum transform block size set by a user (not illustrated). A maximum transform block size determination unit 158 is different from the maximum transform block size determination unit 7 of the first exemplary embodiment in that determination of a maximum transform block size is controlled according to an instruction from the terminal 151. A maximum transform block size coding unit 157 is different from the abovementioned maximum transform block size coding unit 157 in that an input from the terminal 151 is received. A multiplexing unit 122 and the multiplexing unit 112 illustrated in Fig. 14 receive coded data of a profile.

When a maximum transform block size is set from the terminal 151 to execute coding, as in the abovementioned case, the maximum transform block size is determined, coded, and transmitted. When no maximum transform block size is set from the terminal 151, the maximum transform block size determination unit 158 fixes a maximum transform block size to a basic block size of 32 pixels x 32 pixels. The maximum transform block size coding unit 157 does not operate, nor the maximum transform block size is coded. Thus, compatibility can be maintained with the conventional method.

Whether to execute coding by using the maximum transform block size can also be achieved by including it in header information indicating attributes of a bit stream. For example, a special experience identifier (SEI) code included in the H. 264 coding method or a profile using the present invention can be used. In Fig. 15, a profile setting unit 160 sets a profile (e.g., extension profile) when a maximum transform block size is set from the terminal 151 to execute coding, while a basic profile is set when coding is not executed. Codes for identifying these profiles are generated, and input to the multiplexing unit 122. The codes for identifying the profiles are buried in the bit stream.

Thus, whether to execute control based on the maximum block size can become clear at an early stage of the bit stream. If execution of the control becomes clear at the early stage of the bit stream, power saving of the transform unit can be performed early.

Fig. 16 is a block diagram illustrating a configuration of an image decoding apparatus according to a third exemplary embodiment of the present invention. The present exemplary embodiment is described by an example of decoding of the coded data generated in the first exemplary embodiment.

In Fig. 16, a terminal 201 receives the bit stream generated in the first exemplary embodiment. A separation unit 202 separates the bit stream into each code by processing inverse to the multiplexing unit 12 of the first exemplary embodiment illustrated in Fig. 1. A quantization parameter decoding unit 203 receives a code of a quantization parameter from the separation unit 202 to decode it. A maximum transform unit block size setting unit 204 calculates a maximum transform block size from the decoded quantization parameter. An entropy decoding unit (frequency decoding unit) 205 decodes coded data of a quantization coefficient. An inverse quantization unit (inverse frequency transform unit) 206 executes inverse quantization by using the quantization parameter decoded by the quantization parameter decoding unit 203. An inverse orthogonal transform unit 207 decodes coefficient data acquired by the inverse quantization to reproduce prediction error data. A prediction unit 208 decodes a prediction code separated by the separation unit 202, and reads a decoded image from a frame memory 213 to execute prediction. The prediction unit 208 functions as in the case of the prediction unit 8 of the first exemplary embodiment illustrated in Fig. 1.

A motion vector decoding unit 209 decodes a motion vector code, and reproduces a motion vector to input it to a motion compensation unit 210. The motion compensation unit 210 executes motion compensation from the frame memory 213 based on the reproduced motion vector. A prediction error addition unit 211 adds together prediction data generated by the prediction unit 208 and the prediction error data reproduced by the inverse orthogonal transform unit 207 to generate a decoded image. A basic block synthesizing unit 212 synthesizes reproduced image data of transform blocks acquired by decoding, into a basic block. The frame memory 213 stores the decoded image. A block division information decoding unit 214 decodes the data coded by the block division information coding unit 21 of the first exemplary embodiment illustrated in Fig. 1. A terminal 215 outputs the decoded image data to the outside.

In the above configuration, the bit stream is input through the terminal 201 to the separation unit 202. The separation unit 202 transmits a code regarding a quantization parameter to the quantization parameter decoding unit 203. Coded data of a quantized coefficient is transmitted to the entropy decoding unit 205. Coded data relating to a prediction mode and a prediction method is transmitted to the prediction unit 208. A coded data indicating information of block division is input to the block division information decoding unit 214.

The coded data of the quantization parameter is first read and decoded, and the quantization parameter QP is decoded. When a value of the decoded quantization parameter QP is QL as in the case of the maximum transform block size determination unit 7 of the first exemplary embodiment, the maximum transform block size setting unit 204 sets its maximum transform block size to 32 pixels x 32 pixels. When a value of the input quantization parameter QP is QS, the maximum transform block size setting unit 204 sets its maximum transform block size to 8 pixels x 8 pixels. The maximum value of the transform block size is input to the entropy decoding unit 205, the inverse quantization unit 206, the inverse orthogonal transform unit 207, the motion compensation unit 210, and the basic block synthesizing unit 212.

The coded data of the quantized coefficient is input to the entropy decoding unit 205 and decoded, and a quantization result is decoded. The quantization result is input to the inverse quantization unit 206. Inverse quantization is carried out using the quantization parameter decoded by the quantization parameter decoding unit 203 to acquire coefficient data. The coefficient data is input to the inverse orthogonal transform unit 207. The inverse orthogonal transform unit 207 executes transform similar to that of the inverse orthogonal transform unit of the first exemplary embodiment illustrated in Fig. 1 to reproduce a prediction error. The reproduced prediction error is input to the prediction error addition unit 211.

The inverse orthogonal transform unit 207 is described in detail referring to Fig. 17. In Fig. 17, a terminal 260 receives division information of the transform blocks from the transform block division decoding unit 214 illustrated in Fig. 16. A terminal 261 receives the maximum size of the transform block from the maximum transform block size setting unit 204 illustrated in Fig. 16. A controller 262 performs control, as in the case of the controller 50 illustrated in Fig. 2, to determine whether to operate a 32 x 32 transform unit 265 and a 16 x 16 transform unit 266 according to an input from the terminal 261.

A terminal 263 receives the prediction error data from the prediction unit 208 illustrated in Fig. 16.

Selectors

264 and 269 select an input destination or an output destination based on the transform block size of the division information of the transform blocks input from the terminal 260.

A 32 x 32 inverse transform unit 265 carries out inverse DCT for the input transform blocks of 32 pixels x 32 pixels to acquire image data of 32 x 32. A 16 x 16 inverse transform unit 266 carries out inverse DCT for the input transform blocks of 16 pixels x 16 pixels to acquire image data of 16 x 16. A 8 x 8 inverse transform unit 267 carries out DCT for the input transform blocks of 8 pixels x 8 pixels to acquire image data of 8 x 8 . A 4 x 4 inverse transform unit 268 carries out inverse DCT for the input transform blocks of 4 pixels x 4 pixels to acquire image data of 4 x 4.

A terminal 270 outputs the coefficient data acquired by each transform. An operation of orthogonal transform in this configuration is described. Before processing starts, the maximum size of the transform block is input through the terminal 261 to the controller 262. The controller 62 sets, when the maximum size of the transform block is 32, the 32 x 32 inverse transform unit 265 and the 16 x 16 inverse transform unit 266 in operational states. When the maximum size of the transform block is 8, the controller 262 sets the 32 x 32 inverse transform unit 265 and the 16 x 16 inverse transform unit 266 in stopped states. In this case, power supplied to the hardware is cut off.

The division information of the transform blocks is input through the terminal 260 to the

selectors

264 and 269. When the set transform block size is 32, the selector 264 sets the 32 x 32 inverse transform unit 265 as its output destination while the selector 269 sets the 32 x 32 inverse transform unit 265 as its input destination. Similarly, when the input transform block size is 16, the selector 264 sets the 16 x 16 inverse transform unit 266 as its output destination while the selector 269 sets the 16 x 16 inverse transform unit 266 as its input destination. When the input transform block size is 8, the selector 264 sets the 8 x 8 inverse transform unit 267 as its output destination while the selector 269 sets the 8 x 8 inverse transform unit 267 as its input destination. When the input transform block size is 4, the selector 264 sets the 4 x 4 inverse transform unit 268 as its output destination while the selector 269 sets the 4 x 4 transform unit 268 as its input destination.

The selector 264 inputs the coefficient data of the transform blocks input from the terminal 263 to an inverse transform unit of an appropriate size. The selector 269 outputs the prediction error of the transform block acquired by transform at each transform unit from the terminal 70.

Referring back to Fig. 16, the prediction unit 208 receives codes of a prediction method and a prediction mode to decode them. Thus, in the case of intra prediction, pixels used for prediction are selected from the frame memory 213 in the prediction mode to reproduce the prediction data. In the case of motion compensation, the motion vector decoding unit 209 decodes coded data of a motion vector. Based on this motion vector, the motion compensation unit 210 generates prediction data from the frame memory 213.

On the other hand, the block division information decoding unit 214 decodes a division flag indicating block divisions separated by the separation unit 209 to acquire a division status in a basic block. The prediction data is input to the prediction error addition unit 211, and added to the prediction error to acquire reproduced images. The basic block synthesizing unit 212 synthesizes the reproduced images as a basic block based on the division status in the basic block. Then, the reproduced images are stored by basic blocks in the frame memory 213. The reproduced images are output via the terminal 215.

Fig. 18 is a flowchart illustrating image decoding in the image decoding apparatus according to the third exemplary embodiment. In step S201, a value of a quantization parameter QP of a first block of a frame is decoded.

In step S202, when the value of the decoded quantization parameter QP is QL, its maximum transform block size is set to 32 pixels x 32 pixels. When the value of the decoded quantization parameter QP is QS, its maximum transform block size is set to 8 pixels x 8 pixels.

In step S203, block division information of the basic block is decoded by a method described below to acquire sizes and positions of the transform blocks. In step S204, prediction data is generated by intra prediction or motion compensation.

In step S205, coded data of coefficients is decoded to acquire a quantization result of a prediction error.

In step S206, the acquired quantization result of the prediction error is inversely quantized, and inversely transformed to reproduce the prediction error. The reproduced prediction error is added to the prediction data generated in step S204 to restore image data.

In step S207, for the restored image data of the transform blocks, basic blocks are synthesized based on the block division information acquired in step S202.

In step S208, determination is made as to whether processing has ended for all the transform blocks in the basic block. The processing proceeds to step S209 when decoding has ended for all the transform blocks (YES in step S208). The processing proceeds to step S204 to process a next transform block when decoding has not ended (NO in step S208).

In step S209, determination is made as to whether processing has ended for all the basic blocks in the frame. The processing proceeds to step S211 when decoding has ended for all the basic blocks (YES in step S209). The processing proceeds to step S203 to process a next basic block when decoding has not ended (NO in step S209).

In step S210, decoded image data of one frame is output. In step S211, determination is made as to whether there is a frame to be decoded next. The image decoding is ended when the decoding has ended for all the frames (YES in step S211). The processing proceeds to step S201 to process a next frame when the decoding has not ended for all the frames (NO in step S211).

Fig. 19 is a flowchart illustrating a decoding flow of the division flag. In step S251, determination is made as to whether an input division flag is "0" or "1". The processing proceeds to step S252 when it is "0". The processing proceeds to step S253 when it is not "0".

In step S252, the basic block is not divided. Transform blocks are processed by 32 pixels x 32 pixels and the decoding of the division flag is ended.

In step S253, since the division flag is "1", it is determined that the basic block has been divided into transform blocks of 16 pixels x 16 pixels.

In step S254, determination is made as to whether a division flag input next is "0" or "1". The processing proceeds to step S255 when it is "0". The processing proceeds to step S256 when it is not "0".

In step S255, the transform blocks of 16 pixels x 16 pixels are not divided. The transform blocks are processed by 16 pixels x 16 pixels and the processing proceeds to step S256.

In step S256, since the division flag is "1", it is determined that the transform blocks have been divided into transform blocks of 8 pixels x 8 pixels. In step S257, determination is made as to whether a division flag input next is "0" or "1". The processing proceeds to step S258 when it is "0". The processing proceeds to step S259 when it is not "0".

In step S258, the transform blocks of 8 pixels x 8 pixels are not divided. The transform blocks are processed by 8 pixels x 8 pixels and the processing proceeds to step S260. In step S259, since the division flag is "1", it is determined that the transform blocks have been divided into transform blocks of 4 pixels x 4 pixels. In step S260, determination is made as to whether transform blocks of 4 pixels x 4 pixels are present in the transform blocks of 8 pixels x 8 pixels and whether decoding has ended for transform blocks of 4 pixels x 4 pixels. The processing proceeds to step S261 when the decoding has ended (YES in step S260). The processing proceeds to step S257 to process the next transform block of 8 pixels x 8 pixels when the decoding has not ended (NO in step S260).

In step S261, determination is made as to whether transform blocks of 8 pixels x 8 pixels and less are present in the transform blocks of 16 pixels x 16 pixels and whether decoding has ended for transform blocks of 8 pixels x 8 pixels and less. The decoding of the division flag is ended when the decoding has ended (YES in step S261). The processing proceeds to step S254 to process the next transform block of 16 pixels x 16 pixels when the decoding has not ended (NO in step S261). With this configuration and the operation, in the bit stream generated in the first exemplary embodiment, a calculation amount and power consumption can be reduced even on the deciding side by setting the maximum transform block size based on the quantization parameter.

Moreover, in the inverse quantization unit 206, the inverse orthogonal transform unit 207, and the motion compensation unit 210, hardware resources such as a memory can be effectively utilized by adaptively securing buffer sizes. The motion compensation unit 210 acquires the maximum transform block size to adaptively secure buffers when reading out the data from the frame memory 213.

Fig. 20 is a block diagram illustrating a configuration of an image decoding apparatus according to a fourth exemplary embodiment of the present invention. According to the present exemplary embodiment, as an example, the coded data generated in the second exemplary embodiment is decoded.

In Fig. 20, blocks having functions similar to those of the third exemplary embodiment illustrated in Fig. 16 are denoted by similar reference numerals, and description thereof is omitted.

A separation unit 302 separates data into codes by the processing inverse to the multiplexing unit 112 of the first exemplary embodiment illustrated in Fig. 14. A maximum transform block size decoding unit 304 decodes a code indicating a maximum transform block size separated by the separation unit 302. A block division information decoding unit 314 acquires information of the maximum transform block size from the maximum transform block size decoding unit to decode block division information.

An image decoding operation in the above configuration is described. A bit stream is input through a terminal 201 to the separation unit 302. The separation unit 302 receives header information, a code of a quantization parameter, a code of a maximum transform block size, a code of coefficient data, a code of a motion vector, and a code of block division information, and separates the codes to output them to a subsequent stage. The code of the maximum transform block size is input to the maximum transform block size decoding unit 304 and decoded. In the second exemplary embodiment, the size is coded as the index. This code is decoded to acquire information as to the maximum transform block size of 32 pixels x 32 pixels or 8 pixels x 8 pixels. The acquired information of the maximum transform block size is input to each related portion and the block division information decoding unit 314.

The block division information decoding unit 314 receives the maximum transform block size from the maximum transform block size decoding unit 304, and a division flag indicating block division information from the separation unit.

Fig. 21 is a flowchart illustrating detailed processing. In Fig. 21, steps having functions similar to those illustrated in Fig. 19 are denoted by similar reference numerals, and description thereof is omitted.

In step S301, determination is made as to whether a maximum transform block size is 16 pixels x 16 pixels. The processing proceeds to step S254 when it is 16 pixels x 16 pixels (YES in step S301), and to step S302 when it is not 16 pixels x 16 pixels (NO in step S301).

In step S302, determination is made as to whether a maximum transform block size is 8 pixels x 8 pixels. The processing proceeds to step S257 when it is 8 pixels x 8 pixels (YES in step S302), and to step S251 when it is not 8 pixels x 8 pixels (NO in step S302).

With this operation, the processing from steps S251 to S261 is executed when the maximum transform block size is 32 pixels x 32 pixels. The processing from steps S254 to S261 is executed when the maximum transform block size is 16 pixels x 16 pixels. The processing from steps S257 to S261 is executed when the maximum transform block size is 8 pixels x 8 pixels.

With the above configuration and the operation, in the bit stream generated in the second exemplary embodiment, a calculation amount and power consumption can be reduced even on the deciding side by setting the maximum transform block size based on the quantization parameter. In other words, since the maximum transform block size is coded, as compared with the codes of the first exemplary embodiment, similar effects can be acquired even with respect to shorter codes.

The second embodiment is directed to the method for including the information regarding whether to code the data by using the maximum transform block size, in the header information indicating the attributes of the bit stream. However, this code can also be decoded by a configuration illustrated in Fig. 22. Specifically, a separation unit 402 separates a code for identifying a profile buried in the bit stream to input it to a profile decoding unit 453. The profile decoding unit 453 decodes a profile (e.g., extension profile) using the maximum transform size code or a basic profile when coding is not executed. In the case of the extension profile, a maximum transform block size decoding unit 404 is operated to acquire a maximum transform block size. Otherwise, the maximum transform block size is fixed at 32 and output, thereby acquiring similar effects.

Fig. 20 is a block diagram illustrating a configuration of an image decoding apparatus according to a fifth exemplary embodiment of the present invention. According to the present exemplary embodiment, as an example, the coded data generated in the second exemplary embodiment is decoded.

An image decoding operation in the above configuration is described. A bit stream is input through a terminal 201 to the separation unit 302. The separation unit 302 receives header information, a code of a quantization parameter, a code of a maximum transform block size, a code of coefficient data, a code of a motion vector, and a code of block division information, and separates the codes to output them to a subsequent stage. The code of the maximum transform block size is input to the maximum transform block size decoding unit 304 and decoded. In the second exemplary embodiment, the size is coded as the index. This code is decoded to acquire information as to the maximum transform block size of 32 pixels x 32 pixels and 8 pixel x 8 pixels . The acquired information of the maximum transform block size is input to each related portion and the block division information decoding unit 314.

With the above configuration and the operation, in the bit stream generated in the second exemplary embodiment, a calculation amount and power consumption can be reduced even on the deciding side by setting the maximum transform block size based on the quantization parameter. In other words, since the maximum transform block sized is coded, as compared with the codes of the first exemplary embodiment, similar effects can be acquired even for shorter codes.

The second embodiment is directed to the method for including information regarding whether to code the data by using the maximum transform block size, in the header information indicating the attributes of the bit stream. This code can also be decoded by a configuration illustrated in Fig. 22. Specifically, a separation unit 402 separates a code for identifying a profile buried in the bit stream to input it to a profile decoding unit 453. The profile decoding unit 453 decodes a profile (e.g., extension profile) using the maximum transform size code or a basic profile when coding is not executed. In the case of the extension profile, a maximum transform block size decoding unit 404 is operated to acquire a maximum transform block size. Otherwise, the maximum transform block size is fixed at 32 and output, thereby acquiring similar effects.

In the exemplary embodiments, each processing unit is hardware. However, the processing carried out at each processing unit can be realized by a computer program.

Fig. 23 is a block diagram illustrating a hardware configuration example of a computer applicable to an image display apparatus according to each of the exemplary embodiments.

A central processing unit (CPU) 2301 controls the computer overall by using a commuter program or data stored in a random access memory (RAM) 2302 or a read-only memory (ROM) 2303, and executes each of the abovementioned processes carried out by the image processing apparatus according to each exemplary embodiment. In other words, the CPU 2301 functions as each processing unit.

The RAM 2302 has an area for temporarily storing a computer program or data loaded from an external storage device 2306, or data acquired from the outside via an interface (I/F) 2307. The RAM 2302 also has a work area used when the CPU 2301 executes various processes. In other words, for example, the RAM 2302 can be allocated to a frame memory, or can appropriately provide various other areas.

The ROM 2303 stores setting data of the computer or a boot program. An operation unit 2304 includes a keyboard and a mouse. A user of the computer can input various instructions to the CPU 2301 by operating the operation unit 2304. A display unit 2305 displays a processing result of the CPU 2301. The display unit 2305 includes, for example, a hold type display such as a liquid crystal display or an impulse type display such as a field emission type display.

The external storage device 2306 is a large-capacity information storage device represented by a hard disk drive. The external storage device 2306 stores an operating system (OS) or the computer program for causing the CPU 2301 to execute the function of each unit. The external storage device 2306 can store each image data as a processing target.

The computer program or the data stored in the external storage device 2306 is appropriately loaded, under control of the CPU 2301, to the RAM 2302 to become a processing target of the CPU 2301. A network such as a local area network (LAN) or the Internet, and other devices such as a projector or a display can be connected to the I/F 2307. The computer can acquire or transmit various pieces of information via the I/F 2307. A bus 2308 interconnects the units.

For an operation in the above configuration, the operations of the flowcharts are controlled mainly by the CPU 2301.
Other Embodiments

An object of the present invention can be achieved by supplying a recording medium recording the codes of the computer program to realize the functions to a system, and reading and executing the codes of the computer program by the system. In this case, the codes of the computer program read from the storage medium themselves achieve the functions of the exemplary embodiments, and the storage medium storing the codes of the computer program is within the present invention. Based on instructions of the codes of the computer program, the OS operating on the computer executes some or all of actual processes to achieve the functions. This is also within the present invention.

The functions can also be achieved as follows. The codes of the computer program read from the storage medium are written in a memory included in a function extension card inserted into the computer or a function extension unit connected to the computer. Based on instructions of the codes of the computer program, a CPU included in the function extension card or the function extension unit executes some or all of actual processes to achieve the functions. This is also within the present invention.

When the present invention is applied to the storage medium, the storage medium stores the codes of the computer program corresponding to the flowcharts.

While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all modifications, equivalent structures, and functions.

This application claims priority from Japanese Patent Application No. 2011-004645 filed January 13, 2011, which is hereby incorporated by reference herein in its entirety.

Claims

An image coding apparatus comprising:
a setting unit configured to set a coding control parameter;
a first transform unit configured to transform a frequency of pixel data of a block having a first size, and execute transform by substituting transform coefficients with predetermined values except a part of the transform coefficients;
a second transform unit configured to transform a frequency of pixel data of a block having a second size smaller than the first size;
a block size determination unit configured to limit use of the first transform unit based on the coding control parameter; and
a coding unit configured to control and code one of outputs of the first transform unit and the second transform unit based on a determined block size.
The image coding apparatus according to claim 1, wherein the block size determination unit compares a value of the coding control parameter with a threshold value, uses only the block of the second size when the value of the coding control parameter is smaller than the threshold value, and determines use of the blocks of the first and second sizes when otherwise.
The image coding apparatus according to claim 1, wherein the coding unit selects a block for coding based on a result of the block size determination unit.
The image coding apparatus according to claim 1, wherein the coding control parameter is a value calculated from a quantization step.
The image coding apparatus according to claim 1, wherein information regarding whether to control the block size based on the coding control parameter is coded.
An image coding method for controlling an image coding apparatus, comprising:
setting a coding control parameter;
executing first transform to transform a frequency of pixel data of a block having a first size, and executing transform by substituting transform coefficients with predetermined values except a part of the transform coefficients;
executing second transform to transform a frequency of pixel data of a block having a second size smaller than the first size;
determining a block size to limit use of the first transform based on the coding control parameter; and
controlling and coding one of outputs of the first transform and the second transform based on a determined block size.
An image decoding apparatus comprising:
a decoding unit configured to decode coded data indicating a quantization operation to restore a coding control parameter;
a frequency decoding unit configured to decode coded data indicating coefficients in one of a block and a redivided block within the block;
a first inverse transform unit configured to inversely transform a frequency of data of a block having a first size to generate pixel data;
a second inverse transform unit configured to inversely transform a frequency of data of a block having a second size smaller than the first size to generate pixel data;
a block size restoration unit configured to acquire a coded block size based on the coding control parameter; and
a decoding unit configured to control one of inputs to the first inverse transform unit and the second inverse transform unit based on a result of the frequency decoding unit and the acquired block size to restore image data.
An image decoding method for controlling an image decoding apparatus comprising:
decoding coded data indicating a quantization operation to restore a coding control parameter;
decoding coded data indicating coefficients in one of a block and a redivided block within the block;
executing first inverse transform to inversely transform a frequency of data of a block having a first size to generate pixel data;
executing second inverse transform to inversely transform a frequency of data of a block having a second size smaller than the first size to generate pixel data;
executing block size restoration to acquire a coded block size based on the coding control parameter; and
controlling one of inputs to the first inverse transform and the second inverse transform based on a result of the frequency decoding and the acquired block size to restore image data.
A program read and executed by a computer to cause the computer to function as the image coding apparatus according to claim 1.
A program read and executed by a computer to cause the computer to function as the image decoding apparatus according to claim 7.