WO2011121715A1

WO2011121715A1 - Image decoding method

Info

Publication number: WO2011121715A1
Application number: PCT/JP2010/055640
Authority: WO
Inventors: 竹島　秀則; 浅野　渉
Original assignee: 株式会社東芝
Priority date: 2010-03-30
Filing date: 2010-03-30
Publication date: 2011-10-06
Also published as: CN102484716A; JPWO2011121843A1; WO2011121843A1; KR20120043014A

Abstract

Involving a first step for inputting coded data that includes (1) a top coded pattern data indicating whether a prediction residual signal is present within a top block comprising from the 1st to the Nth (N>1) bottom blocks, (2) a bottom coded pattern data indicating whether a prediction residual signal is present in each of the 1st to N-1th aforementioned bottom blocks, and (3) the prediction residual signal of the aforementioned bottom block units; a second step for setting a predetermined value as the top coded pattern data of the Nth bottom block in cases in which the value of the abovementioned top coded pattern data is a value indicating that a prediction residual signal is present within the aforementioned top block and in which the values of the aforementioned bottom coded pattern data for the aforementioned 1st to N-1th bottom blocks are a specific combination; and a third step for acquiring a decoded image from the prediction residual signal in the aforementioned coded data for blocks from among the aforementioned first to Nth bottom blocks in which a prediction residual signal is present, in accordance with the aforementioned bottom coded pattern data.

Description

Image decoding method

The present invention relates to a method for decoding an image stored on a disk, for example, an image decoding method for receiving and reproducing an image by broadcasting or streaming.

H. One of the video encoding technologies is H.264. H.264 (for example, Non-Patent Document 1) is known. H. H.264 is a standard that defines a decoding method. H. In H.264, a residual (prediction error) in intra-frame and inter-frame prediction is converted by a predetermined orthogonal transform, and the transform coefficient is quantized and encoded. As the size of the orthogonal transform, either a 4 × 4 pixel block or an 8 × 8 pixel block is used depending on the encoding mode. The coefficient of the orthogonal transform often becomes zero due to quantization. So H. In H.264, in order to reduce the code amount, a 1-bit flag indicating whether all transform coefficients are zero is encoded, and when all transform coefficients are zero, encoding of transform coefficients is skipped. . This flag indicates that all coefficients are zero if the value is 0, and that there is at least one non-zero coefficient if the value is 1.

H. In H.264, a flag for the number of conversions in a 4 × 4 pixel block is called CBF (coded block block flag), and a flag for a block of 8 × 8 pixels or more is called a CBP (coded block pattern). For CBF, depending on the encoding mode, whether all transform coefficients are zero or not, and whether all coefficients other than the first transform coefficient (DC component) of the transform coefficient are zero There is. Also, four 4x4 pixel blocks are treated as if an 8x8 pixel block was divided, and when the conversion coefficients of the four 4x4 pixel blocks are all 0, the flag (CBP) for the 8x8 pixel block is 0. Is set. Also, CBP is encoded for a block of 16 × 16 pixels, and this CBP is composed of four 8 × 8 pixel luminance signal (Luminance) blocks and all two color difference signal (Chrominance) blocks corresponding to the block of 16 × 16 pixels. Represents whether or not the conversion coefficient is zero. As can be seen from these, information (CBP and CBF) indicating whether or not the transform coefficient is 0 is hierarchically encoded.

Among the coding modes, there is a mode in which CBF is always 0 when CBP is 0 in encoding of CBP and CBF. H. In H.264, even in these encoding modes, information that CBP is 1 and all CBFs are 0 can be encoded. It is redundant to assign a code to such a combination of CBP and CBF. If this redundancy is eliminated, the compression rate can be improved.

H. Even when the CBF and CBP are not encoded in H.264, the same redundancy occurs when the CBP is encoded hierarchically. H. In the H.264 inter prediction, mode information indicating whether or not the skip prediction is performed is encoded by 1 bit. Future H.D. Considering the extension of H.264, it is considered that the same redundancy occurs when the skip flag is encoded hierarchically. For redundancy in hierarchical skip flag encoding, see H.C. It can be solved by the same means as the solution of the present invention for CBP and CBF redundancy in H.264.

An object of the present invention is to provide an image decoding method for decoding when data compressed by an encoding unit that eliminates redundancy in encoding of hierarchical CBP, CBF, and skip flag is given. That is.

In order to solve the above-described problem, the image decoding method of the present invention includes prediction information in units of blocks, coded pattern data indicating the presence or absence of a prediction residual signal in units of blocks, and prediction residual signals in units of blocks. A step of inputting a bit stream including entropy-encoded image data; and a step of decoding the bit stream to obtain higher-order code pattern data when the information included in the bit stream is higher-order code pattern data; and The information included in the bitstream is lower-order code pattern data that is code pattern data belonging to the higher-order code pattern data, and the combination of the already acquired higher-order code pattern data and the lower-order code pattern data is a specific combination. A predetermined code as lower-order code pattern data in the case of a combination, a step of decoding the bitstream and obtaining the next lower-order code pattern data when not in the specific combination, and a block unit by the codet pattern data Determining whether or not there is a prediction residual signal in step B, and if the information included in the bit stream is prediction information in units of blocks, decoding the bit stream to obtain prediction information in units of blocks; When the information included in the bitstream is a prediction residual signal in units of blocks, decoding the bitstream to obtain a prediction residual signal in units of blocks; prediction information in units of blocks; And using the prediction residual signal in block units Characterized by comprising the steps of decoding an image signal.

According to the present invention, there is provided an image decoding method for decoding when data compressed by an encoding unit that eliminates redundancy in encoding of hierarchical CBP, CBF, and skip flag is given. be able to.

The figure showing the pixel block used as the object of an encoding or decoding, and an encoding process direction. The figure which shows hierarchical CBP (when a low-order is 4 bits) of this Embodiment. FIG. 3 shows a simple hierarchical CBP corresponding to FIG. 2. The figure which shows hierarchical CBP (when a low-order is 2 bits) of this Embodiment. The flowchart which shows an example of the operation | movement of CBP encoding of the block unit of this Embodiment. FIG. 6 is a block diagram of an encoding device that performs the operation of FIG. 5. The flowchart which shows an example of the operation | movement of a CBP decoding of the block unit of this Embodiment. The block diagram of the encoding apparatus which performs the operation | movement of FIG. The figure which shows an example of the decoding method of hierarchical CBP of this Embodiment. The figure which shows another example of FIG. The figure which shows an example of the decoding method which reversed the conditional branch of FIG. The figure which shows an example of the decoding method of CBF of a 4x4 pixel block, when CBP corresponding to a block of 8x8 pixel is non-zero. The figure which shows an example of the decoding method of hierarchical CBP whose size of a high-order block is 64x64 pixels, the size of a low-order block is 32x32 pixels, and a low-order is 4 bits. The figure which shows another example of FIG. The figure which shows an example of the syntax which obtains cbp about the conversion of 4x4, 8x8, 16x8, 8x16, 16x16, when the 1-bit brightness | luminance CBP with respect to a 16x16 pixel block is given. The figure which shows a simple hierarchical skip flag. The figure which shows the hierarchical skip flag of this Embodiment corresponding to FIG. The figure which shows an example of the operation | movement of decoding of the skip flag of the block unit of this Embodiment. The figure which shows an example of the syntax which implement | achieves the operation | movement of FIG. The flowchart which shows an example of the operation | movement of a hierarchical skip flag encoding of this Embodiment.

Hereinafter, an image decoding method, an image encoding method, and an apparatus according to embodiments of the present invention will be described in detail with reference to the drawings. Note that, in the following embodiments, the same numbered parts are assumed to perform the same operation, and repeated description is omitted.
First, H. An encoding means for generating data that can be decoded by H.264 will be briefly described. This encoding means, for example, as shown in FIG. 1, is an integer-precision orthogonal transform that is an approximation of intra-frame prediction (Intra prediction), inter-frame prediction (Inter prediction), or discrete cosine transform (DCT). And its inverse transform, coefficient quantization and inverse quantization, entropy code called Context-based Adaptive Binary Arithmetic Code (CABAC) or variable length coding (CAVLC) By means of a combination of H.264 data that can be decoded can be generated.

In the present embodiment, it is considered that no code is assigned to a redundant combination in which all the lower CBPs (CBF in the case of 4 × 4 pixels) are 0 when the upper CBP is 1 in encoding. The image decoding method of the present embodiment aims to decode such compressed data. In addition, when the upper skip flag is 0, encoding is performed while avoiding the assignment of codes to redundant combinations in which the lower skip flags are all 1. The image decoding method according to the present embodiment decodes such compressed data.

(Reduce redundancy of hierarchical CBP, encoder)
Hereinafter, information indicating the presence / absence of a residual signal will be referred to as coded pattern data. H. In the H.264 standard, among the coded pattern data, data that summarizes flags for a plurality of 8 × 8 pixel or 16 × 16 pixel blocks corresponds to CBP, and a flag for one 4 × 4 pixel block corresponds to CBF. Of the coded pattern data, the coded pattern data corresponding to the upper one block is called upper_flag, and the i-th lower coded pattern data belonging to the upper one block is called lower_flags [i]. For example, in Non-Patent Document 1 above, in the case of CBP of 8 × 8 pixel block and CBF of 4 × 4 pixel block for the Luminance signal, upper_flag and lower_flags [i] are 1 bit in CodedBlockPatternLuma and the remaining of the i-th 4 × 4 pixel block, respectively. Corresponds to coded_block_flag for the difference signal.

MPEG-1 / 2/4 compressed video decoding standard and H.264. In the decoding process defined in many standards such as H.264, the image is divided in units of blocks, and then the decoding process is performed in units of blocks. Therefore, in the following description, an example of a process of encoding (encoding) in units of blocks and an example of a process of decoding (decoding) in units of blocks will be described. The processing of the entire screen can be performed by repeating the processing for each block by the number of divided blocks.

In the embodiment of the present invention, when coded pattern data is hierarchically encoded, a means for improving the compression ratio by reducing the amount of code corresponding to a specific pattern is provided, and generated by such means. Decoded bitstream is decoded. A specific example of a specific pattern (when the lower order is 4 bits) is shown in FIG. 2, and an example of simple hierarchical code pattern data corresponding to FIG. 2 (when the lower order is 4 bits) is shown in FIG. 3 is used in H.264 of Non-Patent Document 1).

In FIG. 3, when the lower code pattern data (lower-flags) is 0001, the encoded data is 0001. On the other hand, in the embodiment of the present invention shown in FIG. 2, when the low-order code pattern data (lower-flags) is 0001, the last 1 bit is not encoded and the encoded data is 000. In decoding, if it is found that the upper code pattern data is 1 and 3 bits excluding the last 1 bit in the 4-bit lower code pattern are 000, the last 1 bit is 1. I know that there is. That is, there is a case where it is not necessary to encode 1 bit in encoding, and it is not necessary to decode 1 bit in decoding encoded data.

In FIG. 2, the low-order code pattern data is 4 bits. However, it is not particularly required to be 4 bits. For example, 2 bits shown in FIG. 4 may be used. In the case of 2 bits, if it is found that the upper code pattern data is 1 and the first 1 bit is 0 in the 2-bit lower code pattern, the last 1 bit may be 1. Recognize. Therefore, in this case as well, there is a case where it is not necessary to encode one bit in encoding and it is not necessary to decode one bit in decoding encoded data.

Next, FIG. 5 shows a flowchart showing the process of encoding CBP in block units, and FIG. 6 shows an example of an apparatus used for encoding. The coding pattern data can be encoded by the following method, for example. In the following description, step 3 and step 11 which are encoding of portions other than the codet pattern data will be described.

Step 1. (S501) The image input unit 601 reads an image block. The block to be read here is typically assumed to have the same size or larger size (for example, 16 × 16 pixels) than the block corresponding to the higher-order code pattern data in step 5.

Step 2. (S502) The calculation unit 602 reads from the program memory 605 the prediction mode to be encoded for the block read in S501. The prediction mode may be determined in advance, or the best mode may be selected by performing step S503 and subsequent steps in each of a plurality of prediction modes. There are many other variations for setting the prediction mode.

Step 3. The arithmetic unit 602 encodes the prediction mode, and the encoded data output unit 604 outputs it to the bit stream.

Step 4. (S503) The arithmetic unit 602 performs block prediction using the prediction mode and data that has already been decoded, and obtains a prediction residual that is the difference between the input image block and the predicted image block.

Step 5. (S504) The arithmetic unit 602 performs block transform (for example, KLT base generated by training data in advance or orthogonal transform using DCT base) and quantization on the prediction residual. Here, it is assumed that the block to be converted has a smaller size (for example, 4 × 4 pixels) than the block in Step 1. Next, the arithmetic unit 602 calculates information representing whether or not non-zero coefficients exist in the quantized transform coefficients in units of blocks. Typically, this information is 1-bit flag information for each block, which is 1 when a non-zero coefficient exists in the block and 0 when all the coefficients in the block are 0. This information is referred to as lower-order code pattern data (lower-order CBP or CBF). Next, higher-order code pattern data (upper CBP), which is code pattern data for a larger block, represents information indicating whether or not a set of lower-order code pattern data (for example, four lower-order code pattern data) is all zero. Ask for. Typically, the high-order code pattern data is 1 bit for each block, 0 if the low-order code pattern data is all 0, and 1 if there is any non-zero information in the low-order code pattern data. It becomes flag information. Here, it is assumed that the size of the block corresponding to the high-order code pattern data is larger than the low-order code pattern data (for example, 8 × 8 pixels).

Step 6. (S505) The arithmetic unit 602 encodes the high-order code pattern data, and the encoded data output unit 604 outputs it to the bit stream.

Step 7. (S506) The arithmetic unit 602 skips Steps 8 to 10 if the higher-order code pattern data is 0, and executes Step 8 if not.

Step 8. (S507) Steps 8 to 10 are processes in which the lower code pattern data is encoded bit by bit. In this step, the next conditional branch is executed. The arithmetic unit 602 is a case where the already encoded lower-order code pattern data is specific pattern data (typically, the last one-bit lower-order code pattern data is encoded, and If all the encoded lower-order code pattern data is 0), step 9 for outputting to the next bit stream is skipped; otherwise, step 9 is executed.

Step 9. (S508) Next, the arithmetic unit 602 encodes lower-order code pattern data to be encoded, and the encoded data output unit 604 outputs it to the bit stream.

Step 10. (S509) The arithmetic unit 602 proceeds to step 11 if all the lower-order code pattern data have been encoded, and returns to step 8 if not. In the loop from S507 to S509, the number of times of rotation is determined according to the number of bits.

Step 11. The encoded data output unit 604 encodes the quantized transform coefficient for a block having non-zero coded pattern data, and outputs the result to a bit stream.

Note that step 3 may be prior to S501 and S502. The encoder can improve the compression rate by, for example, repeatedly trying steps 2 to 11 while switching the prediction mode to be encoded and then selecting a prediction mode with good encoding efficiency. However, in order to output a decodable bitstream, such repetition is not essential, and the prediction mode may be determined based on another criterion.

Next, the encoding apparatus will be described with reference to FIG.
The encoding apparatus according to the present embodiment includes an image input unit 601, an arithmetic unit 602, a data memory 603, an encoded data output unit 604, and a program memory 605.

The image input unit 601 reads an image block. The arithmetic unit 602 encodes the image block read by the image input unit 601 with reference to the data memory 603 and the program memory 605. The data memory 603 is a temporary storage device, and temporarily stores, for example, a frame before the currently processed frame. The data memory 603 is a RAM, for example. The program memory 605 stores a program for encoding. The program memory 605 is, for example, a ROM or a RAM. The encoded data output unit 604 outputs the encoded data to a bit stream.

(Reduce redundancy of hierarchical CBP, decoder)
FIG. 7 shows an example of a flowchart of a method for decoding the bit stream generated by the above method (decoding of CBP in units of blocks), and FIG. 8 shows an example of an apparatus for executing the decoding. The decoding of the coded pattern data can be performed by, for example, the following steps 1 to 10.

Step 1. (S701) The encoded data input unit 801 inputs a bit stream as input data. The bit stream includes prediction information in units of blocks, coded pattern data indicating the presence / absence of a prediction residual signal in units of blocks, and image data obtained by entropy encoding the prediction residual signal in units of blocks.

Step 2. (S702) Next, the arithmetic unit 802 determines whether the data to be acquired is higher-order code pattern data (for example, higher-order CBP), and if it is determined that the data is higher-order code pattern data, step S702 If not, jump to step 4.

Step 3. (S703) The arithmetic unit 802 obtains higher-order code pattern data by entropy decoding. Jump to step 9.

Step 4. (S704) Next, the arithmetic unit 802 determines whether the data to be acquired is low-order code pattern data (for example, low-order CBP or low-order CBF) that is code pattern data belonging to the high-order code pattern data. If it is determined that the code pattern data is lower order, the process jumps to step 6; otherwise, the process jumps to step 5.

Step 5. (S705) Next, the data to be acquired by the arithmetic unit 802 (other than the coded pattern data) is acquired and the process jumps to Step 9.

Step 6. (S706) The arithmetic unit 802 determines whether the already acquired lower-order code pattern data is a predetermined pattern. If it is a predetermined pattern, the operation unit 802 jumps to step 7, otherwise. Jump to step 8. For example, in the example of FIG. 2, the predetermined pattern is that the lower-order code pattern data is 4 bits, and the lower-order code pattern data has already been acquired in 3 bits, and the next data to be acquired is It is the 4th bit code pattern data, and indicates the pattern in which 3 bits already acquired is 000.

Step 7. (S707) Next, the arithmetic unit 802 sets a predetermined fixed value as lower-order code pattern data to be acquired, and jumps to step 9 after the setting. For example, in the example of FIG. 2, 1 is set as the fourth bit code pattern data to be acquired next.

Step 8. (S708) Next, the arithmetic unit 802 acquires lower-order codet pattern data to be acquired by entropy decoding.

Step 9. (S709) The arithmetic unit 802 determines whether or not the acquisition of the block information has been completed. If the acquisition has been completed, the process jumps to Step 10, and if not, the process jumps to Step 2. Block information indicates decoded prediction information in units of blocks and prediction residual signals in units of blocks. That is, although not explicitly shown in this flow, when the information included in the bitstream is prediction information in units of blocks, there is a step of decoding the bitstream to obtain prediction information in units of blocks, which is included in the bitstream. If the information to be received is a prediction residual signal in units of blocks, there is a step of obtaining a prediction residual signal in units of blocks by decoding the bit stream.

Step 10. (S710) The arithmetic unit 802 restores the block image signal based on the acquired block information, and the image output unit 804 outputs the restored image.

Next, the decoding apparatus will be described with reference to FIG.
The decoding apparatus according to the present embodiment includes an encoded data input unit 801, an arithmetic unit 802, a data memory 803, an image output unit 804, and a program memory 805.

The encoded data input unit 801 inputs a bit stream as input data. The arithmetic unit 802 decodes the blocks included in the bit stream input by the encoded data input unit 801 with reference to the data memory 603 and the program memory 605 while determining the block breaks. The data memory 803 is a temporary storage device, and temporarily stores, for example, the bit stream input by the encoded data input unit 801. The program memory 805 stores a program for decoding and stores, for example, a program corresponding to the pseudo program shown in FIG. 9 or FIG.

Next, FIG. 9 shows an example of the syntax of the decoding method when the lower-order code pattern data is a 4-bit hierarchical CBP. FIG. 9 shows the decoding method when the lower-order code pattern data is a num_lower_blocks-bit hierarchical CBP. An example of the syntax is shown in FIG. 9 and 10, a line indicated as ae (v) indicates that 1-bit information is acquired by entropy decoding (for example, CABAC), and a line that does not indicate that the line is executed. “(Upper_flag is ready decoded)” indicates that the upper-level code pattern data has already been acquired as upper_flag. “(Other decoding processes, optional)” indicates that if other syntax processing is necessary, the syntax processing is executed. It should be noted that the conditional branch determination for determining whether the coded pattern data is a specific combination can be reversed. An example in which the conditional branch determination in FIG. 9 is reversed is shown in FIG. The same applies to other figures.

(CBP and CBF)
H. In H.264, coded_block_flag that is coded pattern data corresponding to a block of 4 × 4 pixels is the first coded_block_flag, and when the first coded_block_flag is non-zero, the first conversion coefficient, the second coded_block_flag, and the second coded_block are In the case of non-zero, the second transform coefficient is decoded together with the transform coefficient. In this case, considering the CBP corresponding to the 8 × 8 pixel block as the upper code pattern data and the CBF corresponding to each 4 × 4 pixel block constituting the 8 × 8 pixel block as the lower code pattern data, the syntax is as shown in FIG. It can be expressed as follows. FIG. 12 shows an example of a method for decoding a CBF of a 4 × 4 pixel block when the CBP corresponding to the block of 8 × 8 pixels is non-zero.

In FIG. 12, an argument blockIndex represents an index when a 4 × 4 pixel block being processed has a leading block of 0 in an 8 × 8 pixel block including the block, and (decode residual coefficients coeffLevel [i]) represents a conversion coefficient. It represents decoding. Also, before decoding the first 4 × 4 pixel block, 0 is set to nonzero_coded_block_flag_found. Since there are four 4 × 4 pixel blocks constituting an 8 × 8 pixel block, when blockIndex is 3 and other coded_block_flag is 0, coded_block_flag can be set to 1 without decoding. Otherwise, the coded_block_flag needs to be decoded. Although not included in FIG. 12, since it is necessary to always decode coded_block_flag depending on the prediction mode, it is determined whether it is necessary to always decode coded_block_flag before calling the function of FIG. It is necessary to keep. As a prediction mode in which coded_block_flag must always be decoded, for example, H.264 H.264 Intra16 × 16 prediction. In this mode, CBP corresponding to the luminance block is expressed in two ways, with all bits being 0 or all bits being 1 for 4-bit CBP for four 8x8 blocks constituting a 16x16 pixel block. Only available. In this case, unless a sign that the upper CBP is 1 and the lower CBF is all 0 is prepared, a coded pattern in which all CBFs in an 8x8 pixel block are 0 in an arbitrary 8x8 pixel block in 16x16 pixels. The data cannot be represented. Therefore, H.H. In H.264 Intra16 × 16 prediction, it is necessary to always decode coded_block_flag.

(Upper CBP and Lower CBP)
H. In H.264, the size of a block (macroblock) having higher-order code pattern data is 16 × 16 pixels. However, considering the encoding efficiency, for example, an extension of expanding the macroblock size to 32 × 32 pixels or 64 × 64 pixels is also conceivable. This is called an extended macroblock. In the extended macroblock, for example, there are four 16 × 16 pixel blocks constituting the extended macroblock of 32 × 32 pixels if only the luminance block is considered, and if there are four luminance blocks and Cb and Cr color difference blocks, there are 12 blocks. Will do. CBP is often 0, and it is desirable to avoid coding 12 flags of 0. Thus, as an extension of the macroblock, it is conceivable to code the coded pattern data indicating whether the CBPs constituting the extended macroblock of 32 × 32 pixels are all 0s or CBPs having one or more non-zero bits. . In this embodiment, the coded pattern data for an extended macroblock of 32 × 32 pixels is called coded_block_pattern_32. coded_block_pattern_32 is 1-bit flag information. If a level 2 extended macroblock of 64 × 64 pixels is considered as a larger extended macroblock, coded_block_pattern_32 which is a 32 × 32 pixel CBP constituting the level 2 extended macroblock is all 0, or one or more non-zero bits are set. It is conceivable to encode whether the CBP is possessed. In this embodiment, the coded pattern data for an extended macroblock of 64 × 64 pixels is called coded_block_pattern_64, and the four 32 × 32 pixel blocks constituting the 64 × 64 pixel block are indexed and coded_block_pattern_32 [0] to coded_block_pattern_32 [3]. ].

FIG. 13 shows an example of the flow of processing for decoding coded_block_pattern_64 and coded_block_pattern_32, and FIG. 14 shows another example. Both figures show an example of a method of decoding a hierarchical CBP in which the size of the upper block is 64 × 64 pixels, the size of the lower block is 32 × 32 pixels, and the lower block is 4 bits.

“(Other decoding processes, optional)” indicates that if other syntax processing is necessary, the syntax processing is executed. FIG. 13 illustrates an example of decoding data in which coded_block_pattern_64 and coded_block_pattern_32 are encoded. In FIG. 13, first, coded_block_pattern_64 is decoded, and when it is not 0, each element of coded_block_pattern_32 is decoded. At this time, coded_block_pattern_32 [3] is decoded if at least one of coded_block_pattern_32 [0] to coded_block_pattern_32 [2] is non-zero, but coded_block_pattern_32 [0] to all of coded_block_block_patter_block_patt_p32_patt_patch_block0_32_32_3 In this case, coded_block_pattern_32 [3] is set to 1. FIG. 14 shows an example in which coded_block_pattern_64 and coded_block_pattern_32 in FIG. 13 are expressed as different functions. In FIG. 14, macroblock_cluster_residual_64x64 () is a function for decoding a residual of a level 2 extended macroblock of 64x64 pixels, and macroblock_cluster_residual_32x32 () is a function for decoding a residual of an extended macroblock of 32x32 pixels. is there. 13, in FIG. 14, decoding of coded_block_pattern_64 is performed by macroblock_cluster_residual_64x64 (), and decoding of each element of coded_block_pattern_32 is performed by macroblock_cluster_residual_32 × 32 (points). 13 is the same as FIG. 13 in that when all of coded_block_pattern_32 [0] to coded_block_pattern_32 [2] are 0, coded_block_pattern_32 [3] is not decoded and is set to a fixed value 1.

(Specific example of hierarchical CBP)
As an example of syntax, if a 1-bit luminance CBP is encoded and decoded for a 16 × 16 pixel block and the luminance CBP is 1, the presence / absence of a conversion coefficient for each block belonging to the 16 × 16 pixel block An example of encoding and decoding as a CBP will be described with reference to the syntax of FIG. FIG. 15 shows an example of a syntax for obtaining cbp for 4 × 4, 8 × 8, 16 × 8, 8 × 16, and 16 × 16 conversion when a 1-bit luminance CBP for a 16 × 16 pixel block is given.

Suppose that there are 5 conversion sizes: 4x4, 8x8, 16x8, 8x16, 16x16. The conversion type “cur_transform_type” is assumed to take three values of 0 corresponding to one of 0 corresponding to 4 × 4, 1 corresponding to 8 × 8, and 16 × 8/8 × 16/16 × 16. When “cur_transform_type” is 2, the transform size is 16 × 8 if the predicted block size is 16 × 8, 8 × 16 if 8 × 16, and 16 × 16 otherwise. For transform sizes 4x4 and 8x8, a 4 bit CBP for 4 8x8 blocks is required. For transform sizes of 16x8 and 8x16, a 2-bit CBP is required for two 16x8 blocks and two 8x16 blocks, respectively. When the conversion size is 16 × 16, 1-bit CBP is required.

In this case, as shown in FIG. 15, when cur_transform_type is 0 or 1, reading corresponding to FIG. 2 is performed, and when cur_transform_type is 2 and the conversion size is 16 × 8/8 × 16, reading corresponding to FIG. 4 is performed. When cur_transform_type is 2 and the transform size is 16x16, the present invention can be used by using the given CBP as it is. In FIG. 15, “cbp_luma — 1 bit” represents 1-bit luminance CBP, “MbPartWidth (mb_type)” and “MbPartHeight (mb_type)” represent prediction block sizes, and cbp represents 16 × 16 luminance blocks in four 8 × 8 blocks. When divided, 4 bits of cbp corresponding to each 8 × 8 luminance block are represented. In the syntax of FIG. 15, in the case of 16 × 8/8 × 16/16 × 16, cbp is set to 4 bits in accordance with the 8 × 8 block. However, in these cases, it is not necessary to adjust cbp to the 8 × 8 block, and 2 bits, 2 bits, It may be read as 1-bit cbp.

(Direct mode)
H. In many inter-frame coding modes of H.264, two pieces of information such as a motion prediction error for motion prediction obtained from already decoded information and a prediction residual between a block and an image block obtained by the prediction are encoded and Decrypt. However, there are two special modes: a direct mode that encodes and decodes only a prediction residual without encoding and decoding motion prediction errors, and a skip mode that uses only motion prediction obtained from already decoded information Is prepared. H. In H.264, a direct mode is prepared only for data called a B slice, but since a direct mode with the same definition can be realized for data called a P slice, the type of slice is particularly limited in this embodiment. do not do.

Comparing the two modes, skip mode and direct mode, it is considered that the skip mode is a mode in which there is no prediction residual of the direct mode. H. In H.264, it is possible to encode a direct mode in which no prediction residual exists, but in such a case, it is encoded in the skip mode, and it is interpreted that there is always a prediction residual in the direct mode at the time of decoding. It can be seen that the direct mode without the prediction residual is a redundant mode. Therefore, when decoding the hierarchical coded block data in the extended macroblock, only the coded block data having the same size as the block coded as the direct mode is acquired without entropy decoding. A fixed value (1) representing the presence of the prediction residual can be set. In other words, when the prediction information in units of blocks is in the direct mode, a predetermined value is set without decoding higher-order code pattern data from the bitstream. For the lower-level blocks in the direct mode (for example, four 32 × 32 pixel extended macroblocks constituting the block when the level 2 extended macroblock of 64 × 64 pixels is recorded as the direct mode), the prediction residual Presence / absence of existence is unknown, and it is necessary to encode and decode the coded block data.

(Skip flag)
The explanation so far has described means for reducing the waste of code block data. With the same means, redundant information can be reduced for the skip flag indicating the skip mode. The skip mode is a flag that becomes 1 when the block is skipped and 0 when it is not. When two or more sizes are prepared as the sizes of blocks that can be skip-predicted, it is necessary to encode the skip flag hierarchically. In this case, it is implemented as shown in FIG. 16 (an example of a simple hierarchical skip flag). This is a table obtained by logically inverting the examples of CBP and CBF described so far. It is redundant for the same reason that the coded block data is redundant. Therefore, as shown in FIG. 17 (an example of a hierarchical skip flag according to the present embodiment), when four flags are encoded and decoded as the lower skip flags, the upper skip flag is 0 and the lower skip is performed. If the first, second, and third skip flags among the flags are 1, a fixed value of 0 can be used without encoding and decoding the fourth lower skip flag.

FIG. 20 shows the flow of a method for encoding a hierarchical skip flag. Encoding is performed in the following flow.

Step 1. (S2001) The image input unit 601 reads an image block.

Step 2. (S2002) The calculation unit 602 reads from the program memory 605 the prediction mode to be encoded for the block read in S501. The prediction mode includes information on a skip flag (Skip). The prediction mode may be determined in advance, or the best mode may be selected by performing step S503 and subsequent steps in each of a plurality of prediction modes. There are many other variations for setting the prediction mode.

Step 3. (S2003) The arithmetic unit 602 encodes the upper skip flag, and the encoded data output unit 604 outputs it to the bit stream.

Step 4. (S2004) This step executes the next conditional branch. If the upper skip flag is 1, the arithmetic unit 602 skips steps 5 to 9. Otherwise, step 5 is executed.

Step 5. (S2005) The arithmetic unit 602 encodes an upper prediction mode other than the skip flag, and outputs the encoded prediction mode from the encoded data output unit 604 to a bit stream.

Step 6. (S2006) Steps 6 to 8 are processes for encoding one bit of the lower skip flags. In this step, the next conditional branch is executed. The arithmetic unit 602 uses the encoding of the low-order skip flag that has already been encoded as the specific pattern data (typically, the encoding of the low-order skip flag of the last 1 bit and the encoding If all the skipped lower flags are 1), step 7 to be output to the next bit stream is skipped; otherwise, step 7 is executed.

Step 7. (S2007) The arithmetic unit 602 encodes the low-order skip flag to be encoded, and outputs it from the encoded data output unit 604 to the bit stream.

Step 8. (S2008) The arithmetic unit 602 encodes a lower prediction mode other than the skip flag, and outputs it from the encoded data output unit 604 to a bit stream. Also, if there is residual information, the arithmetic unit 602 encodes the residual information and outputs it from the encoded data output unit 604 to a bit stream.

Step 9. (S2009) If all the lower blocks (including the lower skip flag) have been encoded, the arithmetic unit 602 ends the block encoding. Otherwise, go back to step 6 to encode the next lower block. In the loop from S2006 to S2009, the number of turns depends on the number of lower-order blocks belonging to the higher-order block. For example, if the lower-order block is divided into four, it is turned only four times.

The encoder, for example, can improve the compression rate by repeatedly selecting the prediction mode with good encoding efficiency after repeatedly trying steps 2 to 9 while switching the prediction mode to be encoded. However, in order to output a decodable bitstream, such repetition is not essential, and the prediction mode may be determined based on another criterion.

The flow of decoding the hierarchical skip flag is almost the same as in FIG. FIG. 18 shows an example of the decoding flow of the block-by-block skip flag corresponding to FIG. The flow in FIG. 18 differs from FIG. 7 in that the CBP in S702 to S708 is replaced with a skip flag in S1802 to S1808 (indicated as skip in FIG. 18). The corresponding syntax is, for example, as shown in FIG. FIG. 19 is an example of flag reduction using this embodiment for hierarchical skip flags.

(The invention's effect)
As described above, when this embodiment is used, hierarchical CBP, CBF, and skip flag are compressed by encoding means having a high compression ratio and the encoding means by eliminating redundancy. Data decoding means can be provided. Note that the means and method in the present embodiment eliminate conventional redundancy and do not add new encoding modes or flags. Therefore, if an encoding unit is used in which only a change of performing flag encoding by the method described in this embodiment is used with respect to the conventional encoding method, the amount of code may be reduced, but increased. I can expect that.

(About execution on computers and transformation)
The instructions shown in the processing procedure shown in the above-described embodiment can be executed based on a program that is software. A general-purpose computer system stores this program in advance and reads this program, so that it is possible to obtain the same effects as those obtained by the image decoding method and the image encoding method of the above-described embodiment. The instructions described in the above-described embodiments are, as programs that can be executed by a computer, magnetic disks (flexible disks, hard disks, etc.), optical disks (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD). ± R, DVD ± RW, etc.), semiconductor memory, or a similar recording medium. As long as the computer or embedded system can read the storage medium, the storage format may be any form. If the computer reads the program from the recording medium and causes the CPU to execute instructions described in the program based on the program, the computer performs the same operation as the image decoding method and the image encoding method of the above-described embodiment. Can be realized. Of course, when the computer acquires or reads the program, it may be acquired or read through a network.
In addition, the OS (operating system), database management software, MW (middleware) such as a network, etc. running on the computer based on the instructions of the program installed in the computer or embedded system from the storage medium realize this embodiment. A part of each process for performing may be executed.
Furthermore, the storage medium in the present invention is not limited to a medium independent of a computer or an embedded system, but also includes a storage medium in which a program transmitted via a LAN or the Internet is downloaded and stored or temporarily stored.
Also, the number of storage media is not limited to one, and the processing in the present embodiment is executed from a plurality of media, and the configuration of the media is included in the storage media in the present invention.

The computer or the embedded system in the present invention is for executing each process in the present embodiment based on a program stored in a storage medium, and includes a single device such as a personal computer or a microcomputer, Any configuration such as a system in which apparatuses are connected to a network may be used.
Further, the computer in the embodiment of the present invention is not limited to a personal computer, but includes an arithmetic processing device, a microcomputer, and the like included in an information processing device, and a device capable of realizing the functions in the embodiment of the present invention by a program, The device is a general term.

Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

601: Image input unit, 602, 802 ... Arithmetic unit, 603, 803 ... Data memory, 604 ... Encoded data output unit, 605, 805 ... Program memory, 801 ... Encoded data input unit, 804 ... Image output unit .

Claims

(1) Upper code pattern data indicating whether a prediction residual signal exists in an upper block including the first to Nth (N> 1) lower blocks, (2) 1st to N−1 A first step of inputting encoded data including low-order code pattern data indicating whether a prediction residual signal exists in each of the sub-blocks up to the th, and (3) a prediction residual signal in units of the low-order blocks When,
The value of the high-order code pattern data is a value indicating that a prediction residual signal is present in the high-order block, and the low-order code for the first to (N−1) -th low-order blocks. A second step of setting a predetermined value as the low-order code pattern data of the Nth low-order block when the value of the pattern data is a specific combination;
A third step of obtaining a decoded image from a prediction residual signal in the encoded data for a block in which a prediction residual signal exists among the first to Nth lower blocks according to the lower code pattern data. When,
An image decoding method comprising:
The lower block is 32 pixels in length and width,
The upper block includes two or four lower blocks,
The high-order codet pattern data and the low-order codet pattern data are flag information that takes 1 when a prediction residual signal exists and 0 when a prediction residual signal does not exist. Image decoding method.
The encoded data further includes prediction information of the upper block unit,
2. The image decoding method according to claim 1, wherein when the prediction information is a value indicating a direct mode, the higher-order code pattern data is set to a predetermined value.
The lower block is a block having a size of 4 pixels in both vertical and horizontal directions,
The upper block is a block having a size of 8 pixels in both vertical and horizontal directions, including the four lower blocks.
The higher-order code pattern data and the lower-order code pattern data are flag information that takes 1 when a prediction residual signal exists, and takes 0 when a prediction residual signal does not exist.
In the second step, when the values of the lower code pattern data for the first to third lower blocks are all 0, the value of the lower code pattern data for the fourth lower block is set to 1. Set to
The image decoding method according to claim 1, wherein:
(1) Upper skip data indicating whether the upper block including the first to Nth (N> 1) lower blocks is a skipped block; (2) 1st to N−1th blocks A first step of inputting encoded data including lower skip data indicating whether each of the lower blocks is a skipped block, and (3) a prediction residual signal in units of the lower blocks;
The value of the upper skip data is a value indicating that it is not a skipped block, and the values of the lower skip data for the first to N−1th lower blocks are in a specific combination. A second step of setting a predetermined value to the value of the lower skip data of the Nth lower block, if any;
A third step of obtaining a decoded image from the prediction residual signal in the encoded data for blocks that are not skipped among the first to Nth lower blocks according to the lower skip data;
An image decoding method comprising:
The upper skip data and the lower skip data are flag information that takes 1 when indicating that it is a skipped block, and takes 0 when indicating that it is not a skipped block,
In the second step, the low-order skip of the N-th low-order block is a combination in which the values of the low-order skip data for the first to (N-1) -th low-order blocks are all 1. Set the data value to 0,
The image decoding method according to claim 5, wherein: