WO2011033853A1 - Procédé de décodage d'image animée et procédé de codage d'image animée - Google Patents

Procédé de décodage d'image animée et procédé de codage d'image animée Download PDF

Info

Publication number
WO2011033853A1
WO2011033853A1 PCT/JP2010/062180 JP2010062180W WO2011033853A1 WO 2011033853 A1 WO2011033853 A1 WO 2011033853A1 JP 2010062180 W JP2010062180 W JP 2010062180W WO 2011033853 A1 WO2011033853 A1 WO 2011033853A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
prediction
frequency conversion
inverse
block unit
Prior art date
Application number
PCT/JP2010/062180
Other languages
English (en)
Japanese (ja)
Inventor
昌史 高橋
村上 智一
山口 宗明
浩朗 伊藤
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to JP2011531838A priority Critical patent/JP5363581B2/ja
Publication of WO2011033853A1 publication Critical patent/WO2011033853A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a moving picture coding technique for coding a moving picture and a moving picture decoding technique for decoding a moving picture.
  • MPEG Motion Picture Experts Group
  • other encoding systems have been formulated as techniques for recording and transmitting large volume moving picture information as digital data. These standards predict the encoding target image in units of blocks using image information that has been encoded, and encode the difference (prediction difference) from the original image, thereby providing redundancy of the moving image. The code amount is reduced except for.
  • inter-screen prediction that refers to an image different from the target image enables high-precision prediction by searching the reference image for a block having a high correlation with the encoding target block.
  • the prediction difference is encoded by performing frequency conversion, for example, discrete cosine transform (DCT: Discrete Cosine Transform) once in order to increase the degree of integration of numerical values, and quantizing the coefficient values after conversion. Since the prediction difference also has a strong correlation with the local region, the frequency conversion is also performed in units of blocks obtained by finely dividing the image.
  • DCT discrete cosine transform
  • Patent Document 1 has a problem in that the prediction accuracy decreases because the block used for prediction is enlarged. When the prediction accuracy is lowered, noise that is noticeable to human eyes is generated, and subjective image quality is lowered.
  • the present invention has been made in view of the above problems, and an object thereof is to reduce the code amount and improve the subjective image quality.
  • the moving picture decoding method performs the following processing.
  • Input an encoded stream.
  • a variable length decoding process is performed on the input encoded stream.
  • a prediction difference is generated by performing an inverse quantization process and an inverse frequency transform process on the data subjected to the variable length decoding process on a first block basis.
  • Prediction processing is performed in units of second blocks.
  • a decoded image is generated based on the generated prediction difference and the result of the prediction process.
  • the first block unit is a larger block unit than the second block unit.
  • the moving picture decoding method performs the following processing.
  • Input an encoded stream.
  • a variable length decoding process is performed on the input encoded stream.
  • the data subjected to the variable length decoding process is subjected to an inverse quantization process and an inverse frequency conversion process in the first block unit or the second block unit.
  • Prediction processing is performed in units of the second block.
  • a decoded image is generated based on the generated prediction difference and the result of the prediction process.
  • One block in the first block unit is a block unit obtained by integrating a block group composed of a plurality of blocks in the second unit into one block.
  • the inverse quantization process and the inverse frequency transform process are performed in the block group in units of the second block.
  • the intra-frame prediction block is not included in the block group, in the block group, the inverse quantization process and the inverse frequency transform process are performed in units of the first block.
  • the moving image encoding method performs the following processing.
  • a prediction difference is generated by performing a prediction process on the input image.
  • the generated prediction difference is subjected to frequency conversion processing and quantization processing to generate quantized data.
  • Variable length coding is performed on the generated quantized data to generate an encoded stream.
  • the frequency conversion process and the quantization process are performed based on whether or not the intra prediction block is included in a block group including a plurality of blocks, and the frequency conversion process and the quantization process are performed on the blocks included in the block group. Change the block unit size.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method.
  • H. 2 is a conceptual explanatory diagram relating to an H.264 / AVC encoding method. It is a conceptual explanatory drawing regarding the encoding method of this embodiment.
  • Embodiment 1 the first embodiment of the present invention will be described in H.264. This will be described in comparison with the processing in H.264 / AVC.
  • H. H.264 / AVC predicts an encoding target image using image information that has been encoded, and encodes a prediction difference from the original image, thereby reducing the redundancy of the moving image and increasing the code amount. Reduced.
  • prediction is performed in units of blocks obtained by finely dividing an image.
  • the encoding process is performed on the target image 305 in units of macroblocks 302 composed of 16 ⁇ 16 pixels according to the raster scan order (arrow) 301.
  • the target image 305 includes an already encoded area 306 and an uncoded area 307. Prediction is roughly classified into intra-screen prediction and inter-screen prediction.
  • Figure 4 shows H. The operation
  • a decoded image of an encoded image included in the same video 401 as the encoding target image 403 is used as a reference image 402, and a block (predicted image) having a high correlation with the target block 404 in the target image is used. ) 405 is searched from the reference image 402.
  • a motion vector 406 expressed as a difference between the coordinate values of both blocks is encoded as side information necessary for prediction.
  • the reverse procedure described above may be performed at the time of decoding, and the decoded image can be acquired by adding the decoded prediction difference to the block (predicted image) 405 in the reference image.
  • H. H.264 / AVC can perform the above prediction by dividing a macroblock into smaller blocks.
  • FIG. 5 shows a macroblock division pattern allowed when inter-screen prediction is performed. That is, H.I. H.264 / AVC is optimal for predicting each macroblock 502 in the target image 501 from among a predetermined division pattern (macroblock division pattern) 503 from 4 ⁇ 4 pixel size to 16 ⁇ 16 pixel size. You can choose one.
  • Information indicating which division pattern is used for each macroblock is encoded on a macroblock basis.
  • the prediction difference generated by the prediction process is decomposed into frequency components by DCT (Discrete Cosine Transformation), which is one of frequency conversion methods, and the coefficient value is encoded.
  • FIG. 6 conceptually shows how the prediction difference is decomposed into frequency components by DCT.
  • DCT is a method of frequency conversion in which an input signal is expressed by a weighted sum of a base signal 603 and its coefficient value.
  • the coefficient value 602 is often biased toward low frequency components, so variable length coding can be performed efficiently.
  • the size is 4 ⁇ 4 pixels.
  • the macro block 702 of the prediction difference 701 is divided into 4 ⁇ 4 small blocks (pixels) (703 in FIG. 7).
  • H. H.264 / AVC achieves high performance by adaptively dividing and encoding an image into fine blocks.
  • H. Since H.264 / AVC uses macroblocks as a basic unit of encoding processing, it has not been possible to handle blocks larger in size than macroblocks or blocks that straddle a plurality of macroblocks. This restriction on the block shape was one of the factors that hindered the improvement of compression efficiency.
  • Patent Document 1 makes it possible to change the size of a macroblock and adaptively changes the upper limit value according to the feature amount of an already-encoded area. According to this method, it is possible to enlarge the macroblock according to the property of the image, and in particular, it is possible to increase the compression efficiency of high definition video.
  • this method has a problem that the prediction accuracy is lowered because the block for performing the prediction is enlarged.
  • the prediction accuracy is lowered, noise that is noticeable to human eyes is generated, and subjective image quality is lowered.
  • inter-screen prediction processing is performed in small blocks, for example, in units of macroblocks, while the application size of frequency conversion for the prediction difference (this embodiment uses DCT as an example). Make it expandable.
  • FIG. 8 shows the prediction difference 801.
  • the prediction difference has a gentle distribution with many low-frequency components, and the DCT has a large block unit. Even if it is applied, image quality degradation is reduced. Therefore, for the region 802 with high prediction accuracy, the prediction difference of a plurality of adjacent blocks is integrated, a large block is formed, and DCT is performed, so that the code amount of DCT coefficients can be greatly reduced.
  • processing described in the present embodiment will be described as being applied to a frame (P slice or B slice in H.264 / AVC) that can perform inter-frame coding.
  • the processing described in the following embodiment may or may not be applied to a frame (I slice in H.264 / AVC) in which all areas in the screen are encoded. good.
  • an inter macro is included in the picture.
  • An example will be described in which there is only a block (macroblock that performs inter-screen coding) and no intra macroblock (macroblock that performs intra-screen coding).
  • FIG. 9 shows an example of a block size used for DCT in the present embodiment.
  • the prediction difference 901 is generated by the following method, for example. This method is described in, for example, H.H.
  • the same means as in H.264 / AVC (FIG. 5) performs block division in units of 16 ⁇ 16 pixel macroblocks, performs inter-screen prediction of each macroblock, and integrates those prediction differences for one screen.
  • a block group 902 64 ⁇ 64 pixels
  • 16 adjacent macroblocks are integrated
  • the size of the block group 902 is not limited to the 64 ⁇ 64 pixel size, and may be any size such as 32 ⁇ 32 or 128 ⁇ 128 as long as a plurality of macroblocks 903 are integrated.
  • One preferable method is to prepare a number of types such as 8 ⁇ 8 pixels, 16 ⁇ 16 pixels, 32 ⁇ 32 pixels, 64 ⁇ 64 pixels in advance as the division pattern 903 of the block group 902, and the optimum among them. DCT is performed by selecting a proper pattern.
  • a code table as shown in FIG. 10 is used, and information indicating which pattern is selected is encoded for each block group.
  • the overall code amount can be reduced by assigning a short code length to a frequently selected pattern.
  • the selection of the block pattern is effective when, for example, the cost function shown in Equation 1 is used and it is determined that the division pattern for minimizing this is optimal.
  • Dist represents the sum of errors between the original image and the decoded image
  • Rate represents the sum of the code amount of the DCT coefficient and the code amount of the block division pattern
  • Weight represents the weight coefficient.
  • the trade-off between image quality and code amount can be controlled by adjusting the value of Weight. For example, if the code amount is to be significantly reduced even if the image quality is somewhat deteriorated, the value of Weigh may be set higher so that the contribution rate of the code amount to the cost value increases.
  • FIG. 11 shows a prediction difference encoding procedure for each block.
  • DCT is performed on the prediction difference 1101 of the target block to obtain a DCT coefficient 1102.
  • the DCT coefficient 1102 is quantized to reduce the number of elements to be encoded.
  • a large quantization step is applied to the high frequency components of the DCT coefficients.
  • the quantization step weight 1103 is represented as Q.
  • the quantized DCT coefficient 1104 is subjected to one-dimensional expansion by scanning in a two-dimensional zigzag direction from the low frequency component to the high frequency component (1105), and VLC is applied to generate a code word ( 1106).
  • the above processing is repeated for all blocks obtained by dividing the block group.
  • the processing order for each block group may be any, but an example is shown in FIG.
  • a block group is processed in the order of raster scanning.
  • the block group 1201 located at the upper left corner of the screen is processed, and then the block group 1202 adjacent to the right side of the block group 1201 is processed. Thereafter, the processing further proceeds on the block group 1203 and the block group 1204 adjacent to the right side.
  • the block group 1205 adjacent below the block group 1201 is processed. The above processing is performed until the lower right corner of the screen is reached.
  • the processing order of the macroblocks included in the same block group may be any, but it is effective to process along the zigzag direction 1210, for example.
  • a memory for temporarily storing the prediction differences is required.
  • the area stored in the memory at once is called an “access group”.
  • prediction and DCT are performed for each access group.
  • prediction processing is performed on all macroblocks in the screen, and sequentially stored in the memory.
  • the prediction difference for one screen stored in the memory is divided into blocks and subjected to DCT.
  • the access group may be set in any range. For example, as shown in FIG. 13, if one access group is constituted by a block group for one line, encoding can be performed efficiently.
  • prediction and DCT are performed on the access group 1311 configured by the block group 1301 to block group 1304 located on the top line of the screen, and then the block group 1305 to block group located on the next line. Prediction and DCT are performed on the access group 1312 configured by 1308. If this is continued until the bottom line of the screen is reached, the encoding process for one frame is completed.
  • FIG. 14 shows a configuration example (one block group) of the encoded stream in the present embodiment.
  • 16 macroblocks serving as basic units for prediction processing exist in the block group.
  • a prediction method forward inter-screen prediction, reverse inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction, etc.
  • macroblock 1 the block division pattern for the first macroblock (macroblock 1)
  • the represented prediction mode 1401 is encoded, and then the motion vector in each block is encoded as side information 1402 necessary for prediction.
  • the prediction mode 1403 for the second macroblock 2 and the motion vector 1404 in each block obtained by dividing the macroblock are encoded. This is repeated for all macroblocks included in the corresponding block group.
  • the block division pattern 1405 when performing DCT on the prediction difference of the corresponding block group and the DCT coefficient 1406 of each block are encoded.
  • the block size for performing DCT may be set to a fixed value such as 64 ⁇ 64. In this case, the encoding of the block group division pattern 1405 is not necessary.
  • FIG. 1 shows an example of a moving picture encoding apparatus according to this embodiment.
  • the moving image encoding apparatus performs an intra-screen prediction in units of blocks, an input image memory 102 that holds an input original image 101, a block division unit 103 that performs block division on an image in the input image memory 102
  • a block determination unit 107 is included.
  • the moving image encoding apparatus further includes a subtraction unit 108 for generating a prediction difference, a DCT unit 110 that performs frequency conversion on the prediction difference, and a frequency that determines a block shape for frequency conversion that matches the nature of the prediction difference.
  • Transform block determining unit 116 quantization processing unit 111 that performs quantization on the coefficient value after frequency conversion, variable length coding unit 112 for performing coding according to the probability of occurrence of symbols, and encoding once
  • a reference image memory 117 for use in later prediction.
  • the input image memory 102 holds one image from the original images 101 as an encoding target image.
  • the block dividing unit 103 divides the image data into blocks of an appropriate size and sends them to the intra-screen prediction unit 104, motion search unit 105, inter-screen prediction unit 106, and subtraction unit 108.
  • the motion search unit 105 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 117 and sends the motion vector to the inter-screen prediction unit 106.
  • the intra-screen prediction unit 104 and the inter-screen prediction unit 106 execute the intra-screen prediction process and the inter-screen prediction process in units of several types of blocks.
  • the prediction method / block determination unit 107 selects an optimal prediction method and block shape (macroblock division pattern).
  • the subtraction unit 108 generates a prediction difference by an optimal prediction encoding unit using the original image and the prediction result, and sends the prediction difference to the prediction difference memory 109.
  • the prediction difference memory 109 sends the prediction difference to the DCT unit 110 when the prediction difference for one access group is stored.
  • the DCT unit 110 and the quantization processing unit 111 divide the blocks into several types of blocks and perform frequency conversion and quantization processing such as DCT, respectively.
  • the variable length coding processing unit 112 and the inverse quantization processing unit 113 are examples of the prediction difference by the original image and the prediction result.
  • the inverse quantization processing unit 113 and the inverse DCT unit 114 perform inverse quantization and inverse frequency transform (for example, IDCT (Inverse DCT)) on the quantized frequency transform coefficient, respectively, and obtain a prediction difference To the adder 113.
  • IDCT Inverse DCT
  • the adding unit 115 generates a decoded image.
  • the frequency transform block determination unit 116 and the reference image memory 117 store the decoded image.
  • the frequency conversion block determination unit 114 determines an optimal block shape (block group division pattern) for performing frequency conversion, and sends the information to the variable length encoding unit 112.
  • variable-length encoding processing unit 112 has optimal block shape information (macroblock, division pattern of block group) when performing prediction / frequency conversion, frequency conversion coefficient (prediction difference information) based on the optimal block shape, and Side information necessary for prediction processing at the time of decoding (for example, a prediction direction when performing intra-screen prediction and a motion vector when performing inter-screen prediction) is variable-length-encoded based on the occurrence probability of a symbol and is encoded stream Is generated.
  • FIG. 2 shows an example of a moving picture decoding apparatus according to the present embodiment.
  • the moving picture decoding apparatus performs variable length decoding for decoding various information by performing the reverse procedure of variable length coding on the encoded stream 201 generated by the moving picture encoding apparatus shown in FIG. ,
  • An inverse quantization processing unit 203 and an inverse DCT unit 204 for decoding prediction difference information
  • a prediction difference memory 205 for storing a prediction difference for one access group
  • inter-screen prediction An inter-screen prediction unit 206, an intra-screen prediction unit 207 that performs intra-screen prediction, an addition unit 208 for acquiring a decoded image, and a reference image memory 209 for temporarily storing the decoded image Have.
  • the variable length decoding unit 202 performs variable length decoding on the encoded stream 201, and acquires block shape information when performing prediction and frequency conversion, prediction difference information, and side information necessary for prediction processing at the time of decoding.
  • block shape information (block group division pattern) and prediction difference information when performing frequency conversion are sent to the inverse quantization processing unit 203, and block shape information (macroblock division pattern) when performing prediction.
  • the side information necessary for the prediction process at the time of decoding is sent to the inter-screen prediction unit 206 or the intra-screen prediction unit 207.
  • the inverse quantization processing unit 203 and the inverse DCT unit 204 respectively perform inverse frequency transforms such as inverse quantization and inverse DCT on the prediction difference information in the block shape (block group division pattern) specified in units of blocks.
  • the inter-screen prediction unit 206 or the intra-screen prediction unit 207 refers to the block shape (macroblock division pattern) designated with reference to the reference image memory 209 based on the information sent from the variable length decoding unit 202.
  • the prediction process is executed at.
  • the adding unit 208 generates a decoded image from the prediction processing result and the prediction difference for one access group stored in the prediction difference memory 205, and stores the decoded image in the reference image memory 209.
  • FIG. 15 shows an encoding processing procedure for one frame in the present embodiment.
  • the following processing is performed for all regions existing in the frame to be encoded (1501). That is, for all macroblocks in the corresponding access group (1502), all available prediction methods (forward inter-screen prediction, backward inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction, etc.) and Prediction is executed with the block shape (macroblock division pattern) (1503), and the prediction difference is calculated.
  • all available prediction methods forward inter-screen prediction, backward inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction, etc.
  • Prediction is executed with the block shape (macroblock division pattern) (1503), and the prediction difference is calculated.
  • an optimal combination is selected from the results of prediction by all prediction methods and block shapes (1504), information on the combination is encoded, and the prediction difference is stored in the memory.
  • the term “optimal” here means a case where both the prediction difference and the code amount are small.
  • SAD Square Absolute Difference
  • R Rate
  • is a constant for weighting, and this value differs depending on the prediction method (intra-screen prediction / inter-screen prediction), quantization parameters, etc., so it is effective to use different values accordingly. Is. It is desirable to calculate the estimated code amount in consideration of not only the prediction difference information but also the code amount such as block shape information and motion vector.
  • the prediction difference corresponding to the corresponding access group stored in the memory is subsequently obtained for all the blocks available for each block group.
  • DCT (1506), quantization (1507), and variable length coding (1508) are performed in the block shape (block group division pattern).
  • the quantized DCT coefficients are subjected to inverse quantization (1509) and inverse DCT (1510) to decode the prediction difference information, and further using Equation 1, an optimal block shape (block group division) Pattern) is selected (1511), and the shape information is encoded.
  • the shape information can be selected using, for example, an RD-Optimization method that determines an optimal encoding mode from the relationship between image quality distortion and code amount.
  • the RD-Optimization method is a well-known technique and will not be described in detail here. For details, see Reference 1, for example (Reference 1: G. Sullivan and T. Wiegand: "Rate-Distortion Optimization for Video Compression", IEEE Signal Processing Magazine, vol.15, no.6, pp .74-90, 1998.).
  • a decoded image is obtained by adding the decoded prediction difference and the predicted image (1512), and stored in the reference image memory.
  • FIG. 16 shows a decoding process procedure of one frame in the present embodiment.
  • the following processing is performed for all access groups in one frame (1601). That is, all the block groups in the access group (1602) are subjected to variable length decoding processing (1603), and the inverse quantization processing (1604) and the designated block shape (block group division pattern) are performed.
  • Inverse DCT (1605) is applied to decode the prediction difference and store it in the memory.
  • variable length decoding prediction method and the block shape (macroblock Based on the division pattern), prediction (1607) is performed, and a decoded image is obtained by adding the prediction difference stored in the memory (1608).
  • decoding for one frame of image is completed (1609).
  • DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT (Karhunen).
  • DST Discrete Sine Transformation
  • WT Widelet Transformation
  • DFT Discrete Fourier Transformation
  • KLT Karhunen
  • -Loeve Transformation Carhunen-Leave transform
  • any orthogonal transform can be used for removing the correlation between pixels, and the prediction difference itself can be encoded without any frequency conversion.
  • variable length coding is not particularly required. Further, this embodiment may be used in combination with another method.
  • prediction is performed in encoding and decoding of a frame configured by inter macroblocks.
  • Moving picture encoding apparatus, encoding method, moving picture decoding apparatus, and decoding method for reducing code amount while maintaining subjective image quality more suitably by performing frequency conversion in units of blocks larger than the macroblock to be used Can be realized.
  • Embodiment 2 when all areas are inter-coded in a frame (P slice or B slice in H.264 / AVC) in which inter-picture coding can be performed, that is, an inter macro is included in the picture.
  • a frame P slice or B slice in H.264 / AVC
  • inter macro is included in the picture.
  • An example has been described in which only blocks (macroblocks that perform inter-screen coding) exist and intra macroblocks (macroblocks that perform intra-screen coding) do not exist.
  • intra-frame coding when intra-frame coding can be applied to a frame (P slice or B slice in H.264 / AVC) in which inter-frame coding can be performed, that is, an inter macroblock ( A case will be described in which encoding can be performed by mixing a macro block that performs inter-screen encoding) and an intra macro block (macro block that performs intra-screen encoding).
  • prediction and DCT are performed on a macroblock basis for a block group including at least one intra macroblock.
  • the DCT process is performed in units of blocks obtained by dividing the macroblock, and the size thereof is equal to or smaller than the macroblock. Therefore, 32 ⁇ 32, 64 ⁇ 64, etc. in FIG. 9 cannot be used as a block size (DCT macroblock division pattern) in DCT processing.
  • the division pattern is encoded using a code table shown in FIG.
  • FIG. 17 conceptually shows an example of an encoding method for each block group. This example shows a case where the access group matches the block group.
  • the encoding process first, prediction is performed for all macroblocks included in the block group 1701 located at the upper left end of the image, and if all macroblocks are inter macroblocks, the same means as in the first embodiment (see FIG. In 9), the division pattern of the block group is determined and DCT is performed. Subsequently, the same processing is performed for the block group 1702 adjacent to the right side of the block group 1 if no intra macroblock is included.
  • the same processing is performed for the block group 1703 and the block group 1704, and when the right end of the screen is reached, the block group 1705 adjacent to the lower side of the block group 1701 is processed.
  • the block group 1706 are intra macroblocks
  • prediction and DCT are performed on the block group in units of macroblocks.
  • the unit for performing DCT is switched between the macroblock and the macroblock group depending on whether or not the corresponding macroblock group includes an intra macroblock.
  • FIG. 19 shows a configuration example of an encoded stream for a block group in which one or more intra macroblocks exist in the present embodiment.
  • the prediction method forward inter-screen prediction, reverse inter-screen prediction, bi-directional inter-screen prediction, intra-screen prediction, etc.
  • a prediction mode 1901 expressed as a combination is encoded.
  • side information 1902 necessary for prediction a motion vector is encoded in the case of an inter macroblock, and information regarding a prediction direction is encoded in the case of an intra macroblock.
  • the macroblock block division pattern 1903 when DCT is performed on the same macroblock and the DCT coefficient 1904 of each block are encoded.
  • the above processing is performed for all macroblocks included in one block group.
  • the block size for performing DCT may be set to a fixed value such as 8 ⁇ 8. In this case, encoding of a division pattern for each macroblock is not necessary.
  • the configuration of the encoded stream for a block group having no intra macroblock is the same as that of the first embodiment (FIG. 14).
  • FIG. 20 shows an encoding processing procedure in a block group in which one or more intra macroblocks exist in this embodiment. This process is performed for all macroblocks included in the corresponding block group (2001), for all available prediction methods (forward inter-screen prediction, backward inter-screen prediction, bidirectional inter-screen prediction, intra-screen prediction). Etc.) and a block shape (macroblock division pattern) (2002), and a prediction difference is calculated.
  • a suitable combination is selected from the prediction results of all prediction methods / block shapes (2003), and information on the combination is encoded.
  • the term “preferable” here means a case where both the prediction difference and the code amount are small, and it is effective to use a cost function expressed by Formula 2 or another formula for the evaluation.
  • DCT (2004), quantization (2005), and variable length coding (2006) are performed on the prediction difference of the same macroblock in an available block shape (macroblock division pattern subjected to DCT).
  • the quantized DCT coefficients are subjected to inverse quantization (2007) and inverse DCT (2008) to decode the prediction difference information, and further using Equation 1, an optimal block shape (for DCT) Macroblock division pattern) is selected (2009), and its shape information is encoded.
  • the selection of the shape information may use another method of Reference Document 1 described above or another method.
  • a decoded image is obtained by adding the decoded prediction difference and the predicted image (2010), and stored in the reference image memory.
  • the encoding of the corresponding block group is completed. Note that the encoding processing procedure in a block group in which no intra macroblock exists is the same as that in the first embodiment (FIG. 15).
  • FIG. 21 shows a decoding processing procedure in a block group in which one or more intra macroblocks exist in this embodiment.
  • all macroblocks included in the block group (2101) are subjected to variable length decoding processing (2102), and inverse quantization processing (2103) and inverse DCT (2104) are performed in the designated block shape. ) To decode the prediction difference and store it in the memory.
  • prediction (2105) is performed based on the prediction method subjected to variable length decoding and the block shape (macroblock division pattern for DCT) when performing prediction, and the prediction difference stored in the memory is added.
  • a decoded image is acquired (2106).
  • the moving picture encoding apparatus and the moving picture decoding apparatus that perform the processing of the second embodiment are components of the moving picture encoding apparatus and the moving picture decoding apparatus according to the first embodiment shown in FIGS. 1 and 2. Therefore, the description of the configuration itself is omitted.
  • DCT is cited as an example of frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT (Karhunen).
  • DST Discrete Sine Transformation
  • WT Widelet Transformation
  • DFT Discrete Fourier Transformation
  • KLT Karhunen
  • -Loeve Transformation Caroonen-Reve conversion
  • the prediction difference itself may be encoded without frequency conversion.
  • variable length coding is not particularly required. Further, this embodiment may be used in combination with another method.
  • the moving picture coding apparatus and coding method, the moving picture decoding apparatus, and the decoding method according to Embodiment 2 described above coding and decoding of a frame having a configuration in which inter macro blocks and intra macro blocks are mixed.
  • Video coding apparatus, coding method, and video decoding that further reduce code amount while maintaining subjective image quality more appropriately by performing frequency conversion in units of blocks larger than macroblocks used for prediction A decoding device and a decoding method can be realized.
  • the present invention can be applied to encoding / decoding of moving images, and in particular, can be applied to encoding / decoding in units of blocks.

Abstract

Selon l'invention, une différence de prédiction est générée par réalisation d'un traitement de prédiction sur une image d'entrée, des données quantifiées sont générées par réalisation d'un traitement de conversion de fréquence et d'un traitement de quantification sur la différence de prédiction générée, un flux codé est généré par codage à longueur variable des données quantifiées générées, et dans le traitement de conversion de fréquence et le traitement de quantification, sur la base qu'un bloc de prédiction intra-image est inclus ou non dans un groupe de blocs comprenant un nombre prédéterminé de multiples blocs, la taille d'une unité de bloc dans laquelle le traitement de conversion de fréquence et le traitement de quantification sont réalisés est modifiée, concernant les blocs inclus dans le groupe de blocs. En résultat, la quantité de codes est réduite et la qualité d'image subjective est améliorée de façon plus appropriée.
PCT/JP2010/062180 2009-09-16 2010-07-20 Procédé de décodage d'image animée et procédé de codage d'image animée WO2011033853A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2011531838A JP5363581B2 (ja) 2009-09-16 2010-07-20 動画像復号化方法及び動画像符号化方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009214011 2009-09-16
JP2009-214011 2009-09-16

Publications (1)

Publication Number Publication Date
WO2011033853A1 true WO2011033853A1 (fr) 2011-03-24

Family

ID=43758466

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/062180 WO2011033853A1 (fr) 2009-09-16 2010-07-20 Procédé de décodage d'image animée et procédé de codage d'image animée

Country Status (2)

Country Link
JP (7) JP5363581B2 (fr)
WO (1) WO2011033853A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015211269A (ja) * 2014-04-24 2015-11-24 富士通株式会社 動画像符号化装置、動画像符号化方法及び動画像符号化用コンピュータプログラム
US10721468B2 (en) 2016-09-12 2020-07-21 Nec Corporation Intra-prediction mode determination method, intra-prediction mode determination device, and storage medium for storing intra-prediction mode determination program

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011033853A1 (fr) * 2009-09-16 2011-03-24 株式会社日立製作所 Procédé de décodage d'image animée et procédé de codage d'image animée

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003250161A (ja) * 2001-12-19 2003-09-05 Matsushita Electric Ind Co Ltd 符号化装置及び復号化装置
JP2003533141A (ja) * 2000-05-10 2003-11-05 ローベルト ボツシユ ゲゼルシヤフト ミツト ベシユレンクテル ハフツング 動画像シーケンスの変換符号化方法
WO2007034918A1 (fr) * 2005-09-26 2007-03-29 Mitsubishi Electric Corporation Dispositif de codage et de décodage d'image dynamique
JP2009005413A (ja) * 2008-09-30 2009-01-08 Toshiba Corp 画像符号化装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011033853A1 (fr) * 2009-09-16 2011-03-24 株式会社日立製作所 Procédé de décodage d'image animée et procédé de codage d'image animée

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003533141A (ja) * 2000-05-10 2003-11-05 ローベルト ボツシユ ゲゼルシヤフト ミツト ベシユレンクテル ハフツング 動画像シーケンスの変換符号化方法
JP2003250161A (ja) * 2001-12-19 2003-09-05 Matsushita Electric Ind Co Ltd 符号化装置及び復号化装置
WO2007034918A1 (fr) * 2005-09-26 2007-03-29 Mitsubishi Electric Corporation Dispositif de codage et de décodage d'image dynamique
JP2009005413A (ja) * 2008-09-30 2009-01-08 Toshiba Corp 画像符号化装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015211269A (ja) * 2014-04-24 2015-11-24 富士通株式会社 動画像符号化装置、動画像符号化方法及び動画像符号化用コンピュータプログラム
US10721468B2 (en) 2016-09-12 2020-07-21 Nec Corporation Intra-prediction mode determination method, intra-prediction mode determination device, and storage medium for storing intra-prediction mode determination program

Also Published As

Publication number Publication date
JP5882416B2 (ja) 2016-03-09
JP5363581B2 (ja) 2013-12-11
JP2020005294A (ja) 2020-01-09
JP2014207713A (ja) 2014-10-30
JP6837110B2 (ja) 2021-03-03
JP2016067062A (ja) 2016-04-28
JP6088080B2 (ja) 2017-03-01
JP6585776B2 (ja) 2019-10-02
JP2018164299A (ja) 2018-10-18
JPWO2011033853A1 (ja) 2013-02-07
JP5611432B2 (ja) 2014-10-22
JP6360214B2 (ja) 2018-07-18
JP2014007759A (ja) 2014-01-16
JP2017103810A (ja) 2017-06-08

Similar Documents

Publication Publication Date Title
US10687075B2 (en) Sub-block transform coding of prediction residuals
JP6084730B2 (ja) 映像復号化装置
US20070098067A1 (en) Method and apparatus for video encoding/decoding
JP2009094828A (ja) 画像符号化装置及び画像符号化方法、画像復号化装置及び画像復号化方法
WO2013108684A1 (fr) Dispositif de décodage d'image vidéo, dispositif de codage d'image vidéo, procédé de décodage d'image vidéo et procédé de codage d'image vidéo
JP6837110B2 (ja) 動画像復号方法
JP5887012B2 (ja) 画像復号化方法
JP2009049969A (ja) 動画像符号化装置及び方法並びに動画像復号化装置及び方法
JP5171658B2 (ja) 画像符号化装置
JP5891333B2 (ja) 画像復号化方法
JP5887013B2 (ja) 画像復号化方法
JP5422681B2 (ja) 画像復号化方法
JP5690898B2 (ja) 画像復号化方法
JP5790722B2 (ja) 動画像符号化方法
JP2016129391A (ja) 画像復号化方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10816966

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2011531838

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10816966

Country of ref document: EP

Kind code of ref document: A1