WO2011048904A1 - 画像符号化装置、画像復号装置、および、符号化データのデータ構造 - Google Patents
画像符号化装置、画像復号装置、および、符号化データのデータ構造 Download PDFInfo
- Publication number
- WO2011048904A1 WO2011048904A1 PCT/JP2010/066248 JP2010066248W WO2011048904A1 WO 2011048904 A1 WO2011048904 A1 WO 2011048904A1 JP 2010066248 W JP2010066248 W JP 2010066248W WO 2011048904 A1 WO2011048904 A1 WO 2011048904A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- unit
- prediction
- target
- rectangular
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
Definitions
- the present invention relates to an image coding apparatus and an image coding method for generating coded data by coding an image.
- the present invention also relates to an image decoding apparatus and an image decoding method for generating an image by decoding encoded data generated using such an image encoding apparatus.
- a moving picture coding apparatus is used to efficiently transmit or record moving pictures.
- a moving picture coding method in the moving picture coding apparatus for example, H.264 shown in Non-Patent Document 1 is used. H.264 / AVC is mentioned.
- the image to be encoded is divided into a plurality of blocks and then encoded.
- a technology is employed in which a predicted image is generated with reference to a decoded region in the same frame as the target block, and a difference image between the predicted image and the target block is coded. ing.
- Patent Document 1 a reverse L character target adjacent area adjacent to the prediction target area and a search area are set, and the sum of absolute error values is the smallest with respect to the target adjacent area in the search area.
- An image characterized by generating a prediction signal for the prediction target area by searching for the prediction adjacent area which is the area where the prediction target area is and setting the area adjacent to the prediction adjacent area as the texture signal for the prediction target area.
- a predictive coding device is disclosed.
- Patent Document 1 searches for an area similar to the inverted L character target adjacent area by scanning the search area two-dimensionally, calculation when performing the search There is a problem that the amount is large and the speed of generating a predicted image is reduced.
- Patent Document 1 sets the prediction target area to a square. Therefore, for example, in the case where the image to be encoded includes an edge whose curvature changes, it is not possible to search for an appropriate predicted adjacent region, and there is a problem that the encoding efficiency decreases. There is.
- the present invention has been made in view of the above problems, and an object thereof is that the image to be encoded includes an edge whose curvature changes while reducing the amount of calculation at the time of searching. Even if, it is in realizing an image coding device with high coding efficiency.
- an image coding apparatus for coding a target image divided into a plurality of blocks for each block, in which a plurality of long sides are adjacent to each other.
- a prediction residual obtained by subtracting a predicted image from a target image on a target block divided into rectangular regions is quantized for each quantization unit consisting of one or more rectangular regions selected from the plurality of rectangular regions.
- inverse quantization means for generating a decoded image on a target quantization unit by adding a difference to the prediction image
- predicted image generation means for generating a prediction image on the target block for each of the rectangular regions Therefore, a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and a rectangular area facing the long side opposite to the side facing the target rectangular area of the template is parallel to the long side direction.
- a predicted image generation unit configured to generate a predicted image on a target rectangular region by searching for a region having the highest correlation with the template among regions on the decoded image obtained by moving. It is characterized by
- a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and the long side opposite to the side facing the target rectangular area of the template
- the predicted image on the target rectangular area is generated by searching the area having the highest correlation with the template among the areas on the decoded image obtained by translating the rectangular area facing in parallel in the long side direction. be able to.
- the region on the decoded image obtained by translating the rectangular region in the long side direction is scanned one-dimensionally in the region having the highest correlation with the template. Search by Therefore, according to the above-described image coding apparatus, the amount of calculation at the time of searching can be reduced compared to the case of performing two-dimensional scanning as in the technique described in Patent Document 1, This produces an effect that the generation of a predicted image can be performed at high speed.
- the above-mentioned image encoding apparatus performs a search for each of the rectangular regions, it is possible to accurately generate a predicted image even when the target image includes an edge whose curvature changes. it can. That is, even in the case where the target image includes an edge whose curvature changes, the coding efficiency is high.
- the data structure of encoded data according to the present invention is a data structure of encoded data obtained by encoding a target image divided into a plurality of blocks for each block.
- a prediction residual obtained by subtracting the predicted image from the target image on the target block divided into a plurality of rectangular regions whose long sides are adjacent to each other, the one or more rectangles selected from the plurality of rectangular regions.
- the prediction image on the target block is generated for each of the rectangular regions, including the encoded data generated by performing quantization on each quantization unit of the regions, and the predicted image on each of the rectangular regions is generated.
- the decoding device can perform the decoding process based on the predicted image of each rectangular area and the quantized prediction residual of each quantization unit. Therefore, according to the above configuration, it is possible to realize the data structure of encoded data with high decoding efficiency.
- an image decoding apparatus is an image decoding apparatus that generates a decoded image divided into a plurality of blocks for each block, in a plurality of rectangular regions in which long sides are adjacent to each other.
- Inverse quantization means for generating a decoded image on a divided target block for each quantization unit consisting of one or more rectangular regions selected from the plurality of rectangular regions, and performing inverse quantization on the quantization value
- An inverse quantization unit that generates a decoded image on a target quantization unit by adding the obtained prediction residual to a predicted image; and a predicted image generation unit that generates the predicted image for each of the rectangular regions,
- a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and a rectangular area facing the long side opposite to the side facing the target rectangular area of the template is used as the template.
- Predictive image generation means for generating a predicted image on the target rectangular region by searching for a region having the highest correlation with the template among regions
- a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and the long side opposite to the side facing the target rectangular area of the template Generating a predicted image on the target rectangular region by searching for a region having the highest correlation with the template among regions on the decoded image obtained by translating the opposing rectangular region in the long side direction in parallel Can.
- the region having the highest correlation with the template is scanned one-dimensionally in the region on the decoded image obtained by translating the rectangular region in the long side direction.
- the amount of calculation at the time of searching can be reduced as compared with the case of performing two-dimensional scanning as in the technique described in Patent Document 1, prediction
- the effect is that image generation can be performed at high speed.
- the above-described image decoding apparatus performs the search for each of the rectangular regions, it is possible to accurately generate a predicted image even when the target image includes an edge whose curvature changes. It plays an effect.
- the present invention it is possible to generate a predicted image at a high speed as compared to the case where the predicted image is generated by a two-dimensional search. Further, even in the case where the image to be encoded includes an edge or the like in which the curvature changes, encoding with high encoding efficiency can be performed.
- FIG. 7 is a diagram for explaining the operation of the TM prediction unit shown in FIG. 2; (A) shows the relationship between the prediction target area, the template, and the search area, and (b) shows the relation between the search area and the search candidate. It is a figure which shows the relationship between an object macroblock and a prediction unit.
- (A) shows the case where the size of the prediction unit is 16 pixels ⁇ 1 pixel
- (b) shows the case where the size of the prediction unit is 1 pixel ⁇ 16 pixels
- (c) shows the size of the prediction unit Indicates that 4 pixels ⁇ 1 pixel
- (d) indicates that the size of the prediction unit is 1 pixel ⁇ 4 pixels.
- (A) shows the case where there are two curve edges going from the top to the bottom in the target macroblock, and the curvature of the curve edge increases toward the bottom of the MB; (b) when predicted by direction prediction (C) shows a case where intra TM prediction is performed with a square area as a prediction unit, and (d) shows the MB encoding unit shown in FIG. 1 with 16 ⁇ 1 pixels as a prediction unit. It is the figure which showed the case where encoding was performed using. It is a block diagram which shows the structure of the image coding apparatus provided with MB coding part shown in FIG. It is a block diagram which shows the structure of MB decoding part concerning the 1st Embodiment of this invention.
- FIG. 17 is a block diagram showing a configuration of a TM prediction unit that configures the MB coding unit shown in FIG.
- FIG. 17 is a block diagram which shows the structure of MB decoding part which comprises the image decoding apparatus which concerns on the 3rd Embodiment of this invention.
- FIG. 17 is a figure for demonstrating the operation
- Embodiment 1 An image coding apparatus 100 and an image decoding apparatus 150 which are the first embodiment of the image coding apparatus and the image decoding apparatus according to the present invention will be described with reference to FIGS. 1 to 10.
- elements having the same function are denoted by the same reference numerals, and the description thereof is omitted.
- an image coding apparatus / image decoding apparatus which divides an image into a plurality of macro blocks (hereinafter referred to as "MB") and performs coding processing / decoding processing for each MB in raster scan order. Assume. Further, MB is divided into a plurality of prediction units, and a prediction image is generated for each prediction unit. The prediction unit to be processed at a certain point is called a prediction target area.
- MB macro blocks
- TM prediction unit 105 which is a component common to the image encoding device 100 and the image decoding device 150, will be described based on FIGS. 2 to 3.
- FIG. 1 the TM prediction unit 105, which is a component common to the image encoding device 100 and the image decoding device 150, will be described based on FIGS. 2 to 3.
- FIG. 1 the TM prediction unit 105, which is a component common to the image encoding device 100 and the image decoding device 150, will be described based on FIGS. 2 to 3.
- FIG. 2 is a block diagram showing the configuration of the TM prediction unit 105.
- the TM prediction unit 105 includes a search area setting unit 101, a template setting unit 102, a template comparison unit 103, and a predicted image generation unit 104.
- the TM prediction unit 105 executes template matching based on prediction unit information # 106 described later and a decoded image # 109 recorded in a frame memory 109 described later, and a predicted image # 105 based on the result of template matching.
- the prediction unit information # 106 includes information indicating the shape and the position of the region to be predicted, as described later.
- the prediction unit information # 106 is used as the position of the prediction target area at the upper left of the prediction target area. It is assumed that the coordinates of a corner pixel (based on the pixel at the upper left corner of the input image) are indicated. Then, the width of the prediction target area is puw, the height of the prediction target area is puh, and the position of the prediction target area, that is, the coordinates of the pixel at the upper left corner of the prediction target area is denoted as (pux, puy). The units of puw, puh, pux and puy are all "pixels". In addition, although the size of the preferable prediction unit in this embodiment is mentioned later, in description of TM estimation part 105, it is not necessary to assume the prediction unit of a specific size especially.
- the template setting unit 102 sets a template corresponding to the prediction target area based on the input prediction unit information # 106, and outputs template information # 102 that is information about the template.
- the template setting unit 102 is adjacent to the upper side of the prediction target area, and has a width equal to the width of the prediction target area and a height of one pixel.
- Set as template That is, in the case of puw ⁇ puh, an area having a size of puw ⁇ 1 pixel located at (pux, puy ⁇ 1) is set as a template.
- the region to be predicted is a vertically long rectangle, a region adjacent to the left side of the region to be predicted, having a height equal to the height of the region to be predicted and a width of 1 pixel, is set as a template.
- FIG. (A) of FIG. 3 is a figure which shows a prediction object area
- the search area shown in FIG. 3A will be described later.
- the search area setting unit 101 sets a search area corresponding to the area to be predicted based on the input prediction unit information # 106 and the template information # 102, and searches area information # that is information on the search area. Output 101.
- the search area setting unit 101 is located at ( ⁇ , ⁇ 2) in relative coordinates (pixel unit) with respect to the prediction target area, An area having a width of (width of prediction target area + 2 ⁇ ) pixels and a height of 1 pixel is set as a search area.
- a region of (puw + 2 ⁇ ) ⁇ 1 pixel size located at (pux ⁇ , puy ⁇ 2) is set as a search region.
- the prediction target area is a vertically long rectangle
- the relative coordinates with respect to the prediction target area are located at (-2, -.alpha.)
- the height is (the height of the prediction target area + 2.alpha.) Pixels
- the width is An area of one pixel is set as a search area.
- puw ⁇ puh an area having a size of 1 pixel ⁇ (puh + 2 ⁇ ) pixels located at (pux ⁇ 2, puy ⁇ ) is set as the search area.
- a search area when the size of the prediction target area is 4 ⁇ 1 pixel is shown in FIG.
- the template comparison unit 103 executes template matching based on the template information # 102, the search area information # 101, and the decoded image # 109 recorded in the frame memory 109 described later, thereby generating a predicted image generation parameter #. 103 is derived and output.
- the predicted image generation parameter # 103 is information indicating the position of a region that approximates the region to be predicted. For example, among partial areas in the search area, the position (position relative to the template) of the partial area that most closely approximates the decoded image on the template can be used as the predicted image generation parameter # 103. In that case, the predicted image generation parameter # 103 can be derived by the following procedures S1 to S3.
- Step S1 the template comparison unit 103 generates a list of search candidates.
- the search candidate is a partial area in the search area congruent with the template.
- each search candidate can be specified by a search index assigned to the search area. For example, as shown in FIG. 3B, when the size of the template is 4 ⁇ 1 pixel and the size of the search area is 8 ⁇ 1 pixel, the offset value spos (0, 1) from the left end of the search area , 2, 3, 4), and sets five types of search candidates. In this case, the offset value can be used as a search index.
- Step S2 the template comparison unit 103 calculates, for each search candidate, an evaluation value indicating the dissimilarity of the decoded image on the template and the decoded image on the search candidate.
- an evaluation value used here SAD (Sum of Absolute Difference), SSD (Sum of Squared Difference) etc. are mentioned, for example.
- Step S3 the template comparison unit 103 specifies a search candidate (that is, a search candidate which most closely approximates the template) which minimizes the dissimilarity calculated in step S2. Then, the relative position of the specified search candidate with respect to the template is calculated, and is output as a predicted image generation parameter # 103 indicating the calculated position.
- a search candidate that is, a search candidate which most closely approximates the template
- the search candidate may be set by another method. For example, by reducing the number of search candidates, it is possible to derive the predicted image generation parameter # 103 with a smaller amount of processing although the accuracy is reduced.
- the position of the search candidate in the search area can be set in units smaller than one pixel, for example, in 0.5 pixel units or 0.25 pixel units.
- an interpolation value obtained by applying the interpolation filter to the pixel value of the decoded image at the integer position is used as the pixel value of the decoded image in the search candidate.
- the position of the search candidate can be finely adjusted, and template matching can be performed on more search candidates. Therefore, although the amount of processing is increased, the possibility of detecting a search candidate that more closely approximates the decoded image on the template can be increased.
- Prediction image generation unit 104 Based on predicted image generation parameter # 103 derived by template comparison unit 103 and decoded image # 109 recorded in frame memory 109, predicted image generation unit 104 generates a predicted image # corresponding to the target area for prediction. Generate 105 and output.
- each pixel value P (pux + i, puy + j) of the predicted image (where i and j) Is 0 ⁇ i ⁇ puw, and 0 ⁇ j ⁇ puh) is derived by the following equation.
- Ir (x, y) the pixel value of the pixel (x, y) of the decoded image.
- x or y is a decimal
- the interpolated pixel value generated by applying the interpolation filter to the pixel value of the neighboring decoded image is used.
- the TM prediction unit 105 generates the predicted image # 105 corresponding to the region to be predicted based on the input prediction unit information # 106 and the decoded image # 109 recorded in the frame memory 109. And output.
- the MB encoding unit 110 is for encoding an input image corresponding to each MB to generate encoded data corresponding to the MB, and is used by the image encoding apparatus 100 as described later. .
- FIG. 1 is a block diagram showing the configuration of the MB coding unit 110.
- the MB coding unit 110 includes a TM prediction unit 105, a prediction unit division unit 106, a prediction residual coding unit 107, a decoded image generation unit 108, and a frame memory 109.
- process target MB the thing of MB which is a process target of the MB encoding part 110 at a certain time.
- the prediction unit division unit 106 divides the processing target macroblock into predetermined units (hereinafter, referred to as “prediction units”), and outputs prediction unit information # 106, which is information on each prediction unit.
- Prediction unit information # 106 includes information on the position and size of each prediction unit.
- the size of the macroblock to be processed is 16 pixels ⁇ 16 pixels
- the present invention is not limited to this, and the size of a general macroblock is described. It can apply.
- FIG. 4 An example of division into prediction units in the prediction unit division unit 106 is shown in (a) to (d) of FIG.
- FIG. 4 shows the case where the size of the prediction unit is 16 pixels ⁇ 1 pixel
- (b) shows the case where the size of the prediction unit is 1 pixel ⁇ 16 pixels
- (c) Indicates a case where the size of the prediction unit is 4 pixels ⁇ 1 pixel
- (d) indicates a case where the size of the prediction unit is 1 pixel ⁇ 4 pixels.
- the prediction unit division unit 106 vertically aligns the processing target macroblocks by division lines extending in the horizontal direction. Divide into 16 prediction units. Further, as shown in (b) of FIG. 4, when the size of the prediction unit is 1 pixel ⁇ 16 pixels, the prediction unit division unit 106 extends the processing target macroblock in the horizontal direction by the division line extending in the vertical direction. Divide into 16 side-by-side prediction units. Also, as shown in (c) of FIG. 4, when the size of the prediction unit is 4 pixels ⁇ 1 pixel, the prediction unit division unit 106 selects 16 macroblocks to be processed in the vertical direction and 4 in the horizontal direction. Divide into a total of 64 prediction units.
- the prediction unit division unit 106 selects four macroblocks to be processed in the vertical direction and 16 in the horizontal direction. Divide into a total of 64 prediction units.
- a prediction unit index is assigned to each prediction unit. As shown in (a) to (d) of FIG. 4, the prediction unit index is an integer of 0 or more, and is assigned in ascending order of raster scan in the macro block.
- the prediction unit information # 106 is sequentially output in the ascending order of the prediction unit index.
- the prediction residual coding unit 107 generates coded data # 110 and decoding residual # 107 based on the prediction image # 105, prediction unit information # 106, and input image # 113 corresponding to each prediction unit input. Generate and output.
- the encoded data # 110 and the decoding residual # 107 are generated by the following procedures S11 to S15.
- the prediction residual coding unit 107 identifies the prediction target area based on the input prediction unit information # 106, and the difference image between the input image # 113 and the prediction image # 105 in the prediction target area. , That is, generate prediction residuals.
- the prediction residual coding unit 107 performs frequency transform on the same size as the size of the prediction unit (for example, for a prediction unit of 16 ⁇ 1 pixels). And 16 ⁇ 1 DCT (Discrete Cosine Transform) is applied to generate transform coefficients of the prediction residual.
- the prediction residual coding unit 107 performs frequency conversion for each quantization unit of the same size as the prediction unit, and generates a conversion coefficient of the prediction residual.
- frequency transformation refers to orthogonal transformation that transforms a space domain representation of an image into a frequency domain representation.
- Step S13 Subsequently, the prediction residual coding unit 107 quantizes the transform coefficient generated in step S12 to generate a quantized transform coefficient.
- the prediction residual coding unit 107 generates a variable-length code by applying a variable-length coding method such as CABAC or CAVLC to the quantized transform coefficient generated in step S13.
- the variable-length code is output as coded data # 110.
- the prediction residual coding unit 107 applies inverse quantization to the quantized transform coefficient generated in step S13, and then performs inverse transform of the frequency transform applied in step S12 (inverse frequency Decoding residual # 107 is generated and output by applying transformation.
- the present invention is not limited by the above procedure.
- the frequency conversion in step S12 may be omitted, and in step 13, the prediction residual may be directly quantized.
- the decoded image generation unit 108 generates the decoded image # 108 by adding the predicted image # 105 to the input decoded residual # 107, and outputs the generated decoded image # 108.
- the frame memory 109 records the input decoded image # 108. At the time of encoding a specific MB, decoded images corresponding to all the MBs preceding the MB in raster scan order are recorded in the frame memory 109.
- FIG. 5 is a flowchart showing a procedure of encoding the input image # 113 corresponding to the processing target MB in the MB encoding unit 110 to generate encoded data # 110.
- Step S21 First, the input image # 113 corresponding to the processing object MB input to the MB encoding unit 110 is input to the prediction unit division unit 106 and the prediction residual encoding unit 107.
- prediction unit division section 106 input image # 113 is divided into N prediction units of a predetermined size, and each prediction unit is assigned a prediction unit index (puid) taking an integer value in the range of 0 or more and N-1 or less. Be done.
- Prediction unit information # 106 corresponding to the prediction target area is input from the prediction unit division unit 106 to the TM prediction unit 105 and the prediction residual coding unit 107.
- Step S23 The TM prediction unit 105 performs template matching on the decoded image # 109 recorded in the frame memory 109 based on the prediction unit information # 106 input in step S22. Then, a predicted image # 105 corresponding to the region to be predicted is generated based on the result of template matching, and is output to the prediction residual coding unit 107 and the decoded image generation unit 108.
- Step S24 The prediction residual encoding unit 107 predicts, based on the predicted image # 105 generated in step S23, the prediction unit information # 106 generated in step S22, and the input image # 113. Encoded data # 110 corresponding to the area is generated and output.
- Step S25 Further, in the prediction residual coding unit 107, based on the prediction image # 105 generated in step S23, the prediction unit information # 106 generated in step S22, and the input image # 113, Decoding residual # 107 corresponding to the region to be predicted is generated and output to the decoded image generation unit 108.
- the decoded image generation unit 108 generates the decoded image # 108 corresponding to the region to be predicted based on the input decoding residual # 107 and the predicted image # 105 input in step S23, and is recorded in the frame memory 109. Ru.
- Step S26 If generation of the decoded image # 108 corresponding to all prediction units in the processing target MB is completed, the process ends; otherwise, the process proceeds to step S22.
- the MB encoding unit 110 can generate and output encoded data # 110 corresponding to the same MB from the input image # 113 corresponding to the processing target MB.
- the prediction unit is assumed to be 16 ⁇ 1 pixel, but the prediction unit having a width or height of 1 pixel (for example, 8 ⁇ 1 pixel, 4 ⁇ 1 pixel, 1 ⁇ 16 pixel) Even in the case of using a prediction unit of 1 ⁇ 8 pixels and 1 ⁇ 4 pixels, the same effect can be obtained.
- a prediction unit whose height is extremely small compared to the width for example, a prediction unit of 8 ⁇ 2 pixels, 16 ⁇ 2 pixels
- a prediction unit whose width is extremely small compared to the height for example, 2 ⁇ 8
- the same effect can be obtained for the pixel (2 ⁇ 16 pixel prediction unit).
- direction prediction When such a region is predicted by a prediction method called direction prediction, the direction of the broken line shown in FIG. 6B is assumed, and the pixel adjacent to the MB upper side is extrapolated in that direction. Predictive images can be generated. However, in direction prediction, the curved edge can be accurately approximated above the MB with a small curvature, but there is a problem that the accuracy is lowered below the MB.
- FIG. 6 is a figure which shows the encoding process in the case of encoding by the MB encoding part 110 by making 16x1 pixel into a prediction unit.
- a 16 ⁇ 1 pixel prediction unit located at the top of the MB is set as a prediction target area
- a 16 ⁇ 1 pixel template is set one pixel above
- a search area one pixel above the template.
- the prediction target area is moved to a prediction unit one pixel below, and similarly, the prediction image # 105 and the decoded image # 108 are generated. Subsequently, similarly, the prediction target area is moved downward by one pixel, and the prediction image The generation of # 105 and the decoded image # 108 is repeated.
- the generation of the prediction image # 105 is always performed at the position of the edge in the region to be predicted and one pixel above, regardless of which prediction unit is set as the region to be predicted to generate a prediction image. It is performed by detecting the displacement of the position of the edge.
- the MB encoding unit 110 it is possible to generate a highly accurate predicted image # 105 even in a region where there is a curved edge whose curvature changes in the MB. Therefore, such a region can be encoded with high coding efficiency.
- a straight edge by detecting the deviation of the edge position in the region to be predicted and the region one pixel above, it is possible to detect the inclination of the straight line and generate a predicted image. Therefore, by using the MB encoding unit 110, it is possible to encode an area in which linear edges with various slopes exist with high encoding efficiency.
- the prediction unit division unit 106 has been described as sequentially outputting the prediction unit information # 106 in the ascending order of the prediction unit index, but the present invention is not limited to this. That is, the output order of prediction unit information # 106 in prediction unit division section 106 may not necessarily be in ascending order of prediction unit index.
- the prediction unit is a horizontally long rectangle
- the prediction unit closer to the upper side of the MB be processed first.
- the prediction unit closer to the left side of the MB be processed first.
- FIG. 7 is a block diagram showing the configuration of the image coding apparatus 100 according to the present invention.
- the image coding apparatus 100 includes an MB coding unit 110, a header information determination unit 111, a header information coding unit 112, an MB setting unit 113, and a variable length code multiplexing unit 114. There is.
- the image coding apparatus 100 receives an input image # 100.
- the image coding apparatus 100 performs coding processing of an input image # 100 and outputs coded data # 180.
- the header information determination unit 111 determines header information based on the input image # 100.
- the determined header information is output as header information # 111.
- the header information # 111 includes the image size of the input image # 100.
- the header information # 111 is input to the MB setting unit 113 and to the header information encoding unit 112.
- the header information encoding unit 112 encodes the header information # 111 and outputs encoded header information # 112.
- the encoded header information # 112 is input to the variable-length code multiplexing unit 114.
- the MB setting unit 113 divides the input image # 100 into a plurality of macro blocks based on the input image # 100 and the header information # 111.
- the MB setting unit 113 inputs the input image # 100 to the MB encoding unit 110 for each macroblock.
- the MB coding unit 110 codes the input image # 113 for one macroblock input sequentially, and generates MB coded data # 110.
- the generated MB encoded data # 110 is input to the variable-length code multiplexing unit 114.
- variable-length code multiplexing unit 114 multiplexes the encoded header information # 112 and the MB encoded data # 110 to generate and output encoded data # 180.
- the encoded data # 110 generated by the MB encoding unit 110 (that is, the encoded header information # 112 is The encoded data # 110) before being multiplexed is referred to herein as “MB encoded data”.
- the MB decoding unit 153 that receives the encoded data # 110 in MB units generated by the MB encoding unit 110 and outputs a decoded image # 190 in MB units will be described with reference to FIGS.
- FIG. 8 is a block diagram showing the configuration of the MB decoding unit 153.
- the MB decoding unit 153 includes a TM prediction unit 105, a decoded image generation unit 108, a frame memory 109, a prediction unit setting unit 151, and a prediction residual decoding unit 152.
- the prediction unit setting unit 151 is activated when the encoded data # 110 in MB is input, and sequentially outputs prediction unit information # 151 indicating the position and size of the prediction unit in the MB in a predetermined order.
- the same method as the division method applied in the prediction unit division unit 106 (see FIG. 1) in the MB encoding unit 110 can be applied as the division method of MB into prediction units.
- the order of outputting prediction unit information # 151 can be the same as the order applied by the prediction unit division unit 106.
- the prediction residual decoding unit 152 applies variable-length code decoding to the input encoded data # 110 in MB units, and generates a transform coefficient corresponding to the prediction unit indicating the input prediction unit information # 151. . Subsequently, decoding residual # 152 is generated and applied by applying inverse DCT transformation (inverse transformation of DCT) having the same size as the size of the prediction unit indicated by prediction unit information # 151 to the generated transformation coefficient. .
- inverse DCT transformation inverse transformation of DCT
- FIG. 9 is a flowchart showing the procedure of the decoding process in the MB decoding unit 153.
- Step S31 First, the encoded data # 110 corresponding to the process target MB input to the MB decoding unit 153 is input to the prediction unit setting unit 151 and the prediction residual decoding unit 152.
- the prediction unit setting unit 151 divides the processing target MB into N prediction units of a predetermined size, and assigns a prediction unit index (puid) taking an integer value in the range of 1 to N to each prediction unit.
- Step S33 Subsequently, the TM prediction unit 105 performs template matching on the decoded image # 109 recorded in the frame memory 109 based on the prediction unit information # 151 generated in step S32. Then, a predicted image # 105 corresponding to the region to be predicted is generated based on the result of template matching, and is output to the decoded image generation unit 108.
- the prediction residual decoding unit 152 generates a decoding residual # 152 corresponding to the area to be predicted based on the prediction unit information # 151 generated in the step S32 and the encoded data # 110. , And is output to the decoded image generation unit 108.
- Step S35 In the decoded image generation unit 108, based on the predicted image # 105 input in step S33 and the decoded residual # 152 generated in step S34, the decoded image # 190 corresponding to the region to be predicted is It is generated.
- the decoded image # 190 is output to the outside of the MB decoding unit 153 and is also recorded in the frame memory 109.
- Step S36 If generation of the decoded image # 190 corresponding to all prediction units in the processing object MB is completed, the process ends; otherwise, the process proceeds to step S32.
- the MB decoding unit 153 can generate a decoded image # 190 corresponding to the same MB from the encoded data # 110 corresponding to the processing target MB.
- the image decoding device 150 receives the encoded data # 180 generated by the image encoding device 100 described above, and generates and outputs a decoded image # 190.
- FIG. 10 is a block diagram showing the configuration of the image decoding apparatus 150.
- the image decoding apparatus 150 includes an MB decoding unit 153, a variable length code demultiplexing unit 154, a header information decoding unit 155, and an MB setting unit 156.
- the encoded data # 180 input to the image decoding apparatus 150 is input to the variable-length code demultiplexing unit 154.
- the variable-length code de-multiplexing unit 154 de-multiplexes the input coded data # 180 to form header coded data # 154 a which is coded data concerning header information, and MB coding which is coded data concerning macro blocks.
- the encoded data # 154b is separated into data # 154b, and the encoded header data # 154a is output to the header information decoding unit 155, and the encoded MB data # 154b is output to the MB setting unit 156.
- the header information decoding unit 155 decodes the header information # 155 from the header encoded data # 154a.
- the header information # 155 is information including the size of the input image.
- the MB setting unit 156 separates the MB encoded data # 154 b into the encoded data # 156 corresponding to each MB based on the input header information # 155, and sequentially outputs the encoded data # 154 b to the MB decoding unit 153.
- the MB decoding unit 153 sequentially decodes the encoded data # 156 corresponding to each input MB to generate and output a decoded image # 190 corresponding to each MB.
- generation processing of the decoded image # 190 corresponding to the encoded data input to the image decoding device 150 is performed. Complete.
- the image coding apparatus 100 is an image coding apparatus that codes a target image divided into a plurality of blocks for each block (MB), and has long sides A quantum quantizes a target image on a target block (target MB) divided into a plurality of rectangular regions (prediction units) adjacent to one another for each continuous quantization unit composed of one or more rectangular regions (prediction units)
- Quantizing means predictive residual coding unit 107) for quantizing the prediction residual obtained by subtracting the predicted image from the target image on the target quantization unit, and the target block (target MB) inverse quantization means for generating a decoded image on each of said quantization units, said prediction image obtained by inverse quantization of the quantization value generated by said quantization means
- Inverse quantization means prediction residual coding unit 107) for generating a decoded image on a unit of distortion, and predicted image generation means for generating the predicted image for each rectangular area (prediction unit), which is a target rectangle A rectangular area
- a region having the highest correlation with the template is searched by one-dimensionally scanning a region on a decoded image obtained by translating the rectangular region in the long side direction. Do. Therefore, according to the above-described image coding apparatus, the amount of calculation at the time of searching can be reduced compared to the case of performing two-dimensional scanning as in the technique described in Patent Document 1, The prediction image can be generated at high speed.
- the above-described image decoding apparatus searches for each of the rectangular regions, even when the target image includes an edge whose curvature changes, as compared with the technique described in Patent Document 1. , Generation of a predicted image can be performed accurately. That is, even in the case where the target image includes an edge whose curvature changes, the coding efficiency is high.
- the image decoding apparatus 150 is an image decoding apparatus that generates a decoded image divided into a plurality of blocks (MB) for each block, and a plurality of long sides are adjacent to each other.
- Inverse quantization means for generating a decoded image on a target block (target MB) divided into rectangular areas for each successive quantization unit consisting of one or more rectangular areas (prediction units), wherein the quantization value is Inverse quantization means (prediction residual decoding unit 152) for generating a decoded image on the target quantization unit by adding the prediction residual obtained by inverse quantization of Predicted image generation means for generating each rectangular area, wherein a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and the opposite side of the template to the side facing the target rectangular area
- the predicted image on the target rectangular area is searched by searching the area having the highest correlation with the template among the areas on the decoded image obtained by translating the rectangular area facing the long side in the long
- the predicted image can be generated at high speed.
- the above-described image decoding apparatus performs the search for each of the rectangular regions, it is possible to accurately generate a predicted image even when the target image includes an edge whose curvature changes. It plays an effect.
- FIG. 11 is a block diagram showing the configuration of the image coding apparatus 300 according to the present embodiment.
- the image coding apparatus 300 includes a header information determination unit 111, a header information coding unit 112, an MB setting unit 113, a variable length code multiplexing unit 114, and an MB coding unit 205. There is.
- the header information determining unit 111, the header information encoding unit 112, the MB setting unit 113, and the variable length code multiplexing unit 114 have already been described, and therefore the MB encoding unit 205 will be described below.
- the MB encoding unit 205 included in the image coding apparatus 300 shown in FIG. 11 will be described with reference to FIG.
- the MB encoding unit 205 is for generating and outputting the encoded data # 205 based on the input image # 113 corresponding to the processing target MB output from the MB setting unit 113.
- FIG. 12 is a block diagram showing the configuration of MB coding section 205.
- the MB coding unit 205 includes a prediction unit structure comparison unit 201, a prediction unit division unit 202, a TM prediction unit 105, a prediction residual coding unit 107, a decoded image generation unit 108, a frame memory 109, A side information coding unit 203 and an MB coded data multiplexing unit 204 are provided.
- the TM prediction unit 105, the prediction residual coding unit 107, the decoded image generation unit 108, and the frame memory 109 have already been described, and hence the prediction unit structure comparison unit 201, the prediction unit division unit 202, the side information coding unit The 203 and the MB encoded data multiplexing unit 204 will be described.
- the prediction unit structure comparison unit 201 analyzes the input image # 113 corresponding to the processing target MB, and selects a prediction unit suitable for the MB from among prediction units included in a predetermined prediction unit set. Also, the prediction unit structure comparison unit 201 outputs prediction unit structure information # 201, which is information indicating the structure of the selected prediction unit.
- the predetermined prediction unit set includes the prediction unit of 16 ⁇ 1 pixel and the prediction unit of 1 ⁇ 16 pixel is described as an example, the present invention is limited to this. It is not a thing, but various other combinations can be considered as a combination of a prediction unit set.
- the predetermined prediction unit set includes a vertically long rectangular prediction unit and a horizontally long rectangular prediction unit.
- the prediction unit structure comparison part 201 selects a prediction unit with high encoding efficiency among the prediction units contained in a prediction unit set.
- the prediction unit structure comparison unit 201 selects a prediction unit according to the result of rate distortion determination. That is, for each prediction unit included in the predetermined prediction unit set, the prediction unit structure comparison unit 201 uses the prediction unit to encode the input image # 113 of the processing target MB.
- the evaluation value RD R + D ⁇ is calculated by calculating R and distortion D (the input image on the processing target and the SSD of the decoded image), and according to the result, any one prediction of each prediction unit is calculated. Choose a unit. More specifically, a prediction unit with which the evaluation value RD becomes smaller is selected.
- prediction unit structure information # 201 indicating the structure of the selected prediction unit is output to the prediction unit division unit 202 and the side information coding unit 203.
- the prediction unit structure comparison unit 201 may select a prediction unit by analyzing the directionality of an edge, or may select a prediction unit by another method.
- the prediction unit dividing unit 202 divides the input image # 113 corresponding to the processing target MB into predetermined prediction units determined by the prediction unit structure information # 201. Also, the prediction unit division unit 106 outputs prediction unit information # 106, which is information on each prediction unit.
- the prediction unit information # 106 includes information on the position and size of each prediction unit as described above.
- a prediction unit index is assigned to each prediction unit as already described using (a) to (d) of FIG.
- Side information encoding section 203 generates side information # 203 based on prediction unit structure information # 201.
- the generated side information # 203 is output to the MB encoded data multiplexing unit 204.
- the side information coding unit 203 performs prediction indicated by prediction unit structure information # 201.
- bit string 0 is generated
- bit string 1 is generated.
- the side information encoding unit 203 When the prediction unit indicated by prediction unit structure information # 201 is 16 ⁇ 1 pixel, bit string 00 is generated as side information # 203, and the prediction unit indicated by prediction unit structure information # 201 is 1 ⁇ 16 pixels In one case, bit string 10 is generated, and when the prediction unit indicated by prediction unit structure information # 201 is 8 ⁇ 1 pixel, bit string 01 is generated, and the prediction unit indicated by prediction unit structure information # 201 is 1 In the case of ⁇ 8 pixels, a bit string 11 is generated.
- the upper digit of the bit string is information indicating the long side direction of the prediction unit
- the lower digit of the bit string is information indicating the size of the prediction unit.
- the same symbol as the digit indicating the direction information in the bit string can be encoded. Is likely to occur continuously, so that more efficient encoding processing can be performed.
- the bit string is generated using bias of the symbol occurrence probability. The number of bits of can be reduced.
- MB encoded data multiplexing section 204 is configured to encode data based on encoded data # 110 output from prediction residual encoding section 107 and side information # 203 output from side information encoding section 203. Generate # 205 and output.
- FIG. 13 shows a bit stream structure of coded data # 205.
- the encoded data # 205 includes side information # 203 indicating which prediction unit is selected in the prediction unit set, and encoded data # 110.
- the prediction unit most suitable for the local characteristics of the input image # 113 that is, the coding efficiency Since the highest prediction unit is selected and the input image # 113 can be encoded using the prediction unit, the coding efficiency is improved.
- the prediction unit set includes the vertically long prediction unit and the horizontally long prediction unit, it is possible to efficiently encode the input image # 113 of various characteristics.
- the image decoding apparatus 350 receives the encoded data # 181, and generates and outputs a decoded image # 254.
- FIG. 14 is a block diagram showing the configuration of the image decoding apparatus 350.
- the image decoding apparatus 350 includes a variable-length code demultiplexing unit 154, a header information decoding unit 155, an MB setting unit 156, and an MB decoding unit 254.
- variable-length code demultiplexing unit 154 Since the variable-length code demultiplexing unit 154, the header information decoding unit 155, and the MB setting unit 156 have already been described, the MB decoding unit 254 will be described below.
- the MB decoding unit 254 provided in the image decoding apparatus 350 shown in FIG. 14 will be described with reference to FIG. The MB decoding unit 254 sequentially decodes the encoded data # 156 corresponding to each MB output from the MB setting unit 156 to generate and output a decoded image # 254 corresponding to each MB. It is for.
- FIG. 15 is a block diagram showing the configuration of the MB decoding unit 254.
- the MB decoding unit 254 includes an MB encoded data demultiplexing unit 251, a side information decoding unit 253, a prediction unit setting unit 252, a prediction residual decoding unit 152, a TM prediction unit 105, and a decoded image generation. And a frame memory 109.
- the MB encoded data demultiplexing unit 251, the side information decoding unit 253, and the prediction unit are described below.
- the setting unit 252 will be described.
- the MB encoded data demultiplexing unit 251 demultiplexes the encoded data # 156 into side information # 251 b and encoded data # 251 a in MB units.
- Side information # 251 b is output to side information decoding section 253, and encoded data # 251 a in MB units is output to prediction unit setting section 252 and prediction residual decoding section 152.
- the side information # 251 b is information corresponding to the above-described side information # 203.
- the side information decoding unit 253 decodes the side information # 251 b to generate prediction unit structure information # 253.
- the prediction unit structure information # 253 is information corresponding to the prediction unit structure information # 201.
- the prediction unit setting unit 252 generates prediction unit information # 252 indicating the position and size of a prediction unit in the MB based on the encoded data # 251a in MB and prediction unit structure information # 253, and performs predetermined Output sequentially in order.
- MB decoding section 254 configured as described above, decoding processing can be performed using the optimal prediction unit included in the side information, that is, the prediction unit with the highest encoding efficiency. It has the effect of improving the efficiency.
- the image coding apparatus 300 includes a plurality of blocks (MBs) each having a plurality of long sides adjacent to each other.
- the division unit (prediction unit structure comparison unit 201) is configured to be divided into rectangular regions (prediction units), and the long side direction of the plurality of rectangular regions is switched for each block.
- the image encoding device 300 is a flag (prediction unit structure indicating the long side direction of the plurality of rectangular regions (prediction units) for each of the plurality of blocks (MB) It further comprises flag encoding means (side information encoding unit 203) for encoding information # 201).
- the MB coding unit 309 included in the image coding apparatus according to the present embodiment will be described with reference to FIG.
- the image coding apparatus according to the present embodiment includes an MB coding unit 309 in place of the MB coding unit 110 in the image coding apparatus 100 described above.
- FIG. 16 is a block diagram showing the configuration of the MB coding unit 309 provided in the image coding apparatus according to the third embodiment.
- the MB coding unit 309 includes a quantization unit division unit 306, a prediction unit division unit 106, a TM prediction unit 305, a frame memory 109, a prediction residual coding unit 307, and a decoded image generation unit. It has 308.
- MB coding section 309 receives input image # 113 in MB units and outputs coded data # 309.
- the quantization unit division unit 306 divides the input image # 113 into a plurality of quantization units.
- the size of the quantization unit is larger than the size of the prediction unit.
- information on the size of each quantization unit is output as quantization unit information # 306.
- the TM prediction unit 305 receives the quantization unit information # 306 output from the quantization unit division unit 306 and the prediction unit information # 106 output from the prediction unit division unit 106, and outputs a prediction image # 305.
- the details of the TM prediction unit 305 will be described later, with reference to the referenced drawings.
- the prediction residual coding unit 307 generates coded data # 309 and decoding residual # 307 based on the prediction image # 305, quantization unit information # 306, prediction unit information # 106, and input image # 113. ,Output.
- the decoded image generation unit 308 generates the decoded image # 308 by adding the predicted image # 305 to the input decoded residual # 307, and outputs the generated decoded image # 308.
- the output decoded image # 308 is stored in the frame memory 109.
- FIG. 17 is a block diagram showing the configuration of the TM prediction unit 305.
- the TM prediction unit 305 includes a search area setting unit 301, a template setting unit 302, a template comparison unit 303, and a predicted image generation unit 304.
- the template setting unit 302 sets a template corresponding to the prediction target area based on the quantization unit information # 306 and the prediction unit information # 106, and outputs template information # 302 which is information about the template.
- the search area setting unit 301 sets a search area corresponding to the prediction target area based on the quantization unit information # 306 and the prediction unit information # 106, and search area information # 301 which is information on the search area.
- FIG. 19 is a diagram for explaining the operations of the template setting unit 302 and the search area setting unit 301. As shown in FIG. 19, the template is set outside the quantization unit including the prediction target area.
- the template is selected from the regions closest to the prediction target region among the regions existing outside the transform region including the prediction target region.
- the search area is set to an area separated from the template by the same distance as the distance between the area to be predicted and the template. Further, when the distance is large, it is preferable to widen the range of the search area accordingly.
- the template comparison unit 303 derives and outputs predicted image generation parameter # 303 by executing template matching based on the template information # 302, the search area information # 301, and the decoded image # 109. More specifically, the template comparison unit 303 obtains a search candidate that most closely approximates the template indicated by the template information # 302 from the search area indicated by the search area information # 301, and the template comparison unit 303 applies the search candidate to the template Calculate the relative position (displacement). Further, it is output as parameter # 303 for predicted image generation which is information indicating the relative position.
- the predicted image generation unit 304 generates a predicted image # corresponding to the region to be predicted based on the predicted image generation parameter # 303 derived by the template comparison unit 303 and the decoded image # 109 recorded in the frame memory 109. Generate 305 and output. Specifically, the predicted image generation unit 304 assigns each pixel of the decoded image at a position shifted by the displacement indicated by the predicted image generation parameter # 303 to each pixel of the region to be predicted.
- the MB decoding unit 353 included in the image decoding apparatus according to the present embodiment will be described with reference to FIG.
- the image decoding apparatus according to the present embodiment includes an MB decoding unit 353 instead of the MB decoding unit 153 in the image decoding apparatus 150 described above.
- the MB decoding unit 353 receives the encoded data # 156 and generates and outputs a decoded image # 254.
- FIG. 18 is a block diagram showing a configuration of MB decoding unit 353.
- the MB decoding unit 353 includes a TM prediction unit 305, a decoded image generation unit 308, a frame memory 109, a prediction unit setting unit 151, a quantization unit setting unit 351, and a prediction residual decoding unit 352.
- the quantization unit setting unit 351 sequentially outputs quantization unit information # 351 indicating the position and size of the quantization unit in the MB in a predetermined order.
- the prediction residual decoding unit 352 applies variable-length code decoding to the input encoded data # 156 to generate transform coefficients. Subsequently, a decoding residual # 352 is generated by applying to the generated transform coefficient an inverse DCT transform (inverse transform of DCT) having the same size as the size of the quantization unit indicated by quantization unit information # 351, Output.
- inverse DCT transform inverse transform of DCT
- the decoded image generation unit 308 in the MB decoding unit 353 generates a decoded image # 254 by adding the predicted image # 305 to the input decoded residual # 352, and outputs the generated decoded image # 254.
- FIG. 20 is a flowchart showing a procedure of encoding the input image # 113 corresponding to the process target MB in the MB encoding unit 309 to generate encoded data # 110.
- Step S41 First, the input image # 113 corresponding to the processing target MB input to the MB encoding unit 309 is input to the quantization unit division unit 306, the prediction unit division unit 106, and the prediction residual coding unit 307. It is input.
- the quantization unit division unit 306 divides the input image # 113 into M quantization units of a predetermined size, and takes an integer value in the range of 0 or more and M ⁇ 1 or less in each quantization unit. tuid) is given.
- the input image # 113 is divided into N prediction units of a predetermined size, and each prediction unit takes an integer value in the range of 0 to N ⁇ 1.
- Unit index (puid) is given.
- Step S44 the TM prediction unit 305 performs template matching on the decoded image # 109 recorded in the frame memory 109 based on the prediction unit information # 106 and the quantization unit information # 306. Further, based on the result, a predicted image # 305 corresponding to the region to be predicted is generated. The predicted image # 305 is output to the prediction residual coding unit 307 and the decoded image generation unit 308.
- Step S45 the MB encoding unit 309 determines whether or not the predicted image # 305 has been generated for all prediction units in the conversion target area.
- Step S47 The decoded image generation unit 308 that receives the decoded residual # 307 generated in step S46 generates a decoded image # 308 of the conversion target area.
- Step S48 When the conversion target area includes a prediction unit in which the predicted image # 350 is not generated (No in step S45) or when the decoded image # 308 is generated in step S47.
- the MB encoding unit 309 determines whether all prediction units in the processing target MB have been decoded. If all prediction units in the processing target MB have been decoded, the encoding process of the processing target MB is ended (Yes in step S48), and prediction units not decoded are included in the processing target MB. If it is (No in step S48), the process of step S43 is performed.
- the MB encoding unit 309 can generate and output encoded data # 309 corresponding to the same MB from the input image # 113 corresponding to the processing target MB.
- the size of the quantization unit is the same as the size of the prediction unit, or larger than the size of the prediction unit. That is, the number M of quantization units is equal to or greater than the number N of prediction units, and the quantization unit includes one or more prediction units.
- the MB coding unit 309 frequency conversion and quantization can be performed for each quantization unit including a single prediction unit, so the correlation in the short side direction of the prediction unit is removed. And the coding efficiency is improved.
- frequency conversion and quantization can be performed for each quantization unit including a plurality of prediction units. That is, generation of predicted images # 305 for a plurality of prediction units included in the same quantization unit can be performed in parallel, which has an effect of increasing processing speed. Also, by performing such parallel processing, there is an effect that the processing load is reduced.
- the quantization unit includes two or more rectangular regions (prediction units), and the prediction image generation unit (TM prediction unit 305) Of the rectangular areas on the decoded image facing the long side, a rectangular area closest to the target rectangular area is a template, and the rectangular area facing the long side opposite to the side facing the target rectangular area of the template Among the regions on the decoded image obtained by translating the rectangular region parallel to the long side direction, the distance between the target rectangular region and the template is the highest, the correlation with the template is the highest. It is characterized by searching for a region.
- the quantization unit is composed of two or more rectangular regions, that is, two or more prediction units. Furthermore, the prediction image of each prediction unit included in the quantization unit can be generated without referring to the decoded image on the same quantization unit. That is, according to the above configuration, a plurality of prediction units included in each quantization unit can be processed in parallel. Therefore, according to the above configuration, the processing time of the encoding process can be reduced.
- the quantization unit includes two or more rectangular regions whose long sides face each other, and the quantization unit (prediction residual coding unit 107) performs frequency conversion in the quantization unit and performs the inverse quantum quantization.
- the encoding unit (prediction residual encoding unit 107) performs inverse frequency transformation, which is the inverse transformation of the above frequency transformation, in the above quantization unit, and the above prediction image generation unit (TM prediction unit 305)
- TM prediction unit 305 a prediction image generation unit
- the rectangular area closest to the target rectangular area is a template
- the rectangular area faces the long side opposite to the side facing the target rectangular area of the template.
- the regions in the decoded image obtained by translating in parallel the rectangular region whose distance from the template is equal to the distance between the target rectangular region and the template. Correlation with rate explore the highest region, it is preferable.
- the quantization unit includes two or more rectangular regions whose long sides face each other, that is, two or more prediction units whose long sides face each other. Furthermore, the prediction image of each prediction unit included in the quantization unit can be generated without referring to the decoded image on the same quantization unit. As a result, since frequency conversion can be applied in quantization units, correlation in the short side direction of prediction units is removed, and coding efficiency is further improved.
- the TM prediction unit 105 in the first embodiment is a region one pixel above the prediction unit, and the prediction unit
- the DC value of the area of the same shape may be set as the prediction value of each pixel of the prediction unit (hereinafter referred to as “flat prediction”).
- SSD degrees of dissimilarity
- Step SA4 When flat prediction is selected, the value of each pixel in the region to be predicted is set as the DC value of the template.
- the above flat prediction it is possible to reduce the amount of coding processing while maintaining high coding efficiency.
- the above flat prediction is effective in the MB where the edge portion and the flat portion are mixed, in particular, in coding for the flat portion of the MB.
- the image coding apparatus divides the processing target MB into a plurality of subblocks, and uses the prediction based on the template matching described above for each subblock, or H. It is also possible to select whether to use directional prediction such as intra prediction in H.264 / AVC.
- the image coding apparatus may perform reversible conversion with a predetermined length in the short side direction of the prediction unit before coding the quantized conversion coefficient.
- the image decoding apparatus may perform coding according to the following procedure.
- Step SA21 First, derive quantized transform coefficients of all prediction units, (Step SA22) Next, lossless conversion is performed on a group of transform coefficients composed of quantized transform coefficients corresponding to the same frequency component in each prediction unit in the processing object MB.
- the image decoding apparatus may switch the prediction unit and the prediction method based on the characteristics of the decoded image on the template.
- the image decoding apparatus determines whether an edge is present on the template by using an index such as the variance of pixel values of each pixel of the decoded image on the template, and the edge is present.
- a prediction unit of 4 ⁇ 1 pixels may be selected, and DCT may be performed at that size. If no edge is present, a prediction unit of 16 ⁇ 1 pixel may be selected, and DCT may be performed at that size. .
- the coding efficiency can be improved by performing DCT in a wide range, so by adopting the configuration as described above, appropriate prediction can be performed without increasing side information. You can choose the unit.
- the image decoding apparatus may use the above-mentioned flat prediction when the variance of the pixel values of each pixel of the decoded image on the template is small.
- the flat portion has a small variance of the pixel values of each pixel. Further, for the flat part, the amount of processing of the encoding process can be reduced by using the above-described flat prediction.
- a predicted image is subtracted from the target image on the target block divided into a plurality of rectangular regions whose long sides are adjacent to each other
- Quantizing means for quantizing the prediction residual obtained in one or more rectangular regions for each successive quantizing unit, and dequantizing for generating the decoded image on the target block for each of the quantizing units
- a predicted image generating unit for generating the predicted image on the target block for each of the rectangular regions, the rectangular region on the decoded image facing the long side of the target rectangular region being a template, and Of the regions on the decoded image obtained by translating in parallel the rectangular region opposite to the long side opposite to the side opposite to the target rectangular area of the template in the long side direction, the area having the highest correlation with the template
- the quantization unit is composed of a single rectangular area, and the prediction image generation means uses the rectangular area on the decoded image adjacent to the long side of the target rectangular area as a template, and the rectangular area adjacent to the long side of the template Search the region having the highest correlation with the template among the regions on the decoded image obtained by translating L in parallel in the direction of the long side.
- the image coding apparatus according to claim 1.
- the quantization unit is composed of two or more rectangular areas, and the prediction image generation unit templates the rectangular area closest to the target rectangular area among the rectangular areas on the decoded image facing the long side of the target rectangular area as a template A rectangular area facing the long side opposite to the side facing the target rectangular area of the template and facing the long side, wherein the distance between the template and the template is equal to the distance between the target rectangular area and the template Among the regions on the decoded image obtained by parallel displacement in the side direction, a region having the highest correlation with the template is searched.
- the image coding apparatus according to claim 1.
- the above quantization means quantizes in parallel each prediction residual obtained from the two or more rectangular regions.
- Each of the plurality of rectangular regions is a rectangular region having a width in the short side direction of one pixel.
- a dividing unit configured to divide each of the plurality of blocks into a plurality of rectangular regions whose long sides are adjacent to each other, the dividing unit switching the long side direction of the plurality of rectangular regions for each block; Characteristic 1. To 5. The image coding device according to any one of the above.
- the division means switches the long side direction of the plurality of rectangular regions for each block in accordance with the coding efficiency.
- the apparatus further comprises: flag encoding means for encoding a flag indicating the long side direction of the plurality of rectangular regions for each of the plurality of blocks.
- flag encoding means for encoding a flag indicating the long side direction of the plurality of rectangular regions for each of the plurality of blocks.
- the image is generated for each of the rectangular areas, and the predicted image on each rectangular area has the rectangular area on the decoded image facing the long side of the rectangular area as a template, and the rectangular area on the template Among the regions on the decoded image obtained by translating the rectangular region opposite to the opposite side and the opposite long side in parallel to the long side direction, the correlation with the template is the most And it is generated by searching the area have a data structure of the encoded data, characterized in that.
- the decoded image on a target block divided into a plurality of rectangular regions whose long sides are adjacent to each other is made up of one or more rectangular regions
- Inverse quantization means for producing for each successive quantization unit, generating a decoded image on the target quantization unit by adding the prediction residual obtained by inverse quantization of the quantization value to the prediction image
- predicted image generation means for generating the predicted image for each of the rectangular regions, wherein a rectangular region on the decoded image facing the long side of the target rectangular region is used as a template, and the target of the template Among the areas on the decoded image obtained by translating the rectangular area facing the long side opposite to the long side opposite to the rectangular area, the correlation with the template is the most significant.
- An image coding method for coding a target image divided into a plurality of blocks for each block, and one or more target images on the target block divided into a plurality of rectangular regions in which long sides are adjacent to each other A quantization step of quantizing each successive quantization unit consisting of a rectangular region, wherein the quantization step is to quantize a prediction residual obtained by subtracting a prediction image from a target image on a target quantization unit; An inverse quantization step of generating a decoded image on the target block for each quantization unit, wherein the prediction residual obtained by inverse quantization of the quantization value generated in the quantization step is predicted An inverse quantization step of generating a decoded image on the target quantization unit by adding to the image; and a predicted image generation step of generating the predicted image for each rectangular region, the long side of the target rectangular region Back to face Among the regions on the decoded image obtained by translating a rectangular region on the image opposite to the long side opposite to the target rectangular region of the template on the long side direction, using the
- a decoded image on a target block divided into a plurality of rectangular regions whose long sides are adjacent to each other is made up of one or more rectangular regions
- the target rectangle and the predicted image generation step of generating a predictive image of the region contains the image decoding method characterized by.
- An image coding apparatus is an image coding apparatus for coding a target image divided into a plurality of blocks for each block, on a target block divided into a plurality of rectangular regions in which long sides are adjacent to each other.
- Quantizing means for quantizing a prediction residual obtained by subtracting a predicted image from a target image of each for each successive quantizing unit consisting of one or more rectangular regions, and the above quantization of the decoded image on the target block
- Inverse quantization means for generating each unit, and adding a prediction residual obtained by inverse quantization of the quantization value generated by the quantization means to the prediction image, on the target quantization unit
- Inverse quantization means for generating a decoded image of the target image, and predicted image generation means for generating a predicted image on the target block for each of the rectangular regions, and a long side on the decoded image facing the long side of the target rectangular region Among the regions on the decoded image obtained by translating the rectangular region opposite to the long side opposite to the target rectangular region of the above template into
- a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and the long side opposite to the side facing the target rectangular area of the template
- the predicted image on the target rectangular area is generated by searching the area having the highest correlation with the template among the areas on the decoded image obtained by translating the rectangular area facing in parallel in the long side direction. be able to.
- the region on the decoded image obtained by translating the rectangular region in the long side direction is scanned one-dimensionally in the region having the highest correlation with the template. Search by Therefore, according to the above-described image coding apparatus, the amount of calculation at the time of searching can be reduced compared to the case of performing two-dimensional scanning as in the technique described in Patent Document 1, This produces an effect that the generation of a predicted image can be performed at high speed.
- the above-mentioned image encoding apparatus performs a search for each of the rectangular regions, it is possible to accurately generate a predicted image even when the target image includes an edge whose curvature changes. it can. That is, even in the case where the target image includes an edge whose curvature changes, the coding efficiency is high.
- the quantization unit is formed of a single rectangular area
- the predicted image generation means uses a rectangular area on the decoded image adjacent to the long side of the target rectangular area as a template and is adjacent to the long side of the template It is preferable to search a region having the highest correlation with the template among regions on the decoded image obtained by translating a rectangular region in the long side direction.
- a rectangular area on the decoded image adjacent to the long side of the target rectangular area is used as a template, and on the decoded image obtained by translating the rectangular area adjacent to the long side of the template in the long side direction Among the regions, it is possible to search for the region having the highest correlation with the template, so that the deviation between the position of the edge in the region to be predicted and the position of the edge in the region adjacent to the region to be predicted is detected. Can generate a predicted image. That is, even when the edge is a curve, a predicted image can be generated by detecting the curvature of the curve, so that the coding efficiency can be further improved. Play.
- the quantization unit is composed of two or more rectangular areas
- the predicted image generation means is a rectangular area closest to the target rectangular area among the rectangular areas on the decoded image facing the long side of the target rectangular area.
- a rectangular area facing the long side opposite to the target rectangular area of the template and opposite to the long side opposite to the target rectangular area, wherein the distance between the template and the template is equal to the distance between the target rectangular area and the template It is preferable to search a region having the highest correlation with the template among regions on the decoded image obtained by translating L in parallel in the long-side direction.
- the quantization unit is composed of two or more rectangular regions, that is, two or more prediction units. Furthermore, the prediction image of each prediction unit included in the quantization unit can be generated without referring to the decoded image on the same quantization unit. That is, according to the above configuration, a plurality of prediction units included in each quantization unit can be processed in parallel. Therefore, according to the above configuration, it is possible to further reduce the processing time of the encoding process.
- the quantization unit includes two or more rectangular regions whose long sides face each other, the quantization unit performs frequency conversion in the quantization unit, and the dequantization unit performs the frequency conversion in the quantization unit.
- the prediction image generation means performs template conversion on the rectangular area closest to the target rectangular area among the rectangular areas on the decoded image facing the long side of the target rectangular area, performing inverse frequency conversion which is inverse conversion of the frequency conversion.
- a rectangular area facing the long side opposite to the side facing the target rectangular area of the template and facing the long side, wherein the distance between the template and the template is equal to the distance between the target rectangular area and the template It is preferable to search a region having the highest correlation with the template among regions on the decoded image obtained by parallel movement in the side direction.
- the quantization unit includes two or more rectangular regions whose long sides face each other, that is, two or more prediction units whose long sides face each other. Furthermore, the prediction image of each prediction unit included in the quantization unit can be generated without referring to the decoded image on the same quantization unit. As a result, since frequency conversion can be applied on a quantization unit basis, the correlation in the short side direction of the prediction unit is removed, and the encoding efficiency is further improved.
- each of the plurality of rectangular regions is a rectangular region having a width in the short side direction of one pixel.
- the image coding apparatus is dividing means for dividing each of the plurality of blocks into a plurality of rectangular areas whose long sides are adjacent to each other, and switches the long side direction of the plurality of rectangular areas for each block. It is preferred to have a dividing means.
- the division means for switching the long side direction of the plurality of rectangular areas is provided for each block, the rectangular area having the long side direction optimum for the local characteristics of the image to be encoded is used. Since the predicted image can be generated, the coding efficiency can be further improved.
- the image coding apparatus further comprises flag coding means for coding a flag indicating the long side direction of the plurality of rectangular regions for each of the plurality of blocks.
- the image decoding apparatus further comprises: flag encoding means for encoding the flag indicating the long side direction of the plurality of rectangular regions for each of the plurality of blocks.
- flag encoding means for encoding the flag indicating the long side direction of the plurality of rectangular regions for each of the plurality of blocks.
- the data structure of encoded data according to the present invention is a data structure of encoded data obtained by encoding a target image divided into a plurality of blocks for each block, and long sides thereof are mutually different.
- a prediction residual obtained by subtracting a predicted image from a target image on a target block divided into a plurality of adjacent rectangular regions is generated by quantizing each successive quantization unit including one or more rectangular regions.
- the predicted image on the target block is generated for each of the rectangular regions, and the predicted image on each rectangular region is on the decoded image facing the long side of the rectangular region.
- the decoding device can perform the decoding process based on the predicted image of each rectangular area and the quantized prediction residual of each quantization unit. Therefore, according to the above configuration, it is possible to realize the data structure of encoded data with high decoding efficiency.
- the image decoding apparatus for generating a decoded image divided into a plurality of blocks for each block
- the image decoding apparatus is on an object block divided into a plurality of rectangular areas with long sides adjacent to each other.
- Inverse quantization means for generating a decoded image for each successive quantization unit consisting of one or more rectangular regions, and adding prediction residuals obtained by inverse quantization of quantization values to the predicted image,
- a rectangle on a decoded image facing the long side of a target rectangular area which is an inverse quantization unit that generates a decoded image on a target quantization unit, and a predicted image generation unit that generates the predicted image for each rectangular area
- An area on a decoded image obtained by translating in the long side direction a rectangular area opposite to the long side opposite to the side opposite to the target rectangular area of the template using the area as a template Chi, by searching the highest area correlation with the template is characterized in that it comprises a, the predicted image generating means for generating a predicted image on the target rectangular area.
- a rectangular area on the decoded image facing the long side of the target rectangular area is used as a template, and the long side opposite to the side facing the target rectangular area of the template Generating a predicted image on the target rectangular region by searching for a region having the highest correlation with the template among regions on the decoded image obtained by translating the opposing rectangular region in the long side direction in parallel Can.
- the region having the highest correlation with the template is scanned one-dimensionally in the region on the decoded image obtained by translating the rectangular region in the long side direction.
- the amount of calculation at the time of searching can be reduced as compared with the case of performing two-dimensional scanning as in the technique described in Patent Document 1, prediction
- the effect is that image generation can be performed at high speed.
- the above-described image decoding apparatus performs the search for each of the rectangular regions, it is possible to accurately generate a predicted image even when the target image includes an edge whose curvature changes. It plays an effect.
- the present invention can be suitably applied to an image coding apparatus for coding an image and an image decoding apparatus for decoding coded image data.
- image coding device 105 TM prediction unit (predicted image generation means) 106 Prediction unit division unit 107 Prediction residual coding unit (quantization unit, inverse quantization unit) 108 Decoded image generation unit 109 Frame memory 110 MB coding unit 150 Image decoding device 152 Prediction residual decoding unit (inverse quantization means) 203 Side information encoding unit (flag encoding unit)
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
本発明に係る画像符号化装置および画像復号装置の第1の実施形態である画像符号化装置100および画像復号装置150について図1~図10を参照しつつ説明する。なお、図面の説明において、同一機能を有する要素には同一符号を付与して説明を省略する。
まず始めに、画像符号化装置100および画像復号装置150に共通の構成要素であるTM予測部105について、図2~図3に基いて説明する。
テンプレート設定部102は、入力される予測単位情報#106に基づいて、予測対象領域に対応するテンプレートを設定し、当該テンプレートについての情報であるテンプレート情報#102を出力する。
探索領域設定部101は、入力される予測単位情報#106、および、テンプレート情報#102に基づいて、予測対象領域に対応する探索領域を設定し、当該探索領域についての情報である探索領域情報#101を出力する。
テンプレート比較部103は、テンプレート情報#102、探索領域情報#101、および、後述するフレームメモリ109に記録されている復号画像#109に基づいてテンプレートマッチングを実行することにより、予測画像生成用パラメータ#103を導出し、出力する。予測画像生成用パラメータ#103は、予測対象領域を近似する領域の位置を表す情報である。例えば、探索領域内の部分領域のうちで、テンプレート上の復号画像を最も正確に近似する部分領域の位置(テンプレートに対する位置)を、予測画像生成用パラメータ#103として用いることができる。その場合、予測画像生成用パラメータ#103は次の手順S1~S3によって導出することができる。
上記の説明では、テンプレート比較部103における探索候補の設定方法の一例を示したが、別の方法で探索候補を設定しても良い。例えば、探索候補数を削減することにより、精度は落ちるが、より少ない処理量で予測画像生成用パラメータ#103を導出できる。一方、探索候補の探索領域内における位置を1画素よりも小さい単位、例えば0.5画素単位や0.25画素単位で設定することもできる。この場合は探索候補における復号画像の画素値として、整数位置の復号画像の画素値に補間フィルタを適用して得られる補間値を用いる。探索候補の位置を細かく調整できるようになり、より多くの探索候補に対しテンプレートマッチングを実行することができる。したがって、処理量は増加するが、テンプレート上の復号画像をより正確に近似する探索候補を検出できる可能性を高めることができる。
予測画像生成部104は、テンプレート比較部103で導出された予測画像生成用パラメータ#103、および、フレームメモリ109に記録されている復号画像#109に基づいて、予測対象領域に対応する予測画像#105を生成し、出力する。
ここでIr(x、y)は、復号画像の画素(x、y)の画素値を表す。なお、xまたはyが小数の場合には、近傍の復号画像の画素値に補間フィルタを適用することで生成される補間された画素値を用いる。
次にTM予測部105を構成要素として含むMB符号化部110について図1、4~6を参照して説明する。このMB符号化部110は、各MBに対応する入力画像を符号化して、そのMBに対応する符号化データを生成するためのものであり、後述するように画像符号化装置100に利用される。
予測単位分割部106は、処理対象マクロブロックを予め定められた所定の単位(以下、「予測単位」と呼ぶ)に分割し、各予測単位についての情報である予測単位情報#106を出力する。予測単位情報#106は、各予測単位の位置およびサイズについての情報を含む。
予測残差符号化部107は、入力される各予測単位に対応する予測画像#105、予測単位情報#106、および入力画像#113に基づいて、符号化データ#110及び復号残差#107を生成し、出力する。符号化データ#110及び復号残差#107は以下の手順S11~S15により生成される。
復号画像生成部108は、入力された復号残差#107に予測画像#105を加えることで復号画像#108を生成し、出力する。
フレームメモリ109には、入力された復号画像#108が記録される。特定のMBを符号化する時点では、当該MBよりもラスタスキャン順で先にある全てのMBに対応する復号画像がフレームメモリ109に記録されている。
以下では、上述したMB符号化部110における符号化処理について図5を参照して説明する。図5は、MB符号化部110における、処理対象MBに対応する入力画像#113を符号化して符号化データ#110を生成する手順を示すフローチャートである。
MB符号化部110において、MB単位の入力画像#113を符号化した場合、曲線エッジや多様な方向の直線エッジが存在する領域に対する予測画像#105を高い予測精度で生成することができる。以下では、その効果について、図6を参照しながら詳しく説明する。
上記の説明では、予測単位分割部106は、予測単位情報#106を予測単位インデックスの昇順に順次出力するとして説明を行ったが、本発明はこれに限られない。すなわち、予測単位分割部106における予測単位情報#106の出力順は、必ずしも予測単位インデックスの昇順でなくても良い。
次に、MB符号化部110を構成要素として含む画像符号化装置100について図7を参照して説明する。図7は、本発明に係る画像符号化装置100の構成を示すブロック図である。
次に、MB符号化部110で生成されるMB単位の符号化データ#110を受け、MB単位の復号画像#190を出力するMB復号部153について図8~9を参照して説明する。
以上説明したMB復号部153において、特定のMBに対応する符号化データ#110を復号して復号画像#190を生成する処理の手順を、図9を参照して説明する。図9は、MB復号部153における復号処理の手順を示すフローチャートである。
次に、前述のMB復号部153を構成要素として含む画像復号装置150について図10を参照して説明する。画像復号装置150は、前述の画像符号化装置100の生成する符号化データ#180を入力として、復号画像#190を生成して出力する。
上記のように、本発明に係る画像符号化装置100は、複数のブロックに分割された対象画像をブロック(MB)毎に符号化する画像符号化装置であって、長辺同士が互いに隣接する複数の長方形領域(予測単位)に分割された対象ブロック(対象MB)上の対象画像を、1以上の長方形領域(予測単位)からなる連続した量子化単位毎に量子化する量子化手段であって、対象量子化単位上の対象画像から予測画像を減算して得られた予測残差を量子化する量子化手段(予測残差符号化部107)と、上記対象ブロック(対象MB)上の復号画像を上記量子化単位毎に生成する逆量子化手段であって、上記量子化手段にて生成された量子化値を逆量子化して得られた予測残差を上記予測画像に加算することによって、上記対象量子化単位上の復号画像を生成する逆量子化手段(予測残差符号化部107)と、上記予測画像を上記長方形領域(予測単位)毎に生成する予測画像生成手段であって、対象長方形領域の長辺に対向する復号画像上の長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索することによって、上記対象長方形領域上の予測画像を生成する予測画像生成手段(TM予測部105)とを備えている。
以下では、本発明に係る画像符号化装置および画像復号装置の第2の実施形態である画像符号化装置300、および、画像復号装置350について、図11~15を参照して説明する。なお、すでに説明した構成と同じ部分については、同じ符号を付し、説明を省略する。
本実施形態に係る画像符号化装置300について、図11~13に基づいて説明すれば以下のとおりである。
図11に示した画像符号化装置300が備えているMB符号化部205について、図12を参照して説明する。このMB符号化部205は、MB設定部113から出力される、処理対象MBに対応する入力画像#113に基づいて、符号化データ#205を生成し、出力するためのものである。
予測単位構造比較部201は、処理対象MBに対応する入力画像#113を解析し、予め定められた予測単位セットに含まれる予測単位の中から、当該MBに適した予測単位を選択する。また、予測単位構造比較部201は、選択された予測単位の構造を示す情報である予測単位構造情報#201を出力する。
予測単位分割部202は、処理対象MBに対応する入力画像#113を、予測単位構造情報#201によって定められた所定の予測単位に分割する。また、予測単位分割部106は、各予測単位についての情報である予測単位情報#106を出力する。予測単位情報#106は、上述のように、各予測単位の位置およびサイズについての情報を含む。
サイド情報符号化部203は、予測単位構造情報#201に基づき、サイド情報#203を生成する。生成されたサイド情報#203は、MB符号化データ多重化部204に対して出力される。
MB符号化データ多重化部204は、予測残差符号化部107から出力される符号化データ#110と、サイド情報符号化部203から出力されるサイド情報#203とに基づいて、符号化データ#205を生成し、出力する。
上述したMB符号化部205を用いることによって、MB単位の入力画像#113を符号化する際の符号化効率が向上する。
続いて、本実施形態に係る画像復号装置350について、図14および図15を参照して説明する。画像復号装置350は符号化データ#181を受け、復号画像#254を生成し出力する。
図14に示した画像復号装置350が備えているMB復号部254について、図15を参照して説明する。このMB復号部254は、MB設定部156から出力される、個々のMBに対応する符号化データ#156を順次復号することにより、個々のMBに対応する復号画像#254を生成し、出力するためのものである。
MB符号化データ逆多重化部251は、符号化データ#156を逆多重化することによって、サイド情報#251b、および、MB単位の符号化データ#251aに分離する。サイド情報#251bは、サイド情報復号部253に対して出力され、MB単位の符号化データ#251aは、予測単位設定部252、および、予測残差復号部152に対して出力される。なお、サイド情報#251bは、上述したサイド情報#203に対応する情報である。
サイド情報復号部253は、サイド情報#251bを復号し、予測単位構造情報#253を生成する。なお、予測単位構造情報#253は、予測単位構造情報#201に対応する情報である。
予測単位設定部252は、MB単位の符号化データ#251a、および、予測単位構造情報#253に基づいて、MB内の予測単位の位置やサイズを示す予測単位情報#252を生成し、所定の順序で順次出力する。
以上のように構成されたMB復号部254を用いることによって、サイド情報に含まれる、最適な予測単位、すなわち、符号化効率の最も高い予測単位を用いて復号処理を行うことができるので、復号効率が向上するという効果がある。
以下では、本発明に係る画像符号化装置および画像復号装置の第3の実施形態について図16~図20を参照して説明する。なお、すでに説明した構成と同じ部分については、同じ符号を付し、説明を省略する。
本実施形態に係る画像符号化装置が備えているMB符号化部309について、図16を参照して説明する。本実施形態に係る画像符号化装置は、上述した画像符号化装置100におけるMB符号化部110に代えて、MB符号化部309を備えている。
続いて、図16に示したMB符号化部309が備えているTM予測部305について、図17および図19を参照しつつより詳しく説明する。
本実施形態に係る画像復号装置が備えているMB復号部353について、図18を参照して説明する。本実施形態に係る画像復号装置は、上述した画像復号装置150におけるMB復号部153に代えて、MB復号部353を備えている。このMB復号部353は、符号化データ#156を受け、復号画像#254を生成し出力するためのものである。
以下では、上述したMB符号化部309における符号化処理について図20を参照して説明する。図20は、MB符号化部309における、処理対象MBに対応する入力画像#113を符号化して符号化データ#110を生成する手順を示すフローチャートである。
上述したMB符号化部309を用いることによって、MB単位の入力画像#113を符号化する符号化効率が向上する。また、MB符号化部309を用いることによって、符号化処理の処理時間が削減される。
上記のように、本実施形態においては、上記量子化単位は、2以上の長方形領域(予測単位)からなり、上記予測画像生成手段(TM予測部305)は、対象長方形領域の長辺に対向する復号画像上の長方形領域のうち、上記対象長方形領域に最も近い長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域であって、上記テンプレートとの間隔が上記対象長方形領域と上記テンプレートとの間隔に等しい長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索する、ことを特徴としている。
<補足事項1>
本発明は、上述した実施形態に限定されるものではない。
(手順SA1)まず、テンプレート上の復号画像のDC値と探索候補上の復号画像のDC値の差分ΔDCを計算し、
(手順SA2)ΔDCを用いて、Ev=ΔDC×ΔDC×(テンプレートの画素数)により、評価指標Evを算出し、
(手順SA3)評価指数Evが、テンプレート比較部103において計算された全ての非類似度(SSD)よりも小さい場合には、平坦予測を選択する。
また、本発明に係る画像符号化装置は、処理対象MBを複数のサブブロックに分割し、サブブロックごとに、上述したテンプレートマッチングによる予測を用いるか、H.264/AVCにおけるイントラ予測などの方向予測を用いるか、を選択するようにしても良い。
また、本発明に係る画像符号化装置は、量子化された変換係数を符号化する前に、予測単位の短辺方向に所定の長さで可逆変換を行ってもよい。
(手順SA22)次に、処理対象MB内の各予測単位における同一周波数成分に対応する量子化された変換係数から構成される変換係数群に対して、可逆変換を行う。
また、本発明に係る画像復号装置は、テンプレート上の復号画像の特性に基づいて、予測単位、および、予測方法を切り替えるようにしても良い。
例えば、本発明は、以下のように表現することもできる。
105 TM予測部(予測画像生成手段)
106 予測単位分割部
107 予測残差符号化部(量子化手段、逆量子化手段)
108 復号画像生成部
109 フレームメモリ
110 MB符号化部
150 画像復号装置
152 予測残差復号部(逆量子化手段)
203 サイド情報符号化部(フラグ符号化手段)
Claims (9)
- 複数のブロックに分割された対象画像をブロック毎に符号化する画像符号化装置において、
長辺同士が互いに隣接する複数の長方形領域に分割された対象ブロック上の対象画像から予測画像を減算して得られる予測残差を、上記複数の長方形領域から選択された1以上の長方形領域からなる量子化単位毎に量子化する量子化手段と、
上記対象ブロック上の復号画像を上記量子化単位毎に生成する逆量子化手段であって、上記量子化手段にて生成された量子化値を逆量子化して得られる予測残差を上記予測画像に加算することによって、対象量子化単位上の復号画像を生成する逆量子化手段と、
上記対象ブロック上の予測画像を上記長方形領域毎に生成する予測画像生成手段であって、対象長方形領域の長辺に対向する復号画像上の長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索することによって、対象長方形領域上の予測画像を生成する予測画像生成手段と、を備えている、
ことを特徴とする画像符号化装置。 - 上記量子化単位は、単一の長方形領域からなり、
上記予測画像生成手段は、対象長方形領域の長辺に隣接する復号画像上の長方形領域をテンプレートとし、該テンプレートの長辺に隣接する長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索する、
ことを特徴とする請求項1に記載の画像符号化装置。 - 上記量子化単位は、2以上の長方形領域からなり、
上記予測画像生成手段は、対象長方形領域の長辺に対向する復号画像上の長方形領域のうち、上記対象長方形領域に最も近い長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域であって、上記テンプレートとの間隔が上記対象長方形領域と上記テンプレートとの間隔に等しい長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索する、
ことを特徴とする請求項1に記載の画像符号化装置。 - 上記量子化単位は、長辺が互いに対向する2以上の長方形領域を含み、
上記量子化手段は、上記量子化単位で周波数変換を行い、
上記逆量子化手段は、上記量子化単位で上記周波数変換の逆変換である逆周波数変換を行い、
上記予測画像生成手段は、対象長方形領域の長辺に対向する復号画像上の長方形領域のうち、上記対象長方形領域に最も近い長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域であって、上記テンプレートとの間隔が上記対象長方形領域と上記テンプレートとの間隔に等しい長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索する、
ことを特徴とする請求項1に記載の画像符号化装置。 - 上記複数の長方形領域の各々は、短辺方向の幅が1画素の長方形領域である、
ことを特徴とする請求項1から4までの何れか1項に記載の画像符号化装置。 - 上記複数のブロックの各々を長辺同士が互いに隣接する複数の長方形領域に分割する分割手段であって、上記複数の長方形領域の長辺方向をブロック毎に切り替える分割手段を備えている、
ことを特徴とする請求項1から5までの何れか1項に記載の画像符号化装置。 - 上記複数のブロックの各々について、上記複数の長方形領域の長辺方向を示すフラグを符号化するフラグ符号化手段を更に備えている、
ことを特徴とする請求項1から6までの何れか1項に記載の画像符号化装置。 - 複数のブロックに分割された対象画像をブロック毎に符号化することによって得られた符号化データのデータ構造であって、
長辺同士が互いに隣接する複数の長方形領域に分割された対象ブロック上の対象画像から予測画像を減算して得られる予測残差を、上記複数の長方形領域から選択された1以上の長方形領域からなる量子化単位毎に量子化することによって生成された符号化データを含み、
上記対象ブロック上の予測画像は、上記長方形領域毎に生成されたものであり、各長方形領域上の予測画像は、当該長方形領域の長辺に対向する復号画像上の長方形領域をテンプレートとし、上記テンプレートの当該長方形領域に対向する側と反対側の長辺に対向する長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索することによって生成されたものである、
ことを特徴とする符号化データのデータ構造。 - 複数のブロックに分割された復号画像をブロック毎に生成する画像復号装置において、
長辺同士が互いに隣接する複数の長方形領域に分割された対象ブロック上の復号画像を、上記複数の長方形領域から選択された1以上の長方形領域からなる量子化単位毎に生成する逆量子化手段であって、量子化値を逆量子化して得られた予測残差を予測画像に加算することによって、対象量子化単位上の復号画像を生成する逆量子化手段と、
上記予測画像を上記長方形領域毎に生成する予測画像生成手段であって、対象長方形領域の長辺に対向する復号画像上の長方形領域をテンプレートとし、上記テンプレートの上記対象長方形領域に対向する側と反対側の長辺に対向する長方形領域を長辺方向に平行移動して得られる復号画像上の領域のうち、上記テンプレートとの相関が最も高い領域を探索することによって、上記対象長方形領域上の予測画像を生成する予測画像生成手段と、を備えている、
ことを特徴とする画像復号装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP10824755.2A EP2493196A4 (en) | 2009-10-20 | 2010-09-17 | IMAGE ENCODING APPARATUS, IMAGE DECODING APPARATUS, AND CODED DATA STRUCTURE |
CN201080046955.0A CN102577391A (zh) | 2009-10-20 | 2010-09-17 | 图像编码装置、图像解码装置、以及编码数据的数据结构 |
US13/502,703 US20120219232A1 (en) | 2009-10-20 | 2010-09-17 | Image encoding apparatus, image decoding apparatus, and data structure of encoded data |
JP2011537184A JPWO2011048904A1 (ja) | 2009-10-20 | 2010-09-17 | 画像符号化装置、画像復号装置、および、符号化データのデータ構造 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-241339 | 2009-10-20 | ||
JP2009241339 | 2009-10-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011048904A1 true WO2011048904A1 (ja) | 2011-04-28 |
Family
ID=43900144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/066248 WO2011048904A1 (ja) | 2009-10-20 | 2010-09-17 | 画像符号化装置、画像復号装置、および、符号化データのデータ構造 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20120219232A1 (ja) |
EP (1) | EP2493196A4 (ja) |
JP (1) | JPWO2011048904A1 (ja) |
CN (1) | CN102577391A (ja) |
WO (1) | WO2011048904A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018512810A (ja) * | 2015-03-27 | 2018-05-17 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | ビデオコーディングにおけるサブブロックの動き情報の導出 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201501511A (zh) * | 2013-06-25 | 2015-01-01 | Hon Hai Prec Ind Co Ltd | 框內預測方法及系統 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007043651A (ja) * | 2005-07-05 | 2007-02-15 | Ntt Docomo Inc | 動画像符号化装置、動画像符号化方法、動画像符号化プログラム、動画像復号装置、動画像復号方法及び動画像復号プログラム |
JP2007300380A (ja) | 2006-04-28 | 2007-11-15 | Ntt Docomo Inc | 画像予測符号化装置、画像予測符号化方法、画像予測符号化プログラム、画像予測復号装置、画像予測復号方法及び画像予測復号プログラム |
WO2008102805A1 (ja) * | 2007-02-23 | 2008-08-28 | Nippon Telegraph And Telephone Corporation | 映像符号化方法及び復号方法、それらの装置、それらのプログラム並びにプログラムを記録した記録媒体 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5446271A (en) * | 1993-08-06 | 1995-08-29 | Spectra-Physics Scanning Systems, Inc. | Omnidirectional scanning method and apparatus |
US5761686A (en) * | 1996-06-27 | 1998-06-02 | Xerox Corporation | Embedding encoded information in an iconic version of a text image |
CN1156169C (zh) * | 1997-05-07 | 2004-06-30 | 西门子公司 | 一个数字化图象的编码方法和装置 |
JP2897761B2 (ja) * | 1997-07-18 | 1999-05-31 | 日本電気株式会社 | ブロック・マッチング演算装置及びプログラムを記録した機械読み取り可能な記録媒体 |
US6654419B1 (en) * | 2000-04-28 | 2003-11-25 | Sun Microsystems, Inc. | Block-based, adaptive, lossless video coder |
US6711211B1 (en) * | 2000-05-08 | 2004-03-23 | Nokia Mobile Phones Ltd. | Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder |
EP1404136B1 (en) * | 2001-06-29 | 2018-04-04 | NTT DoCoMo, Inc. | Image encoder, image decoder, image encoding method, and image decoding method |
RU2008146977A (ru) * | 2006-04-28 | 2010-06-10 | НТТ ДоКоМо, Инк. (JP) | Устройство прогнозирующего кодирования изображений, способ прогнозирующего кодирования изображений, программа прогнозирующего кодирования изображений, устройство прогнозирующего декодирования изображений, способ прогнозирующего декодирования изображений и программа прогнозирующего декодирования изображений |
KR101712351B1 (ko) * | 2009-06-26 | 2017-03-06 | 에스케이 텔레콤주식회사 | 다차원 정수 변환을 이용한 영상 부호화/복호화 장치 및 방법 |
-
2010
- 2010-09-17 JP JP2011537184A patent/JPWO2011048904A1/ja active Pending
- 2010-09-17 CN CN201080046955.0A patent/CN102577391A/zh active Pending
- 2010-09-17 US US13/502,703 patent/US20120219232A1/en not_active Abandoned
- 2010-09-17 EP EP10824755.2A patent/EP2493196A4/en not_active Withdrawn
- 2010-09-17 WO PCT/JP2010/066248 patent/WO2011048904A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007043651A (ja) * | 2005-07-05 | 2007-02-15 | Ntt Docomo Inc | 動画像符号化装置、動画像符号化方法、動画像符号化プログラム、動画像復号装置、動画像復号方法及び動画像復号プログラム |
JP2007300380A (ja) | 2006-04-28 | 2007-11-15 | Ntt Docomo Inc | 画像予測符号化装置、画像予測符号化方法、画像予測符号化プログラム、画像予測復号装置、画像予測復号方法及び画像予測復号プログラム |
WO2008102805A1 (ja) * | 2007-02-23 | 2008-08-28 | Nippon Telegraph And Telephone Corporation | 映像符号化方法及び復号方法、それらの装置、それらのプログラム並びにプログラムを記録した記録媒体 |
Non-Patent Citations (4)
Title |
---|
HONGBO ZHU: "Improved intra 4x4 DC prediction mode, ITU - Telecommunications Standardization Sector STUDY GROUP 16 Question 6", VCEG-AG15, VIDEO CODING EXPERTS GROUP (VCEG) 33RD MEETING, - 20 October 2007 (2007-10-20), SHENZHEN, CHINA, pages 1 - 3, XP030003619 * |
See also references of EP2493196A4 * |
THIOW KENG TAN ET AL.: "Intra Prediction by Template Matching", IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2006), - October 2006 (2006-10-01), pages 1693 - 1696, XP008155299 * |
YUNG-LYUL LEE ET AL.: "Improved lossless intra coding for H.264/MPEG-4 AVC", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 15, no. 9, September 2006 (2006-09-01), pages 2610 - 2615, XP008155305 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018512810A (ja) * | 2015-03-27 | 2018-05-17 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | ビデオコーディングにおけるサブブロックの動き情報の導出 |
JP2018513611A (ja) * | 2015-03-27 | 2018-05-24 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | ビデオコーディングにおける動きベクトル導出 |
US10958927B2 (en) | 2015-03-27 | 2021-03-23 | Qualcomm Incorporated | Motion information derivation mode determination in video coding |
US11330284B2 (en) | 2015-03-27 | 2022-05-10 | Qualcomm Incorporated | Deriving motion information for sub-blocks in video coding |
Also Published As
Publication number | Publication date |
---|---|
US20120219232A1 (en) | 2012-08-30 |
EP2493196A1 (en) | 2012-08-29 |
JPWO2011048904A1 (ja) | 2013-03-07 |
CN102577391A (zh) | 2012-07-11 |
EP2493196A4 (en) | 2013-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11070802B2 (en) | Moving image coding device, moving image decoding device, moving image coding/decoding system, moving image coding method and moving image decoding method | |
RU2603542C2 (ru) | Способ и устройство для кодирования видео и способ и устройство для декодирования видео | |
KR101951696B1 (ko) | 화면 내 예측 모드에 기초한 적응적인 변환 방법 및 이러한 방법을 사용하는 장치 | |
KR101457929B1 (ko) | 화면 간 예측 수행시 후보 블록 결정 방법 및 이러한 방법을 사용하는 장치 | |
WO2012087034A2 (ko) | 화면 내 예측 방법 및 이러한 방법을 사용하는 장치 | |
KR20140128904A (ko) | 화면 내 예측 방법 및 장치 | |
KR101974952B1 (ko) | 두 개의 후보 인트라 예측 모드를 이용한 화면 내 예측 모드의 부/복호화 방법 및 이러한 방법을 사용하는 장치 | |
KR20170108367A (ko) | 인트라 예측 기반의 비디오 신호 처리 방법 및 장치 | |
CN108810532B (zh) | 图像解码装置 | |
RU2721160C1 (ru) | Способ и устройство декодирования изображения на основе интрапредсказания в системе кодирования изображения | |
WO2011048904A1 (ja) | 画像符号化装置、画像復号装置、および、符号化データのデータ構造 | |
KR102185952B1 (ko) | 두 개의 후보 인트라 예측 모드를 이용한 화면 내 예측 모드의 부/복호화 방법 및 이러한 방법을 사용하는 장치 | |
KR102069784B1 (ko) | 두 개의 후보 인트라 예측 모드를 이용한 화면 내 예측 모드의 부/복호화 방법 및 이러한 방법을 사용하는 장치 | |
KR102516137B1 (ko) | 두 개의 후보 인트라 예측 모드를 이용한 화면 내 예측 모드의 부/복호화 방법 및 이러한 방법을 사용하는 장치 | |
JP2012070249A (ja) | 画像符号化方法、画像復号方法、画像符号化装置、画像復号装置、及びプログラム | |
KR20200136862A (ko) | 두 개의 후보 인트라 예측 모드를 이용한 화면 내 예측 모드의 부/복호화 방법 및 이러한 방법을 사용하는 장치 | |
KR20220121747A (ko) | 비디오 신호 부호화/복호화 방법 및 상기 부호화 방법에 의해 생성된 데이터 스트림을 저장하는 기록 매체 | |
KR102038818B1 (ko) | 화면 내 예측 방법 및 이러한 방법을 사용하는 장치 | |
KR101348566B1 (ko) | 화면 간 예측 수행시 후보 블록 결정 방법 및 이러한 방법을 사용하는 장치 | |
KR20140004825A (ko) | 엔트로피 부호화 및 엔트로피 복호화를 위한 구문 요소 이진화 방법 및 장치 | |
KR20130139810A (ko) | 화면 간 예측 수행시 후보 블록 결정 방법 및 이러한 방법을 사용하는 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080046955.0 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10824755 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011537184 Country of ref document: JP Ref document number: 13502703 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010824755 Country of ref document: EP |