WO2011068332A2 - Procédé et appareil de prédiction spatiale, procédé et dispositif de codage d'image, et procédé et dispositif de décodage d'image utilisant lesdits procédé et appareil de prédiction - Google Patents

Procédé et appareil de prédiction spatiale, procédé et dispositif de codage d'image, et procédé et dispositif de décodage d'image utilisant lesdits procédé et appareil de prédiction Download PDF

Info

Publication number
WO2011068332A2
WO2011068332A2 PCT/KR2010/008389 KR2010008389W WO2011068332A2 WO 2011068332 A2 WO2011068332 A2 WO 2011068332A2 KR 2010008389 W KR2010008389 W KR 2010008389W WO 2011068332 A2 WO2011068332 A2 WO 2011068332A2
Authority
WO
WIPO (PCT)
Prior art keywords
mode
prediction
template matching
block
execution unit
Prior art date
Application number
PCT/KR2010/008389
Other languages
English (en)
Korean (ko)
Other versions
WO2011068332A3 (fr
Inventor
김수년
임정연
최재훈
이규민
정제창
Original Assignee
에스케이텔레콤 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 에스케이텔레콤 주식회사 filed Critical 에스케이텔레콤 주식회사
Publication of WO2011068332A2 publication Critical patent/WO2011068332A2/fr
Publication of WO2011068332A3 publication Critical patent/WO2011068332A3/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • An embodiment of the present invention relates to a spatial prediction apparatus and a prediction method thereof, an image encoding apparatus and method using the same, and an image decoding apparatus and method. More specifically, by using a template matching method in addition to the directional intra prediction mode in the prediction within the same frame for the video, the spatial prediction device that can increase the prediction efficiency and accuracy while minimizing the increase in the overhead thereof, and its A prediction method, an image encoding apparatus and method using the same, and an image decoding apparatus and method.
  • the basic principle of compressing data is to eliminate redundancy in the data. Spatial overlap, such as the same color or object repeating in an image, temporal overlap, such as when there is almost no change in adjacent frames in a movie frame, or the same note over and over in audio, or high frequency of human vision and perception Data can be compressed by removing the psychological duplication taking into account the insensitive to.
  • H.264 is a digital video codec standard that has a very high data compression ratio, also called MPEG-4 Part 10 or Advanced Video Coding (AVC).
  • AVC Advanced Video Coding
  • This standard is based on the Video Coding Experts Group (VCEG) of the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) and the International Standardization Organization / International Electrotechnical Commission (ISO / IEC). This is the result of MPEG jointly forming and standardizing a Joint Video Team.
  • VCEG Video Coding Experts Group
  • ITU-T International Telecommunication Union Telecommunication Standardization Sector
  • ISO / IEC International Electrotechnical Commission
  • the temporal prediction is performed by referring to a reference block 122 of another temporal frame 120 that is adjacent in time when predicting the current block 112 of the current frame 110. to be. That is, in inter-prediction of the current block 112 of the current frame 110, the adjacent reference frame 120 is searched for in time, and the reference block (the most similar to the current block 112 in the reference frame 120) 122).
  • the reference block 122 is a block that can best predict the current block 112, and the block having the smallest sum of absolute difference (SAD) with the current block 112 may be the reference block 122.
  • the reference block 122 becomes a prediction block of the current block 112, and generates a residual block by subtracting the reference block 122 from the current block 112.
  • the generated residual block is encoded and inserted into the bitstream.
  • the relative difference between the position of the current block 112 in the current frame 110 and the position of the reference block 122 in the reference frame 120 is called a motion vector 130, and the motion vector 130 is also a residual block.
  • Temporal prediction is also referred to as inter prediction or inter prediction.
  • Spatial prediction is to obtain the prediction pixel value of the target block by using the reconstructed pixel value of the reference block adjacent to the target block in one frame, and directional intra-prediction (hereinafter referred to simply as intra prediction) It is also called intra prediction.
  • H. 264 specifies encoding / decoding using intra prediction.
  • Intra prediction is a method of predicting values of a current subblock by copying in a predetermined direction by using adjacent pixels in up and left directions for one sub-block, and encoding only the difference.
  • the prediction block for the current block is generated based on another block having the previous coding order.
  • a value obtained by subtracting the current block and the prediction block is coded.
  • the video encoder according to H. 264 selects, for each block, a prediction mode in which the difference between the current block and the prediction block is minimal among the prediction modes.
  • Intra prediction according to the H. 264 standard is illustrated in FIG. 2 in consideration of the position of adjacent pixels and the direction of the prediction used to generate predicted pixel values of 4 x 4 luma blocks and 8 x 8 luma blocks.
  • Nine prediction modes as defined. The nine prediction modes are vertical prediction mode (prediction mode 0), horizontal prediction mode (prediction mode 1), DC prediction mode (prediction mode 2), Diagonal_Down_Left prediction mode (prediction mode 3), Diagontal_Down_Right prediction mode (depending on the prediction direction).
  • Prediction mode 4 Vertical_Right prediction mode (prediction mode 5), Horizontal_Down prediction mode (prediction mode 6), Vertical_Left prediction mode (prediction mode 7), and Horizontal_Up prediction mode (prediction mode 8).
  • the DC prediction mode uses an average value of eight adjacent pixels.
  • prediction mode 3 is that.
  • the same four prediction modes are also used for intra prediction processing on 8 x 8 chroma blocks.
  • FIG. 3 shows an example of labeling for explaining the nine prediction modes of FIG. 2.
  • a prediction block (region including a to p) for the current block is generated using the samples A to M that are decoded in advance. If E, F, G, and H cannot be decoded in advance, E, F, G, and H can be virtually generated by copying D to their positions.
  • FIG. 4 is a diagram for describing nine prediction modes of FIG. 2 using FIG. 3.
  • the prediction block predicts the pixel value with the same pixel value for each vertical line. That is, the pixels of the prediction block predict the pixel value from the nearest pixels of the reference block located above the prediction block, and the reconstructed pixel values of the adjacent pixel A are converted into the first column pixels a, pixel e, pixel i and Set to the predicted pixel value for pixel m.
  • second column pixel b, pixel f, pixel j and pixel n are predicted from the reconstructed pixel values of adjacent pixel B
  • third column pixel c, pixel g, pixel k and pixel o are Predicted from the reconstructed pixel values
  • fourth column pixel d, pixel h, pixel l and pixel p predicts from the reconstructed pixel values of adjacent pixel D.
  • a prediction block is generated in which the prediction pixel values of each column are the pixel values of pixel A, pixel B, pixel C and pixel D.
  • the prediction block predicts the pixel value with the same pixel value for each horizontal line. That is, the pixels of the prediction block predict the pixel value from the nearest pixels of the reference block located to the left of the prediction block, and the reconstructed pixel value of the adjacent pixel I is determined by the first row of pixels a, pixel b, pixel c and Set to the predicted pixel value for pixel d.
  • the second row pixels e, pixel f, pixel g and pixel h are predicted from the reconstructed pixel values of adjacent pixel J
  • the third row pixel i, pixel j, pixel k and pixel l are Predicted from the reconstructed pixel values
  • the fourth row pixel m, pixel n, pixel o and pixel p predicts from the reconstructed pixel values of adjacent pixel D.
  • a prediction block is generated in which the prediction pixel values of each row are the pixel values of pixel I, pixel J, pixel K, and pixel L.
  • the pixels of the prediction block are equally replaced by the average of the pixel values of the upper pixels A, B, C and D and the left pixels I, J, K and L.
  • the pixels of the prediction block in the prediction mode 3 are interpolated in the lower left direction at a 45 ° angle between the lower-left and the upper-right, and the prediction in the prediction mode 4
  • the pixels of the block are extrapolated in the lower right direction at a 45 ° angle.
  • the pixels of the prediction block in the prediction mode 6 are extrapolated in the lower right direction at an angle of about 26.6 ° horizontally, and the pixels of the prediction block in the prediction mode 7 are in the lower left direction at about 26.6 ° angle from the vertical Extrapolated, the pixels of the predictive block in the case of the prediction mode 8 are interpolated in an upward direction of about 26.6 degrees from the horizontal.
  • the pixels of the prediction block may be generated from a weighted average of pixels A to M of the reference block to be decoded in advance.
  • the pixel d located at the top right of the prediction block may be estimated as in Equation 1.
  • round () is a function that rounds to integer places.
  • the 16 ⁇ 16 prediction model for the luminance component includes four modes of prediction mode 0, prediction mode 1, prediction mode 2, and prediction mode 3.
  • prediction mode 1 the pixels of the prediction block are extrapolated from the upper pixels, and in prediction mode 1, the pixels are extrapolated from the left pixels.
  • prediction mode 2 the pixels of the prediction block are calculated as an average of upper pixels and left pixels.
  • prediction mode 3 a linear "plane" function is used that fits the upper and left pixels. This mode is more suitable for areas where the luminance changes smoothly.
  • the pixel value of the prediction block is generated according to the direction corresponding to each mode based on the adjacent pixels of the prediction block to be currently encoded.
  • the current directional mode may be sufficient.
  • the encoding efficiency may be poor, and thus the pixel value of the prediction block may not be accurately predicted.
  • the gain of entropy coding cannot be properly seen due to incorrect intra prediction, which causes a problem that the bit rate is unnecessarily increased.
  • One embodiment of the present invention is to solve the above-described problem, by using a template matching method in addition to the directional intra prediction mode in the prediction within the same frame for the video, thereby increasing the prediction efficiency and accuracy,
  • An object of the present invention is to provide a spatial prediction apparatus and a prediction method thereof, an image encoding apparatus and method using the same, and an image decoding apparatus and method capable of minimizing the increase.
  • an image encoding apparatus performs prediction on a target block using a template matching mode together with a directional intra prediction mode, among which A spatial prediction execution unit which selects a mode having the lowest cost based on distortion (distortion); And an integer conversion execution unit that performs integer conversion on the residual signal of the image predicted by the template matching mode when the template matching mode is selected by the spatial prediction execution unit.
  • the integer conversion execution unit may perform integer conversion as in the following equation.
  • the spatial prediction execution unit may select a low cost mode by the following equation.
  • C is the cost
  • E is the difference between the reconstructed signal and the original signal when decoding the coded bits
  • B is the amount of bits required for each coding
  • is the Lagrangian coefficient, which reflects the reflection ratio of E and B. Represents an adjustable coefficient.
  • the image encoding apparatus may further include an MDDT execution unit that executes a Mode Dependent Directional Transform (MDDT) in consideration of the directionality when any one of nine modes of the directional intra prediction mode is selected by the spatial prediction execution unit.
  • MDDT Mode Dependent Directional Transform
  • the MDDT execution unit transforms the residual signal of the predicted image according to a transform function corresponding to a selected mode among the preset transform functions corresponding to the directional intra prediction mode.
  • the intra prediction execution unit for performing the prediction for the target block using the directional intra prediction mode;
  • a template prediction execution unit which executes the prediction on the target block using the template matching mode;
  • a mode selection unit for selecting a mode having a lowest cost based on rate-distortion among a prediction mode executed by the intra prediction execution unit and a template matching mode executed by the template prediction execution unit;
  • a residual signal calculator configured to calculate a residual signal between the prediction block and the target block according to the selected mode.
  • an image decoding apparatus for determining the mode type for the current block with respect to the bitstream encoded and input by spatial predictive encoding; If the mode type determination unit determines that the mode type of the current block is the template matching mode, the template matching is performed by dividing the current block into units of N ⁇ N blocks and performing template matching on each of the divided N ⁇ N blocks. part; And an inverse integer transform execution unit that performs inverse integer transform on the residual signal between the prediction block and the target block by the template matching.
  • the video decoding apparatus when it is determined by the mode type determination unit that the mode type of the current block is the directional intra prediction mode, the video decoding apparatus further includes an inverse MDDT execution unit that executes the inverse MDDT in consideration of the directionality.
  • the template matching execution unit may divide the current block into 2 x 2 block units and then perform template matching on each 2 x 2 block.
  • an image encoding method includes: performing prediction on a target block using a template matching mode together with a directional intra prediction mode; Selecting a mode having the lowest cost among the modes executed by the predictive execution step; Calculating a residual signal between the prediction block and the target block generated by the mode selected by the selecting step; And perform integer conversion on the residual signal calculated by the calculation step when the mode selected by the selection step is the template matching mode, and calculate the residual signal calculated by the calculation step when the mode selected by the selection step is the directional prediction mode. It characterized in that it comprises the step of executing the MDDT for.
  • the image encoding method may further include selecting a transform function corresponding to the prediction mode among preset transformation functions when the mode selected by the selecting step is a directional prediction mode.
  • the MDDT execution step preferably executes the MDDT according to the selected conversion function.
  • a spatial prediction method performing the prediction for the target block using a template matching mode with a directional intra prediction mode; Selecting a mode having the lowest cost among the modes executed by the predictive execution step; And calculating a residual signal between the prediction block and the target block generated by the mode selected by the selecting step.
  • an image decoding method comprises the steps of: determining a mode type of a current block from a bitstream encoded and input by spatial predictive encoding; If it is determined that the mode type of the current block is a template matching mode, dividing the current block into units of N ⁇ N blocks and performing template matching on each of the divided N ⁇ N blocks; And performing inverse integer transform on the residual signal between the prediction block and the target block by template matching.
  • the image decoding method may further include executing the inverse MDDT in consideration of the directionality if it is determined that the mode type of the current block is the directional intra prediction mode.
  • 1 is a diagram illustrating a general inter prediction.
  • FIG. 2 is a diagram illustrating directionality of the intra prediction mode.
  • FIG. 3 is a diagram illustrating an example of labeling for explaining an intra prediction mode of FIG. 2.
  • FIG. 4 is a diagram illustrating each of the intra prediction modes of FIG. 2.
  • FIG. 5A is a diagram illustrating the prediction mode 0 of the intra prediction modes of FIG. 2
  • FIG. 5B is a diagram illustrating the prediction mode 1 of the intra prediction modes of FIG. 2. to be.
  • FIG. 6 is a diagram schematically illustrating an image encoding apparatus according to an embodiment of the present invention.
  • FIG. 7 is a diagram illustrating template matching used in an embodiment of the present invention.
  • FIG. 8 is a diagram showing an example of the structure of a macroblock composed of four 8x8 partitions.
  • FIG. 9 is a diagram showing an example of the structure of a macroblock consisting of 16 4x4 partitions.
  • 10 is a diagram illustrating a zigzag scan for transform coefficients of a 4x4 partition.
  • FIG. 11 is a flowchart illustrating a spatial prediction method according to an embodiment of the present invention.
  • FIG. 12 is a flowchart illustrating a video encoding method according to another embodiment of the present invention.
  • FIG. 13 is a diagram illustrating an example of a structure of a bitstream generated by the video encoding apparatus of FIG. 6.
  • FIG. 14 is a diagram illustrating an image decoding apparatus according to an embodiment of the present invention.
  • FIG. 15 is a flowchart illustrating an image decoding method by the image decoding apparatus of FIG. 14.
  • the image encoding apparatus 600 includes a spatial prediction execution unit 610, an integer transform execution unit 620, and a Mode Dependent Directional Transform (MDDT) 630.
  • the image encoding apparatus may further include a difference calculator, a quantizer, an inverse quantizer, a motion estimator, a motion compensator, etc. in addition to the illustrated components, but the components that are not directly related to an embodiment of the present invention Omitted to simplify the description.
  • the spatial prediction execution unit 610 executes the prediction for the target block using the template matching mode together with the directional intra prediction mode in the same frame, based on rate-distortion. Choose the lowest cost mode.
  • the spatial prediction execution unit 610 may be implemented as one component in the image encoding apparatus 600, but as illustrated, the intra prediction execution unit 612, the template prediction execution unit 614, and the mode selection unit may be implemented. 616 and the residual signal calculator 618 may be configured.
  • the intra prediction execution unit 612 performs the prediction on the target block by using the directional intra prediction mode. That is, the intra prediction execution unit 612 predicts pixel values according to each prediction mode from neighboring pixels of the target block in the same frame as shown in FIG. 4.
  • the template prediction execution unit 614 executes the prediction on the target block using the template matching mode.
  • the pixel value in the prediction frame for pixel p in the current frame can be determined by comparing the value N (p) of neighboring pixels in the current frame.
  • the value N (p) of the neighboring pixels to be compared is referred to as a template of the pixel p.
  • the search area 700 illustrates a search region adjacent to a 4 ⁇ 4 target block for which pixel values are to be predicted.
  • the search area 700 is composed of the width of the x pixels and the height of the y pixels among the pixels that are first reproduced, but the portion that is not reproduced is excluded as shown in the lower right.
  • the 4 ⁇ 4 target block 710 is further divided into 2 ⁇ 2 target subblocks 720, and template matching is performed in units of each target subblock.
  • the pixels in the same frame and adjacent to the target subblock 720 become the template 730.
  • Template matching calculates the SAD between the corresponding pixels among the group of pixels having the same shape as the template 730 (inverted L-shape in the drawing) in the search area 700, and selects the area with the smallest SAD as the candidate neighbor.
  • An area 740 is assumed.
  • the candidate subblock 750 in contact with the candidate neighboring region 740 is determined as a texture signal for the target subblock 720.
  • Template matching has been described using 4 x 4 blocks as an example to facilitate explanation, but is not limited thereto, and template matching is possible for various blocks.
  • the mode selector 616 selects a mode with the lowest cost based on rate-distortion among the prediction mode executed by the intra prediction execution unit 612 and the template matching mode executed by the template prediction execution unit 614. do. That is, if the target block is 4 x 4, the mode selector 616 may include nine directional intra prediction modes and the template prediction execution unit 614 executed by the intra prediction execution unit 612 according to the H. 264 standard. Of the template matching modes executed by, the mode with the lowest cost is selected.
  • cost C is not limited to the rate-distortion basis and can be defined in various ways.
  • Equation 2 E denotes the difference between the signal reconstructed by decoding the encoded bit and the original signal, and B denotes the amount of bits required for each coding.
  • is a Lagrangian coefficient and means a coefficient which can adjust the reflection ratio of E and B.
  • the residual signal calculator 618 calculates a residual signal between the prediction block and the target block according to the mode selected by the mode selector 616.
  • the integer transform execution unit 620 performs nine directional intra prediction modes and the template prediction execution unit 614 when the template matching mode is selected by the spatial prediction execution unit 610, that is, executed by the intra prediction execution unit 612. If the template matching mode is selected as the lowest cost by the mode selection unit 616 among the template matching modes performed by), integer conversion is performed on the residual signal of the image predicted by the template matching mode.
  • the template matching mode unlike the directional prediction mode, the adaptive transform described later is not defined, but the prediction block by the template matching mode may use an integer transform defined in the H. 264 standard.
  • the H. 264 standard adopts an integer transform, which may occur due to the lack of resolution when performing a transform operation. Mismatch was eliminated at the root.
  • the Discrete Cosine Transform (DCT) operation which is a transform used in the existing video and still image standards, adopts a floating point operation, and thus the result of the transform operation may vary depending on individual implementations. There was room left.
  • the conversion is defined only by integer and bit shift operations, and the digital system eliminates the possibility of error in operation during standardization.
  • the size of the macroblock defined in the H. 264 standard is defined as a set of pixels having a size of 16 ⁇ 16 as shown in FIG. 8.
  • the macroblock of FIG. 8 shows a state composed of four 8 ⁇ 8 partitions having indices of 0 to 3. FIG. This indicates that when encoding transform coefficients of four 8 x 8 partitions from 0 to 3, they are encoded in that order.
  • the H.264 standard defines whether to determine the coded block pattern for Y (CBPY) based on the presence or absence of nonzero transform coefficients in each 8 x 8 partition.
  • one macroblock is composed of sixteen 4 ⁇ 4 partitions. As described in FIG. 8, one macroblock is defined to be divided into four 8 ⁇ 8 partitions and processed through a specified order. Likewise, one 8 x 8 partition is defined to be divided into four 4 x 4 partitions and processed in the specified order.
  • This series of configurations is as shown in FIG.
  • the drawing shows that each DC component of the 16 4 x 4 compartments can be collected to reconstruct the 4 x 4 compartments.
  • the darker part on the upper left of each 4 x 4 partition is conceptually a part of indicating the DC among the conversion coefficients of each 4 x 4 partition, and it is possible to collect these DC coefficients to form a separate 4 x 4 partition. Done.
  • the 4 x 4 integer transform is a transform used for the compression of the residual signal of the 4 x 4 partition in the intra and inter modes. Since all transforms in H. 264 can be implemented by addition and bit shift operations only, every basis is defined by a power of 2 and 1 or 2 or 2 only.
  • the basic 4 x 4 integer transform is used to generate transform coefficients for performing a zig-zag scan on the 4 x 4 partition as shown in FIG.
  • Equation 4 Has the value By factoring Equation 3, Equation 4 can be obtained.
  • Equation 5 E is a scaling factor matrix
  • I a symbol that multiplies the values of (CXC T ) and the same position in the E matrix by each other.
  • d c / b-0.414.
  • Equation 5 Denotes the equation for forward integer conversion used in H. 264, which can be calculated as the product of the matrix.
  • the first and last matrices have only integer values of ⁇ 1 and ⁇ 2, and these values can be simply calculated by addition, subtraction, and shift operations. This is called 'multiplication-free' and can be used very efficiently in a reference encoder.
  • the MDDT execution unit 630 is any one of nine directional intra prediction modes selected by the spatial prediction execution unit 610, that is, any one of the directional intra prediction modes executed by the intra prediction execution unit 612. If is selected to have the lowest cost, execute the MDDT taking into account the direction.
  • MDDT Mode Dependent Directional Transform
  • KLT Karhunen Loeve Transform
  • This technique compresses the energy of the error block in the frequency domain. Since MDDT applies transform coding according to the direction of the intra prediction method, characteristics of quantized transform coefficients generated after quantization may also appear in different forms according to the direction. In order to encode these coefficients more efficiently, adaptive scanning may be used.
  • the MDDT may be selected as a set of transform functions classified according to the directional prediction mode, and such a set of transform functions may be considered as shown in Table 1 below.
  • f xy denotes the x-th transform function corresponding to the y-th prediction mode.
  • Table 1 describes that N + 1 functions are allocated to each prediction mode, but the present invention is not limited thereto, and the number of functions of each prediction mode may not be the same.
  • mode 0 may have N + 1 assigned transform functions
  • mode 1 may have N assigned transform functions
  • mode 2 may have N ⁇ 1 assigned transform functions.
  • the MDDT execution unit 630 converts the residual signal of the image predicted according to the preset corresponding transform function in response to the directional intra prediction mode selected by the mode selection unit 616 of the spatial prediction execution unit 610. .
  • FIG. 11 is a flowchart illustrating a spatial prediction method according to another embodiment of the present invention.
  • the spatial prediction execution unit 610 of FIG. 6 executes prediction on a target block using a template matching mode together with a directional intra prediction mode (S1101).
  • the spatial prediction execution unit 610 compares the costs of the directional intra prediction mode and the template matching mode, and selects the mode having the lowest cost as an optimal mode (S1103).
  • the optimal mode is selected from the directional intra prediction mode and the template matching mode as described above, the residual signal between the prediction block selected by the selected mode and the target block is calculated (S1105).
  • steps S1201 to S1205 calculate the residual signal using the same spatial prediction method as that of FIG. 11, detailed description thereof will be omitted.
  • the integer transform calculation unit 620 may perform the process between the prediction block and the target block executed by the template prediction execution unit 614. Integer conversion is performed on the residual signal (S1209).
  • the MDDT execution unit 630 selects the prediction selected from the preset conversion functions. A transform function corresponding to the mode is selected (S1211), and the residual signal between the prediction block executed by the intra prediction execution unit 612 and the target block is executed using the selected transform function (S1213).
  • FIG. 13 is a diagram illustrating an example of a structure of a bitstream generated by the video encoding apparatus 600 of FIG. 6.
  • bitstreams are encoded in slice units.
  • the bitstream includes a slice header 1310 and a slice date 1320, and the slice data 1320 includes a plurality of macroblock data (MBs) 1321 to 1324.
  • macroblock data 1323 may include an mb_type field 1330, an mb_pred field 1335, and a texture data field 1335.
  • a value indicating the type of macroblock is recorded in the mb_type field 1330. That is, it indicates whether the current macroblock is an intra macroblock or an inter macroblock.
  • a detailed prediction mode according to the type of macroblock is recorded.
  • information of a prediction mode selected during intra prediction is recorded, and in case of an inter macroblock, information of a reference frame number and a motion vector is recorded for each macroblock partition.
  • the template matching mode only a bit for informing it may be recorded and the remaining information may be omitted to notify the decoder that the current mode is the template matching mode. For example, when the mode of the current block is selected as the template matching mode, bit 1 is transmitted and the remaining information is omitted. Otherwise, the remaining mode information may be encoded after bit 0 is transmitted.
  • the mb-pred field 1335 is divided into a plurality of block information 1342 to 1344, and each block information 1342 is a value of the main mode described above. It is divided into a main_mode field 1345 for recording a sub-mode field 1346 for recording a value of the above-described sub-mode.
  • the encoded residual image that is, the texture data
  • the texture data field 1339 is recorded in the texture data field 1339.
  • an image decoding apparatus 1400 may include a mode type determination unit 1410, a template matching execution unit 1420, an inverse integer conversion execution unit 1430, and an inverse MDDT execution unit. 1440.
  • the mode type determination unit 1410 determines a mode type for the current block with respect to a bitstream encoded and input by spatial prediction encoding. That is, the mode type of the current block is read from the bitstream as shown in FIG. 13 to determine the mode type. For example, when bit 1 indicating that the mode type of the current block is a template matching mode is recorded in the input bitstream, it recognizes that the corresponding bitstream is encoded in the template matching mode and prepares decoding corresponding thereto. In addition, when bit 0 indicating that the mode type of the current block is the directional intra prediction mode is recorded in the input bitstream, decoding of the directional intra prediction block corresponding to the information recorded in the sub mode of the corresponding bitstream is referred to. Prepare.
  • the template matching unit 1420 divides the current block into units of N ⁇ N blocks and then divides each N ⁇ N block. Template matching is performed on the block. Preferably, the template matching unit 1420 divides the current block in units of 2 ⁇ 2 blocks and performs template matching on each of the divided 2 ⁇ 2 blocks.
  • the method for template matching is the SAD between the corresponding pixel among the group of pixels having the same shape as the template 730 (inverted L-shape in the figure) in the search area 700, as shown in FIG. Is calculated, and the area with the smallest SAD is used as the candidate neighboring area 740.
  • the candidate subblock 750 in contact with the candidate neighboring region 740 is determined as a texture signal for the target subblock 720.
  • the inverse integer transform execution unit 1430 executes an inverse integer transform on the residual signal between the prediction block matched by the template matching unit 1420 and the target block.
  • the inverse integer conversion execution unit 1430 may perform inverse integer conversion on the residual signal using Equation 5.
  • the inverse transform of the residual signal can be performed by inversely transforming the equation (5).
  • the inverse MDDT execution unit 1440 executes the inverse MDDT in consideration of the directionality of the input bitstream. . That is, when it is determined that the current block of the input bitstream is the directional intra prediction mode, the directional information is considered with reference to the directional information recorded in the sub mode of the bitstream, and the corresponding inverse MDDT is executed. For example, assuming that the set of transform functions is shown in Table 1, it can be seen that N + 1 transform functions are assigned to each prediction mode according to the direction, and the transform function and the directional information recorded in the bitstream Inverse MDDT can be executed based on. Here, the number of transform functions allocated to each prediction mode may be different depending on the direction.
  • FIG. 15 is a flowchart illustrating an image decoding method by the image decoding apparatus of FIG. 14.
  • the mode type determination unit 1410 determines the mode type of the current block from the input bitstream encoded and input by spatial predictive coding (S1501). That is, based on the structure of the bitstream as shown in FIG. 13, it is determined whether the mode type of the current block is a template matching mode or a directional intra prediction mode.
  • the mode type determination unit 1410 has been described as determining the mode type for the spatial predictive coding, but the present invention is not limited thereto. However, since temporal prediction coding is beyond the subject matter of the present invention, detailed description thereof is omitted.
  • the template matching unit 1420 divides the current block into units of N ⁇ N blocks, and then divides each of the divided N ⁇ N blocks. Template matching is performed on the target block (S1505). In this case, as shown in FIG. 7, the template matching performing unit 1420 divides the 4 ⁇ 4 target block into 2 ⁇ 2 target subblocks (S1505), and executes template matching on each target subblock unit. It is preferable to carry out (S1507).
  • Template matching calculates the SAD between the corresponding pixels among the group of pixels having the same shape as the template 730 (inverted L-shape in the drawing) in the search area 700, and selects the area with the smallest SAD as the candidate neighbor.
  • An area 740 is assumed.
  • the candidate subblock 750 in contact with the candidate neighboring region 740 is determined as a texture signal for the target subblock 720.
  • Template matching has been described using 4 x 4 blocks as an example to facilitate explanation, but is not limited thereto, and template matching is possible for various blocks.
  • the block obtained by template matching becomes a result of intra prediction, and the inverse integer transform unit 1430 performs inverse quantization and inverse integer transformation on the residual signal between the generated prediction block and the target block (S1509).
  • the result obtained through inverse quantization and inverse integer transformation is added to the template matching result to form a reconstructed image.
  • the inverse MDDT execution unit 1440 determines the directional intra prediction based on the structure of the input bitstream.
  • the direction of the mode is determined, and inverse quantization and inverse MDDT are performed in consideration of the direction (S1511).
  • the set of transform functions may be set as shown in Table 1, and the inverse MDDT may be executed based on the assigned transform function according to the direction of each prediction mode.
  • the embodiment of the present invention is applied to an intra prediction apparatus, an image encoding and decoding field, and compared to the H.264 standard, the intra prediction is performed while reducing the bit rate without greatly increasing the overhead of the bitstream generator. It is a very useful invention to produce an effect that can increase the accuracy of.

Abstract

La présente invention concerne un procédé et un appareil de prédiction spatiale, un procédé et un dispositif de codage d'image et un procédé et un dispositif de décodage d'image utilisant lesdites procédé et appareil de prédiction spatiale. Le dispositif de codage d'image, selon des modes de réalisation de la présente intention, comprend : une unité d'exécution de prédiction spatiale qui prédit un bloc cible à l'aide d'un mode de prédiction intra directionnelle et d'un mode de concordance de gabarit, et sélectionne le mode le moins cher sur la base d'un taux de distorsion; et une unité d'exécution de conversion de nombres entiers qui exécute une conversion de nombres entiers pour les signaux résiduels des images prédites par le mode de concordance de gabarit lorsque ce dernier est sélectionné par l'unité d'exécution de prédiction spatiale. La présente invention permet d'améliorer la précision et le rendement de la prédiction intra, et de minimiser une augmentation du surdébit dans le codage vidéo.
PCT/KR2010/008389 2009-12-04 2010-11-25 Procédé et appareil de prédiction spatiale, procédé et dispositif de codage d'image, et procédé et dispositif de décodage d'image utilisant lesdits procédé et appareil de prédiction WO2011068332A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2009-0119570 2009-12-04
KR1020090119570A KR101601854B1 (ko) 2009-12-04 2009-12-04 공간적 예측장치 및 그 예측방법, 그것을 이용한 영상 부호화 장치 및 방법, 및 영상 복호화 장치 및 방법

Publications (2)

Publication Number Publication Date
WO2011068332A2 true WO2011068332A2 (fr) 2011-06-09
WO2011068332A3 WO2011068332A3 (fr) 2011-09-15

Family

ID=44115403

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2010/008389 WO2011068332A2 (fr) 2009-12-04 2010-11-25 Procédé et appareil de prédiction spatiale, procédé et dispositif de codage d'image, et procédé et dispositif de décodage d'image utilisant lesdits procédé et appareil de prédiction

Country Status (2)

Country Link
KR (1) KR101601854B1 (fr)
WO (1) WO2011068332A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017201141A1 (fr) * 2016-05-17 2017-11-23 Arris Enterprises Llc Mise en correspondance de modèles pour une prédiction intra jvet

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130003856A1 (en) * 2011-07-01 2013-01-03 Samsung Electronics Co. Ltd. Mode-dependent transforms for residual coding with low latency
KR101596085B1 (ko) * 2012-12-18 2016-02-19 한양대학교 산학협력단 적응적인 인트라 예측을 이용한 영상 부호화/복호화 장치 및 방법
KR101911587B1 (ko) * 2015-08-03 2018-10-24 한양대학교 산학협력단 적응적인 인트라 예측을 이용한 영상 부호화/복호화 장치 및 방법
WO2017047897A1 (fr) * 2015-09-15 2017-03-23 디지털인사이트주식회사 Procédé et appareil de quantification ou de masquage hdr
US11234003B2 (en) 2016-07-26 2022-01-25 Lg Electronics Inc. Method and apparatus for intra-prediction in image coding system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050067083A (ko) * 2003-12-26 2005-06-30 가부시키가이샤 엔티티 도코모 화상 부호화 장치, 화상 부호화 방법, 화상 부호화프로그램, 화상 복호 장치, 화상 복호 방법, 및 화상 복호프로그램
KR20080019294A (ko) * 2005-07-05 2008-03-03 가부시키가이샤 엔티티 도코모 동화상 부호화 장치, 동화상 부호화 방법, 동화상 부호화프로그램, 동화상 복호 장치, 동화상 복호 방법 및 동화상복호 프로그램
KR20090008418A (ko) * 2006-04-28 2009-01-21 가부시키가이샤 엔티티 도코모 화상 예측 부호화 장치, 화상 예측 부호화 방법, 화상 예측부호화 프로그램, 화상 예측 복호 장치, 화상 예측 복호 방법 및 화상 예측 복호 프로그램

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090003443A1 (en) * 2007-06-26 2009-01-01 Nokia Corporation Priority-based template matching intra prediction video and image coding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050067083A (ko) * 2003-12-26 2005-06-30 가부시키가이샤 엔티티 도코모 화상 부호화 장치, 화상 부호화 방법, 화상 부호화프로그램, 화상 복호 장치, 화상 복호 방법, 및 화상 복호프로그램
KR20080019294A (ko) * 2005-07-05 2008-03-03 가부시키가이샤 엔티티 도코모 동화상 부호화 장치, 동화상 부호화 방법, 동화상 부호화프로그램, 동화상 복호 장치, 동화상 복호 방법 및 동화상복호 프로그램
KR20090008418A (ko) * 2006-04-28 2009-01-21 가부시키가이샤 엔티티 도코모 화상 예측 부호화 장치, 화상 예측 부호화 방법, 화상 예측부호화 프로그램, 화상 예측 복호 장치, 화상 예측 복호 방법 및 화상 예측 복호 프로그램

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017201141A1 (fr) * 2016-05-17 2017-11-23 Arris Enterprises Llc Mise en correspondance de modèles pour une prédiction intra jvet
US9948930B2 (en) 2016-05-17 2018-04-17 Arris Enterprises Llc Template matching for JVET intra prediction
US10375389B2 (en) 2016-05-17 2019-08-06 Arris Enterprises Llc Template matching for JVET intra prediction
US10554971B2 (en) 2016-05-17 2020-02-04 Arris Enterprises Llc Template matching for JVET intra prediction
US11310494B2 (en) 2016-05-17 2022-04-19 Arris Enterprises Llc Template matching for JVET intra prediction
US11659168B2 (en) 2016-05-17 2023-05-23 Arris Enterprises Llc Template matching for JVET intra prediction
US11936856B2 (en) 2016-05-17 2024-03-19 Arris Enterprises Llc Template matching for JVET intra prediction

Also Published As

Publication number Publication date
WO2011068332A3 (fr) 2011-09-15
KR101601854B1 (ko) 2016-03-10
KR20110062748A (ko) 2011-06-10

Similar Documents

Publication Publication Date Title
WO2011068331A2 (fr) Procédé et dispositif de codage vidéo, procédé et dispositif de décodage vidéo et procédé de prédiction intra directionnelle à utiliser avec ceux-ci
CA2478691C (fr) Procede de codage du mouvement dans une sequence video
JP5026092B2 (ja) 動画像復号装置および動画像復号方法
AU728469B2 (en) Intra-macroblock DC and AC coefficient prediction for interlaced digital video
WO2011004986A2 (fr) Procédé et appareil de codage et de décodage d'images
WO2011133002A2 (fr) Dispositif et procédé de codage d'image
WO2012018198A2 (fr) Dispositif de génération de blocs de prédiction
WO2013062196A1 (fr) Appareil de décodage d'images
WO2009113791A2 (fr) Dispositif de codage d'image et dispositif de décodage d'image
WO2010087620A2 (fr) Procédé et appareil de codage et de décodage d'images par utilisation adaptative d'un filtre d'interpolation
WO2009157665A2 (fr) Procédé de prédiction intra-trame et appareil utilisant la transformation par bloc, procédé de codage/décodage d'image et appareil utilisant ce procédé
WO2011087271A2 (fr) Procédé et dispositif de traitement de signaux vidéo
WO2011019246A2 (fr) Procédé et appareil pour encoder/décoder une image en contrôlant la précision d'un vecteur de mouvement
WO2013062195A1 (fr) Procédé et appareil de décodage de mode de prédiction intra
WO2012018197A2 (fr) Dispositif de décodage intra-prédictif
WO2013069932A1 (fr) Procédé et appareil de codage d'image, et procédé et appareil de décodage d'image
WO2012134085A2 (fr) Procédé pour décoder une image dans un mode de prévision interne
WO2013062197A1 (fr) Appareil de décodage d'images
WO2011126285A2 (fr) Procédé et appareil destinés à coder et à décoder des informations sur des modes de codage
WO2012005558A2 (fr) Procédé et appareil d'interpolation d'image
WO2013062198A1 (fr) Appareil de décodage d'images
WO2010044569A2 (fr) Procédé et appareil permettant de générer une trame de référence et procédé et appareil permettant le coder/décoder une image au moyen de la trame de référence
WO2011068332A2 (fr) Procédé et appareil de prédiction spatiale, procédé et dispositif de codage d'image, et procédé et dispositif de décodage d'image utilisant lesdits procédé et appareil de prédiction
WO2013062194A1 (fr) Procédé et appareil de génération de bloc reconstruit
JP5475409B2 (ja) 動画像符号化装置および動画像符号化方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10834747

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 14-09-2012)

122 Ep: pct application non-entry in european phase

Ref document number: 10834747

Country of ref document: EP

Kind code of ref document: A2