WO2011080925A1 - 画像符号化装置および方法 - Google Patents
画像符号化装置および方法 Download PDFInfo
- Publication number
- WO2011080925A1 WO2011080925A1 PCT/JP2010/007592 JP2010007592W WO2011080925A1 WO 2011080925 A1 WO2011080925 A1 WO 2011080925A1 JP 2010007592 W JP2010007592 W JP 2010007592W WO 2011080925 A1 WO2011080925 A1 WO 2011080925A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixel
- value
- block size
- image
- unit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000008859 change Effects 0.000 claims abstract description 12
- 238000013139 quantization Methods 0.000 claims description 61
- 238000004364 calculation method Methods 0.000 claims description 28
- 238000004088 simulation Methods 0.000 claims description 9
- 230000001629 suppression Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 17
- 239000006185 dispersion Substances 0.000 description 28
- 238000010586 diagram Methods 0.000 description 17
- 230000008569 process Effects 0.000 description 11
- 230000001276 controlling effect Effects 0.000 description 5
- 230000007423 decrease Effects 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 235000012736 patent blue V Nutrition 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
Definitions
- moving images are converted to H.264 images.
- the present invention relates to an image encoding apparatus and method for encoding according to the H.264 standard.
- FIG. 12A is a diagram illustrating a prediction mode of a 4 ⁇ 4 pixel block
- FIG. 12B is a diagram illustrating a prediction mode of a 16 ⁇ 16 pixel block.
- nine prediction directions including the average value prediction are determined for the prediction direction of the luminance signal 4 ⁇ 4 pixel block.
- FIG. 12B four prediction directions are determined in the prediction of the 16 ⁇ 16 pixel block of the luminance signal and the 8 ⁇ 8 pixel block of the color difference signal, and one direction is set for each prediction block. select.
- the amount of information in the prediction direction required for one macroblock increases or decreases depending on the prediction block size. When the prediction block size decreases, the number of prediction blocks in the macroblock increases, and thus the amount of information increases.
- the generated code amount is controlled by controlling the quantization width.
- the quantization width In order to rapidly suppress the generated code amount, it is sometimes insufficient to simply increase the quantization width. In order to suppress the generated code amount, it is necessary to reduce information other than the image signal (hereinafter referred to as “overhead”).
- Patent Document 1 discloses a method of suppressing the amount of code by reducing the overhead due to the intra prediction encoding.
- the input image has a checkered pattern of white and black for each pixel
- the overhead is reduced but the prediction is not performed at all.
- the difference component becomes very large, and the code amount cannot be suppressed.
- the quantization width is increased in order to suppress the code amount, significant image quality degradation is caused.
- An object of the present invention is to provide an image encoding apparatus and method capable of selecting an intra prediction mode.
- an image coding apparatus is an image code that performs intra-frame predictive coding on an encoding target macroblock in an input image in units of intra-prediction blocks having a plurality of sizes.
- a feature amount calculation unit that calculates statistical information of the pixel value based on a pixel value of a pixel belonging to the encoding target macroblock in the input image, and based on the calculated statistical information
- the intra-prediction block size is determined according to a predetermined criterion so that the smaller the degree of change in the pixel value in a predetermined direction in the encoding target macroblock, the larger the intra-prediction block size is selected.
- the present invention can be realized not only as an image encoding device, but also as a method using a mobile information terminal, a broadcasting device, and a processing unit constituting the image encoding device as a step.
- These programs, information, data, and signals may be distributed via a communication network such as the Internet.
- FIG. 1 is a block diagram showing the configuration of the image coding apparatus according to the first embodiment.
- FIG. 2 is a block diagram showing a detailed configuration of the in-plane prediction block size determination unit shown in FIG.
- FIG. 3 is a flowchart for explaining an example of processing for determining the in-plane prediction block size in the first embodiment.
- FIG. 4 is a flowchart for explaining another example of the process of determining the in-plane prediction block size in the first embodiment.
- FIG. 5 is a diagram illustrating changes in pixel values representing edges and gradation.
- FIG. 6 is a conceptual diagram illustrating a method for calculating a difference between pixel values between adjacent pixels in the horizontal direction and the vertical direction.
- FIG. 1 is a block diagram showing the configuration of the image coding apparatus according to the first embodiment.
- FIG. 2 is a block diagram showing a detailed configuration of the in-plane prediction block size determination unit shown in FIG.
- FIG. 3 is a flowchart for explaining an example of processing for
- FIG. 7 is a flowchart for explaining processing for calculating the threshold value of the luminance dispersion value using the threshold value 0 to the threshold value 3 and the quantization parameter QP.
- FIG. 8 shows that in-plane prediction is performed in units of 4 ⁇ 4 pixel blocks when the number of 4 ⁇ 4 pixel blocks whose luminance variance is smaller than that of a 16 ⁇ 16 pixel macroblock is large in the second embodiment. It is a figure which shows an example of the image which can reduce the generated code amount.
- FIG. 9 is a diagram illustrating luminance that is a residual of each pixel when the image in the macroblock illustrated in FIG. 8 is scanned in the horizontal direction at a position of 16 pixels from the top of the macroblock.
- FIG. 8 shows that in-plane prediction is performed in units of 4 ⁇ 4 pixel blocks when the number of 4 ⁇ 4 pixel blocks whose luminance variance is smaller than that of a 16 ⁇ 16 pixel macroblock is large in the second embodiment. It is a figure which shows an example
- FIG. 10 is a flowchart for explaining an example of processing for determining the in-plane prediction block size in the second embodiment.
- FIG. 11 is a diagram for explaining control of the generated code amount in the buffer simulation of the decoder according to the third embodiment.
- FIG. It is a figure which shows the prediction mode of the in-plane prediction encoding system prescribed
- FIG. 1 is a block diagram showing the configuration of the image coding apparatus according to the first embodiment.
- the image encoding device 100 includes a block feature amount calculation unit 102, an in-plane prediction block size determination unit 103, an encoding unit 105, and a rate control unit 106.
- the encoding unit 105 includes a subtractor 1051, an in-plane prediction unit 1052, an in-plane prediction direction determination unit 1053, a T • Q (Transformation and Quantization) unit 1054, and an IQ • IT (Inverse Quantization and Inverse Transformation).
- the image encoding apparatus 100 calculates a block feature amount of the input image 101 acquired from the outside, and uses the calculated block feature amount and a control parameter 104 set in an external register or memory by an external input.
- An image encoding device that determines in which plane the input image 101 is to be predicted in-plane, predicts the input image 101 in-plane with the determined block size, and outputs a stream 107 obtained by further encoding It is.
- the block feature amount is statistical information of pixel values, and is, for example, a dispersion value of luminance values, an average value, an adjacent pixel difference value sum, an adjacent pixel difference absolute value sum, and a dynamic range.
- the main point of the present invention is the processing in the in-plane prediction, description of the configuration of a processing unit that is not related to the in-plane prediction, for example, a processing unit that performs inter-plane prediction, is omitted.
- the in-plane prediction block size of the luminance signal is either 16 ⁇ 16 pixels or 4 ⁇ 4 pixels.
- a method for determining a prediction direction after determining a block size is assumed as a method for determining an in-plane prediction mode. That is, H.I.
- the 8 ⁇ 8 pixel block size of the luminance signal used in the H.264 high profile is not used.
- the description regarding the encoding process using prediction between screens is omitted.
- an input image 101 corresponding to one screen is sequentially divided into 16 ⁇ 16 pixel rectangular regions (macroblocks, hereinafter referred to as “MB”) in the horizontal direction from the upper left to the lower right of the image. Encoding processing is performed in units of divided MBs.
- the block feature amount calculation unit 102 calculates a feature of the encoding target MB, that is, a block feature amount indicating a tendency of change in pixel value in the MB.
- the block feature quantity calculation unit 102 as the block feature quantity, for example, average brightness value, variance value, dynamic range, and adjacent pixel difference in MB (16 ⁇ 16 pixels) units or orthogonal transform block (4 ⁇ 4 pixels) units.
- the sum of absolute values is calculated and output as a block feature amount.
- the block feature value calculation unit 102 calculates the block feature value in MB units.
- the average value a ′ of the luminance values is obtained by the following (Equation 1).
- M is the number of pixels in the horizontal direction in the block
- N is the number of pixels in the vertical direction in the block
- i is an integer that increases from 1 to 1 and takes values up to N
- j is 1 to 1 It indicates an integer that increases gradually and takes a value up to M
- a (i, j) indicates the luminance value of the pixel in i row and j column.
- the block feature quantity calculation unit 102 calculates the average value a ′ of the luminance values of each block by performing the calculation of (Equation 1) for each block.
- equation 1 since the calculation of the following arithmetic expression using a computer is a well-known technique, description about the specific process of calculation is abbreviate
- the luminance dispersion value S 2 is obtained by the following (Equation 2).
- M is the number of pixels in the horizontal direction in the block
- N is the number of pixels in the vertical direction in the block
- a ′ is the average value of the luminance values
- a (i, j) is i row j in the block.
- the luminance value of the pixel in the column i is an integer that increases from 1 to 1 and takes a value up to N, and j is an integer that increases from 1 to 1 and takes a value up to M.
- Block feature quantity calculation unit 102 for each block, by performing the following calculation, to calculate the variance value S 2 of the luminance values of each block.
- the dynamic range is obtained by the width from the minimum value to the maximum value of the luminance value or the ratio (dB) between the minimum value and the maximum value of the luminance value.
- the adjacent pixel difference absolute value sum a h of luminance values in the horizontal direction (row direction) is obtained by the following (Equation 3), and the adjacent pixel difference absolute value sum of luminance values in the vertical direction (column direction) is calculated.
- a v is obtained by the following (formula 4).
- K represents the number of pixels on one side of the block
- a (i, j) represents the luminance value of the pixel located in i row and j column in the block.
- i represents an integer that increases from 1 to 1 and takes a value up to K
- j represents an integer that increases from 1 to 1 and takes a value up to (K ⁇ 1).
- i represents an integer that increases from 1 to 1 and takes a value up to (K ⁇ 1)
- j represents an integer that increases from 1 to 1 and takes a value up to K. Yes.
- the in-plane prediction block size determination unit 103 uses the block feature amount calculated by the block feature amount calculation unit 102, the control parameter 104 input from the outside, and the rate control information calculated by the rate control unit 106 described later.
- the in-plane prediction block size is determined as either 4 ⁇ 4 or 16 ⁇ 16. Details will be described later.
- Encoding unit 105 is an H.264 standard. The encoding process is performed in accordance with the H.264 standard Baseline profile or Main profile.
- the in-plane prediction direction determination unit 1053 selects one prediction per MB from the four prediction directions from mode 0 to mode 3 shown in FIG. Select the direction. If the in-plane prediction block size is 4 ⁇ 4, one prediction direction is selected for each 4 ⁇ 4 pixel block from the nine prediction directions from mode 0 to mode 8 shown in FIG. . If the block size is 4 ⁇ 4, there are 16 4 ⁇ 4 pixel blocks per MB, and it is necessary to determine the respective prediction directions.
- a method for determining the prediction direction is not defined here, a general method is, for example, a method of selecting a prediction direction in which the sum of absolute differences between the pixel value in the block and the predicted image is the smallest.
- the in-plane prediction unit 1052 uses the in-plane prediction block size determined by the in-plane prediction block size determination unit 103 and the prediction direction determined by the in-plane prediction direction determination unit 1053 according to FIGS. 12A and 12B. As shown in FIG. 4, a prediction image is generated using the peripheral pixels of the target block. The generation method is described in detail in Non-Patent Document 1.
- the subtractor 1051 generates a difference image between the image of the encoding target MB and the prediction image generated by the in-plane prediction unit 1052.
- the TQ unit 1054 performs orthogonal transform on the difference image, performs quantization, and calculates a quantization coefficient.
- orthogonal transform for example, DCT transform (Discrete Cosine Transform) is used.
- T ⁇ Q unit 1054 calculates a quantized coefficient by quantizing the orthogonal transform coefficient obtained by the orthogonal transform.
- the entropy encoding unit 1050 includes a quantization coefficient calculated by the T / Q unit 1054, a quantization width (also referred to as a “quantization step”) used for quantization, a block size for in-plane prediction, Side information (also referred to as “additional information”) such as a prediction direction is encoded and a stream 107 is output.
- a quantization width also referred to as a “quantization step”
- Side information also referred to as “additional information” such as a prediction direction is encoded and a stream 107 is output.
- the local decoding process for reconstructing an image from the quantization coefficient calculated by the TQ unit 1054 performs the following process.
- the IQ / IT unit 1055 performs inverse quantization and inverse orthogonal transform processing on the quantization coefficient to generate a reconstructed difference image.
- the IQ / IT unit 1055 performs IDCT conversion (Inverse Discrete Cosine Transform), which is the inverse process of DCT conversion, corresponding to the DCT conversion in the T / Q unit 1054.
- IDCT conversion Inverse Discrete Cosine Transform
- the adder 1056 adds the reconstructed difference image generated by the IQ / IT unit 1055 to the predicted image generated by the in-plane prediction unit 1052, and generates a reconstructed image.
- the DBF unit 1057 performs a deblocking filter process on the reconstructed image to generate a reconstructed filter image.
- in-plane prediction is performed using peripheral pixels of a block for which in-plane prediction is performed, but it is defined that pixels before deblocking filter processing are used. Therefore, the peripheral pixel memory 1059 holds only pixels that can be used for in-plane prediction in the reconstructed image.
- the frame memory 1058 holds the reconstructed filter image generated by the DBF unit 1057 as a reference image when performing inter-screen prediction.
- the rate control unit 106 calculates an average quantization width, a generated code amount transition, a buffer occupancy status, and the like as rate control information from the encoding result of the encoding unit 105, and encodes the next input image. A target code amount, a quantization width, etc. are determined.
- in-plane prediction mode In the 4 ⁇ 4 in-plane prediction mode, in-plane prediction is performed in units of 4 ⁇ 4 pixel blocks. Therefore, in-plane prediction can be performed more finely than in-plane prediction in units of 16 ⁇ 16 pixel blocks. Since one of nine prediction directions can be selected for each 4 ⁇ 4 pixel block, there is an advantage that prediction performance is improved and residual components of pixel values can be reduced. However, there are 16 4 ⁇ 4 pixel blocks per MB, and it is necessary to embed information indicating the prediction direction for each block in the stream.
- in-plane prediction is performed in units of 16 ⁇ 16 pixel blocks, so that residual components such as MBs with uniform pixel values or MBs with gradation or horizontal or vertical edges are used.
- residual components such as MBs with uniform pixel values or MBs with gradation or horizontal or vertical edges are used.
- the uniform pixel value means that there is little variation in the pixel value, the change width of the pixel value is small, that is, the image is flat.
- unlike 4 ⁇ 4 in-plane prediction only one prediction direction information is required per MB, which has an advantage that overhead can be reduced.
- the quantization width becomes larger (coarse)
- the high frequency component of the orthogonal transform coefficient is deleted, that is, the pixel value of the difference image is flattened, so that the 16 ⁇ 16 in-plane prediction mode with less overhead is performed.
- the selection is advantageous from the viewpoint of code amount suppression.
- an image with a pixel value in the MB that is a checkered pattern of white and black Is more advantageous in that the amount of code can be suppressed than selecting the 16 ⁇ 16 in-plane prediction mode because the selection of the 4 ⁇ 4 in-plane prediction mode can select a prediction mode that can reduce the residual component.
- FIG. 2 is a block diagram showing a detailed configuration of the in-plane prediction block size determination unit shown in FIG.
- the in-plane prediction block size determination unit 103 includes a block size determination unit 1031 and a parameter adjustment unit 1032.
- the block size determination unit 1031 determines the block size by comparing with a threshold value using, for example, a luminance variance value as a feature amount for determining the predicted block size.
- FIG. 3 is a flowchart for explaining an example of processing for determining the in-plane prediction block size in the present embodiment. The flowchart of the process for determining the predicted block size is as shown in FIG.
- FIG. 3 illustrates the simplest example in which whether or not the luminance of the MB is uniform is determined by comparing the luminance dispersion value with a threshold value, and the predicted block size is determined according to the determination result. .
- FIG. 4 is a flowchart for explaining another example of the process of determining the in-plane prediction block size in the present embodiment.
- the block size determination unit 1031 first determines whether the MB luminance is uniform (S401). Note that the determination method of the block size determination unit 1031 in S401 is the same as S301 in FIG.
- the block size determination unit 1031 compares the MB luminance variance value calculated by the block feature quantity calculation unit 102 with a threshold value, and if the luminance variance value is less than or equal to the threshold value, the MB luminance value is uniform. Judge that there is. If the luminance dispersion value exceeds the threshold, it is determined that the MB luminance is not uniform.
- the block size determination unit 1031 determines that the luminance of the MB is uniform (Yes in S401), the block size determination unit 1031 selects a 16 ⁇ 16 pixel block as the predicted block size (S404), and otherwise (in S401). No) Further, it is determined whether the MB is gradation (S402). The determination of gradation will be described in detail later with reference to FIGS.
- the block size determining unit 1031 determines that the MB is gradation (Yes in S402)
- the block size determining unit 1031 selects a 16 ⁇ 16 pixel block as the predicted block size (S404), and otherwise (No in S402). Further, it is determined whether or not there is a horizontal or vertical edge in the MB (S403). The determination of the presence or absence of an edge will be described in detail later with reference to FIG.
- the block size determination unit 1031 determines that there is a horizontal or vertical edge in the MB (Yes in S403), the block size determination unit 1031 selects a 16 ⁇ 16 pixel block as the predicted block size (S404), and otherwise (No in S403), a 4 ⁇ 4 pixel block is selected as the predicted block size (S405).
- FIG. 5 is a diagram showing changes in pixel values representing edges and gradation.
- FIG. 5A is a diagram showing an example of an edge by a change in luminance, using a 3 ⁇ 3 pixel block as an example.
- the left side of FIG. 5A shows the luminance of each pixel in the block as a numerical value.
- BT Assume that a value between 16 and 235 is used with 8-bit precision of the 709 standard.
- the right side of FIG. 5A represents the change in the luminance of each pixel in the block on the left side of FIG.
- the luminance value changes abruptly between an adjacent pixel and a pixel corresponding to the edge in a direction perpendicular to the edge direction.
- adjacent pixels have similar luminance values in the edge direction.
- the luminance value of the pixel in the second column changes more rapidly than the pixel in the first column, and is close to the maximum luminance value. ing.
- the luminance values of the pixels in the second column are almost the same in the column direction. Therefore, it can be seen that the pixels in the second column correspond to edges.
- FIG. 5A shows an example in which the edge has a width of one pixel in the vertical direction, but even if the third column has the same value as the second column, the pixel in the second column is the edge. It corresponds to.
- edge features can be detected using the sum of absolute values of adjacent pixel differences in the vertical and horizontal directions calculated by the block feature quantity calculation unit 102. That is, as shown in FIG. 5 (a) right, if there is an edge in the vertical direction, adjacent pixel difference absolute value sum a v of the luminance value in the vertical direction (column direction) becomes a value close to 0, the horizontal direction The adjacent pixel difference absolute value sum a h of luminance values in the (row direction) tends to be very large.
- whether or not there is an edge in the vertical direction is determined by whether or not the sum of absolute values of adjacent pixel differences a v in the vertical direction is equal to or smaller than a predetermined threshold Th v (a v ) and the adjacent pixel difference absolute value in the horizontal direction. This can be determined by determining whether or not the sum a h exceeds a predetermined threshold Th v (a h ).
- FIG. 5B is a diagram showing an example of gradation by a change in luminance, taking a 3 ⁇ 3 pixel block as an example.
- the left side of FIG. 5B shows the luminance of each pixel in the block as a numerical value.
- the right side of FIG. 5B represents the change in the luminance of each pixel in the left block of FIG.
- the luminance value (which may be a color difference) tends to gradually increase or decrease gradually in one direction.
- the luminance value of each pixel hardly changes in the vertical direction and gradually increases in the horizontal direction to the right. I understand that.
- Such edge and gradation features are the sum of absolute values of adjacent pixel differences in the vertical and horizontal directions calculated by the block feature amount calculation unit 102, or the block feature amount calculation unit 102 adjacent to each MB in the vertical direction and horizontal direction. Detection can be performed using a value in the middle of calculating the pixel difference absolute value sum.
- FIG. 6 is a conceptual diagram illustrating a method of calculating a difference in pixel values between adjacent pixels in the horizontal direction and the vertical direction.
- 6A shows that a difference in pixel value is calculated between adjacent pixels in the horizontal direction (row direction) indicated by an arrow
- FIG. 6B shows an adjacent in the vertical direction (column direction) indicated by an arrow. It shows that a difference in pixel value between pixels is calculated.
- the adjacent pixel difference in the horizontal direction for luminance is calculated for each row in the block.
- the difference in luminance value between adjacent pixels in the horizontal direction is represented by ⁇ a (i, j) ⁇ a (i, j + 1) ⁇ .
- the adjacent pixel difference in the vertical direction is calculated for each column in the block.
- a difference in luminance value between adjacent pixels in the vertical direction is represented by ⁇ a (i, j) ⁇ a (i + 1, j) ⁇ .
- the difference in luminance value between adjacent pixels is a value that is uniformly close to 0 in a certain direction, and in the direction perpendicular to that direction, the magnitude of the difference is substantially a constant value that is less than or equal to a threshold value. Tend. Therefore, the block size determination unit 1031 detects whether or not there is a direction in which the difference in luminance value between these adjacent pixels calculated by the block feature amount calculation unit 102 is close to zero. That is, the block size determination unit 1031 detects whether or not the degree of change in pixel value between adjacent pixels is small in the vertical or horizontal direction.
- the sign of the luminance value difference between adjacent pixels is constant in a direction perpendicular to the detected direction, or the luminance value difference between adjacent pixels is predetermined.
- the width is equal to or less than the predetermined threshold value, it is possible to determine whether there is a gradation in the vertical direction or the horizontal direction.
- gradation a case where the luminance value of each pixel in the block changes in a linear curve in the horizontal direction, that is, a case where it has a constant inclination is shown, but this is not limited thereto. Alternatively, it may be a quadratic curve or a cubic curve. Further, here, gradation determination is performed using the difference in luminance value between adjacent pixels, but the degree of change in luminance value in each direction in the block may be calculated using primary differentiation. Further, gradation may be detected using a conventional technique in graphic processing.
- the prediction block size is determined based on whether the luminance of the MB is uniform, that is, whether the variation in the luminance value of the MB is equal to or less than the threshold value.
- the predicted block size can be determined based on whether the MB is horizontally or vertically grading and whether there are horizontal or vertical edges in the MB.
- the coding unit 105 As described above, according to the image coding apparatus 100 of the first embodiment, the coding unit 105, the block feature amount calculation unit 102, and the in-plane prediction block size determination unit 103 illustrated in FIG. It is apparent that the configuration can sufficiently solve the conventional problems in the in-plane prediction in units of pixel blocks.
- the generated code amount when the generated code amount is large, in order to make it easier to select a 16 ⁇ 16 pixel block as the predicted block size, whether or not the luminance of the MB is uniform is determined.
- a threshold for determining whether or not is set in accordance with the quantization width in units of pictures.
- the parameter adjustment unit 1032 uses the rate control information to adjust the control parameter 104, which is a threshold used for the determination in S301 in FIG. 3 or S401 in FIG. 4, and generates a threshold for the block size determination unit 1031.
- the threshold value is determined as follows will be described.
- a list of threshold values corresponding to the quantization width is included in the control parameter 104 and is set in units of pictures.
- the parameter adjustment unit 1032 holds in advance a list of threshold values such as the following that fluctuate in association with the quantization width (QP) that is one of the rate control information.
- the threshold value 0 to the threshold value 3 are set so that the threshold value becomes larger as the quantization width QP becomes larger, that is, the threshold value 0 ⁇ threshold value 1 ⁇ threshold value 2 ⁇ threshold value 3 is satisfied.
- the selection rate at which the block size determination unit 1031 selects the 16 ⁇ 16 pixel block can be increased.
- the selection rate of the 16 ⁇ 16 pixel block can be further increased by further increasing the increase amount of the threshold value 2 to the threshold value 3 as approaching this.
- FIG. 7 is a flowchart for explaining processing for calculating the threshold value of the luminance dispersion value using the threshold value 0 to the threshold value 3 and the quantization parameter QP.
- the parameter adjustment unit 1032 acquires the control parameter 104 from an external register or the like, and extracts a list of threshold values included in the control parameter 104 (S701).
- the parameter adjustment unit 1032 acquires the quantization parameter QP from the rate control unit 106 (S702).
- the setting is made (S704). That is, the value “0” is stored in the register for holding the threshold number for identifying the threshold determined in step S704.
- control parameter 104 includes a list of threshold values that increases as the quantization width increases, but the present invention is not limited to this.
- the list of threshold values is stored in an arbitrary memory, for example, a recording medium and an external memory, in a format such as a look-up table, by combining a quantization width range and a threshold value calculated in advance. It may be left.
- a list of threshold values is not stored, but a weighting coefficient corresponding to the quantization width is determined in advance, and an operation represented by a linear expression or other function using a coefficient corresponding to the quantization width
- the threshold value may be calculated using an equation.
- the threshold value is controlled to increase in accordance with the increase in the quantization width, but instead of controlling the threshold value, that is, without changing the threshold value, the threshold value is changed.
- the weighting coefficient may be multiplied by the luminance dispersion value.
- the reference for comparison between the threshold value and the variance value (that is, the list of threshold values) is changed in accordance with the increase in the quantization width, but the present invention is not limited to this, and for example, the pixel value
- the probability that a large prediction block size is selected according to the variance value and the increase in the quantization width is determined in advance, and the prediction block size with a large probability according to the combination of the variance value and the quantization width May be selected.
- Control when setting so that a large predicted block size is selected with a probability of 70% when QP is 40 or more and 45 or less, for example, a natural number from 1 to 10 is randomly generated, and the generated random value is Control may be performed so that a large prediction block size is selected if the value is 1 or more and 7 or less, and a small prediction block size is selected if the value is 8 or more and 10 or less.
- a large prediction block size is selected if the value is 1 or more and 7 or less
- a small prediction block size is selected if the value is 8 or more and 10 or less.
- the larger the quantization width the larger the prediction block size is selected even with the same variance value.
- a standard may be set so that it can be easily done. For example, a table in which a prediction block size to be selected is associated with a luminance variance value in an encoding target macroblock is prepared for each predetermined quantization width. In this case, a table is prepared in which a larger prediction block size is assigned to a smaller variance value as the quantization width increases.
- the threshold value may be changed according to the recording mode or the target code amount, for example. That is, in the recording mode in which the bit rate for recording the encoded data is low, the 16 ⁇ 16 pixel MB size is easily selected, thereby reducing the overhead and reducing the generated code amount, and in the recording mode in which the recording bit rate is high. Encoding accuracy (resolution) may be improved by facilitating selection of a 4 ⁇ 4 pixel block size. Also, when the target code amount is low, the 16 ⁇ 16 pixel MB size is easily selected to reduce the generated code amount, and when the target code amount is high, the 4 ⁇ 4 pixel block size is easily selected to encode accuracy. It is good to raise.
- the predicted block size is determined by detecting both whether there is an edge in the horizontal or vertical direction and whether there is a gradation in the horizontal or vertical direction.
- the predicted block size may be determined by detecting either one of the edge and the gradation.
- the MB luminance dispersion value is given as an example.
- a color difference dispersion value may be used, and the adjacent pixel difference absolute value sum of the color difference and the sign of the color difference are considered.
- the block size is determined by determining whether the pixel value (luminance and color difference) is uniform, gradation, or has an edge as shown in FIG. It is possible.
- a method of determining whether or not the MB brightness is uniform by comparing the brightness dispersion value of the MB with a threshold value and determining the prediction block size according to the determination result is given as an example.
- the present invention is not limited to this.
- the MB includes a partial image with a high contrast in the background consisting of a flat image
- each image is compared with the in-plane predictive coding with the 16 ⁇ 16 pixel MB size.
- the luminance variance value of the entire 16 ⁇ 16 pixel MB is compared with the luminance variance value for each 4 ⁇ 4 pixel block in the MB, and 16 ⁇ 16 pixels are compared.
- the number of 4 ⁇ 4 pixel blocks whose variance value is smaller than the variance value of the entire MB is counted, and when the counted number is larger than a certain number, the MB is predicted in the plane with a 4 ⁇ 4 pixel block size. Decide what to do.
- a threshold value Th (n) regarding the number of blocks is included in the control parameter 104 and stored in an external memory. Further, the block feature amount calculation unit 102 calculates a block feature amount, here, a luminance dispersion value not only for the MB but also for all 4 ⁇ 4 pixel blocks in the MB.
- the block size determining unit 1031 compares the luminance dispersion value of the MB with the luminance dispersion value of each 4 ⁇ 4 pixel block in the MB, and the luminance dispersion value of the 4 ⁇ 4 pixel block is smaller than the luminance dispersion value of the MB. The number of 4 ⁇ 4 pixel blocks is calculated.
- the calculated number of 4 ⁇ 4 pixel blocks is compared with a threshold value Th (n) regarding the number of blocks read from the control parameter 104, and if the number of 4 ⁇ 4 pixel blocks exceeds the threshold value Th (n).
- Th (n) the threshold value regarding the number of blocks read from the control parameter 104
- a 4 ⁇ 4 pixel block is selected as the predicted block size
- a 16 ⁇ 16 pixel block is selected as the predicted block size.
- FIG. 8 shows that when the number of 4 ⁇ 4 pixel blocks having a luminance variance value smaller than 16 ⁇ 16 pixel MB is large, the amount of generated code is reduced by performing in-plane prediction in units of 4 ⁇ 4 pixel blocks.
- the 16 ⁇ 16 pixel MB is considerably larger than the 4 ⁇ 4 pixel block.
- the MB may include a high-contrast image such as a human head or face with a uniform image such as the sky as a background. In such a case, the dispersion value of the luminance of MB tends to be a large value due to the influence of an image with high contrast contained therein.
- the luminance is uniform in both the background and the human head image except for the block including the boundary between the sky and the head image.
- the number of blocks with a small value is counted.
- the residual component can be reduced more accurately and the generated code amount can be reduced by performing in-plane prediction in units of 4 ⁇ 4 pixel blocks having a small luminance dispersion value rather than in units of MB. can do.
- FIG. 9 is a diagram illustrating the luminance of each pixel on a horizontal line across the head image in the 16 ⁇ 16 pixel MB representing the image of FIG. Further, in the figure, when an image including an edge as shown in FIG. 8 is subjected to intra prediction prediction in the horizontal direction with a 16 ⁇ 16 pixel block size, the quantization noise of the edge portion is the entire decoded image of MB. The dotted line shows how it spreads. In FIG. 9, the separation for each 4 ⁇ 4 pixel block from the left is indicated by a vertical broken line, and the luminance value, which is the horizontal pixel value of each pixel, is indicated by a solid line.
- the luminance of each pixel shows a constant high value (luminance representing sky blue) from the first pixel to the eleventh pixel in the horizontal direction from the left end.
- the luminance suddenly becomes a low value (luminance representing the head black), and the constant value is shown as it is up to the sixteenth pixel. Therefore, since the luminance is constant in the first 4 ⁇ 4 pixel block and the second 4 ⁇ 4 pixel block from the left, the variance is low, and the third 4 ⁇ 4 pixel block from the left includes an edge. Therefore, the luminance variance becomes a large value, and in the next fourth 4 ⁇ 4 pixel block, the luminance becomes constant again, and the variance becomes a low value.
- the prediction residual can be kept low by performing in-plane prediction with a prediction block size of 16 ⁇ 16 pixels. Can do.
- the predicted direction in the vertical direction of FIG. 12B is simply applied. Can not do it.
- the 4 ⁇ 4 pixel block in the fourth column from the left and the second from the top includes a horizontal edge.
- FIG. 10 is a flowchart for explaining an example of processing for determining the predicted block size in the second embodiment.
- the threshold Th (n) related to the number of blocks is included in the control parameter 104 and stored in the external memory in advance, and the parameter adjustment unit 1032 reads the threshold Th (n) from the external memory.
- the initial value of a register for counting the number of blocks is set to 0.
- the block feature amount calculation unit 102 calculates the variance value of the MB luminance (S901).
- the block size determination unit 1031 calculates a luminance variance value of one 4 ⁇ 4 pixel block in the MB (S902), and the calculated luminance variance value of the 4 ⁇ 4 pixel block is calculated as a block feature amount calculation unit. It is determined whether or not it is smaller than the dispersion value of the luminance of MB calculated in 102 (S903).
- the block size determination unit 1031 sequentially compares the MB luminance variance value with the 4 ⁇ 4 pixel block luminance variance value for all 4 ⁇ 4 pixel blocks in the MB, and The number of 4 ⁇ 4 pixel blocks having a smaller luminance dispersion value is counted.
- the block size determination unit 1031 determines that the counted number of blocks is from the parameter adjustment unit 1032. It is determined whether or not it is equal to or less than the acquired threshold value Th (n) (S905). If the counted number of blocks is equal to or smaller than the threshold Th (n) (Yes in S905), it is determined to perform in-plane prediction with a 16 ⁇ 16 pixel MB size (S906). If the counted number of blocks exceeds the threshold Th (n) (No in S905), it is determined to perform in-plane prediction with a 4 ⁇ 4 pixel block size (S907).
- the feature amount calculating unit when the 16 ⁇ 16 pixel block size is selected by the size determining unit, the feature amount calculating unit further includes all 4 ⁇ included in the encoding target macroblock.
- a variance value is calculated based on pixel values of pixels belonging to the 4 ⁇ 4 pixel block, and the size determining unit calculates the variance value calculated for the 16 ⁇ 16 pixel block, The variance values calculated for the ⁇ 4 pixel block are compared, and the 16 ⁇ 16 pixel block and the 4 ⁇ 4 pixel block are selectively switched based on the comparison result.
- all the 4 ⁇ 4 pixel blocks included in the encoding target macroblock are further included in the encoding target macroblock.
- the number of all the 4 ⁇ 4 pixel blocks having a pixel value variance smaller than the pixel value variance is counted, and if the counted number of 4 ⁇ 4 pixel blocks is equal to or less than a predetermined number, 16 ⁇ 16 pixels
- the 4 ⁇ 4 pixel block size can be selected.
- the intra prediction encoding can be performed with the 4 ⁇ 4 pixel prediction block size.
- the intra prediction encoding it is possible to accurately reduce the amount of generated code by intra-prediction encoding, and because the encoding target macroblock includes an image with a strong contrast, noise due to quantization error may be caused by the encoding target macro. It is possible to prevent the entire decoded image of the block from spreading and obtain a decoded image closer to the input image.
- the offset of the luminance dispersion value of the 4 ⁇ 4 pixel block may be adjusted in conjunction with the quantization parameter QP, similarly to the method described in the first embodiment. Specifically, the offset offset (n) adjusted in conjunction with the quantization parameter QP in S902 in FIG. 10 is added to the luminance dispersion value of the 4 ⁇ 4 pixel block, and the offset offset (n) is added in S903. You may make it compare the luminance dispersion value of * 4 pixel block, and the luminance dispersion value of MB.
- the probability (frequency or ratio) of selecting the 16 ⁇ 16 pixel MB size when the generated code amount is large. can be high.
- the MB luminance dispersion value is given as an example.
- a color difference dispersion value may be used, and the adjacent pixel difference absolute value sum of the color difference and the sign of the color difference are considered.
- the block size is determined by determining whether the pixel value (luminance and color difference) is uniform, gradation, or has an edge as shown in FIG. It is possible.
- control is performed to increase the multiplier to be multiplied by the threshold as the buffer occupancy of the encoded data on the decoder side approaches the underflow level.
- the code amount actually generated by CAVLC Context-Adaptive Variable Length Coding
- VBR Very Bit Rate
- FIG. 11 is a diagram for explaining the control of the generated code amount in the decoder buffer simulation. More specifically, as shown in FIG. 11, the vertical axis indicates the amount of encoded data occupied by the buffer, and the horizontal axis indicates time.
- encoded data read from an external medium, network, or the like is accumulated at a constant bit rate.
- the decoder virtually instantly reads out the encoded data to be decoded, one picture at a time, as shown by the upward arrow in FIG. It should be noted that the reading of the encoded data from the buffer is not actually instantaneously read as shown in the figure, but is simulated as being virtually instantaneously performed.
- the third embodiment suppresses the generated code amount when the remaining amount of encoded data in the buffer reaches the control line so as not to cause underflow. Therefore, the threshold multiplier is increased to facilitate selection of the 16 ⁇ 16 pixel prediction block size.
- control is performed so that the 16 ⁇ 16 pixel prediction block size is easily selected as the remaining encoded data amount in the buffer decreases, and the threshold multiplier is increased as the remaining encoded data amount decreases.
- the control line is divided into several stages and the underflow is approached. Control may be performed to increase the threshold value.
- a list indicating the correspondence between the remaining amount of encoded data in the buffer and the threshold value may be stored in the parameter adjustment unit 1032 as a lookup table or the like.
- a threshold value corresponding to the remaining amount of encoded data in the buffer when performing the in-plane prediction of the target MB is read from the table, the read threshold value is compared with the variance of the luminance value of the target MB, and the comparison result Thus, the predicted block size of the target MB may be determined. Further, the thresholds corresponding to the remaining amount of encoded data in the buffer and the quantization width at that time may be determined in advance by combining the first and third embodiments. Thus, by determining the prediction block size of the target MB using a threshold that is set so that the smaller the remaining amount of encoded data in the buffer, the larger the value, the remaining amount of encoded data in the buffer is reduced. The smaller the number, the easier it is to select a larger block size, so that the amount of generated code of the next picture can be suppressed, and the remaining amount of encoded data in the buffer can be quickly returned to an appropriate amount. is there.
- the threshold value or the increase amount of the threshold value may be increased as the average code amount generated for each picture increases with respect to the target code amount in all the pictures encoded so far.
- the threshold or the amount of increase of the threshold increases as the generated code amount for each picture exceeds the target code amount, the 16 ⁇ 16 MB block size is more easily selected as the predicted block size, There is an effect that the amount of generated codes can be reduced with good timing.
- the block size in the in-plane prediction is determined based on the actual generated code amount by CAVLC.
- CABAC Context
- the block size may be determined on the basis of the actual generated code amount by based Adaptive Binary Arithmetic Coding).
- the present invention does not need to determine the block size based on the actually generated code amount, and instead of the actual generated code amount, for example, an intermediate stage of encoding such as binary data before arithmetic encoding
- the predicted block size may be determined based on the amount of data generated in step (b).
- the generated code amount may be estimated from the binarized data, and the predicted block size may be determined based on the estimated code amount.
- the generated code amount may be estimated from the binarized data, and the predicted block size may be determined based on the estimated code amount.
- the present invention can also be applied to a case where data transfer is performed by CBR control.
- CBR control since overflow should not occur, the predicted block size is adaptively selected so that overflow does not occur. Specifically, as the remaining code amount in the buffer approaches the overflow, for example, the control is performed such that the 16 ⁇ 16 pixel prediction block size is less likely to be selected by decreasing the threshold value.
- the amount of generated code in a predetermined number of pictures encoded immediately before the encoding target picture is encoded increases with respect to the target code amount. It may be calculated by increasing the coefficient in the arithmetic expression for calculating the threshold value. It is also possible to control the threshold value by combining these alone or in combination. For example, an average of a threshold value set according to the quantization parameter QP and a threshold value set according to the remaining amount of encoded data in the buffer may be obtained, and the obtained average value may be used as the threshold value.
- the threshold value is controlled based on the generated code amount so that the 16 ⁇ 16 pixel prediction block size is easily selected as the generated code amount increases. It is not limited to.
- a look-up table showing the correspondence between the luminance value distribution of the target MB and the predicted block size to be selected corresponding thereto is prepared. That is, as the generated code amount in the simulation increases, a table that is set so that a predicted block size of 16 ⁇ 16 pixels is selected for a lower luminance variance value is prepared. Needless to say, the prediction block size corresponding to the variance of the luminance values may be selected with reference to the lookup table corresponding to the generated code amount.
- the first embodiment, the second embodiment, and the third embodiment may be implemented in any combination as long as they do not contradict each other. That is, the present invention is not limited to the above-described embodiments, and various improvements and modifications can be made without departing from the scope of the present invention.
- each functional block in the block diagrams is typically realized as an LSI that is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them. For example, the functional blocks other than the memory may be integrated into one chip.
- LSI is used, but it may be called IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.
- the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor.
- An FPGA Field Programmable Gate Array
- a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
- only the means for storing the data to be encoded or decoded may be configured separately instead of being integrated into one chip.
- the image encoding apparatus and method according to the present invention can convert moving image data into H.264 format.
- H.264 standard Main profile or Baseline profile encoded encoded video data is broadcast, uploaded to a server device on a network, or recorded, for use in a broadcast device, recording device, portable information terminal, etc. Useful.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
図1は、本実施の形態1に係る画像符号化装置の構成を示すブロック図である。図1に示すように画像符号化装置100は、ブロック特徴量算出部102、面内予測ブロックサイズ決定部103、符号化部105、およびレート制御部106を備える。さらに、符号化部105は、内部に、減算器1051、面内予測部1052、面内予測方向決定部1053、T・Q(Transformation and Quantization)部1054、IQ・IT(Inverse Quantization and Inverse Transformation)部1055、加算器1056、DBF(Deblocking Filter)部1057、フレームメモリ1058、周辺画素メモリ1059、およびエントロピー符号化部1050を備える。画像符号化装置100は、外部から取得した入力画像101のブロック特徴量を算出し、算出されたブロック特徴量と、外部からの入力により外部のレジスタまたはメモリに設定された制御パラメータ104とを用いて、入力画像101をどのブロックサイズで面内予測するかを決定し、決定されたブロックサイズで入力画像101を面内予測し、さらに符号化して得られたストリーム107を出力する画像符号化装置である。ここで、ブロック特徴量は、画素値の統計情報であり、例えば、輝度値の分散値、平均値、隣接画素差分値和、隣接画素差分絶対値和、およびダイナミックレンジなどである。なお、図1では、本願発明の主眼は面内予測における処理であるので、面内予測と関係がない処理部、例えば、面間予測を行う処理部などの構成の記載を省略している。
33<QP≦39・・・閾値1
39<QP≦45・・・閾値2
45<QP ・・・閾値3
上記のように、閾値のリストを量子化幅QPが大きくなるほど閾値が大きくなるように、すなわち、閾値0<閾値1<閾値2<閾値3となるように、閾値0から閾値3を設定することで、ブロックサイズ決定部1031が16×16画素ブロックを選択する選択率を、増加させることが可能となる。また、QPの上限が51であるため、これに近づくにつれて閾値2から閾値3の増加量をさらに大きくすることで16×16画素ブロックの選択率をさらに増加させる事が可能となる。
上記実施の形態1では、MBの輝度分散値と閾値とを比較することによってMBの輝度が一様であるか否かを判定し、判定結果に従って、予測ブロックサイズを決定する方式を例としてあげたが、本発明はこれに限定されない。例えば、MB内に、平坦な画像からなる背景の中にコントラストの高い部分画像を含んでいる場合には、16×16画素MBサイズで面内予測符号化を行うよりも、それぞれの画像ごとに小さいブロックサイズで面内予測符号化を行った方が、発生符号量を少なくすることが可能な場合がある。本実施の形態2では、このような場合に対し、16×16画素MB全体の輝度分散値と、当該MB内の4×4画素ブロックごとの輝度分散値とを比較して、16×16画素MB全体の分散値よりも分散値が小さい4×4画素ブロックの数を計数し、計数した数が一定数よりも多い場合には、当該MBを4×4画素ブロックサイズで面内予測符号化するものと決定する。
また、上述の実施の形態1では、量子化パラメータに応じて、面内予測符号化の単位となる面内予測ブロックのサイズを制御する方法を説明した。具体的には、量子化パラメータが大きくなるほど、大きいブロックサイズの面内予測ブロックが選択されやすくなるよう、量子化パラメータQPに連動して閾値を制御する方式を例とし、実施の形態2では4×4画素ブロックの輝度分散値に対するオフセットを制御する方式を例としたが、本発明はこれに限定されない。本実施の形態3では、デコーダモデルのバッファシミュレーションの値に基づき、デコーダ側での符号化データのバッファ占有量がアンダーフローレベルに近づくにつれて、閾値に乗算する乗数を大きくするという制御を行う。このバッファシミュレーションでは、例えば、直前に符号化されたピクチャ内で、CAVLC(Context-Adaptive Variable Length Coding)により実際に発生した符号量が用いられる。また、レート制御部106によるバッファへのデータ転送制御には、VBR(Variable Bit Rate)制御が採用される。
101 入力画像
102 ブロック特徴量算出部
103 面内予測ブロックサイズ決定部
104 制御パラメータ
105 符号化部
106 レート制御部
107 ストリーム
1031 ブロックサイズ決定部
1032 パラメータ調整部
1050 エントロピー符号化部
1051 減算器
1052 面内予測部
1053 面内予測方向決定部
1054 T・Q部
1055 IQ・IT部
1056 加算器
1057 DBF部
1058 フレームメモリ
1059 周辺画素メモリ
Claims (9)
- 入力画像内の符号化対象マクロブロックを、複数のサイズを有する面内予測ブロックを単位として面内予測符号化する画像符号化装置であって、
前記入力画像内の前記符号化対象マクロブロックに属する画素の画素値に基づいて、当該画素値の統計情報を算出する特徴量算出部と、
算出された前記統計情報に基づいて、前記符号化対象マクロブロック内で所定の方向に対する画素値の変化の度合いが小さいほど、より大きい面内予測ブロックサイズが選択されるよう所定の基準に従って、前記面内予測ブロックサイズを決定するサイズ決定部と、
決定された前記サイズの面内予測ブロックを単位として、前記符号化対象マクロブロックを面内予測符号化する符号化部とを備える
画像符号化装置。 - 前記特徴量算出部は、前記符号化対象マクロブロックに属する画素の画素値の分散値を前記統計情報として算出し、
前記サイズ決定部は、前記分散値が前記基準に従って小さいと判断された場合には、前記面内予測ブロックサイズとして、16×16画素ブロックサイズを選択する
請求項1記載の画像符号化装置。 - 前記特徴量算出部は、さらに、前記符号化対象マクロブロックに属する画素の画素値に基づいて、隣接する画素値との差分絶対値を前記符号化対象マクロブロックに属するそれぞれの画素において算出し、算出した差分絶対値和の総和を前記統計情報として算出し、
前記サイズ決定部は、前記分散値が前記基準に従って大きいと判断された場合には、前記総和に基づいて、前記面内予測ブロックサイズを16×16画素ブロックサイズとするか否かを選択する
請求項2記載の画像符号化装置。 - 前記特徴量算出部は、記符号化対象マクロブロックに属する画素の画素値に基づいて、隣接する画素値との隣接差分値を前記符号化対象マクロブロックに属するそれぞれの画素において算出し、
前記サイズ決定部は、前記分散値が前記基準に従って大きいと判断された場合には、前記隣接差分値に基づいて、前記面内予測ブロックサイズを16×16画素ブロックサイズとするか否かを選択する
請求項2または3記載の画像符号化装置。 - 前記画像符号化装置は、さらに、前記符号化部から出力される出力信号の発生符号量を制御するためのレート制御情報を生成するレート制御部を備え、
前記サイズ決定部は、前記レート制御部で生成された前記レート制御情報から発生符号量の抑制度合いを判定し、前記判定結果である前記抑制度合いに連動して、同一の前記統計情報に対して、大きい面内予測ブロックサイズがより選択されやすくなるように前記基準を制御する
請求項1~4のいずれか1項に記載の画像符号化装置。 - 前記サイズ決定部は、前記レート制御部で生成された前記レート制御情報で示される量子化幅が大きいほど、同一の統計情報に対して、大きな面内予測ブロックサイズが選択されやすくなるように前記基準を制御する
請求項5記載の画像符号化装置。 - 前記サイズ決定部は、前記レート制御部で前記レート制御情報として生成された、デコードシミュレーション用バッファにおける符号化データのバッファ占有量に連動して、前記バッファ占有量がアンダーフローに近づくほど、同一の前記統計情報に対して、大きな面内予測ブロックサイズが選択されやすくなるように前記基準を制御する
請求項5記載の画像符号化装置。 - 前記特徴量算出部は、前記サイズ決定部によって16×16画素ブロックサイズが選択されたとき、さらに、前記符号化対象マクロブロックに含まれるすべての4×4画素ブロックについて、前記4×4画素ブロックに属する画素の画素値に基づいて分散値を算出し、
前記サイズ決定部は、前記16×16画素ブロックに対して算出した分散値と、前記各4×4画素ブロックに対して算出した分散値とを比較し、当該比較の結果に基づいて、16×16画素ブロックサイズと4×4画素ブロックサイズを選択的に切り替える、
請求項1または2記載の画像符号化装置。 - 入力画像内の符号化対象マクロブロックを、複数のサイズを有する面内予測ブロックを単位として面内予測符号化する画像符号化方法であって、
前記入力画像内の前記符号化対象マクロブロックに属する画素の画素値に基づいて、当該画素値の統計情報を算出し、
算出された前記統計情報に基づいて、前記符号化対象マクロブロック内で所定の方向に対する画素値の変化の度合いが小さいほど、より大きい面内予測ブロックサイズが選択されるよう所定の基準に従って、前記面内予測ブロックサイズを決定し、
決定された前記サイズの面内予測ブロックを単位として、前記符号化対象マクロブロックを面内予測符号化する、
画像符号化方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011547337A JP5470405B2 (ja) | 2009-12-28 | 2010-12-28 | 画像符号化装置および方法 |
US13/148,111 US9369720B2 (en) | 2009-12-28 | 2010-12-28 | Image coding apparatus and image coding method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009298928 | 2009-12-28 | ||
JP2009-298928 | 2009-12-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011080925A1 true WO2011080925A1 (ja) | 2011-07-07 |
Family
ID=44226349
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/007592 WO2011080925A1 (ja) | 2009-12-28 | 2010-12-28 | 画像符号化装置および方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9369720B2 (ja) |
JP (1) | JP5470405B2 (ja) |
WO (1) | WO2011080925A1 (ja) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013258501A (ja) * | 2012-06-11 | 2013-12-26 | Canon Inc | 画像処理装置、画像処理方法 |
JP2018093386A (ja) * | 2016-12-05 | 2018-06-14 | キヤノン株式会社 | 符号化装置及び符号化方法 |
JP2019102861A (ja) * | 2017-11-29 | 2019-06-24 | 富士通株式会社 | 動画像符号化装置、動画像符号化方法、及び動画像符号化プログラム |
WO2021117091A1 (ja) * | 2019-12-09 | 2021-06-17 | 日本電信電話株式会社 | 符号化方法、符号化装置、及びプログラム |
WO2023005830A1 (zh) * | 2021-07-29 | 2023-02-02 | 维沃移动通信有限公司 | 预测编码方法、装置和电子设备 |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10897625B2 (en) * | 2009-11-20 | 2021-01-19 | Texas Instruments Incorporated | Block artifact suppression in video coding |
US8483272B2 (en) * | 2010-09-24 | 2013-07-09 | Intel Corporation | System and method for frame level bit rate control without pre-analysis |
JP2012169762A (ja) | 2011-02-10 | 2012-09-06 | Sony Corp | 画像符号化装置と画像復号化装置およびその方法とプログラム |
JP2012251785A (ja) * | 2011-05-31 | 2012-12-20 | Nuflare Technology Inc | 検査装置および検査方法 |
US9241167B2 (en) * | 2012-02-17 | 2016-01-19 | Microsoft Technology Licensing, Llc | Metadata assisted video decoding |
IN2013MU01146A (ja) * | 2013-03-26 | 2015-04-24 | Tektronix Inc | |
US10003792B2 (en) * | 2013-05-27 | 2018-06-19 | Microsoft Technology Licensing, Llc | Video encoder for images |
US20150016509A1 (en) * | 2013-07-09 | 2015-01-15 | Magnum Semiconductor, Inc. | Apparatuses and methods for adjusting a quantization parameter to improve subjective quality |
WO2015015681A1 (ja) * | 2013-07-31 | 2015-02-05 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 画像符号化方法および画像符号化装置 |
KR102169610B1 (ko) * | 2013-08-21 | 2020-10-23 | 삼성전자주식회사 | 인트라 예측 모드 결정 방법 및 장치 |
US10136140B2 (en) | 2014-03-17 | 2018-11-20 | Microsoft Technology Licensing, Llc | Encoder-side decisions for screen content encoding |
JP6652068B2 (ja) * | 2015-01-19 | 2020-02-19 | 日本電気株式会社 | 動画像符号化装置、動画像符号化方法および動画像符号化プログラム |
US10924743B2 (en) | 2015-02-06 | 2021-02-16 | Microsoft Technology Licensing, Llc | Skipping evaluation stages during media encoding |
US10038917B2 (en) | 2015-06-12 | 2018-07-31 | Microsoft Technology Licensing, Llc | Search strategies for intra-picture prediction modes |
US10136132B2 (en) | 2015-07-21 | 2018-11-20 | Microsoft Technology Licensing, Llc | Adaptive skip or zero block detection combined with transform size decision |
US9955186B2 (en) * | 2016-01-11 | 2018-04-24 | Qualcomm Incorporated | Block size decision for video coding |
KR102287414B1 (ko) * | 2016-12-23 | 2021-08-06 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 손실 비디오 코딩을 위한 저복잡도 혼합 도메인 협력 인-루프 필터 |
US10924741B2 (en) * | 2019-04-15 | 2021-02-16 | Novatek Microelectronics Corp. | Method of determining quantization parameters |
WO2021191499A1 (en) * | 2020-03-26 | 2021-09-30 | Nokia Technologies Oy | A method, an apparatus and a computer program product for video encoding and video decoding |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007201558A (ja) * | 2006-01-23 | 2007-08-09 | Matsushita Electric Ind Co Ltd | 動画像符号化装置および動画像符号化方法 |
JP2008022405A (ja) * | 2006-07-14 | 2008-01-31 | Sony Corp | 画像処理装置および方法、並びに、プログラム |
WO2008044658A1 (en) * | 2006-10-10 | 2008-04-17 | Nippon Telegraph And Telephone Corporation | Intra prediction encoding control method and device, its program, and storage medium containing program |
JP2009232324A (ja) * | 2008-03-25 | 2009-10-08 | Panasonic Corp | 画像符号化装置、画像符号化方法および画像符号化プログラム |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05167998A (ja) * | 1991-12-16 | 1993-07-02 | Nippon Telegr & Teleph Corp <Ntt> | 画像の符号化制御処理方法 |
US6870884B1 (en) * | 1992-01-29 | 2005-03-22 | Mitsubishi Denki Kabushiki Kaisha | High-efficiency encoder and video information recording/reproducing apparatus |
EP1404136B1 (en) * | 2001-06-29 | 2018-04-04 | NTT DoCoMo, Inc. | Image encoder, image decoder, image encoding method, and image decoding method |
JP2003319391A (ja) * | 2002-04-26 | 2003-11-07 | Sony Corp | 符号化装置および方法、復号装置および方法、記録媒体、並びにプログラム |
US9237347B2 (en) * | 2005-03-21 | 2016-01-12 | The Regents Of The University Of California | Systems and methods for video compression for low bit rate and low latency video communications |
US8000390B2 (en) * | 2006-04-28 | 2011-08-16 | Sharp Laboratories Of America, Inc. | Methods and systems for efficient prediction-mode selection |
US7653130B2 (en) * | 2006-12-27 | 2010-01-26 | General Instrument Corporation | Method and apparatus for bit rate reduction in video telephony |
JP2008263529A (ja) * | 2007-04-13 | 2008-10-30 | Sony Corp | 符号化装置、符号化方法、符号化方法のプログラム及び符号化方法のプログラムを記録した記録媒体 |
KR100905059B1 (ko) * | 2007-08-16 | 2009-06-30 | 한국전자통신연구원 | 동영상 부호화에 있어서 비트 발생 가능성 예측을 이용한블록 모드 결정 방법 및 장치 |
KR100952340B1 (ko) | 2008-01-24 | 2010-04-09 | 에스케이 텔레콤주식회사 | 시공간적 복잡도를 이용한 부호화 모드 결정 방법 및 장치 |
KR20090090152A (ko) * | 2008-02-20 | 2009-08-25 | 삼성전자주식회사 | 영상의 부호화, 복호화 방법 및 장치 |
KR20090097688A (ko) * | 2008-03-12 | 2009-09-16 | 삼성전자주식회사 | 영상의 인트라 예측 부호화/복호화 방법 및 장치 |
US9137545B2 (en) * | 2009-10-21 | 2015-09-15 | Sk Telecom Co., Ltd. | Image encoding and decoding apparatus and method |
-
2010
- 2010-12-28 US US13/148,111 patent/US9369720B2/en active Active
- 2010-12-28 WO PCT/JP2010/007592 patent/WO2011080925A1/ja active Application Filing
- 2010-12-28 JP JP2011547337A patent/JP5470405B2/ja active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007201558A (ja) * | 2006-01-23 | 2007-08-09 | Matsushita Electric Ind Co Ltd | 動画像符号化装置および動画像符号化方法 |
JP2008022405A (ja) * | 2006-07-14 | 2008-01-31 | Sony Corp | 画像処理装置および方法、並びに、プログラム |
WO2008044658A1 (en) * | 2006-10-10 | 2008-04-17 | Nippon Telegraph And Telephone Corporation | Intra prediction encoding control method and device, its program, and storage medium containing program |
JP2009232324A (ja) * | 2008-03-25 | 2009-10-08 | Panasonic Corp | 画像符号化装置、画像符号化方法および画像符号化プログラム |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013258501A (ja) * | 2012-06-11 | 2013-12-26 | Canon Inc | 画像処理装置、画像処理方法 |
US9363432B2 (en) | 2012-06-11 | 2016-06-07 | Canon Kabushiki Kaisha | Image processing apparatus and image processing method |
JP2018093386A (ja) * | 2016-12-05 | 2018-06-14 | キヤノン株式会社 | 符号化装置及び符号化方法 |
JP2019102861A (ja) * | 2017-11-29 | 2019-06-24 | 富士通株式会社 | 動画像符号化装置、動画像符号化方法、及び動画像符号化プログラム |
WO2021117091A1 (ja) * | 2019-12-09 | 2021-06-17 | 日本電信電話株式会社 | 符号化方法、符号化装置、及びプログラム |
JPWO2021117091A1 (ja) * | 2019-12-09 | 2021-06-17 | ||
JP7364936B2 (ja) | 2019-12-09 | 2023-10-19 | 日本電信電話株式会社 | 符号化方法、符号化装置、及びプログラム |
WO2023005830A1 (zh) * | 2021-07-29 | 2023-02-02 | 维沃移动通信有限公司 | 预测编码方法、装置和电子设备 |
Also Published As
Publication number | Publication date |
---|---|
US20110292998A1 (en) | 2011-12-01 |
JP5470405B2 (ja) | 2014-04-16 |
JPWO2011080925A1 (ja) | 2013-05-09 |
US9369720B2 (en) | 2016-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5470405B2 (ja) | 画像符号化装置および方法 | |
EP3158751B1 (en) | Encoder decisions based on results of hash-based block matching | |
CN1960495B (zh) | 图像编码装置、图像编码方法和集成电路装置 | |
JP5801032B2 (ja) | インループのアーチファクト除去フィルタリングのための方法および装置 | |
KR100677552B1 (ko) | 루프 필터링 방법 및 루프 필터 | |
EP1675402A1 (en) | Optimisation of a quantisation matrix for image and video coding | |
US7388995B2 (en) | Quantization matrix adjusting method for avoiding underflow of data | |
US20160330468A1 (en) | Image encoding device, image decoding device, encoded stream conversion device, image encoding method, and image decoding method | |
WO2007055158A1 (ja) | 動画像符号化方法、動画像復号化方法および装置 | |
JP4804107B2 (ja) | 画像符号化装置、画像符号化方法及びそのプログラム | |
JPH04196976A (ja) | 画像符号化装置 | |
JP2006519565A (ja) | ビデオ符号化 | |
KR101394209B1 (ko) | 영상의 인트라 예측 부호화 방법 | |
JP5133290B2 (ja) | 動画像符号化装置および復号装置 | |
KR20070110517A (ko) | 부호화 장치 및 부호화 장치를 구비한 동화상 기록 시스템 | |
CA2798354A1 (en) | A video encoding bit rate control technique using a quantization statistic threshold to determine whether re-encoding of an encoding-order picture group is required | |
JP4532980B2 (ja) | 動画像符号化装置及び方法、並びにコンピュータプログラム及びコンピュータ可読記憶媒体 | |
JP2007336468A (ja) | 再符号化装置、再符号化方法およびプログラム | |
US8687910B2 (en) | Image filtering method using pseudo-random number filter and apparatus thereof | |
JP5178616B2 (ja) | シーンチェンジ検出装置および映像記録装置 | |
JP4911625B2 (ja) | 画像処理装置、およびそれを搭載した撮像装置 | |
EP1675405A1 (en) | Optimisation of a quantisation matrix for image and video coding | |
KR20150096353A (ko) | 이미지 인코딩 시스템, 디코딩 시스템 및 그 제공방법 | |
JP4857243B2 (ja) | 画像符号化装置及びその制御方法、コンピュータプログラム | |
JP3599909B2 (ja) | 動画像符号化装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 13148111 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011547337 Country of ref document: JP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10840784 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10840784 Country of ref document: EP Kind code of ref document: A1 |