WO2011155332A1 - 画像復号化装置と画像符号化装置およびその方法とプログラム - Google Patents
画像復号化装置と画像符号化装置およびその方法とプログラム Download PDFInfo
- Publication number
- WO2011155332A1 WO2011155332A1 PCT/JP2011/061974 JP2011061974W WO2011155332A1 WO 2011155332 A1 WO2011155332 A1 WO 2011155332A1 JP 2011061974 W JP2011061974 W JP 2011061974W WO 2011155332 A1 WO2011155332 A1 WO 2011155332A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- data
- image data
- transform
- prediction
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/004—Predictors, e.g. intraframe, interframe coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/007—Transform coding, e.g. discrete cosine transform
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
- H04N19/197—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including determination of the initial value of an encoding parameter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
Definitions
- the present invention relates to an image decoding device, an image encoding device, a method thereof, and a program. Specifically, an image decoding apparatus, an image encoding apparatus, a method thereof, and a program that can perform efficient decoding and encoding are provided.
- image information has been handled as digital data.
- the purpose is to transmit and store information efficiently, and it is based on a method such as MPEG that compresses by orthogonal transform and motion compensation using redundancy unique to image information.
- Devices are becoming popular for both information distribution in broadcasting stations and information reception in general households.
- MPEG2 ISO / IEC 13818-2
- MPEG2 compression is a standard that covers both interlaced and progressively scanned images, as well as standard resolution and high definition images, and is currently widely used in a wide range of professional and consumer applications.
- a high-resolution interlaced scanned image having 1920 ⁇ 1088 pixels can be allocated a code amount (bit rate) of 18 to 22 Mbps, thereby realizing a high compression ratio and good image quality. It is.
- MPEG2 was mainly intended for high-quality encoding suitable for broadcasting, but did not support encoding methods with a lower code amount (bit rate) than MPEG1, that is, a higher compression rate.
- bit rate code amount
- MPEG4 encoding system has been standardized accordingly.
- image coding system the standard was approved as an international standard as ISO / IEC 14496-2 in December 1998.
- H.264 and MPEG-4 Part 10 are international standards. This H. H.264 / AVC is H.264. Based on 26L, It also incorporates functions not supported by 26L.
- Patent Document 1 discloses that image data is more efficiently encoded using H.264 / AVC.
- MDDT Mode-dependent directional transform
- an object of the present invention is to provide an image decoding device, an image encoding device, a method thereof, and a program that can improve the encoding efficiency.
- encoding is performed by orthogonally transforming prediction error data, which is an error between image data and predicted image data, for each transform block and processing the coefficient data after the orthogonal transform.
- an image decoding device that decodes the image data from the bit stream, a data processing unit that processes the encoded bit stream to obtain coefficient data and encoding parameter information after the orthogonal transformation, and the encoding parameter information
- An inverse orthogonal transform unit that obtains a prediction error by performing inverse orthogonal transform of the coefficient data using a base set in advance according to the position of the transform block in the indicated macroblock, and generates the predicted image data
- the predicted image data generated by the predicted image data generating unit is added to the predicted error obtained by the predicted image data generating unit and the inverse orthogonal transform unit.
- an addition unit for decoding the image data is performed by orthogonally transforming prediction error data, which is an error between image data and predicted image data, for each transform block and processing the coefficient data after the orthogonal transform.
- the Karoonen-Labe inverse transform is performed using the set basis.
- the base used in the inverse orthogonal transform unit is a reverse example of the base used when the prediction error data is orthogonally transformed for each transform block.
- Such a base is provided in advance, and the prediction error data before the orthogonal transformation is generated by performing the inverse orthogonal transformation by selecting and using the basis corresponding to the block position or the like.
- an encoding generated by orthogonally transforming prediction error data which is an error between image data and predicted image data, for each transform block, and processing the coefficient data after the orthogonal transform.
- the predicted image data generation procedure for generating the predicted image data, and the generated prediction error obtained by the inverse orthogonal transform unit An adding procedure for decoding the image data by adding measurement image data in the program to be executed by the computer.
- a prediction unit that generates predicted image data of the image data, and prediction that is an error between the image data and the predicted image data
- An orthogonal unit that performs orthogonal transform of the prediction error for each transform block and performs orthogonal transform using a base set in advance according to the position of the transform block in a macro block.
- the image coding apparatus includes a transform unit and a data processing unit that processes output data of the orthogonal transform unit to generate a coded bitstream.
- the block position and block position of the transform block in the macro block and the predicted image data are obtained.
- a base set in advance according to the prediction mode at the time of generation is used, and orthogonal transformation, for example, Karhunen-Loeve transformation, is performed.
- Karoonen-Label transform is performed on the block configured with the coefficient of the lowest frequency component after orthogonal transform in each transform block.
- a base set in advance according to the prediction mode is used.
- This base is obtained by using a plurality of images prepared for base learning in advance, in each transform block for each macroblock size, each transform block size, each transform block position in the macroblock, and each prediction mode. Is an eigenvector corresponding to the eigenvalue of the matrix calculated from the prediction error data.
- the bases are grouped according to the distance between the bases or the distance from the reference pixel.
- Such a base is provided in advance, and an orthogonal transform is performed by selecting and using a base corresponding to a block position or the like. Further, the coefficient data after the orthogonal transform is subjected to processing such as quantization and lossless encoding, and an encoded bit stream is generated.
- a predicted image data generation step for generating predicted image data of the image data, and an error between the image data and the predicted image data
- a subtraction process for generating prediction error data, and orthogonal transformation of the prediction error is performed for each transformation block, and the orthogonal transformation is performed using a base set in advance according to the position of the transformation block in a macroblock.
- the image encoding method includes an orthogonal transform step for performing the above.
- the program of the present invention is, for example, a storage medium or communication medium provided in a computer-readable format to a general-purpose computer system capable of executing various program codes, such as an optical disk, a magnetic disk, a semiconductor memory, etc. Or a program that can be provided by a communication medium such as a network.
- a program that can be provided by a communication medium such as a network.
- the orthogonal transformation is performed using a base set in advance according to the block position of the transformation block in the macroblock.
- it is included in the encoded bitstream.
- Inverse orthogonal transformation is performed using a base set in advance according to the block position in the macroblock indicated by the encoding parameter information, so that coefficient data after orthogonal transformation is predicted before orthogonal transformation. Error data can be restored.
- orthogonal transform and inverse orthogonal transform are performed using a base corresponding to the block position in the macroblock, it is possible to perform a transform optimized according to the block position and improve coding efficiency. be able to.
- FIG. 1 shows the configuration of an image encoding device.
- the image encoding device 10 includes an analog / digital conversion unit (A / D conversion unit) 11, a screen rearrangement buffer 12, a subtraction unit 13, an orthogonal transformation unit 14, a quantization unit 15, a lossless encoding unit 16, and a storage buffer 17.
- the rate control unit 18 is provided.
- the image encoding device 10 includes an inverse quantization unit 21, an inverse orthogonal transform unit 22, an addition unit 23, a deblocking filter 24, a frame memory 27, an intra prediction unit 31, a motion prediction / compensation unit 32, a predicted image / optimum A mode selection unit 33 is provided.
- the A / D converter 11 converts an analog image signal into digital image data and outputs the digital image data to the screen rearrangement buffer 12.
- the screen rearrangement buffer 12 rearranges the frames of the image data output from the A / D conversion unit 11.
- the screen rearrangement buffer 12 rearranges the frames according to the GOP (Group of Pictures) structure related to the encoding process, and subtracts the image data after the rearrangement, the intra prediction unit 31, and the motion prediction / compensation unit. 32.
- GOP Group of Pictures
- the subtraction unit 13 is supplied with the image data output from the screen rearrangement buffer 12 and the predicted image data selected by the predicted image / optimum mode selection unit 33 described later.
- the subtraction unit 13 calculates prediction error data that is a difference between the image data output from the screen rearrangement buffer 12 and the prediction image data supplied from the prediction image / optimum mode selection unit 33, and sends the prediction error data to the orthogonal transformation unit 14. Output.
- the quantization unit 15 is supplied with coefficient data output from the orthogonal transform unit 14 and a rate control signal from a rate control unit 18 described later.
- the quantization unit 15 quantizes the coefficient data and outputs the quantized data to the lossless encoding unit 16 and the inverse quantization unit 21. Further, the quantization unit 15 changes the bit rate of the quantized data by switching the quantization parameter (quantization scale) based on the rate control signal from the rate control unit 18.
- the lossless encoding unit 16 performs lossless encoding on the encoding parameter information and adds it to, for example, header information of the encoded bit stream.
- the quantization unit 15 and the lossless encoding unit 16 correspond to a data processing unit that processes output data of the orthogonal transform unit 14 to generate an encoded bit stream.
- the rate control unit 18 monitors the free capacity of the storage buffer 17, generates a rate control signal according to the free capacity, and outputs it to the quantization unit 15.
- the rate control unit 18 acquires information indicating the free capacity from the accumulation buffer 17, for example.
- the rate control unit 18 reduces the bit rate of the quantized data by the rate control signal when the free space is low.
- the rate control unit 18 increases the bit rate of the quantized data by the rate control signal.
- the inverse quantization unit 21 performs an inverse quantization process on the quantized data supplied from the quantization unit 15.
- the inverse quantization unit 21 outputs coefficient data obtained by performing the inverse quantization process to the inverse orthogonal transform unit 22.
- the inverse orthogonal transform unit 22 outputs data obtained by performing an inverse orthogonal transform process on the coefficient data supplied from the inverse quantization unit 21 to the addition unit 23.
- the adding unit 23 adds the data supplied from the inverse orthogonal transform unit 22 and the predicted image data supplied from the predicted image / optimum mode selection unit 33 to generate reference image data, and the deblocking filter 24 and the intra prediction. To the unit 31.
- the deblocking filter 24 performs a filter process for reducing block distortion that occurs during image coding.
- the deblocking filter 24 performs a filtering process for removing block distortion from the reference image data supplied from the adding unit 23, and outputs the filtered reference image data to the frame memory 27.
- the frame memory 27 holds the reference image data after the filtering process supplied from the deblocking filter 24.
- the intra prediction unit 31 performs an intra prediction process using the image data of the encoding target image output from the screen rearrangement buffer 12 and the reference image data supplied from the addition unit 23.
- the intra prediction unit 31 performs an intra prediction process for each transform block size in orthogonal transform and for each prediction mode of intra prediction.
- the intra prediction unit 31 outputs the generated predicted image data to the predicted image / optimum mode selection unit 33.
- the intra prediction unit 31 generates coding parameter information related to the intra prediction process, and outputs the coding parameter information to the lossless coding unit 16 and the predicted image / optimum mode selection unit 33.
- the intra prediction unit 31 includes, for example, the macro block size, the transform block size, the position of the transform block in the macro block, the prediction mode, and the like in the encoding parameter information.
- the intra prediction unit 31 calculates a cost function value in each intra prediction process, and selects an intra prediction process that minimizes the calculated cost function value, that is, an optimal intra prediction process that maximizes the coding efficiency.
- the intra prediction unit 31 outputs the encoding parameter information and the cost value in the optimal intra prediction process, and the predicted image data generated in the optimal intra prediction process to the predicted image / optimum mode selection unit 33.
- the motion prediction / compensation unit 32 performs inter prediction processing with all motion compensation block sizes for the macroblock, generates predicted image data, and outputs the prediction image data to the predicted image / optimum mode selection unit 33.
- the motion prediction / compensation unit 32 uses the filtered reference image data read from the frame memory 27 for each image of each motion compensation block size in the encoding target image read from the screen rearrangement buffer 12. To detect a motion vector. Further, the motion prediction / compensation unit 32 performs motion compensation processing on the reference image based on the detected motion vector to generate predicted image data.
- the motion prediction / compensation unit 32 generates encoding parameter information related to inter prediction processing, for example, encoding parameter information indicating a macroblock size, a motion compensation block size, a motion vector, and the like, and performs prediction with the lossless encoding unit 16. Output to the image / optimum mode selection unit 33.
- the motion prediction / compensation unit 32 calculates a cost function value for each motion compensation block size, and performs an inter prediction process in which the calculated cost function value is the minimum, that is, an inter prediction process in which the encoding efficiency is the highest. Select.
- the motion prediction / compensation unit 32 outputs the encoding parameter information and cost value in the optimal inter prediction process, and the predicted image data generated in the optimal inter prediction process to the predicted image / optimum mode selection unit 33.
- the predicted image / optimum mode selection unit 33 When the intra prediction unit 31 performs intra prediction processing for each transform block size or prediction mode and selects the optimal intra prediction processing, the predicted image / optimum mode selection unit 33 performs lossless encoding on the encoding parameter information with the orthogonal transform unit 14. Unit 16 outputs the predicted image data to the subtraction unit 13.
- the prediction image / optimum mode selection unit 33 converts the encoding parameter information into the orthogonal transform unit 14 and the lossless code. Output to the conversion unit 16, and output the predicted image data to the subtraction unit 13.
- FIG. 2 shows a prediction mode for a block of 4 ⁇ 4 pixels, for example.
- each prediction mode in FIG. 2 will be briefly described.
- the arrow indicates the prediction direction.
- the prediction mode 0 vertical
- the pixels P0 to P3 have fewer prediction errors than the pixels P12 to P15.
- the pixels P0, P4, P8, and P12 have fewer prediction errors than the pixels P3, P7, P11, and P15.
- the prediction mode 4 diagonal down-right
- the pixel P0 has less prediction error than the pixel P15.
- the calculated base is added to the encoded bitstream, the encoding efficiency is deteriorated. Therefore, an optimal base is calculated in advance for each block position and prediction mode for performing orthogonal transformation in the macroblock. If the calculated base is used in the image coding apparatus and the image decoding apparatus, it is not necessary to calculate the base in the image coding apparatus and the image decoding apparatus, and the image coding apparatus and the image decoding apparatus The configuration is simple compared to the case of calculating the base. Furthermore, since it is not necessary to transmit the base, it is possible to increase the coding efficiency using the KL transform. The basis learning will be described later.
- the transform block size that is the block size of the encoding target image is, for example, any block size of 16 ⁇ 16 pixels, 8 ⁇ 8 pixels, or 4 ⁇ 4 pixels. It is said.
- the conversion block size is, for example, any block size of 8 ⁇ 8 pixels and 4 ⁇ 4 pixels. Therefore, as shown in FIG. 4, when the macro block is 16 ⁇ 16 pixels, the orthogonal transform unit 14 has a block size of 16 ⁇ 16 pixels, 8 ⁇ 8 pixels, and 4 ⁇ 4 pixels according to the prediction mode. It is configured so that conversion can be performed.
- the 16 ⁇ 16 KL conversion unit 141 performs KL conversion of prediction error data in block units of 16 ⁇ 16 pixels using an optimal base learned in advance for each prediction mode, and the obtained coefficient is used as a coefficient selection unit 148. Output to.
- the 2 ⁇ 2 KL conversion unit 143 uses the optimal base learned in advance for each prediction mode, and performs the KL conversion of the coefficients of 2 ⁇ 2 blocks supplied from the 8 ⁇ 8 KL conversion unit 142 to the prediction mode. The basis is used, and the obtained coefficient is output to the coefficient selection unit 148.
- the 4 ⁇ 4 KL conversion unit 144 performs KL conversion of prediction error data in units of 4 ⁇ 4 pixel blocks using an optimal base previously learned for each block position in the prediction mode and macroblock.
- the prediction error data is data corresponding to a block size of 16 ⁇ 16 pixels
- the 16 ⁇ 16 pixel block includes 16 4 ⁇ 4 pixel blocks. Therefore, the 4 ⁇ 4KL conversion unit 144 outputs the lowest frequency component coefficient in each block of 4 ⁇ 4 pixels to the 4 ⁇ 4KL conversion unit 145 and outputs the other coefficients to the coefficient selection unit 148.
- the prediction error data is data corresponding to a block size of 8 ⁇ 8 pixels
- the block of 8 ⁇ 8 pixels includes four 4 ⁇ 4 pixel blocks. Therefore, the 4 ⁇ 4 KL conversion unit 144 outputs the lowest frequency component coefficient in each block of 4 ⁇ 4 pixels to the 2 ⁇ 2 KL conversion unit 146 and outputs the other coefficients to the coefficient selection unit 148.
- the 2 ⁇ 2 KL conversion unit 146 performs KL conversion on the block of the lowest frequency component coefficient for 2 ⁇ 2 blocks supplied from the 4 ⁇ 4 KL conversion unit 144 using the optimum base learned in advance for each prediction mode. This is done using the basis corresponding to the prediction mode.
- the 2 ⁇ 2 KL conversion unit 146 outputs the coefficient obtained by the KL conversion to the coefficient selection unit 148.
- step ST12 the screen rearrangement buffer 12 performs image rearrangement.
- the screen rearrangement buffer 12 stores the image data supplied from the A / D conversion unit 11, and rearranges from the display order of each picture to the encoding order.
- step ST13 the subtraction unit 13 generates prediction error data.
- the subtraction unit 13 calculates a difference between the image data of the images rearranged in step ST12 and the predicted image data selected by the predicted image / optimum mode selection unit 33, and generates prediction error data.
- the prediction error data has a smaller data amount than the original image data. Therefore, the data amount can be compressed as compared with the case where the image is encoded as it is.
- step ST16 the inverse quantization unit 21 performs an inverse quantization process.
- the inverse quantization unit 21 inversely quantizes the coefficient data quantized by the quantization unit 15 with characteristics corresponding to the characteristics of the quantization unit 15.
- step ST19 the deblocking filter 24 performs filter processing.
- the deblocking filter 24 filters the reference image data output from the addition unit 23 to remove block distortion.
- step ST23 the predicted image / optimum mode selection unit 33 performs encoding parameter information generation processing.
- the prediction image / optimum mode selection unit 33 outputs the encoding parameter information regarding the selected prediction image data to the orthogonal transform unit 14 and the lossless encoding unit 16 as the encoding parameter information of the optimal mode.
- step ST32 the motion prediction / compensation unit 32 performs inter prediction.
- the motion prediction / compensation unit 32 uses the reference image data after filter processing stored in the frame memory 27 to perform inter prediction processing with each motion compensation block size.
- inter prediction inter prediction processing is performed with each motion compensation block size, and a cost function value in each prediction block is calculated. Then, based on the calculated cost function value, the inter prediction process with the highest coding efficiency is selected.
- the motion prediction / compensation unit 32 calculates a cost function value for each motion compensation block size.
- the motion prediction / compensation unit 32 calculates the cost function value using the above-described equation (1) or equation (2).
- a generated code amount including coding parameter information and the like is used.
- the cost function value for the inter prediction mode is calculated using the H.264 standard. Evaluation of the cost function value of Skip Mode and Direct Mode defined in the H.264 / AVC format is also included.
- step ST61 the intra prediction unit 31 determines whether or not the macroblock size is 16 ⁇ 16 pixels.
- the intra prediction unit 31 proceeds to step ST62 when the macroblock size is 16 ⁇ 16 pixels, and proceeds to step ST63 when the macroblock size is not 16 ⁇ 16 pixels.
- step ST73 the orthogonal transform unit 14 determines whether the transform block size is 4 ⁇ 4 pixels.
- the orthogonal transform unit 14 proceeds to step ST74 when the coding parameter information indicates that the transform block size is 4 ⁇ 4 pixels, and proceeds to step ST75 when it does not indicate that it is 4 ⁇ 4 pixels.
- the orthogonal transform unit 14 performs 16 ⁇ 16 orthogonal transform processing.
- the orthogonal transform unit 14 performs KL transform of a block of 16 ⁇ 16 pixels using a base learned in advance according to the prediction mode, and outputs the obtained coefficient to the quantization unit 15. That is, the coefficient selection unit 148 of the orthogonal transformation unit 14 illustrated in FIG. 5 selects the coefficient output from the 16 ⁇ 16 KL conversion unit 141 and outputs the selected coefficient to the quantization unit 15.
- the orthogonal transform unit 14 determines whether or not the transform block size is 4 ⁇ 4 pixels.
- the orthogonal transform unit 14 proceeds to step ST79 when the coding parameter information indicates that the transform block size is 4 ⁇ 4 pixels, and proceeds to step ST80 when it does not indicate that it is 4 ⁇ 4 pixels.
- the orthogonal transform unit 14 performs 4 ⁇ 4 orthogonal transform processing.
- the orthogonal transform unit 14 performs KL transform for each block of 4 ⁇ 4 pixels using a base learned in advance according to the prediction mode and the block position.
- KL conversion is performed four times.
- the lowest frequency component coefficient is selected from the coefficients obtained by performing the KL conversion on the 4 ⁇ 4 pixel block, and the KL conversion is performed on the selected 2 ⁇ 2 coefficient using the basis corresponding to the prediction mode. I do.
- the orthogonal transform unit 14 outputs the coefficient obtained by performing the KL transform on the lowest frequency component coefficient and other coefficients excluding the lowest frequency component coefficient to the quantization unit 15.
- the orthogonal transform unit 14 performs orthogonal transform in units of 8 ⁇ 8 pixel blocks.
- the orthogonal transform unit 14 performs KL transform of an 8 ⁇ 8 pixel block using a base learned in advance according to the prediction mode, and outputs the obtained coefficient to the quantization unit 15. That is, the coefficient selection unit 148 of the orthogonal transformation unit 14 illustrated in FIG. 5 selects the coefficient output from the 8 ⁇ 8 KL conversion unit 142 and outputs the selected coefficient to the quantization unit 15.
- the orthogonal transform unit 14 performs discrete cosine transform (DCT).
- DCT discrete cosine transform
- the orthogonal transform unit 14 outputs the coefficient obtained by performing the discrete cosine transform to the quantization unit 15. That is, the coefficient selection unit 148 of the orthogonal transform unit 14 illustrated in FIG. 5 selects the coefficient output from the DCT unit 147 and outputs the selected coefficient to the quantization unit 15.
- FIG. 12 is a diagram for explaining the orthogonal transform operation.
- the transform block size is 4 ⁇ 4 pixels.
- the 16 macroblocks are included in the macroblock.
- the number in the block indicates the block position loc.
- the 4 ⁇ 4 KL transform unit 144 of the orthogonal transform unit 14 performs KL transform on each transform block using a base optimized for the prediction mode and block position of each block, and is shown in FIG. Thus, a coefficient for each block is generated.
- the 4 ⁇ 4 KL conversion unit 145 forms a 4 ⁇ 4 block as shown in FIG. 12D using the lowest frequency component coefficient (indicated by hatching) in each block.
- the 4 ⁇ 4 KL conversion unit 145 performs KL conversion on the block using a base optimized in accordance with the prediction mode, and generates a coefficient for each block as illustrated in FIG.
- the orthogonal transform unit 14 outputs the coefficient shown in (E) of FIG. 12 and other coefficients excluding the lowest frequency component coefficient in (C) of FIG. 12 to the quantization unit 15.
- the macroblock size is 8 ⁇ 8 pixels as shown in FIG. 12F and the transform block size is 4 ⁇ 4 pixels
- 4 macroblocks are included in the macroblock as shown in FIG. Contains transformation blocks.
- the number in the block indicates the block position loc.
- the 4 ⁇ 4 KL transform unit 144 of the orthogonal transform unit 14 performs KL transform on each transform block using a base optimized for the prediction mode and block position of each block, and is shown in FIG. Thus, a coefficient for each block is generated.
- the 2 ⁇ 2 KL conversion unit 146 forms a 2 ⁇ 2 block as shown in (I) of FIG. 12 using the lowest frequency component coefficient (indicated by hatching) in each block.
- the 2 ⁇ 2 KL conversion unit 146 performs KL conversion on the block using a base optimized in accordance with the prediction mode, and generates a coefficient for each block as illustrated in FIG.
- the orthogonal transform unit 14 outputs the coefficient shown in (J) of FIG. 12 and other coefficients excluding the lowest frequency component coefficient in (H) of FIG.
- FIG. 13 shows the configuration of the image decoding apparatus.
- the image decoding device 50 includes a storage buffer 51, a lossless decoding unit 52, an inverse quantization unit 53, an inverse orthogonal transform unit 54, an addition unit 55, a deblocking filter 56, a screen rearrangement buffer 57, a digital / analog conversion unit ( D / A converter 58). Furthermore, the image decoding device 50 includes a frame memory 61, an intra prediction unit 62, a motion compensation unit 63, and a selector 64.
- the accumulation buffer 51 accumulates the transmitted encoded bit stream.
- the lossless decoding unit 52 decodes the encoded bit stream supplied from the accumulation buffer 51 by a method corresponding to the encoding method of the lossless encoding unit 16 of FIG.
- the lossless decoding unit 52 outputs the encoding parameter information obtained by decoding the header information of the encoded bitstream to the intra prediction unit 62, the motion compensation unit 63, and the deblocking filter 56. Further, the lossless decoding unit 52 sets prediction motion vector candidates using the motion vectors of the decoding target block and the decoded adjacent block. The lossless decoding unit 52 selects a motion vector from prediction motion vector candidates based on prediction motion vector selection information obtained by lossless decoding of the encoded bitstream, and uses the selected motion vector as a prediction motion vector. . Further, the lossless decoding unit 52 adds the predicted motion vector to the difference motion vector obtained by lossless decoding of the encoded bitstream to calculate the motion vector of the block to be decoded, and the motion compensation unit 63 Output.
- the inverse quantization unit 53 inversely quantizes the quantized data decoded by the lossless decoding unit 52 by a method corresponding to the quantization method of the quantization unit 15 in FIG.
- the inverse orthogonal transform unit 54 performs inverse orthogonal transform on the output of the inverse quantization unit 53 by a method corresponding to the orthogonal transform method of the orthogonal transform unit 14 of FIG.
- the addition unit 55 adds the data after inverse orthogonal transform and the predicted image data supplied from the selector 64 to generate decoded image data, and outputs the decoded image data to the deblocking filter 56 and the intra prediction unit 62.
- the deblocking filter 56 performs a filtering process on the decoded image data supplied from the adder 55, removes block distortion, supplies the frame memory 61 to the frame memory 61, and stores it in the screen rearrangement buffer 57.
- the screen rearrangement buffer 57 rearranges images. That is, the order of frames rearranged for the encoding order by the screen rearrangement buffer 12 in FIG. 1 is rearranged in the original display order and output to the D / A conversion unit 58.
- the D / A conversion unit 58 performs D / A conversion on the image data supplied from the screen rearrangement buffer 57 and outputs it to a display (not shown) to display an image.
- the frame memory 61 holds the decoded image data after the filtering process supplied from the deblocking filter 24.
- the intra prediction unit 62 generates a predicted image based on the encoding parameter information supplied from the lossless decoding unit 52, and outputs the generated predicted image data to the selector 64.
- the motion compensation unit 63 performs motion compensation based on the encoding parameter information and the motion vector supplied from the lossless decoding unit 52, generates predicted image data, and outputs the prediction image data to the selector 64. That is, the motion compensation unit 63 performs motion compensation on the basis of the motion vector for the reference image indicated by the reference frame information based on the motion vector and the reference frame information supplied from the lossless decoding unit 52, Prediction image data having a compensation block size is generated.
- the selector 64 supplies the prediction image data generated by the intra prediction unit 62 to the addition unit 55. Further, the selector 64 supplies the predicted image data generated by the motion compensation unit 63 to the addition unit 55.
- FIG. 14 shows the configuration of the inverse orthogonal transform unit 54.
- the inverse orthogonal transform unit 54 includes a 16 ⁇ 16KL inverse transform unit 541, a 2 ⁇ 2KL inverse transform unit 542, 545, an 8 ⁇ 8KL inverse transform unit 543, a 4 ⁇ 4KL inverse transform unit 544, 546, an IDCT unit 547, and a data selection unit. 548.
- the 16 ⁇ 16 KL reverse conversion unit 541 performs KL reverse conversion corresponding to the KL conversion performed by the 16 ⁇ 16 KL conversion unit 141 illustrated in FIG.
- the 16 ⁇ 16KL inverse transform unit 541 is output from the inverse quantization unit 53 using a basis corresponding to the prediction mode (optimum prediction mode) indicated by the coding parameter information of the optimum mode supplied from the lossless decoding unit 52. KL inverse transformation of the dequantized data is performed.
- the 16 ⁇ 16 KL inverse transform unit 541 outputs the image data obtained by performing the KL inverse transform to the data selection unit 548.
- the 2 ⁇ 2 KL reverse conversion unit 542 performs KL reverse conversion corresponding to the KL conversion performed by the 2 ⁇ 2 KL conversion unit 143 shown in FIG.
- the 2 ⁇ 2 KL inverse transform unit 542 performs KL inverse transform of the dequantized data output from the inverse quantization unit 53 using a basis corresponding to the prediction mode indicated by the coding parameter information of the optimal mode.
- the 2 ⁇ 2 KL inverse transform unit 542 outputs the lowest frequency component coefficient obtained by performing the KL inverse transform to the 8 ⁇ 8 KL inverse transform unit 543.
- the 8 ⁇ 8 KL inverse conversion unit 543 performs KL inverse conversion corresponding to the KL conversion performed by the 8 ⁇ 8 KL conversion unit 143 shown in FIG.
- the 8 ⁇ 8 KL inverse transform unit 543 performs KL inverse transform based on the coding parameter information of the optimal mode supplied from the lossless decoding unit 52. For example, when the macro block size is 16 ⁇ 16 pixels, the 8 ⁇ 8 KL inverse transform unit 543 uses the prediction mode indicated by the coding parameter information of the optimal mode and the basis corresponding to the block position to perform the 2 ⁇ 2 KL inverse transform.
- the KL inverse transform between the lowest frequency component coefficient output from the unit 542 and the post-inverse quantization data output from the inverse quantization unit 53 is performed.
- the 8 ⁇ 8 KL inverse transform unit 543 outputs the image data obtained by performing the KL inverse transform to the data selection unit 548. Further, the 8 ⁇ 8KL inverse transform unit 543 uses the basis corresponding to the prediction mode and the block position when the macroblock size is 8 ⁇ 8 pixels, and the dequantized data output from the inverse quantization unit 53 KL inverse transform is performed, and the obtained image data is output to the data selection unit 548.
- the 4 ⁇ 4KL inverse transform unit 544 performs KL inverse transform corresponding to the KL transform performed by the 4 ⁇ 4KL transform unit 145 illustrated in FIG.
- the 4 ⁇ 4KL inverse transform unit 544 performs KL inverse transform on the dequantized data output from the inverse quantization unit 53 using a basis corresponding to the prediction mode indicated by the coding parameter information of the optimal mode.
- the 4 ⁇ 4KL inverse transform unit 544 outputs the lowest frequency component coefficient obtained by performing the KL inverse transform to the 4 ⁇ 4KL inverse transform unit 546.
- the 2 ⁇ 2 KL reverse conversion unit 545 performs KL reverse conversion corresponding to the KL conversion performed by the 2 ⁇ 2 KL conversion unit 146 shown in FIG.
- the 2 ⁇ 2 KL inverse transform unit 545 performs KL inverse transform on the dequantized data output from the inverse quantization unit 53 using a basis corresponding to the prediction mode indicated by the coding parameter information of the optimal mode.
- the 2 ⁇ 2 KL inverse transform unit 545 outputs the lowest frequency component coefficient obtained by performing the KL inverse transform to the 4 ⁇ 4 KL inverse transform unit 546.
- the 4 ⁇ 4 KL inverse conversion unit 546 performs KL inverse conversion corresponding to the KL conversion performed by the 4 ⁇ 4 KL conversion unit 144 illustrated in FIG.
- the 4 ⁇ 4KL inverse transform unit 546 performs KL inverse transform based on the coding parameter information of the optimal mode supplied from the lossless decoding unit 52. For example, when the macro block size is 16 ⁇ 16 pixels, the 4 ⁇ 4 KL inverse transform unit 546 uses the prediction mode indicated by the coding parameter information of the optimal mode and the basis corresponding to the block position to perform the 4 ⁇ 4 KL inverse transform.
- the KL inverse transform between the lowest frequency component coefficient output from the unit 544 and the dequantized data output from the inverse quantization unit 53 is performed.
- the 4 ⁇ 4 KL inverse transform unit 546 outputs the image data obtained by performing the KL inverse transform to the data selection unit 548.
- the 4 ⁇ 4 KL inverse transform unit 546 uses the base corresponding to the prediction mode and the block position, and the lowest frequency component output from the 2 ⁇ 2 KL inverse transform unit 545.
- KL inverse transform is performed between the coefficient and the data after inverse quantization output from the inverse quantization unit 53.
- the 4 ⁇ 4 KL inverse transform unit 546 outputs the image data obtained by performing the KL inverse transform to the data selection unit 548.
- the IDCT unit 547 performs inverse discrete cosine transform using the dequantized data output from the inverse quantization unit 53, and outputs the obtained image data to the data selection unit 548.
- the data selection unit 548 selects the image data output from the 16 ⁇ 16KL inverse transform unit 541, the 8 ⁇ 8KL inverse transform unit 543, the 4 ⁇ 4KL inverse transform unit 546, and the IDCT unit 547 based on the encoding parameter information. Do.
- the data selection unit 548 outputs the selected image data to the addition unit 55 as prediction error data.
- step ST91 the accumulation buffer 51 accumulates the transmitted encoded bit stream.
- step ST92 the lossless decoding unit 52 performs lossless decoding processing.
- the lossless decoding unit 52 decodes the encoded bit stream supplied from the accumulation buffer 51. That is, quantized data of each picture encoded by the lossless encoding unit 16 in FIG. 1 is obtained.
- the lossless decoding unit 52 performs lossless decoding of the encoding parameter information included in the header information of the encoded bitstream, and supplies the obtained encoding parameter information to the deblocking filter 56 and the selector 64.
- the lossless decoding unit 52 outputs the encoding parameter information to the intra prediction unit 62 when the encoding parameter information is information related to the intra prediction mode.
- the lossless decoding unit 52 outputs the encoding parameter information to the motion compensation unit 63 when the encoding parameter information is information related to the inter prediction mode.
- step ST93 the inverse quantization unit 53 performs an inverse quantization process.
- the inverse quantization unit 53 inversely quantizes the quantized data decoded by the lossless decoding unit 52 with characteristics corresponding to the characteristics of the quantization unit 15 in FIG.
- the inverse orthogonal transform unit 54 performs an inverse orthogonal transform process.
- the inverse orthogonal transform unit 54 performs inverse orthogonal transform corresponding to the orthogonal transform of the orthogonal transform unit 14 of FIG. 1 on the data after inverse quantization from the inverse quantization unit 53.
- step ST95 the addition unit 55 generates decoded image data.
- the adder 55 adds the prediction error data obtained by performing the inverse orthogonal transform process and the prediction image data selected in step ST99 described later to generate decoded image data. As a result, the original image is decoded.
- step ST96 the deblocking filter 56 performs filter processing.
- the deblocking filter 56 performs a filtering process on the decoded image data output from the adding unit 55 to remove block distortion included in the decoded image.
- step ST97 the frame memory 61 performs a storage process of the decoded image data.
- step ST98 the intra prediction unit 62 and the motion compensation unit 63 perform prediction processing.
- the intra prediction unit 62 and the motion compensation unit 63 perform prediction processing corresponding to the encoding parameter information supplied from the lossless decoding unit 52, respectively.
- the intra prediction unit 62 performs intra prediction processing based on the encoding parameter information, and obtains predicted image data. Generate. Also, when the encoding parameter information supplied from the lossless decoding unit 52 indicates inter prediction, the motion compensation unit 63 performs motion compensation based on the encoding parameter information and generates predicted image data. .
- step ST99 the selector 64 selects predicted image data. That is, the selector 64 selects the prediction image data supplied from the intra prediction unit 62 and the prediction image data generated by the motion compensation unit 63 and supplies the selection image data to the addition unit 55. As described above, the selector 64 performs the reverse operation in step ST95. It is added to the output of the orthogonal transformation unit 54.
- step ST100 the screen rearrangement buffer 57 performs image rearrangement. That is, the screen rearrangement buffer 57 rearranges the order of frames rearranged for encoding by the screen rearrangement buffer 12 of the image encoding device 10 of FIG. 1 to the original display order.
- step ST101 the D / A converter 58 D / A converts the image data from the screen rearrangement buffer 57. This image is output to a display (not shown), and the image is displayed.
- the inverse orthogonal transform unit 54 determines whether or not intra prediction is performed. For example, the inverse orthogonal transform unit 54 determines whether the block to be decoded is intra prediction based on the encoding parameter information extracted from the encoded bitstream by the lossless decoding unit 52. The inverse orthogonal transform unit 54 proceeds to step ST112 when the encoding parameter information indicates intra prediction, and proceeds to step ST121 when it does not indicate intra prediction, that is, when it is inter prediction.
- step ST112 the inverse orthogonal transform unit 54 determines whether or not the macroblock size is 16 ⁇ 16 pixels.
- the inverse orthogonal transform unit 54 proceeds to step ST113 when the encoding parameter information indicates that the macroblock size is 16 ⁇ 16 pixels, and proceeds to step ST118 when it does not indicate that it is 16 ⁇ 16 pixels.
- step ST113 the inverse orthogonal transform unit 54 determines whether the transform block size is 4 ⁇ 4 pixels.
- the inverse orthogonal transform unit 54 proceeds to step ST114 when the transform block size information in the encoding parameter information is “0” and the transform block size is 4 ⁇ 4 pixels, and proceeds to step ST115 when the transform block size is not “0”.
- the inverse orthogonal transform unit 54 performs 4 ⁇ 4 inverse orthogonal transform processing.
- the inverse orthogonal transform unit 54 performs 4 ⁇ 4 KL inverse transform using a base learned in advance according to the prediction mode and the block position.
- the KL conversion is performed by selecting the lowest frequency component coefficient from the coefficients obtained by performing KL conversion and KL conversion 16 times. Therefore, the inverse orthogonal transform unit 54 performs KL inverse transform on the data after inverse quantization of the lowest frequency component coefficient, using the basis corresponding to the prediction mode.
- the inverse orthogonal transform unit 54 performs KL using the basis corresponding to the prediction mode and the block position with respect to 16 blocks including the lowest frequency component coefficient obtained by the KL inverse transform and the coefficients of other components. Perform inverse transformation.
- the inverse orthogonal transform unit 54 outputs prediction error data obtained by performing the KL inverse transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 uses the output of the 4 ⁇ 4KL inverse transform unit 544 to perform data obtained by performing the KL inverse transform in the 4 ⁇ 4KL inverse transform unit 546. Select and output to the adder 55.
- step ST115 the inverse orthogonal transform unit 54 determines whether the transform block size is 8 ⁇ 8 pixels.
- the inverse orthogonal transform unit 54 proceeds to step ST116 when the transform block size information in the encoding parameter information is “1” and the transform block size is 8 ⁇ 8 pixels, and proceeds to step ST117 when it is not “1”.
- the inverse orthogonal transform unit 54 performs 8 ⁇ 8 inverse orthogonal transform processing.
- the inverse orthogonal transform unit 54 performs 8 ⁇ 8 KL inverse transform using a base learned in advance according to the prediction mode and the block position.
- the KL conversion is performed by selecting the lowest frequency component coefficient from the coefficients obtained by performing KL conversion and KL conversion four times. Therefore, the inverse orthogonal transform unit 54 performs KL inverse transform on the data after inverse quantization of the lowest frequency component coefficient, using the basis corresponding to the prediction mode.
- the inverse orthogonal transform unit 54 uses the basis corresponding to the prediction mode and the block position for the four blocks including the lowest frequency component coefficient obtained by the KL inverse transform and the coefficients of the other components. Perform inverse transformation.
- the inverse orthogonal transform unit 54 outputs prediction error data obtained by performing the KL inverse transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 uses the output of the 2 ⁇ 2KL inverse transform unit 542 to perform the KL inverse transform in the 8 ⁇ 8KL inverse transform unit 543. Select and output to the adder 55.
- the inverse orthogonal transform unit 54 performs 16 ⁇ 16 inverse orthogonal transform processing.
- the inverse orthogonal transform unit 54 performs 16 ⁇ 16 KL inverse transform using a base learned in advance according to the prediction mode.
- the inverse orthogonal transform unit 54 outputs prediction error data obtained by performing the KL inverse transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 selects the data obtained by performing the KL inverse transform in the 16 ⁇ 16 KL inverse transform unit 541 and outputs the data to the adder 55.
- the inverse orthogonal transform unit 54 determines whether the transform block size is 4 ⁇ 4 pixels. The inverse orthogonal transform unit 54 proceeds to step ST119 when the transform block size information in the encoding parameter information is “0” and the transform block size is 4 ⁇ 4 pixels, and proceeds to step ST120 when the transform block size is not “0”.
- the inverse orthogonal transform unit 54 performs 4 ⁇ 4 inverse orthogonal transform processing.
- the inverse orthogonal transform unit 54 performs 4 ⁇ 4 KL inverse transform processing using a base learned in advance according to the prediction mode and the block position.
- the KL conversion is performed by selecting the lowest frequency component coefficient from the coefficients obtained by performing KL conversion and KL conversion four times. Therefore, the inverse orthogonal transform unit 54 performs KL inverse transform on the data after inverse quantization of the lowest frequency component coefficient, using the basis corresponding to the prediction mode.
- the inverse orthogonal transform unit 54 uses the basis corresponding to the prediction mode and the block position for the four blocks including the lowest frequency component coefficient obtained by the KL inverse transform and the coefficients of the other components. Perform inverse transformation.
- the inverse orthogonal transform unit 54 outputs prediction error data obtained by performing the KL inverse transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 uses the output of the 2 ⁇ 2KL inverse transform unit 545 to perform the data obtained by performing the KL inverse transform in the 4 ⁇ 4KL inverse transform unit 546. Select and output to the adder 55.
- the inverse orthogonal transform unit 54 performs 8 ⁇ 8 inverse orthogonal transform processing.
- the inverse orthogonal transform unit 54 performs 8 ⁇ 8 KL inverse transform using a base learned in advance according to the prediction mode.
- the inverse orthogonal transform unit 54 outputs prediction error data obtained by performing the KL inverse transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 selects the data obtained by performing the KL inverse transform in the 8 ⁇ 8KL inverse transform unit 543 and outputs the selected data to the adder 55.
- the inverse orthogonal transform unit 54 performs inverse discrete cosine transform (IDCT).
- IDCT inverse discrete cosine transform
- the inverse orthogonal transform unit 54 outputs the coefficient obtained by performing the inverse discrete cosine transform to the addition unit 55. That is, the data selection unit 548 of the inverse orthogonal transform unit 54 illustrated in FIG. 14 selects the data output from the IDCT unit 547 and outputs the data to the addition unit 55.
- FIG. 17 is a diagram for explaining the inverse orthogonal transform operation, and illustrates the inverse orthogonal transform of the transform coefficient generated by the orthogonal transform operation of FIG.
- the 4 ⁇ 4KL inverse transform unit 544 uses the basis corresponding to the prediction mode indicated by the coding parameter information of the optimal mode, and performs the KL-transformed data (inverse quantum) of the lowest frequency component coefficient shown in FIG. KL inverse transform of the data).
- the 4 ⁇ 4KL inverse transform unit 544 generates the coefficient of the lowest frequency component shown in FIG. 17B by the KL inverse transform.
- the 4 ⁇ 4KL inverse transform unit 546 returns the lowest frequency component coefficient and other KL-transformed data (inverse quantized data) to coefficients for each block. Further, as shown in FIG.
- the 4 ⁇ 4 KL inverse transform unit 546 uses the prediction mode indicated by the encoding parameter information and the basis corresponding to the block position to perform KL every 16 4 ⁇ 4 blocks. Inverse transformation is performed to generate prediction error data shown in FIG.
- the data selection unit 548 selects the generated prediction error data and outputs it to the addition unit 55.
- the 2 ⁇ 2 KL inverse transform unit 545 uses the basis corresponding to the prediction mode indicated by the coding parameter information of the optimum mode, and performs the KL-transformed data (inverse of the lowest frequency component coefficient shown in FIG. 17F). KL inverse transform of (quantized data) is performed. The 2 ⁇ 2 KL inverse transform unit 545 generates the lowest frequency component coefficient shown in FIG. 17G by the KL inverse transform. As shown in FIG. 17H, the 4 ⁇ 4KL inverse transform unit 546 returns the lowest frequency component coefficient and other KL-transformed data (inverse quantized data) to coefficients for each block.
- the 4 ⁇ 4KL inverse transform unit 546 uses the prediction mode indicated by the encoding parameter information and the base corresponding to the block position to perform KL for every 4 ⁇ 4 blocks. Inverse transformation is performed to generate prediction error data shown in FIG.
- the data selection unit 548 selects the generated prediction error data and outputs it to the addition unit 55.
- step ST98 in FIG. 15 will be described with reference to the flowchart in FIG.
- step ST131 the lossless decoding unit 52 determines whether or not the target block is intra-coded.
- the lossless decoding unit 52 supplies the encoding parameter information to the intra prediction unit 62, and proceeds to step ST132. Also, when the encoding parameter information is not intra prediction information, the lossless decoding unit 52 supplies the encoding parameter information to the motion compensation unit 63 and proceeds to step ST133.
- step ST133 the motion compensation unit 63 performs inter prediction processing.
- the motion compensation unit 63 performs motion compensation on the decoded image data supplied from the frame memory 61 based on the encoding parameter information and the motion vector from the lossless decoding unit 52. Further, the motion compensation unit 63 outputs predicted image data generated by motion compensation to the selector 64.
- coded bits generated by processing coefficient data obtained by performing orthogonal transformation using a base set in advance according to a block position In decoding a stream, inverse orthogonal transform is performed using a base set in advance according to the block position in the macroblock indicated by the encoding parameter information included in the encoded bitstream. Therefore, the coefficient data after orthogonal transformation can be restored to the prediction error data before orthogonal transformation, so even if orthogonal transformation is performed using the basis corresponding to the block position in the macroblock, the prediction error before orthogonal transformation is performed. You can return to data.
- coefficient data after orthogonal transformation is obtained by using a basis set in advance according to the prediction mode indicated by the encoding parameter information.
- the prediction error data before the orthogonal transformation can be restored.
- step ST142 the base generation unit determines whether macroblocks that are not used for learning remain. The base generation unit proceeds to step ST143 when macroblocks not used for learning remain in the image used for learning, and returns to step ST141 when learning is performed using all macroblocks.
- the base generation unit calculates a symmetric matrix of 4 ⁇ 4 orthogonal transformation.
- the base generation unit divides the 16 ⁇ 16 prediction error data into 16 transform blocks each having 4 ⁇ 4 pixels, and calculates a symmetric matrix M for each block position of the transform block in the prediction mode and the macroblock.
- the base generation unit calculates the difference between the average of the 16th-order vectors and each vector as 16th-order vectors by arranging the prediction error data of the 4 ⁇ 4 pixel conversion blocks.
- the base generation unit calculates the symmetric matrix M by performing the calculation of Expression (3) using this difference as “q”.
- mdt conversion mode information that makes it possible to determine the macroblock size and the conversion block size.
- Mod is a prediction mode of intra prediction.
- Loc is the block position of the transform block within the macroblock.
- Na is the number of learning times.
- T indicates a transposed matrix.
- the base generation unit calculates a symmetric matrix of 8 ⁇ 8 orthogonal transformation.
- the base generation unit divides the 16 ⁇ 16 prediction error data into four transform blocks each having 8 ⁇ 8 pixels, and calculates a symmetric matrix M for each block position of the transform block in the prediction mode and the macroblock.
- the base generation unit arranges the prediction error data of the transform block of 8 ⁇ 8 pixels and calculates a difference between each vector and the average of the 64th order vector as a 64th order vector.
- the base generation unit calculates the symmetric matrix M by performing the calculation of Expression (3) using this difference as “q”.
- the base generation unit calculates a 4 ⁇ 4 orthogonal transformation symmetric matrix.
- the base generation unit divides the 8 ⁇ 8 prediction error data into four transform blocks each having 4 ⁇ 4 pixels, and calculates a symmetric matrix M for each block position of the transform block in the prediction mode and the macroblock.
- the base generation unit calculates the difference between the average of the 16th-order vectors and each vector as 16th-order vectors by arranging the prediction error data of the 4 ⁇ 4 pixel conversion blocks.
- the base generation unit calculates the symmetric matrix M by performing the calculation of Expression (3) using this difference as “q”.
- step ST151 the base generation unit calculates a symmetric matrix of 8 ⁇ 8 orthogonal transformation.
- the base generation unit calculates the difference between the average of the 64th order vectors and each vector as a 64th order vector by arranging the prediction error data of the 8 ⁇ 8 pixel transform block for each prediction mode.
- the base generation unit calculates the symmetric matrix M for each prediction mode by performing the calculation of Expression (3) using this difference as “q”.
- the prediction errors of the pixels P1, P5, P9, and P13 often have similar characteristics. Therefore, all the Groups 1 adopt the same base.
- the groups 0, 2, and 3 can be reduced from 16 types to 4 types by adopting the same base.
- the program can be recorded in advance on a hard disk or ROM (Read Only Memory) as a recording medium.
- the program can be temporarily or permanently stored on a removable recording medium such as a flexible disk, CD-ROM (Compact Disc Read Only Memory), MO (Magneto optical disc), DVD (Digital Versatile Disc), magnetic disk, or semiconductor memory. It can be stored (recorded).
- a removable recording medium can be provided as so-called package software.
- H.264 is used as the encoding method / decoding method.
- the present invention can also be applied to an image encoding device / image decoding device using other encoding / decoding methods.
- the decoder 904 performs packet decoding processing, and outputs video data generated by the decoding processing to the video signal processing unit 905 and audio data to the audio signal processing unit 907.
- the video signal processing unit 905 performs noise removal, video processing according to user settings, and the like on the video data.
- the video signal processing unit 905 generates video data of a program to be displayed on the display unit 906, image data by processing based on an application supplied via a network, and the like.
- the video signal processing unit 905 generates video data for displaying a menu screen for selecting an item and the like, and superimposes the video data on the video data of the program.
- the video signal processing unit 905 generates a drive signal based on the video data generated in this way, and drives the display unit 906.
- the television device 90 is provided with a bus 912 for connecting the tuner 902, the demultiplexer 903, the video signal processing unit 905, the audio signal processing unit 907, the external interface unit 909, and the control unit 910.
- FIG. 22 illustrates a schematic configuration of a mobile phone to which the present invention is applied.
- the cellular phone 92 includes a communication unit 922, an audio codec 923, a camera unit 926, an image processing unit 927, a demultiplexing unit 928, a recording / reproducing unit 929, a display unit 930, and a control unit 931. These are connected to each other via a bus 933.
- the image data generated by the camera unit 926 is supplied to the image processing unit 927.
- the image processing unit 927 performs encoding processing of image data and generates encoded data.
- the image processing unit 927 is provided with the functions of the image encoding device (image encoding method) and the image decoding device (image decoding method) of the present application. Therefore, encoding efficiency and image quality can be improved when communicating image data.
- FIG. 23 exemplifies a schematic configuration of a recording / reproducing apparatus to which the present invention is applied.
- the recording / reproducing apparatus 94 records, for example, audio data and video data of a received broadcast program on a recording medium, and provides the recorded data to the user at a timing according to a user instruction.
- the recording / reproducing device 94 can also acquire audio data and video data from another device, for example, and record them on a recording medium.
- the recording / reproducing device 94 decodes and outputs the audio data and video data recorded on the recording medium, thereby enabling image display and audio output on the monitor device or the like.
- the external interface unit 942 includes at least one of an IEEE 1394 interface, a network interface unit, a USB interface, a flash memory interface, and the like.
- the external interface unit 942 is an interface for connecting to an external device, a network, a memory card, and the like, and receives data such as video data and audio data to be recorded.
- the selector 946 selects one of the encoded bit streams from the tuner 941 or the encoder 943 and supplies it to either the HDD unit 944 or the disk drive 945 when recording video or audio. Further, the selector 946 supplies the encoded bit stream output from the HDD unit 944 or the disk drive 945 to the decoder 947 at the time of reproduction of video and audio.
- the encoder 943 is provided with the function of the image encoding apparatus (image encoding method) of the present application
- the decoder 947 is provided with the function of the image decoding apparatus (image decoding method). Video recording and reproduction can be performed efficiently by improving the efficiency and image quality.
- the imaging device 96 includes an optical block 961, an imaging unit 962, a camera signal processing unit 963, an image data processing unit 964, a display unit 965, an external interface unit 966, a memory unit 967, a media drive 968, an OSD unit 969, and a control unit 970. Have. In addition, a user interface unit 971 is connected to the control unit 970. Furthermore, the image data processing unit 964, the external interface unit 966, the memory unit 967, the media drive 968, the OSD unit 969, the control unit 970, and the like are connected via a bus 972.
- the camera signal processing unit 963 performs various camera signal processing such as knee correction, gamma correction, and color correction on the electrical signal supplied from the imaging unit 962.
- the camera signal processing unit 963 supplies the image data after the camera signal processing to the image data processing unit 964.
- the image data processing unit 964 performs an encoding process on the image data supplied from the camera signal processing unit 963.
- the image data processing unit 964 supplies the encoded data generated by performing the encoding process to the external interface unit 966 and the media drive 968. Further, the image data processing unit 964 performs a decoding process on the encoded data supplied from the external interface unit 966 and the media drive 968.
- the image data processing unit 964 supplies the image data generated by performing the decoding process to the display unit 965. Further, the image data processing unit 964 superimposes the processing for supplying the image data supplied from the camera signal processing unit 963 to the display unit 965 and the display data acquired from the OSD unit 969 on the image data. To supply.
- the OSD unit 969 generates display data such as a menu screen and icons made up of symbols, characters, or figures and outputs them to the image data processing unit 964.
- the external interface unit 966 includes, for example, a USB input / output terminal, and is connected to a printer when printing an image.
- a drive is connected to the external interface unit 966 as necessary, a removable medium such as a magnetic disk or an optical disk is appropriately mounted, and a computer program read from them is installed as necessary.
- the external interface unit 966 has a network interface connected to a predetermined network such as a LAN or the Internet.
- the control unit 970 reads the encoded data from the memory unit 967 in accordance with an instruction from the user interface unit 971, and supplies the encoded data to the other device connected via the network from the external interface unit 966. it can.
- the control unit 970 may acquire encoded data and image data supplied from another device via the network via the external interface unit 966 and supply the acquired data to the image data processing unit 964. it can.
- any readable / writable removable medium such as a magnetic disk, a magneto-optical disk, an optical disk, or a semiconductor memory is used.
- the recording medium may be any type of removable medium, and may be a tape device, a disk, or a memory card. Of course, a non-contact IC card or the like may be used.
- media drive 968 and the recording medium may be integrated and configured by a non-portable storage medium such as a built-in hard disk drive or an SSD (Solid State Drive).
- a non-portable storage medium such as a built-in hard disk drive or an SSD (Solid State Drive).
- the control unit 970 is configured using a CPU, a memory, and the like.
- the memory stores programs executed by the CPU, various data necessary for the CPU to perform processing, and the like.
- the program stored in the memory is read and executed by the CPU at a predetermined timing such as when the imaging device 96 is activated.
- the CPU executes the program to control each unit so that the imaging device 96 operates according to the user operation.
- the image data processing unit 964 is provided with the functions of the image encoding device (image encoding method) and the image decoding device (image decoding method) of the present application. Therefore, when the captured image is recorded in the memory unit 967, a recording medium, or the like, it is possible to improve the encoding efficiency and the image quality and efficiently record and reproduce the captured image.
- the present invention should not be construed as being limited to the embodiments of the invention described above.
- it should not be limited to the above-described macroblock size, transform block size, and prediction mode.
- the embodiments of the present invention disclose the present invention in the form of examples, and it is obvious that those skilled in the art can make modifications and substitutions of the embodiments without departing from the gist of the present invention. That is, in order to determine the gist of the present invention, the claims should be taken into consideration.
- a base set in advance according to the block position of the transform block in the macro block is set.
- orthogonal transformation in decoding of an encoded bitstream generated by processing coefficient data obtained by performing orthogonal transformation using a base set in advance according to a block position, it is included in the encoded bitstream.
- the base set in advance according to the block position in the macroblock indicated by the encoding parameter information is used, inverse orthogonal transformation is performed, and the coefficient data after orthogonal transformation is predicted before orthogonal transformation. Returned to error data.
- image information (encoded bitstream) obtained by performing encoding in block units, such as MPEG and H.26x, is transmitted via network media such as satellite broadcasting, cable TV, the Internet, and cellular phones. Therefore, the present invention is suitable for an image decoding device, an image encoding device, or the like used when transmitting / receiving data or processing on a storage medium such as an optical, magnetic disk, or flash memory.
- tuner 903 ... demultiplexer, 904, 947 ..Decoder, 905 ... Video signal processing unit, 906 ... Display unit, 907 ... Audio signal processing unit, 908 ... Speaker, 909, 942, 966 ... External interface unit, 910,931 , 949, 970 ... control unit, 911, 932, 971 ... user interface unit, 912, 933, 972 ... bus, 922 ... communication unit, 923 ... voice codec, 924 ... Speaker, 925 ... Microphone, 926 ... Camera part, 927 ... Image processing part, 928 ... Demultiplexing part, 929 ... Recording / reproducing part, 930 ... Display part, 943 ... Encoder, 944... HDD section, 945... Disk drive, 948, 969... OSD section, 961... Optical block, 962. ..Image data processing unit, 965... Display unit, 967... Memory unit, 968.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Discrete Mathematics (AREA)
- Computing Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
1.画像符号化装置の構成
2.直交変換部の構成
3.画像符号化装置の動作
4.画像復号化装置の構成
5.逆直交変換部の構成
6.画像復号化装置の動作
7.基底の学習動作
8.ソフトウェア処理の場合
9.電子機器に適用した場合
図1は、画像符号化装置の構成を示している。画像符号化装置10は、アナログ/ディジタル変換部(A/D変換部)11、画面並び替えバッファ12、減算部13、直交変換部14、量子化部15、可逆符号化部16、蓄積バッファ17、レート制御部18を備えている。さらに、画像符号化装置10は、逆量子化部21、逆直交変換部22、加算部23、デブロッキングフィルタ24、フレームメモリ27、イントラ予測部31、動き予測・補償部32、予測画像・最適モード選択部33を備えている。
イントラ予測処理では、符号化済みの隣接ブロックの画素を用いて予測が行われており、複数の予測方向から最適な予測方向を選択することが行われている。例えば、H.264/AVCでは、16×16画素のブロックについての予測モードとして、予測モード0~予測モード3の4つモードが設定されている。また、8×8画素のブロックについての予測モードとして、予測モード0~予測モード8の9つの予測モードが設定されている。さらに、4×4画素のブロックについての予測モードとして、予測モード0~予測モード8の9つの予測モードが設定されている。
次に、画像符号化処理動作について説明する。図6は、画像符号化処理動作を示すフローチャートである。ステップST11において、A/D変換部11は入力された画像信号をA/D変換する。
Cost(Mode∈Ω)=D+λ・R ・・・(1)
Cost(Mode∈Ω)=D+QPtoQuant(QP)・Header_Bit ・・・(2)
入力画像を符号化して生成された符号化ビットストリームは、所定の伝送路や記録媒体等を介して画像復号化装置に供給されて復号される。
図14は、逆直交変換部54の構成を示している。逆直交変換部54は、16×16KL逆変換部541、2×2KL逆変換部542,545、8×8KL逆変換部543、4×4KL逆変換部544,546、IDCT部547およびデータ選択部548を有している。
次に、図15のフローチャートを参照して、画像復号化装置50で行われる画像復号処理動作について説明する。
次に、直交変換部14と逆直交変換部54で用いられる基底を、学習動作によって予め生成する基底生成部について説明する。図19は、基底の学習動作を示すフローチャートであり、基底生成部は、学習用に用意した画像を用いて図19に示す処理を行い基底を生成する。なお、学習用の画像としては、画像の内容によって学習に偏りが起こらないように、なるべく異なる多くの画像を用いるようにする。
明細書中において説明した一連の処理はハードウェア、またはソフトウェア、または両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理シーケンスを記録したプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させる。または、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることも可能である。
また、以上においては、符号化方式/復号方式としてH.264/AVC方式が用いられたが、本発明は、その他の符号化方式/復号方式を用いる画像符号化装置/画像復号装置に適用することもできる。
Claims (16)
- 画像データと予測画像データとの誤差である予測誤差データを、変換ブロック毎に直交変換して、該直交変換後の係数データを処理して生成された符号化ビットストリームから前記画像データを復号する画像復号化装置において、
前記符号化ビットストリームを処理して、前記直交変換後の係数データと符号化パラメータ情報を得るデータ処理部と、
前記符号化パラメータ情報で示されたマクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記係数データの逆直交変換を行い予測誤差データを得る逆直交変換部と、
前記予測画像データを生成する予測画像データ生成部と、
前記逆直交変換部で得られた前記予測誤差データに前記予測画像データ生成部で生成された予測画像データを加算して前記画像データを復号する加算部と
を有する画像復号化装置。 - 前記逆直交変換部は、前記変換ブロックの位置と前記符号化パラメータ情報で示された予測モードに応じて、予め設定されている基底を用いて前記逆直交変換を行う請求項1記載の画像復号化装置。
- 前記逆直交変換部は、前記符号化パラメータ情報に基づきマクロブロックに変換ブロックが複数含まれるとき、前記マクロブロックに含まれる各変換ブロックの直交変換後の最も低い周波数成分の係数データについての直交変換後の係数データに対して、予測モードに応じて予め設定されている基底を用いて前記逆直交変換を行う請求項2記載の画像復号化装置。
- 前記逆直交変換部で用いられる基底は、前記予測誤差データを変換ブロック毎に直交変換したときに用いられる基底の逆行例である請求項2記載の画像復号化装置。
- 前記逆直交変換部は、前記基底を用いてカルーネン・レーベ逆変換を行う請求項1記載の画像復号化装置。
- 画像データと予測画像データとの誤差である予測誤差データを、変換ブロック毎に直交変換して、該直交変換後の係数データを処理して生成された符号化ビットストリームから前記画像データを復号する画像復号化方法において、
前記符号化ビットストリームを処理して、前記直交変換後の係数データと符号化パラメータ情報を得るデータ処理工程と、
前記符号化パラメータ情報で示されたマクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記係数データの逆直交変換を行い予測誤差を得る逆直交変換工程と、
前記予測画像データを生成する予測画像データ生成工程と、
前記逆直交変換部で得られた前記予測誤差に前記生成された予測画像データを加算して前記画像データを復号する加算工程と
を設けた画像復号化方法。 - 画像データと予測画像データとの誤差である予測誤差データを、変換ブロック毎に直交変換して、該直交変換後の係数データを処理して生成された符号化ビットストリームから前記画像データを復号する画像符号化をコンピュータで実行させるプログラムであって、
前記符号化ビットストリームを処理して、前記直交変換後の係数データと符号化パラメータ情報を得るデータ処理手順と、
前記符号化パラメータ情報で示されたマクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記係数データの逆直交変換を行い予測誤差を得る逆直交変換手順と、
前記予測画像データを生成する予測画像データ生成手順と、
前記逆直交変換部で得られた前記予測誤差に前記生成された予測画像データを加算して前記画像データを復号する加算手順と
を前記コンピュータで実行させるプログラム。 - 画像データの符号化を行う画像符号化装置において、
前記画像データの予測画像データを生成する予測部と、
前記画像データと前記予測画像データとの誤差である予測誤差データを生成する減算部と、
前記予測誤差の直交変換を変換ブロック毎に行い、マクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記直交変換を行う直交変換部と、
前記直交変換部の出力データを処理して符号化ビットストリームを生成するデータ処理部と
を有する画像符号化装置。 - 前記直交変換部は、前記変換ブロックの位置と前記予測部で前記予測画像データの生成を行ったときの予測モードに応じて、予め設定されている基底を用いて前記直交変換を行う請求項8記載の画像符号化装置。
- 前記直交変換部は、前記マクロブロックに含まれる変換ブロックが複数であるとき、マクロブロックに含まれる各変換ブロックの直交変換後の最も低い周波数成分の係数を用いたブロックについて、前記予測モードに応じて、予め設定されている基底を用いて直交変換を行う請求項9記載の画像符号化装置。
- 前記直交変換部で用いられる基底は、予め用意されている複数の画像を用いて、前記マクロブロックサイズ、前記変換ブロックサイズ、前記マクロブロック内における変換ブロックの位置、および前記予測モード毎の各変換ブロック内の予測誤差データから算出した行列の固有値に対応する固有ベクトルである請求項9記載の画像符号化装置。
- 前記直交変換部で用いられる基底は、基底間の距離に応じてグループ化されている請求項11記載の画像符号化装置。
- 前記直交変換部で用いられる基底は、参照画素からの距離に応じてグループ化されている請求項11記載の画像符号化装置。
- 前記直交変換部は、前記基底を用いてカルーネン・レーベ変換を行う請求項8記載の画像符号化装置。
- 画像データの符号化を行う画像符号化方法において、
前記画像データの予測画像データを生成する予測画像データ生成工程と、
前記画像データと前記予測画像データとの誤差である予測誤差データを生成する減算工程と、
前記予測誤差の直交変換を変換ブロック毎に行い、マクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記直交変換を行う直交変換工程と
を設けた画像符号化方法。
- 画像データの符号化をコンピュータで実行させるプログラムであって、
前記画像データの予測画像データを生成する予測画像データ生成手順と、
前記画像データと前記予測画像データとの誤差である予測誤差データを生成する減算手順と、
前記予測誤差の直交変換を変換ブロック毎に行い、マクロブロック内における前記変換ブロックの位置に応じて、予め設定されている基底を用いて前記直交変換を行う直交変換手順と
を前記コンピュータで実行させるプログラム。
Priority Applications (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2012151530/08A RU2012151530A (ru) | 2010-06-09 | 2011-05-25 | Устройство декодирования изображения, устройство кодирования изображения, способ и программа для декодирования и кодирования изображения |
EP11792289.8A EP2582137A1 (en) | 2010-06-09 | 2011-05-25 | Image decoder apparatus, image encoder apparatus and methods and programs thereof |
BR112012030544A BR112012030544A2 (pt) | 2010-06-09 | 2011-05-25 | aparelho e método de decodificação de imagem, programa, e, aparelho e método de codificação de imagem |
CN201180027201.5A CN102918843B (zh) | 2010-06-09 | 2011-05-25 | 图像解码装置、图像编码装置以及用于图像解码和编码的方法和程序 |
KR20127031223A KR20130090322A (ko) | 2010-06-09 | 2011-05-25 | 화상 복호화 장치와 화상 부호화 장치 및 그 방법과 프로그램 |
US13/701,319 US9053549B2 (en) | 2010-06-09 | 2011-05-25 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
US14/720,265 US9596476B2 (en) | 2010-06-09 | 2015-05-22 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
US15/432,338 US9979982B2 (en) | 2010-06-09 | 2017-02-14 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
US15/960,370 US10499083B2 (en) | 2010-06-09 | 2018-04-23 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-131891 | 2010-06-09 | ||
JP2010131891A JP2011259205A (ja) | 2010-06-09 | 2010-06-09 | 画像復号化装置と画像符号化装置およびその方法とプログラム |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/701,319 A-371-Of-International US9053549B2 (en) | 2010-06-09 | 2011-05-25 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
US14/720,265 Continuation US9596476B2 (en) | 2010-06-09 | 2015-05-22 | Image decoding apparatus, image encoding apparatus, and method and program for image decoding and encoding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011155332A1 true WO2011155332A1 (ja) | 2011-12-15 |
Family
ID=45097940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/061974 WO2011155332A1 (ja) | 2010-06-09 | 2011-05-25 | 画像復号化装置と画像符号化装置およびその方法とプログラム |
Country Status (9)
Country | Link |
---|---|
US (4) | US9053549B2 (ja) |
EP (1) | EP2582137A1 (ja) |
JP (1) | JP2011259205A (ja) |
KR (1) | KR20130090322A (ja) |
CN (3) | CN105049859B (ja) |
BR (1) | BR112012030544A2 (ja) |
RU (1) | RU2012151530A (ja) |
TW (1) | TW201215156A (ja) |
WO (1) | WO2011155332A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107743240A (zh) * | 2012-02-29 | 2018-02-27 | 索尼公司 | 图像处理装置和方法 |
RU2759052C2 (ru) * | 2016-12-28 | 2021-11-09 | Сони Корпорейшн | Устройство и способ обработки изображений |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8625914B2 (en) * | 2013-02-04 | 2014-01-07 | Sony Corporation | Image processing system, image processing method and program |
CN106060547B (zh) * | 2010-06-07 | 2019-09-13 | 数码士有限公司 | 解码高分辨率图像的方法和装置 |
JP2011259205A (ja) * | 2010-06-09 | 2011-12-22 | Sony Corp | 画像復号化装置と画像符号化装置およびその方法とプログラム |
KR20120035096A (ko) * | 2010-10-04 | 2012-04-13 | 한국전자통신연구원 | 쿼드 트리 변환 구조에서 부가 정보의 시그널링 방법 및 장치 |
JPWO2012120840A1 (ja) * | 2011-03-07 | 2014-07-17 | パナソニック株式会社 | 画像復号方法、画像符号化方法、画像復号装置および画像符号化装置 |
CN103974076B (zh) | 2014-05-19 | 2018-01-12 | 华为技术有限公司 | 图像编解码方法和设备、系统 |
US10139480B2 (en) * | 2016-02-19 | 2018-11-27 | Fujifilm Sonosite, Inc. | Ultrasound transducer with data compression |
WO2017195917A1 (ko) * | 2016-05-12 | 2017-11-16 | 엘지전자 주식회사 | 비디오 코딩 시스템에서 인트라 예측 방법 및 장치 |
JP7493310B2 (ja) * | 2019-06-07 | 2024-05-31 | キヤノン株式会社 | 画像復号装置及び方法及びプログラム |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005130417A (ja) * | 2003-10-01 | 2005-05-19 | Matsushita Electric Ind Co Ltd | 変換符号化方法および変換復号化方法 |
JP2009272727A (ja) * | 2008-04-30 | 2009-11-19 | Toshiba Corp | 予測誤差の方向性に基づく変換方法、画像符号化方法及び画像復号化方法 |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0846971A (ja) * | 1994-07-29 | 1996-02-16 | Sharp Corp | 動画像符号化装置 |
JP3169783B2 (ja) * | 1995-02-15 | 2001-05-28 | 日本電気株式会社 | 動画像の符号化・復号システム |
JP3788823B2 (ja) * | 1995-10-27 | 2006-06-21 | 株式会社東芝 | 動画像符号化装置および動画像復号化装置 |
JP4014263B2 (ja) * | 1997-10-01 | 2007-11-28 | 松下電器産業株式会社 | 映像信号変換装置及び映像信号変換方法 |
JP3519594B2 (ja) * | 1998-03-03 | 2004-04-19 | Kddi株式会社 | ステレオ動画像用符号化装置 |
JP4168304B2 (ja) * | 1999-09-16 | 2008-10-22 | ソニー株式会社 | 情報出力装置、情報報知方法および情報信号供給経路選択方法 |
US6940911B2 (en) * | 2000-03-14 | 2005-09-06 | Victor Company Of Japan, Ltd. | Variable picture rate coding/decoding method and apparatus |
JP3887178B2 (ja) * | 2001-04-09 | 2007-02-28 | 株式会社エヌ・ティ・ティ・ドコモ | 信号符号化方法及び装置並びに復号方法及び装置 |
JP4120301B2 (ja) * | 2002-04-25 | 2008-07-16 | ソニー株式会社 | 画像処理装置およびその方法 |
US7009655B2 (en) * | 2002-07-23 | 2006-03-07 | Mediostream, Inc. | Method and system for direct recording of video information onto a disk medium |
US20090118019A1 (en) * | 2002-12-10 | 2009-05-07 | Onlive, Inc. | System for streaming databases serving real-time applications used through streaming interactive video |
JP4617644B2 (ja) * | 2003-07-18 | 2011-01-26 | ソニー株式会社 | 符号化装置及び方法 |
JP2005272727A (ja) * | 2004-03-25 | 2005-10-06 | Nippon Shokubai Co Ltd | 水性樹脂分散体及びその製造方法 |
KR101104828B1 (ko) * | 2004-12-09 | 2012-01-16 | 삼성전자주식회사 | 움직임 벡터 연산 장치 및 그 방법 |
JP2008004984A (ja) * | 2006-06-20 | 2008-01-10 | Sony Corp | 画像理装置および方法、プログラム、並びに記録媒体 |
US20080008246A1 (en) * | 2006-07-05 | 2008-01-10 | Debargha Mukherjee | Optimizing video coding |
US8571104B2 (en) * | 2007-06-15 | 2013-10-29 | Qualcomm, Incorporated | Adaptive coefficient scanning in video coding |
JP4678015B2 (ja) * | 2007-07-13 | 2011-04-27 | 富士通株式会社 | 動画像符号化装置及び動画像符号化方法 |
EP2051524A1 (en) * | 2007-10-15 | 2009-04-22 | Panasonic Corporation | Image enhancement considering the prediction error |
US8275033B2 (en) * | 2008-01-15 | 2012-09-25 | Sony Corporation | Picture mode selection for video transcoding |
JP2009194617A (ja) * | 2008-02-14 | 2009-08-27 | Sony Corp | 画像処理装置、画像処理方法、画像処理方法のプログラム及び画像処理方法のプログラムを記録した記録媒体 |
WO2010143853A2 (ko) * | 2009-06-07 | 2010-12-16 | 엘지전자 주식회사 | 비디오 신호의 디코딩 방법 및 장치 |
US20110090952A1 (en) * | 2009-10-21 | 2011-04-21 | Cohen Robert A | Directional Transforms for Video and Image Coding |
KR101441879B1 (ko) * | 2009-12-09 | 2014-09-23 | 에스케이텔레콤 주식회사 | 영상 부호화 장치 및 방법, 및 거기에 이용되는 변환 부호화 장치 및 방법, 변환기저 생성장치 및 방법, 및 영상 복호화 장치 및 방법 |
US8792740B2 (en) * | 2010-02-02 | 2014-07-29 | Humax Holdings Co., Ltd. | Image encoding/decoding method for rate-distortion optimization and apparatus for performing same |
JP2011259205A (ja) * | 2010-06-09 | 2011-12-22 | Sony Corp | 画像復号化装置と画像符号化装置およびその方法とプログラム |
PL3104616T3 (pl) * | 2010-07-09 | 2017-10-31 | Samsung Electronics Co Ltd | Urządzenie do entropijnego dekodowania współczynników przekształcenia |
US9066097B2 (en) * | 2011-02-01 | 2015-06-23 | Sony Corporation | Method to optimize the transforms and/or predictions in a video codec |
US9641840B2 (en) * | 2011-05-20 | 2017-05-02 | Sony Corporation | Processing device and image processing method for encoding and decoding image |
AU2012355057B2 (en) * | 2011-12-19 | 2016-08-18 | Sony Corporation | Image processing device and method |
-
2010
- 2010-06-09 JP JP2010131891A patent/JP2011259205A/ja not_active Withdrawn
-
2011
- 2011-05-25 CN CN201510424195.7A patent/CN105049859B/zh active Active
- 2011-05-25 WO PCT/JP2011/061974 patent/WO2011155332A1/ja active Application Filing
- 2011-05-25 KR KR20127031223A patent/KR20130090322A/ko not_active Application Discontinuation
- 2011-05-25 BR BR112012030544A patent/BR112012030544A2/pt not_active IP Right Cessation
- 2011-05-25 US US13/701,319 patent/US9053549B2/en not_active Expired - Fee Related
- 2011-05-25 CN CN201180027201.5A patent/CN102918843B/zh not_active Expired - Fee Related
- 2011-05-25 CN CN201510422994.0A patent/CN105025295B/zh active Active
- 2011-05-25 EP EP11792289.8A patent/EP2582137A1/en not_active Withdrawn
- 2011-05-25 RU RU2012151530/08A patent/RU2012151530A/ru unknown
- 2011-06-02 TW TW100119435A patent/TW201215156A/zh unknown
-
2015
- 2015-05-22 US US14/720,265 patent/US9596476B2/en active Active
-
2017
- 2017-02-14 US US15/432,338 patent/US9979982B2/en active Active
-
2018
- 2018-04-23 US US15/960,370 patent/US10499083B2/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005130417A (ja) * | 2003-10-01 | 2005-05-19 | Matsushita Electric Ind Co Ltd | 変換符号化方法および変換復号化方法 |
JP2009272727A (ja) * | 2008-04-30 | 2009-11-19 | Toshiba Corp | 予測誤差の方向性に基づく変換方法、画像符号化方法及び画像復号化方法 |
Non-Patent Citations (3)
Title |
---|
BYEONGMOON JEON ET AL.: "Description of video coding technology proposal by LG Electronics, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/ SC29/WG11", JCTVC-A110, 1ST MEETING, April 2010 (2010-04-01), DRESDEN, DE, pages 1 * |
MADHUKAR BUDAGAVI ET AL.: "Orthogonal MDDT and Mode Dependent DCT, tions Standardization SectorSTUDY GROUP 16 Question 6 Video Coding Experts Group (VCEG)", VCEG-AM20, 39TH MEETING, January 2010 (2010-01-01), KYOTO, JP, pages 1 - 9 * |
MARTA KARCZEWICZ: "Improved Intra Coding, ITU - Telecommunications Standardization Sector STUDY GROUP 16 Question 6 Video Coding Experts Group (VCEG)", VCEG-AF15, 32ND MEETING, April 2007 (2007-04-01), SAN JOSE, CA, pages 1 - 4 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107743240A (zh) * | 2012-02-29 | 2018-02-27 | 索尼公司 | 图像处理装置和方法 |
CN109803149A (zh) * | 2012-02-29 | 2019-05-24 | 索尼公司 | 图像处理装置和方法 |
US10574990B2 (en) | 2012-02-29 | 2020-02-25 | Sony Corporation | Image processing device and method with a scalable quantization matrix |
CN107743240B (zh) * | 2012-02-29 | 2020-03-17 | 索尼公司 | 图像处理装置和方法 |
CN110913224A (zh) * | 2012-02-29 | 2020-03-24 | 索尼公司 | 图像处理装置和方法 |
CN110913224B (zh) * | 2012-02-29 | 2022-05-03 | 索尼公司 | 图像处理装置和方法 |
US11539954B2 (en) | 2012-02-29 | 2022-12-27 | Sony Corporation | Image processing device and method with a scalable quantization matrix |
RU2759052C2 (ru) * | 2016-12-28 | 2021-11-09 | Сони Корпорейшн | Устройство и способ обработки изображений |
US11997316B2 (en) | 2016-12-28 | 2024-05-28 | Sony Corporation | Image processing apparatus and method for curbing deterioration in coding efficiency |
Also Published As
Publication number | Publication date |
---|---|
CN102918843A (zh) | 2013-02-06 |
US20180242019A1 (en) | 2018-08-23 |
US9053549B2 (en) | 2015-06-09 |
US20130071038A1 (en) | 2013-03-21 |
EP2582137A1 (en) | 2013-04-17 |
CN102918843B (zh) | 2015-08-19 |
US20150281697A1 (en) | 2015-10-01 |
TW201215156A (en) | 2012-04-01 |
CN105049859A (zh) | 2015-11-11 |
CN105049859B (zh) | 2019-01-08 |
US10499083B2 (en) | 2019-12-03 |
RU2012151530A (ru) | 2014-07-20 |
JP2011259205A (ja) | 2011-12-22 |
CN105025295B (zh) | 2019-01-29 |
US20170164005A1 (en) | 2017-06-08 |
US9979982B2 (en) | 2018-05-22 |
US9596476B2 (en) | 2017-03-14 |
BR112012030544A2 (pt) | 2016-08-09 |
KR20130090322A (ko) | 2013-08-13 |
CN105025295A (zh) | 2015-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2011155332A1 (ja) | 画像復号化装置と画像符号化装置およびその方法とプログラム | |
KR101914896B1 (ko) | 화상 처리 장치와 화상 처리 방법 | |
KR101772851B1 (ko) | 화상 처리 장치 및 방법 | |
WO2011155364A1 (ja) | 画像復号化装置と画像符号化装置およびその方法とプログラム | |
WO2012102088A1 (ja) | 画像復号装置と画像符号化装置およびその方法 | |
WO2014002896A1 (ja) | 符号化装置および符号化方法、復号装置および復号方法 | |
WO2011018965A1 (ja) | 画像処理装置および方法 | |
WO2011089972A1 (ja) | 画像処理装置および方法 | |
WO2012063878A1 (ja) | 画像処理装置と画像処理方法 | |
WO2010035732A1 (ja) | 画像処理装置および方法 | |
US20110229049A1 (en) | Image processing apparatus, image processing method, and program | |
JP4360093B2 (ja) | 画像処理装置および符号化装置とそれらの方法 | |
JP2013150164A (ja) | 符号化装置および符号化方法、並びに、復号装置および復号方法 | |
KR20130088119A (ko) | 부호화 장치 및 부호화 방법, 및 복호 장치 및 복호 방법 | |
JP5387520B2 (ja) | 情報処理装置と情報処理方法 | |
JP4655791B2 (ja) | 符号化装置、符号化方法およびそのプログラム | |
WO2014002900A1 (ja) | 画像処理装置および画像処理方法 | |
JP2012138884A (ja) | 符号化装置および符号化方法、並びに復号装置および復号方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180027201.5 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11792289 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011792289 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20127031223 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2012151530 Country of ref document: RU Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13701319 Country of ref document: US Ref document number: 10421/DELNP/2012 Country of ref document: IN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112012030544 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112012030544 Country of ref document: BR Kind code of ref document: A2 Effective date: 20121130 |