WO2014088316A2

WO2014088316A2 - Video encoding and decoding method, and apparatus using same

Info

Publication number: WO2014088316A2
Application number: PCT/KR2013/011165
Authority: WO
Inventors: 심동규; 조현호
Original assignee: 인텔렉추얼 디스커버리 주식회사
Priority date: 2012-12-04
Filing date: 2013-12-04
Publication date: 2014-06-12
Also published as: WO2014088316A3

Abstract

The present invention relates to a method and apparatus for reference of reconstructed pictures in scalable video codecs. The decoding method comprises the steps of: reference point selection on a reference layer; interpolation filter selection; interpolation implementation; and reconstitution of enhancement layer reference list via the interpolated picture.

Description

Video encoding and decoding method, apparatus using same

The present invention relates to an image processing technique, and more particularly, to a method and apparatus for encoding / decoding an enhancement layer by interpolating a picture of a reference layer in a scalable video codec and using the same as a reference picture of the enhancement layer.

Recently, as the demand for high resolution and high quality images increases, there is a need for a high efficiency video compression technology for the next generation video service. In response to these market demands, MPEG and VCEG formed Joint Collaborative Team on Video Coding (JCT-VC) in January 2010, and next-generation video standard called High Efficiency Video Coding (HEVC) in January 2013 through JCT-VC. The technology was established. This HEVC has a compression efficiency of about 50% or more when compared in terms of subjective picture quality compared to the H.264 / AVC High profile, which is a video standard known to have the highest compression efficiency. In addition, HEVC can effectively support 4K-UHD and 8K-UHD resolution video, and since the basic block of encoding is variable, it can support more various resolution video than conventional video compression standard technology.

The HEVC standardization for the base layer was established in January 2013 under the name HEVC version 1, and by 2014, the HEVC-based scalable video compression standard technology and the HEVC-based multiview video compression standard technology are planned to be developed. . HEVC is similar in structure to conventional video codecs such as H.264 / AVC, but since new coding techniques are additionally used, the interlayer prediction technique considering the new additions to HEVC in the HEVC-based scalable video compression technique need.

The present invention provides a method and apparatus for improving encoding performance of an enhancement layer by interpolating a reconstructed picture of a reference layer according to the resolution of an enhancement layer in an HEVC-based scalable video compression codec, and adding it to a reference picture list of the enhancement layer. It aims to provide.

An inter-layer reconstructed picture reference method according to an embodiment of the present invention for solving the above problems includes selecting a picture to be interpolated in a reference layer; Performing interpolation through a plurality of interpolators on the selected picture; Adding the interpolated picture to the reference picture list of the enhancement layer.

An inter-layer reconstructed picture reference apparatus according to an embodiment of the present invention includes an apparatus for selecting a picture to be interpolated in a reference layer; An apparatus for performing interpolation through a plurality of interpolation apparatuses on a selected picture; And an apparatus for adding the interpolated picture to the reference picture list of the enhancement layer.

The inter-layer reconstructed picture reference method according to the second embodiment of the present invention for solving the above problems comprises the steps of selecting a plurality of interpolation points in the picture of the reference layer; Performing interpolation through an interpolation unit on the input picture of the selected point; Adding the interpolated picture to the reference picture list of the enhancement layer.

An inter-layer reconstructed picture reference apparatus according to an embodiment of the present invention includes an apparatus for selecting a plurality of interpolation points in a picture of a reference layer; An apparatus for performing interpolation through an interpolation unit on an input picture of a selected point; And an apparatus for adding the interpolated picture to the reference picture list of the enhancement layer.

The inter-layer reconstructed picture reference method according to the third embodiment of the present invention for solving the above problems comprises the steps of selecting a plurality of interpolation points in a picture of a reference layer; Performing interpolation through different interpolators according to the input picture of the selected point; Adding the interpolated picture to the reference picture list of the enhancement layer.

An inter-layer reconstructed picture reference apparatus according to an embodiment of the present invention includes an apparatus for selecting a plurality of interpolation points in a picture of a reference layer; An apparatus for performing interpolation through different interpolators in accordance with an input picture of a selected point; And an apparatus for adding the interpolated picture to the reference picture list of the enhancement layer.

According to an embodiment of the present invention, the inter-layer reconstruction picture reference encoding / decoding apparatus interpolates a plurality of reference pictures having different characteristics by performing interpolation through a plurality of interpolators on an image to which sample adaptive offset is applied in a reference layer. Create A plurality of reference pictures generated through the reference layer may be inserted into a reference picture list of the enhancement layer and used for inter prediction to improve encoding performance of the enhancement layer.

According to an embodiment of the present invention, the inter-layer reconstructed picture reference encoding / decoding apparatus generates interpolation images having different characteristics by interpolating a reference picture in a reference layer at a plurality of interpolation points, and in the enhancement layer By using the reference picture lister, encoding performance of an enhancement layer can be improved through various combinations.

According to an embodiment of the present invention, the inter-layer reconstructed picture reference encoding / decoding apparatus selects a plurality of interpolation points with respect to a reference picture in a reference layer and has a plurality of characteristics having different characteristics in consideration of characteristics of each interpolation point. The interpolation is performed to generate a plurality of reference pictures having different characteristics. The generated plurality of reference pictures may be inserted into the reference picture list of the enhancement layer and used for inter prediction to improve encoding performance of the enhancement layer.

1 is a block diagram illustrating a configuration of a scalable video encoder.

2 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.

3 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.

4 is a block diagram illustrating a configuration of a video decoding apparatus according to an embodiment of the present invention.

FIG. 5 is a conceptual diagram for describing adding pictures interpolated in a reference layer to a reference picture list of an enhancement layer in an image encoding / decoding apparatus according to the present invention.

FIG. 6 is a conceptual diagram for describing adding pictures interpolated in a reference layer to a reference picture list of an enhancement layer when decoding an arbitrary access picture in the image encoding / decoding apparatus according to the present invention.

FIG. 7 is a conceptual diagram illustrating an example of performing inter-layer intra prediction in an enhancement layer, deriving a difference coefficient from an enhancement layer, and encoding the same by using pictures interpolated in a reference layer in an image encoding apparatus to which the present invention is applied.

8 is a conceptual diagram illustrating an example of performing inter-layer intra prediction in an enhancement layer using pictures interpolated in a reference layer and decoding a block using reconstructed difference coefficients of an enhancement layer in an image decoding apparatus to which the present invention is applied. to be.

9 is a flowchart schematically illustrating a method of constructing a reference picture list of an enhancement layer and performing inter-layer prediction in the enhancement layer according to an embodiment of the present invention.

EMBODIMENT OF THE INVENTION Hereinafter, embodiment of this invention is described concretely with reference to drawings. In describing the embodiments of the present specification, when it is determined that a detailed description of a related well-known configuration or function may obscure the gist of the present specification, the detailed description thereof will be omitted.

When a component is said to be "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that another component may be present in between. Should be. In addition, the content described as "include" a specific configuration in the present invention does not exclude a configuration other than the configuration, it means that additional configuration may be included in the scope of the technical idea of the present invention or the present invention.

Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.

In addition, the components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, and do not mean that each component is made of separate hardware or one software component unit. In other words, each component is included in each component for convenience of description, and at least two of the components may be combined into one component, or one component may be divided into a plurality of components to perform a function. Integrated and separate embodiments of the components are also included within the scope of the present invention without departing from the spirit of the invention.

In addition, some of the components may not be essential components for performing essential functions in the present invention, but may be optional components for improving performance. The present invention can be implemented including only the components essential for implementing the essentials of the present invention except for the components used for improving performance, and the structure including only the essential components except for the optional components used for improving performance. Also included in the scope of the present invention.

1 is a block diagram illustrating a configuration of a scalable video encoder.

Referring to FIG. 1, a scalable video encoder provides spatial scalability, temporal scalability, and SNR scalability. For spatial scalability, multi-layers using upsampling are used, and temporal scalability uses Hierarchical B picture structure. In addition, for the quality scalability, only the quantization coefficient is changed or a gradual encoding method for quantization error is used in the same manner as the technique for spatial scalability.

Input video 110 is down sampled through spatial decimation 115. The down-sampled image 120 is used as an input of the reference layer, and the intra-prediction technique through the intra predictor 135 or the inter-screen through the motion compensator 130 for effectively coding the coding blocks in the picture of the reference layer. Use prediction techniques. The difference coefficient, which is a difference value between the original block to be encoded and the prediction block generated by the motion compensation unit 130 or the intra prediction unit 135, is discrete cosine transformed or integer transformed through the transform unit 140. The transform difference coefficient is quantized while passing through the quantization unit 145, and the transform difference coefficient is entropy coded by the entropy encoder 150. The quantized transform difference coefficients are reconstructed back into differential coefficients through the inverse quantizer 152 and the inverse transform unit 154 to generate predicted values for use in adjacent blocks or adjacent pictures. In this case, the difference coefficient value restored due to an error occurring in the quantization unit 145 may not be the same as the difference coefficient value used as an input of the converter 140. The reconstructed difference coefficient value is added to a prediction block previously generated by the motion compensator 130 or the intra predictor 135 to reconstruct the pixel value of the block currently encoded. The reconstructed block passes through the in-loop filter 156. When all blocks in the picture are reconstructed, the reconstructed picture is input to the reconstructed picture buffer 158 and used for inter prediction in the reference layer.

In the enhancement layer, the input video 110 is used as an input value and encoded. The interlayer prediction is performed by the motion compensator 172 or the intra predictor 170 in order to effectively encode the coding block in the picture as in the reference layer. Alternatively, an intra prediction is performed and an optimal prediction block is generated. The block to be encoded in the enhancement layer is predicted in the prediction block generated by the motion compensator 172 or the intra predictor 170, and as a result, a difference coefficient is generated in the enhancement layer. The difference coefficients of the enhancement layer are encoded through the transform unit, the quantization unit, and the entropy encoding unit similarly to the reference layer. In the multi-layered structure as shown in FIG. 1, encoded bits are generated in each layer. The multiplexer 192 serves to configure one single bitstream 194.

Although each of the multiple layers may be independently encoded in FIG. 1, since the input video of the lower layer is down-sampled from the video of the upper layer, it has very similar characteristics. Therefore, when the reconstructed pixel values, motion vectors, and residual signals of the lower layer video are used in the enhancement layer, encoding efficiency may be increased.

In FIG. 1, the inter-layer intra prediction 162 reconstructs an image of a reference layer and interpolates the reconstructed image 180 according to an image size of an enhancement layer and uses the image as a reference image. When reconstructing the image of the reference layer, a method of decoding the reference image in units of frames and a method of decoding in units of blocks may be used in consideration of complexity reduction. In particular, when the reference layer is encoded in the inter prediction mode, since the complexity of decoding is high, in H.264 / SVC, inter-layer prediction is allowed only when the reference layer is encoded in the intra prediction mode. The image 180 reconstructed in the reference layer is input to the intra prediction unit 170 of the enhancement layer, thereby improving coding efficiency than using neighboring pixel values in the picture in the enhancement layer.

In FIG. 1, inter-layer motion prediction 160 refers to motion information 185 such as a motion vector or a reference frame index in the reference layer in the enhancement layer. In particular, since the specific gravity of the motion information is high when the video is encoded at a low bit rate, the coding efficiency of the enhancement layer is improved by referring to such information of the reference layer.

In FIG. 1, the inter-layer difference coefficient prediction 164 predicts the difference coefficient of the enhancement layer as the value of the difference coefficient 190 decoded in the reference layer. Through this, the difference coefficient value of the enhancement layer can be encoded more effectively. According to the implementation method of the encoder, the difference coefficient 190 decoded in the reference layer is input to the motion compensation unit 172 of the enhancement layer to predict the motion of the enhancement layer. From the process, an optimal motion vector may be derived by considering the decoded difference coefficient value 190 of the reference layer.

Referring to FIG. 2, the apparatus for decoding an image includes a portion for decoding a reference layer and a portion for decoding an enhancement layer. The portion 205 for decoding the reference layer includes an entropy decoder 210, an inverse quantizer 211, an inverse transformer 212, a motion compensator 213, an intra predictor 214, and a deblocking filter 215. ), A sample adaptive offset unit 216, a reconstructed picture buffer 217, and the like.

In the scalable video codec, a bitstream of a reference layer and an enhancement layer is composed of a single bitstream. When the scalable video decoder decodes only the reference layer, the demultiplexer 200 extracts only the bitstream of the reference layer from the received bitstream and inputs it to the video decoder of the reference layer. The input bitstream is decoded by the syntax elements encoded by CABAC or VLC through the entropy decoding unit 210. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 211, and the inverse quantized coefficients are inversely transformed through the inverse transformer 212 to restore a differential coefficient value. When the block to be decoded is in the inter prediction mode, the prediction block is generated from the reference picture stored in the reconstructed picture buffer 217 through the motion compensation unit 213. When the block to be decoded is encoded in the intra prediction mode, the intra prediction unit 214 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 215. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 216, and a process of compensating an offset value in units of pixels according to each class is performed. The reconstructed picture that has passed through the deblocking filter unit 215 and the sample adaptive offset unit 216 is stored in the reconstructed picture buffer 217 for the inter prediction mode of the next picture. When the video of the reference layer is played back, the picture 220 that has passed through the sample adaptive offset unit 216 is copied to the video output buffer and then output as video.

The decoding apparatus according to an embodiment of the present invention interpolates the reconstructed picture 220 of the reference layer and then uses it as a reference picture of the enhancement layer in order to effectively encode the picture of the enhancement layer. An image decoding apparatus for an enhancement layer is similar to an image decoding apparatus of a reference layer, such as an entropy decoding unit 250, an inverse quantization unit 251, an inverse transform unit 252, a motion compensator 253, and an intra predictor 254. , A deblocking filter unit 255, a sample adaptive offset unit 256, and a reconstructed picture buffer 257. The image decoding apparatus further includes interpolation unit A 240 and interpolation unit B 230 having different characteristics in order to interpolate the reconstructed picture of the reference layer.

When decoding the enhancement layer in the scalable video codec, since the enhancement layer refers to the reference layer, decoding on the reference layer must be preceded. After decoding of the reference layer is preceded, the syntax elements of the enhancement layer encoded by CABAC or VLC are decoded through the entropy decoder 250 through the bitstream of the enhancement layer extracted through the demultiplexer 200. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 251, and the inverse quantized coefficients are inversely changed through the inverse transformation unit 252 to restore a differential coefficient value. When the block of the enhancement layer to be decoded is in the inter prediction mode, the prediction block is generated from the picture stored in the reconstructed picture buffer 257 through the motion compensation unit 253. When the block of the enhancement layer to be decoded is encoded in the intra prediction mode, the intra prediction unit 254 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 255. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 256, and a process of compensating an offset value in units of pixels according to each class is performed.

The image decoding apparatus decodes a picture of a reference layer and interpolates it according to the resolution of the enhancement layer, and uses the interpolated picture as a prediction value in inter-screen or intra-picture prediction of the enhancement layer. A plurality of interpolation filters are used by the image decoding apparatus to interpolate the decoded picture 220 in the reference layer. The interpolator A 240 performs interpolation on the picture to which the in-loop filter such as the deblocking filter 215 and the sample adaptive offset 216 is applied after being decoded in the reference layer to be equal to the resolution of the enhancement layer. The filter coefficient of the interpolator A 240 may be a DCT-IF based interpolation filter, an adaptive filter coefficient based interpolation filter, or a fixed coefficient based interpolation filter. The interpolator B 230 also receives the same picture as the input of the interpolator A 240 and performs interpolation in consideration of the resolution of the enhancement layer. The interpolator B 230 is a DCT-IF based interpolation filter using fewer filter tap coefficients than the interpolator A 240, an adaptive filter coefficient based interpolation filter, and a fixed coefficient based interpolation filter. Can be used.

In order to interpolate the pictures of the reference layer, the image decoding apparatus uses the interpolation unit A 240 having high efficiency and the interpolation unit B 230 having low complexity to perform the reference layer with respect to the same picture of the reference layer. Characterized by generating different prediction images for. In particular, in this figure, only two interpolation units represented by high efficiency and low complexity are described, but three or more interpolation units having different characteristics can be applied, and in this case, information about which interpolation unit to select in the enhancement layer is explicitly specified. Signaling or may be derived from context information of an enhancement layer or a reference layer. The

pictures

260 and 270 interpolated by the interpolator A 240 and the interpolator B 230 are input to the reconstructed picture buffer of the enhancement layer 257 to compensate for motion in the inter prediction mode of the enhancement layer. Is used). The picture 260 interpolated by the interpolator A 240 may further be used as an inter-layer intra prediction value in the intra predictor of the enhancement layer.

Referring to FIG. 3, the apparatus for decoding an image includes a portion for decoding a reference layer and a portion for decoding an enhancement layer. The portion 305 for decoding the reference layer may include an entropy decoder 310, an inverse quantizer 311, an inverse transformer 312, a motion compensator 313, an intra predictor 314, and a deblocking filter 315. ), A sample adaptive offset unit 316, a reconstructed picture buffer 317, and the like.

In the scalable video codec, a bitstream of a reference layer and an enhancement layer is composed of a single bitstream. When the scalable video decoder decodes only the reference layer, the demultiplexer 300 extracts only the bitstream of the reference layer from the received bitstream and inputs it to the video decoder of the reference layer. The input bitstream is decoded by the syntax elements encoded by CABAC or VLC through the entropy decoding unit 310. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 311, and the inverse quantized coefficients are inversely transformed through the inverse transformer 312 to restore a differential coefficient value. When the block to be decoded is in the inter prediction mode, the prediction block is generated from the reconstructed picture stored in the reconstructed picture buffer 317 through the motion compensation unit 313. When the block to be decoded is encoded in the intra prediction mode, the intra predictor 314 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 315. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 316, and a process of compensating an offset value in units of pixels according to each class is performed. The reconstructed picture that has passed through the deblocking filter unit 315 and the sample adaptive offset unit 316 is stored in the reconstructed picture buffer 317 for the inter prediction mode of the next picture. When the video of the reference layer is played back, the picture 320 that has passed through the sample adaptive offset unit 316 is copied to the video output buffer and then output as video.

According to a second embodiment of the present invention, in order to effectively encode a picture of an enhancement layer, the decoding apparatus interpolates a reconstructed picture of the reference layer and then uses it as a reference picture of the enhancement layer. An image decoding apparatus for an enhancement layer is similar to an image decoding apparatus of a reference layer, such as an entropy decoding unit 350, an inverse quantizer 351, an inverse transform unit 352, a motion compensator 353, and an intra predictor 354. , A deblocking filter unit 355, a sample adaptive offset unit 356, and a reconstructed picture buffer 357. The apparatus for decoding an image additionally includes an interpolation unit 330 for interpolating a reconstructed picture of a reference layer.

When decoding the enhancement layer in the scalable video codec, since the enhancement layer refers to the reference layer, decoding on the reference layer must be preceded. After decoding of the reference layer is preceded, the syntax elements of the enhancement layer encoded by CABAC or VLC are decoded through the entropy decoder 350 through the bitstream of the enhancement layer extracted through the demultiplexer 300. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 351, and the inverse quantized coefficients are inversely changed by the inverse transformation unit 352 to restore the difference coefficient values. When the block of the enhancement layer to be decoded is in the inter prediction mode, a motion block 353 generates a prediction block from a picture stored in the reconstructed picture buffer 357. When the block of the enhancement layer to be decoded is encoded in the intra prediction mode, the intra prediction unit 354 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 355. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 356, and a process of compensating an offset value in units of pixels according to each class is performed.

The image decoding apparatus decodes a picture of a reference layer and interpolates it according to the resolution of the enhancement layer, and then uses the interpolated picture as a prediction value in inter-screen or intra-picture prediction of the enhancement layer. The image decoding apparatus uses a plurality of interpolation points in interpolating a picture decoded in a reference layer. When a plurality of in-loop filters are used in the reference layer, the image decoding apparatus may include points before applying the deblocking filter 340, points after applying the deblocking filter 342, and points after applying the sample adaptive offset 345. It is used as an input point of the executive 330. The point 340 before the deblocking filter is applied has a feature that edges are relatively well preserved since the deblocking filter is not applied. However, when the quantization value is large, a deblocking phenomenon may exist in the image. After the deblocking filter is applied, the point 342 may have a small deblocking phenomenon, but edges may not be well preserved due to low-pass filtering. After the sample adaptive offset is applied, the point 345 may be similar in terms of the PSNR and the original image of the reconstructed point because the band offset and the edge offset may be applied in units of coding tree units (CTUs). The interpolator 330 selects two points from the picture 340 before applying the deblocking filter, the picture 342 after applying the picture, and the picture 345 applied up to the sample adaptive offset in the reference layer and interpolates the image according to the resolution of the enhancement layer. Do this. Information about selecting two interpolation points in the interpolator 330 may be explicitly informed by the enhancement layer. Two pictures interpolated at two points selected by the interpolator 330 are added to the reconstructed picture buffer 357 of the enhancement layer. The interpolation pictures added to the reconstruction picture buffer 357 of the enhancement layer are used for constructing a reference picture list such as L0, L1, or BI direction prediction in the inter prediction process in the enhancement layer. One of the two interpolation points selected by the interpolator 330 may be additionally used in the intra prediction unit of the enhancement layer, and the selection information about the interpolation points may be explicitly informed by the enhancement layer.

Referring to FIG. 4, the apparatus for decoding an image includes a portion for decoding a reference layer and a portion for decoding an enhancement layer. In the scalable video codec, a bitstream of a reference layer and an enhancement layer is composed of a single bitstream. When the scalable video decoder decodes only the reference layer, the demultiplexer 400 extracts only the bitstream of the reference layer from the received bitstream and inputs it to the video decoder of the reference layer. The input bitstream is decoded by the syntax elements encoded by CABAC or VLC through the entropy decoding unit 410. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 411, and the inverse quantized coefficients are inversely transformed through the inverse transformer 412 to be restored to differential coefficient values. When the block to be decoded is in the inter prediction mode, a prediction block is generated from a picture stored in the reconstructed picture buffer 417 through the motion compensation unit 413. When the block to be decoded is encoded in the intra prediction mode, the intra predictor 414 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 415. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 416, and a process of compensating an offset value in units of pixels according to each class is performed. The reconstructed picture that has passed through the deblocking filter unit 415 and the sample adaptive offset unit 416 is stored in the reconstructed picture buffer 417 for the inter prediction mode of the next picture. In the case of playing the video of the reference layer, the picture 420 passed through the sample adaptive offset unit 416 is copied to the video output buffer and then output as video.

According to a third embodiment of the present invention, in order to effectively encode a picture of an enhancement layer, the decoding apparatus interpolates a reconstructed picture of the reference layer and then uses it as a reference picture of the enhancement layer. An image decoding apparatus for an enhancement layer is similar to an image decoding apparatus of a reference layer, such as an entropy decoding unit 450, an inverse quantization unit 451, an inverse transform unit 452, a motion compensator 453, and an intra predictor 454. , A deblocking filter 455, a sample adaptive offset 456, and a reconstructed picture buffer 457. The image decoding apparatus further includes interpolation unit A 440 and interpolation unit B 430 having different characteristics in order to interpolate the reconstructed picture of the reference layer.

When decoding the enhancement layer in the scalable video codec, since the enhancement layer refers to the reference layer, decoding on the reference layer must be preceded. After decoding of the reference layer is preceded, the syntax elements of the enhancement layer encoded by CABAC or VLC are decoded through the entropy decoding unit 450 in the bitstream of the enhancement layer extracted through the demultiplexer 400. The entropy decoded coefficients are inversely quantized while passing through the inverse quantization unit 451, and the inverse quantized coefficients are inversely changed by the inverse transformation unit 452, thereby restoring the differential coefficient values. When the block of the enhancement layer to be decoded is the inter prediction mode, the motion compensation unit 453 generates a prediction block from a picture stored in the reconstructed picture buffer 457. When the block of the enhancement layer to be decoded is encoded in the intra prediction mode, the intra prediction unit 454 generates a prediction block according to the intra prediction mode using pixel values decoded around the block to be currently decoded. The prediction block generated through inter-screen or intra-screen prediction is added to the difference coefficient value of the pixel domain to decode the block. Deblocking filtering is performed on the decoded block, slice, or pictures through the deblocking filter 455. The deblocking filter of HEVC is basically performed on a prediction block (PB) and a transform block (TB) boundary. The decoded block to which deblocking filtering is applied is selected as one of a band offset or an edge offset through the sample adaptive offset unit 456, and a process of compensating an offset value in units of pixels according to each class is performed.

The image decoding apparatus decodes a picture of a reference layer and interpolates it according to the resolution of the enhancement layer, and then uses the interpolated picture as a prediction value in inter-screen or intra-picture prediction of the enhancement layer. The image decoding apparatus uses a plurality of interpolation units considering interpolation points and characteristics of each interpolation point in interpolating a picture decoded in a reference layer. When a plurality of in-loop filters are used in the reference layer, the image decoding apparatus uses points 425 after the deblocking filter and points 420 after the sample adaptive offset are used as interpolation points. Since the two interpolation points have different characteristics depending on whether the sample adaptive offset unit is applied or not, interpolation unit A 440 and interpolation unit B 430 having different interpolation characteristics are used.

The pictures interpolated through the interpolator A 440 and the interpolator B 430 are used for constructing a reference picture list such as L0, L1, or BI direction prediction in the inter prediction process in the enhancement layer. The picture 460 interpolated through the interpolator A 440 may be further used in the intra prediction unit of the enhancement layer.

FIG. 5 is a conceptual diagram for explaining adding pictures interpolated in a reference layer to a reference picture list of an enhancement layer in an image encoding / decoding apparatus according to the present invention.

Referring to FIG. 5, in the image encoding / decoding apparatus to which the present invention is applied, an interpolation picture A having different characteristics by interpolating a decoded picture 510 at the same time in a reference layer when encoding / decoding a picture of an enhancement layer is performed. 530 and interpolation picture B 520 are generated. The generation of interpolation pictures having different characteristics may vary depending on the embodiment as in the first, second, and third embodiments of the present invention.

The generated interpolation picture A 530 and the interpolation picture B 520 are added to the L0 list 540 and the L1 list 550 in the enhancement layer to enable prediction through various picture combinations in the prediction mode of the enhancement layer.

FIG. 6 is a conceptual diagram for describing adding pictures interpolated in a reference layer to a reference picture list of an enhancement layer when encoding / decoding a random access picture is performed in an image encoding / decoding apparatus according to the present invention.

Referring to FIG. 6, in the image encoding / decoding apparatus to which the present invention is applied, an interpolation picture A having different characteristics by interpolating decoded pictures 610 at the same time in a reference layer when encoding / decoding a picture of an enhancement layer is performed. 630 and interpolation picture B 620 are generated. The generated interpolation picture A 630 and interpolation picture B 620 may be randomly accessible pictures such as a clean random access (CRA) or instantaneous decoding refresh (IDR) picture of HEVC. Even when the inter-layer intra prediction technique is used in the corresponding picture, the enhancement layer is added to the L0 list 640 and the L1 list 650 in the enhancement layer to enable prediction through various picture combinations in the enhancement layer.

FIG. 7 is a conceptual view illustrating encoding a block by applying an inter-layer intra prediction technique to a coding block of an enhancement layer in an image encoding apparatus to which the present invention is applied.

Referring to FIG. 7, the image encoding apparatus to which the present invention is applied may apply an inter-layer intra prediction technique using a picture interpolated in a reference layer when encoding a coding block of an enhancement layer. In the picture to be encoded in the enhancement layer, interpolation frame A 710 and interpolation frame B 720 having different characteristics may be generated using a plurality of interpolation points or a plurality of interpolation filters in the picture 700 of the reference layer. . In inter-layer intra prediction, two

prediction blocks

730 and 732 are generated at positions corresponding to coding blocks encoded in an enhancement layer in a plurality of interpolation pictures. In order to be used as a single prediction block for the coding block of the enhancement layer, the two prediction blocks are finally generated as the prediction block 735 through operations such as an average value and a weight-based average value. The coding block 740 of the enhancement layer and the generated prediction block 735 are input to the difference 750 module and then calculate the difference coefficients. The calculated difference coefficients are encoded 760 through a transform, quantization, and entropy coding process in the enhancement layer.

8 is a conceptual diagram illustrating decoding of a block by applying an inter-layer intra prediction technique to a coding block of an enhancement layer in an image decoding apparatus according to the present invention.

Referring to FIG. 8, an image decoding apparatus to which the present invention is applied may apply an inter-layer intra prediction technique using a picture interpolated in a reference layer when decoding a coding block of an enhancement layer. In the picture to be decoded in the enhancement layer, the interpolation frame A 810 and the interpolation frame B 820 having different characteristics may be generated using a plurality of interpolation points or a plurality of interpolation filters of the picture 800 of the reference layer. . In inter-layer intra prediction, two

prediction blocks

830 and 832 are generated at positions corresponding to coding blocks encoded in an enhancement layer in a plurality of interpolation pictures. In order to be used as a single prediction block for the coding block of the enhancement layer, the two prediction blocks are finally generated as the prediction block 835 through calculation of an average value and a weight-based average value. The difference coefficient reconstructed through the entry-decode decoding, inverse quantization, and the soft transform unit in the enhancement layer and the prediction block generated in the reference layer are input to the reconstruction unit 850. The reconstruction unit 850 finally decodes the block 860 by adding the reconstruction difference coefficient of the enhancement layer and the prediction block generated in the reference layer in units of pixels.

9 is a flowchart schematically illustrating a method of constructing a reference picture list of an enhancement layer and performing inter-layer prediction in the enhancement layer according to an embodiment of the present invention. The method of FIG. 9 may be performed by the encoding apparatus and the decoding apparatus of FIGS. 1 to 4 described above. In FIG. 9, it is described as being performed by the decoding apparatus for convenience of description, but may also be performed by the encoding apparatus.

Referring to FIG. 9, when interlayer prediction is applied when decoding a current picture of an enhancement layer, the decoding apparatus may generate a reference picture used for interlayer prediction (S900).

The reference picture may be a picture decoded in the reference layer at the same time as the current picture of the enhancement layer. The picture decoded in the reference layer may be used as a reference picture after resampling according to the resolution of the enhancement layer through interpolation or the like.

For example, the decoding apparatus may generate the reference picture A and the reference picture B by interpolating the decoded picture of the reference layer according to the resolution of the enhancement layer. Here, the reference picture A and the reference picture B may be interpolation pictures generated according to interpolation characteristics of the interpolation unit as in the first, second, and third embodiments of the present invention.

The decoding apparatus may construct a reference picture list of the enhancement layer by using the reference picture (S910).

For example, when the reference picture A and the reference picture B are generated as described above, the decoding apparatus uses the reference picture A and the reference picture B to make reference pictures such as L0 and L1 used in inter-picture prediction or inter-layer prediction. Lists can be constructed. In this case, reference pictures in the reference picture list may be specified by a reference picture index (refIdx) value. For example, the first reference picture in the reference picture list may have a reference picture index (refIdx) having a value of 0, and the second reference picture in the reference picture list may have a reference picture index (refIdx) having a value of 1. FIG.

The reference picture lists L0 and L1 may be configured as shown in FIGS. 5 and 6 described above.

For example, as shown in FIG. 5, the reference picture list L0 is a reference picture having a POC smaller than the picture order count (POC) of the current picture at the reference picture index (refIdx) 0 and a reference picture at the reference picture index (refIdx) 1. A, a reference picture having a larger PCO than the POC of the current picture may be added to the reference picture index refIdx 2. The reference picture list L1 is a reference picture having a larger PCO than the POC of the current picture at reference picture index (refIdx) 0, a reference picture B at reference picture index (refIdx) 1, and a POC at the reference picture index (refIdx) 2 than the POC of the current picture. A small reference picture may be added.

In addition, the reference picture list L0 may add a long-term reference picture having a large POC difference from the current picture to the reference picture index (refIdx) 3, and add the reference picture B to the reference picture index (refIdx) 4. In addition, it can be configured. Reference picture list L1 may add a long-term reference picture having a large POC difference from the current picture to reference picture index (refIdx) 3, and add reference picture A to reference picture index (refIdx) 4 Can be configured.

The decoding apparatus may perform inter-layer prediction on the current picture of the enhancement layer using the reference picture lists L0 and L1 (S920).

For example, the decoding apparatus may receive information (reference picture index information) about a reference picture used for inter-layer prediction from the encoding apparatus. A reference picture of the current picture can be obtained from the reference picture lists L0 and L1 based on the information about the reference picture (reference picture index information), and a prediction value of the prediction block in the current picture is generated using the reference picture. can do.

The method according to the present invention described above may be stored in a computer-readable recording medium that is produced as a program for execution on a computer, and examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape , Floppy disks, optical data storage devices, and the like, and also include those implemented in the form of carrier waves (eg, transmission over the Internet).

The computer readable recording medium can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. In addition, functional programs, codes, and code segments for implementing the method can be easily inferred by programmers in the art to which the present invention belongs.

In addition, although the preferred embodiment of the present invention has been shown and described above, the present invention is not limited to the specific embodiments described above, but the technical field to which the invention belongs without departing from the spirit of the invention claimed in the claims. Of course, various modifications can be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

Claims

In a reconstructed picture reference method for decoding an enhancement layer in a scalable video codec,

Selecting a plurality of interpolation points in a picture of the reference layer;

Performing interpolation through different interpolators according to the input picture of the selected point; And

And adding the interpolated picture to a reference picture list of an enhancement layer.
The method of claim 1,

The selecting of the plurality of interpolation points includes selecting a point before applying a deblocking filter as the interpolation point when a plurality of in-loop filters are used in the reference layer.
The method of claim 1,

The selecting of the plurality of interpolation points may refer to an inter-layer reconstructed picture including selecting two points from a picture before a deblocking filter, a picture after a deblocking filter, and a picture applied to a sample adaptive offset in the reference layer. Way.
The method of claim 1,

The reconstructed picture reference method of the interpolation target picture includes at least one of reconstructed pictures included in the reference layer.
The method of claim 1,

And the plurality of interpolators each have different characteristics.
A reconstructed picture reference apparatus for decoding an enhancement layer in a scalable video codec,

A plurality of interpolators for selecting a plurality of interpolation points in a picture of a reference layer and performing interpolation through different interpolators according to input pictures of the selected points; And

And a buffer unit to add the interpolated picture to a reference picture list of an enhancement layer.
The method of claim 6,

And the plurality of interpolators further selects a point before applying a deblocking filter as the interpolation point when a plurality of in-loop filters are used in the reference layer.
The method of claim 6,

And the plurality of interpolators select two points from the picture before the deblocking filter is applied, the picture after the deblocking filter is applied, and the picture to which the sample adaptive offset is applied.
The method of claim 6,

The reconstructed picture reference device of the interpolation target picture includes at least one of reconstructed pictures included in the reference layer.
The method of claim 6,

The reconstructed picture reference apparatus, wherein the plurality of interpolators have different characteristics.