WO2015072626A1 - Interlayer reference picture generation method and apparatus for multiple layer video coding - Google Patents

Interlayer reference picture generation method and apparatus for multiple layer video coding Download PDF

Info

Publication number
WO2015072626A1
WO2015072626A1 PCT/KR2014/001197 KR2014001197W WO2015072626A1 WO 2015072626 A1 WO2015072626 A1 WO 2015072626A1 KR 2014001197 W KR2014001197 W KR 2014001197W WO 2015072626 A1 WO2015072626 A1 WO 2015072626A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
reference layer
image
enhancement layer
inter
Prior art date
Application number
PCT/KR2014/001197
Other languages
French (fr)
Korean (ko)
Inventor
김경혜
조현호
심동규
유지우
Original Assignee
광운대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 광운대학교 산학협력단 filed Critical 광운대학교 산학협력단
Publication of WO2015072626A1 publication Critical patent/WO2015072626A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors

Definitions

  • the present invention relates to an image processing technique, and more particularly, to a method and apparatus for more effectively compressing an enhancement layer by using a reconstructed picture of a reference layer in inter-layer video coding.
  • Conventional video coding generally services by encoding and decoding one screen, resolution and bit rate suitable for an application.
  • Scalable Video Coding SVC
  • MVC multi-view video that can express various viewpoints and depth information Standardization and related research on multi-view video coding
  • H.264 / AVC a video compression standard technology that is widely used in the market, also includes SVC and MVC extended video standards, and extended High Efficiency Video Coding (HEVC), which was established in January 2013. Standardization on video standard technology is underway.
  • HEVC High Efficiency Video Coding
  • the SVC may refer to and code images having one or more temporal / spatial resolutions and image quality with each other, and the MVC may refer to and code multiple images at different viewpoints.
  • coding of one image is called a layer.
  • Conventional video coding can be encoded / decoded by referring to previously decoded / decoded information in one image, but extended video encoding / decoding is performed by referring to not only the current layer but also different layers at different resolutions and / or different viewpoints. You can perform encryption / decryption.
  • Hierarchical or multi-view video data transmitted and decoded for various display environments should support compatibility with existing single layer and viewpoint systems as well as stereoscopic image display systems.
  • the concept introduced for this is the base layer or reference layer and enhancement layer or extended layer in hierarchical video coding, and the base view in multiview video coding. ) Or reference view, enhancement view, or extended view. If a bitstream is encoded using a HEVC-based hierarchical or multi-view video coding technique, at least one base layer / view or reference layer / view can be correctly decoded by the HEVC decoding apparatus in the decoding process of the corresponding bitstream.
  • the extended layer / view or enhancement layer / view is an image decoded by referring to information of another layer / view, so that information of the layer / view referred to is present and correctly decoded after the image of the layer / view is decoded. Can be. Therefore, the decoding order must be followed according to the coding order of each layer / view image.
  • the reason why the enhancement layer / view has a dependency on the reference layer / view is that encoding information or an image of the reference layer / view is used in the encoding process of the enhancement layer / view, and in hierarchical video coding, inter-layer prediction (inter-layer) prediction, referred to as inter-view prediction in multiview video coding.
  • inter-layer prediction inter-layer prediction
  • layer / time prediction By performing layer / time prediction, additional bit savings of about 20 to 30% can be achieved, compared to general intra-picture prediction and inter-screen prediction.
  • the reference layer / time of the enhancement layer / time is used. Research is in progress on how to use or correct information.
  • the enhancement layer may refer to a reconstructed picture of the reference layer, and if there is a difference in resolution between the reference layer and the enhancement layer, upsampling of the reference layer is performed to perform the reference. Can be done.
  • An object of the present invention is to improve an up-sampled reference layer image by predicting a difference coefficient between layers in order to improve the enhancement layer encoding performance when referring to the reconstructed image of the reference layer in the enhancement / decoding unit of the enhancement layer.
  • Another object of the present invention is to provide a method and apparatus for predicting a difference coefficient without applying an interpolation filter to a reconstructed image of a reference layer and an enhancement layer by adjusting motion information of a reference layer when predicting and encoding inter-layer difference coefficients. It is done.
  • the inter-layer reference image generator includes an upsampling unit; It includes an inter-layer reference picture enhancement unit.
  • the reference layer motion information limiter restricts the precision of the motion vector of the reference layer when predicting the inter-layer difference signal, thereby avoiding applying an additional interpolation filter to the upsampled reference layer and enhancement layer pictures. .
  • the reference layer motion information adjusting unit adjusts the precision of the motion vector of the reference layer when predicting the inter-layer difference signal for improving the inter-layer reference picture, thereby adding it to the reconstructed picture of the reference layer and the enhancement layer. Differential signal prediction between layers can be performed without applying an interpolation filter.
  • FIG. 1 is a block diagram illustrating a configuration of a scalable video encoder.
  • FIG. 2 is a block diagram of an extended decoder according to an embodiment of the present invention.
  • FIG. 3 is a block diagram of an extension encoder according to an embodiment of the present invention.
  • FIG. 4 is a block diagram of an apparatus for upsampling a reconstructed frame of a reference layer in a scalable video encoder / decoder, enhancing an upsampled reference layer image, and using the same as a reference value of an enhancement layer.
  • FIG. 5 is a conceptual diagram illustrating a generalized residual prediction (GRP) for inter-layer difference coefficients according to an embodiment of the present invention.
  • GRP generalized residual prediction
  • FIG. 6 is a block diagram of an extended decoder according to this embodiment of the present invention.
  • FIG. 7 is a block diagram of an extension encoder according to this embodiment of the present invention.
  • 8A is a diagram illustrating a reference layer upsampling and enhancement operation of an extension encoder / decoder according to an embodiment of the present invention.
  • FIG. 8B is a view for explaining the operation of the motion information adjusting unit of the expansion unit / decoder according to this embodiment of the present invention.
  • FIG. 9 illustrates an example in which a motion information adjusting unit of an extension / decoder according to an embodiment of the present invention maps a motion vector of a reference layer to integer pixels.
  • FIG. 10 is a diagram for explaining an example of a method of constructing an enhancement layer reference list of an extended encoder / decoder according to this embodiment of the present invention.
  • first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
  • the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
  • each component shown in the embodiments of the present invention are independently shown to represent different characteristic functions, and do not mean that each component is made of separate hardware or one software unit.
  • each component is included in each component for convenience of description, and at least two of the components may be combined into one component, or one component may be divided into a plurality of components to perform a function.
  • Integrated and separate embodiments of the components are also included within the scope of the present invention without departing from the spirit of the invention.
  • the components may not be essential components for performing essential functions in the present invention, but may be optional components for improving performance.
  • the present invention can be implemented including only the components essential for implementing the essentials of the present invention except for the components used for improving performance, and the structure including only the essential components except for the optional components used for improving performance. Also included in the scope of the present invention.
  • FIG. 1 is a block diagram illustrating a configuration of a scalable video encoder.
  • a scalable video encoder provides spatial scalability, temporal scalability, and SNR scalability.
  • spatial scalability multi-layers using upsampling are used, and temporal scalability uses Hierarchical B picture structure.
  • quality scalability only the quantization coefficient is changed or a gradual encoding method for quantization error is used in the same manner as the technique for spatial scalability.
  • Input video 110 is down sampled through spatial decimation 115.
  • the down-sampled image 120 is used as an input of the reference layer, and the coding blocks in the picture of the reference layer may be obtained through intra prediction using the intra prediction unit 135 or inter prediction using the motion compensation unit 130.
  • the difference coefficient which is a difference value between the original block to be encoded and the prediction block generated by the motion compensation unit 130 or the intra prediction unit 135, is discrete cosine transformed or integer transformed through the transform unit 140.
  • the transform difference coefficient is quantized while passing through the quantization unit 145, and the transform difference coefficient is entropy coded by the entropy encoder 150.
  • the quantized transform difference coefficients are reconstructed back into differential coefficients through the inverse quantizer 152 and the inverse transform unit 154 to generate predicted values for use in adjacent blocks or adjacent pictures.
  • the difference coefficient value restored due to an error occurring in the quantization unit 145 may not match the difference coefficient value used as an input of the converter 140.
  • the reconstructed difference coefficient value is added to a prediction block previously generated by the motion compensator 130 or the intra predictor 135 to reconstruct the pixel value of the block currently encoded.
  • the reconstructed block passes through the in-loop filter 156. When all blocks in the picture are reconstructed, the reconstructed picture is input to the reconstructed picture buffer 158 and used for inter prediction in the reference layer.
  • the input video 110 is used as an input value and encoded.
  • the interlayer prediction is performed by the motion compensator 172 or the intra predictor 170 in order to effectively encode the coding block in the picture as in the reference layer.
  • an intra prediction is performed and an optimal prediction block is generated.
  • the block to be encoded in the enhancement layer is predicted in the prediction block generated by the motion compensator 172 or the intra predictor 170, and as a result, a difference coefficient is generated in the enhancement layer.
  • the difference coefficients of the enhancement layer are encoded through the transform unit, the quantization unit, and the entropy encoding unit like the reference layer.
  • encoded bits are generated in each layer.
  • the multiplexer 180 serves to configure one single bitstream 185.
  • each of the multiple layers may be independently encoded in FIG. 1, since the input video of the lower layer is down-sampled from the video of the upper layer, it has very similar characteristics. Therefore, when the reconstructed pixel values, motion vectors, and the like of the lower layer video are used in the enhancement layer, encoding efficiency may be increased.
  • the inter-prediction prediction 172 of the enhancement layer may reconstruct an image of the reference layer and interpolate the reconstructed image 164 according to the image size of the enhancement layer and use it as a reference image.
  • a method of decoding the reference image in units of frames and a method of decoding in units of blocks may be used in consideration of a reduction in complexity.
  • the image 164 reconstructed in the reference layer is input to the motion compensation unit 172 of the enhancement layer, thereby improving the coding efficiency in the enhancement layer.
  • the motion information 162 of the reference layer may be upsampled through the upsampling unit 160 according to the enhancement layer resolution, and then referred to when motion information is encoded by the motion compensation unit 172 of the enhancement layer.
  • the extended decoder includes both a reference layer 200 and a decoder for the enhancement layer 210.
  • the reference layer 200 and the enhancement layer 210 may be one or multiple depending on the number of layers of the SVC.
  • the decoder 200 of the reference layer has an entropy decoder 201, an inverse quantizer 202, an inverse transformer 203, a motion compensator 204, and an intra prediction unit 205 in a structure similar to a general video decoder. ), A loop filter unit 206, a reconstructed image buffer 207, and the like.
  • the entropy decoding unit 201 receives an extracted bitstream of the reference layer through the demultiplexer unit 224 and then performs an entropy decoding process.
  • the quantized coefficient values reconstructed through the entropy decoding process are inversely quantized by the inverse quantizer 202.
  • the inverse-zeroed coefficient value is restored to the residual coefficient through the inverse transform unit 203.
  • the decoder of the reference layer performs motion compensation through the motion compensation unit 204.
  • the reference layer motion compensation unit 204 performs motion compensation after performing interpolation according to the precision of a motion vector.
  • the decoder When the coding block of the reference layer is encoded through intra prediction, the decoder generates a prediction value through the intra prediction unit 205.
  • the intra prediction unit 205 generates a prediction value from the reconstructed neighboring pixel values in the current frame according to the intra prediction mode.
  • the difference coefficient reconstructed in the reference layer and the predicted value are added to each other to generate a reconstructed value.
  • the reconstructed frame is stored in the reconstructed image buffer 207 after passing through the loop filter unit 206 and used as a predicted value in the inter prediction of the next frame.
  • the extended decoder including the reference layer and the enhancement layer decodes the image of the reference layer and uses the prediction layer in the motion compensation unit 214 and the intra prediction unit 215 of the enhancement layer.
  • the upsampling unit 221 upsamples the picture and motion information 223 reconstructed in the reference layer according to the resolution of the enhancement layer.
  • the motion vector included in the motion information 223 may be used in the original form or in the compressed form.
  • the upsampled image 225 may be used as a reference image by the motion compensator 214 of the enhancement layer.
  • the enhanced inter-layer reference image 226 may be used as a reference image in the motion compensation unit 214 of the enhancement layer.
  • the bitstream input to the extended decoder is input to the entropy decoding unit 211 of the enhancement layer through the demultiplexer 224 to perform bitstream parsing according to the syntax structure of the enhancement layer.
  • a reconstructed differential image is generated through the inverse quantization unit 212 and the inverse transform unit 213, which is further added to the prediction image acquired by the motion compensation unit 214 or the intra prediction unit 215 of the enhancement layer.
  • the reconstructed image is stored in the reconstructed image buffer 217 via the loop filter 216 and used in the predictive image generation process by the motion compensator 214 of frames continuously positioned in the enhancement layer.
  • FIG. 3 is a block diagram of an extension encoder according to an embodiment of the present invention.
  • the scalable video encoder downsamples the input video 300 through the spatial partitioning 310 and then uses the downsampled video 320 as an input of the video encoder of the reference layer.
  • Video input to the reference layer video encoder is predicted in an intra or inter mode in units of coding blocks in the reference layer.
  • the difference image which is a difference between the original block and the coding block, is transformed and quantized through the transform unit 330 and the quantizer 335.
  • the quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 340.
  • the encoder for the enhancement layer uses input video 300 as input.
  • the input video is predicted through the intra predictor 360 or the motion compensator 370 in units of coding blocks in the enhancement layer.
  • the difference image which is the difference between the original block and the coding block, undergoes a transform encoding and quantization process through the transformer 371 and the quantizer 372.
  • the quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 373.
  • the bitstreams encoded in the reference layer and the enhancement layer are composed of a single bitstream 385 through the multiplexer 380.
  • the motion compensation unit 370 of the enhancement layer encoder may generate a prediction value by using the reconstructed picture of the reference layer.
  • the reconstructed reference layer picture is upsampled by the upsampling unit 350 according to the resolution of the enhancement layer, and the upsampled reference layer image 355 is used by the motion compensator 370.
  • the motion compensation unit 370 of the enhancement layer may upsample the motion information 345 of the reference layer by the upsampling unit 350 to use the reference information when encoding the motion vector.
  • motion vector information compressed in the reference layer may be used.
  • the enhanced inter-layer reference image 395 may be used as a reference image in the motion compensator 370 of the enhancement layer.
  • FIG. 4 is a block diagram of an apparatus for upsampling and improving a reconstructed picture of a reference layer in a scalable video encoder / decoder.
  • the apparatus includes a reconstructed picture buffer 401 of the reference layer, an N-fold upsampling unit 402, an inter-layer reference picture enhancement unit 403, and an inter-layer reference picture buffer 404. .
  • the reference layer reconstructed picture buffer 401 is a buffer that stores a reconstructed picture of the reference layer.
  • the reconstructed image of the reference layer should be upsampled to a size corresponding to the image size of the enhancement layer, and the upsampling is performed through the N-fold upsampling unit 402.
  • the upsampled image of the reference layer is enhanced by the inter-layer reference image enhancer 403 and then stored in the inter-layer reference image buffer 404 of the enhancement layer.
  • FIG. 5 is a conceptual diagram illustrating a generalized residual prediction (GRP) technique for improving an inter-layer reference picture according to an embodiment of the present invention.
  • GRP generalized residual prediction
  • block 530 of a corresponding position of an upsampled reference layer may be selected as a prediction block.
  • a difference coefficient is predicted using the motion information 510 of the reference layer block 530 at a position corresponding to the enhancement layer block currently being coded and the reconstructed images of the enhancement layer and the reference layer.
  • the reference layer picture is improved.
  • compressed motion vector information may be used in the reference layer, or uncompressed original motion information may be used.
  • the difference coefficient 560 is calculated by the difference between the prediction block 520 in the enhancement layer reconstruction image generated using the upsampled motion information 510 of the reference layer and the prediction block 550 in the upsampled reference layer reconstruction image. .
  • the final prediction block 570 of the enhancement layer may be generated by adding the generated difference coefficient 560 and the reference layer block 530, and the difference coefficient 560 may be multiplied by a weight.
  • the coefficient of weight may be selected to 0, 0.5, 1 and the like.
  • the motion information of the reference layer is bidirectional prediction in the GRP
  • the average value of the difference coefficient in the L0 direction and the difference coefficient in the L0 direction and the block 53 of the reference layer is calculated in order to calculate the prediction block 580 of the enhancement layer. Using the weighted sum for.
  • FIG. 6 is a block diagram of an extended decoder according to this embodiment of the present invention.
  • a single bitstream input to the scalable video decoder configures a bitstream for each layer through the demultiplexer 624.
  • the bitstream for the reference layer is entropy decoded through the entropy decoding unit 601 of the reference layer.
  • the entropy decoded difference coefficient is decoded into a difference coefficient after passing through the inverse quantization unit 602 and the inverse transform unit 603.
  • the coding block decoded in the reference layer generates a predictive block through the motion compensator 604 or the intra predictor 605, which is added to the difference coefficient to decode the block.
  • the decoded image is filtered through the in-loop filter 606 and then stored in the reconstructed image buffer 607 of the reference layer.
  • the bitstream of the enhancement layer extracted through the demultiplexer 624 is entropy decoded by the entropy decoding unit 611 of the enhancement layer.
  • the entropy-decoded difference coefficient is decoded into the difference coefficient after passing through the inverse quantization unit 612 and the inverse transform unit 613.
  • the coding block decoded in the enhancement layer generates a prediction block through the motion compensation unit 614 or the intra prediction unit 615 of the enhancement layer, and the prediction block is added to the difference coefficient to decode the block.
  • the decoded image is filtered through the in-loop filter 616 and then stored in the reconstructed image buffer 617 of the enhancement layer.
  • the upsample the image and motion information of the reference layer and then derive the difference coefficient from the reference layer and the enhancement layer reconstructed image using the motion vector of the reference layer, and reference the derived difference coefficient value. In addition to the layer, use it as a predictive value.
  • motion vector information compressed in the reference layer may be used.
  • the upsampling unit 621 performs upsampling according to the resolution of the image of the enhancement layer by using the reconstructed image of the reference layer.
  • the motion information adjusting unit 625 adjusts the precision of the reference layer motion vector in integer pixels in order to use the motion vector information of the reference layer in the GRP.
  • the inter-layer reference image enhancer 622 receives a coding block 530 at the same position as the coding block 500 of the enhancement layer from the reconstructed picture buffer of the reference layer and is manipulated by an integer unit through the motion information adjuster 625. Receive a motion vector.
  • the upsampling unit 621 compensates for the block for generating the differential coefficients in the upsampled image and the reconstructed image of the enhancement layer by using the motion vector adjusted in integer units.
  • the difference between the two compensated prediction blocks and the coding block 500 of the enhancement layer and the coding block 530 of the same position are added to generate the prediction image 627 to be used in the enhancement layer.
  • FIG. 7 is a block diagram of an extension encoder according to this embodiment of the present invention.
  • the scalable video encoder downsamples the input video 700 through the spatial partitioning 715 and then uses the downsampled video 710 as an input of the video encoder of the reference layer.
  • Video input to the reference layer video encoder is predicted in an intra or inter mode in units of coding blocks in the reference layer.
  • the difference image which is the difference between the original block and the coding block, undergoes a transform encoding and quantization process through the transform unit 730 and the quantization unit 732.
  • the quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 734.
  • the encoder for the enhancement layer uses input video 700 as input.
  • the input video is predicted through the intra predictor 760 or the motion compensator 765 in units of coding blocks in the enhancement layer.
  • the difference image which is the difference between the original block and the coding block, is transformed and quantized through the transform unit 770 and the quantizer 772.
  • the quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 774.
  • the bitstreams encoded in the reference layer and the enhancement layer consist of a single bitstream 785 through the multiplexer 780.
  • the difference coefficient is derived from the reconstructed picture of the reference layer and the enhancement layer by using the motion vector of the reference layer, and the derived difference coefficient value is referred to as the reference layer. In addition to the block, it is used as a prediction value of the enhancement layer.
  • motion vector information compressed in the reference layer may be used.
  • the upsampling unit 750 performs upsampling according to the resolution of the image of the enhancement layer by using the reconstructed image of the reference layer.
  • the motion information adjusting unit 794 adjusts the precision of the upsampled motion vector in integer units in order to use the motion vector information of the reference layer in the GRP.
  • the inter-layer reference image enhancer 790 receives a coding block 530 at the same position as the coding block 500 of the enhancement layer from the reconstructed picture buffer of the reference layer and is manipulated by an integer unit through the motion information adjuster 794. Receive a motion vector.
  • the upsampling unit 750 compensates a block for generating a differential coefficient in the upsampled reference layer image and the reconstructed image of the enhancement layer by using the motion vector of the reference layer adjusted by an integer unit.
  • the inter-layer prediction image 792 to be used in the enhancement layer is generated by adding the difference between the two compensated prediction blocks and the coding block 530 of the reference layer located at the same position as the coding block 500 of the enhancement layer.
  • FIG. 8 is a diagram illustrating an operation of a motion information adjusting unit of an expansion / decoding unit according to an embodiment of the present invention.
  • the motion information adjusting units 625 and 794 of the expansion unit / decoder adjust the precision of the upsampled motion vector of the reference layer to an integer position for GRP.
  • GRP derives the difference coefficients from the reference layer and the enhancement layer using the motion vector of the reference layer.
  • the reference picture should be interpolated according to the precision of the motion vector.
  • the interpolation is not performed in the reconstructed images of the reference layer and the enhancement layer by adjusting the motion vector to an integer position.
  • the motion information adjusting units 625 and 794 determine whether the motion vector of the reference layer is already at an integer position (810). If the motion vector of the reference layer is already at an integer position, no additional motion vector adjustment is performed. If the motion vector of the reference layer is not an integer position, mapping 811 to integer pixels is performed so that the motion vector of the reference layer can be used in the GRP.
  • FIG. 9 illustrates an example in which a motion information adjusting unit of an extension / decoder according to an embodiment of the present invention maps a motion vector of an enhancement layer to integer pixels.
  • the motion vector of the enhancement layer may be located at integer locations 900, 905, 910, and 915 or at non-integer locations 920.
  • Process of interpolating the reference layer and the enhancement layer reconstruction image by mapping the motion vector of the reference layer to integer pixels when generating difference coefficients from the reconstruction images of the reference layer and the enhancement layer using the motion vector of the reference layer in GRP Can be omitted. If the motion vector of the reference layer corresponds to the non-integer position 920, the motion vector is adjusted to the integer pixel position 900 located on the left-top side of the pixel of the non-integer position, and then the adjusted motion vector is used for the GRP. do.
  • FIG. 10 is a diagram illustrating a configuration of an enhancement layer reference list of an extended encoder / decoder according to an embodiment of the present invention.
  • the reference layer picture 1010 is enhanced by a reference layer picture A 1020 upsampled to fit with an enhancement layer and an inter-layer reference picture enhancer 622 or 790. It may be used to construct a reference picture list of the enhancement layer.
  • the reference list L0 and L1 may be configured 1040 using only the reference layer image A 1020, and the reference layer image A 1020 is added to L0 and the reference layer image B 1030 is added to L1. May be configured 1050.
  • the reference layer structure 1060 of the enhancement layer may be performed by adding the reference layer image B 1030 to the reference list L0 and the reference layer image A 1010 to the reference list L1.
  • Reference layer picture A 1020 and reference layer picture B 1030 added to the reference list may be used to encode an enhancement layer.
  • the method according to the present invention described above may be stored in a computer-readable recording medium that is produced as a program for execution on a computer, and examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape , Floppy disks, optical data storage devices, and the like, and also include those implemented in the form of carrier waves (eg, transmission over the Internet).
  • the computer readable recording medium can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
  • functional programs, codes, and code segments for implementing the method can be easily inferred by programmers in the art to which the present invention belongs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention, with reference to a restoration image of a reference layer by an enhancement layer in an SVC decoder, adjusts and limits a motion vector of the reference layer to an integer pixel position when deriving a differential coefficient of the reference layer by using the motion vector of the reference layer in a GRP process, thereby being capable of generating a differential coefficient without performing additional interpolation on an image of the reference layer and a restoration image of the enhancement layer.

Description

다중 계층 비디오 코딩을 위한 계층 간 참조 픽쳐 생성 방법 및 장치Method and apparatus for generating inter-layer reference picture for multi-layer video coding
본 발명은 영상 처리 기술에 관한 것으로써, 보다 상세하게는 계층간 비디오 코딩에서 참조 계층의 복원 픽쳐를 사용하여 향상 계층을 보다 효과적으로 압축하는 방법 및 장치에 관한 것이다.The present invention relates to an image processing technique, and more particularly, to a method and apparatus for more effectively compressing an enhancement layer by using a reconstructed picture of a reference layer in inter-layer video coding.
종래의 비디오 코딩은 일반적으로 응용에 적합한 하나의 화면, 해상도 및 비트율을 부호화 및 복호화하여 서비스한다. 멀티미디어의 발달로 인하여 다양한 해상도와 응용 환경에 따라 시공간에 따른 해상도 및 화질을 다양하게 지원하는 비디오 코딩 기술인 스케일러블 비디오 코딩(SVC: Scalable Video Coding)과 다양한 시점과 깊이 정보를 표현할 수 있는 멀티뷰 비디오 코딩(MVC: Multi-view Video Coding)에 대한 표준 제정 및 관련 연구가 진행되어 왔다. 이러한 MVC와 SVC 등을 칭하여 확장 비디오 부/복호화라 한다. Conventional video coding generally services by encoding and decoding one screen, resolution and bit rate suitable for an application. Due to the development of multimedia, Scalable Video Coding (SVC), a video coding technology that supports various resolutions and image quality according to various resolutions and application environments, and multi-view video that can express various viewpoints and depth information Standardization and related research on multi-view video coding (MVC) has been conducted. Such MVC and SVC are referred to as extended video encoding / decoding.
현재 시장에서 널리 사용되고 있는 비디오 압축 표준 기술인 H.264/AVC도 SVC와 MVC의 확장 비디오 표준을 포함하고 있으며, 2013년 1월에 표준 제정이 완료된 고효율 비디오 코딩 (HEVC: High Efficiency Video Coding)도 확장 비디오 표준 기술에 대한 표준화를 진행 중에 있다. H.264 / AVC, a video compression standard technology that is widely used in the market, also includes SVC and MVC extended video standards, and extended High Efficiency Video Coding (HEVC), which was established in January 2013. Standardization on video standard technology is underway.
SVC는 하나 이상의 시간/공간 해상도 및 화질을 갖는 영상을 서로 참조하며 코딩 할 수 있으며, MVC는 여러 시점에서의 다수 영상이 서로 참조하여 코딩 할 수 있다. 이 때, 하나의 영상에 대한 코딩을 계층이라 칭한다. 기존의 비디오 코딩은 하나의 영상에서 미리 부/복호화 된 정보를 참조하여 부/복호화가 가능하지만, 확장 비디오 부/복호화는 현재 계층뿐만 아니라 다른 해상도 및/또는 다른 시점의 서로 다른 계층 간 참조를 통하여 부/복호화를 수행 할 수 있다. The SVC may refer to and code images having one or more temporal / spatial resolutions and image quality with each other, and the MVC may refer to and code multiple images at different viewpoints. In this case, coding of one image is called a layer. Conventional video coding can be encoded / decoded by referring to previously decoded / decoded information in one image, but extended video encoding / decoding is performed by referring to not only the current layer but also different layers at different resolutions and / or different viewpoints. You can perform encryption / decryption.
다양한 디스플레이 환경에 대하여 전송 및 복호화되는 계층적 혹은 다시점 비디오 데이터는 입체 영상 디스플레이 시스템뿐만 아니라 기존의 단일 계층 및 시점의 시스템에 대한 호환성을 지원하여야 한다. 이를 위하여 도입된 개념이 계층적 비디오 코딩에서는 기본계층 (base layer) 혹은 참조계층 (reference layer)과 향상계층 (enhancement layer) 혹은 확장계층 (extended layer)이며, 다시점 비디오 코딩에서는 기본시점 (base view) 혹은 참조시점 (reference view)과 향상시점 (enhancement view) 혹은 확장시점 (extended view)이다. 어떠한 비트스트림이 HEVC 기반의 계층적 혹은 다시점 비디오 코딩 기술로 부호화 되었다면 해당 비트스트림의 복호화 과정에서는 적어도 한 개의 기본계층/시점 혹은 참조계층/시점에 대해서는 HEVC 복호화 장치를 통해 올바르게 복호화 될 수 있다. 이와 반대로, 확장계층/시점 혹은 향상계층/시점은 다른 계층/시점의 정보를 참조하여 복호화 되는 영상으로써, 참조하는 계층/시점의 정보가 존재하고 해당 계층/시점의 영상이 복호화 된 후에 올바르게 복호화 될 수 있다. 따라서 각 계층/시점 영상의 부호화 순서에 맞게 복호화 순서도 지켜져야 한다.Hierarchical or multi-view video data transmitted and decoded for various display environments should support compatibility with existing single layer and viewpoint systems as well as stereoscopic image display systems. The concept introduced for this is the base layer or reference layer and enhancement layer or extended layer in hierarchical video coding, and the base view in multiview video coding. ) Or reference view, enhancement view, or extended view. If a bitstream is encoded using a HEVC-based hierarchical or multi-view video coding technique, at least one base layer / view or reference layer / view can be correctly decoded by the HEVC decoding apparatus in the decoding process of the corresponding bitstream. On the contrary, the extended layer / view or enhancement layer / view is an image decoded by referring to information of another layer / view, so that information of the layer / view referred to is present and correctly decoded after the image of the layer / view is decoded. Can be. Therefore, the decoding order must be followed according to the coding order of each layer / view image.
향상계층/시점이 참조계층/시점에 대한 종속성을 갖는 이유는 참조계층/시점의 부호화 정보 혹은 영상을 향상계층/시점의 부호화 과정에서 사용되기 때문이며, 계층적 비디오 코딩에서는 계층 간 예측 (inter-layer prediction), 다시점 비디오 코딩에서는 시점 간 예측 (inter-view prediction)이라고 한다. 계층/시점 간 예측을 수행함으로써, 일반적인 화면 내 예측 및 화면 간 예측 수행에 비하여 약 20~30%의 추가적인 비트 절약이 가능하게 되었으며, 계층/시점 간 예측에서 향상계층/시점에서 참조계층/시점의 정보를 어떻게 사용 혹은 보정할 것인가에 대한 연구가 진행 중이다. 계층적 비디오 코딩에서 향상 계층에서의 계층 간의 참조 시, 향상 계층은 참조 계층의 복원 영상을 참조할 수 있으며, 참조 계층과 향상 계층 간 해상도 차이가 날 경우 참조 계층에 대한 업 샘플링을 수행하여 참조를 수행할 수 있다.The reason why the enhancement layer / view has a dependency on the reference layer / view is that encoding information or an image of the reference layer / view is used in the encoding process of the enhancement layer / view, and in hierarchical video coding, inter-layer prediction (inter-layer) prediction, referred to as inter-view prediction in multiview video coding. By performing layer / time prediction, additional bit savings of about 20 to 30% can be achieved, compared to general intra-picture prediction and inter-screen prediction.In the layer / time prediction, the reference layer / time of the enhancement layer / time is used. Research is in progress on how to use or correct information. In hierarchical video coding, when a reference is made between layers in an enhancement layer, the enhancement layer may refer to a reconstructed picture of the reference layer, and if there is a difference in resolution between the reference layer and the enhancement layer, upsampling of the reference layer is performed to perform the reference. Can be done.
본 발명은 향상 계층의 부/복호화기에서 참조 계층의 복원된 영상을 참조 할 때, 향상 계층 부호화 성능을 높이기 위하여 계층 간 차분 계수 예측을 통해 업 샘플링 된 참조 계층 영상을 향상시키는 것을 목적으로 한다. An object of the present invention is to improve an up-sampled reference layer image by predicting a difference coefficient between layers in order to improve the enhancement layer encoding performance when referring to the reconstructed image of the reference layer in the enhancement / decoding unit of the enhancement layer.
또한, 본 발명은 계층간 차분 계수를 예측 부호화할 때 참조 계층의 움직임 정보를 조정함으로써 참조 계층 및 향상 계층의 복원 영상에 보간 필터를 적용하지 않고 차분 계수를 예측하는 방법 및 장치를 제공하는 것을 목적으로 한다. Another object of the present invention is to provide a method and apparatus for predicting a difference coefficient without applying an interpolation filter to a reconstructed image of a reference layer and an enhancement layer by adjusting motion information of a reference layer when predicting and encoding inter-layer difference coefficients. It is done.
본 발명의 일 실시 예에 따른 계층 간 참조 영상 생성부는 업 샘플링 수행부; 계층 간 참조 영상 향상부를 포함한다.The inter-layer reference image generator according to an embodiment of the present invention includes an upsampling unit; It includes an inter-layer reference picture enhancement unit.
본 발명의 이 실시 예에 따른 참조 계층 움직임 정보 제한부는 계층간 차분 신호를 예측할 때 참조 계층의 움직임 벡터의 정밀도를 제한함으로써, 업 샘플링 된 참조 계층 및 향상 계층 픽쳐에 추가적인 보간 필터를 적용하지 않게 한다.The reference layer motion information limiter according to this embodiment of the present invention restricts the precision of the motion vector of the reference layer when predicting the inter-layer difference signal, thereby avoiding applying an additional interpolation filter to the upsampled reference layer and enhancement layer pictures. .
본 발명의 일 실시 예에 따르면, 참조 계층 움직임 정보 조정부는 계층 간 참조 영상 향상을 위해 계층간 차분 신호를 예측할 때 참조 계층의 움직임 벡터의 정밀도를 조정함으로써, 참조 계층과 향상 계층의 복원 영상에 추가 적인 보간 필터 적용 없이 계층간 차분 신호 예측을 할 수 있다.According to an embodiment of the present invention, the reference layer motion information adjusting unit adjusts the precision of the motion vector of the reference layer when predicting the inter-layer difference signal for improving the inter-layer reference picture, thereby adding it to the reconstructed picture of the reference layer and the enhancement layer. Differential signal prediction between layers can be performed without applying an interpolation filter.
도 1은 스케일러블 비디오 부호화기의 구성을 나타내는 블록도이다. 1 is a block diagram illustrating a configuration of a scalable video encoder.
도 2는 본 발명의 일 실시 예에 따른 확장 복호화기의 블록도이다. 2 is a block diagram of an extended decoder according to an embodiment of the present invention.
도 3은 본 발명의 일 실시 예에 따른 확장 부호화기의 블록도이다. 3 is a block diagram of an extension encoder according to an embodiment of the present invention.
도 4는 스케일러블 비디오 부/복호화기에서 참조 계층의 복원 프레임을 업샘플링하고, 업 샘플링 된 참조 계층 영상을 향상하여 향상 계층의 참조 값으로 사용하는 장치의 블록도이다. 4 is a block diagram of an apparatus for upsampling a reconstructed frame of a reference layer in a scalable video encoder / decoder, enhancing an upsampled reference layer image, and using the same as a reference value of an enhancement layer.
도 5는 본 발명의 일 실시 예와 관련된 계층간 차분 계수를 예측 기술(generalized residual prediction; GRP)을 설명하기 위한 개념도이다. FIG. 5 is a conceptual diagram illustrating a generalized residual prediction (GRP) for inter-layer difference coefficients according to an embodiment of the present invention.
도 6은 본 발명의 이 실시 예에 따른 확장 복호화기의 블록도이다. 6 is a block diagram of an extended decoder according to this embodiment of the present invention.
도 7은 본 발명의 이 실시 예 에 따른 확장 부호화기의 블록도이다. 7 is a block diagram of an extension encoder according to this embodiment of the present invention.
도 8a는 본 발명의 이 실시 예에 따른 확장 부/복호화기의 참조 계층 업 샘플링 및 향상 동작을 설명하는 도면이다. 8A is a diagram illustrating a reference layer upsampling and enhancement operation of an extension encoder / decoder according to an embodiment of the present invention.
도 8b는 본 발명의 이 실시 예에 따른 확장 부/복호화기의 움직임 정보 조정부의 동작을 설명하는 도면이다.8B is a view for explaining the operation of the motion information adjusting unit of the expansion unit / decoder according to this embodiment of the present invention.
도 9는 본 발명의 이 실시 예에 따른 확장 부/복호화기의 움직임 정보 조정부가 참조 계층의 움직임 벡터를 정수 화소로 매핑하는 실시 예에 대한 것이다.FIG. 9 illustrates an example in which a motion information adjusting unit of an extension / decoder according to an embodiment of the present invention maps a motion vector of a reference layer to integer pixels.
도 10은 본 발명의 이 실시 예에 따른 확장 부/복호화기의 향상 계층 참조 리스트를 구성하는 방법의 예를 설명하는 도면이다.10 is a diagram for explaining an example of a method of constructing an enhancement layer reference list of an extended encoder / decoder according to this embodiment of the present invention.
이하, 도면을 참조하여 본 발명의 실시 형태에 대하여 구체적으로 설명한다. 본 명세서의 실시 예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.EMBODIMENT OF THE INVENTION Hereinafter, embodiment of this invention is described concretely with reference to drawings. In describing the embodiments of the present specification, when it is determined that a detailed description of a related well-known configuration or function may obscure the gist of the present specification, the detailed description thereof will be omitted.
어떤 구성 요소가 다른 구성 요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있으나, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 아울러, 본 발명에서 특정 구성을 "포함"한다고 기술하는 내용은 해당 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성이 본 발명의 실시 또는 본 발명의 기술적 사상의 범위에 포함될 수 있음을 의미한다. When a component is said to be "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that another component may be present in between. Should be. In addition, the content described as "include" a specific configuration in the present invention does not exclude a configuration other than the configuration, it means that additional configuration may be included in the scope of the technical idea of the present invention or the present invention.
제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
또한 본 발명의 실시 예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시 예 및 분리된 실시 예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다.In addition, the components shown in the embodiments of the present invention are independently shown to represent different characteristic functions, and do not mean that each component is made of separate hardware or one software unit. In other words, each component is included in each component for convenience of description, and at least two of the components may be combined into one component, or one component may be divided into a plurality of components to perform a function. Integrated and separate embodiments of the components are also included within the scope of the present invention without departing from the spirit of the invention.
또한, 일부의 구성 요소는 본 발명에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 발명은 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 발명의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 발명의 권리범위에 포함된다.In addition, some of the components may not be essential components for performing essential functions in the present invention, but may be optional components for improving performance. The present invention can be implemented including only the components essential for implementing the essentials of the present invention except for the components used for improving performance, and the structure including only the essential components except for the optional components used for improving performance. Also included in the scope of the present invention.
도 1은 스케일러블 비디오 부호화기의 구성을 나타내는 블록도이다. 1 is a block diagram illustrating a configuration of a scalable video encoder.
도 1을 참조하면, 스케일러블 비디오 부호화기는 공간적 스케일러빌러티(spatial scalability), 시간적 스케일러빌리티(temporal scalability), 화질적 스케일러빌리티 (SNR scalability)를 제공한다. 공간적 스케일러빌러티를 위해서는 업 샘플링을 이용한 다계층(multi-layers) 방식을 사용하며, 시간적 스케일러빌러티는 Hierarchical B 픽쳐 구조를 사용한다. 그리고 화질적 스케일러빌리티를 위해서는 공간적 스케일러빌러티를 위한 기법과 동일한 방식에 양자화 계수만을 변경하거나 양자화 에러에 대한 점진적 부호화 기법을 사용한다. Referring to FIG. 1, a scalable video encoder provides spatial scalability, temporal scalability, and SNR scalability. For spatial scalability, multi-layers using upsampling are used, and temporal scalability uses Hierarchical B picture structure. In addition, for the quality scalability, only the quantization coefficient is changed or a gradual encoding method for quantization error is used in the same manner as the technique for spatial scalability.
입력 비디오(110)는 spatial decimation(115)을 통해서 다운 샘플링된다. 다운 샘플링된 영상(120)은 참조 계층의 입력으로 사용되며 참조 계층의 픽쳐 내의 코딩 블록들을 인트라 예측부(135)를 통한 화면 내 예측 기술 또는 움직임 보상부(130)를 통한 화면 간 예측 기술을 통해 효과적으로 부호화된다. 부호화하려는 원본 블록과 움직임 보상부(130) 또는 인트라 예측부(135)에서 생성된 예측 블록과의 차이 값인 차분 계수는 변환부(140)를 통해서 이산여현변환 또는 정수 변환된다. 변환 차분 계수는 양자화부(145)를 거치면서 양자화되고 양자화된 변환 차분 계수는 엔트로피 부호화부(150)를 통해 엔트로피 코딩된다. 양자화된 변환 차분 계수는 인접하는 블록 또는 인접한 픽쳐에서 사용할 예측 값을 생성하기 위하여 역양자화부(152)와 역변환부(154)를 거치면서 다시 차분 계수로 복원된다. 이때 양자화부(145)에서 발생하는 에러로 인하여 복원된 차분 계수 값은 변환부(140)의 입력으로 사용되었던 차분 계수 값과 일치하지 않을 수 있다. 복원된 차분 계수 값은 앞서 움직임 보상부(130) 또는 인트라 예측부(135)에서 생성된 예측 블록과 더해짐으로써 현재 부호화했던 블록의 픽셀 값을 복원한다. 복원된 블록은 인-루프 필터(156)를 거치게 되는데 픽쳐 내의 모든 블록이 복원된 경우 복원 픽쳐는 복원 픽쳐 버퍼(158)에 입력되어 참조 계층에서 화면 간 예측에 사용된다. Input video 110 is down sampled through spatial decimation 115. The down-sampled image 120 is used as an input of the reference layer, and the coding blocks in the picture of the reference layer may be obtained through intra prediction using the intra prediction unit 135 or inter prediction using the motion compensation unit 130. Effectively encoded. The difference coefficient, which is a difference value between the original block to be encoded and the prediction block generated by the motion compensation unit 130 or the intra prediction unit 135, is discrete cosine transformed or integer transformed through the transform unit 140. The transform difference coefficient is quantized while passing through the quantization unit 145, and the transform difference coefficient is entropy coded by the entropy encoder 150. The quantized transform difference coefficients are reconstructed back into differential coefficients through the inverse quantizer 152 and the inverse transform unit 154 to generate predicted values for use in adjacent blocks or adjacent pictures. In this case, the difference coefficient value restored due to an error occurring in the quantization unit 145 may not match the difference coefficient value used as an input of the converter 140. The reconstructed difference coefficient value is added to a prediction block previously generated by the motion compensator 130 or the intra predictor 135 to reconstruct the pixel value of the block currently encoded. The reconstructed block passes through the in-loop filter 156. When all blocks in the picture are reconstructed, the reconstructed picture is input to the reconstructed picture buffer 158 and used for inter prediction in the reference layer.
향상 계층에서는 입력 비디오(110)를 그대로 입력 값으로 사용하여 이를 부호화하는데, 참조 계층과 마찬가지로 픽쳐 내의 부호화 블록을 효과적으로 부호화하기 위하여 움직임 보상부(172) 또는 인트라 예측부(170)를 통해 화면 간 예측 또는 화면 내 예측을 수행하고 최적의 예측 블록을 생성한다. 향상 계층에서 부호화하려는 블록은 움직임 보상부(172) 또는 인트라 예측부(170)에서 생성된 예측 블록에서 예측되며 그 결과로 향상 계층에서의 차분 계수가 발생한다. 향상 계층의 차분 계수는 참조 계층과 마찬가지로 변환부, 양자화부, 엔트로피 부호화부를 통해서 부화된다. 도 1과 같이 다계층 구조에서는 각 계층에서 부호화 비트가 발생하는데 멀티플렉서는(180)는 이를 하나의 단일 비트스트림(185)으로 구성하는 역할을 한다. In the enhancement layer, the input video 110 is used as an input value and encoded. The interlayer prediction is performed by the motion compensator 172 or the intra predictor 170 in order to effectively encode the coding block in the picture as in the reference layer. Alternatively, an intra prediction is performed and an optimal prediction block is generated. The block to be encoded in the enhancement layer is predicted in the prediction block generated by the motion compensator 172 or the intra predictor 170, and as a result, a difference coefficient is generated in the enhancement layer. The difference coefficients of the enhancement layer are encoded through the transform unit, the quantization unit, and the entropy encoding unit like the reference layer. In the multi-layered structure as shown in FIG. 1, encoded bits are generated in each layer. The multiplexer 180 serves to configure one single bitstream 185.
도 1에서 다계층 각각을 독립적으로 부호화할 수도 있지만, 하위 계층의 입력 비디오는 상위 계층의 비디오에서 다운 샘플링된 것이므로 매우 유사한 특성을 갖고 있다. 따라서 하위 계층의 비디오의 복원된 픽셀값, 모션벡터 등을 향상 계층에서 이용하면 부호화 효율을 높일 수 있다. Although each of the multiple layers may be independently encoded in FIG. 1, since the input video of the lower layer is down-sampled from the video of the upper layer, it has very similar characteristics. Therefore, when the reconstructed pixel values, motion vectors, and the like of the lower layer video are used in the enhancement layer, encoding efficiency may be increased.
도 1에서 향상 계층의 화면 간 예측(172)은 참조 계층의 영상을 복원한 후 복원된 영상(164)을 향상 계층의 영상 크기에 맞게 보간하고 이를 참조 영상으로 이용할 수 있다. 참조 계층의 영상을 복원하는 경우 복잡도 감소를 고려하여 프레임 단위로 참조 영상을 복호화 하는 방식과 블록 단위로 복호화 화는 방식이 사용될 수 있다. 참조 계층에서 복원된 영상(164)은 향상 계층의 움직임 보상부(172)에 입력되는데, 이를 통해 향상 계층에서의 부호화 효율을 향상시킬 수 있다.In FIG. 1, the inter-prediction prediction 172 of the enhancement layer may reconstruct an image of the reference layer and interpolate the reconstructed image 164 according to the image size of the enhancement layer and use it as a reference image. When reconstructing the image of the reference layer, a method of decoding the reference image in units of frames and a method of decoding in units of blocks may be used in consideration of a reduction in complexity. The image 164 reconstructed in the reference layer is input to the motion compensation unit 172 of the enhancement layer, thereby improving the coding efficiency in the enhancement layer.
도 1에서 참조 계층의 움직임 정보(162)는 향상 계층 해상도에 맞게 업 샘플링 수행부(160)를 통해 업 샘플링 된 후, 향상 계층의 움직임 보상부(172)에서 움직임 정보 부호화 시 참조될 수 있다.In FIG. 1, the motion information 162 of the reference layer may be upsampled through the upsampling unit 160 according to the enhancement layer resolution, and then referred to when motion information is encoded by the motion compensation unit 172 of the enhancement layer.
도 2는 본 발명의 일 실시 예에 따른 확장 복호화기 블록도이다. 확장 복호화기는 참조계층(200)과 향상계층(210)을 위한 복호화기를 모두 포함한다. 참조계층(200)과 향상계층(210)은 SVC의 계층의 개수에 따라 하나 또는 다수개가 될 수 있다. 참조계층의 복호화기(200)는 일반적인 비디오 복호화기와 같은 구조로 엔트로피 복호화부(201), 역 양자화부(202), 역 변환부(203), 움직임 보상부(204), 화면 내 예측부(205), 루프 필터부(206), 복원 영상 버퍼(207) 등을 포함할 수 있다. 엔트로피 복호화부(201)는 디멀티플렉서부(224)를 통해서 참조계층에 대한 추출된 비트스트림을 입력 받은 후 엔트로피 복호화 과정을 수행한다. 엔트로피 복호화 과정을 통해 복원된 양자화된 계수 값은 역 양자화부(202)를 통해서 역 양자화 된다. 역 영자화된 계수 값은 역 변환부(203)를 거쳐 차분 계수(residual)로 복원된다. 참조계층의 코딩 블록에 대한 예측 값을 생성하는데 있어, 해당 코딩 블록이 화면 간 부호화로 코딩 된 경우에는 참조계층의 복호화기에서는 움직임 보상부(204)를 통해서 움직임 보상을 수행한다. 일반적으로 참조계층 움직임 보상부(204)는 움직임 벡터의 정밀도에 따라 보간을 수행한 후 움직임 보상을 수행한다. 참조계층의 코딩 블록이 화면 내 예측을 통해서 부호화된 경우에는 복호화기에서 화면 내 예측부(205)를 통하여 예측 값을 생성한다. 화면 내 예측부(205)에서는 화면 내 예측 모드에 따라서 현재 프레임 내의 복원된 주변 픽셀값 들로부터 예측 값을 생성한다. 참조계층에서 복원된 차분 계수와 예측 값은 서로 더해져서 복원 값을 생성한다. 복원된 프레임은 루프 필터부(206)를 거친 후 복원 영상 버퍼 (207)에 저장되고, 다음 프레임의 화면 간 예측 과정에서 예측 값으로 사용된다. 2 is an extended decoder block diagram according to an embodiment of the present invention. The extended decoder includes both a reference layer 200 and a decoder for the enhancement layer 210. The reference layer 200 and the enhancement layer 210 may be one or multiple depending on the number of layers of the SVC. The decoder 200 of the reference layer has an entropy decoder 201, an inverse quantizer 202, an inverse transformer 203, a motion compensator 204, and an intra prediction unit 205 in a structure similar to a general video decoder. ), A loop filter unit 206, a reconstructed image buffer 207, and the like. The entropy decoding unit 201 receives an extracted bitstream of the reference layer through the demultiplexer unit 224 and then performs an entropy decoding process. The quantized coefficient values reconstructed through the entropy decoding process are inversely quantized by the inverse quantizer 202. The inverse-zeroed coefficient value is restored to the residual coefficient through the inverse transform unit 203. In generating a prediction value for a coding block of a reference layer, when the corresponding coding block is coded by inter picture coding, the decoder of the reference layer performs motion compensation through the motion compensation unit 204. In general, the reference layer motion compensation unit 204 performs motion compensation after performing interpolation according to the precision of a motion vector. When the coding block of the reference layer is encoded through intra prediction, the decoder generates a prediction value through the intra prediction unit 205. The intra prediction unit 205 generates a prediction value from the reconstructed neighboring pixel values in the current frame according to the intra prediction mode. The difference coefficient reconstructed in the reference layer and the predicted value are added to each other to generate a reconstructed value. The reconstructed frame is stored in the reconstructed image buffer 207 after passing through the loop filter unit 206 and used as a predicted value in the inter prediction of the next frame.
상기 참조계층 및 향상계층을 포함한 확장 복호화기는 참조계층의 영상을 복호화한 후 이를 향상계층의 움직임 보상부(214)와 화면 내 예측부(215)에서 예측 값으로 사용한다. 이를 위해 업 샘플링 수행부(221)는 참조 계층에서 복원된 픽쳐와 움직임 정보(223)를 향상 계층의 해상도에 맞춰 업 샘플링을 수행한다. 움직임 정보(223)에 포함된 움직임 벡터는 원본 형태 또는 압축된 형태로 사용될 수 있다. 업 샘플링 된 영상(225)는 향상 계층의 움직임 보상부(214)에서 참조 영상으로 사용될 수 있다. 또한 업 샘플링된 영상은 계층 간 참조 영상 향상부(222)를 통해 향상된 후, 향상 된 계층간 참조 영상(226)은 향상 계층의 움직임 보상부(214)에서 참조 영상으로 사용될 수 있다. The extended decoder including the reference layer and the enhancement layer decodes the image of the reference layer and uses the prediction layer in the motion compensation unit 214 and the intra prediction unit 215 of the enhancement layer. To this end, the upsampling unit 221 upsamples the picture and motion information 223 reconstructed in the reference layer according to the resolution of the enhancement layer. The motion vector included in the motion information 223 may be used in the original form or in the compressed form. The upsampled image 225 may be used as a reference image by the motion compensator 214 of the enhancement layer. In addition, after the upsampled image is enhanced by the inter-layer reference image enhancer 222, the enhanced inter-layer reference image 226 may be used as a reference image in the motion compensation unit 214 of the enhancement layer.
확장 복호화기로 입력된 비트스트림은 디멀티플렉서(224)를 통하여 향상계층의 엔트로피 복호화부 (211)에 입력되어 향상계층의 신택스 구조에 따라 비트스트림 파싱을 수행한다. 이후, 역 양자화부 (212)와 역 변환부 (213)를 거쳐 복원된 차분 영상이 생성되며, 이는 향상 계층의 움직임 보상부 (214) 또는 화면 내 예측부 (215)에서 획득 된 예측 영상에 더해진다. 해당 복원 영상은 루프 필터부 (216)를 거쳐 복원 영상 버퍼 (217)에 저장되고 향상 계층에서 연속하여 위치하는 프레임들의 움직임 보상부 (214)에서 예측 영상 생성 과정에 사용된다.The bitstream input to the extended decoder is input to the entropy decoding unit 211 of the enhancement layer through the demultiplexer 224 to perform bitstream parsing according to the syntax structure of the enhancement layer. Thereafter, a reconstructed differential image is generated through the inverse quantization unit 212 and the inverse transform unit 213, which is further added to the prediction image acquired by the motion compensation unit 214 or the intra prediction unit 215 of the enhancement layer. Become. The reconstructed image is stored in the reconstructed image buffer 217 via the loop filter 216 and used in the predictive image generation process by the motion compensator 214 of frames continuously positioned in the enhancement layer.
도 3은 본 발명의 일 실시 예에 따른 확장 부호화기의 블록도이다. 3 is a block diagram of an extension encoder according to an embodiment of the present invention.
도 3을 참조하면, 스케일러블 비디오 인코더는 입력 비디오(300)를 Spatial Decimation(310)을 통하여 다운 샘플링한 후 다운 샘플링된 비디오(320)를 참조 계층의 비디오 인코더의 입력으로 사용한다. 참조 계층 비디오 인코더에 입력된 비디오는 참조 계층에서 코딩 블록 단위로 인트라 또는 인터 모드로 예측된다. 원본 블록과 코딩 블록의 차이인 차분 영상은 변환부(330), 양자화부(335)를 거치면서 변환 부호화 및 양자화 과정을 거친다. 양자화된 차분 계수들은 엔트로피 부호화부(340)를 통해서 각 신택스 요소 단위로 비트로 표현된다. Referring to FIG. 3, the scalable video encoder downsamples the input video 300 through the spatial partitioning 310 and then uses the downsampled video 320 as an input of the video encoder of the reference layer. Video input to the reference layer video encoder is predicted in an intra or inter mode in units of coding blocks in the reference layer. The difference image, which is a difference between the original block and the coding block, is transformed and quantized through the transform unit 330 and the quantizer 335. The quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 340.
향상 계층을 위한 인코더는 입력 비디오(300)를 입력으로 사용한다. 입력 된 비디오는 향상 계층에서 코딩 블록 단위로 인트라 예측부(360) 또는 움직임 보상부(370)를 통해 예측된다. 원본 블록과 코딩 블록의 차이인 차분 영상은 변환부(371), 양자화부(372)를 거치면서 변환 부호화 및 양자화 과정을 거친다. 양자화된 차분 계수들은 엔트로피 부호화부(373)를 통해서 각 신택스 요소 단위로 비트로 표현된다. 참조 계층과 향상 계층에서 인코딩된 비트스트림은 멀티플렉서(380)를 통해서 단일의 비트스트림(385)으로 구성된다. The encoder for the enhancement layer uses input video 300 as input. The input video is predicted through the intra predictor 360 or the motion compensator 370 in units of coding blocks in the enhancement layer. The difference image, which is the difference between the original block and the coding block, undergoes a transform encoding and quantization process through the transformer 371 and the quantizer 372. The quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 373. The bitstreams encoded in the reference layer and the enhancement layer are composed of a single bitstream 385 through the multiplexer 380.
향상 계층 인코더의 움직임 보상부(370)는 참조 계층의 복원된 픽쳐를 사용하여 예측 값을 생성할 수 있다. 이러한 경우에 복원된 참조 계층의 픽쳐를 업 샘플링 수행부(350)에서 향상 계층의 해상도에 맞춰 업 샘플링고, 업 샘플링 된 참조 계층 영상(355)을 움직임 보상부(370)에서 사용한다. 또한 향상 계층의 움직임 보상부(370)는 움직임 벡터를 부호화 할 때 참조 계층의 움직임 정보(345)를 업 샘플링 수행부(350)에서 업 샘플링 하여 참조 정보로 사용할 수 있다. 참조 계층의 움직임 정보(345)를 사용할 때 참조 계층에서 압축된 움직임 벡터 정보를 사용할 수도 있다. 업 샘플링된 영상은 계층 간 참조 영상 향상부(390)를 통해 향상된 후, 향상 된 계층간 참조 영상(395)은 향상 계층의 움직임 보상부(370)에서 참조 영상으로 사용될 수도 있다.The motion compensation unit 370 of the enhancement layer encoder may generate a prediction value by using the reconstructed picture of the reference layer. In this case, the reconstructed reference layer picture is upsampled by the upsampling unit 350 according to the resolution of the enhancement layer, and the upsampled reference layer image 355 is used by the motion compensator 370. In addition, the motion compensation unit 370 of the enhancement layer may upsample the motion information 345 of the reference layer by the upsampling unit 350 to use the reference information when encoding the motion vector. When using the motion information 345 of the reference layer, motion vector information compressed in the reference layer may be used. After the upsampled image is enhanced by the inter-layer reference image enhancer 390, the enhanced inter-layer reference image 395 may be used as a reference image in the motion compensator 370 of the enhancement layer.
도 4는 스케일러블 비디오 부/복호화기에서 참조 계층의 복원 영상을 업 샘플링 하고 향상시키는 장치의 블록도이다.4 is a block diagram of an apparatus for upsampling and improving a reconstructed picture of a reference layer in a scalable video encoder / decoder.
도 4를 참조하면, 해당 장치는 참조 계층의 복원 영상 버퍼(401), N배 업 샘플링 수행부(402), 계층 간 참조 영상 향상부(403), 계층 간 참조 영상 버퍼(404)를 포함한다.Referring to FIG. 4, the apparatus includes a reconstructed picture buffer 401 of the reference layer, an N-fold upsampling unit 402, an inter-layer reference picture enhancement unit 403, and an inter-layer reference picture buffer 404. .
참조 계층 복원 영상 버퍼(401)은 참조 계층의 복원 영상을 저장하는 버퍼이다. 향상 계층에서 참조 계층의 영상을 사용하기 위하여 참조 계층의 복원 영상은 향상 계층의 영상 크기에 준하는 크기로 업 샘플링 되어야 하는데, N배 업 샘플링 수행부(402)를 통해 업 샘플링이 수행된다. 업 샘플링 된 참조 계층의 영상은 계층 간 참조 영상 향상부(403)에서 향상 된 후 향상 계층의 계층 간 참조 영상 버퍼(404)에 저장된다.The reference layer reconstructed picture buffer 401 is a buffer that stores a reconstructed picture of the reference layer. In order to use the image of the reference layer in the enhancement layer, the reconstructed image of the reference layer should be upsampled to a size corresponding to the image size of the enhancement layer, and the upsampling is performed through the N-fold upsampling unit 402. The upsampled image of the reference layer is enhanced by the inter-layer reference image enhancer 403 and then stored in the inter-layer reference image buffer 404 of the enhancement layer.
도 5는 본 발명의 일 실시 예와 관련된 계층 간 참조 영상 향상을 위한 계층 간 차분 계수 예측 기술 (generalized residual prediction; GRP)을 설명하기 위한 개념도이다.FIG. 5 is a conceptual diagram illustrating a generalized residual prediction (GRP) technique for improving an inter-layer reference picture according to an embodiment of the present invention.
도 5를 참조하면, 스케일러블 비디오 인코더에서 향상 계층의 블록(500)을 코딩할 때, 업 샘플링 된 참조 계층의 대응되는 위치의 블록(530)을 예측 블록으로 선택할 수 있다.Referring to FIG. 5, when coding a block 500 of an enhancement layer in a scalable video encoder, block 530 of a corresponding position of an upsampled reference layer may be selected as a prediction block.
GRP에서는 현재 코딩중인 향상 계층 블록에 대해 대응되는 위치에 존재하는 참조 계층 블록(530)의 움직임 정보(510)와 향상 계층 및 참조 계층의 복원 영상을 이용하여 차분 계수를 예측하고, 이를 참조 계층 블록(530)에 더해줌으로써 참조 계층 영상을 향상시킨다. 참조 계층의 움직임 정보(510)를 사용할 때 참조 계층에서 압축된 움직임 벡터 정보를 사용할 수도 있고, 압축되지 않은 원본 움직임 정보를 사용할 수도 있다. 차분 계수(560)는 참조 계층의 업 샘플링 된 움직임 정보(510)을 이용하여 생성된 향상 계층 복원 영상 내의 예측 블록(520)과 업 샘플링 된 참조 계층 복원 영상 내의 예측 블록(550)의 차로 계산한다. 향상 계층의 최종 예측 블록(570)은 생성된 차분 계수(560)와 참조 계층 블록(530)을 더함으로써 생성할 수 있으며, 차분 계수(560)에는 가중치가 곱해질 수 있다. 이때 가중치의 계수는 0, 0.5, 1등이 선택될 수 있다.In the GRP, a difference coefficient is predicted using the motion information 510 of the reference layer block 530 at a position corresponding to the enhancement layer block currently being coded and the reconstructed images of the enhancement layer and the reference layer. In addition to 530, the reference layer picture is improved. When using the motion information 510 of the reference layer, compressed motion vector information may be used in the reference layer, or uncompressed original motion information may be used. The difference coefficient 560 is calculated by the difference between the prediction block 520 in the enhancement layer reconstruction image generated using the upsampled motion information 510 of the reference layer and the prediction block 550 in the upsampled reference layer reconstruction image. . The final prediction block 570 of the enhancement layer may be generated by adding the generated difference coefficient 560 and the reference layer block 530, and the difference coefficient 560 may be multiplied by a weight. At this time, the coefficient of weight may be selected to 0, 0.5, 1 and the like.
GRP에서 참조 계층의 움직임 정보가 양방향 예측인 경우에는 향상 계층의 예측 블록(580)을 계산하기 위하여 참조 계층의 블록(53)과 L0 방향으로의 차분 계수와 L1 방향으로의 차분 계수의 평균 값에 대한 가중치 합을 이용한다.If the motion information of the reference layer is bidirectional prediction in the GRP, the average value of the difference coefficient in the L0 direction and the difference coefficient in the L0 direction and the block 53 of the reference layer is calculated in order to calculate the prediction block 580 of the enhancement layer. Using the weighted sum for.
도 6은 본 발명의 이 실시 예 에 따른 확장 복호화기의 블록도이다. 6 is a block diagram of an extended decoder according to this embodiment of the present invention.
도 6을 참조하면, 스케일러블 비디오 디코더로 입력된 단일 비트스트림은 디멀티플렉서(624)를 통해서 각 계층을 위한 비트스트림을 구성된다. 참조 계층을 위한 비트스트림은 참조 계층의 엔트로피 복호화부(601)를 통해서 엔트로피 복호화된다. 엔트로피 복호화된 차분 계수는 역양자화부(602)와 역변환부(603)를 거친 후 차분 계수로 복호화된다. 참조 계층에서 복호화하는 코딩 블록은 움직임 보상부(604) 또는 인트라 예측부(605)를 통해 예측 블록을 생성하며 이 예측 블록은 차분 계수와 더해져 블록을 복호화한다. 복호된 영상은 인-루프 필터(606)를 통해 필터링 된 후 참조 계층의 복원 영상 버퍼(607)에 저장된다. Referring to FIG. 6, a single bitstream input to the scalable video decoder configures a bitstream for each layer through the demultiplexer 624. The bitstream for the reference layer is entropy decoded through the entropy decoding unit 601 of the reference layer. The entropy decoded difference coefficient is decoded into a difference coefficient after passing through the inverse quantization unit 602 and the inverse transform unit 603. The coding block decoded in the reference layer generates a predictive block through the motion compensator 604 or the intra predictor 605, which is added to the difference coefficient to decode the block. The decoded image is filtered through the in-loop filter 606 and then stored in the reconstructed image buffer 607 of the reference layer.
디멀티플렉서(624)를 통해서 추출된 향상 계층의 비트스트림은 향상 계층의 엔트로피 복호화부(611)를 통해서 엔트로피 복호화된다. 엔트로피 복호화된 차분 계수는 역양자화부(612)와 역변환부(613)를 거친 후 차분 계수로 복호화된다. 향상 계층에서 복호화하는 코딩 블록은 향상 계층의 움직임 보상부(614) 또는 인트라 예측부(615)를 통해 예측 블록을 생성하며 이 예측 블록은 차분 계수와 더해져 블록을 복호화한다. 복호된 영상은 인-루프 필터(616)를 통해 필터링 된 후 향상 계층의 복원 영상 버퍼(617)에 저장된다.The bitstream of the enhancement layer extracted through the demultiplexer 624 is entropy decoded by the entropy decoding unit 611 of the enhancement layer. The entropy-decoded difference coefficient is decoded into the difference coefficient after passing through the inverse quantization unit 612 and the inverse transform unit 613. The coding block decoded in the enhancement layer generates a prediction block through the motion compensation unit 614 or the intra prediction unit 615 of the enhancement layer, and the prediction block is added to the difference coefficient to decode the block. The decoded image is filtered through the in-loop filter 616 and then stored in the reconstructed image buffer 617 of the enhancement layer.
향상 계층에서 GRP 기술을 사용하는 경우 참조 계층의 영상과 움직임 정보를 업 샘플링한 후 참조 계층의 움직임 벡터를 사용하여 참조 계층 및 향상 계층 복원 영상에서 차분 계수를 유도하고, 유도 된 차분 계수 값을 참조 계층에 더하여 이를 예측 값으로 사용한다. 참조 계층의 움직임 정보(623)를 사용할 때 참조 계층에서 압축된 움직임 벡터 정보를 사용할 수도 있다. 업 샘플링 수행부(621)에서는 참조 계층의 복원 영상을 사용하여 향상 계층의 영상의 해상도에 맞춰 업 샘플링을 수행한다. 움직임 정보 조정부(625)에서는 GRP에서 참조 계층의 움직임 벡터 정보를 사용하기 위하여 참조 계층 움직임 벡터의 정밀도를 정수 픽셀 단위로 조정한다. 계층 간 참조 영상 향상부(622)에서는 참조 계층의 복원 픽쳐 버퍼에서 향상 계층의 코딩 블록(500)과 동일 위치의 코딩 블록(530)을 입력 받고 움직임 정보 조정부(625)를 통해서 정수 단위로 조종된 움직임 벡터를 입력 받는다. 정수 단위로 조정된 움직임 벡터를 사용하여 업 샘플링 수행부(621)에서 업샘플링 된 영상과 향상 계층의 복원 영상에서 차분 계수 생성을 위한 블록을 보상한다. 보상 된 두 예측 블록의 차와 향상 계층의 코딩 블록(500)과 동일 위치의 코딩 블록(530)를 더해줌으로써 향상 계층에서 사용할 예측 영상(627)을 생성한다. When using the GRP technique in the enhancement layer, upsample the image and motion information of the reference layer and then derive the difference coefficient from the reference layer and the enhancement layer reconstructed image using the motion vector of the reference layer, and reference the derived difference coefficient value. In addition to the layer, use it as a predictive value. When using the motion information 623 of the reference layer, motion vector information compressed in the reference layer may be used. The upsampling unit 621 performs upsampling according to the resolution of the image of the enhancement layer by using the reconstructed image of the reference layer. The motion information adjusting unit 625 adjusts the precision of the reference layer motion vector in integer pixels in order to use the motion vector information of the reference layer in the GRP. The inter-layer reference image enhancer 622 receives a coding block 530 at the same position as the coding block 500 of the enhancement layer from the reconstructed picture buffer of the reference layer and is manipulated by an integer unit through the motion information adjuster 625. Receive a motion vector. The upsampling unit 621 compensates for the block for generating the differential coefficients in the upsampled image and the reconstructed image of the enhancement layer by using the motion vector adjusted in integer units. The difference between the two compensated prediction blocks and the coding block 500 of the enhancement layer and the coding block 530 of the same position are added to generate the prediction image 627 to be used in the enhancement layer.
도 7은 본 발명의 이 실시 예에 따른 확장 부호화기의 블록도이다. 7 is a block diagram of an extension encoder according to this embodiment of the present invention.
도 7을 참조하면, 스케일러블 비디오 인코더는 입력 비디오(700)를 Spatial Decimation(715)을 통하여 다운 샘플링한 후 다운 샘플링된 비디오(710)를 참조 계층의 비디오 인코더의 입력으로 사용한다. 참조 계층 비디오 인코더에 입력된 비디오는 참조 계층에서 코딩 블록 단위로 인트라 또는 인터 모드로 예측된다. 원본 블록과 코딩 블록의 차이인 차분 영상은 변환부(730), 양자화부(732)를 거치면서 변환 부호화 및 양자화 과정을 거친다. 양자화된 차분 계수들은 엔트로피 부호화부(734)를 통해서 각 신택스 요소 단위로 비트로 표현된다. Referring to FIG. 7, the scalable video encoder downsamples the input video 700 through the spatial partitioning 715 and then uses the downsampled video 710 as an input of the video encoder of the reference layer. Video input to the reference layer video encoder is predicted in an intra or inter mode in units of coding blocks in the reference layer. The difference image, which is the difference between the original block and the coding block, undergoes a transform encoding and quantization process through the transform unit 730 and the quantization unit 732. The quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 734.
향상 계층을 위한 인코더는 입력 비디오(700)를 입력으로 사용한다. 입력 된 비디오는 향상 계층에서 코딩 블록 단위로 인트라 예측부(760) 또는 움직임 보상부(765)를 통해 예측된다. 원본 블록과 코딩 블록의 차이인 차분 영상은 변환부(770), 양자화부(772)를 거치면서 변환 부호화 및 양자화 과정을 거친다. 양자화된 차분 계수들은 엔트로피 부호화부(774)를 통해서 각 신택스 요소 단위로 비트로 표현된다. 참조 계층과 향상 계층에서 인코딩된 비트스트림은 멀티플렉서(780)를 통해서 단일의 비트스트림(785)으로 구성된다. The encoder for the enhancement layer uses input video 700 as input. The input video is predicted through the intra predictor 760 or the motion compensator 765 in units of coding blocks in the enhancement layer. The difference image, which is the difference between the original block and the coding block, is transformed and quantized through the transform unit 770 and the quantizer 772. The quantized difference coefficients are expressed in bits in units of syntax elements through the entropy encoder 774. The bitstreams encoded in the reference layer and the enhancement layer consist of a single bitstream 785 through the multiplexer 780.
GRP 기술에서는 참조 계층의 복원 영상 및 움직임 정보(752)를 업 샘플링한 후 참조 계층의 움직임 벡터를 사용하여 참조 계층 및 향상 계층의 복원 영상에서 차분 계수를 유도하고, 유도 된 차분 계수 값을 참조 계층 블록에 더하여 향상 계층의 예측 값으로 사용한다. 참조 계층의 움직임 정보(752)를 사용할 때 참조 계층에서 압축된 움직임 벡터 정보를 사용할 수도 있다. 업 샘플링 수행부(750)에서는 참조 계층의 복원 영상을 사용하여 향상 계층의 영상의 해상도에 맞춰 업 샘플링을 수행한다. 움직임 정보 조정부(794)에서는 GRP에서 참조 계층의 움직임 벡터 정보를 사용하기 위하여 업 샘플링 된 움직임 벡터의 정밀도를 정수 픽셀 단위로 조정한다. 계층 간 참조 영상 향상부(790)에서는 참조 계층의 복원 픽쳐 버퍼에서 향상 계층의 코딩 블록(500)과 동일 위치의 코딩 블록(530)을 입력 받고 움직임 정보 조정부(794)를 통해서 정수 단위로 조종된 움직임 벡터를 입력 받는다. 정수 단위로 조정된 참조 계층의 움직임 벡터를 사용하여 업 샘플링 수행부(750)에서 업 샘플링 된 참조 계층 영상과 향상 계층의 복원 영상에서 차분 계수 생성을 위한 블록을 보상한다. 보상된 두 예측 블록의 차와 향상 계층의 코딩 블록(500)과 동일한 위치에 있는 참조 계층의 코딩 블록(530)을 더해줌으로써 향상 계층에서 사용할 계층 간 예측 영상(792)을 생성한다.In the GRP technique, after upsampling the reconstructed picture and motion information 752 of the reference layer, the difference coefficient is derived from the reconstructed picture of the reference layer and the enhancement layer by using the motion vector of the reference layer, and the derived difference coefficient value is referred to as the reference layer. In addition to the block, it is used as a prediction value of the enhancement layer. When using the motion information 752 of the reference layer, motion vector information compressed in the reference layer may be used. The upsampling unit 750 performs upsampling according to the resolution of the image of the enhancement layer by using the reconstructed image of the reference layer. The motion information adjusting unit 794 adjusts the precision of the upsampled motion vector in integer units in order to use the motion vector information of the reference layer in the GRP. The inter-layer reference image enhancer 790 receives a coding block 530 at the same position as the coding block 500 of the enhancement layer from the reconstructed picture buffer of the reference layer and is manipulated by an integer unit through the motion information adjuster 794. Receive a motion vector. The upsampling unit 750 compensates a block for generating a differential coefficient in the upsampled reference layer image and the reconstructed image of the enhancement layer by using the motion vector of the reference layer adjusted by an integer unit. The inter-layer prediction image 792 to be used in the enhancement layer is generated by adding the difference between the two compensated prediction blocks and the coding block 530 of the reference layer located at the same position as the coding block 500 of the enhancement layer.
도 8은 본 발명의 일 실시 예에 따른 확장 부/복호화기의 움직임 정보 조정부의 동작을 설명하는 도면이다. 8 is a diagram illustrating an operation of a motion information adjusting unit of an expansion / decoding unit according to an embodiment of the present invention.
도 8a를 참조하면, 본 발명의 이 실시 예에 따른 확장 부/복화기의 움직임 정보 조정부(625, 794)는 GRP를 위해서 참조 계층의 업 샘플링 된 움직임 벡터의 정밀도를 정수 위치로 조정한다. GRP에서는 참조 계층의 움직임 벡터를 사용하여 참조 계층 및 향상 계층에서 차분 계수를 유도하는데 이러한 경우에 참조 영상은 움직임 벡터의 정밀도에 맞게 보간 되어야 한다. 본 발명의 일 실시 예에 따른 확장 부/복호화기에서는 GRP에서 참조 계층의 움직임 벡터를 사용할 때 움직임 벡터를 정수 위치로 조정함으로써 참조 계층 및 향상 계층의 복원 영상에서 보간을 수행하지 않도록 한다. Referring to FIG. 8A, the motion information adjusting units 625 and 794 of the expansion unit / decoder according to the present embodiment adjust the precision of the upsampled motion vector of the reference layer to an integer position for GRP. GRP derives the difference coefficients from the reference layer and the enhancement layer using the motion vector of the reference layer. In this case, the reference picture should be interpolated according to the precision of the motion vector. In the extended encoder / decoder according to an embodiment of the present invention, when using the motion vector of the reference layer in GRP, the interpolation is not performed in the reconstructed images of the reference layer and the enhancement layer by adjusting the motion vector to an integer position.
도 8b를 참조하면, 움직임 정보 조정부(625, 794)는 참조 계층의 움직임 벡터가 이미 정수 위치에 있는지를 판단한다(810). 참조 계층의 움직임 벡터가 이미 정수에 위치에 있는 경우에는 추가적인 움직임 벡터의 조정이 수행되지 않는다. 참조 계층의 움직임 벡터가 정수 위치가 아닌 경우에는 참조 계층의 움직임 벡터가 GRP에서 사용될 수 있도록 정수 화소로의 매핑(811)이 수행된다. Referring to FIG. 8B, the motion information adjusting units 625 and 794 determine whether the motion vector of the reference layer is already at an integer position (810). If the motion vector of the reference layer is already at an integer position, no additional motion vector adjustment is performed. If the motion vector of the reference layer is not an integer position, mapping 811 to integer pixels is performed so that the motion vector of the reference layer can be used in the GRP.
도 9는 본 발명의 이 실시 예에 따른 확장 부/복호화기의 움직임 정보 조정부가 향상 계층의 움직임 벡터를 정수 화소로 매핑하는 실시 예에 대한 것이다. FIG. 9 illustrates an example in which a motion information adjusting unit of an extension / decoder according to an embodiment of the present invention maps a motion vector of an enhancement layer to integer pixels.
도 9를 참조하면, 향상 계층의 움직임 벡터는 정수 위치 (900, 905, 910, 915)에 위치하거나 비 정수 위치 (920)에 위치할 수 있다. GRP에서 참조 계층의 움직임 벡터를 사용하여 참조 계층 및 향상 계층의 복원 영상에서 차분 계수를 생성하고자 할 때 참조 계층의 움직임 벡터를 정수 화소로 매핑하여 사용함으로써 참조 계층 및 향상 계층 복원 영상을 보간하는 과정을 생략할 수 있다. 참조 계층의 움직임 벡터가 비 정수 위치(920)에 해당하는 경우 해당 비 정수 위치의 픽셀의 좌-상에 위치하는 정수 화소 위치 (900)로 움직임 벡터를 조정한 후 조정된 움직임 벡터를 GRP에 사용한다. Referring to FIG. 9, the motion vector of the enhancement layer may be located at integer locations 900, 905, 910, and 915 or at non-integer locations 920. Process of interpolating the reference layer and the enhancement layer reconstruction image by mapping the motion vector of the reference layer to integer pixels when generating difference coefficients from the reconstruction images of the reference layer and the enhancement layer using the motion vector of the reference layer in GRP Can be omitted. If the motion vector of the reference layer corresponds to the non-integer position 920, the motion vector is adjusted to the integer pixel position 900 located on the left-top side of the pixel of the non-integer position, and then the adjusted motion vector is used for the GRP. do.
도 10은 본 발명의 일 실시 예에 따른 확장 부/복호화기의 향상 계층 참조 리스트 구성을 설명하는 도면이다.FIG. 10 is a diagram illustrating a configuration of an enhancement layer reference list of an extended encoder / decoder according to an embodiment of the present invention.
도 10을 참조하면, 참조 계층 영상(1010)은 향상 계층에 맞게 업 샘플링 된 참조 계층 영상A(1020)와 계층 간 참조 영상 향상부(622, 790)을 통해 향상된 참조 계층 영상B(1030)는 향상 계층의 참조 영상 리스트를 구성하는 데 사용될 수 있다. 참조 계층 영상A(1020)만을 이용하여 참조 리스트 L0, L1을 구성(1040)할 수 있으며, 참조 계층 영상A(1020)는 L0에, 참조 계층 영상B(1030)는 L1에 추가함으로써 참조 영상 리스트를 구성(1050)할 수 있다. 또한 참조 리스트 L0에 참조 계층 영상B(1030), 참조 리스트 L1에 참조 계층 영상A(1010)를 추가함으로써 향상 계층의 참조 리스트 구성(1060)을 수행할 수 있다. 참조 리스트에 추가된 참조 계층 영상A(1020)와 참조 계층 영상B(1030)는 향상 계층을 부호화 하는 데 사용될 수 있다.Referring to FIG. 10, the reference layer picture 1010 is enhanced by a reference layer picture A 1020 upsampled to fit with an enhancement layer and an inter-layer reference picture enhancer 622 or 790. It may be used to construct a reference picture list of the enhancement layer. The reference list L0 and L1 may be configured 1040 using only the reference layer image A 1020, and the reference layer image A 1020 is added to L0 and the reference layer image B 1030 is added to L1. May be configured 1050. In addition, the reference layer structure 1060 of the enhancement layer may be performed by adding the reference layer image B 1030 to the reference list L0 and the reference layer image A 1010 to the reference list L1. Reference layer picture A 1020 and reference layer picture B 1030 added to the reference list may be used to encode an enhancement layer.
상술한 본 발명에 따른 방법은 컴퓨터에서 실행되기 위한 프로그램으로 제작되어 컴퓨터가 읽을 수 있는 기록 매체에 저장될 수 있으며, 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다.The method according to the present invention described above may be stored in a computer-readable recording medium that is produced as a program for execution on a computer, and examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape , Floppy disks, optical data storage devices, and the like, and also include those implemented in the form of carrier waves (eg, transmission over the Internet).
컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 상기 방법을 구현하기 위한 기능적인(function) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.The computer readable recording medium can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. In addition, functional programs, codes, and code segments for implementing the method can be easily inferred by programmers in the art to which the present invention belongs.
또한, 이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형 실시가 가능한 것은 물론이고, 이러한 변형 실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해 되어서는 안될 것이다.In addition, although the preferred embodiment of the present invention has been shown and described above, the present invention is not limited to the above-described specific embodiment, the technical field to which the invention belongs without departing from the spirit of the invention claimed in the claims. Of course, various modifications can be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

Claims (2)

  1. 계층 간 참조 구조를 제공하는 부/복호화 방법에서, 계층 간 참조 영상을 생성함에 있어, 참조 계층의 움직임 정보를 이용한 계층 간 차분 계수 예측 시 움직임 벡터의 정밀도를 제한하여 계층 간 차분 계수 예측을 수행하는 방법.In the encoding / decoding method for providing an inter-layer reference structure, in generating an inter-layer reference image, the inter-layer difference coefficient prediction is performed by limiting the precision of the motion vector when predicting the inter-layer difference coefficient using motion information of the reference layer. Way.
  2. 계층 간 참조 구조를 제공하는 부/복호화 방법에서, 계층 간 참조 영상 참조하여 향상 계층의 참조 리스트를 구성함에 있어, 업 샘플링 된 참조 계층 복원 영상과 업 샘플링 및 향상 된 참조 계층 복원 영상을 이용해 향상 계층의 참조 리스트를 구성하는 방법.In an encoding / decoding method that provides an inter-layer reference structure, in constructing a reference list of an enhancement layer by referencing an inter-layer reference image, an enhancement layer using an upsampled reference layer reconstruction image and an upsampling and enhanced reference layer reconstruction image To construct a reference list for a file.
PCT/KR2014/001197 2013-11-15 2014-02-13 Interlayer reference picture generation method and apparatus for multiple layer video coding WO2015072626A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2013-0138706 2013-11-15
KR1020130138706A KR20150056679A (en) 2013-11-15 2013-11-15 Apparatus and method for construction of inter-layer reference picture in multi-layer video coding

Publications (1)

Publication Number Publication Date
WO2015072626A1 true WO2015072626A1 (en) 2015-05-21

Family

ID=53057547

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2014/001197 WO2015072626A1 (en) 2013-11-15 2014-02-13 Interlayer reference picture generation method and apparatus for multiple layer video coding

Country Status (2)

Country Link
KR (1) KR20150056679A (en)
WO (1) WO2015072626A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109219958A (en) * 2016-08-22 2019-01-15 联发科技股份有限公司 The method for video coding and equipment of do not apply loop filtering to handle the reconstructed blocks for being located at picture material discontinuity edge and relevant video encoding/decoding method and equipment
CN114051137A (en) * 2021-10-13 2022-02-15 上海工程技术大学 Spatial scalable video coding method and decoding method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10924747B2 (en) * 2017-02-27 2021-02-16 Apple Inc. Video coding techniques for multi-view video

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050190979A1 (en) * 2004-02-27 2005-09-01 Microsoft Corporation Barbell lifting for multi-layer wavelet coding
KR100703778B1 (en) * 2005-04-29 2007-04-06 삼성전자주식회사 Method and apparatus for coding video supporting fast FGS
JP5019054B2 (en) * 2005-04-27 2012-09-05 日本電気株式会社 Image decoding method, apparatus and program thereof
KR20130095282A (en) * 2010-09-24 2013-08-27 퀄컴 인코포레이티드 Coding stereo video data
WO2013160277A1 (en) * 2012-04-27 2013-10-31 Canon Kabushiki Kaisha A method, device, computer program, and information storage means for encoding and decoding an image comprising blocks of pixels

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050190979A1 (en) * 2004-02-27 2005-09-01 Microsoft Corporation Barbell lifting for multi-layer wavelet coding
JP5019054B2 (en) * 2005-04-27 2012-09-05 日本電気株式会社 Image decoding method, apparatus and program thereof
KR100703778B1 (en) * 2005-04-29 2007-04-06 삼성전자주식회사 Method and apparatus for coding video supporting fast FGS
KR20130095282A (en) * 2010-09-24 2013-08-27 퀄컴 인코포레이티드 Coding stereo video data
WO2013160277A1 (en) * 2012-04-27 2013-10-31 Canon Kabushiki Kaisha A method, device, computer program, and information storage means for encoding and decoding an image comprising blocks of pixels

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109219958A (en) * 2016-08-22 2019-01-15 联发科技股份有限公司 The method for video coding and equipment of do not apply loop filtering to handle the reconstructed blocks for being located at picture material discontinuity edge and relevant video encoding/decoding method and equipment
CN114051137A (en) * 2021-10-13 2022-02-15 上海工程技术大学 Spatial scalable video coding method and decoding method

Also Published As

Publication number Publication date
KR20150056679A (en) 2015-05-27

Similar Documents

Publication Publication Date Title
CN111989921B (en) Method and related device for video decoding
CN112042188B (en) Method and apparatus for intra prediction of non-square blocks in video compression
CN111630864B (en) Encoding and decoding method, device and medium
EP3756353B1 (en) Method, apparatus and medium for decoding or encoding
AU2023202827B2 (en) Method and apparatus for improved implicit transform selection
US11290734B2 (en) Adaptive picture resolution rescaling for inter-prediction and display
CN113228661B (en) Method and apparatus for video decoding, storage medium, and computer device
BRPI0616745A2 (en) multi-view video encoding / decoding using scalable video encoding / decoding
US20210314587A1 (en) Method and appartus for video coding
EP3363201A1 (en) Video coding with helper data for spatial intra-prediction
CN113574896B (en) Method, equipment and electronic equipment for acquiring intra-frame prediction mode
CN111953996A (en) Method and device for video decoding
CN110798686A (en) Video decoding method and device, computer equipment and computer readable storage medium
CN102754433A (en) Low complexity, high frame rate video encoder
WO2014088306A2 (en) Video encoding and decoding method and device using said method
WO2015072626A1 (en) Interlayer reference picture generation method and apparatus for multiple layer video coding
US20200296399A1 (en) Method and apparatus for video coding
CN118216147A (en) Signaling of downsampling filters for predicting chroma intra prediction modes from luma
CN112118452B (en) Video decoding method and device and computer equipment
KR20080055685A (en) A method and apparatus for decoding a video signal
CN115486078A (en) Low memory design for multiple reference row selection scheme
WO2014088316A2 (en) Video encoding and decoding method, and apparatus using same
WO2014171771A1 (en) Video signal processing method and apparatus
US11445206B2 (en) Method and apparatus for video coding
US11997317B2 (en) Techniques for constraint flag signaling for range extension with persistent rice adaptation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14862795

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14862795

Country of ref document: EP

Kind code of ref document: A1