KR20130045784A

KR20130045784A - Method and apparatus for scalable video coding using inter prediction mode

Info

Publication number: KR20130045784A
Application number: KR1020110131156A
Authority: KR
Inventors: 심동규; 남정학; 조현호; 최효민
Original assignee: 인텔렉추얼디스커버리 주식회사
Priority date: 2011-10-26
Filing date: 2011-12-08
Publication date: 2013-05-06
Also published as: KR20130045783A; KR20240000416A; KR102324462B1; KR20210134590A; KR20200128375A; KR20220121231A; KR102508989B1; KR102435739B1; KR20230037527A; KR20220165710A; KR102176539B1; KR102475242B1; KR102616143B1; KR101979284B1

Abstract

PURPOSE: A scalable coding method of an inter-prediction mode and a device thereof are provided to selectively use motion information of surrounding blocks and motion information for a block of a basic layer for predicting motion information of an enhancement layer in scalable video coding based on a plurality of layers, thereby reducing the number of bits necessary for encoding and decoding. CONSTITUTION: When a determined motion information prediction mode is a first mode, an image decoding device predicts motion information for a decoding object block of an enhancement layer through motion information for a surrounding block of the enhancement layer(S310,S330). When the determined motion information prediction mode is a second mode, the image decoding device predicts the motion information for the decoding object block of the enhancement layer through motion information for a block corresponding to a reference layer(S320). [Reference numerals] (BB) Yes; (CC) No; (S300) Inter-layer motion predicted?; (S310) Obtain motion information on surrounding blocks of an enhancement layer; (S320) Obtain motion information on a corresponding block of a reference layer; (S330) Predict motion information on a target decoding block of the enhancement layer using the obtained motion information

Description

Inter prediction mode scalable coding method and apparatus {METHOD AND APPARATUS FOR SCALABLE VIDEO CODING USING INTER PREDICTION MODE}

본 발명은 영상 처리 기술에 관한 것으로써, 보다 상세하게는 영상을 부호화/복호화하는 스케일러블 비디오 코딩 방법 및 장치에 관한 것이다.The present invention relates to an image processing technique, and more particularly, to a scalable video coding method and apparatus for encoding / decoding an image.

최근 HD(High Definition) 해상도(1280x1024 혹은 1920x1080)를 가지는 방송 서비스가 국내뿐만 아니라 세계적으로 확대되면서, 많은 사용자들이 고해상도, 고화질의 영상에 익숙해지고 있으며 이에 따라 많은 기관들이 차세대 영상기기에 대한 개발에 박차를 가하고 있다. 또한 HDTV와 더불어 HDTV의 4배 이상의 해상도를 갖는 UHD(Ultra High Definition)에 대한 관심이 증대되면서 동영상 표준화 단체들은 보다 높은 해상도, 고화질의 영상에 대한 압축기술의 필요성을 인식하게 되었다. 또한 현재 HDTV, 휴대전화 등에 사용되는 동영상 압축 부호화 표준인 H.264/AVC(Advanced Video Coding)보다 높은 압축 효율을 통해, 기존 부호화 방식과 동일한 화질을 제공하면서도 주파수 대역이나 저장 측면에서 많은 이득을 제공할 수 있는 새로운 표준이 요구되고 있다. 현재 MPEG(Moving Picture Experts Group)과 VCEG(Video Coding Experts Group)은 공동으로 차세대 비디오 코덱인 HEVC(High Efficiency Video Coding)에 대한 표준화 작업을 진행하고 있다. HEVC의 개략적인 목표는 UHD 영상까지 포함한 영상을 H.264/AVC 대비 2배의 압축효율로 부호화하는 것이다. HEVC는 HD, UHD 영상뿐만 아니라 3D 방송 및 이동통신망에서도 현재보다 낮은 주파수로 고화질의 영상을 제공할 수 있다.Recently, as the broadcasting service having high definition (HD) resolution (1280x1024 or 1920x1080) has been expanded not only in Korea but also in the world, many users are getting used to high resolution and high quality video. Is adding. In addition, as HDTV and UHD (Ultra High Definition), which has four times the resolution of HDTV, have increased interest, video standardization organizations have recognized the need for compression technology for higher resolution and higher quality images. It also provides higher compression efficiency than H.264 / AVC (Advanced Video Coding), the video compression coding standard currently used in HDTVs, mobile phones, etc., which provides the same image quality as the existing coding method, but provides a lot of gain in terms of frequency band and storage. New standards are needed to do this. Currently, the Moving Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG) are working together to standardize the next-generation video codec, High Efficiency Video Coding (HEVC). The general goal of HEVC is to encode a video including UHD video with twice the compression efficiency compared to H.264 / AVC. HEVC can provide high-definition video at lower frequencies than current HD, UHD video as well as 3D broadcasting and mobile communication networks.

HEVC에서는 공간적 또는 시간적으로 영상에 대한 예측(prediction)이 수행되어 예측 영상이 생성될 수 있으며 원본 영상과 예측 영상과의 차이가 부호화될 수 있다. 이러한 예측 부호화에 의해 영상 부호화의 효율이 높아질 수 있다.In HEVC, a prediction image may be generated by performing a prediction on an image spatially or temporally, and a difference between an original image and a prediction image may be encoded. Such predictive encoding may increase the efficiency of image encoding.

본 발명은 부호화/복호화 효율을 향상시킬 수 있는 스케일러블 비디오 코딩 방법 및 장치를 제공하는 것을 목적으로 한다.An object of the present invention is to provide a scalable video coding method and apparatus capable of improving encoding / decoding efficiency.

상기 과제를 해결하기 위한 본 발명의 일실시예에 따른 스케일러블 비디오 복호화 방법은, 향상 계층(enhancement layer)의 복호화 대상 블록에 대한 움직임 정보 예측 모드를 판단하는 단계; 상기 판단된 움직임 정보 예측 모드가 제1 모드인 경우, 상기 향상 계층의 주변 블록에 대한 움직임 정보를 사용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하는 단계; 및 상기 판단된 움직임 정보 예측 모드가 제2 모드인 경우, 참조 계층(reference layer)의 대응되는 블록에 대한 움직임 정보를 사용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하는 단계를 포함한다.According to an aspect of the present invention, there is provided a scalable video decoding method comprising: determining a motion information prediction mode for a decoding target block of an enhancement layer; When the determined motion information prediction mode is a first mode, predicting motion information on a decoding target block of the enhancement layer by using motion information on a neighboring block of the enhancement layer; And when the determined motion information prediction mode is the second mode, predicting motion information on a decoding target block of the enhancement layer by using motion information on a corresponding block of a reference layer. .

본 발명의 일실시예에 따른 스케일러블 비디오 복호화 장치는, 주변 블록에 대한 움직임 정보를 사용하여 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하는 제1 움직임 예측부; 및 참조 계층의 대응되는 블록에 대한 움직임 정보를 사용하여 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하는 제2 움직임 예측부를 포함하고, 부호화 장치에서 시그널링된 움직임 정보 예측 모드에 따라, 상기 제1, 2 움직임 예측부 중 어느 하나가 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하기 위해 사용된다.A scalable video decoding apparatus according to an embodiment of the present invention includes a first motion predictor for predicting motion information of a decoding target block of an enhancement layer using motion information of a neighboring block; And a second motion predictor for predicting motion information on a decoding target block of an enhancement layer by using motion information on a corresponding block of a reference layer, wherein the first motion predictor includes: a first motion predictor according to the motion information prediction mode signaled by the encoding apparatus; , One of two motion predictors is used to predict motion information for the decoding object block of the enhancement layer.

본 발명의 실시예에 따르면, 복수 계층 기반의 스케일러블 비디오 코딩에서 향상 계층의 움직임 정보를 예측하기 위해 주변 블록들의 움직임 정보와 기초 계층의 대응 블록에 대한 움직임 정보 중에서 선택적으로 사용함으로써, 부호화 및 복호화에 필요한 비트 수를 감소시켜 코딩 효율을 높일 수 있으며, 그에 따라 동일 비트율에서 보다 향상된 화질을 제공할 수 있다.According to an embodiment of the present invention, in order to predict motion information of an enhancement layer in multi-layer scalable video coding, encoding and decoding are performed by selectively using among motion information of neighboring blocks and motion information for a corresponding block of a base layer. Coding efficiency can be increased by reducing the number of bits required for the C2, thereby providing better image quality at the same bit rate.

도 1은 본 발명이 적용되는 영상 부호화 장치의 일실시예에 따른 구성을 나타내는 블록도이다.
도 2는 본 발명이 적용되는 영상 복호화 장치의 일실시예에 따른 구성을 나타내는 블록도이다.
도 3은 본 발명의 실시예에서 사용되는 영상 및 블록의 개념을 나타내는 개념도이다.
도 4는 복수 계층 기반의 스케일러블 비디오 코딩 구조에 대한 일실시예를 개략적으로 나타내는 개념도이다.
도 5는 도 2에 도시된 움직임 보상부의 구성에 대한 일실시예를 나타내는 블록도이다.
도 6은 도 5에 도시된 제2 움직임 예측부의 구성에 대한 일실시예를 나타내는 블록도이다.
도 7은 본 발명의 제1 실시예에 따른 스케일러블 비디오 코딩 방법을 나타내는 흐름도이다.
도 8은 계층 간 움직임 예측 방법에 대한 일실시예를 설명하기 위한 도면이다.
도 9는 본 발명의 제2 실시예에 따른 스케일러블 비디오 코딩 방법을 나타내는 흐름도이다.
도 10은 본 발명의 제3 실시예에 따른 스케일러블 비디오 코딩 방법을 나타내는 흐름도이다.1 is a block diagram illustrating a configuration of an image encoding apparatus according to an embodiment of the present invention.
2 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.
3 is a conceptual diagram illustrating the concept of an image and a block used in an embodiment of the present invention.
4 is a conceptual diagram schematically illustrating an embodiment of a multi-layer based scalable video coding structure.
FIG. 5 is a block diagram illustrating an embodiment of a configuration of a motion compensation unit illustrated in FIG. 2.
FIG. 6 is a block diagram illustrating an embodiment of a configuration of the second motion predictor illustrated in FIG. 5.
7 is a flowchart illustrating a scalable video coding method according to a first embodiment of the present invention.
8 is a diagram for describing an embodiment of an inter-layer motion prediction method.
9 is a flowchart illustrating a scalable video coding method according to a second embodiment of the present invention.
10 is a flowchart illustrating a scalable video coding method according to a third embodiment of the present invention.

이하, 도면을 참조하여 본 발명의 실시 형태에 대하여 구체적으로 설명한다. 본 명세서의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present disclosure rather unclear.

어떤 구성 요소가 다른 구성 요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있으나, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 아울러, 본 발명에서 특정 구성을 "포함"한다고 기술하는 내용은 해당 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성이 본 발명의 실시 또는 본 발명의 기술적 사상의 범위에 포함될 수 있음을 의미한다. It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . In addition, the content described as "include" a specific configuration in the present invention does not exclude a configuration other than the configuration, it means that additional configuration may be included in the scope of the technical idea of the present invention or the present invention.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

또한 본 발명의 실시예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다.In addition, the components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, which does not mean that each component is composed of separate hardware or software constituent units. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and separate embodiments of the components are also included within the scope of the present invention, unless they depart from the essence of the present invention.

또한, 일부의 구성 요소는 본 발명에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 발명은 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 발명의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 발명의 권리범위에 포함된다.In addition, some of the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.

도 1은 본 발명이 적용되는 영상 부호화 장치의 일실시예에 따른 구성을 나타내는 블록도이다.1 is a block diagram illustrating a configuration of an image encoding apparatus according to an embodiment of the present invention.

도 1을 참조하면, 상기 영상 부호화 장치(100)는 움직임 예측부(111), 움직임 보상부(112), 인트라 예측부(120), 스위치(115), 감산기(125), 변환부(130), 양자화부(140), 엔트로피 부호화부(150), 역양자화부(160), 역변환부(170), 가산기(175), 필터부(180) 및 참조영상 버퍼(190)를 포함한다. Referring to FIG. 1, the image encoding apparatus 100 may include a motion predictor 111, a motion compensator 112, an intra predictor 120, a switch 115, a subtractor 125, and a converter 130. And a quantization unit 140, an entropy encoding unit 150, an inverse quantization unit 160, an inverse transform unit 170, an adder 175, a filter unit 180, and a reference image buffer 190.

영상 부호화 장치(100)는 입력 영상에 대해 인트라(intra) 모드 또는 인터(inter) 모드로 부호화를 수행하고 비트스트림을 출력한다. 인트라 예측은 화면 내 예측, 인터 예측은 화면 간 예측을 의미한다. 인트라 모드인 경우 스위치(115)가 인트라로 전환되고, 인터 모드인 경우 스위치(115)가 인터로 전환된다. 영상 부호화 장치(100)는 입력 영상의 입력 블록에 대한 예측 블록을 생성한 후, 입력 블록과 예측 블록의 차분을 부호화한다.The image encoding apparatus 100 performs encoding in an intra mode or an inter mode with respect to an input image and outputs a bit stream. Intra prediction means intra prediction and inter prediction means inter prediction. In the intra mode, the switch 115 is switched to intra, and in the inter mode, the switch 115 is switched to inter. The image encoding apparatus 100 generates a prediction block for an input block of an input image and then encodes a difference between the input block and the prediction block.

인트라 모드인 경우, 인트라 예측부(120)는 현재 블록 주변의 이미 부호화된 블록의 화소값을 이용하여 공간적 예측을 수행하여 예측 블록을 생성한다.In the intra mode, the intra predictor 120 generates a prediction block by performing spatial prediction using pixel values of blocks that are already encoded around the current block.

인터 모드인 경우, 움직임 예측부(111)는, 움직임 예측 과정에서 참조 영상 버퍼(190)에 저장되어 있는 참조 영상에서 입력 블록과 가장 매치가 잘 되는 영역을 찾아 움직임 벡터를 구한다. 움직임 보상부(112)는 움직임 벡터를 이용하여 움직임 보상을 수행함으로써 예측 블록을 생성한다. In the inter mode, the motion predictor 111 finds a motion vector in the reference picture stored in the reference picture buffer 190 that best matches the input block in the motion prediction process. The motion compensation unit 112 generates a prediction block by performing motion compensation using a motion vector.

감산기(125)는 입력 블록과 생성된 예측 블록의 차분에 의해 잔여 블록(residual block)을 생성한다. 변환부(130)는 잔여 블록에 대해 변환(transform)을 수행하여 변환 계수(transform coefficient)를 출력한다. 그리고 양자화부(140)는 입력된 변환 계수를 양자화 파라미터에 따라 양자화하여 양자화된 계수(quantized coefficient)를 출력한다. 엔트로피 부호화부(150)는 입력된 양자화된 계수를 확률 분포에 따라 엔트로피 부호화하여 비트스트림(bit stream)을 출력한다.The subtracter 125 generates a residual block by a difference between the input block and the generated prediction block. The transforming unit 130 performs a transform on the residual block to output a transform coefficient. The quantization unit 140 quantizes the input transform coefficient according to the quantization parameter and outputs a quantized coefficient. The entropy encoding unit 150 entropy-codes the input quantized coefficients according to a probability distribution to output a bit stream.

HEVC는 인터 예측 부호화, 즉 화면 간 예측 부호화를 수행하므로, 현재 부호화된 영상은 참조 영상으로 사용되기 위해 복호화되어 저장될 필요가 있다. 따라서 양자화된 계수는 역양자화부(160)에서 역양자화되고 역변환부(170)에서 역변환된다. 역양자화, 역변환된 계수는 가산기(175)를 통해 예측 블록과 더해지고 복원 블록이 생성된다. Since the HEVC performs inter prediction coding, i.e., inter prediction coding, the currently encoded image needs to be decoded and stored for use as a reference image. Accordingly, the quantized coefficients are inversely quantized in the inverse quantization unit 160 and inversely transformed in the inverse transformation unit 170. The inverse quantized and inverse transformed coefficients are added to the prediction block through the adder 175 and a reconstruction block is generated.

복원 블록은 필터부(180)를 거치고, 필터부(180)는 디블록킹 필터(deblocking filter), SAO(Sample Adaptive Offset), ALF(Adaptive Loop Filter) 중 적어도 하나 이상을 복원 블록 또는 복원 픽쳐에 적용할 수 있다. 필터부(180)는 적응적 인루프(in-loop) 필터로 불릴 수도 있다. 디블록킹 필터는 블록 간의 경계에 생긴 블록 왜곡을 제거할 수 있다. SAO는 코딩 에러를 보상하기 위해 화소값에 적정 오프셋(offset) 값을 더해줄 수 있다. ALF는 복원된 영상과 원래의 영상을 비교한 값을 기초로 필터링을 수행할 수 있으며, 고효율이 적용되는 경우에만 수행될 수도 있다. 필터부(180)를 거친 복원 블록은 참조 영상 버퍼(190)에 저장된다.The restoration block passes through the filter unit 180 and the filter unit 180 applies at least one of a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) can do. The filter unit 180 may be referred to as an adaptive in-loop filter. The deblocking filter can remove block distortion occurring at the boundary between the blocks. SAO may add an appropriate offset value to pixel values to compensate for coding errors. The ALF may perform filtering based on a comparison between the reconstructed image and the original image, and may be performed only when high efficiency is applied. The restoration block having passed through the filter unit 180 is stored in the reference image buffer 190.

도 2는 본 발명이 적용되는 영상 복호화 장치의 일실시예에 따른 구성을 나타내는 블록도이다.2 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.

도 2를 참조하면, 상기 영상 복호화 장치(200)는 엔트로피 복호화부(210), 역양자화부(220), 역변환부(230), 인트라 예측부(240), 움직임 보상부(250), 필터부(260) 및 참조 영상 버퍼(270)를 포함한다.2, the image decoding apparatus 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an intra prediction unit 240, a motion compensation unit 250, (260) and a reference image buffer (270).

영상 복호화 장치(200)는 부호화기에서 출력된 비트스트림을 입력 받아 인트라 모드 또는 인터 모드로 복호화를 수행하고 재구성된 영상, 즉 복원 영상을 출력한다. 인트라 모드인 경우 스위치가 인트라로 전환되고, 인터 모드인 경우 스위치가 인터로 전환된다. 영상 복호화 장치(200)는 입력 받은 비트스트림으로부터 잔여 블록(residual block)을 얻고 예측 블록을 생성한 후 잔여 블록과 예측 블록을 더하여 재구성된 블록, 즉 복원 블록을 생성한다.The video decoding apparatus 200 receives the bit stream output from the encoder and decodes the video stream into an intra mode or an inter mode, and outputs a reconstructed video, i.e., a reconstructed video. In the intra mode, the switch is switched to the intra mode, and in the inter mode, the switch is switched to the inter mode. The image decoding apparatus 200 obtains a residual block from the input bitstream, generates a prediction block, adds the residual block and the prediction block, and generates a reconstructed block, that is, a reconstruction block.

엔트로피 복호화부(210)는 입력된 비트스트림을 확률 분포에 따라 엔트로피 복호화하여 양자화된 계수(quantized coefficient)를 출력한다. 양자화된 계수는 역양자화부(220)에서 역양자화되고 역변환부(230)에서 역변환되며, 양자화된 계수가 역양자화/역변환 된 결과, 잔여 블록(residual block)이 생성된다. The entropy decoding unit 210 entropy-decodes the input bitstream according to a probability distribution and outputs a quantized coefficient. The quantized coefficients are inversely quantized in the inverse quantization unit 220 and inversely transformed in the inverse transformation unit 230. As a result of inverse quantization / inverse transformation of the quantized coefficients, a residual block is generated.

인트라 모드인 경우, 인트라 예측부(240)는 현재 블록 주변의 이미 부호화된 블록의 화소값을 이용하여 공간적 예측을 수행하여 예측 블록을 생성한다. In the intra mode, the intra predictor 240 generates a predictive block by performing spatial prediction using pixel values of already encoded blocks around the current block.

인터 모드인 경우, 움직임 보상부(250)는 움직임 벡터 및 참조 영상 버퍼(270)에 저장되어 있는 참조 영상을 이용하여 움직임 보상을 수행함으로써 예측 블록을 생성한다. In the inter mode, the motion compensator 250 generates a prediction block by performing motion compensation using the motion vector and the reference image stored in the reference image buffer 270.

잔여 블록과 예측 블록은 가산기(255)를 통해 더해지고, 더해진 블록은 필터부(260)를 거친다. 필터부(260)는 디블록킹 필터, SAO, ALF 중 적어도 하나 이상을 복원 블록 또는 복원 픽쳐에 적용할 수 있다. 필터부(260)는 재구성된 영상, 즉 복원 영상을 출력한다. 복원 영상은 참조 영상 버퍼(270)에 저장되어 화면 간 예측에 사용될 수 있다.The residual block and the prediction block are added through the adder 255, and the added block is passed through the filter unit 260. [ The filter unit 260 may apply at least one or more of the deblocking filter, SAO, and ALF to the reconstructed block or the reconstructed picture. The filter unit 260 outputs a reconstructed image, that is, a reconstructed image. The reconstructed picture may be stored in the reference picture buffer 270 to be used for inter prediction.

부호화/복호화 장치의 예측 성능을 향상시키기 위한 방법에는 보간(interpolation) 영상의 정확도를 높이는 방법과 차신호를 예측하는 방법이 있다. 여기서 차신호란 원본 영상과 예측 영상과의 차이를 나타내는 신호이다. 본 발명에서 "차신호"는 문맥에 따라 "차분 신호", "잔여 블록" 또는 "차분 블록"으로 대체되어 사용될 수 있으며, 해당 기술분야에서 통상의 지식을 가진 자는 발명의 사상, 본질에 영향을 주지 않는 범위 내에서 이를 구분할 수 있을 것이다.Methods for improving the prediction performance of the encoding / decoding apparatus include a method of increasing the accuracy of the interpolation image and a method of predicting the difference signal. Here, the difference signal is a signal indicating the difference between the original image and the predicted image. In the present invention, the "difference signal" may be replaced with "difference signal", "residual block" or "difference block" according to the context, and those skilled in the art may affect the spirit and the essence of the invention. This can be distinguished to the extent that it does not give.

보간 영상의 정확도가 높아져도 차신호는 발생할 수 밖에 없다. 따라서 차신호 예측의 성능을 향상시켜 부호화될 차신호를 최대한 줄임으로써 부호화 성능을 향상시킬 필요가 있다.Even if the interpolated image is more accurate, a difference signal can only be generated. Therefore, it is necessary to improve encoding performance by improving the performance of difference signal prediction and reducing the difference signal to be encoded as much as possible.

차신호 예측 방법으로는 고정된 필터 계수를 이용한 필터링 방법이 사용될 수 있다. 그러나, 이러한 필터링 방법은 영상 특성에 따라 적응적으로 필터 계수가 사용될 수 없으므로, 예측 성능에 한계가 있다. 따라서 예측 블록마다 그 특성에 맞게 필터링이 수행되도록 함으로써 예측의 정확도를 향상시킬 필요가 있다.As a difference signal prediction method, a filtering method using fixed filter coefficients may be used. However, such a filtering method has a limitation in prediction performance because filter coefficients cannot be used adaptively according to image characteristics. Therefore, it is necessary to improve the accuracy of prediction by filtering according to the characteristics of each prediction block.

도 3은 본 발명의 실시예에서 사용되는 영상 및 블록의 개념을 나타내는 개념도이다. 3 is a conceptual diagram illustrating the concept of an image and a block used in an embodiment of the present invention.

도 3을 참조하면, 부호화 대상 블록은 현재 부호화 대상 영상 내의 공간적으로 연결된 화소들의 집합이다. 부호화 대상 블록은 부호화 및 복호화가 이루어지는 단위이며, 사각형 또는 임의의 모양일 수 있다. 주변 복원 블록은 현재 부호화 대상 영상 내에서 현재 부호화 대상 블록이 부호화되기 이전에 부호화 및 복호화가 완료된 블록이다.Referring to FIG. 3, an encoding target block is a set of spatially connected pixels in a current encoding target image. The encoding target block is a unit in which encoding and decoding are performed, and may be a quadrangle or an arbitrary shape. The neighbor reconstruction block is a block in which encoding and decoding are completed before the current encoding target block is encoded in the current encoding target image.

예측 영상은 현재 부호화 대상 영상 내에서, 영상의 첫 번째 부호화 대상 블록에서부터 현재 부호화 대상 블록까지, 각 블록의 부호화에 사용되는 예측 블록을 모아놓은 영상이다. 여기서 예측 블록이란, 현재 부호화 대상 영상 내에서 각 부호화 대상 블록들의 부호화에 사용되는 예측 신호를 가지는 블록을 말한다. 즉, 예측 블록은 예측 영상 내에 있는 각각의 블록을 말한다. The predictive image is an image obtained by collecting predictive blocks used for encoding each block, from the first encoding target block of the image to the current encoding target block in the current encoding target image. Here, the prediction block refers to a block having a prediction signal used for encoding the respective encoding target blocks in the current encoding target video. That is, the prediction block refers to each block in the prediction image.

주변 블록은 현재 부호화 대상 블록의 주변 복원 블록 및 각 주변 복원 블록의 예측 블록인 주변 예측 블록을 의미한다. 즉, 주변 블록은 주변 복원 블록과 주변 예측 블록을 함께 지칭한다.The neighboring block means a neighboring reconstruction block of the current encoding target block and a neighboring prediction block that is a prediction block of each neighboring reconstruction block. That is, the neighbor block refers to the neighbor reconstruction block and the neighbor prediction block together.

현재 부호화 대상 블록의 예측 블록은 도 1의 실시예에 따라 움직임 보상부(112) 또는 인트라 예측부(120)에서 생성된 예측 블록일 수 있다. 이 경우, 움직임 보상부(112) 또는 인트라 예측부(120)에서 생성된 예측 블록에 대한 예측 블록 필터링 과정이 수행된 후, 감산기(125)는 필터링된 최종 예측 블록과 원 블록의 차분을 수행할 수 있다.The prediction block of the current encoding target block may be a prediction block generated by the motion compensator 112 or the intra predictor 120 according to the embodiment of FIG. 1. In this case, after the prediction block filtering process is performed on the prediction block generated by the motion compensator 112 or the intra prediction unit 120, the subtractor 125 may perform a difference between the filtered final prediction block and the original block. Can be.

주변 블록은 도 1의 실시예에서의 참조 영상 버퍼(190) 또는 별도의 메모리에 저장된 블록일 수 있다. 또한 영상 부호화 과정에서 생성된 주변 복원 블록 또는 주변 예측 블록이 그대로 주변 블록으로 사용될 수도 있다.The neighboring block may be a block stored in the reference image buffer 190 or a separate memory in the embodiment of FIG. 1. In addition, the neighbor reconstruction block or the neighbor prediction block generated in the image encoding process may be used as the neighbor block.

도 4는 본 발명이 적용될 수 있는, 복수 계층을 이용한 스케일러블 비디오 코딩 구조의 일실시예를 개략적으로 나타내는 개념도이다. 도 4에서 GOP(Group of Picture)는 픽쳐군 즉, 픽쳐의 그룹을 나타낸다.4 is a conceptual diagram schematically illustrating an embodiment of a scalable video coding structure using multiple layers to which the present invention can be applied. In FIG. 4, a group of pictures (GOP) represents a picture group, that is, a group of pictures.

영상 데이터를 전송하기 위해서는 전송 매체가 필요하며, 그 성능은 다양한 네트워크 환경에 따라 전송 매체별로 차이가 있다. 이러한 다양한 전송 매체 또는 네트워크 환경에의 적용을 위해 스케일러블 비디오 코딩 방법이 제공될 수 있다.In order to transmit video data, a transmission medium is required, and the performance of the transmission medium varies depending on various network environments. A scalable video coding method may be provided for application to these various transmission media or network environments.

스케일러블 비디오 코딩 방법은 계층(layer) 간의 텍스쳐 정보, 움직임 정보, 잔여 신호 등을 활용하여 계층 간 중복성을 제거하여 부호화/복호화 성능을 높이는 코딩 방법이다. 스케일러블 비디오 코딩 방법은, 전송 비트율, 전송 에러율, 시스템 자원 등의 주변 조건에 따라, 공간적, 시간적, 화질적 관점에서 다양한 스케일러빌리티를 제공할 수 있다.The scalable video coding method is a coding method for enhancing the coding / decoding performance by eliminating inter-layer redundancy by utilizing texture information, motion information, residual signal, etc. between layers. The scalable video coding method can provide various scalabilities in terms of spatial, temporal, and image quality according to surrounding conditions such as a transmission bit rate, a transmission error rate, and a system resource.

스케일러블 비디오 코딩은, 다양한 네트워크 상황에 적용 가능한 비트스트림을 제공할 수 있도록, 복수 계층(multiple layers) 구조를 사용하여 수행될 수 있다. 예를 들어 스케일러블 비디오 코딩 구조는, 일반적인 영상 부호화 방법을 이용하여 영상 데이터를 압축하여 처리하는 기초 계층을 포함할 수 있고, 기초 계층의 부호화 정보 및 일반적인 영상 부호화 방법을 함께 사용하여 영상 데이터를 압축 처리하는 향상 계층을 포함할 수 있다.Scalable video coding can be performed using multiple layers structure to provide a bitstream applicable to various network situations. For example, the scalable video coding structure may include a base layer that compresses and processes image data using a general image encoding method, and compresses the image data using both the encoding information of the base layer and a general image encoding method. May include an enhancement layer for processing.

여기서, 계층(layer)은 공간(예를 들어, 영상 크기), 시간(예를 들어, 부호화 순서, 영상 출력 순서), 화질, 복잡도 등을 기준으로 구분되는 영상 및 비트스트림(bitstream)의 집합을 의미한다. 또한 복수의 계층들은 서로 간에 종속성을 가질 수도 있다.Here, the layer may be a set of images and bitstreams classified based on space (eg, image size), time (eg, encoding order, image output order), image quality, complexity, and the like. it means. The plurality of layers may also have dependencies between each other.

도 4를 참조하면, 예를 들어 기초 계층(base layer)은 QCIF(Quarter Common Intermediate Format), 15Hz의 프레임율, 3Mbps 비트율로 정의될 수 있고, 제1 향상 계층(enhanced layer)은 CIF(Common Intermediate Format), 30Hz의 프레임율, 0.7Mbps 비트율로 정의될 수 있으며, 제2 향상 계층은 SD(Standard Definition), 60Hz의 프레임율, 0.19Mbps 비트율로 정의될 수 있다. 상기 포맷(format), 프레임율, 비트율 등은 하나의 실시예로서, 필요에 따라 달리 정해질 수 있다. 또한 사용되는 계층의 수도 본 실시예에 한정되지 않고 상황에 따라 달리 정해질 수 있다. Referring to FIG. 4, for example, a base layer may be defined as a QCIF (Quarter Common Intermediate Format), a frame rate of 15 Hz, and a 3 Mbps bit rate, and the first enhanced layer is a CIF (Common Intermediate). Format), a frame rate of 30 Hz, and a 0.7 Mbps bit rate, and the second enhancement layer may be defined as a standard definition (SD), a frame rate of 60 Hz, and a 0.19 Mbps bit rate. The format, the frame rate, the bit rate, and the like are one example, and can be determined as needed. Also, the number of layers to be used is not limited to the present embodiment, but can be otherwise determined depending on the situation.

이 때, 만일 CIF 0.5Mbps 비트스트림(bit stream)이 필요하다면, 제1 향상 계층에서 비트율이 0.5Mbp가 되도록 비트스트림이 잘려서 전송될 수 있다. 스케일러블 비디오 코딩 방법은 상기 도 3의 실시예에서 상술한 방법에 의해 시간적, 공간적, 화질적 스케일러빌리티를 제공할 수 있다.At this time, if a CIF 0.5 Mbps bit stream is needed, the bit stream may be truncated and transmitted so that the bit rate is 0.5 Mbp in the first enhancement layer. The scalable video coding method can provide temporal, spatial, and image quality scalability by the method described in the embodiment of FIG.

이하, 대상 계층, 대상 영상, 대상 슬라이스, 대상 유닛, 대상 블록, 대상 심볼, 대상 빈은 각각 현재 부호화 또는 복호화되는 계층, 영상, 슬라이스, 유닛, 블록, 심볼 및 빈을 의미한다. 따라서 예를 들어, 대상 계층은 대상 심볼이 속한 계층일 수 있다. 또한 다른 계층은 대상 계층을 제외한 계층으로서, 대상 계층에서 이용 가능한 계층을 의미한다. 즉, 다른 계층은 대상 계층에서의 복호화 수행에 이용될 수 있다. 대상 계층에서 이용 가능한 계층에는 예를 들어, 시간적, 공간적, 화질적 하위 계층이 있을 수 있다.Hereinafter, the target layer, the target image, the target slice, the target unit, the target block, the target symbol, and the target bin mean a layer, an image, a slice, a unit, a block, a symbol, and a bin currently encoded or decoded, respectively. Thus, for example, the target layer may be a layer to which the target symbol belongs. In addition, the other layer is a layer excluding the target layer, and means a layer available in the target layer. That is, another layer may be used to perform decoding in the target layer. Layers available in the target layer may include, for example, temporal, spatial and image quality sublayers.

또한 이하, 대응 계층, 대응 영상, 대응 슬라이스, 대응 유닛, 대응 블록, 대응 심볼, 대응 빈은 각각 대상 계층, 대상 영상, 대상 슬라이스, 대상 유닛, 대상 블록, 대상 심볼, 대상 빈에 대응되는 계층, 영상, 슬라이스, 유닛, 블록, 심볼 및 빈을 의미한다. 대응 영상이란, 대상 영상과 동일한 시간축에 존재하는 다른 계층의 영상을 의미한다. 대상 계층 내의 영상과 다른 계층 내의 영상의 디스플레이 순서(display order)가 동일하면, 대상 계층 내의 영상과 다른 계층 내의 영상은 동일한 시간축에 존재한다고 할 수 있다. 영상들이 동일한 시간축에 존재하는 지 여부는 POC (picture order count)와 같은 부호화 파라미터를 이용해서 식별될 수 있다. 대응 슬라이스는 상기 대응 영상 내에서, 대상 영상의 대상 슬라이스와 공간적으로 동일하거나 유사하게 대응되는 위치에 존재하는 슬라이스를 의미한다. 대응 유닛은 상기 대응 영상 내에서, 대상 영상의 대상 유닛과 공간적으로 동일하거나 유사하게 대응되는 위치에 존재하는 유닛을 의미한다. 대응 블록은 상기 대응 영상 내에서, 대상 영상의 대상 블록과 공간적으로 동일하거나 유사하게 대응되는 위치에 존재하는 블록을 의미한다.In addition, hereinafter, the corresponding layer, the corresponding image, the corresponding slice, the corresponding unit, the corresponding block, the corresponding symbol, and the corresponding bin correspond to the target layer, the target image, the target slice, the target unit, the target block, the target symbol, and the target bin, respectively. Means image, slice, unit, block, symbol, and bin. The corresponding image refers to an image of another layer existing on the same time axis as the target image. When the display order of an image in another layer and an image in another layer are the same, it may be said that an image in the target layer and an image in another layer exist on the same time axis. Whether the pictures exist on the same time axis may be identified using an encoding parameter such as a picture order count (POC). The corresponding slice refers to a slice existing in a position corresponding to the same or similar to the target slice of the target image in the corresponding image. The corresponding unit refers to a unit existing in a corresponding position in the corresponding image that is spatially identical to or similar to the target unit of the target image. The corresponding block refers to a block existing at a position corresponding to the same as or similar to the target block of the target image in the corresponding image.

또한 이하, 영상이 분할되는 단위를 나타내는 슬라이스는 타일(tile), 엔트로피 슬라이스(entropy slice) 등의 분할 단위를 통칭하는 의미로 사용된다. 각 분할된 단위 간에는 독립적인 영상 부호화 및 복호화가 가능하다.In addition, hereinafter, a slice indicating a unit in which an image is divided is used to mean a division unit such as a tile or an entropy slice. Independent image encoding and decoding are possible between the divided units.

또한 이하, 블록은 영상 부호화 및 복호화의 단위를 의미한다. 영상 부호화 및 복호화 시 부호화 혹은 복호화 단위는, 하나의 영상을 세분화된 유닛으로 분할하여 부호화 혹은 복호화 할 때 그 분할된 단위를 말하므로, 매크로 블록, 부호화 유닛 (CU: Coding Unit), 예측 유닛 (PU: Prediction Unit), 변환 유닛(TU: Transform Unit), 변환 블록(transform block) 등으로 불릴 수 있다. 하나의 블록은 크기가 더 작은 하위 블록으로 더 분할될 수 있다.In addition, hereinafter, a block means a unit of image encoding and decoding. When encoding or decoding an image, a coding or decoding unit refers to a divided unit when a single image is divided into subdivided units to be encoded or decoded, and thus, a macroblock, a coding unit (CU), and a prediction unit (PU). It may be called a Prediction Unit, a Transform Unit, a transform block, or the like. One block may be further divided into smaller sub-blocks.

상기한 바와 같은 스케일러블 비디오 코딩의 특성을 고려하여, 계층 간 중복성을 제거하기 위해 계층 간 인트라 예측, 계층 간 인터 예측 또는 계층 간 차분 신호 예측 등이 수행될 수 있다.In consideration of the characteristics of scalable video coding as described above, inter-layer intra prediction, inter-layer inter prediction, or inter-layer differential signal prediction may be performed to remove inter-layer redundancy.

상기 계층 간 인터 예측은 참조 계층(reference layer)의 대응되는 블록에 대한 움직임 정보를 향상 계층에서 사용하는 방법으로서, 이에 대한 상세한 설명은 이하에서 상세히 기술하기로 한다.The inter-layer inter prediction is a method of using motion information on a corresponding block of a reference layer in an enhancement layer. A detailed description thereof will be described below.

이하, 도 5 내지 도 10을 참조하여 본 발명의 일실시예에 따른 스케일러블 비디오 코딩 방법에 대해 상세히 설명하기로 한다. 한편, 이하에서는 도 4를 참조하여 설명한 바와 같은 향상 계층을 코딩하는 방법에 대해 설명한다.Hereinafter, a scalable video coding method according to an embodiment of the present invention will be described in detail with reference to FIGS. 5 to 10. In the following description, a method of coding an enhancement layer as described with reference to FIG. 4 will be described.

도 5는 본 발명의 일실시예에 따른 복호화 장치의 구성을 간략하게 블록도로 도시한 것으로, 도 2에 도시된 움직임 보상부(250)의 구체적인 구성을 나타낸 것이다.FIG. 5 is a block diagram schematically illustrating a configuration of a decoding apparatus according to an embodiment of the present invention, and illustrates a specific configuration of the motion compensator 250 illustrated in FIG. 2.

도 5를 참조하면, 움직임 보상부(250)는 복수의 움직임 예측 방식들을 이용하여 움직임 정보(예를 들어, 움직임 벡터)를 예측하며, 그를 위해 서로 다른 방식으로 향상 계층의 복호화 대상 블록을 예측하는 제1 움직임 예측부(251)와 제2 움직임 예측부(255)를 포함하여 구성될 수 있다.Referring to FIG. 5, the motion compensator 250 predicts motion information (eg, a motion vector) by using a plurality of motion prediction methods, and predicts a decoding target block of an enhancement layer in different ways. The first motion predictor 251 and the second motion predictor 255 may be configured to be included.

제1 움직임 예측부(251)는, 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측하기 위해, 해당 향상 계층 내의 주변 블록에 대한 움직임 정보를 사용할 수 있다.The first motion predictor 251 may use the motion information of the neighboring block in the enhancement layer to predict the motion information of the decoding target block of the enhancement layer.

예를 들어, 제1 움직임 예측부(251)에 포함된 움직임 병합부(252)는 주변 후보 블록의 움직임 정보를 상기 복호화 대상 블록에 대한 움직임 정보로 사용하며, 예를 들어 HEVC에서 정의된 움직임 병합(merge) 방법을 이용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측할 수 있다.For example, the motion merger 252 included in the first motion predictor 251 uses motion information of neighboring candidate blocks as motion information for the decoding target block, for example, motion merge defined in HEVC. By using a merge method, motion information about a decoding target block of the enhancement layer may be predicted.

좀 더 구체적으로, 상기 움직임 병합 방법에 있어서, 부호화 장치는 주변 블록들에 대한 움직임 정보가 조합된 움직임 병합 후보 리스트(merge candidate list) 중 어느 하나를 병합할 후보로 선택하고, 상기 선택된 움직임 병합 후보에 대한 인덱스를 시그널링 할 수 있다.More specifically, in the motion merging method, the encoding apparatus selects any one of a merge candidate list in which motion information about neighboring blocks are combined as a candidate to merge, and selects the selected motion merging candidate. It can signal the index to.

한편, 복호화 장치는 상기 부호화 장치에서 시그널링된 움직임 병합 후보 인덱스를 이용해 미리 만들어진 상기 움직임 병합 후보 리스트 중 어느 하나를 복호화 대상 블록에 대한 움직임 벡터로 선택할 수 있다.Meanwhile, the decoding apparatus may select any one of the motion merge candidate lists previously created using the motion merge candidate index signaled by the encoding apparatus as a motion vector for the decoding target block.

움직임 벡터 예측부(253)는 주변 후보 블록들 각각의 움직임 정보 중 율-왜곡(rate-distortion) 관점에서 최적의 성능을 내는 후보 블록 하나를 사용하며, 예를 들어 HEVC에서 정의된 향상된 움직임 벡터 예측(AMVP, Advanced Motion Vector Prediction) 방법을 이용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측할 수 있다.The motion vector predictor 253 uses one candidate block having optimal performance in terms of rate-distortion among the motion information of each neighboring candidate block. For example, the motion vector prediction unit defined in HEVC may be improved. The motion information of the decoding target block of the enhancement layer may be predicted by using an Advanced Motion Vector Prediction (AMVP) method.

구체적으로, 상기 향상된 움직임 벡터 예측 방법에 있어서, 부호화 장치는 주변 블록들에 대한 움직임 정보를 포함하는 움직임 예측 후보 리스트(AMVP candidate list) 중에서 율-왜곡(rate-distortion) 비용 값을 비교하여 어느 하나의 움직임 예측 후보를 선택하고, 상기 선택된 움직임 예측 후보에 대한 인덱스를 시그널링할 수 있다.Specifically, in the improved motion vector prediction method, the encoding apparatus compares a rate-distortion cost value among an AMVP candidate list including motion information of neighboring blocks. A motion prediction candidate may be selected and an index for the selected motion prediction candidate may be signaled.

한편, 복호화 장치는 상기 부호화 장치에서 시그널링된 움직임 예측 후보 인덱스를 이용하여 미리 만들어진 상기 움직임 예측 후보 리스트 중 어느 하나를 선택하고, 상기 선택된 움직임 예측 후보에 움직임 벡터 차분값(MVD)를 합성하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터를 생성할 수 있다.Meanwhile, the decoding apparatus selects any one of the motion prediction candidate lists previously prepared using the motion prediction candidate index signaled by the encoding apparatus, synthesizes a motion vector difference value (MVD) with the selected motion prediction candidate, and improves the improvement. A motion vector for the decoding target block of the layer may be generated.

한편, 제2 움직임 예측부(255)는 참조 계층의 대응되는 블록에 대한 움직임 정보를 사용하여 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측할 수 있다. Meanwhile, the second motion predictor 255 may predict the motion information of the decoding target block of the enhancement layer by using the motion information of the corresponding block of the reference layer.

도 6을 참조하면, 제2 움직임 예측부(255)는 스케일링부(256)와 움직임 벡터 생성부(257)를 포함할 수 있다.Referring to FIG. 6, the second motion predictor 255 may include a scaler 256 and a motion vector generator 257.

스케일링부(256)는 상기 참조 계층의 대응되는 블록에 대한 움직임 정보를 계층 간 해상도 차이에 따라 적응적으로 스케일링하고, 움직임 벡터 생성부(257)는 상기 스케일링된 움직임 정보에 움직임 벡터 차분값(MVD)을 합성하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터를 생성할 수 있다.The scaling unit 256 adaptively scales the motion information of the corresponding block of the reference layer according to the inter-layer resolution difference, and the motion vector generator 257 adjusts the motion vector difference value (MVD) to the scaled motion information. ) May be synthesized to generate a motion vector for the decoding target block of the enhancement layer.

예를 들어, 상기 참조 계층은 기초 계층(base layer)일 수 있다.For example, the reference layer may be a base layer.

본 발명의 일실시예에 따른 복호화 장치는, 상기한 바와 같은 움직임 병합부(252), 움직임 벡터 예측부(253) 및 제2 움직임 예측부(255) 중 움직임 예측 모드에 따라 선택된 모듈은, 엔트로피부(101)로부터 전달되는 움직임 정보 또는 참조 계층으로부터 추론(derivation)된 움직임 정보를 이용하여 움직임 보상부(250)에서 사용될 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터를 구성할 수 있다.In the decoding apparatus according to the embodiment of the present invention, a module selected according to the motion prediction mode among the motion merger 252, the motion vector predictor 253, and the second motion predictor 255 as described above may be entropy. The motion vector for the decoding target block of the enhancement layer to be used in the motion compensation unit 250 may be configured by using the motion information transmitted from the unit 101 or the motion information deduced from the reference layer.

도 7은 본 발명의 제1 실시예에 따른 스케일러블 비디오 코딩 방법을 흐름도로 도시한 것으로, 도시된 비디오 코딩 방법을 도 5 및 도 6에 도시된 본 발명의 일실시예에 따른 복호화 장치의 구성을 나타내는 블록도들과 결부시켜 설명하기로 한다.7 is a flowchart illustrating a scalable video coding method according to a first embodiment of the present invention. The decoding method according to an embodiment of the present invention shown in FIGS. It will be described in conjunction with the block diagrams that represent.

도 7을 참조하면, 향상 계층의 현재 복호화 대상 블록에 대한 움직임 정보 예측 모드가 계층 간 움직임 예측을 수행하는 것인지 여부가 판단된다(S300 단계).Referring to FIG. 7, it is determined whether the motion information prediction mode for the current decoding target block of the enhancement layer performs inter-layer motion prediction (step S300).

예를 들어, 상기 움직임 정보 예측 모드는 부호화 장치에서 시그널링되는 정보에 따라 판단되며, 구체적으로 상기 시그널링 정보는 계층 간 움직임 예측(Inter-layer inter coding)의 수행 여부를 나타내는 플래그(flag)를 포함할 수 있다.For example, the motion information prediction mode is determined according to the information signaled by the encoding apparatus, and specifically, the signaling information may include a flag indicating whether to perform inter-layer inter coding. Can be.

또한, 상기 움직임 정보 예측 모드를 판단하는 S400 단계 및 그 이후의 일련의 단계들은 부호화 유닛(CU, Coding Unit) 단위로 수행될 수 있다.In addition, the step S400 of determining the motion information prediction mode and a series of subsequent steps may be performed in units of a coding unit (CU).

상기 판단된 움직임 정보 예측 모드가 계층간 움직임 예측을 수행하지 않는 제1 모드인 경우, 제1 움직임 예측부(251)는 상기 향상 계층의 주변 블록에 대한 움직임 정보를 획득한 후(S310 단계), 상기 획득된 주변 블록의 움직임 정보를 사용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측한다(S320 단계).When the determined motion information prediction mode is a first mode in which inter-layer motion prediction is not performed, the first motion predictor 251 obtains motion information on neighboring blocks of the enhancement layer (step S310). The motion information of the decoding object block of the enhancement layer is predicted using the obtained motion information of the neighboring block (step S320).

한편, 상기 판단된 움직임 정보 예측 모드가 계층간 움직임 예측을 수행하는 제2 모드인 경우, 제2 움직임 예측부(255)는 참조 계층의 대응되는 블록에 대한 움직임 정보를 획득한 후(S320 단계), 상기 획득된 참조 계층의 대응 블록에 대한 블록의 움직임 정보를 사용하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 정보를 예측한다(S320 단계).On the other hand, when the determined motion information prediction mode is the second mode for performing inter-layer motion prediction, the second motion predictor 255 obtains motion information on the corresponding block of the reference layer (step S320). In operation S320, the motion information of the decoding target block of the enhancement layer is predicted using the motion information of the block of the corresponding block of the reference layer.

도 8을 참조하면, 제2 움직임 예측부(255)의 스케일링부(256)는, 향상 계층의 복호화 대상 블록(B1)에 대한 움직임 정보를 예측하기 위해, 기초 계층의 대응되는 블록(B2)에 대한 움직임 벡터를 상기 향상 계층과 기초 계층 사이의 해상도 차이에 기초해 스케일링하여 상기 복호화 대상 블록(B1)에 대한 움직임 벡터를 생성할 수 있다.Referring to FIG. 8, the scaling unit 256 of the second motion predicting unit 255 may perform an operation on the corresponding block B2 of the base layer in order to predict motion information of the decoding target block B1 of the enhancement layer. The motion vector for the decoding target block B1 may be generated by scaling the motion vector with respect to the resolution difference between the enhancement layer and the base layer.

예를 들어, 상기 기초 계층의 대응되는 블록(B2)은, 상기 기초 계층에 존재하는 블록들 중 상기 향상 계층의 복호화 대상 블록(B1)과 가장 잘 매치되는 블록이거나, 또는 상기 향상 계층의 복호화 대상 블록(B1)과 대응되는 위치를 가지는 블록(co-located block)일 수 있다.For example, the corresponding block B2 of the base layer is a block that best matches the decoding target block B1 of the enhancement layer among the blocks existing in the base layer, or the decoding target of the enhancement layer. The block may be a co-located block having a position corresponding to the block B1.

또한, 현재 복호화 대상 블록이 향상 계층이 아닌 기초 계층에 포함된 블록인 경우, 제1 움직임 예측부(251)가 주변 블록에 대한 움직임 정보를 사용하여 상기 기초 계층의 복호화 대상 블록에 대한 움직임 정보를 예측할 수 있다.In addition, when the current decoding target block is a block included in the base layer instead of the enhancement layer, the first motion predictor 251 uses the motion information on the neighboring block to obtain motion information about the decoding target block of the base layer. It can be predicted.

한편, 비디오 코딩 시 예측 영상의 생성과 관련하여, 스킵 모드에서는 움직임 정보를 주변 블록으로부터 유도한 뒤, 유도된 움직임 정보를 사용하여 예측 영상(또는 블록)을 생성하고, 움직임 정보나 잔여 영상 정보를 부호화 또는 복호화하지 않을 수 있다.On the other hand, with respect to generation of a predictive image in video coding, in the skip mode, motion information is derived from a neighboring block, and then a predictive image (or block) is generated using the derived motion information, and motion information or residual image information is generated. It may not be encoded or decoded.

본 발명의 실시예에 따른 비디오 코딩 방법은 상기 스킵 모드의 적용 여부에 따라 서로 상이하게 수행될 수 있다.The video coding method according to the embodiment of the present invention may be performed differently depending on whether the skip mode is applied.

도 9는 본 발명의 제2 실시예에 따른 스케일러블 비디오 코딩 방법을 흐름도로 도시한 것으로, 스킵 모드가 아닌 경우 수행되는 비디오 복호화 방법에 대한 일예를 나타낸 것이다.9 is a flowchart illustrating a scalable video coding method according to a second embodiment of the present invention, which shows an example of a video decoding method performed when not in a skip mode.

도 9를 참조하면, 먼저 향상 계층의 현재 복호화 대상 블록에 대해 계층 간 움직임 예측이 수행되는지 여부가 판단된다(S400 단계).Referring to FIG. 9, it is first determined whether inter-layer motion prediction is performed on the current decoding target block of the enhancement layer (S400).

상기 계층간 움직임 예측을 수행되지 않는 경우, 움직임 병합부(252)는 부호화 장치에서 시그널링된 움직임 병합 후보 인덱스(merge candidate index)를 복호화한 후(S410 단계), 상기 복호화된 움직임 병합 후보 인덱스를 이용하여 미리 만들어진 움직임 병합 후보 리스트(merge candidate list) 중 어느 하나를 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터로 선택한다(S420 단계).If the inter-layer motion prediction is not performed, the motion merging unit 252 decodes the motion merge candidate index signaled by the encoding apparatus (step S410) and then uses the decoded motion merge candidate index. In operation S420, any one of a previously created merge candidate list is selected as a motion vector for the decoding target block of the enhancement layer.

한편, 상기 계층간 움직임 예측을 수행되는 경우, 제2 움직임 예측부(255)는 참조 계층의 대응되는 블록에 대한 움직임 벡터를 추론(derivation)한 후(S430 단계), 상기 추론된 움직임 벡터를 해상도에 따라 스케일링한다(S440 단계).On the other hand, when the inter-layer motion prediction is performed, the second motion predictor 255 deduces the motion vector of the corresponding block of the reference layer (step S430) and then resolutions the deduced motion vector. Scale according to (S440 step).

예를 들어, 제2 움직임 예측부(255)에 구비된 스케일링부(256)는 상기 참조 계층의 대응되는 블록으로부터 획득된 움직임 벡터를 계층 간 해상도에 맞추어 스케일링하며, 상기 향상 계층과 참조 계층의 해상도가 동일한 경우 상기 S440 단계는 생략되고 상기 참조 계층의 대응되는 블록에 대한 움직임 벡터가 상기 향상 계층의 복호화 블록에 대한 움직임 벡터로 사용될 수 있다.For example, the scaling unit 256 included in the second motion predictor 255 scales a motion vector obtained from a corresponding block of the reference layer according to inter-layer resolution, and resolutions of the enhancement layer and the reference layer. If S is the same, step S440 is omitted and a motion vector for a corresponding block of the reference layer may be used as a motion vector for a decoding block of the enhancement layer.

도 10은 본 발명의 제3 실시예에 따른 스케일러블 비디오 코딩 방법을 흐름도로 도시한 것으로, 스킵 모드인 경우 수행되는 비디오 복호화 방법에 대한 일예를 나타낸 것이다.10 is a flowchart illustrating a scalable video coding method according to a third embodiment of the present invention, which shows an example of a video decoding method performed in a skip mode.

도 10을 참조하면, 먼저 향상 계층의 현재 복호화 대상 블록에 대해 계층 간 움직임 예측이 수행되는지 여부가 판단된다(S500 단계).Referring to FIG. 10, it is first determined whether inter-layer motion prediction is performed on a current decoding target block of an enhancement layer (S500).

상기 계층간 움직임 예측을 수행되지 않는 경우, 움직임 벡터 예측부(253)는 부호화 장치에서 시그널링된 움직임 예측 후보 인덱스(prediction candidate index)를 복호화한 후(S510 단계), 상기 복호화된 움직임 예측 후보 인덱스를 이용하여 미리 만들어진 움직임 예측 후보 리스트(prediction candidate list)에서 움직임 벡터를 선택한다(S520 단계).When the inter-layer motion prediction is not performed, the motion vector predictor 253 decodes the motion prediction candidate index signaled by the encoding apparatus (step S510), and then decodes the decoded motion prediction candidate index. In operation S520, a motion vector is selected from a prediction candidate list.

그 후, 움직임 벡터 예측부(253)는 부호화 장치에서 시그널링되는 움직임 벡터 차분 값(MVD, Motion Vector Difference)을 복호화한 후(S530 단계), 상기 S520 단계에서 선택된 움직임 벡터에 상기 복호화된 움직임 벡터 차분 값(MVD)을 합성하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터를 생성한다(S540 단계).Thereafter, the motion vector predictor 253 decodes a motion vector difference value (MVD, Motion Vector Difference) signaled by the encoding apparatus (step S530), and then decodes the motion vector difference to the motion vector selected in step S520. A value MVD is synthesized to generate a motion vector for the decoding object block of the enhancement layer (S540).

한편, 상기 계층간 움직임 예측을 수행되는 경우, 제2 움직임 예측부(255)는 참조 계층의 대응되는 블록에 대한 움직임 벡터를 추론(derivation)한 후(S430 단계), 스케일링부(256)를 통해 상기 추론된 움직임 벡터를 계층간 해상도 차이에 따라 스케일링한다(S440 단계).On the other hand, when the inter-layer motion prediction is performed, the second motion predictor 255 deduces the motion vector of the corresponding block of the reference layer (step S430) and then, through the scaling unit 256. The inferred motion vector is scaled according to the inter-layer resolution difference (S440).

그 후, 움직임 벡터 생성부(257)는 부호화 장치에서 시그널링되는 움직임 벡터 차분 값(MVD)을 복호화한 후(570 단계), 상기 S440 단계에서 스케일링된 움직임 벡터에 상기 복호화된 움직임 벡터 차분값(MVD)을 합성하여 상기 향상 계층의 복호화 대상 블록에 대한 움직임 벡터를 생성한다(S580 단계).Thereafter, the motion vector generator 257 decodes the motion vector difference value MVD signaled by the encoding apparatus (step 570), and then decodes the motion vector difference value (MVD) to the scaled motion vector in step S440. ) Is synthesized to generate a motion vector for the decoding target block of the enhancement layer (step S580).

상기에서는 비디오 복호화 방법 및 장치를 중심으로 본 발명의 일실시예에 따른 스케일러블 비디오 코딩 방법 및 장치에 대해 설명하였으나, 본 발명의 일실시예에 따른 스케일러블 비디오 부호화 방법은 도 5 내지 도 10을 참조하여 설명한 바와 같은 복호화 방법에 따른 일련의 단계들을 수행함에 의해 구현될 수 있다.The scalable video coding method and apparatus according to an embodiment of the present invention have been described above with reference to the video decoding method and apparatus. However, the scalable video coding method according to an embodiment of the present invention is illustrated in FIGS. It may be implemented by performing a series of steps according to the decoding method as described with reference.

구체적으로, 본 발명의 실시예에 따른 스케일러블 비디오 부호화 방법 및 장치는 도 5 내지 도 10을 참조하여 설명한 바와 같은 복호화 방법 및 장치와 동일한 구성의 인터 예측을 수행하여 향상 계층의 부호화 대상 블록에 대한 움직임 정보를 예측하고, 상기 예측된 움직임 정보에 따라 예측 신호를 생성할 수 있다.In detail, the scalable video encoding method and apparatus according to the embodiment of the present invention perform inter prediction having the same configuration as the decoding method and apparatus described with reference to FIGS. The motion information may be predicted, and a prediction signal may be generated according to the predicted motion information.

상술한 실시예에서, 방법들은 일련의 단계 또는 블록으로써 순서도를 기초로 설명되고 있으나, 본 발명은 단계들의 순서에 한정되는 것은 아니며, 어떤 단계는 상술한 바와 다른 단계와 다른 순서로 또는 동시에 발생할 수 있다. 또한, 당해 기술 분야에서 통상의 지식을 가진 자라면 순서도에 나타난 단계들이 배타적이지 않고, 다른 단계가 포함되거나, 순서도의 하나 또는 그 이상의 단계가 본 발명의 범위에 영향을 미치지 않고 삭제될 수 있음을 이해할 수 있을 것이다.In the above-described embodiments, methods are described based on a flowchart as a series of steps or blocks, but the present invention is not limited to the order of the steps, and some steps may occur in different orders or in a different order than the steps described above have. It will also be understood by those skilled in the art that the steps depicted in the flowchart illustrations are not exclusive, that other steps may be included, or that one or more steps in the flowchart may be deleted without affecting the scope of the present invention. You will understand.

상술한 실시예는 다양한 양태의 예시들을 포함한다. 다양한 양태들을 나타내기 위한 모든 가능한 조합을 기술할 수는 없지만, 해당 기술 분야의 통상의 지식을 가진 자는 다른 조합이 가능함을 인식할 수 있을 것이다. 따라서, 본 발명은 이하의 특허청구범위 내에 속하는 모든 다른 교체, 수정 및 변경을 포함한다고 할 것이다.The above-described embodiments include examples of various aspects. While it is not possible to describe every possible combination for expressing various aspects, one of ordinary skill in the art will recognize that other combinations are possible. Accordingly, it is intended that the invention include all alternatives, modifications and variations that fall within the scope of the following claims.

Claims

A multi-layer scalable video decoding method,
Determining a motion information prediction mode for a decoding target block of an enhancement layer;
When the determined motion information prediction mode is a first mode, predicting motion information on a decoding target block of the enhancement layer by using motion information on a neighboring block of the enhancement layer; And
If the determined motion information prediction mode is the second mode, estimating motion information of a decoding target block of the enhancement layer by using motion information of a corresponding block of a reference layer; Flexible video decoding method.

The method of claim 1, wherein the motion information prediction mode is
It is determined according to the information signaled by the encoding apparatus, the signaling information includes a flag indicating whether to perform inter-layer motion prediction (flag).

The method of claim 1, wherein the determining of the motion information prediction mode comprises:
A scalable video decoding method performed in units of coding units (CUs).

The method of claim 1, wherein predicting the motion information in the first mode comprises:
If it is not a skip mode, decoding a motion merge candidate index; And selecting one of a motion merge candidate list in which motion information about neighboring blocks is combined using the decoded motion merge candidate index as a motion vector for a decoding target block of the enhancement layer. Scalable video decoding method comprising.

The method of claim 1, wherein predicting the motion information in the first mode comprises:
When in the skip mode, decoding a motion prediction candidate index; Selecting one of a motion prediction candidate list including motion information on neighboring blocks by using the decoded motion prediction candidate index; And generating a motion vector for a decoding object block of the enhancement layer by synthesizing a motion vector difference (MVD) to the selected motion prediction candidate.

The method of claim 1, wherein predicting the motion information in the second mode comprises:
Scaling motion information on a corresponding block of the reference layer according to a resolution difference between layers.

The method of claim 6, wherein predicting the motion information in the second mode comprises:
And in the skip mode, generating a motion vector for a decoding target block of the enhancement layer by synthesizing a motion vector difference value (MVD) with the scaled motion information.

The method of claim 1, wherein the reference layer is
A scalable video decoding method which is a base layer.

A multi-layer scalable video decoding apparatus,
A first motion predictor for predicting motion information on a decoding target block of an enhancement layer using motion information on a neighboring block; And
A second motion predictor for predicting motion information on a decoding target block of the enhancement layer by using motion information on a corresponding block of the reference layer,
According to the motion information prediction mode signaled by the encoding device, one of the first and second motion predictors is used to predict motion information for the decoding target block of the enhancement layer.

10. The method of claim 9, wherein the first motion predictor
In the non-skip mode, a motion merging unit for selecting one of the motion merge candidate lists including motion information about neighboring blocks using a motion merge candidate index as a motion vector for the decoding target block of the enhancement layer Scalable video decoding apparatus comprising a.

10. The method of claim 9, wherein the first motion predictor
In the skip mode, the motion prediction candidate index is used to select any one of the motion prediction candidate lists including the motion information for neighboring blocks, and the motion vector differential value (MVD) is synthesized by the selected motion prediction candidate. A scalable video decoding apparatus comprising a motion vector predictor constituting a motion vector for a decoding target block of an enhancement layer.

10. The method of claim 9, wherein the second motion predictor
A scaling unit for scaling motion information of a corresponding block of the reference layer according to a resolution difference between layers; And
And a motion vector generator for synthesizing a motion vector difference value (MVD) to the scaled motion information to generate a motion vector for a decoding target block of the enhancement layer.