KR20070116527A

KR20070116527A - A method and apparatus for decoding/encoding a video signal

Info

Publication number: KR20070116527A
Application number: KR1020060110122A
Authority: KR
Inventors: 전용준; 전병문; 구한서; 박승욱
Original assignee: 엘지전자 주식회사
Priority date: 2006-06-05
Filing date: 2006-11-08
Publication date: 2007-12-10

Abstract

A method and an apparatus for decoding/encoding a video signal are provided to define flag information indicating whether to perform an inter-view weighted prediction. A weighted prediction for decoding a current block by using information about a picture located in a view different from a current picture is performed according to a slice type extracted from a video signal(S310,S320). The slice type includes a macro block to which an inter-view prediction is applied.

Description

A method and apparatus for decoding / encoding a video signal

도 1은 본 발명을 적용한 다시점 영상(multi-view sequence) 인코딩 및 디코딩 시스템을 도시한 것이다.1 illustrates a multi-view sequence encoding and decoding system to which the present invention is applied.

도 2는 본 발명을 적용한 다시점 비디오 코딩에 있어서, 픽쳐들 간의 예측 구조를 나타낸다. 2 shows a prediction structure between pictures in multi-view video coding according to the present invention.

도 3은 본 발명이 적용되는 비디오 신호 코딩에 있어서, 슬라이스 타입에 따라 가중치 예측을 수행하는 흐름도를 나타낸다.3 is a flowchart of performing weight prediction according to slice type in video signal coding to which the present invention is applied.

도 4는 본 발명이 적용되는 비디오 신호 코딩에 있어서, 슬라이스 타입에서 허용되는 매크로블록 유형들의 일실시예를 나타낸다.4 illustrates an embodiment of macroblock types allowed in a slice type in video signal coding to which the present invention is applied.

도 5a ~ 도 5b는 본 발명이 적용되는 일실시예로서, 새롭게 정의된 슬라이스 타입에 따라 가중치 예측을 수행하는 신택스를 나타낸다. 5A to 5B illustrate syntaxes for performing weight prediction according to a newly defined slice type according to an embodiment to which the present invention is applied.

도 6은 본 발명이 적용되는 비디오 신호 코딩에 있어서, 시점 간 가중치 예측 수행 여부를 나타내는 플래그 정보를 이용하여 가중치 예측을 수행하는 흐름도를 나타낸다.6 illustrates a flowchart of performing weight prediction using flag information indicating whether to perform inter-view weight prediction in video signal coding according to the present invention.

도 7은 본 발명이 적용되는 일실시예로서, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보에 따 른 가중치 예측 방법을 설명하기 위한 것이다.FIG. 7 is a diagram for describing a weight prediction method based on flag information indicating whether weight prediction is to be performed using information of a picture that is different from a current picture, according to an embodiment to which the present invention is applied.

도 8은 본 발명이 적용되는 일실시예로서, 새롭게 정의된 플래그 정보에 따라 가중치 예측을 수행하는 신택스를 나타낸다. FIG. 8 illustrates a syntax for performing weight prediction according to newly defined flag information according to an embodiment to which the present invention is applied.

도 9는 본 발명이 적용되는 실시예로서, NAL(Network Abstraction Layer) 유닛 타입에 따라 가중치 예측을 수행하는 흐름도를 나타낸다.FIG. 9 is a flowchart to which weight prediction is performed according to a network abstraction layer (NAL) unit type according to an embodiment to which the present invention is applied.

도 10a ~ 도 10b는 본 발명이 적용되는 일실시예로서, NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입인 경우 가중치 예측을 수행하는 신택스를 나타낸다. 10A to 10B illustrate syntaxes for performing weight prediction when the NAL unit type is the NAL unit type for multiview video coding according to an embodiment to which the present invention is applied.

본 발명은 비디오 신호의 디코딩/인코딩 방법 및 장치에 관한 기술이다.The present invention relates to a method and apparatus for decoding / encoding a video signal.

현재 주류를 이루고 있는 비디오 방송 영상물은 한 대의 카메라로 획득한 단일시점 영상이다. 비록 여러 대의 카메라로 찍은 영상이라 할지라도 편집되어 한 개의 영상으로 취급된다. 반면, 다시점 비디오(Multi-view video)란 한 대 이상의 카메라를 통해 촬영된 영상들을 기하학적으로 교정하고 공간적인 합성 등을 통하여 여러 방향의 다양한 시점을 사용자에게 제공하는 3차원(3D) 영상처리의 한 분야이다. 다시점 비디오는 사용자에게 시점의 자유를 증가시킬 수 있으며, 한대의 카메라를 이용하여 획득할 수 있는 영상 영역에 비해 큰 영역을 포함하는 특징을 지닌다. The mainstream video broadcasting image is a single view image acquired with one camera. Even if the video is taken by multiple cameras, it is edited and treated as a single video. Multi-view video, on the other hand, is a three-dimensional (3D) image processing method that geometrically corrects images taken by more than one camera and provides users with various viewpoints in various directions through spatial synthesis. It is a field. Multi-view video can increase the freedom of view for the user, and has a feature that includes a larger area than the image area that can be acquired using a single camera.

이러한 다시점 비디오 영상은 시점들 사이에 높은 상관 관계를 가지고 있기 때문에 시점들 사이의 공간적 예측을 통해서 중복된 정보를 제거함으로써 시점들 사이의 예측을 효율적으로 수행할 수 있다. 그러나, 다시점 비디오(Multi-view video)의 각 시점 영상(view sequence)들은 각기 다른 카메라에서 취득된 영상들이기 때문에 카메라의 내외적 요인으로 인하여 조명(illumination) 차이가 발생하게 된다. 예를 들어, 카메라의 이질성(camera heterogeneity), 카메라 측정(camera calibration)의 차이, 또는 카메라의 정렬(camera alignment)의 차이 등이 원인이 된다. 이러한 조명(illumination) 차이는 다른 시점(view)들 간의 상관도를 현저히 떨어뜨려 효율적인 코딩을 저해하므로, 이를 보완하기 위한 가중치 예측 코딩(weighted prediction coding) 기술이 필요하다. Since such multi-view video images have a high correlation between viewpoints, prediction between viewpoints can be efficiently performed by removing redundant information through spatial prediction between viewpoints. However, since each view sequence of a multi-view video is obtained from different cameras, illumination differences may occur due to internal and external factors of the camera. For example, a camera heterogeneity, a difference in camera calibration, or a camera alignment is caused. Since such illumination differences significantly reduce the correlation between different views, thereby inhibiting efficient coding, a weighted prediction coding technique is required to compensate for this.

본 발명의 목적은 시점 간 예측이 적용된 슬라이스 타입을 정의하고, 그에 따른 가중치 예측 방법을 제공하고자 함에 있다.An object of the present invention is to define a slice type to which the inter-view prediction is applied, and to provide a weight prediction method accordingly.

본 발명의 다른 목적은 시점 간 가중치 예측 수행 여부를 나타내는 플래그 정보를 정의하고, 그에 따른 가중치 예측 방법을 제공하고자 함에 있다.Another object of the present invention is to define flag information indicating whether to perform inter-view weight prediction, and to provide a weight prediction method accordingly.

본 발명의 또 다른 목적은 다시점 비디오 코딩을 위한 NAL 유닛 타입에 따른 가중치 예측 방법을 제공하고자 함에 있다.Another object of the present invention is to provide a weight prediction method according to a NAL unit type for multiview video coding.

본 발명의 또 다른 목적은 다시점 영상에 있어서, 시점들 간의 조명 차이를 효율적으로 보상하고자 하는데 있다.It is another object of the present invention to efficiently compensate for lighting differences between viewpoints in a multiview image.

본 발명의 또 다른 목적은 시점 간의 상관관계를 효과적으로 이용하여 비디 오 영상의 코딩 효율을 높이는데 있다.It is still another object of the present invention to improve the coding efficiency of a video image by effectively using the correlation between viewpoints.

상기 목적을 달성하기 위하여 본 발명은 비디오 신호를 디코딩하는 방법에 있어서, 상기 비디오 신호로부터 추출된 슬라이스 타입에 따라, 현재 픽쳐와 다른 시점에 있는 픽쳐에 대한 정보를 이용하여 현재 블록을 디코딩하기 위한 가중치 예측을 수행하는 단계를 포함하는 비디오 신호 디코딩 방법을 제공한다.In order to achieve the above object, the present invention provides a method for decoding a video signal, comprising: a weight for decoding a current block by using information about a picture at a different time point than a current picture according to a slice type extracted from the video signal. It provides a video signal decoding method comprising the step of performing the prediction.

또한, 본 발명은 비디오 신호로부터 NAL 유닛 타입 정보를 추출하는 단계와 상기 NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입인 경우, 현재 픽쳐와 다른 시점에 있는 픽쳐에 대한 정보를 이용하여 현재 블록을 디코딩하기 위한 가중치 예측을 수행하는 단계를 포함하는 것을 특징으로 하는 비디오 신호 디코딩 방법을 제공한다.The present invention also provides a method of extracting NAL unit type information from a video signal, and when the NAL unit type is a NAL unit type for multi-view video coding, the current block using information on a picture that is different from the current picture. It provides a video signal decoding method comprising the step of performing a weighted prediction for decoding.

상술한 목적 및 구성의 특징은 첨부된 도면과 관련하여 다음의 상세한 설명을 통하여 보다 명확해질 것이다. 이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예들를 상세히 설명한다.The above objects and features of the construction will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

아울러, 본 발명에서 사용되는 용어는 가능한 한 현재 널리 사용되는 일반적인 용어를 선택하였으나, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우는 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재하였으므로, 단순한 용어의 명칭이 아닌 용어가 가지는 의미로서 본 발명을 파악하여야 함을 밝혀두고자 한다.In addition, the terms used in the present invention was selected as a general term widely used as possible now, but in some cases, the term is arbitrarily selected by the applicant, in which case the meaning is described in detail in the description of the invention, It is to be understood that the present invention is to be understood as the meaning of terms rather than the names of terms.

도 1은 본 발명을 적용한 다시점 영상(multi-view sequence) 인코딩 및 디코 딩 시스템을 도시한 것이다. 도 1에 도시된 바와 같이, 본 발명을 적용한 다시점 영상 인코딩 시스템은, 다시점 영상 발생부(10), 전처리부(20, preprocessing part) 및 인코딩부(30, encoding part)를 포함하여 구성된다. 또한, 디코딩 시스템은 익스트랙터(40, extractor), 디코딩부(50, decoding part), 후처리부(60, post processing part) 및 디스플레이부(70, display part)를 포함하여 구성된다.1 illustrates a multi-view sequence encoding and decoding system to which the present invention is applied. As shown in FIG. 1, the multiview image encoding system to which the present invention is applied includes a multiview image generator 10, a preprocessing part 20, and an encoding part 30. . In addition, the decoding system includes an extractor 40, an extractor 50, a decoding part 50, a post processing part 60, and a display part 70.

상기 다시점 영상 발생부(10)는 다시점 개수 만큼의 영상 획득장치(예를 들어, 카메라 #1 ~ #N)를 구비하여, 각 시점별로 독립적인 영상을 획득하게 된다. 전처리부(20)는 다시점 영상 데이터가 입력되면, 노이즈 제거, 임발란싱(imbalancing) 문제를 해결하면서 전처리 과정을 통해 다시점 영상 데이터들 간의 상관도를 높여주는 기능을 수행한다. 인코딩부(30)는 움직임(motion) 추정부, 움직임 보상부 및 시점 간의 변이(disparity) 추정부, 변이 보상부, 조명(illumination) 보상부, 비트율 제어 및 차분(residual) 영상 인코딩부 등을 포함한다. 상기 인코딩부(30)는 일반적으로 알려진 방식을 적용할 수 있다.The multi-view image generator 10 includes an image obtaining apparatus (for example, cameras # 1 to #N) corresponding to the number of multi-views to acquire independent images for each viewpoint. When the multiview image data is input, the preprocessing unit 20 performs a function of increasing the correlation between the multiview image data through a preprocessing process while solving noise removal and imbalancing problems. The encoder 30 includes a motion estimator, a motion compensator, and a disparity estimator, a disparity compensator, an illumination compensator, a bit rate control, and a residual image encoder. do. The encoding unit 30 may apply a generally known method.

익스트랙터(40)는 전송된 MVC 비트스트림으로부터 원하는 시점(view)에 해당하는 비트스트림만 추출할 수 있는 기능을 한다. 상기 익스트랙터(40)는 헤더를 보고 선택적으로 원하는 시점만 디코딩할 수 있다. 또한, 픽쳐의 시점을 구별하는 시점 식별자(view identifier)를 이용하여 원하는 시점에 해당하는 비트스트림만을 추출함으로써 시점 확장성(view scalability)을 구현할 수 있다. MVC는 H.264/AVC와 완벽하게 호환이 가능해야 하기 때문에 H.264/AVC와 호환 가능한 특정 시점만 디코딩해야 할 필요가 있다. 이러한 경우, 호환 가능한 시점만 디코딩하기 위해 픽 쳐의 시점을 구별하는 시점 식별자가 이용될 수 있다. 익스트랙터(40)를 통해 추출된 비트스트림은 디코딩부(50)로 전송된다. 디코딩부(50)는 움직임 보상부, 조명 보상부, 가중치 예측부, 디블록킹 필터부 등을 포함한다. 디코딩부(50)는 전술한 방식에 의해 인코딩된 비트스트림을 수신한 후, 이를 역으로 디코딩한다. 또한, 후처리부(60)는 디코딩된 데이터의 신뢰도 및 해상도를 높여주는 기능을 수행하게 된다. 마지막으로 디스플레이부(70)는 디스플레이의 기능, 특히 다시점 영상을 처리하는 능력에 따라 다양한 방식으로 사용자에게 디코딩된 데이터를 제공하게 된다. 예를 들어, 평면 2차원 영상만을 제공하는 2D 디스플레이(71)이거나, 2개의 시점을 입체 영상으로 제공하는 스테레오(stereo) 타입의 디스플레이(73)이거나 또는 M개의 시점(2<M)을 입체 영상으로 제공하는 디스플레이(75)일 수 있다.The extractor 40 functions to extract only a bitstream corresponding to a desired view from the transmitted MVC bitstream. The extractor 40 may view the header and selectively decode only the desired time point. In addition, view scalability may be implemented by extracting only a bitstream corresponding to a desired view using a view identifier that distinguishes a view of a picture. Since MVC must be fully compatible with H.264 / AVC, it is necessary to decode only a specific point in time that is compatible with H.264 / AVC. In this case, a view identifier that distinguishes a view of the picture may be used to decode only compatible views. The bitstream extracted through the extractor 40 is transmitted to the decoding unit 50. The decoder 50 includes a motion compensator, an illumination compensator, a weight predictor, a deblocking filter, and the like. The decoding unit 50 receives the bitstream encoded by the above-described method, and decodes it in reverse. In addition, the post-processing unit 60 performs a function of increasing the reliability and resolution of the decoded data. Finally, the display unit 70 provides the decoded data to the user in various ways depending on the function of the display, in particular, the ability to process a multi-view image. For example, it is a 2D display 71 that provides only a planar two-dimensional image, a stereo type display 73 that provides two views as a stereoscopic image, or a stereoscopic image of M views (2 <M). The display 75 may be provided.

도 2에 나타난 바와 같이 가로축의 T0 ~ T100 은 각각 시간에 따른 프레임을 나타낸 것이고, 세로축의 S0 ~ S7은 각각 시점에 따른 프레임을 나타낸 것이다. 예를 들어, T0에 있는 픽쳐들은 같은 시간대(T0)에 서로 다른 카메라에서 찍은 영상들을 의미하며, S0 에 있는 픽쳐들은 한 대의 카메라에서 찍은 다른 시간대의 영상들을 의미한다. 또한, 도면 상의 화살표들은 각 픽쳐들의 예측 방향과 순서를 나타낸 것으로서, 예를 들어, T0 시간대의 S2 시점에 있는 P0 픽쳐는 I0로부터 예측된 픽쳐이며, 상기 P0 픽쳐는 TO 시간대의 S4 시점에 있는 또 다른 P0 픽쳐의 참조 픽쳐가 된다. 또한, S2 시점의 T4, T2 시간대에 있는 B1, B2 픽쳐의 참조 픽쳐가 된 다.As shown in FIG. 2, T0 to T100 on the horizontal axis represent frames according to time, and S0 to S7 on the vertical axis represent frames according to viewpoints, respectively. For example, pictures in T0 refer to images taken by different cameras in the same time zone (T0), and pictures in S0 refer to images in different time zones taken by one camera. In addition, the arrows in the drawing indicate the prediction direction and the order of each picture. For example, a P0 picture at S2 time point in the T0 time zone is a picture predicted from I0, and the P0 picture is also at S4 time point in the TO time zone. It becomes a reference picture of another P0 picture. It is also a reference picture of the B1 and B2 pictures in the T4 and T2 time zones at the time S2.

이와 같이 현재 픽쳐와 다른 시점에 있는 픽쳐의 픽쳐에 대한 정보를 이용하여 예측 코딩을 하게 되는 경우, 상기 현재 픽쳐와 상기 다른 시점에 있는 픽쳐와의 조명 차이를 보상하기 위해 본 발명에 따른 가중치 예측 방법이 적용될 수 있다. 상기 가중치 예측 방법이 적용되는 경우에도, 상기 픽쳐들 사이의 시점을 구별해주는 식별자가 필요할 수 있다. As described above, when the prediction coding is performed using information on a picture of a picture at a different point in time from the current picture, a weight prediction method according to the present invention to compensate for an illumination difference between the current picture and a picture at another point in time. This can be applied. Even when the weight prediction method is applied, an identifier for distinguishing a viewpoint between the pictures may be needed.

가중치 예측(weighted prediction)은 P 또는 B 슬라이스 매크로블록 내의 움직임 보상된 예측 데이터의 샘플을 스케일링하는 방법이다. 가중치 예측 방법은 참조 픽쳐들에 대한 정보로부터 획득된 가중치 계수 정보를 이용하여 현재 픽쳐에 대한 가중치 예측을 수행하는 명시적인(explicit) 모드 또는 현재 픽쳐와 참조 픽쳐들 사이의 거리 정보로부터 획득된 가중치 계수 정보를 이용하여 현재 픽쳐에 대한 가중치 예측을 수행하는 묵시적인(implicit) 모드를 포함한다. 이러한 가중치 예측 방법은 적용하고자 하는 매크로블록의 슬라이스 타입에 따라 다르게 적용될 수 있다. 예를 들어, 상기 명시적인 모드는 가중치 예측이 수행되는 현재 매크로블록이 P 슬라이스의 매크로블록인지 B 슬라이스의 매크로블록인지에 따라 가중치 계수 정보가 달라질 수 있다. 그리고, 명시적인 모드에서의 가중치 계수는 인코더에 의해 결정되고 슬라이스 헤더 내에 포함되어 전송될 수 있다. 반면에, 묵시적인 모드에서의 가중치 계수는 List 0와 List 1 참조 픽쳐의 상대적인 시간적 위치에 기초하 여 획득될 수 있다. 예를 들어, 참조 픽쳐가 현재 픽쳐에 시간적으로 가까우면 큰 가중치 계수가 적용될 수 있고, 참조 픽쳐가 현재 픽쳐로부터 시간적으로 멀리 떨어져 있으면 작은 가중치 계수가 적용될 수 있다. 따라서, 본 발명에서는 먼저 비디오 신호로부터 가중치 예측을 적용하고자 하는 매크로블록의 슬라이스 타입을 추출한다(S310). 상기 추출된 슬라이스 타입에 따라 현재 매크로블록에 대해 가중치 예측을 수행할 수 있다(S320). 여기서, 상기 슬라이스 타입은 시점 간 예측(inter-view prediction)이 적용된 매크로블록을 포함할 수 있다. 시점 간 예측이란, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 현재 픽쳐를 예측하는 것을 말한다. 예를 들어, 상기 슬라이스 타입은 현재 픽쳐와 같은 시점에 있는 픽쳐의 정보를 이용하여 예측을 수행하는 시간적 예측(temporal prediction)이 적용된 매크로블록, 상기 시점 간 예측이 적용된 매크로블록 및 시간적 예측과 시점 간 예측이 함께 적용된 매크로블록을 포함할 수 있다. 또한, 상기 슬라이스 타입은 시간적 예측이 적용된 매크로블록만을 포함할 수도 있고, 시점 간 예측이 적용된 매크로블록만을 포함할 수도 있으며, 상기 두 가지 예측이 모두 적용된 매크로블록만을 포함할 수도 있으며, 상기 매크로블록 유형 중 두 가지 유형 또는 세 가지 유형 모두를 포함할 수도 있다. 이에 대해서는 도 4에서 상세히 설명하도록 한다. 이처럼, 비디오 신호로부터 시점 간 예측이 적용된 매크로블록을 포함하는 슬라이스 타입이 추출된 경우, 현재 픽쳐와 다른 시점에 있는 픽쳐에 대한 정보를 이용하여 가중치 예측을 수행하게 된다. 여기서, 다른 시점에 있는 픽쳐에 대한 정보를 이용하기 위하여 픽쳐의 시점을 구별하는 시점 식별자를 이용할 수 있다.Weighted prediction is a method of scaling a sample of motion compensated prediction data within a P or B slice macroblock. The weight prediction method is an explicit mode in which weight prediction is performed on a current picture using weight coefficient information obtained from information on reference pictures or a weight coefficient obtained from distance information between a current picture and reference pictures. Implicit mode for performing weight prediction for the current picture using the information. This weight prediction method may be applied differently according to the slice type of the macroblock to be applied. For example, in the explicit mode, the weight coefficient information may vary depending on whether the current macroblock on which the weight prediction is performed is a macroblock of a P slice or a macroblock of a B slice. And, the weight coefficient in the explicit mode may be determined by the encoder and included in the slice header and transmitted. On the other hand, the weighting coefficient in the implicit mode may be obtained based on the relative temporal position of the List 0 and List 1 reference pictures. For example, if the reference picture is close to the current picture in time, a large weighting factor may be applied. If the reference picture is far from the current picture in time, a small weighting factor may be applied. Therefore, in the present invention, first, the slice type of the macroblock to which weight prediction is to be applied is extracted from the video signal (S310). According to the extracted slice type, weight prediction may be performed on the current macroblock (S320). Here, the slice type may include a macroblock to which inter-view prediction is applied. Inter-view prediction refers to predicting a current picture by using information of a picture located at a different viewpoint than the current picture. For example, the slice type may be a macroblock to which temporal prediction is applied using the information of a picture at the same point in time as the current picture, a macroblock to which the inter-view prediction is applied, and a temporal prediction to a viewpoint. The prediction may include macroblocks applied together. In addition, the slice type may include only macroblocks to which temporal prediction is applied, may include only macroblocks to which inter-prediction is applied, may include only macroblocks to which both predictions are applied, and the macroblock type. It can also include either or both types. This will be described in detail with reference to FIG. 4. As such, when the slice type including the macroblock to which the inter-view prediction is applied is extracted from the video signal, the weight prediction is performed by using information about a picture that is different from the current picture. Here, a viewpoint identifier for distinguishing a viewpoint of a picture may be used to use information about a picture at another viewpoint.

상기 도 4에 도시된 바와 같이, 먼저 시점 간 예측에 의한 P 슬라이스 타입을 VP(View_P)라 정의할 때, 상기 시점 간 예측에 의한 P 슬라이스 타입에는 인트라 매크로블록(I), 현재 시점(current view)에 있는 하나의 픽쳐로부터 예측되는 매크로블록(P) 또는 다른 시점(different view)에 있는 하나의 픽쳐로부터 예측되는 매크로블록(VP)이 허용된다(410). 그리고, 시점 간 예측에 의한 B 슬라이스 타입을 VB(View_B)라 정의할 때, 상기 시점 간 예측에 의한 B 슬라이스 타입에는 인트라 매크로블록(I), 현재 시점에 있는 적어도 하나 이상의 픽쳐로부터 예측되는 매크로블록(P or B) 또는 적어도 하나 이상의 다른 시점에 있는 픽쳐로부터 예측되는 매크로블록(VP or VB)이 허용된다(420). 또한, 시간적 예측과 시점 간 예측 각각 또는 모두를 이용하여 예측 수행된 슬라이스 타입을 Mixed 라 정의할 때, 상기 Mixed 슬라이스 타입에는 인트라 매크로블록(I), 현재 시점에 있는 적어도 하나 이상의 픽쳐로부터 예측되는 매크로블록(P or B) 또는 적어도 하나 이상의 다른 시점에 있는 픽쳐로부터 예측되는 매크로블록(VP or VB) 또는 현재 시점에 있는 픽쳐와 다른 시점에 있는 픽쳐 모두를 이용하여 예측된 매크로블록(Mixed)이 허용된다(430). 여기서, 다른 시점에 있는 픽쳐를 이용하기 위하여 픽쳐의 시점을 구별하는 시점 식별자를 이용할 수 있다.As shown in FIG. 4, when defining the P slice type by the inter-view prediction as VP (View_P), the P slice type by the inter-view prediction includes an intra macroblock (I) and a current view (current view). A macroblock (P) predicted from one picture in) or a macroblock (VP) predicted from one picture in a different view is allowed (410). When the B slice type based on the inter-view prediction is defined as VB (View_B), the B slice type based on the inter-view prediction includes an intra macroblock (I) and a macroblock predicted from at least one picture present at the current view. A macroblock (VP or VB) predicted from a picture at (P or B) or at least one or more other views is allowed (420). In addition, when the slice type predicted using each or both of the temporal prediction and the inter-view prediction is defined as Mixed, the mixed slice type includes an intra macroblock (I) and a macro predicted from at least one or more pictures at the current view. A macroblock (VP or VB) predicted from a block (P or B) or a picture at least one or more other views, or a macroblock predicted using both a picture at a current time and a picture at a different time is allowed. 430. In this case, in order to use a picture at another view, a view identifier that distinguishes a view of the picture may be used.

상기 도 4에서 살펴본 바와 같이, 슬라이스 타입이 VP, VB, Mixed 로 정의될 경우, 기존(예를 들어, H.264)의 가중치 예측을 수행하는 신택스는 도 5a ~ 도 5b와 같이 변경될 수 있다. 예를 들어, 슬라이스 타입이 시간적 예측에 의한 P 슬라이스인 경우에는 "if(slice_type != VP ∥ slice_type != VB) " 부분이 추가되고(510), 슬라이스 타입이 시간적 예측에 의한 B 슬라이스인 경우에는 if 문이 "if(slice_type == B ∥ slice_type == Mixed) " 와 같이 변경될 수 있다(520). 또한, VP 및 VB 슬라이스 타입이 새롭게 정의됨으로써, 도 5a와 유사한 형식으로 새롭게 추가될 수 있다(530,540). 이 경우, 시점에 대한 정보가 추가되기 때문에 신택스 요소들은 "시점(view)"부분을 포함하고 있다. 그 예로, "luma_log2_view_weight_denom, chroma_log2_view_weight_denom" 등을 들 수 있다. As shown in FIG. 4, when the slice type is defined as VP, VB, or Mixed, the syntax for performing weight prediction of the existing (eg, H.264) may be changed as shown in FIGS. 5A to 5B. . For example, if the slice type is a P slice based on temporal prediction, the portion “if (slice_type! = VP ∥ slice_type! = VB)” is added (510), and when the slice type is a B slice based on temporal prediction The if statement can be changed to "if (slice_type == B | slice_type == Mixed)" (520). In addition, the VP and VB slice types are newly defined, and thus may be newly added in a format similar to that of FIG. 5A (530, 540). In this case, since the information about the viewpoint is added, the syntax elements include a "view" portion. Examples thereof include "luma_log2_view_weight_denom, chroma_log2_view_weight_denom", and the like.

본 발명이 적용되는 비디오 신호 코딩에 있어서, 가중치 예측을 수행할지 여부를 나타내는 플래그 정보를 사용할 경우 보다 효율적인 코딩이 가능해진다. 이러한 플래그 정보는 슬라이스 타입에 기초하여 정의할 수 있다. 예를 들어, 가중치 예측이 P 슬라이스와 SP 슬라이스에 적용될지 여부를 나타내는 플래그 정보가 존재할 수 있고, B 슬라이스에 적용될지 여부를 나타내는 플래그 정보가 존재할 수 있다. 그 구체적 예로, 상기 플래그 정보를 각각 "weighted_pred_flag" , "weighted_bipred_idc"로 정의할 수 있다. weighted_pred_flag = 0 이면 가중치 예 측이 P 슬라이스와 SP 슬라이스에 적용되지 않는 것을 나타내고, weighted_pred_flag = 1 이면 가중치 예측이 P 슬라이스와 SP 슬라이스에 적용되는 것을 나타낸다. 그리고, weighted_bipred_idc = 0 이면 디폴트(default) 가중치 예측이 B 슬라이스에 적용되는 것을 나타내고, weighted_bipred_idc = 1 이면 명시적인(explicit) 가중치 예측이 B 슬라이스에 적용되는 것을 나타내며, weighted_bipred_idc = 2 이면 묵시적인(implicit) 가중치 예측이 B 슬라이스에 적용되는 것을 나타낸다. 또한, 다시점 비디오 코딩에 있어서는, 시점 간의 픽쳐에 대한 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보를 슬라이스 타입에 기초하여 정의할 수도 있다.In video signal coding to which the present invention is applied, more efficient coding becomes possible when using flag information indicating whether to perform weight prediction. Such flag information may be defined based on the slice type. For example, there may be flag information indicating whether the weight prediction is applied to the P slice and the SP slice, and there may be flag information indicating whether the weight prediction is to be applied to the B slice. As a specific example, the flag information may be defined as "weighted_pred_flag" and "weighted_bipred_idc", respectively. Weighted_pred_flag = 0 indicates that the weight prediction is not applied to the P slice and the SP slice, and weighted_pred_flag = 1 indicates that the weight prediction is applied to the P slice and the SP slice. And weighted_bipred_idc = 0 indicates that the default weighted prediction is applied to the B slice, weighted_bipred_idc = 1 indicates that the explicit weighted prediction is applied to the B slice, and weighted_bipred_idc = 2 implicit weights. Indicates that the prediction is applied to the B slice. In multi-view video coding, flag information indicating whether weight prediction is performed using information on pictures between views may be defined based on a slice type.

먼저 비디오 신호로부터 슬라이스 타입 및 시점 간 가중치 예측 수행 여부를 나타내는 플래그 정보를 추출한다(S610,S620). 여기서, 상기 슬라이스 타입은, 예를 들어, 현재 픽쳐와 같은 시점에 있는 픽쳐의 정보를 이용하여 예측을 수행하는 시간적 예측이 적용된 매크로블록 및 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 예측을 수행하는 시점 간 예측이 적용된 매크로블록을 포함할 수 있다. 상기 추출된 슬라이스 타입과 플래그 정보에 기초하여 가중치 예측 모드를 결정할 수 있다(S630). 결정된 가중치 예측 모드에 따라 가중치 예측을 수행할 수 있다(S640). 여기서, 플래그 정보는 상기 예를 들어 설명했던 "weighted_pred_flag" , "weighted_bipred_idc" 외에, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보를 포함할 수 있다. 이에 대해서는 도 7에서 상세히 설명하도록 한다. 이처럼, 현 재 매크로블록의 슬라이스 타입이 시점 간 예측이 적용된 매크로블록을 포함하는 슬라이스 타입인 경우에 다른 시점의 픽쳐에 대한 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보를 사용할 경우 보다 효율적인 코딩이 가능해진다.First, flag information indicating whether slice type and inter-view weight prediction is performed is extracted from the video signal (S610 and S620). In this case, the slice type may be, for example, a prediction using a macroblock to which a temporal prediction is performed using information of a picture at the same point in time as the current picture, and information of a picture at a different point in time from the current picture. It may include a macroblock to which the inter-view prediction is performed. A weight prediction mode may be determined based on the extracted slice type and flag information (S630). Weight prediction may be performed according to the determined weight prediction mode (S640). Here, the flag information may include flag information indicating whether to perform weight prediction using information of a picture that is different from the current picture in addition to the "weighted_pred_flag" and "weighted_bipred_idc" described above. This will be described in detail with reference to FIG. 7. As such, when the slice type of the current macroblock is a slice type including a macroblock to which inter-view prediction is applied, it is more efficient to use flag information indicating whether to perform weighted prediction using information on pictures of different views. Coding is possible.

도 7은 본 발명이 적용되는 일실시예로서, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보에 따른 가중치 예측 방법을 설명하기 위한 것이다.FIG. 7 is a diagram for describing a weight prediction method based on flag information indicating whether to perform weight prediction using information on a picture that is different from a current picture, according to an embodiment to which the present invention is applied.

예를 들어, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보를 "view_weighted_pred_flag" , "view_weighted_bipred_idc" 로 정의할 수 있다. view_weighted_pred_flag = 0 이면 가중치 예측이 VP 슬라이스에 적용되지 않는 것을 나타내고, view_weighted_pred_flag = 1 이면 명시적인(explicit) 가중치 예측이 VP 슬라이스에 적용되는 것을 나타낸다. 그리고, view_weighted_bipred_idc = 0 이면 디폴트(default) 가중치 예측이 VB 슬라이스에 적용되는 것을 나타내고, view_weighted_bipred_idc = 1 이면 명시적인 가중치 예측이 VB 슬라이스에 적용되는 것을 나타내며, view_weighted_bipred_idc = 2 이면 묵시적인(implicit) 가중치 예측이 VB 슬라이스에 적용되는 것을 나타낸다. 묵시적인 가중치 예측이 VB 슬라이스에 적용되는 경우, 가중치 계수는 현재 시점과 다른 시점 간의 상대적인 거리에 의하여 획득될 수 있다. 또한, 묵시적인 가중치 예측이 VB 슬라이스에 적용되는 경우, 픽쳐의 시점을 구별하는 시점 식별자를 이용하여 가중치 예측을 수행할 수 있 으며, 또는 각 시점을 구분할 수 있도록 고려하여 만들어진 픽쳐 출력 순서(Picture Order Count, POC)를 이용하여 가중치 예측을 수행할 수도 있다. 또한, 상기 플래그 정보들은 픽쳐 파라미터 세트(Picture Parameter Set, PPS)에 포함될 수 있다. 여기서 픽쳐 파라미터 세트란, 픽쳐 전체의 부호화 모드(예를 들어, 엔트로피 부호화 모드, 픽쳐 단위의 양자화 파라미터 초기값 등)를 나타내는 헤더 정보를 말한다. 단, 픽쳐 파라미터 세트는 모든 픽쳐에 붙는 것이 아니며, 픽쳐 파라미터 세트가 없는 경우에는 직전에 존재하는 픽쳐 파라미터 세트를 헤더 정보로 사용한다.For example, flag information indicating whether to perform weight prediction using information of a picture that is different from the current picture may be defined as "view_weighted_pred_flag" and "view_weighted_bipred_idc". View_weighted_pred_flag = 0 indicates that weighted prediction is not applied to the VP slice, and view_weighted_pred_flag = 1 indicates that explicit weighted prediction is applied to the VP slice. If view_weighted_bipred_idc = 0, this indicates that a default weighted prediction is applied to the VB slice, and if view_weighted_bipred_idc = 1, it indicates that an explicit weighted prediction is applied to the VB slice. Applies to the slice. When implicit weight prediction is applied to a VB slice, the weight coefficient may be obtained by the relative distance between the current time point and another time point. In addition, when the implicit weight prediction is applied to the VB slice, the weight prediction may be performed by using a view identifier that distinguishes the viewpoints of the pictures, or a picture output order created by considering each viewpoint. , POC may be used to perform weight prediction. In addition, the flag information may be included in a picture parameter set (PPS). Here, the picture parameter set refers to header information indicating an encoding mode (for example, an entropy encoding mode, a quantization parameter initial value in picture units, etc.) of the entire picture. However, the picture parameter set is not attached to all the pictures. If there is no picture parameter set, the picture parameter set immediately existing is used as header information.

본 발명이 적용되는 다시점 비디오 코딩에 있어서, 시점 간 예측이 적용된 매크로블록을 포함하는 슬라이스 타입 및 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할지 여부를 나타내는 플래그 정보가 정의되는 경우, 어떤 슬라이스 타입에 따라 어떤 가중치 예측을 수행할지를 판단할 필요가 있다. 예를 들어, 도 8에 도시된 바와 같이 비디오 신호로부터 추출된 슬라이스 타입이 P 슬라이스이거나 SP 슬라이스인 경우 weighted_pred_flag = 1 이어야 가중치 예측을 수행할 수 있으며, 슬라이스 타입이 B 슬라이스인 경우에는 weighted_bipred_idc = 1 이어야 가중치 예측을 수행할 수 있다. 또한, 슬라이스 타입이 VP 슬라이스인 경우 view_weighted_pred_flag = 1 이어야 가중치 예측을 수행할 수 있으며, 슬라이스 타입이 VB 슬라이스인 경우에는 view_weighted_bipred_idc = 1 이어야 가중치 예측을 수행할 수 있다.In multi-view video coding to which the present invention is applied, flag information indicating whether to perform weight prediction using a slice type including a macroblock to which inter-view prediction is applied and information of a picture that is different from the current picture is defined. If necessary, it is necessary to determine which weight prediction to perform according to which slice type. For example, as shown in FIG. 8, when the slice type extracted from the video signal is a P slice or an SP slice, weighted_pred_flag = 1 may be used for weight prediction, and when the slice type is a B slice, weighted_bipred_idc = 1. Weight prediction may be performed. In addition, when the slice type is a VP slice, weight prediction may be performed when view_weighted_pred_flag = 1, and when the slice type is VB slice, weight prediction may be performed when view_weighted_bipred_idc = 1.

먼저 비디오 신호로부터 NAL 유닛 타입(nal_unit_type)을 추출한다(S910). 여기서, NAL 유닛 타입이란, NAL 단위의 종류를 나타내는 식별자를 말한다. 예를 들어, nal_unit_type = 5 인 경우 NAL 단위가 IDR 픽쳐의 슬라이스임을 나타낸다. IDR(Instantaneous Decoding Refresh) 픽쳐란, 영상 시퀀스의 선두 픽쳐를 말한다. 그리고, 상기 추출된 NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입인지 여부를 확인한다(S920). 상기 NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입인 경우, 현재 픽쳐와 다른 시점에 있는 픽쳐에 대한 정보를 이용하여 가중치 예측을 수행하게 된다(S930). 상기 NAL 유닛 타입은 스케일러블 비디오 코딩과 다시점 비디오 코딩 모두 적용가능한 NAL 유닛 타입일 수도 있고, 다시점 비디오 코딩만을 위한 NAL 유닛 타입일 수도 있다. 이처럼, 다시점 비디오 코딩을 위한 NAL 유닛 타입일 경우, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행할 수 있어야 하므로 새롭게 신택스가 정의될 필요가 있다. 이하 도 10a ~ 도 10b에서 상세히 설명하도록 한다.First, a NAL unit type (nal_unit_type) is extracted from a video signal (S910). Here, the NAL unit type means an identifier indicating the type of the NAL unit. For example, when nal_unit_type = 5, it indicates that the NAL unit is a slice of an IDR picture. An IDR (Instantaneous Decoding Refresh) picture refers to the first picture of the video sequence. Then, it is checked whether the extracted NAL unit type is a NAL unit type for multi-view video coding (S920). When the NAL unit type is a NAL unit type for multiview video coding, weight prediction is performed using information on a picture that is at a different time point than the current picture (S930). The NAL unit type may be a NAL unit type applicable to both scalable video coding and multiview video coding, or may be a NAL unit type for multiview video coding only. As such, in the case of the NAL unit type for multi-view video coding, since the weight prediction can be performed using information of a picture that is different from the current picture, a syntax needs to be newly defined. Hereinafter, the detailed description will be made with reference to FIGS. 10A to 10B.

NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입일 경우, 기존(예 를 들어, H.264)의 가중치 예측을 수행하는 신택스는 도 10a ~ 도 10b와 같이 변경될 수 있다. 예를 들어, 도 10a의 1010 부분은 기존의 가중치 예측을 수행하는 신택스 부분에 해당되며, 도 10a의 1020 부분은 다시점 비디오 코딩에서 가중치 예측을 수행하는 신택스 부분에 해당된다. 따라서, 1020 부분에서는 NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입일 경우에 한하여 가중치 예측이 수행된다. 이 경우, 시점에 대한 정보가 추가되기 때문에 신택스 요소들은 "시점(view)"부분을 포함하고 있다. 그 예로, "luma_view_log2_weight_denom, chroma_view_log2_weight_denom" 등을 들 수 있다. 또한, 도 10b의 1030 부분은 기존의 가중치 예측을 수행하는 신택스 부분에 해당되며, 도 10b의 1040 부분은 다시점 비디오 코딩에서 가중치 예측을 수행하는 신택스 부분에 해당된다. 따라서, 1040 부분에서는 NAL 유닛 타입이 다시점 비디오 코딩을 위한 NAL 유닛 타입일 경우에 한하여 가중치 예측이 수행된다. 이 경우에도 마찬가지로, 시점에 대한 정보가 추가되기 때문에 신택스 요소들은 "시점(view)"부분을 포함하고 있다. 그 예로, "luma_view_weight_l1_flag, chroma_view_weight_l1_flag" 등을 들 수 있다. 이처럼, 다시점 비디오 코딩을 위한 NAL 유닛 타입이 정의될 경우, 현재 픽쳐와 다른 시점에 있는 픽쳐의 정보를 이용하여 가중치 예측을 수행함으로써 보다 효율적인 코딩이 가능할 수 있다.When the NAL unit type is the NAL unit type for multi-view video coding, the syntax for performing weight prediction of the existing (eg, H.264) may be changed as shown in FIGS. 10A to 10B. For example, the 1010 portion of FIG. 10A corresponds to the syntax portion for performing the existing weight prediction, and the 1020 portion of FIG. 10A corresponds to the syntax portion for performing the weight prediction in multiview video coding. Accordingly, in 1020, weight prediction is performed only when the NAL unit type is the NAL unit type for multi-view video coding. In this case, since the information about the viewpoint is added, the syntax elements include a "view" portion. Examples thereof include "luma_view_log2_weight_denom, chroma_view_log2_weight_denom", and the like. In addition, part 1030 of FIG. 10B corresponds to a syntax part for performing weighted prediction, and part 1040 of FIG. 10B corresponds to a syntax part for performing weighted prediction in multiview video coding. Accordingly, in 1040, weight prediction is performed only when the NAL unit type is the NAL unit type for multiview video coding. In this case as well, since the information about the viewpoint is added, the syntax elements include a "view" portion. Examples thereof include "luma_view_weight_l1_flag, chroma_view_weight_l1_flag", and the like. As such, when a NAL unit type for multi-view video coding is defined, more efficient coding may be performed by performing weight prediction using information of a picture that is different from the current picture.

다시점 비디오의 각 시점 영상들은 카메라의 내외적 요인으로 인하여 조명 차이가 발생하게 된다. 이러한 조명 차이는 다른 시점들 간의 상관도를 현저히 떨 어뜨려 효과적인 부호화를 저해하게 된다. 따라서, 본 발명에서는 시점 간 예측이 적용된 매크로블록을 포함하는 슬라이스 타입 또는 시점 간 가중치 예측 수행 여부를 나타내는 플래그 정보를 새롭게 정의하고 이를 이용하여 가중치 예측을 수행함으로써 보다 효율적인 코딩을 할 수 있다. 또한, 다시점 비디오 코딩을 위한 NAL 유닛 타입을 정의하고 이를 이용하여 가중치 예측을 수행함으로써 보다 효율적인 코딩을 할 수 있다.Each viewpoint image of the multi-view video is caused by a difference in illumination due to internal and external factors of the camera. This illumination difference significantly reduces the correlation between different viewpoints, which hinders effective coding. Therefore, in the present invention, flag information indicating whether the slice type including the macroblock to which the inter-view prediction is applied or whether to perform the inter-view weight prediction is newly defined and weighted prediction is performed using the same to enable more efficient coding. In addition, more efficient coding may be performed by defining a NAL unit type for multi-view video coding and performing weight prediction using the same.

Claims

In the method for decoding a video signal,

And performing weighted prediction for decoding the current block by using information about a picture that is different from the current picture according to the slice type extracted from the video signal.

The method of claim 1,

And the slice type comprises a macroblock to which inter-view prediction has been applied.

The method of claim 1, wherein the video signal decoding method comprises:

Extracting flag information indicating whether the inter-view weight prediction is performed;

And performing weighted prediction according to the weighted prediction mode determined based on the slice type and the flag information.

The method of claim 3, wherein

And the weight prediction mode is a mode for performing weight prediction using coefficient information obtained from information about a picture that is different from the current picture.

The method of claim 3, wherein

And the weight prediction mode is a mode for performing weight prediction using a distance between the current picture and a reference picture at a different point in time.

The method of claim 5,

The distance between the current picture and a reference picture that is different from the current picture may be determined by using a view identifier indicating the view of the picture or by using output order display information of the picture that may distinguish the view of the picture. Video signal decoding method characterized in that it is obtained.

Extracting a NAL unit type from the video signal;

If the NAL unit type is a NAL unit type for multi-view video coding, performing weight prediction to decode the current block by using information about a picture that is different from the current picture.

Video signal decoding method comprising a.