KR20070074495A

KR20070074495A - Method and apparatus for inter-viewing reference in multi-viewpoint video coding

Info

Publication number: KR20070074495A
Application number: KR1020070001621A
Authority: KR
Inventors: 조숙희; 허남호; 이수인; 이영렬; 김대연; 허재호; 이융기
Original assignee: 한국전자통신연구원; 세종대학교산학협력단
Priority date: 2006-01-07
Filing date: 2007-01-05
Publication date: 2007-07-12
Also published as: EP1972141A4; EP1972141A1; WO2007081117A1

Abstract

An inter-view frame reference method in multi-view moving picture coding and an apparatus therefor are provided to reduce a residual signal and enhance a compression rate by referring to a frame of a different viewpoint of the same time or a frame of a different viewpoint of a different time when multi-view moving picture data are encoded. A multi-view GOP(Group Of Picture) arranging unit(110) includes GOPs corresponding to at least one or more viewpoints, and arranges the respective view points between the respective GOPs to have the same coding structure. A mutual viewpoint reference unit(120) refers in the picture order within a spatially close viewpoint among pictures of different viewpoints previously encoded in the respective GOPs.

Description

Method and apparatus for inter-view frame reference in multi-view video coding {method and Apparatus for inter-viewing reference in multi-viewpoint video coding}

도 1 은 본 발명의 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치의 구성도를 도시한다. 1 is a block diagram of an inter-view frame reference apparatus for encoding multi-view video data according to an embodiment of the present invention.

도 2 는 본 발명의 바람직한 일 실시예로서, 다시점 동영상 데이터의 시점 간 프레임 참조를 위해 제안된 다시점 GOP 의 코딩 구조이다. 2 illustrates a coding structure of a multi-view GOP proposed for inter-view frame reference of multi-view video data according to an embodiment of the present invention.

도 3 은 1차 병렬 및 아크 구조의 카메라 타입에서 P 픽처 인코딩을 위한 시점 간 프레임 참조 방식을 도시한다. 3 illustrates an inter-view frame reference scheme for P picture encoding in a camera type of first order parallel and arc structure.

도 4 는 1차 병렬 및 아크 구조의 카메라 타입에서 B 픽처 인코딩을 위한 시점 간 프레임 참조 방식을 도시한다. 4 illustrates an inter-view frame reference scheme for B picture encoding in a camera type of first order parallel and arc structure.

도 5 는 2차 십자형 구조의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 시점 간 프레임 참조 방식을 도시한다.5 illustrates an inter-view frame reference method for encoding of a P picture and a B picture in a camera type having a quadratic cross structure.

도 6 은 2차 십자형 구조의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 또 다른 시점 간 프레임 참조 방식을 도시한다.FIG. 6 illustrates another inter-view frame reference scheme for encoding a P picture and a B picture in a camera type having a quadratic cross structure.

도 7 은 2차 병렬 구조(3x5)의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 시점 간 프레임 참조 방식을 도식화한 것이다.7 is a diagram illustrating an inter-view frame reference method for encoding a P picture and a B picture in a camera type of a quadratic parallel structure (3 × 5).

도 8 은 본 발명의 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화 하기 위한 시점간 프레임 참조 방법의 흐름도를 도시한다. 8 illustrates a flowchart of an inter-view frame reference method for encoding multi-view video data, according to an embodiment of the present invention.

도 9a 및 9b 는 본 발명의 시점간 프레임 참조 방법의 효과를 실험하기 위한 테스트 환경을 도시한다. 9A and 9B illustrate a test environment for experimenting with the effect of the inter-view frame reference method of the present invention.

도 10(a) 내지 10(g) 는 도 9의 실험 환경에 기초한 Rate-distortion 커브를 도시한다. 10 (a) to 10 (g) show rate-distortion curves based on the experimental environment of FIG.

본 발명은 서로 다른 시점을 갖는 복수의 카메라가 각기 촬영한 다시점 동영상 데이터들을 부호화하기 위하여 시점 간 프레임 참조하는 구조 및 장치에 관한 것이다. The present invention relates to a structure and an apparatus for referencing an inter-view frame in order to encode multi-view video data photographed by a plurality of cameras having different viewpoints.

현재, 동영상 데이터를 부호화하기 위해 사용되는 H.264 표준에 따르면 동일 시점 내에서만 시간적으로 앞, 뒤에 존재하는 여러 개의 프레임을 참조하고 있다. Currently, according to the H.264 standard used for encoding video data, it refers to a plurality of frames that exist before and after in time only within the same viewpoint.

현재까지는 동일 시점 내에서만 프레임을 참조하는 기술에 그치고 있었으나, 본 발명에서는 다시점 동영상 데이터를 부호화에서는 동일시간의 다른 시점의 프레임이나 다른 시간의 다른 시점의 프레임을 참조하는 기술 사상을 제시함으로써 동일 시점의 시간적으로 과거나 미래의 프레임만을 참조하는 방법보다 잔여 신호를 줄임으로서 압축률을 높이고자 한다.Until now, the technology has only been referred to a frame within the same view, but in the present invention, in encoding the multi-view video data, the present invention provides a technical idea of referring to a frame at a different point in time or at a different point in time. We want to increase the compression rate by reducing the residual signal rather than the method of referencing only past or future frames in time.

본 발명의 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치는 적어도 하나 이상의 시점에 각각 대응하는 GOP를 포함하고 상기 각 GOP는 코딩이 시작되는 최초 시간에 IDR(Instantaneous Decoder Refresh) 픽처를 포함하며 각 GOP 간의 각각의 시점이 동일한 코딩 구조를 지닌 다시점 GOP;및 상기 각 GOP 중 자기 시점 내에서의 시간적 프레임 참조만 수행하는 기본 GOP, 상기 IDR 픽처, 및 마지막 시점을 제외하고 이미 부호화된 다른 시점의 픽처 중 공간적으로 가까운 시점 내의 픽처 순으로 참조하는 상호시점 참조부;를 포함한다.In a preferred embodiment of the present invention, an inter-view frame reference apparatus for encoding multi-view video data includes a GOP corresponding to at least one or more viewpoints, and each GOP includes an Instantaneous Decoder at an initial time at which coding starts. (Refresh) a multiview GOP including a picture and each view between each GOP having the same coding structure; and the base GOP, the IDR picture, and the last view, which perform only temporal frame references within their own view of each GOP. And a mutual viewpoint reference unit for referring to pictures within a spatially close viewpoint among pictures of other viewpoints already encoded.

바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치는 이미 부호화된 동일 시점 픽처를 참조하는 동일시점 참조부;를 더 포함한다. In a preferred embodiment, the inter-view frame reference apparatus for encoding multi-view video data further includes a same-view reference unit that refers to the same-view picture that is already encoded.

바람직한 일 실시예로서, 상기 기본 GOP는 H.264/AVC GOP 인 것을 특징으로 한다. In a preferred embodiment, the basic GOP is characterized in that the H.264 / AVC GOP.

바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치에서 각 시점의 GOP는 I 픽처, P 픽처, rB 픽처 및 B 픽처 순으로 부호화한다. In a preferred embodiment, in the inter-view frame reference apparatus for encoding multi-view video data, the GOP of each view is encoded in the order of an I picture, a P picture, an rB picture, and a B picture.

바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치에서 상기 상호시점 참조부의 P 픽처는 현재 i번째 시점, t 시간에서 부호화하고자 하는 P픽처를 b(i, t)라고 할 경우 b(i-1,t), b(i-1, t-n), b(i+1,t-n) 중 적어도 하나 이상을 참조하고, t-n 은 t 시간 전 가장 근접한 시간 에 I 픽처 또는 P 픽처를 부호화한 시간으로 한다. In a preferred embodiment, in the inter-view frame reference apparatus for encoding multi-view video data, the P picture of the cross-view reference unit may be referred to as b (i, t) for the P picture to be encoded at the current i-th time, t time. In the case of at least one of b (i-1, t), b (i-1, tn) and b (i + 1, tn), tn denotes an I picture or a P picture at the closest time before t hours. Let it be time coded.

바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치에서 상기 상호시점 참조부의 현재 i번째 시점, t번째 시간에서 부호화하고자 하는 rB픽처를 b(i, t)라고 할 경우 b(i+1, t+n), b(i-1, t+n), b(i-1,t), b(i-1,t-n) 및 b(i+1, t-n) 중 세 개를 참조하고, t-n 은 t 시간 전 가장 근접한 시간에 I 픽처 또는 P 픽처를 부호화한 시간이며, t+n 은 t 시간 이후 가장 근접한 시간에 I 픽처 또는 P 픽처를 부호화한 시간으로 한다. According to a preferred embodiment, in the inter-view frame reference apparatus for encoding multi-view video data, an rB picture to be encoded at a current i-th point and t-th time of the cross-view reference unit is b (i, t). three of (i + 1, t + n), b (i-1, t + n), b (i-1, t), b (i-1, tn), and b (i + 1, tn) Tn is a time at which an I picture or a P picture is encoded at the closest time before t hours, and t + n is a time at which the I picture or P picture is encoded at the closest time after t hours.

바람직한 일 실시예로서, 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점 간 프레임 참조 장치에서 상기 상호시점 참조부의 현재 i번째 시점, t번째 시간에서 부호화하고자 하는 B픽처를 b(i, t)라고 할 경우 b(i+1, t+n), b(i-1, t+n), b(i-1,t-n) 및 b(i+1, t-n) 중 세 개를 참조하고, t-n 은 t 시간 전 가장 근접한 시간에 I 픽처 ,P 픽처 또는 rB 픽처를 부호화한 시간이며, t+n 은 t 시간 이후 가장 근접한 시간에 I 픽처 P 픽처 또는 rB 픽처를 부호화한 시간으로 한다. As a preferred embodiment, as a preferred embodiment, in the inter-view frame reference apparatus for encoding multi-view video data, a B picture to be encoded at a current i-th point and t-th time of the cross-view reference unit is b (i, t) refers to three of b (i + 1, t + n), b (i-1, t + n), b (i-1, tn), and b (i + 1, tn) , tn is the time at which the I picture, P picture or rB picture was encoded at the closest time before t time, and t + n is the time at which the I picture P picture or rB picture was encoded at the closest time after t time.

본 발명의 또 다른 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 방법은 적어도 하나 이상의 시점에 각각 대응하는 GOP를 포함하고 상기 각 GOP는 코딩이 시작되는 최초 시간에 IDR 픽처를 포함하며 각 GOP 간의 각각의 시점이 동일한 코딩 구조를 지닌 다시점 GOP 배열 단계;및 상기 각 GOP 중 자기 시점 내에서의 시간적 프레임 참조만 수행하는 기본 GOP, 상기 IDR 픽처 및 마지막 시점을 제외하고 이미 부호화된 다른 시점의 픽처 중 공간적으 로 가까운 시점 내의 픽처 순으로 참조하는 상호시점 참조 단계;를 포함한다. In another preferred embodiment of the present invention, an inter-view frame reference method for encoding multi-view video data includes a GOP corresponding to at least one or more viewpoints, and each GOP includes an IDR picture at an initial time at which coding starts. A multi-view GOP array step wherein each view between each GOP has the same coding structure; and except for the base GOP, the IDR picture, and the last view, which perform only temporal frame references within their respective views among the GOPs. And a mutual view reference step of referring to pictures in a spatially close view among pictures of other encoded views.

이하 본 발명의 바람직한 실시예가 첨부된 도면들을 참조하여 설명될 것이다. 도면들 중 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 참조번호들 및 부호들로 나타내고 있음에 유의해야 한다. 하기에서 본 발명을 설명함에 있어, 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다.Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that the same elements among the drawings are denoted by the same reference numerals and symbols as much as possible even though they are shown in different drawings. In the following description of the present invention, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 장치는 다시점 GOP 배열부(110), 상호 시점 참조부(120) 및 동일 시점 참조부(130)를 포함한다. An inter-view frame reference apparatus for encoding multi-view video data includes a multi-view GOP arranging unit 110, a mutual view reference unit 120, and a same view reference unit 130.

다시점 GOP 배열부(110)는 적어도 하나 이상의 시점에 각각 대응하는 GOP를 포함한다. 각 GOP는 코딩이 시작되는 최초 시간에 IDR(Instantaneous Decoder Refresh) 픽처를 포함하며 각 GOP 간의 각각의 시점이 동일한 코딩 구조를 지닌다(도 2 참고).The multi-view GOP arranging unit 110 includes GOPs corresponding to at least one or more viewpoints, respectively. Each GOP includes an Instantaneous Decoder Refresh (IDR) picture at the first time that coding starts and each time point between each GOP has the same coding structure (see Figure 2).

다시점 GOP 배열부(100)에서 배열된 다시점 GOP는 시점 축으로 H.264/AVC GOP(Group of Picture) 뿐만 아니라 이외의 다른 시점들을 모두 포함하며, 각각의 GOP는 시간 축에서 가장 최초의 시간에 IDR 픽처를 한 장 포함한다. 각 시점의 GOP에서 I 픽처, P 픽처, rB 픽처 및 B 픽처 순으로 부호화를 수행한다. The multi-view GOP arranged in the multi-view GOP arranging unit 100 includes not only the H.264 / AVC GOP (Group of Picture) but also all other views as the view axis, and each GOP is the first in the time axis. Include one IDR picture in time. In the GOP at each viewpoint, encoding is performed in the order of an I picture, a P picture, an rB picture, and a B picture.

본 발명에서 다시점 동영상 데이터의 시점 간 프레임 참조를 위해 제안된 다 시점 GOP의 코딩 구조에 대해서는 도 2에서 보다 상세히 설명하기로 한다. A coding structure of a multi-view GOP proposed for inter-view frame reference of multi-view video data in the present invention will be described in detail with reference to FIG. 2.

상호 시점 참조부(120)는 시공간적으로 가장 근접한 시점들의 프레임을 참조함으로서 압축률을 높인다. 보다 상세히, 상호 시점 참조부(120)는 다시점 GOP에서 자기 시점 내에서의 시간적 프레임 참조만 수행하는 기본 GOP, IDR 픽처 및 마지막 시점에 있는 픽처들을 제외하고 동일 시간의 다른 시점의 프레임이나 다른 시간의 다른 시점의 프레임을 참조한다. The mutual viewpoint reference unit 120 increases the compression ratio by referring to the frames of the viewpoints closest to each other in space and time. More specifically, the cross-view reference unit 120 performs a frame or other time at another point in time at the same time except for a base GOP, an IDR picture, and a picture at the last point, which perform only a temporal frame reference within its viewpoint in the multi-view GOP. Reference the frame at different time points.

구체적으로, 상호 시점 참조부(120)에서, P 픽처, rB(reference B)픽처 및 B 픽처를 부호화 하기 위해 참조하는 픽처는 다음과 같다. In detail, the pictures referenced by the mutual view reference unit 120 to encode the P picture, the rB (reference B) picture, and the B picture are as follows.

현재 i번째 시점, t 시간에서 부호화하고자 하는 P픽처를 b(i, t)라고 할 경우 상호 시점 참조부(120)에서 P픽처를 부호화하기 위해 b(i-1,t), b(i-1, t-n), b(i+1,t-n) 중 적어도 하나 이상을 참조한다. 이 경우, t-n 은 t 시간 전 가장 근접한 시간에 I 픽처 또는 P 픽처를 부호화한 시간을 나타낸다. 이에 관해서는 도 3에서 구체적 일 예를 살펴보기로 한다. When the P picture to be encoded at the current i-th time and t time is b (i, t), the b- (i-1, t), b (i- Reference is made to at least one of 1, tn) and b (i + 1, tn). In this case, t-n represents the time at which the I picture or the P picture was encoded at the closest time before t time. This will be described in detail with reference to FIG. 3.

현재 i번째 시점, t번째 시간에서 부호화하고자 하는 rB픽처를 b(i, t)라고 할 경우 상호 시점 참조부(120)에서 P픽처를 부호화하기 위해 b(i+1, t+n), b(i-1, t+n), b(i-1,t), b(i-1,t-n) 및 b(i+1, t-n) 중 세 개의 픽처를 참조한다. 이 경우, t-n 은 t 시간 전 가장 근접한 시간에 I 픽처 또는 P 픽처를 부호화한 시간이며, t+n 은 t 시간 이후 가장 근접한 시간에 I 픽처 또는 P 픽처를 부호화한 시간을 나타낸다. 이에 관해서는 도 4에서 구체적 일 예를 살펴보기로 한다. If rB pictures to be encoded at the current i-th time and t-th time are b (i, t), b (i + 1, t + n), b in order to encode the P picture in the mutual view reference unit 120 Reference is made to three pictures of (i-1, t + n), b (i-1, t), b (i-1, tn), and b (i + 1, tn). In this case, t-n is a time at which an I picture or a P picture is encoded at the closest time before t hours, and t + n is a time at which an I picture or P picture is encoded at the closest time after t hours. This will be described in detail with reference to FIG. 4.

현재 i번째 시점, t번째 시간에서 부호화하고자 하는 B픽처를 b(i, t)라고 할 경우 b(i+1, t+n), b(i-1, t+n), b(i-1,t-n) 및 b(i+1, t-n) 중 세 개를 참조하고, t-n 은 t 시간 전 가장 근접한 시간에 I 픽처 ,P 픽처 또는 rB 픽처를 부호화한 시간이며, t+n 은 t 시간 이후 가장 근접한 시간에 I 픽처 P 픽처 또는 rB 픽처를 부호화한 시간을 나타낸다. 이에 관해서는 도 4에서 구체적 일 예를 살펴보기로 한다.If the B picture to be encoded at the current i-th point and t-th time is b (i, t), b (i + 1, t + n), b (i-1, t + n), b (i- 1, tn) and b (i + 1, tn), where tn is the time when the I picture, P picture, or rB picture was encoded at the closest time before t hours, and t + n after t hours The time at which the I picture P picture or rB picture is encoded at the nearest time is shown. This will be described in detail with reference to FIG. 4.

상호 시점 참조부(120)에서는 또한 다시점 동영상 데이터의 시점 간 프레임 참조 방법은 카메라 타입에 따라 그 구조가 다르게 적용된다. 도 3 및 도 4는 1 차구조로, 도 5 및 도 6은 2차 구조로, 그리고 도 7은 2차 병렬구조로 위치한 상태로 동영상을 촬영한 구조를 나타낸다. In the mutual viewpoint reference unit 120, the structure of the frame reference method of the multi-view video data is differently applied according to the camera type. 3 and 4 illustrate a structure in which a video is photographed with a primary structure, FIGS. 5 and 6 as a secondary structure, and FIG. 7 as a secondary parallel structure.

동일 시점 참조부(130)는 이미 부호화된 동일 시점 픽처를 참조한다. 즉, 동일 시점 내에서 전 후 시간 픽처를 참조한다. The same view reference unit 130 refers to an already encoded same view picture. That is, the time picture is referred to before and after within the same time point.

도 2의 x 축은 시점을 y축은 시간을 나타낸다. 다시점 GOP(200)은 적어도 하나의 H.264/AVC GOP(210)을 포함하며, 각 시점에 대응되는 GOP들을 포함한다. GOP 간에는 동일한 코딩 구조를 지닌다. In FIG. 2, the x axis represents a viewpoint and the y axis represents time. The multi-view GOP 200 includes at least one H.264 / AVC GOP 210 and includes GOPs corresponding to each view. GOPs have the same coding structure.

H.264/AVC GOP는 자기 시점 내에서의 시간적인 프레임 참조만을 하며 시점 간 프레임 참조는 하지 않는다. 다시점 동영상 데이터에서 어떤 시점도 H.264/AVC GOP로 선택되어 질 수 있다.The H.264 / AVC GOP only makes temporal frame references within its own viewpoint, not inter-view frame references. Any point in the multiview video data can be selected as an H.264 / AVC GOP.

각각의 시점에서 GOP의 코딩 구조는 I B rB B P B rB B P 일 수 있다. 또 다 른 일 예로서, GOP의 코딩 구조는 I rB rB rB P rB rB rB P 일 수 있다. 이 경우 rB는 기준 픽처로 사용가능한 B 픽처인 것을 특징으로 한다.The coding structure of the GOP at each time point may be I B rB B P B rB B P. As another example, the coding structure of the GOP may be I rB rB rB P rB rB rB P. In this case, rB is characterized in that the B picture available as a reference picture.

1차 병렬 및 아크 구조는 복수의 카메라가 1차 병렬 및 아크 구조로 위치한 상태로 동영상을 촬영한 구조이다. 0 ~ 9의

인덱스는 인코딩되는 P 픽처의 순서를 나타낸 것이며, 화살표는 참조되는 시점 간 프레임을 나타낸다. The primary parallel and arc structure is a structure in which a video is photographed with a plurality of cameras positioned in a primary parallel and arc structure. 0 to 9

The index indicates the order of the P pictures to be encoded, and the arrow indicates the inter-view frame referenced.

P 픽처에서 시점간 프레임 참조의 일 예로서, H.264/AVC GOP와 마지막 시점을 제외한 다시점 GOP의 P 픽처들은 좌측 시점의 동일 시간 P 픽처와 좌측 시점의 바로 이전 시간에 인코딩된 I 픽처 혹은 P 픽처와 우측 시점의 바로 이전 시간에 인코딩된 1장의 I 픽처 혹은 P 픽처를 참조한다. 마지막 카메라는 좌측 카메라의 동일 시간의 P 픽처와 좌측 카메라의 바로 이전 시간에 인코딩된 I 픽처 혹은 P 픽처를 참조한다.As an example of an inter-view frame reference in a P picture, the P pictures of the H.264 / AVC GOP and the multiview GOP except for the last view are the same time P picture at the left view and an I picture encoded at the time immediately preceding the left view. Reference is made to the P picture and one I picture or P picture encoded at the time immediately preceding the right view. The last camera refers to the P picture at the same time of the left camera and the I picture or P picture encoded at the time immediately preceding the left camera.

도 4 는 1차 병렬 및 아크 구조의 카메라 타입에서 B 픽처 인코딩을 위한 시점 간 프레임 참조 방식을 도시한다. 우선, I 픽처와 P 픽처의 시간적으로 가운데 위치한 reference B 픽처(rB)의 시점 간 프레임 참조는 다음과 같다.4 illustrates an inter-view frame reference scheme for B picture encoding in a camera type of first order parallel and arc structure. First, the inter-view frame reference of the reference B picture rB located in the temporal center of the I picture and the P picture is as follows.

rB 픽처가 시점 간 프레임 참조의 일 예로서, H.264/AVC GOP와 마지막 시점을 제외한 다시점 GOP들은 좌측 카메라의 동일 시간에 인코딩된 rB 픽처와 좌측 카메라의 다음 시간에 위치한 P 픽처를 참조하고, 우측 카메라의 다음 시간에 위치한 P 픽처를 참조한다. 마지막 카메라는 좌측 카메라의 동일 시간에 인코딩된 B 픽처 와 좌측 카메라의 다음 시간에 위치한 P 픽처를 참조한다.As an example of an inter-view frame reference for an rB picture, H.264 / AVC GOPs and multiview GOPs except for the last view refer to an rB picture encoded at the same time of the left camera and a P picture located at the next time of the left camera. Refer to the P picture located next time of the right camera. The last camera refers to the B picture encoded at the same time of the left camera and the P picture located at the next time of the left camera.

보다 구체적인 예를 들면 1 인덱스의 경우, (0, 20, 22, 15, 17) 중 임의적으로 (0, 20, 22) , (0, 15, 17).. 등과 같이 3개의 인덱스를 참조할 수 있다.For more specific example, in the case of 1 index, any of (0, 20, 22, 15, 17) may refer to three indexes arbitrarily (0, 20, 22), (0, 15, 17) .. have.

또 다른 일 예로서, rB 픽처와 P 픽처 혹은 I 픽처 사이에 위치한 B 픽처의 시점 간 프레임 참조는 다음과 같다.As another example, the inter-view frame reference of a B picture located between an rB picture and a P picture or an I picture is as follows.

H.264/AVC GOP와 마지막 시점을 제외한 다시점 GOP들은 좌측 카메라의 이전 시간에 인코딩된 rB 픽처와 좌측 카메라의 다음 시간에 위치한 P 픽처 혹은 rB 픽처를 참조하고, 우측 카메라의 다음 시간에 위치한 P 픽처 혹은 rB 픽처를 참조한다. 마지막 카메라는 좌측 카메라의 이전 시간에 인코딩된 reference B 픽처와 좌측 카메라의 다음 시간에 위치한 P 픽처 혹은 reference B 픽처를 참조한다.H.264 / AVC GOPs and multiview GOPs, except for the last view, refer to the rB picture encoded at the previous time of the left camera and the P picture or rB picture located at the next time of the left camera, and the P located at the next time of the right camera. See picture or rB picture. The last camera references the reference B picture encoded at the previous time of the left camera and the P picture or reference B picture located at the next time of the left camera.

보다 구체적인 예를 들면 6 인덱스의 경우, (0, 2, 15, 17) 중 임의적으로 (0, 2, 15) , (0, 15, 17).. 등과 같이 3개의 인덱스를 참조할 수 있다.For example, in the case of 6 indexes, three indexes may be referred to as (0, 2, 15), (0, 15, 17)... Among arbitrary (0, 2, 15, 17).

도 5 는 2차 십자형 구조의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 시점 간 프레임 참조 방식을 도시한다. 5 illustrates an inter-view frame reference method for encoding of a P picture and a B picture in a camera type having a quadratic cross structure.

2차 십자형 구조는 복수의 카메라가 2차 십자형 구조로 위치한 상태로 동영상을 촬영한 구조이다. 0 ~ 4 의 인덱스는 인코딩 되는 픽처의 순서를 나타낸 것이며, 화살표는 참조되는 시점 간 프레임을 나타낸다. The secondary cross structure is a structure in which a video is recorded with a plurality of cameras positioned in a secondary cross structure. The indices of 0 to 4 indicate the order of pictures to be encoded, and the arrows indicate the inter-view frames referred to.

H.264/AVC GOP를 제외한 다시점 GOP들은 다음과 같이 시점 간 프레임 참조를 한다. 중앙에 위치한 시점은 동일 시간에 이미 인코딩된 시점이 없으므로 시점 간 프레임 참조를 하지 않는다. 좌측에 위치한 시점은 동일 시간에 인코딩된 중앙 시 점의 픽처를 참조한다. 상측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌측 시점의 픽처를 참조한다. 우측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 상측 시점의 픽처를 참조한다. 하측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌,우 시점의 픽처를 참조한다.Multi-view GOPs except H.264 / AVC GOP refer to frame-to-view frames as follows. The centrally located view does not have an already encoded view at the same time, so there is no frame reference between views. The viewpoint located on the left refers to the picture of the center viewpoint encoded at the same time. The viewpoint located above refers to the picture of the center viewpoint and the left viewpoint encoded at the same time. The view located on the right refers to the picture of the center view and the upper view encoded at the same time. The view located at the lower side refers to the picture of the center view and the left and right views encoded at the same time.

시점 간 프레임 참조의 또 다른 방법으로 다음과 같이 참조가 가능하다. H.264/AVC GOP를 제외한 다시점 GOP들은 다음과 같이 시점 간 프레임 참조를 한다. 중앙에 위치한 시점은 동일 시간에 이미 인코딩된 시점이 없으므로 시점 간 프레임 참조를 하지 않는다. 좌측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점의 픽처를 참조한다. 상측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌측 시점의 픽처를 참조한다. 우측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 상측 시점의 픽처를 참조한다. 하측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌,우 시점의 픽처를 참조한다.As another method of inter-view frame reference, a reference can be performed as follows. Multi-view GOPs except H.264 / AVC GOP refer to frame-to-view frames as follows. The centrally located view does not have an already encoded view at the same time, so there is no frame reference between views. The viewpoint located on the left refers to the picture of the center viewpoint encoded at the same time. The viewpoint located above refers to the picture of the center viewpoint and the left viewpoint encoded at the same time. The view located on the right refers to the picture of the center view and the upper view encoded at the same time. The view located at the lower side refers to the picture of the center view and the left and right views encoded at the same time.

도 6 은 2차 십자형 구조의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 또 다른 시점 간 프레임 참조 방식을 도시한다. 십자형 구조는 도 5와 다르게 도 6 같은 구조로 시점 간 프레임 참조를 할 수 있다.FIG. 6 illustrates another inter-view frame reference scheme for encoding a P picture and a B picture in a camera type having a quadratic cross structure. Unlike the cross structure of FIG. 5, the cross structure may refer to an inter-frame frame reference as the structure of FIG. 6.

즉, H.264/AVC GOP를 제외한 다시점 GOP들은 다음과 같이 시점 간 프레임 참조를 한다. 중앙에 위치한 시점은 동일 시간에 이미 인코딩된 시점이 없으므로 시점 간 프레임 참조를 하지 않는다. 좌측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점의 픽처를 참조한다. 우측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌측 시점의 픽처를 참조한다. 상측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌,우 시점의 픽처를 참조한다. 하측에 위치한 시점은 동일 시간에 인코딩된 중앙 시점과 좌,우 시점의 픽처를 참조한다.That is, multi-view GOPs except H.264 / AVC GOPs refer to inter-frame frames as follows. The centrally located view does not have an already encoded view at the same time, so there is no frame reference between views. The viewpoint located on the left refers to the picture of the center viewpoint encoded at the same time. The view located on the right refers to the picture of the center view and the left view encoded at the same time. The upper viewpoint includes the pictures of the center viewpoint and the left and right viewpoints encoded at the same time. The view located at the lower side refers to the picture of the center view and the left and right views encoded at the same time.

도 7 은 2차 병렬 구조(3x5)의 카메라 타입에서 P 픽처와 B 픽처의 인코딩을 위한 시점 간 프레임 참조 방식을 도식화한 것이다. 7 is a diagram illustrating an inter-view frame reference method for encoding a P picture and a B picture in a camera type of a quadratic parallel structure (3 × 5).

2차 병렬 구조는 복수의 카메라가 2차 병렬 구조로 위치한 상태로 동영상을 촬영한 구조이다. 도 7에서 0 ~ 14의 인덱스는 인코딩되는 픽처의 순서를 나타낸 것이다. 화살표는 참조되는 시점 간 프레임을 나타낸다. The second parallel structure is a structure in which a video is recorded with a plurality of cameras positioned in a second parallel structure. In FIG. 7, indices of 0 to 14 indicate the order of pictures to be encoded. The arrows indicate the frames between views that are referenced.

H.264/AVC GOP를 제외한 다시점 GOP들은 다음과 같이 시점 간 프레임 참조를 한다. 중앙에 위치한 시점은 동일 시간에 이미 인코딩된 시점이 없으므로 시점 간 프레임 참조를 하지 않는다. Multi-view GOPs except H.264 / AVC GOP refer to frame-to-view frames as follows. The centrally located view does not have an already encoded view at the same time, so there is no frame reference between views.

1번 시점은 동일 시간에 인코딩된 중앙 시점의 픽처를 참조한다. 2번 시점은 동일 시간에 인코딩된 중앙 시점과 좌측 시점의 픽처를 참조한다. 3번 시점은 동일 시간에 인코딩된 중앙 시점과 상측,좌측 시점의 픽처를 참조한다. 4번 시점은 동일 시간에 인코딩된 중앙 시점과 좌,우 시점의 픽처를 참조한다. 중앙에 위치한 열에 위치한 시점들 중, 중앙 시점과 중앙과 근접한 시점(1,3)을 제외한 시점 중, 좌측에 있는 시점(9)은 자신의 시점을 기준으로 우상측, 우측, 우하측의 픽처를 참조한다. 우측에 있는 시점(10)은 자신의 시점을 기준으로 좌상측, 좌측, 좌하측의 픽처를 참조한다. The first view refers to the picture of the center view encoded at the same time. The second view refers to pictures of the center view and the left view encoded at the same time. View point 3 refers to the picture of the center view and the upper and left views encoded at the same time. View point 4 refers to pictures of the center view and the left and right views encoded at the same time. Of the viewpoints located in the column located in the center, except for the viewpoints near the center and the center (1,3), the viewpoint 9 on the left side shows the pictures of the upper right, right, and lower sides based on its own viewpoint. See. The viewpoint 10 on the right side refers to the pictures on the upper left, left, and lower left sides of the viewpoint.

중앙을 기준으로, 좌상에 위치한 시점들(5,11)은 동일 시간에 인코딩된 자신의 시점을 기준으로 우측,우하측,하측의 픽처를 참조한다. 우상에 위치한 시점 들(6,12)은 동일 시간에 인코딩된 자신의 시점을 기준으로 좌측,좌하측,하측의 픽처를 참조한다. 좌하에 위치한 시점들(7,13)은 동일 시간에 인코딩된 자신의 시점을 기준으로 우측,우상측,상측의 픽처를 참조한다. 우하에 위치한 시점들(8,14)은 동일 시간에 인코딩된 자신의 시점을 기준으로 좌측,좌상측,상측의 픽처를 참조한다.With respect to the center, the viewpoints 5 and 11 located on the upper left refer to the pictures on the right, lower right and lower sides with respect to their viewpoints encoded at the same time. The viewpoints 6 and 12 located on the upper right refer to the left, lower left and lower pictures based on their viewpoints encoded at the same time. The lower left viewpoints 7 and 13 refer to pictures on the right, upper right and upper sides with respect to their viewpoints encoded at the same time. The viewpoints 8 and 14 located at the lower right refer to the pictures on the left, upper left and upper sides with respect to their viewpoints encoded at the same time.

도 8 은 본 발명의 바람직한 일 실시예로서, 다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 방법의 흐름도를 도시한다. 8 illustrates a flowchart of an inter-view frame reference method for encoding multi-view video data, according to an embodiment of the present invention.

다시점 동영상 데이터를 부호화하기 위한 시점간 프레임 참조 방법은 우선 동일 시점 및 상호 시점 참조를 위하여 다시점 GOP를 배열한 후, 이미 부호화된 시점의 픽처중 시공간적으로 가까운 시점의 픽처를 참고한다. In the inter-view frame reference method for encoding multi-view video data, a multi-view GOP is first arranged for the same view and cross-view reference, and then reference is made to a picture of a time that is close in time and space among pictures of an encoded view.

상호 시점 참조를 위해 사용되는 다시점 GOP는 적어도 하나 이상의 시점에 각각 대응하는 GOP를 포함하고 상기 각 GOP는 코딩이 시작되는 최초 시간에 IDR 픽처를 포함하며 각 GOP 간의 각각의 시점이 동일한 코딩 구조를 지닌다(S810). Multi-view GOPs used for cross-view reference each include a GOP corresponding to at least one or more viewpoints, wherein each GOP includes an IDR picture at the first time coding starts and each viewpoint between each GOP has the same coding structure. (S810).

그리고, 상호시점 참조 단계(S820)에서는 각 GOP 중 자기 시점 내에서의 시간적 프레임 참조만 수행하는 기본 GOP, 상기 IDR 픽처 및 마지막 시점을 제외하고 이미 부호화된 다른 시점의 픽처 중 공간적으로 가까운 시점 내의 픽처 순으로 참조한다. In the cross-view reference step (S820), a picture within a spatially close view of the base GOP, the IDR picture, and the pictures of other views that are already encoded except for the last view, which performs only temporal frame references within its own view among each GOP. See in order.

또한, 다시점 동영상의 부호화를 위해 동시점을 참조한다(S830). 즉, 동일 시점 내에서 이전 또는 이후에 부호화된 픽셀을 참조한다. 동일 시점 참조 및 상호 시점 참조는 동시에 또는 순서를 달리하여 수행될 수 있다. Also, the simultaneous point is referred to for encoding the multiview video (S830). That is, the pixel is encoded before or after the same view. The same view reference and the mutual view reference may be performed simultaneously or in different orders.

도 9a 에 주어진 테스트 데이터 세트(test data set)에서 다시점 시퀀스에 대한 공간 해상도, 시간 해상도, 카메라 배열 및 주어진 비트레이트에 대한 레이트-디스토션 커브가 도 9(a) 및 (b) 와 같다.The spatial-distortion, temporal resolution, camera arrangement and rate-distortion curves for a given bitrate for the multi-view sequence in the test data set given in FIG. 9A are shown in FIGS. 9A and 9B.

도 10(a) 내지 (g)에서는 도 9(a) 및 (b) 에서 설정한 환경하에서 다시점 테스트 시퀀스에 대한 레이트 디스토션 커브를 도시한다. 도 10(a) 내지(g)에서는 모든 다시점 테스트 시퀀스에서 모든 비트레이트에서 1.2에서 2 dB의 PSNR 이득을 얻을 것을 볼 수 있다. 10A to 10G show rate distortion curves for a multi-view test sequence under the environment set in FIGS. 9A and 9B. 10 (a) to (g), it can be seen that a PSNR gain of 1.2 to 2 dB is obtained at every bitrate in all multi-view test sequences.

본 발명은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. The invention can also be embodied as computer readable code on a computer readable recording medium. Computer-readable recording media include all kinds of recording devices that store data that can be read by a computer system.

컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플라피 디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like, which are also implemented in the form of carrier waves (for example, transmission over the Internet). It also includes. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

이상 도면과 명세서에서 최적 실시예들이 개시되었다. 여기서 특정한 용어들이 사용되었으나, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. The best embodiments have been disclosed in the drawings and specification above. Although specific terms have been used herein, they are used only for the purpose of describing the present invention and are not used to limit the scope of the present invention as defined in the meaning or claims.

그러므로 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.Therefore, those skilled in the art will understand that various modifications and equivalent other embodiments are possible from this. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

본 발명에서는 다시점 동영상 데이터에서 시공간적으로 근접한 시점의 프레임을 참조함으로서 압축률을 높이는 효과가 있다. 이에 따라, 동일 시간의 다른 시점, 다른 시간의 다른 시점의 프레임이 참조됨으로서 동일 시점의 시간적으로 과거나 미래의 프레임만을 참조하는 방법보다 잔여 신호를 줄임으로서 높은 압축률을 얻을 수 있다. According to the present invention, the compression rate is increased by referring to a frame at a point in time that is close in time and space in the multi-view video data. Accordingly, by referring to frames at different time points and at different time points in the same time, a higher compression ratio can be obtained by reducing the residual signal than a method of referring to only temporal or future frames at the same time point.

Claims

A multi-view GOP array unit including GOPs corresponding to at least one viewpoint and arranging each viewpoint between each GOP to have the same coding structure; and

And a cross-view reference unit for referring to pictures in a spatially close view among pictures of other viewpoints already encoded in the respective GOPs.

The method of claim 1,

An inter-view frame reference apparatus for encoding multi-view video data for encoding multi-view video data, further comprising: a same-view reference unit for referring to the same-view picture that is already encoded.

The method of claim 1, wherein the base GOP is

An inter-view frame reference apparatus for encoding multi-view video data, wherein the frame is H.264 / AVC GOP.

The method of claim 1,

An inter-view frame reference apparatus for encoding multi-view video data, characterized by encoding in the order of an I picture, a P picture, an rB picture, and a B picture in a GOP of each viewpoint.

The method of claim 1, wherein

If the P picture to be encoded at the current i th time and t time is b (i, t), among b (i-1, t), b (i-1, tn) and b (i + 1, tn) And at least one reference word, and tn is a time at which an I picture or a P picture is encoded at the closest time before t time.

The method of claim 1, wherein the mutual view reference unit

B (i + 1), b (i-1, t + n), b (i- 1, t), b (i-1, tn) and b (i + 1, tn) refer to three, and tn is the time when the I picture or P picture was encoded at the closest time before t time, and t + n is an inter-view frame reference apparatus for encoding multi-view video data, wherein the I picture or P picture is encoded at the closest time after time t.

The method of claim 1, wherein the mutual view reference unit

If the B picture to be encoded at the current i-th point and t-th time is b (i, t), b (i + 1, t + n), b (i-1, t + n), b (i- 1, tn) and b (i + 1, tn), where tn is the time when the I picture, P picture, or rB picture was encoded at the closest time before t hours, and t + n after t hours An inter-view frame reference apparatus for encoding multi-view video data, wherein the I-picture P picture or rB picture is encoded at the closest time.

The method of claim 1, wherein the mutual view reference unit

In the camera type of the quadratic cross-section and the camera type of the quadratic parallel structure, when the viewpoint and the time are the respective axes, the radius is determined based on the point closest to the center of gravity of the viewpoint and the time of the multi-view GOP on the coordinate axis. An inter-view frame reference apparatus for encoding multi-view video data, wherein the viewpoint located at a close point is first coded.

The method of claim 1, wherein the mutual view reference unit

An inter-view frame reference apparatus for encoding multi-view video data, characterized by referring to up to three pictures.

The method of claim 1, wherein the coding structure of the GOP at each of the time points is

I B rB B P B rB B P, in which case rB is a B picture usable as a reference picture.

I rB rB rB P rB rB rB P, in which case rB is a B picture usable as a reference picture.

The apparatus of claim 1, wherein each GOP includes an Instantaneous Decoder Refresh (IDR) picture at an initial time at which coding starts.

The method of claim 1,

The mutual view reference unit refers to a picture within a spatially close view among the base GOP, the IDR picture, and pictures of other views that are already encoded except for the last view, in which only the temporal frame reference is performed within the respective view among the GOPs. An inter-view frame reference apparatus for encoding multi-view video data, characterized in that the.

A multiview GOP arrangement step of including a GOP corresponding to at least one viewpoint and arranging each viewpoint between each GOP to have the same coding structure; and

And a mutual viewpoint reference step of referring to pictures in a spatially close viewpoint among pictures of other viewpoints already encoded in the respective GOPs.

The method of claim 14,

An inter-view frame reference method for encoding multi-view video data for encoding multi-view video data, further comprising: a same-view reference step of referring to an already-viewed same view picture.

15. The method of claim 14, wherein the base GOP is

An inter-view frame reference method for encoding multi-view video data, characterized in that it is H.264 / AVC GOP.

The method of claim 14,

An inter-view frame reference method for encoding multi-view video data, characterized by encoding in the order of an I picture, a P picture, an rB picture, and a B picture in a GOP of each viewpoint.

15. The method of claim 14, wherein

P picture is b (i-1, t), b (i-1, tn), b (i + 1, tn), wherein tn is a time of encoding an I picture or a P picture at the closest time before t time.

15. The method of claim 14, wherein

B (i + 1), b (i-1, t + n), b (i- 1, t), b (i-1, tn) and b (i + 1, tn) refer to three, and tn is the time when the I picture or P picture was encoded at the closest time before t time, and t + n is an inter-view frame reference method for encoding multi-view video data, wherein the I picture or P picture is encoded at the closest time after time t.

15. The method of claim 14, wherein the cross-view reference step is

If the B picture to be encoded at the current i-th point and t-th time is b (i, t), b (i + 1, t + n), b (i-1, t + n), b (i- 1, tn) and b (i + 1, tn), where tn is the time when the I picture, P picture, or rB picture was encoded at the closest time before t hours, and t + n after t hours An inter-view frame reference method for encoding multi-view video data, wherein the I-picture P picture or rB picture is encoded at the closest time.

15. The method of claim 14, wherein the cross-view reference step is

In the camera type of the quadratic cross-section and the camera type of the quadratic parallel structure, when the viewpoint and the time are the respective axes, the radius is determined based on the point closest to the center of gravity of the viewpoint and the time of the multiview GOP on the coordinate axis. An inter-view frame reference method for encoding multi-view video data, characterized by first coding a viewpoint located at a close point.

15. The method of claim 14, wherein the mutual view reference step is

An inter-view frame reference method for encoding multi-view video data, characterized by referring to up to three pictures.

15. The coding structure of claim 14, wherein the coding structure of the GOP at each time point is

15. The method of claim 14, wherein each GOP includes an IDR picture at an initial time at which coding starts.

15. The method of claim 14, wherein the cross-view reference step is

Wherein, among the GOPs, the base GOP which performs only temporal frame reference within its own view, the IDR picture, and the pictures of other views that are already encoded except for the last view are referred to in the order of pictures within the spatially close view. An inter-view frame reference method for encoding point video data.

A computer-readable recording medium having recorded thereon a program for executing the method according to any one of claims 14 to 26.