KR101396948B1

KR101396948B1 - Method and Equipment for hybrid multiview and scalable video coding

Info

Publication number: KR101396948B1
Application number: KR1020070021299A
Authority: KR
Inventors: 서덕영; 이용헌; 박광훈
Original assignee: 경희대학교 산학협력단
Priority date: 2007-03-05
Filing date: 2007-03-05
Publication date: 2014-05-20
Also published as: KR20080081407A

Abstract

본 발명에서는 계층화 비디오 코딩(SVC 또는 Scalable Video Coding)과 다시점 비디오 코딩(MVC 또는 Multi-view Video Coding)을 결합한 복합형 비디오 코딩 (HMSVC or Hybrid Multiview and Scalable Video Coding) 방법을 제시한다. 다시점 비디오 코딩은 여러 가지 각도에서 획득한 비디오 영상을 모두 전송하여 부호화, 전송, 복호화에 매우 많은 자원이 이용된다. 본 발명에서는 디스플레이 되는 뷰는 충분한 품질을 제공하고 다른 뷰에 대해서는 제한된 품질만 유지하는 방법을 제안한다. 따라서, 제한된 품질로 부호화되는 여러 뷰들은 MVC를 이용하여 부호화 및 복호화되고 선택된 뷰는 SVC를 이용하여 향상된 계층을 디스플레이할 수 있게한다. 여기서 품질의 향상은 공간적 계층화 (spatial scalability), 시간적 계층화 (temporal scalability), 품질 계층화 (quality scalability), 또는 이들 방법을 조합한 방법을 이용하여 가능하게 한다. 이 같은 계층화 방법은 일 실시예로 JVT SVC에서 제공하는 계층화 방식을 이용할 수 있다.The present invention proposes a hybrid video coding (HMSVC or Hybrid Multiview and Scalable Video Coding) method combining SVC or Scalable Video Coding and MVC or Multi-view Video Coding. In multi-view video coding, much resources are used for encoding, transmitting, and decoding by transmitting all video images obtained from various angles. The present invention proposes a method of providing sufficient quality for the displayed view and only limited quality for the other views. Thus, multiple views that are encoded with limited quality are encoded and decoded using MVC, and selected views can display enhanced layers using SVC. Here, the quality enhancement is made possible using spatial scalability, temporal scalability, quality scalability, or a combination of these methods. This layering method can use the layering method provided by the JVT SVC in one embodiment.

본 발명은 MVC를 이용하여 다수의 뷰를 동시에 전송하고 선택된 뷰는 고품질로 디스플레이할 수 있게하는 방법 및 장치를 이용하여 적은 자원을 이용하여 다시점 비디오 서비스가 가능하게 하는 방법을 포함한다.The present invention includes a method for enabling multi-view video service using few resources using a method and apparatus for simultaneously transmitting multiple views using MVC and displaying selected views at high quality.

본 발명은 다시점 비디오 서비스에서 자원의 절약 및 응용의 다양성을 위해 이용된다.The present invention is utilized for resource saving and application diversity in multi-view video services.

MPEG, JVT, 스케일러블 비디오 코딩(SVC), 다시점 비디오 코딩(MVC), NAL(Network Adaptation Layer) MPEG, JVT, scalable video coding (SVC), multi-view video coding (MVC), network adaptation layer (NAL)

Description

&Lt; Desc / Clms Page number 1 > METHOD AND APPARATUS FOR HYBRID DISTRIBUTION &

도 1. 다시점 비디오 코딩의 예측 구조의 예Figure 1. Example of prediction structure of multi-view video coding

도 2. QCIF 8개와 CIF 한 개가 병렬적으로 (parallel) 복호화되는 HMSVC 방식Figure 2. HMSVC scheme in which 8 QCIFs and one CIF are decoded in parallel

도 3. QCIF 4개와 CIF 한 개가 포함된 HMSVC 방식Figure 3. HMSVC method with four QCIFs and one CIF

도 4. 실시간 부호화/복호화되는 1:1 대화형 서비스Figure 4. Real-time encoded / decoded 1: 1 interactive service

도 5. 스트리밍시 지연되지만 화질저하 없는 랜덤 엑세스 Figure 5. Random access without delay when streaming

도 6. 화질 저하는 되지만 지연없이 랜덤 엑세스Figure 6. Random access without degradation of image quality.

도 7. SI/SP를 이용한 랜덤 엑세스Figure 7. Random access using SI / SP

도 8. Concatenated MVC/SVC NAL headerFigure 8. Concatenated MVC / SVC NAL header

도 9. 본 발명의 사용 시나리오Figure 9. Usage scenario of the present invention

도 10. HMSVC 부호화 / 복화화기, MANE를 이용하는 스트리밍 시스템의 블록도Figure 10. Block diagram of streaming system using HMSVC encoding / demultiplexer, MANE

도 11. HMSVC를 이용한 스트리밍 시스템의 예Figure 11. Example of streaming system using HMSVC

본 발명은 계층화 비디오 코딩과 다시점 비디오 코딩을 결합한 복합형 비디오 코딩 방법 및 장치에 관한 것이다.
현재 표준화가 진행중인 다시점 비디오 코딩(Multi-view Video Coding; MVC)은 기존의 동영상 국제표준인 MPEG-4 part 10 Advanced Video Coding (AVC; H.264)을 기반으로하여 코딩 성능을 향상 시키고 있다. 시간 방향으로는 JSVC (Joint Scalable Video Coding)에서 시간적 스케일러빌리티(temporal scalability)를 지원하기 위해 수행되는 방법인 계층적인 B-픽쳐(hierarchical B-pictures) 코딩을 수행하고 또한 뷰간(inter-view)의 예측을 수행하여 코딩의 향상을 이루고 있다. The present invention relates to a composite video coding method and apparatus that combines layered video coding and multi-view video coding.
Multi-view Video Coding (MVC), which is currently under standardization, improves coding performance based on MPEG-4 part 10 Advanced Video Coding (AVC), which is an existing video standard. And performs hierarchical B-pictures coding, which is a method performed in order to support temporal scalability in JSVC (Joint Scalable Video Coding) in the time direction, and also performs inter-view coding Prediction is performed to improve the coding.

도 1은 현재 표준화가 진행중인 다시점 비디오 코딩에서 8개의 뷰가 존재할 경우의 예로써 시간 방향의 GOP (Group of Pictures)의 크기가 8일 경우의 예측 구조를 보여준다. S0, S1, S2, S3, …, S7은 각각 하나의 뷰(view)를 나타내고 T0, T1, T2, T3, …, T100은 시간방향의 영상을 나타낸다. FIG. 1 shows a prediction structure when a size of a group of pictures (GOP) in a time direction is 8, for example, in the case where eight views exist in multi-view video coding in which standardization is currently underway. S0, S1, S2, S3, ... , S7 each represent one view, and T0, T1, T2, T3, ... , And T100 represents an image in the time direction.

보통 모든 뷰는 서로 같은 시간대에서 참조관계를 가지고, 다른 뷰를 참조하지 않고 독립적으로 복호화가능한 뷰를 기본 뷰 (base view)라고 한다. 각각의 뷰에 속한 프레임들은 시간적으로 서로 참조관계를 가지며, 랜덤 엑세스와 에러 전파 끊기를 위해 주기적으로 앵커를 삽입한다. 여기서 앵커란 시간적으로 이전의 데이터에 독립적으로 복호화할 수 있는 프레임을 말한다. 그림에서는 S0가 기본 뷰이고, T0, T8, T96에 있는 프레임들이 앵커 프레임들이다. Normally, all views have a reference relationship with each other at the same time, and a view that can be decoded independently without reference to another view is called a base view. Frames belonging to each view have a reference relationship with respect to each other in terms of time, and periodically insert an anchor for random access and error propagation interruption. Here, an anchor refers to a frame that can be decoded independently of previous data with respect to time. In the figure, S0 is the default view, and the frames at T0, T8, and T96 are anchor frames.

도 1에서 각 뷰의 시간방향의 코딩에서는 계층적인 B-픽쳐 구조를 사용하여 예측을 수행하고 있고, 각 뷰의 첫번째 시간(T0)의 영상들과 그 이후로 시간 방향으로 8 프레임씩 계속해서 떨어있는 영상(즉, 시간 방향의 GOP 크기만큼 계속해서 떨어져 있는 영상)들 (즉, T8, T16, T24,…)은 이웃하는 뷰로부터의 예측만을 수행하고 있다. 즉 S2 뷰는 S0 뷰로부터, S1 뷰는 S0 뷰와 S2뷰로부터, S4 뷰는 S2 뷰로부터, S3 뷰는 S2 뷰와 S4 뷰로부터, S6 뷰는 S4 뷰로부터, S5 뷰는 S4 뷰와 S6 뷰로부터, S7 뷰는 마지막 뷰이기 때문에 S6 뷰로부터 예측을 수행한다. 그리고 그 이외의 매 두번째 뷰(즉, S1, S3, S5, S7)에서는 이웃하는 뷰로부터의 예측을 시간 방향의 예측과 함께 수행한다. 즉 S1 뷰는 S0 뷰와 S2 뷰로부터, S3 뷰는 S1 뷰와 S4 뷰로부터, S5 뷰는 S3 뷰와 S6 뷰로부터 예측을 수행한다.In FIG. 1, in the coding in the temporal direction of each view, prediction is performed using a hierarchical B-picture structure, and images in the first time (T0) of each view are continuously (I.e., the images continuously separated by the GOP size in the temporal direction) (i.e., T8, T16, T24, ...) are only performing prediction from neighboring views. That is, S2 view is from S0 view, S1 view is from S0 view and S2 view, S4 view is from S2 view, S3 view is from S2 view and S4 view, S6 view is from S4 view, S5 view is S4 view and S6 view , Since the S7 view is the last view, the prediction is performed from the S6 view. And every other second view (i.e., S1, S3, S5, S7) performs prediction from the neighboring view along with temporal prediction. That is, the S1 view predicts from the S0 view and the S2 view, the S3 view from the S1 view and the S4 view, and the S5 view from the S3 view and the S6 view.

이 같이 8개의 뷰가 동시에 복호화가 되므로 매우 많은 계산량을 요구하고 있다. 본 발명의목적은 선택된 뷰 이외의 뷰는 품질을 매우 떨어뜨려서 계산량과 전송량을 줄이는 것이다. 예를 들어, 8개의 CIF 영상(352X288)을 프레임율 30Hz로 복호화한다면, 1초에

개의 매크로블럭을 복호화해야 하지만, 8개의 뷰는 QCIF 영상(176X144)로 하고, 하나의 뷰만 CIF 영상으로 한다면

개의 매크로블럭만 복호화하므로 계산량이 약 3/8으로 줄어든다.Since the eight views are decoded simultaneously, a very large amount of computation is required. It is an object of the present invention to reduce the amount of computation and transmission by reducing the quality of views other than the selected view. For example, if eight CIF images (352X288) are decoded at a frame rate of 30 Hz,

Macroblocks should be decoded, but the QCIF image (176X144) is used as the eight views, and if only one view is used as the CIF image

Since only macroblocks are decoded, the calculation amount is reduced to about 3/8.

본 발명의 목적은 디스플레이용으로 수신측에서 선택된 뷰 이외의 뷰는 품질을 매우 떨어뜨려서 계산량과 전송량을 줄이는 것이다. 예를 들어, 8개의 CIF 영상(352X288)을 프레임율 30Hz로 복호화한다면, 1초에

개의 매크로블럭만 복호화하므로 계산량이 약 3/8으로 줄어든다. 즉, 3개의 CIF급 비디오를 복호화하는 것과 계산량이 비슷하다. 만일 기본 계층을 QCIF로 하고, 향상계층을 4CIF로 한다면 계산량은 약 3/16으로 줄어들어서 4CIF 하나를 복호화 하는 것에서 50%만 계산량을 추가하면 된다.It is an object of the present invention to reduce the amount of calculation and the amount of transmission by reducing the quality of the view other than the view selected on the receiving side for the display. For example, if eight CIF images (352X288) are decoded at a frame rate of 30 Hz,

Since only macroblocks are decoded, the calculation amount is reduced to about 3/8. In other words, the computational complexity is similar to that of decoding three CIF videos. If the base layer is QCIF and the enhancement layer is 4CIF, the calculation amount is reduced to about 3/16, and only 50% of the calculation of 4CIF is added.

본 발명에 의하면 기본계층과 향상계층간 계층화 방식은 공간적 계층화 (spatial scalability), 시간적 계층화 (temporal scalability), 품질 계층화 (quality scalability), 또는 이들 방법을 조합한 방법 모두 가능하다. 그러나, 본 발명에서는 설명을 간단하게 하기 위해서 일 실시예로 하나의 공간적 계층화 계층을 사용하는 경우에 대해서만 설명한다. According to the present invention, the layer between the base layer and the enhancement layer can be a spatial scalability, a temporal scalability, a quality scalability, or a combination of these methods. However, in order to simplify the description, only one spatial layering layer is used as an embodiment of the present invention.

다시점 비디오 서비스에서 동시에 복호화되고 있는 뷰는 다수이지만 디스플레이되는 뷰는 하나라는 점에 착안한다. 랜덤 엑세스(뷰 스위칭) 지연을 얼마나 허용하는가에 따라 계산량, 버퍼량 등 자원을 절약할 수 있는 방법이 있을 수 있다. It is noted that there are many views being decoded simultaneously in the multi-view video service, but only one view is displayed. Depending on how much the random access (view switching) latency is allowed, there may be a way to save resources such as calculation amount and buffer amount.

예를 들어 도 2에서는 8개의 QCIF와 하나의 CIF가 동시에 복호화되고 있는 경우를 보여준다. 이때 8개의 QCIF는 MVC 방식에 의해 복호화되고, 그중 하나의 QCIF는 CIF의 공간적 계층 복호화의 기본 계층이 된다. CIF 영상이 복원되는 뷰가 디스플레이 되며 display_view라고 정의한다. 이로써 동시에 8개의 CIF를 MVC 복호화하는 것에 비해서 자원을 절약할 수 있지만 랜덤엑세스 지연, 복잡도의 증가 등 문제가 발생한다. 본 발명에서는 이 같은 문제를 최소화하는 부호화, 복호화 방식을 제시하고, 필요한 신택스를 규정한다. 이와 같이 다시점 비디오 복호화와 계층화 비디오 복호화를 동시에 이용하는 복호화 방식을 HMSVC( Hybrid Multiview and Scalable Coding )이라고 정의한다.For example, FIG. 2 shows a case where 8 QCIFs and one CIF are decoded simultaneously. At this time, 8 QCIFs are decoded by the MVC method, and one QCIF is a base layer of spatial layer decoding of CIF. The view from which the CIF image is restored is displayed and defined as display_view. This saves resources compared to MVC decoding of 8 CIFs at the same time, but there are problems such as random access delay and increased complexity. In the present invention, a coding and decoding method that minimizes such a problem is presented, and the necessary syntax is defined. Thus, the decoding method to be used again, the multi-view video decoding and layered video decoding at the same time HMSVC (Hybrid Multiview and Scalable Coding ) .

만일 display_view를 복호화하는데 필요한 최소의 데이터만 전송한다면 자원을 절약할 수 있을 것이다. 랜덤엑세스시 도 3과 같이 4개의 QCIF 스트림과 하나의 CIF가 복호화되는 경우 QCIF간에는 MVC로 동작하고 있다고 하자. 참조관계는 그림에서와 같이 S1과 S2가 S0를 참조하고, S3는 S2를 참조한다고 하자. CIF는 디스플레이 되는 뷰, 즉, display_view에서만 복호화된다고 하자. 이때, 뷰 2가 display_view이면 S0, S2의 QCIF 데이터와 뷰 2의 CIF 데이터만 전송되어 복호화되면 된다. 이로써 계산량, 전송율, 복호화 버퍼메모리 등 자원을 절약할 수 있다. 만일 뷰 2에서 뷰 1으로 스위칭을 사용자가 요청하였을 때, 뷰 1을 복호화하고 있지 않았기 때문에 다음 앵커까지 (다음 GOP 시작까지) 복호화가 불가능하다. If only the minimum amount of data needed to decode the display_view is transmitted, it will save resources. Assume that random access is performed between MVC and QCIF when four QCIF streams and one CIF are decoded as shown in FIG. The reference relation is as shown in the figure, where S1 and S2 refer to S0, and S3 refers to S2. Assume that the CIF is decoded only in the displayed view, i.e., display_view. At this time, if view 2 is display_view, only QCIF data of S0 and S2 and CIF data of view 2 are transmitted and decoded. This saves resources such as computation volume, transfer rate, and decoding buffer memory. When the user requests switching from view 2 to view 1, decoding is not possible until the next anchor (until the start of the next GOP) because view 1 is not being decoded.

응용 서비스는 다음과 같은 항목별로 분류할 수 있다.Application services can be classified into the following categories.

1. 실시간으로 동시에 부호화/복호화되는가? (예) 화상전화, 화상회의, 스포츠중계1. Is it encoded / decoded simultaneously in real time? (Example) Video call, video conference, sports relay

2. 멀티캐스트되므로 개개인의 피드백 요구를 서버가 들어줄 수 없는가? (예) 스포츠중계2. Can the server not listen to individual feedback requests because they are multicast? (Example) Sports broadcasting

3. 이미 인코딩되어있는 콘텐츠를 스트리밍하는 경우인가? (예) VOD3. Is it streaming already encoded content? (Example) VOD

랜덤 엑세스 지연, 영상 품질, 자원의 절약은 서로 밀접한 관계를 가지고 있으며 이러한 관계는 응용 서비스의 형태에 따라 다르게 나타날 수 있다.Random access delay, image quality, and resource saving are closely related to each other, and this relationship can be different depending on the type of application service.

1. 실시간 부호화/복호화 (realtime encoding/decoding)1. Real-time encoding / decoding

실시간으로 동시에 부호화/복호화되며 1:1 송신인 경우는 단순히 하나의 AVC 또는 SVC로 구현할 수 있다. 뷰 스위칭도 단순히 부호화기의 입력영상이 다른 뷰에서 입력되면 된다. 스위칭시에도 영상간 참조모드가 차별없이 적용될 수 있다. 따라서 이 경우에는 MVC가 사실상 필요없게 된다.In real-time, they are simultaneously encoded / decoded. In the case of 1: 1 transmission, they can be implemented simply as one AVC or SVC. View switching can also be done simply by inputting the input image of the encoder in a different view. The inter-image reference mode can be applied without any discrimination even when switching. So in this case MVC is virtually unnecessary.

2. 스트리밍 (Streaming encoded video)2. Streaming encoded video

이미 부호화된 다시점 시퀀스를 스트리밍하는 경우이다. 실시간 부호화/복호화하면서 멀티캐스트하는 경우에도 다음과 같은 시나리오가 적용된다. 선택된 뷰에 대해서만 향상계층이 전송된다. 기본 계층은 모든 뷰가 다 복호화되며, display_view에 대해서만 향상계층이 전송된다. 뷰 스위칭시에 기본 계층은 바로 뷰 스위칭이 일어나지만, 향상 계층에서는 인트라 리프레쉬 (Intra 또는 Enhanced Intra)가 일어나야 뷰 스위칭이 가능하다. And the already encoded multi-point sequence is streamed. The following scenarios also apply when multicasting in real time encoding / decoding. Only the enhancement layer is transmitted for the selected view. The base layer decodes all views, and only the enhancement layer for display_view is transmitted. In view switching, the base layer immediately switches to view, but in the enhancement layer, intra refresh (Intra or Enhanced Intra) must occur before view switching is possible.

품질의 저하가 없어야 하고, 뷰스위칭 지연은 허용되면, 도 6과 같이 향상 계층 인트라 리프레쉬 이전까지 그전 뷰를 계속 디스플레이한다. 뷰 스위칭 요구와 실제 스위칭 중간 기간동안 이전 프레임의 향상 계층 데이터가 전송되어야 한다. If there is no degradation in quality and the view switching delay is allowed, the previous view is continuously displayed before the enhancement layer intra refresh as shown in FIG. Enhancement layer data of the previous frame should be transmitted during the view switching request and the actual switching intermediate period.

만일 품질의 저하를 허용하고 뷰 스위칭 지연을 허용하지 않으면 도 5와 같이 기본 계층을 바로 스위칭하고, 향상 계층은 기본 계층을 업샘플링하여 디스플레이 한다. 뷰 스위칭 요구와 실제 스위칭 중간 기간동안 그전 뷰의 향상 계층 데이터는 전송될 필요없다.If the quality degradation is allowed and the view switching delay is not allowed, the base layer is immediately switched as shown in FIG. 5, and the enhancement layer upsamples and displays the base layer. The enhancement layer data of the previous view need not be transmitted during the view switching request and the actual switching intermediate period.

인트라 리프레쉬를 자주하면 코딩효율이 떨어지므로, 도 7과 같이 중간중간에 SI (Switching Intra) 또는 SP (Switching Predictive)를 사용하여 코딩 효율을 어느 정도 유지하면서 뷰 스위칭 지연을 줄이는 것이 바람직하다.Since coding efficiency deteriorates frequently when intra refresh is frequently performed, it is desirable to reduce the view switching delay while maintaining the coding efficiency to some extent by using SI (Switching Intra) or SP (Switching Predictive) in the middle as shown in FIG.

HMSVC 복호화 과정HMSVC decoding process

1. MVC 복호화 1. MVC Decryption

기본계층에서는 기존의 MVC에서 바뀌는 사항이 없다. 다만, 현재 디스플레이 되는 뷰를 인식하는 정보가 복호화측에서 필요하다. 일 실시예로 display_view라고 하자. 부호화기에서 부호화해야하는 측 또는 전송단에서도 display_view를 알면 display_view를 복호화하는데 필요한 데이터만 선택적으로 복호화측에 제공할 수 있다. 예를 들어, 도 1에서 뷰 4가 display_view라면 뷰 1과 뷰 3은 부호화할 필요가 없고 부호화하였더라도 전송할 필요없다. 그러나, GOP 단위로 랜덤엑세스하는 것을 허용하지 않고, 만일 완전히 즉각적인 랜덤엑세스를 원한다면 모든 뷰를 동시에 복호화하여야 한다. In the base layer, there is no change in the existing MVC. However, information for recognizing the currently displayed view is necessary on the decoding side. Let us say display_view in one embodiment. Knowing the display_view in the side or the transmission end to be encoded by the encoder, only the data necessary for decoding the display_view can be selectively provided to the decoding side. For example, in FIG. 1, if view 4 is display_view, view 1 and view 3 do not need to be encoded. However, it is not allowed to perform random access on a GOP-by-GOP basis, and all views must be decoded simultaneously if completely random random access is desired.

레퍼런스 픽쳐 리스트 관리 : 기존의 MVC 뷰간 예측구조를 따를 경우, 레퍼런스 픽쳐 리스트 관리는 기존의 방법과 동일하다 reference Picture list management : When following the existing MVC inter-view prediction structure, the reference picture list management is the same as the existing method

2. SVC 복호화2. SVC Decryption

여러 개의 뷰중에서 선택된 뷰만 향상계층 디코딩을 수행한다. 향상 계층은 1개 또는 전부를 선택할 수 있다. 향상 계층간에 뷰간 참조는 기본 계층의 뷰간 참조와 구조를 달리 할 수 있다. 일 실시예로 향상 계층에서는 뷰간 참조를 하지 않는 것이 바람직하다. Only the selected view among the multiple views performs enhancement layer decoding. The enhancement layer can select one or all. Among the enhancement layers, the view-to-view reference may be different from the view-to-view reference of the base layer. In one embodiment, the enhancement layer preferably does not have a cross-view reference.

뷰 스위칭 : 뷰 스위칭시 향상계층은 가장 가까운 인트라 리프레쉬 (intra refresh) 이후부터 복호화 가능하다. 향상 계층의 뷰 스위칭(랜덤엑서스) 지연을 줄이기 위해서는 SP(switching predictive) 또는 SI(switching intra)방식을 사용하는 것이 바람직하다. View switching : When switching the view, the enhancement layer can be decoded from the nearest intra refresh. In order to reduce the view switching (random access) delay of the enhancement layer, it is preferable to use switching prediction (SP) or switching intra (SI).

레퍼런스 픽쳐 리스트 관리 : 다른 뷰의 기본계층의 복호화된 영상을 향상 계층의 참조 영상으로 사용할 수 있다. 이를 위하여 레퍼런스 픽쳐 구조에 대한 신택스가 필요하다. reference Picture list management : The decoded image of the base layer of another view can be used as the reference image of the enhancement layer. To this end, a syntax for the reference picture structure is required.

HMSVC를 위한 신택스Syntax for HMSVC

1. NAL 헤더1. NAL header

수신측에서 하나의 뷰만 SVC 향상 계층이 사용되는 경우에는 display_view를 정하면, MVC 헤더를 전송하지 않더라도 자동적으로 display_view의 SVC 향상 계층임을 인식할 수 있게된다. 그러나, 여러 개의 뷰에 대해서 SVC NALU을 전송하는 경우에는 뷰와 계층을 각각 인식할 수 있어야 한다.In the case where only one view on the receiving side uses the SVC enhancement layer, it is possible to recognize that the display_view is automatically set to the SVC enhancement layer of the display_view even if the MVC header is not transmitted. However, when transmitting SVC NALU to multiple views, it is necessary to recognize the view and the hierarchy respectively.

■ Concatenated MVC / SVC NAL header ■ Concatenated MVC / SVC NAL header

MVC와 SVC의 NAL 헤더를 붙여서 헤더를 만드는 방식이다. SVC NALU는 선택적으로 존재하므로 기본적으로 MVC를 가지고 있는 것이 좋다. 이 경우에 MVC 기본 계층 NALU (NAL Unit)인지 SVC NALU인지 구분은 MVC NAL 헤더에서 이루어져야 한다. 이를 위하여 일 실시예로 MVC NAL 헤더의 마지막 비트를 사용할 수 있다, SVCE (SVC Extension)이라고 이름을 붙이자. SVCE=1이면, HMSVC NAL 헤더이고, SVCE=0이면 SVC NAL 헤더가 된다. MVC and SVC 's NAL headers are attached to each other. SVC NALU is optional, so it is better to have MVC by default. In this case, whether the MVC base layer NALU (NAL Unit) or the SVC NALU should be distinguished is made in the MVC NAL header. For this purpose, the last bit of the MVC NAL header can be used in one embodiment. Let's call it SVCE (SVC Extension). If SVCE = 1, it is an HMSVC NAL header. If SVCE = 0, it is an SVC NAL header.

첫 바이트에 있는 H.264/AVC NAL 헤더는 붙여지는 SVC NAL 헤더에서 첫 바이트와 같이 하는 것이 바람직하다. 이때, 두 번째 바이트에 있는 temporal_level(3) 은 현 SVC의 기본 계층의 그것이고, 다섯 번째 바이트에 있는 temporal_level(3)은 현 SVC NALU의 그것이다.The H.264 / AVC NAL header in the first byte is preferably the same as the first byte in the SVC NAL header to be attached. In this case, the temporal_level (3) in the second byte is that of the base layer of the current SVC, and the temporal_level (3) in the fifth byte is that of the current SVC NALU.

MVCMVC NALNAL 헤더를 이용한 제한적 Limited by header SVCSVC NALNAL 헤더 Header

SVC 계층의 수를 매우 제한적으로 하는 경우에는 SVC에 대한 정보를 View_id (10) 또는 resv (6)를 이용하여 표현할 수 있다. 이 경우 MVC NAL 헤더의 자유도가 줄어들지만 NAL 헤더의 크기를 줄일 수 있다. HMSVC의 경우에는 1가지 종류의 SVC를 허용한다면, 1비트, 4가지 이하 종류의 SVC만을 허용한다면 2비트를 사용하여 SVC를 표시할 수 있다. 일 실시예로 view_id(10)을 view_id(8)+SVC_id(2)로 이용할 수 있다. 이를 사용하는 예로 MVC는 QCIF 15Hz로 하고, SVC는 CIF 15Hz, 4CIF 15Hz, 4CIF 30Hz가 가능하다. 이때, 뷰의 개수는 256개까지 가능하다. 일 실시시 예로, 다음과 같이 할 수 있다. 표 1은 제한적 SVC NAL 헤더를 위한 계층화 정보 테이블을 나타낸다.If the number of SVC layers is very limited, information about the SVC can be expressed using View_id (10) or resv (6). In this case, the degree of freedom of the MVC NAL header is reduced, but the size of the NAL header can be reduced. In case of HMSVC, if one kind of SVC is allowed, if only 1 bit and 4 types of SVC are allowed, 2 bits can be used to display SVC. In one embodiment, view_id (10) can be used as view_id (8) + SVC_id (2). An example of using this is MVC with QCIF 15Hz and SVC with CIF 15Hz, 4CIF 15Hz, 4CIF 30Hz. At this time, the number of views can be up to 256. In one embodiment, this can be done as follows. Table 1 shows a layered information table for the restricted SVC NAL header.

QCIF 15Hz (base)QCIF 15Hz (base) CIF 15HzCIF 15Hz CIF 30HzCIF 30Hz 4CIF 30Hz4CIF 30Hz SVC_idSVC_id 0000 0101 1010 1111

SVC_id(1)을 사용하는 경우에도 향상 계층은 매우 다양하게 설정할 수 있다, 일 실시예로 다음과 같이 SVC_id=0은 QCIF 15Hz로 하고, SVC_id=1은 4CIF 30Hz로 할 수 있다. 이때, 뷰의 개수는 512개까지 가능하다. View_id(10)을 사용할 수 없다면 SVC_id는 resv(6)에서 설정할 수 있다. 즉, resv(6)에서 1비트 또는 2비트를 사용하여 SVC 계층을 인식할 수 있다.In case of using SVC_id (1), the enhancement layer can be set in various ways. In one embodiment, SVC_id = 0 can be set to QCIF 15 Hz, and SVC_id = 1 can be set to 4CIF 30 Hz. At this time, the number of views can be up to 512. If View_id (10) is not available, SVC_id can be set in resv (6). That is, one bit or two bits can be used in resv (6) to recognize the SVC layer.

허용하는 SVC는 SPS (Sequence Parameter Set) 또는 PPS (Picture Parameter Set) 메시지를 이용해서 설정할 수 있다. SVC_id를 한 비트로 하느냐, 2비트로 하느냐를 선택적으로 할 수 있다면 이에 대한 선택도 SPS 또는 PPS 메시지를 통해 전달되어야 한다.The allowable SVC can be set using SPS (Sequence Parameter Set) or PPS (Picture Parameter Set) message. If the SVC_id can be selectively made to be one bit or two bits, the choice of this should also be conveyed through the SPS or PPS message.

● 기타 방식 ● Other methods

■ slice header 이용 : Overhead가 너무 크기 때문에 현실적이지 못하다 ■ Using slice header: Overhead is too large to be realistic

■ UDP port를 뷰별로 규정 ■ Defining UDP ports by view

서버 클라이언트 구조의 HMSVC 스트리밍 서비스를 구현할 경우, 기본계층에 해당하는 MVC 비트열을 하나의 UDP 포트로 전송하고, 고급계층에 해당하는 SVC 비트열의 경우, 각 뷰 별로 UDP 포트를 할당하여 전송한다. 이러한 경우, 기본계층의 비트열들은 기존의 MVC 복호화 방법으로 복호를 실시하고, 고급계층에 한해서 전송되는 포트로 뷰를 구분할 수 있다. 이는 MVC, SVC NAL header에서의 신택스 변화는 필요하지 않게 된다.When the HMSVC streaming service of the server client structure is implemented, the MVC bitstream corresponding to the base layer is transmitted on one UDP port, and in the case of the SVC bitstream corresponding to the higher layer, the UDP port is allocated for each view and transmitted. In this case, the bitstreams of the base layer can be decoded using the existing MVC decoding method, and the view can be distinguished from the ports transmitted only to the higher layer. This does not require the syntax change in MVC, SVC NAL header.

■ 라벨을 이용하는 방식 ■ How to use labels

모든 NAL 헤더를 2바이트로 하고, 첫 번째 바이트는 H.264/AVC의 그것과 같이 하고, 두 번째 바이트는 라벨로 하여, 하나의 시퀀스에 대해서 구별해야하는 모든 종류의 NAL 헤더 리스트로 만들고 각각에 대해 라벨을 부여하는 방식이다. 라벨과 해당 NAL 헤더는 SPS 메시지를 통해 전달된다. SI/SP를 위한 NAL 헤더도 따로 있어야 한다. H.264/AVC의 NAL 헤더만 사용하면 인식이 가능한 SPS나 PPS에 대해서는 라벨이 필요없다.All NAL headers shall be 2 bytes, the first byte shall be the same as that of H.264 / AVC, the second byte shall be the label and shall be made of a list of all kinds of NAL headers to be distinguished for one sequence, It is a method to give a label. The label and the corresponding NAL header are passed through the SPS message. There should also be a separate NAL header for SI / SP. If only the NAL header of H.264 / AVC is used, labeling is unnecessary for recognizable SPS or PPS.

라벨 방식은 1바이트 라벨방식으로 하면 256가지의 서로다른 NAL 헤더를 규정할 수 있고, 처음 7비트는 라벨로 사용하고 마지막 비트는 연장여부를 정하는 방식으로 하면 무한개의 NAL 헤더를 인식할 수 있게된다. 다음의 표에서와 같이 하나의 뷰에 대해서 기본 계층을 포함해서 3개의 SVC의 계층이 있고, 각각 시간적 레벨이 3개 인 경우에는 하나의 뷰에 대해서 9개의 라벨이 필요하다. 만일 8개의 뷰가 존재한다면, 총 72개의 라벨이 필요하다. SP나 SI를 사용하는 경우 라벨로 구분하려면 뷰마다 9개의 라벨이 더 필요하고, 슬라이스 헤더에서 구분한다면 따로 라벨이 필요 없다. 표 2는 라벨을 이용하는 HMSVC NAL 헤더를 위한 계층화 정보 테이블을 나타낸다.The labeling method can define 256 different NAL headers by using a one-byte labeling method, and can recognize an infinite number of NAL headers by using the first 7 bits as a label and the last bit as an extension. . As shown in the following table, if there are three SVC layers including a base layer for one view, and three temporal levels, respectively, nine labels are required for one view. If there are 8 views, a total of 72 labels are required. If you use SP or SI, you need more than 9 labels for each view to distinguish them from each other. Table 2 shows a layered information table for the HMSVC NAL header using labels.

LabelLabel Temporal_ID (3)Temporal_ID (3) Anchor (1)Anchor (1) View_id(8)View_id (8) SVC(2)SVC (2) 00 00 1One 00 0000 1One 1One 00 00 0000 22 22 00 00 0000 33 00 1One 00 0101 44 1One 00 00 0101 55 22 00 00 0101 66 00 1One 00 1010 77 1One 00 00 1010 88 22 00 00 1010 99 00 1One 1One 0000 1010 1One 00 1One 0000 1111 22 00 1One 0000 1212 00 1One 1One 0101 1313 1One 00 1One 0101

앵커인지 여부는 temporal_id와 중복적이면 anchor(1)은 필요없다. 또한, view_level(3)은 view_id에서 유추하여 알 수 있다면, 3개의 비트는 필요없게 된다. 또한, temporal_id(3)과 SVC 종류가 내용이 중복되면, 뷰별 필요한 라벨 수는 더 줄어들게 된다. 다만, 서버, MANE, 복호기에서 필터링여부를 판단하기 위해 중복적인 정보를 구체화하기 위해 NALU마다 약간의 계산이 필요하다. MANE에서는 각 사용자별로 또는 엑세스 네트워크별로 필터링해야하는 라벨의 리스트만 알려주면 된다. 구체적으로 어떤 라벨이 어떤 계층이나 뷰에 속하는지는 알려줄 필요없다.Anchor (1) is not necessary if the anchor is redundant with temporal_id. Also, if view_level (3) is known from view_id by analogy, three bits are not needed. Also, if the contents of temporal_id (3) and SVC type are overlapped, the number of labels required for view is further reduced. However, a little computation is required for each NALU in order to specify redundant information in order to judge whether filtering is performed in the server, the MANE, and the decoder. In MANE, you only need to list the labels that need to be filtered by each user or access network. Specifically, it is not necessary to tell which label belongs to which layer or view.

2. SPS(Sequence Parameter Set)2. Sequence Parameter Set (SPS)

SPS는 부호화, 복호화기 모두가 공통적으로 가지고 있어야 하는 정보들을 담고 있다. 본 발명에서는 기존의 MVC, SVC의 SPS가 부호화, 복호화기 양쪽 모두에 필요하게 된다. 또한 MVC NAL 헤더를 이용한 제한적 SVC NAL 헤더 및 라벨을 이용하는 HMSVC NAL헤더를 이용하여 본 발명의 장치를 구현하게 될 경우 SPS의 수정이 필요하게 된다. The SPS contains information that both the encoder and the decoder need to have in common. In the present invention, the SPS of the existing MVC and SVC is required for both the encoder and the decoder. In addition, when the apparatus of the present invention is implemented using the restricted SVC NAL header using the MVC NAL header and the HMSVC NAL header using the label, the SPS needs to be modified.

Concatenated MVC/SVC 구조를 사용할 경우, HMSVC NAL header에서 해당 슬라이스의 계층화 정보를 담고 있으므로, MVC, SVC를 동시에 지원할 수 있도록 MVC, SVC의 SPS가 모두필요하게 되며, 이때 각각의 MVC, SVC SPS는 기존의 방식을 따른다.In case of using concatenated MVC / SVC structure, since the HMSVC NAL header contains layering information of the corresponding slice, both MVC and SVC SPS are needed to support MVC and SVC at the same time. .

■ MVC NAL 헤더를 이용한 제한적 SVC NAL 헤더 ■ MVC Limited SVC using NAL header NAL header

MVC NAL 헤더를 이용한 제한적 SVC NAL 헤더의 경우, SPS는 MVC NAL 헤더의 view_id에 해당하는 10비트 중 상위(또는 하위) 몇 비트를 이용하여 계층화를 표현할 것인지 명시하여야 한다. 또한 view_id에서 취한 계층화 정보의 내용을 부호화, 복호화기 간에 통일 시켜야 한다. 따라서 SPS 신택스의 변화가 필요하다For a restricted SVC NAL header using the MVC NAL header, the SPS shall specify whether to represent the layering using the upper (or lower) bits of the 10 bits corresponding to the view_id of the MVC NAL header. In addition, the contents of the layered information taken from the view_id must be unified between the encoding and decoding units. Therefore, the SPS syntax needs to be changed.

3. PPS(Picture Parameter Set) : 변화 없음3. PPS (Picture Parameter Set): No change

본 발명이 적용되는 서비스 : 본 발명은 멀티캐스트 서비스에서 유용하다. 피드백 요구의 전달 가능성은 상관없다. 여기서 피드백 요구란 사용자(client)가 현재 어떤 뷰를 시청하고 싶어하는가를 알려주는 것이다. 만일 소수의 사용자만 있는 경우에는 사용자가 시청하고 있는 뷰를 조사하여 아무도 시청하지 않는 경우에는 전송하지 않음으로써 자원을 절약할 수 있다. 그리고, 사용자별로 가용한 대역폭을 알 수 있는 경우(예를 들어, IEEE 802.16e)에는 사용자에 따라 향상 계층을 전송할지 하지 않을지 여부를 선택적으로 결정할 수 있으면 바람직 할 것이다. 이러한 결정은 사용자 그룹에 가까이 있는 MANE (Media Aware Network Element)에서 이루어지거나, 서버에서 이루어지는 것이 바람직하다. Services to which the invention applies : The present invention is useful in multicast services. The likelihood of delivery of the feedback request is irrelevant. Here, the feedback request tells the user what view the user wants to view. If there are only a small number of users, resources can be saved by checking the view that the user is viewing and not transmitting when no one is watching. In a case where the available bandwidth can be known for each user (for example, IEEE 802.16e), it is preferable that the user can selectively determine whether or not to transmit the enhancement layer according to the user. This determination is preferably made at the MANE (Media Aware Network Element) near the user group or at the server.

서비스 품질은 또한 사용자의 단말기에 따라 결정되거나 사용자의 선택에 의해 결정될 수도 있다. 만일 수신 패킷별로 요금이 책정되는 경우에는 사용자가 품질을 선택할 수 있는지 여부가 매우 중요하다. 사용자가 매우 많아서 개별적인 제어가 어려운 경우에는 전송대역폭 줄이는 효과보다는 단말기 자원을 절약하는 효과를 위해 본 발명이 사용될 수 있다. The quality of service may also be determined by the user's terminal or by the user's choice. If a fee is charged for each packet received, it is very important whether the user can select quality. When the number of users is so large that individual control is difficult, the present invention can be used for saving the terminal resources rather than reducing the transmission bandwidth.

4개의 뷰가 있는 서비스에서 뷰 1, 2, 3이 뷰 0을 참조한다고 하자. 단말기 (1)과 (2)는 모든 뷰의 모든 데이터가 브로드캐스팅되는 망에서 필요한 스트림을 선택하여 디코딩하여 디스플레이한다. 뷰 1을 선택한 단말기 (1)은 디스플레이 화면 크기가 작고, CPU 성능이 충분하지 않으므로 뷰 0과 뷰 1의 기본 계층만 복호화한다. 뷰 0은 뷰 1의 참조 뷰로 필요하다. 뷰 2를 선택한 단말기 (2)에서는 디스플레이 화면 크기가 크고, CPU 성능이 충분하므로 SVC 향상 계층을 수신하여 복호화한다. 단말기 (3)에서는 뷰 1을 선택하였고 1:1 VOD 서비스이므로 MANE에서는 단말기 (3)에서 오는 피드백 정보에 의거하여 필요없는 스트림은 필터링하여 전송한다.In a service with four views, assume that views 1, 2, and 3 refer to view 0. The terminals 1 and 2 select and decode necessary streams in a network where all data of all views are broadcasted and display them. The terminal (1) selected for view 1 decodes only the base layer of view 0 and view 1 because the display screen size is small and the CPU performance is not sufficient. View 0 is required as a reference view of view 1. In the terminal 2 that has selected the view 2, the SVC enhancement layer is received and decoded because the display screen size is large and the CPU performance is sufficient. In view of the fact that the terminal 3 has selected the view 1 and the 1: 1 VOD service, the MANE filters and transmits unnecessary streams based on the feedback information from the terminal 3.

NAL 헤더와 SPS의 신택스가 필요하다. 여기서 NAL 헤더는 결합형 (concatenated)과 SVC 제한형 (restricted SVC) 두 가지에 대하여 명시한다.The syntax of the NAL header and SPS is required. Here, the NAL header specifies both concatenated and restricted SVCs.

■ HMSVC 패킷의 NAL 헤더 구조■ NAL header structure of HMSVC packet

1. 결합형 NAL 헤더 (Concatenated MVC/SVC NAL header)1. Concatenated NAL header (Concatenated MVC / SVC NAL header)

표 3a 및 표 3b는 결합형 NAL 헤더를 나타내는 것으로서, 표 3b는 표 3a에 이어지는 부분이다.Tables 3a and 3b represent combined NAL headers, and Table 3b is the portion following Table 3a.

3. SVC 제한형 (restricted SVC) NAL 헤더를 위한 SPS, MVC NAL 헤더 구조3. SPS, MVC NAL header structure for SVC restricted (restricted SVC) NAL header

MVC NAL header의 view_id(u(10))의 상/하위의 비트를 이용하여 제한적인 계층화 정보를 나타낸다. 이 때, view_id의 몇 비트를 이용할지는 SPS를 이용하여 설정한다.The upper and lower bits of the view_id (u (10)) of the MVC NAL header are used to indicate restricted layering information. At this time, the number of bits of view_id to be used is set using the SPS.

표 4는 SVC 제한형 NAL 헤더를 위한 SPS 메시지 처리 방식을 나타낸다.Table 4 shows the SPS message processing method for the SVC-limited NAL header.

표 5a, 표 5b, 및 표 5c는 SPS 처리 루틴에서 profile_idc값(XX)에 따라서seq_parameter_set_mvc_extenstion() 수행하는 것을 나타낸다.Tables 5a, 5b, and 5c show that the seq_parameter_set_mvc_extenstion () is performed according to the profile_idc value (XX) in the SPS processing routine.

표 6a 및 표 6b는 NAL MVC SPS 처리 루틴을 나타낸다.Table 6a and Table 6b show the NAL MVC SPS processing routine.

표 7은 nal_unit_header_svc_mvc_extension 처리 루틴를 나타낸다.Table 7 shows the nal_unit_header_svc_mvc_extension processing routine.

3. Label을 이용한 HMSVC NAL unit 처리3. HMSVC NAL unit processing using Label

A. SPS를 읽고 복호화기 설정A. Read SPS and set decoders

i.NAL header로부터 SPS 처리 루틴으로 이동(표 8 참조)i.NAL header to SPS processing routine (see Table 8)

ii. SPS 처리 루틴에서 profile_idc 값에 따라서(XXX) seq_parameter_set_mvc_label_extenstion() 수행(표 9a, 9b, 및 9c 참조)ii. (XXX) seq_parameter_set_mvc_label_extenstion () according to the value of profile_idc in the SPS processing routine (see Tables 9a, 9b, and 9c)

iii. seq_parameter_set_mvc_label_extension()를 통한 label 설정(표 10 참조)iii. Set label via seq_parameter_set_mvc_label_extension () (see Table 10)

B. Hybrid MVC SVC NAL unit type code 정의(표 11a 및 11b 참조)B. Hybrid MVC SVC NAL unit type code definition (see Tables 11a and 11b)

■ Semantics■ Semantics

● Concatenated MVC SVC NAL header● Concatenated MVC SVC NAL header

■ nal_unit_header_svc_mvc_extension( )■ nal_unit_header_svc_mvc_extension ()

Hybrid_mvc_svc_extension_flag 해당 비트가 1(또는 0)으로 설정되어 있는 경우, 이후 존재하는 3바이트는 SVC header로 인식하여 처리하도록 하는 플래그 비트 Hybrid_mvc_svc_extension_flag When the corresponding bit is set to 1 (or 0), the flag bits

● Restricted SVC NAL header● Restricted SVC NAL header

■ seq_parameter_set_mvc_extension( )■ seq_parameter_set_mvc_extension ()

restricted_bits HMSVC NAL header가 입력될 경우, view_id에 해당하는 10bits 중 상위(또는 하위) 몇 비트를 이용하여 계층화 정보를 판단하는 지 나타낸다. restricted_bits When the HMSVC NAL header is input, it indicates whether the upper (or lower) bits of the 10 bits corresponding to the view_id are used to determine the layering information.

dependency_id HMSVC 부호/복호기가 지원 할 수 있는 공간적 계층화 정도를 나타낸다. 해당 값을 이용하여, 부호/복호기는 계층화 정보 테이블을 작성하고, HMSVC NAL header가 입력되는 경우에 계층화 정보를 판단한다. dependency_id Indicates the degree of spatial layering that the HMSVC code / decoder can support. Using the corresponding value, the code / decoder creates the layered information table and determines the layered information when the HMSVC NAL header is input.

temporal_level HMSVC 부호/복호기가 지원 할 수 있는 시간적 계층화 정도를 나타낸다. 해당 값을 이용하여, 부호/복호기는 계층화 정보 테이블을 작성하고, HMSVC NAL header가 입력되는 경우에 계층화 정보를 판단한다. temporal_level Indicates the degree of temporal layering that an HMSVC code / decoder can support. Using the corresponding value, the code / decoder creates the layered information table and determines the layered information when the HMSVC NAL header is input.

quality_level HMSVC 부호/복호기가 지원 할 수 있는 화질 계층화 정도를 나타낸다. 해당 값을 이용하여, 부호/복호기는 계층화 정보 테이블을 작성하고, HMSVC NAL header가 입력되는 경우에 계층화 정보를 판단한다. quality_level indicates the degree of image quality layering that the HMSVC code / decoder can support. Using the corresponding value, the code / decoder creates the layered information table and determines the layered information when the HMSVC NAL header is input.

view_id seq_parameter_set_mvc_extension()의 restricted_bits를 이용하여 view_id의 상위(또는 하위) 비트로부터 계층화 정보를 나타낼 svc_id를 추출한다. 추출된 svc_id는 계층화 정보 테이블을 이용하여 해당되는 계층화 정보를 표현한다.(svc_id) representing the layering information is extracted from the upper (or lower) bit of the view_id using the restricted_bits of view_id seq_parameter_set_mvc_extension (). The extracted svc_id expresses the corresponding layering information using the layering information table.

● Label을 이용한 HMSVC NAL header● HMSVC NAL header using Label

■ seq_parameter_set_mvc_label_extension( )■ seq_parameter_set_mvc_label_extension ()

seq_parameter_set_mvc_label_extension()은 부호/복호기에서 지원 가능한 계층화 종류를 규정 짓는다. 이를 이용하여 부호/복호기는 계층화 테이블(label)을 작성하여 유지한다.seq_parameter_set_mvc_label_extension () defines the kind of layering that can be supported by the code / decoder. Using this, the code / decoder creates and maintains a layered table (label).

num_views_minus_1 전체 뷰의 개수를 표현 num_views_minus_1 expresses the total number of views

temporal _ level HMSVC 부호/복호기가 지원 할 수 있는 시간적 계층화 정도를 나타낸다. 해당 값을 이용하여, 부호/복호기는 계층화 정보 테이블을 작성하고, HMSVC NAL header가 입력되는 경우에 계층화 정보를 판단한다. temporal _ level HMSVC marks / indicates the extent to which the temporal layered decoder can support. Using the corresponding value, the code / decoder creates the layered information table and determines the layered information when the HMSVC NAL header is input.

■ nal_unit_header_HMSVC_label_extension( )■ nal_unit_header_HMSVC_label_extension ()

새로운 nal_type 정의(Coded slice of an Hybrid-MVC-SVC picture)에 따라, HMSVC_label NAL header가 입력되는 경우, 처리하는 루틴.The routine to process when the HMSVC_label NAL header is entered according to the new nal_type definition (coded slice of an MVC-SVC picture).

label _ number seq_parameter_set_mvc_label_extension()에 의하여 부호/복호기에 설정된 계층화 label 테이블의 label 번호를 나타낸다.by label number seq_parameter_set_mvc_label_extension _ () indicates the label number of the layered label Table set to a code / decoder.

next _ label _ bit _ flag label 번호가 7비트로서 표현이 불가능한 경우, 연장을 위하여 사용되는 플래그 비트로서 그 값이 1(또는 0)으로 설정되어 있는 경우, 이후 존재하는 1바이트를 nal_unit_header_HMSVC_label_extension()으로 처리하도록 한다.the next _ label _ bit _ flag label number is not possible case, the value is 1 (or 0) as is, nal_unit_header_HMSVC_label_extension () 1 byte exists since if setting a flag bit which is used for extending expressed as a 7-bit .

4. HMSVC 를 이용한 자원 효율적인 다시점 스트리밍 장치 4. Resource efficient multi- point streaming device using HMSVC

다음의 그림(도 10)은 HMSVC 부호화, 복호화기, MANE를 이용하는 자원 효율적인 다시점 스트리밍 장치의 블록도를 나타낸다. 다시점 영상을 위하여 하나 이상의 카메라를 입력으로 하여 기본계층을 위한 입력은 다운 샘플링을 실시하고, 고급계층을 위한 입력은 입력 영상을 그대로 하여 각각 부호화기에 입력한다. 이 때, 기본계층에 해당하는 MVC 부호화기는 기존 MVC 부호화기의 예측구조 등을 그대로 사용하도록 한다. 각각의 기본계층을 참조하여, 고화질의 고급계층을 부호화 한다. 이때 생성되는 고급계층의 비트열에는 본 발명에서의 기술인 Concatenated MVC/SVC NAL header 또는 SPS, MVC NAL header를 이용한 Restricted SVC NAL header를 이용한다. 각각의 부호화기에서 생성된 비트열은 다중화기를 이용하여 하나 또는 다수의 비트열로 합쳐져서 MANE로 전송되고, 이 중 기본계층은 그 모두를 전송하고, 고급계층은 사용자의 명령(view change command)에 따라, 사용자가 시청하기를 원하는 뷰를 선택적으로 전송한다. 복호기에 도착한 비트열은 역 다중화기를 통하여, 기본계층 비트열과 고급계층 비트열로 나뉘어 각각 해당하는 복호기로 입력되어 복호화 된다. 이 때, 고급계층의 비트열은 현재 고화질로 시청되어야할 뷰에 해당하며, 기본계층의 동일 뷰(그림에서 S2)를 참조하여 복호화 한다. 시청 도중 사용자의 요구에 의한 뷰 전환이 있을 경우, 뷰 전환 명령이 MANE로 전달되어 지고, 필터는 해당되는 뷰의 고급계층 비트열을 전송한다. 복호화기에 위치하는 스위치는 뷰전환이 있을 경우, 고급계층의 참조 방향을 변환시켜주는 역할을 수행한다. 도 11은 HMSVC를 이용한 스트리밍 시스템의 예를 보여준다.The following figure (FIG. 10) shows a block diagram of a resource efficient multi-point streaming device using HMSVC encoding, decoders, and MANE. For the multi-view video, one or more cameras are input, the input for the base layer is downsampled, and the input for the advanced layer is input to the encoder as it is. At this time, the MVC encoder corresponding to the base layer uses the prediction structure of the existing MVC encoder as it is. Each of the base layers is referenced to encode a high-quality advanced layer. In this case, the concatenated MVC / SVC NAL header or the Restricted SVC NAL header using the SPS and MVC NAL header is used for the bit stream of the advanced layer generated in this case. The bitstreams generated by the respective encoders are combined into one or a plurality of bitstreams using a multiplexer and transmitted to the MANE. Among them, the base layer transmits all of the bits, and the higher layer is transmitted according to a view change command , And selectively transmits a view that the user wants to view. The bit stream arriving at the decoder is divided into a base layer bitstream and an advanced layer bitstream through a demultiplexer and input to a corresponding decoder and decoded. At this time, the bit stream of the advanced layer corresponds to the view to be viewed at the current high image quality, and decodes it by referring to the same view (S2 in the figure) of the base layer. If there is a view change according to the user's request during viewing, the view change command is transmitted to MANE, and the filter transmits the advanced layer bit stream of the corresponding view. The switch located in the decoder plays the role of converting the reference direction of the advanced layer when there is a view switching. 11 shows an example of a streaming system using HMSVC.

MVC의 적용범위가 크게 확대될 수 있다. 현재는 모든 뷰를 다 수신하고 복호화하여야 하므로 대역폭, CPU 성능, 메모리 자원이 크게 필요하다. 그러나, 디스플레이할 뷰와 요구되는 품질에 따라 전송하는 스트림을 선택적으로 조절할 수 있게 되었다. 다음과 같은 조건에서 적용범위가 넓어진다.The application range of MVC can be greatly expanded. At present, all views must be received and decoded, so bandwidth, CPU performance, and memory resources are required. However, it is now possible to selectively adjust the stream to be transmitted according to the view to be displayed and the required quality. The scope of application is expanded under the following conditions.

1. 사용자가 디스플레이할 뷰와 품질을 선택할 수 있음.1. User can select the view and quality to display.

2. 전송가능한 대역폭의 범위가 더욱 확대됨. (즉, 가용 대역폭이 적은 경우에도서비스 할 수 있게됨)2. The range of bandwidth that can be transmitted is further expanded. (I.e., it can be serviced even when the available bandwidth is small)

3. 사용할 수 있는 단말기의 범위가 확대됨. (CPU의 성능이 더 낮은 경우와 디스플레이 크기가 더 작은 경우에 적용할 수 있게 되었음)3. The range of available terminals is expanded. (Applicable to lower CPU performance and smaller display sizes)

Claims

The method comprising: receiving a bitstream including information about a plurality of views; And

And decoding the bitstream based on information about the plurality of views,

Wherein the information on the plurality of views includes at least one of identification information on the plurality of views and layering information on the plurality of views,

Wherein the bitstream includes an extension flag for indicating layering information for the plurality of views and further includes layering information for the plurality of views according to a value of the extension flag. Way.

The method according to claim 1,

In the step of decoding the bitstream,

Decodes an enhancement layer image for the display view based on information on a display view displayed among the plurality of views,

The information on the display view may include,

The identification information for identifying the display view, and the layering information for the display view.

The method according to claim 1,

The information about the plurality of views may include:

(NAL) unit header. &Lt; Desc / Clms Page number 13 >

delete

The method according to claim 1,

Wherein the information for the plurality of views is derived from predetermined bits contained in a NAL unit header in the bitstream,

Wherein the upper or lower bits of the predetermined bits indicate identification information for identifying the plurality of views and the remaining bits of the predetermined bits other than the upper or lower bits are layered information for the plurality of views The video decoding method comprising the steps of:

6. The method of claim 5,

Wherein information on the number of bits of the upper or lower bits of the predetermined bits is set using a SPS (Sequence Parameter Set) or a PPS (Picture Parameter Set).

The method according to claim 1,

Wherein the bitstream includes label information indicating information on the plurality of views,

Wherein the label information is a table predetermined by an encoder and a decoder and is information composed of identification information for the plurality of views and label for specifying layering information for the plurality of views.

The method according to claim 1,

If the received bitstream includes information indicating a view switching point for switching from a first view to a second view for a view to be decoded,

Wherein the enhancement layer decodes the enhancement layer image for the second view after an intra refresh closest to the point indicated by the view switching point.

9. The method of claim 8,

If you allow view switching latency,

The image of the enhancement layer for the first view is decoded before the intra refresh,

And decodes an enhancement layer image for the second view after the intra refresh.

9. The method of claim 8,

If you do not allow view switching latency,

The image of the base layer for the second view is up-sampled before the intra refresh,

3. The method of claim 2,

The enhancement layer for the display view comprises:

Wherein at least one of the decoded images of the base layer for the plurality of views is used as a reference image.

3. The method of claim 2,

Receiving a bitstream including an image of a base layer for the plurality of views at a first UDP (User Datagram Protocol) port,

And receiving a bitstream including an enhancement layer image for the display view by a second UDP port, wherein a different UDP port is used for each of the plurality of display views if the display views are a plurality of display views. Way.

Identifying information about a plurality of views; And

And encoding information about the plurality of views,

Further comprising the step of encoding extension flag information for indicating layering information for the plurality of views, wherein in the encoding of the extension flag information, And coding the layered information.

14. The method of claim 13,

In the step of encoding information on the plurality of views,

Determining a display view to be displayed among the plurality of views; And

And encoding information about the display view,

The information on the display view may include,

14. The method of claim 13,

The information about the plurality of views may include:

(NAL) unit header of the video signal.

delete

14. The method of claim 13,

In the step of specifying information on the plurality of views,

Identifying information about the plurality of views using predetermined bits in a NAL unit header,

Wherein the upper or lower bits of the predetermined bits specify identification information for identifying the plurality of views and the remaining bits of the predetermined bits other than the upper or lower bits are used for layering And the information is specified.

18. The method of claim 17,

Wherein information on the number of bits of the upper or lower bits of the predetermined bits is set using a sequence parameter set (SPS) or a picture parameter set (PPS).

14. The method of claim 13,

In the step of specifying information on the plurality of views,

Information on the plurality of views is specified based on label information,

Wherein the label information is a table predetermined by an encoder and a decoder and is information composed of identification information for the plurality of views and a label for specifying layering information for the plurality of views.

14. The method of claim 13,

Further comprising the step of encoding a bitstream comprising information indicating a view switching point switching from a first view to a second view,

Wherein the information indicating the view switching point comprises information about when an image of the enhancement layer for the second view should be decoded to enable switching from the first view to the second view, / RTI >

21. The method of claim 20,

Wherein the information indicating the view switching point comprises:

And instructs to decode an enhancement layer image for the second view after an intra refresh closest to a point indicated by the view switching point.

22. The method of claim 21,

If the view switching delay is allowed, the information indicating the view switching point,

Wherein the decoding unit decodes the enhancement layer image for the first view before the intra refresh and decodes the enhancement layer image for the second view after the intra refresh.

22. The method of claim 21,

If the view switching delay is not allowed, the information indicating the view switching point,

Wherein the controller instructs to decode and upsample an image of a base layer for the second view before the intra refresh and to decode an image of an enhancement layer for the second view after the intra refresh.

15. The method of claim 14,

Further comprising: encoding an enhancement layer image for the display view based on information about the display view,

The enhancement layer for the display view comprises:

Wherein at least one of the encoded images of the base layer for the plurality of views is used as a reference image.

25. The method of claim 24,

The base layer image for the plurality of views is transmitted to a first UDP (User Datagram Protocol) port,

And transmitting an enhancement layer image for the display view to a second UDP port, wherein if the plurality of display views are a plurality of display views, the plurality of display views are transmitted using different UDP ports.