KR102312285B1

KR102312285B1 - A method for seletively decoding a syncronized multi view video by using spatial layout information

Info

Publication number: KR102312285B1
Application number: KR1020190101566A
Authority: KR
Inventors: 임정윤; 임화섭
Original assignee: 가온미디어 주식회사
Priority date: 2016-09-08
Filing date: 2019-08-20
Publication date: 2021-10-13
Also published as: KR20190101930A

Abstract

본 발명의 실시 예에 따른 영상 복호화 방법은, 부호화된 영상을 포함하는 비트스트림을 수신하는 단계; 동기화된 다시점 영상에 대응되는 공간적 구조 정보를 획득하는 단계; 및 상기 공간적 구조 정보에 기초하여, 상기 비트스트림의 적어도 일부를 선택적으로 복호화하는 단계를 포함한다.An image decoding method according to an embodiment of the present invention includes: receiving a bitstream including an encoded image; obtaining spatial structure information corresponding to the synchronized multi-view image; and selectively decoding at least a portion of the bitstream based on the spatial structure information.

Description

BACKGROUND OF THE INVENTION Field of the Invention

본 발명은 영상의 부호화/복호화 방법 및 장치에 관한것이다. 보다 구체적으로, 본 발명은 공간적 구조 정보를 이용한 동기화된 다시점 영상의 선택적 복호화 방법, 부호화 방법 및 그 장치에 관한 것이다.The present invention relates to a method and apparatus for encoding/decoding an image. More specifically, the present invention relates to a method for selectively decoding a synchronized multi-view image using spatial structure information, a method for encoding, and an apparatus therefor.

최근 디지털 영상 처리와 컴퓨터 그래픽 기술이 발전함에 따라, 현실 세계를 재현하고 이를 실감나게 경험하도록 하는 가상현실(VIRTUAL REALITY, VR) 기술에 관한 연구가 활발히 진행되고 있다.With the recent development of digital image processing and computer graphics technology, research on virtual reality (VR) technology, which reproduces the real world and allows you to experience it realistically, is being actively conducted.

특히, HMD(Head Mounted Display)와 같은 최근의 VR 시스템은, 사용자의 양안에 3차원 입체 영상을 제공할 수 있을 뿐만 아니라, 그 시점을 전방위로 트래킹할 수 있기에, 360도 회전 시청 가능한 실감나는 가상현실(VR) 영상 컨텐츠를 제공할 수 있다는 점에서 많은 관심을 받고 있다.In particular, a recent VR system such as a head mounted display (HMD) can provide a 3D stereoscopic image to both eyes of a user and track the viewpoint in all directions, so 360-degree rotational viewing is possible. It is receiving a lot of attention in that it can provide reality (VR) video content.

그러나, 360 VR 컨텐츠는 시간 및 양안 영상이 공간적으로 복합 동기화된 동시 전방위의 다시점 영상 정보로 구성되기 때문에, 영상의 제작 및 전송에 있어서, 모든 시점의 양안 공간에 대해 동기화된 2개의 대형 영상을 부호화하여 압축 및 전달하게 된다. 이는 복잡도 및 대역폭 부담을 가중시키며, 특히 복호화 장치에서는 사용자 시점을 벗어나 실제로 시청되지 않는 영역에 대하여도 복호화가 이루어짐으로써 불필요한 프로세스가 낭비되는 문제점이 있다.However, since 360 VR contents are composed of simultaneous omnidirectional multi-view image information in which temporal and binocular images are spatially compositely synchronized, in the production and transmission of images, two large images synchronized for binocular space at all viewpoints are used. It is encoded, compressed, and transmitted. This increases complexity and bandwidth burden, and in particular, in the decoding apparatus, there is a problem in that unnecessary processes are wasted because decoding is performed even for an area that is not actually viewed beyond the user's point of view.

이에 따라, 영상의 전송 데이터량과 복잡도를 감소시키고, 대역폭 및 복호화 장치의 배터리 소모 측면에서도 효율적인 부호화 방법이 요구된다.Accordingly, there is a need for an efficient encoding method that reduces the amount and complexity of transmission data of an image, and also consumes bandwidth and a battery of a decoding apparatus.

본 발명은 상기와 같은 과제를 해결하기 위한 것으로, 동기화된 다시점 영상의 공간적 구조 정보를 이용하여, 360도 카메라나 VR용 영상과 같은 동기화된 다시점 영상을 효율적으로 부호화/복호화하는 방법 및 장치를 제공하는 데 그 목적이 있다.The present invention provides a method and apparatus for efficiently encoding/decoding a synchronized multi-view image such as a 360-degree camera or VR image by using spatial structure information of the synchronized multi-view image. Its purpose is to provide

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 실시예에 따른 영상 복호화 방법은, 부호화된 영상을 포함하는 비트스트림을 수신하는 단계; 동기화된 다시점 영상에 대응되는 공간적 구조 정보를 획득하는 단계; 및 상기 공간적 구조 정보에 기초하여, 상기 비트스트림의 적어도 일부를 선택적으로 복호화하는 단계를 포함한다.As a technical means for achieving the above technical problem, an image decoding method according to an embodiment of the present invention includes: receiving a bitstream including an encoded image; obtaining spatial structure information corresponding to the synchronized multi-view image; and selectively decoding at least a portion of the bitstream based on the spatial structure information.

또한, 상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 실시예에 따른 영상 복호화 장치는, 부호화된 영상을 포함하는 비트스트림으로부터 동기화된 다시점 영상에 대응되는 공간적 구조 정보를 획득하며, 상기 공간적 구조 정보에 기초하여, 상기 비트스트림의 적어도 일부를 선택적으로 복호화하는 복호화 처리부를 포함한다.In addition, as a technical means for achieving the above technical problem, an image decoding apparatus according to an embodiment of the present invention obtains spatial structure information corresponding to a synchronized multi-view image from a bitstream including an encoded image, and a decoding processing unit that selectively decodes at least a portion of the bitstream based on the spatial structure information.

그리고, 상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 실시예에 따른 영상 부호화 방법은, 동기화된 다시점 영상을 획득하는 단계; 상기 동기화된 다시점 영상의 공간적 구조 정보를 생성하는 단계; 상기 동기화된 다시점 영상을 부호화하는 단계; 및 상기 부호화된 다시점 영상 및 상기 공간적 구조 정보를 포함하는 비트스트림을 복호화 시스템으로 전송하는 단계를 포함하고, 상기 복호화 시스템은 상기 공간적 구조 정보에 기초하여, 상기 비트스트림의 적어도 일부를 선택적으로 복호화하는 것을 특징으로 한다.And, as a technical means for achieving the above technical problem, an image encoding method according to an embodiment of the present invention includes: obtaining a synchronized multi-view image; generating spatial structure information of the synchronized multi-view image; encoding the synchronized multi-view image; and transmitting a bitstream including the encoded multiview image and the spatial structure information to a decoding system, wherein the decoding system selectively decodes at least a portion of the bitstream based on the spatial structure information characterized in that

한편, 상기 동영상 처리 방법은 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체로 구현될 수 있다.Meanwhile, the video processing method may be implemented as a computer-readable recording medium in which a program for execution by a computer is recorded.

본 발명의 실시 예에 따르면, 동기화된 다시점 영상으로부터 부호화 및 전송에 최적화된 공간적 구조 정보를 추출 및 시그널링하여, 영상의 전송 데이터량과 대역폭 및 복잡도를 효율적으로 감소시킬 수 있다.According to an embodiment of the present invention, by extracting and signaling spatial structure information optimized for encoding and transmission from a synchronized multi-view image, it is possible to efficiently reduce the transmission data amount, bandwidth, and complexity of an image.

또한, 복호화단에서는 동기화된 다시점 영상이 수신된 경우 상기 시그널링 정보에 따라 각 시점에 대한 최적화된 일부 선택적 복호화를 수행할 수 있게 됨으로써 시스템 낭비를 저감시킬 수 있어 복잡도 및 배터리 소모 측면에서도 효율적인 부호화/복호화 방법 및 장치를 제공할 수 있다.In addition, when a synchronized multi-view image is received, the decoding end can perform some optimized selective decoding for each view according to the signaling information, thereby reducing system waste and efficient encoding/decoding in terms of complexity and battery consumption. A decryption method and apparatus may be provided.

그리고, 본 발명의 실시 예에 따르면 다양한 방식의 동기화된 영상에 대한 공간적 구조 정보를 지원할 수 있도록 하여 복호화 장치 스펙에 따라 적절한 영상 재생을 가능하게 하여, 장치 호환성을 향상시킬 수 있게 된다.In addition, according to an embodiment of the present invention, it is possible to support spatial structure information for various types of synchronized images, thereby enabling appropriate image reproduction according to the specification of a decoding device, thereby improving device compatibility.

도 1은 본 발명의 일실시예에 따른 전체 시스템 구조를 도시한다.
도 2는 본 발명의 일 실시 예에 따른 시간 동기화된 다시점 영상 부호화 장치의 구성을 나타내는 블록도이다.
도 3 내지 도 6은 본 발명의 실시 예에 따른 동기화된 다시점 영상의 공간적 구조의 일 예를 나타내는 도면이다.
도 7 내지 도 9는 본 발명의 다양한 실시 예에 따른 공간적 구조 정보의 시그널링 방법을 설명하기 위한 도면들이다.
도 10은 본 발명의 실시 예에 따른 공간적 구조 정보의 구성을 설명하기 위한 도면이다.
도 11 내지 도 12는 본 발명의 실시 예에 따른 공간적 구조 정보의 타입 인덱스 테이블을 설명하기 위한 도면들이다.
도 13은 본 발명의 실시 예에 따른 공간적 구조 정보의 시점 정보 테이블을 설명하기 위한 도면이다.
도 14는 본 발명의 실시 예에 따른 복호화 방법을 설명하기 위한 흐름도이다.
도 15 및 도 18은 본 발명의 실시 예에 따른 공간적 구조 정보의 시그널링에 따라 복호화단에서의 스캐닝 순서가 결정되는 것을 예시한 도면들이다.
도 17은 공간적 구조 정보의 시그널링에 따라 구분되는 독립적 서브 이미지와 의존적 서브 이미지를 설명하기 위한 도면이다.
도 18 내지 도 19는 공간적 구조 정보에 따라, 서브 이미지간 바운더리 영역이 독립적 서브 이미지를 참조하여 복호화되는 것을 도시한다.
도 20 내지 도 24는 본 발명의 일 실시 예에 따른 복호화 시스템 및 그 동작을 도시한 도면들이다.
도 25 내지 도 26은 본 발명의 실시 예에 따른 부호화 및 복호화 처리를 설명하기 위한 도면들이다.1 shows the overall system structure according to an embodiment of the present invention.
2 is a block diagram illustrating a configuration of a time-synchronized multi-view video encoding apparatus according to an embodiment of the present invention.
3 to 6 are diagrams illustrating an example of a spatial structure of a synchronized multi-view image according to an embodiment of the present invention.
7 to 9 are diagrams for explaining a method of signaling spatial structure information according to various embodiments of the present disclosure.
10 is a diagram for explaining the configuration of spatial structure information according to an embodiment of the present invention.
11 to 12 are diagrams for explaining a type index table of spatial structure information according to an embodiment of the present invention.
13 is a diagram for explaining a viewpoint information table of spatial structure information according to an embodiment of the present invention.
14 is a flowchart illustrating a decoding method according to an embodiment of the present invention.
15 and 18 are diagrams illustrating that a scanning order of a decoder is determined according to signaling of spatial structure information according to an embodiment of the present invention.
17 is a diagram for describing an independent sub-image and a dependent sub-image that are distinguished according to signaling of spatial structure information.
18 to 19 show that a boundary region between sub-images is decoded with reference to an independent sub-image according to spatial structure information.
20 to 24 are diagrams illustrating a decoding system and its operation according to an embodiment of the present invention.
25 to 26 are diagrams for explaining encoding and decoding processing according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시 예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present application will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art to which the present application pertains can easily implement them. However, the present application may be implemented in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present application in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다.Throughout this specification, when a part is said to be "connected" with another part, it includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element interposed therebetween. do.

본원 명세서 전체에서, 어떤 부재가 다른 부재 상에 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout this specification, when a member is said to be positioned on another member, this includes not only a case in which a member is in contact with another member but also a case in which another member exists between the two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다. 본원 명세서 전체에서 사용되는 정도의 용어 "약", "실질적으로" 등은 언급된 의미에 고유한 제조 및 물질 허용오차가 제시될 때 그 수치에서 또는 그 수치에 근접한 의미로 사용되고, 본원의 이해를 돕기 위해 정확하거나 절대적인 수치가 언급된 개시 내용을 비양심적인 침해자가 부당하게 이용하는 것을 방지하기 위해 사용된다. 본원 명세서 전체에서 사용되는 정도의 용어 "~(하는) 단계" 또는 "~의 단계"는 "~ 를 위한 단계"를 의미하지 않는다.Throughout this specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated. As used throughout this specification, the terms "about," "substantially," and the like are used in a sense at or close to the numerical value when presented with manufacturing and material tolerances inherent in the stated meaning, and are intended to enhance the understanding of this application. To help, precise or absolute figures are used to prevent unfair use by unscrupulous infringers of the stated disclosure. The term “step of” or “step of” to the extent used throughout this specification does not mean “step for”.

본원 명세서 전체에서, 마쿠시 형식의 표현에 포함된 이들의 조합의 용어는 마쿠시 형식의 표현에 기재된 구성 요소들로 이루어진 군에서 선택되는 하나 이상의 혼합 또는 조합을 의미하는 것으로서, 상기 구성 요소들로 이루어진 군에서 선택되는 하나 이상을 포함하는 것을 의미한다.Throughout this specification, the term of a combination thereof included in the expression of the Markush form means one or more mixtures or combinations selected from the group consisting of the components described in the expression of the Markush form, and the components are It means to include one or more selected from the group consisting of.

본 발명의 실시 예에서, 동기화된 영상을 부호화하는 방법의 일예로, 현재까지 개발된 비디오 부호화 표준 중에서 최고의 부호화 효율을 가지는 MPEG(Moving Picture Experts Group)과 VCEG(Video Coding Experts Group)에서 공동으로 표준화한 HEVC(High Efficiency Video Coding)를 이용하여 부호화를 수행할 수 있으나, 이에 한정되지는 아니한다.In an embodiment of the present invention, as an example of a method for encoding a synchronized image, the Moving Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG), which have the highest encoding efficiency among the video encoding standards developed so far, jointly standardize Encoding may be performed using one HEVC (High Efficiency Video Coding), but is not limited thereto.

통상, 부호화 장치는 인코딩 과정과 디코딩 과정을 포함하고, 복호화 장치는 디코딩 과정을 구비한다. 복호화 장치의 디코딩 과정은 부호화 장치의 디코딩 과정과 동일하다. 따라서, 이하에서는 부호화 장치를 위주로 설명하기로 한다.In general, the encoding apparatus includes an encoding process and a decoding process, and the decoding apparatus includes a decoding process. The decoding process of the decoding apparatus is the same as the decoding process of the encoding apparatus. Therefore, the encoding apparatus will be mainly described below.

도 1은 본 발명의 일실시예에 따른 전체 시스템 구조를 도시한다.1 shows the overall system structure according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시 예에 따른 전체 시스템은, 전처리 장치(10), 부호화 장치(100), 복호화 장치(200), 후처리 장치(20)를 포함한다.Referring to FIG. 1 , the entire system according to an embodiment of the present invention includes a pre-processing unit 10 , an encoding unit 100 , a decoding unit 200 , and a post-processing unit 20 .

본 발명의 실시 예에 따른 시스템은, 복수의 시점별 영상들을 병합 또는 스티치(stitch)등의 작업을 통해 전처리하여, 동기화된 비디오 프레임을 획득하는 전처리 장치(10)와, 상기 동기화된 비디오 프레임을 부호화하여 비트스트림을 출력하는 부호화 장치(100)와, 상기 비트스트림을 전송받아 상기 동기화된 비디오 프레임을 복호화하는 복호화 장치(200) 및 상기 비디오 프레임의 후처리를 통해 각 시점별 동기화된 영상이 각각의 디스플레이로 출력되도록 하는 후처리 장치(20)를 포함하여 구성될 수 있다.A system according to an embodiment of the present invention includes a preprocessing unit 10 for obtaining a synchronized video frame by preprocessing a plurality of views by merging or stitching images, and the synchronized video frame. An encoding apparatus 100 for encoding and outputting a bitstream, a decoding apparatus 200 for receiving the bitstream and decoding the synchronized video frame, and post-processing of the video frame, each synchronized image for each view It may be configured to include a post-processing device 20 to be output to the display of.

여기서, 입력 영상은 다시점별 개별 영상을 포함할 수 있으며, 예를 들어 하나 이상의 카메라가 시간 및 공간 동기화된 상태에서 촬영되는 다양한 시점의 서브 이미지 정보를 포함할 수 있다. 이에 따라 전처리 장치(10)는 취득된 다시점 서브 이미지 정보를 시간에 따라 공간적 병합 또는 스티치 처리함으로써 동기화된 영상 정보를 획득할 수 있다.Here, the input image may include individual images for each multi-viewpoint, for example, sub-image information of various viewpoints captured by one or more cameras in time and space synchronized state. Accordingly, the pre-processing apparatus 10 may obtain synchronized image information by spatially merging or stitching the acquired multi-viewpoint sub-image information according to time.

그리고, 부호화 장치(100)는 상기 동기화된 영상 정보를 스캐닝 및 예측 부호화하여 비트스트림을 생성하며, 생성된 비트스트림은 복호화 장치(200)로 전송될 수 있다. 특히, 본 발명의 실시 예에 따른 부호화 장치(100)는 상기 동기화된 영상 정보로부터 공간적 구조 정보를 추출할 수 있으며, 복호화 장치(200)로 시그널링할 수 있다.In addition, the encoding apparatus 100 may generate a bitstream by scanning and predictive encoding the synchronized image information, and the generated bitstream may be transmitted to the decoding apparatus 200 . In particular, the encoding apparatus 100 according to an embodiment of the present invention may extract spatial structure information from the synchronized image information, and may signal to the decoding apparatus 200 .

여기서 공간적 구조 정보(spatial layout information)는 상기 전처리 장치(10)로부터 하나 이상의 서브 이미지들이 병합되어 하나의 비디오 프레임으로 구성됨에 따라, 각각의 서브 이미지들의 속성 및 배치에 대한 기본 정보를 포함할 수 있다. 또한, 각 서브 이미지들 및 서브 이미지들간 관계에 대한 부가 정보를 더 포함할 수 있으며, 이에 대하여는 후술하도록 한다.Here, the spatial layout information may include basic information about the properties and arrangement of each sub-image as one or more sub-images are merged from the pre-processing unit 10 to form one video frame. . In addition, additional information about each sub-image and a relationship between the sub-images may be further included, which will be described later.

이에 따라, 본 발명의 실시 예에 따른 공간적 구조 정보가 복호화 장치(200)로 전달될 수 있다. 그리고, 복호화 장치(200)는 공간적 구조 정보와, 사용자 시점 정보를 참조하여 비트스트림의 복호화 대상 및 복호화 순서를 결정할 수 있으며, 이는 효율적인 복호화를 유도할 수 있다.Accordingly, spatial structure information according to an embodiment of the present invention may be transmitted to the decoding apparatus 200 . In addition, the decoding apparatus 200 may determine a decoding target and a decoding order of the bitstream with reference to spatial structure information and user viewpoint information, which may induce efficient decoding.

그리고, 복호화된 비디오 프레임은 다시 후처리 장치(20)를 통해 각각의 디스플레이별 서브 이미지로 분리되어 HMD 와 같은 복수의 동기화된 디스플레이 시스템으로 제공되며, 이에 따라 사용자는 가상 현실과 같이 현실감있는 동기화된 다시점 영상을 제공받을 수 있게 된다.Then, the decoded video frame is again separated into sub-images for each display through the post-processing device 20 and provided to a plurality of synchronized display systems such as HMD. A multi-view video can be provided.

도 2는 본 발명의 일 실시 예에 따른 시간 동기화된 다시점 영상 부호화 장치의 구성을 나타내는 블록도이다.2 is a block diagram illustrating a configuration of a time-synchronized multi-view video encoding apparatus according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 실시 예에 따른 부호화 장치(100)는 동기화된 다시점 영상 획득부(110), 공간적 구조 정보 생성부(120), 공간적 구조 정보 시그널링부(130), 영상 부호화부 및 전송 처리부(150)를 포함한다.Referring to FIG. 2 , the encoding apparatus 100 according to an embodiment of the present invention includes a synchronized multi-view image acquisition unit 110 , a spatial structure information generation unit 120 , a spatial structure information signaling unit 130 , and an image encoding unit. and a transmission processing unit 150 .

동기화된 다시점 영상 획득부(110)는 360도 카메라와 같은 동기화된 다시점 영상 획득 수단을 이용하여 동기화된 다시점 영상을 획득한다. 동기화된 다시점 영상은 시간 및 공간 동기화된 복수의 서브 이미지를 포함할 수 있으며, 전처리 장치(10)로부터 수신되거나 별도의 외부 입력 장치로부터 수신될 수도 있다.The synchronized multi-viewpoint image acquisition unit 110 acquires a synchronized multi-viewpoint image using a synchronized multi-viewpoint image acquisition means such as a 360-degree camera. The synchronized multi-view image may include a plurality of sub-images synchronized in time and space, and may be received from the preprocessing unit 10 or may be received from a separate external input device.

그리고, 공간적 구조 정보 생성부(120)는 상기 동기화된 다시점 영상을 시간 단위의 비디오 프레임으로 분할하고, 상기 비디오 프레임에 대한 공간적 구조 정보를 추출한다. 공간적 구조 정보는 각각의 서브 이미지들의 속성 및 배치 상태에 따라 결정될 수 있으며, 전처리 장치(10)로부터 획득되는 정보에 따라 결정될 수도 있다.Then, the spatial structure information generating unit 120 divides the synchronized multi-view image into video frames in units of time, and extracts spatial structure information for the video frame. The spatial structure information may be determined according to the property and arrangement state of each sub-image, and may be determined according to information obtained from the preprocessing unit 10 .

그리고, 공간적 구조 정보 시그널링부(130)는 상기 공간적 구조 정보를 복호화 장치(200)로 시그널링하기 위한 정보 처리를 수행한다. 예를 들어, 공간적 구조 정보 시그널링부(130)는 영상 부호화부에서 부호화된 영상 데이터에 포함시키거나, 별도의 데이터 포맷을 구성하거나, 부호화된 영상의 메타데이터에 포함시키기 위한 하나 이상의 프로세스를 수행할 수 있다.In addition, the spatial structure information signaling unit 130 performs information processing for signaling the spatial structure information to the decoding apparatus 200 . For example, the spatial structure information signaling unit 130 performs one or more processes to be included in the image data encoded by the image encoder, to configure a separate data format, or to be included in the metadata of the encoded image. can

그리고, 영상 부호화부는 동기화된 다시점 영상을 시간 흐름에 따라 부호화한다. 또한, 영상 부호화부는 공간적 구조 정보 생성부(120)에서 생성되는 공간적 구조 정보를 참조 정보로 이용하여, 영상 스캐닝 순서 및 참조 이미지 등을 결정할 수 있다.Then, the image encoder encodes the synchronized multi-view image according to the passage of time. Also, the image encoder may determine an image scanning order and a reference image by using the spatial structure information generated by the spatial structure information generator 120 as reference information.

따라서, 영상 부호화부는 전술한 바와 같이 HEVC(High Efficiency Video Coding)를 이용하여 부호화를 수행할 수 있으나, 공간적 구조 정보에 따라, 동기화된 다시점 영상에 대해 보다 효율적인 방식으로 개선될 수 있다.Accordingly, the image encoder may perform encoding using HEVC (High Efficiency Video Coding) as described above, but may be improved in a more efficient manner for a synchronized multi-view image according to spatial structure information.

그리고, 전송 처리부(150)는 부호화된 영상 데이터와, 상기 공간적 구조 정보 시그널링부(130)로부터 삽입된 공간적 구조 정보를 결합하여 복호화 장치(200)로 전송하기 위한 하나 이상의 변환 및 송신 처리를 수행할 수 있다.In addition, the transmission processing unit 150 combines the encoded image data and the spatial structure information inserted from the spatial structure information signaling unit 130 to perform one or more transformation and transmission processes for transmission to the decoding apparatus 200 . can

도 3 내지 도 6은 본 발명의 실시 예에 따른 동기화된 다시점 영상의 공간적 구조 및 영상 구성의 일 예를 나타내는 도면이다.3 to 6 are diagrams illustrating an example of a spatial structure and image configuration of a synchronized multi-view image according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시 예에 따른 다시점 영상은 시간적 동기화 및 공간적 동기화된 복수의 영상 프레임을 포함할 수 있다.Referring to FIG. 3 , a multi-view image according to an embodiment of the present invention may include a plurality of temporally synchronized and spatially synchronized image frames.

각각의 프레임은 고유의 공간적 구조(Spatial layout)에 따라 동기화될 수 있으며, 동일한 시간에 표시될 하나 이상의 Scene, Perspective 또는 View 에 대응되는 서브 이미지의 레이아웃을 구성할 수 있다.Each frame may be synchronized according to a unique spatial layout, and a layout of sub-images corresponding to one or more scenes, perspectives, or views to be displayed at the same time may be configured.

이에 따라, 공간적 구조 정보(Spatial layout information)는 동기화된 다시점 영상을 구성하는 각각의 서브 이미지들이 병합, 스티치(Stitch) 등을 통해 하나의 입력 영상으로 구성되거나 동시간 다시점 영상(예를 들어, 동일한 시간으로 동기화된 복수의 영상으로서, 동일한 POC 내에 대응되는 다양한 View에 대응)이 입력 영상으로 구성되는 경우에, 상기 다시점 영상 또는 서브 이미지들의 배치 정보, 캡쳐 카메라의 위치 정보 및 각도 정보, 병합 정보, 서브 이미지의 개수 정보, 스캐닝 순서 정보, 취득 시간 정보, 카메라 파라미터 정보, 서브 이미지들 간의 참조 의존성 정보 등 서브 이미지와 그 관계 정보를 포함할 수 있다.Accordingly, spatial layout information may be obtained by combining sub-images constituting a synchronized multi-view image into one input image through merging, stitching, etc., or a simultaneous multi-view image (for example, , as a plurality of images synchronized at the same time (corresponding to various views corresponding to the same POC) as input images, arrangement information of the multi-view image or sub-images, location information and angle information of the capture camera, The sub-images and their relationship information, such as merging information, information on the number of sub-images, scanning order information, acquisition time information, camera parameter information, and reference dependency information between sub-images, may be included.

예를 들어, 도 4에 도시된 바와 같이 다이버전트(divergent) 형태의 카메라 배열을 통해 영상 정보가 촬영될 수 있으며, 배열 영상에 대한 스티치처리(stiching)를 통해, 360도 관찰 가능한 공간영상을 구성할 수 있다.For example, as shown in FIG. 4 , image information can be captured through a divergent type camera arrangement, and a 360-degree observable spatial image is constructed through stitching on the arrangement image. can do.

도 4에 도시된 바와 같이, 각 카메라 배열 A, B, C ... 에 대응하여 촬영된 영상 A', B', C', ...들이 1차원 또는 2차원 공간적 구조에 따라 배치될 수 있으며, 배열된 영상들간의 스티치 처리를 위한 좌우, 상하 영역 관계 정보가 공간적 구조 정보로서 예시될 수 있다.As shown in FIG. 4, images A', B', C', ... taken corresponding to each camera arrangement A, B, C ... may be arranged according to a one-dimensional or two-dimensional spatial structure. In addition, left, right, upper and lower region relation information for stitch processing between the arranged images may be exemplified as spatial structure information.

이에 따라, 공간적 구조 정보 생성부(120)는 상기와 같은 다양한 속성을 포함하는 공간적 구조 정보를 입력 영상으로부터 추출할 수 있으며, 공간적 구조 정보 시그널링부(130)는 상기 공간적 구조 정보를 후술할 최적화된 방법으로 시그널링할 수 있다.Accordingly, the spatial structure information generating unit 120 can extract spatial structure information including the various properties as described above from the input image, and the spatial structure information signaling unit 130 provides an optimized spatial structure information to be described later. method can be signaled.

*이와 같이 생성 및 시그널링되는 공간적 구조 정보는 전술한 바와 같이 유용한 참조 정보로 활용 될 수 있다.* The spatial structure information generated and signaled in this way can be utilized as useful reference information as described above.

예를 들어, 각 카메라를 통해 촬영된 컨텐트가 pre-stitched된 이미지라 하면, 인코딩 이전에 상기 각 pre-stiched 이미지들이 오버랩되어 하나의 scene을 구성하게 된다. 반면, 상기 scene은 각 view에 따라 분리될 수 있으며, 타입에 따라 각 분리되는 이미지간 상호 보상이 이루어질 수 있다.For example, if the content captured by each camera is a pre-stitched image, the pre-stitched images are overlapped before encoding to constitute one scene. On the other hand, the scene may be divided according to each view, and mutual compensation may be made between the separated images according to the type.

이에 따라, 다시점에서 촬영한 하나 이상의 영상을 전처리 과정에서 하나의 이미지로 병합 및 Stitching 하여 인코더의 입력으로 전달하는 Pre-stitched image의 경우, 병합 및 Stitching된 입력 영상의 장면 정보, 공간적 레이아웃 구성 정보 등은 별도의 공간적 구조 정보 시그널링을 통해 인코딩 단계 및 디코딩 단계에 전달될 수 있다.Accordingly, in the case of a pre-stitched image that merges and stitches one or more images taken from multiple viewpoints into one image in the pre-processing process and delivers it as an input to the encoder, scene information of the merged and stitched input image, spatial layout configuration information etc. may be delivered to the encoding step and the decoding step through separate spatial structure information signaling.

또한, 다시점에서 취득된 영상들이 시간적으로 동기화된 시점의 하나 이상의 입력 영상으로 전달되어 부호화 및 복호화 되는 Non-stitched image 영상 타입의 경우에도, 부호화 및 복호화 단계에서 상기 공간적 구조 정보에 따라 참조 및 보상될 수 있다.이를 위해, 다양한 공간적 레이아웃 정보 및 이에 대응하는 데이터 필드가 필요할 수 있다. 그리고, 데이터 필드는 입력 영상의 압축정보와 함께 부호화되거나, 별도의 메타데이터에 포함되어 전송될 수 있다.In addition, even in the case of a non-stitched image image type in which images acquired from multi-views are transmitted as one or more input images at a temporally synchronized viewpoint and encoded and decoded, referencing and compensation according to the spatial structure information in the encoding and decoding steps For this purpose, various spatial layout information and data fields corresponding thereto may be required. In addition, the data field may be encoded together with the compression information of the input image or may be transmitted in separate metadata.

또한, 공간적 레이아웃 정보를 포함하는 데이터 필드는 영상의 후처리 장치(20) 및 디스플레이의 렌더링 프로세스에서도 활용될 수 있다.In addition, the data field including spatial layout information may be utilized in the image post-processing apparatus 20 and the rendering process of the display.

이를 위해, 공간적 레이아웃 정보를 포함하는 데이터 필드는 각 카메라로부터의 영상 취득 당시 획득된 위치 좌표 정보 및 색차 정보를 포함할 수 있다.To this end, the data field including spatial layout information may include position coordinate information and color difference information obtained at the time of image acquisition from each camera.

예를 들어, 각 카메라로부터 영상 정보의 취득 당시 획득한 영상의 3차원 좌표 정보 및 색차 정보 (X, Y, Z), (R, G, B)등의 정보가 각각의 서브 이미지들에 대한 부가 정보로 획득 및 전달 될 수 있으며, 이러한 정보는 복호화를 수행한 이후, 영상의 후처리 및 렌더링 과정에서 활용될 수 있다.For example, information such as 3D coordinate information and color difference information (X, Y, Z), (R, G, B) of an image acquired at the time of acquiring image information from each camera is added to each sub-image It can be acquired and transmitted as information, and this information can be used in post-processing and rendering of images after decoding is performed.

또한, 공간적 레이아웃 정보를 포함하는 데이터 필드는 각 카메라의 카메라 정보를 포함할 수 있다.Also, the data field including spatial layout information may include camera information of each camera.

도 5 내지 도 6에 도시된 바와 같이, 3차원 공간을 촬영하여 공간 영상을 제공하는 하나 이상의 카메라가 배치될 수 있다.As shown in FIGS. 5 to 6 , one or more cameras may be disposed to provide a spatial image by photographing a three-dimensional space.

예를 들어, 도 5에 도시된 바와 같이, 영상 획득시 3차원 공간 안에서의 한 지점에서 주변의 사물들을 취득하는 형태로 하나 이상의 카메라의 위치가 중앙 위치에 고정되고 각각의 방향이 설정될 수 있다.For example, as shown in FIG. 5 , the positions of one or more cameras may be fixed to a central position and respective directions may be set in a form of acquiring surrounding objects at one point in a three-dimensional space when acquiring an image. .

또한, 도 6에 도시된 바와 같이, 하나 이상의 카메라는 하나의 오브젝트를 다양한 각도에서 촬영하는 형태로 배치될 수 있다. 이 때, 영상 획득 당시의 좌표 정보(X, Y, Z)와 거리 정보 등을 기반으로 3차원 영상을 재생할 VR 디스플레이 디바이스에서는 사용자의 움직임 정보(Up/Down, Left/Right, Zoom in/Zoom Out) 등을 분석하고, 이에 대응되는 영상의 일부분을 복호화 하거나 후처리하여 사용자가 원하는 시점 또는 부분 영상을 복원할 수 있게 된다.한편, 전술한 바와 같이, VR 영상으로 예시되는 동기화된 다시점 영상의 압축, 전송, 재생 등의 시스템에 있어서, 영상의 타입이나 특성, 복호화 장치의 특성 등에 따라 별도의 영상 변환 툴 모듈 등이 필요한 부분에 추가될 수 있다.In addition, as shown in FIG. 6 , one or more cameras may be arranged to photograph one object from various angles. At this time, in a VR display device that plays a 3D image based on coordinate information (X, Y, Z) and distance information at the time of image acquisition, user's movement information (Up/Down, Left/Right, Zoom in/Zoom Out) ), etc., and decoding or post-processing a portion of the corresponding image, it is possible to restore the desired viewpoint or partial image. On the other hand, as described above, the In a system such as compression, transmission, and reproduction, a separate image conversion tool module, etc. may be added to a necessary part according to a type or characteristic of an image, a characteristic of a decoding apparatus, and the like.

예를 들어, 영상 부호화부(140)는 카메라로부터 취득된 영상이 Equirectangular 타입일 때, 영상의 압축 성능 및 부호화 효율 등에 따라 변환 툴 모듈을 통해 Icosahedron/Cube Map 등과 같은 방식의 영상 타입으로 변환하고, 이를 통한 부호화를 수행할 수 있다. 이때의 변환 툴 모듈은 전처리 장치(10) 및 후처리 장치(20)에서도 활용될 수 있으며, 변환에 따른 변환 정보가 상기 공간적 구조 정보등에 포함되어 메타데이터 형식으로 복호화 장치(200) 또는 후처리 장치(20)나 VR 디스플레이 장치로 전달될 수 있다.For example, when the image obtained from the camera is of an equirectangular type, the image encoding unit 140 converts it into an image type of a method such as Icosahedron/Cube Map through a conversion tool module according to the compression performance and encoding efficiency of the image, etc., Through this, encoding can be performed. At this time, the conversion tool module may be utilized in the pre-processing device 10 and the post-processing device 20 as well, and conversion information according to the conversion is included in the spatial structure information, etc. in the form of metadata in the decoding device 200 or the post-processing device. (20) or may be transmitted to a VR display device.

한편, 본 발명의 실시 예에 따른 동기화된 다시점 영상을 전달하기 위해, 부호화 장치(100) 및 복호화 장치(200)간 스케일가능성(Scalability)을 지원하기 위한 별도 VR 영상 압축방식이 필요할 수 있다.Meanwhile, in order to deliver a synchronized multi-view image according to an embodiment of the present invention, a separate VR image compression method for supporting scalability between the encoding apparatus 100 and the decoding apparatus 200 may be required.

이에 따라, 부호화 장치(100)는 VR 영상을 스케일 가능하게 압축하기 위해, 기본 계층과 향상 계층을 구분하는 방식으로 영상을 압축 부호화 할 수 있다.Accordingly, in order to scalably compress the VR image, the encoding apparatus 100 may compression-encode the image in a manner that separates the base layer and the enhancement layer.

이러한 방법으로는 한 장의 입력 영상이 다양한 카메라를 통해 취득된 고해상도 VR 영상을 압축 함에 있어, 기본 계층에서는 원본 영상에 대한 압축을 수행하고, 향상 계층에서는 한 장의 픽쳐를 Slice / Tile 등과 같이 영역을 분할하여 각 서브 이미지별로 부호화를 수행할 수 있다.In this way, when one input image is compressed with high-resolution VR images acquired through various cameras, the base layer performs compression on the original image, and the enhancement layer divides a single picture into regions such as Slice / Tile, etc. Thus, encoding can be performed for each sub-image.

이 때, 부호화 장치(100)는 기본 계층의 복원 영상을 참조 영상으로 활용하여 부호화 효율을 높이는 계층간 예측 기법 (Inter layer prediction)을 통해 압축 부호화를 처리할 수 있다.In this case, the encoding apparatus 100 may process compression encoding through an inter-layer prediction technique that increases encoding efficiency by using the reconstructed image of the base layer as a reference image.

한편, 복호화 장치(200)에서는 기본 계층을 복호화 하면서, 사용자의 움직임 등에 따라 특정 영상을 빠르게 복호화해야 할 때, 향상 계층의 일부 영역을 복호화 하여, 사용자 움직임에 따른 일부 영상 복호화를 빠르게 수행할 수 있다.On the other hand, the decoding apparatus 200 decodes the base layer and, when it is necessary to quickly decode a specific image according to the user's movement, decodes a part of the enhancement layer, so that the partial image decoding according to the user's movement can be performed quickly. .

이와 같이 스케일 가능한(Scalability) 압축 방식에 있어, 부호화 장치(100)는 기본 계층을 부호화하되, 기본 계층에서는 원본 영상을 임의의 비율로 스케일 다운(Scale down) 또는 다운 샘플링(Down sampling)등을 수행하여 압축할 수 있다. 이때 향상 계층에서는 기본 계층의 복원 영상에 대한 스케일 업(Scale Up) 또는 업 샘플링(Up sampling) 등을 통해 동일한 해상도로 영상의 사이즈를 조절하고, 이에 대응되는 기본 계층의 복원 영상을 참조 픽쳐로 활용함으로써 부/복호화를 수행할 수 있다.In this scalable compression method, the encoding apparatus 100 encodes the base layer, but in the base layer, scale down or down sampling of the original image is performed at an arbitrary ratio. can be compressed. In this case, the enhancement layer adjusts the size of the image to the same resolution through scale-up or up-sampling of the reconstructed image of the base layer, and uses the reconstructed image of the base layer corresponding thereto as a reference picture. By doing so, encoding/decoding can be performed.

이러한 스케일가능성(Scalability)을 지원하는 처리 구조에 따라, 복호화 장치(200)는 낮은 비트 또는 저해상도로 압축된 기본 계층의 전체 비트스트림을 복호화 하고, 사용자의 움직임에 따라 전체 비트스트림 중 일부 영상만을 향상 계층으로 복호화 할 수 있다. 또한, 전체 영상에 대한 복호화를 전부 수행하지는 않기 때문에 낮은 복잡도만으로도 VR 영상을 복원할 수 있게 된다.According to a processing structure supporting such scalability, the decoding apparatus 200 decodes the entire bitstream of the base layer compressed to a low bit or low resolution, and improves only some images of the entire bitstream according to the user's movement. It can be decrypted in layers. In addition, since the entire image is not completely decoded, the VR image can be reconstructed with low complexity.

또한, 해상도가 다른 별도의 스케일가능성(Scalability)을 지원하는 영상 압축 방식에 따라, 부호화 장치(100)는 기본 계층에서 원본 영상 또는 영상 제작자의 의도에 따른 영상에 대한 압축을 수행할 수 있으며, 향상 계층에서 기본 계층의 복원 영상을 참조하여 부호화를 수행하는 계층간 예측 방식을 기반으로 부호화를 수행할 수 있다.In addition, according to an image compression method supporting separate scalability with different resolutions, the encoding apparatus 100 may perform compression on an original image or an image according to the intention of the image producer in the base layer, and improve Encoding may be performed based on an inter-layer prediction method in which encoding is performed by referring to a reconstructed image of a base layer in a layer.

이 때, 향상 계층의 입력 영상은 한 장의 입력 영상을 영상 분할 방법을 통해 분할하여 복수 개의 영역으로 부호화한 영상일 수 있다. 하나의 분할된 영역은 최대 하나의 서브 이미지를 포함할 수 있으며, 복수개의 분할 영역이 하나의 서브 이미지로 구성될 수 있다. 이러한 분할 방법을 통해 부호화 된 압축 비트스트림은 서비스 및 어플리케이션 단계에서 2개 이상의 출력을 처리할 수 있게 된다. 예를 들어, 서비스에서는 기본 계층에 대한 복호화를 통해 전체 영상을 복원 및 출력을 수행하고, 향상 계층에서는 서비스 또는 어플리케이션을 통한 사용자의 움직임, 시점 변화 및 조작 등을 반영하여 일부 영역 및 일부 서브 이미지만을 복호화 할 수 있다.In this case, the input image of the enhancement layer may be an image obtained by segmenting one input image through an image segmentation method and encoding the image into a plurality of regions. One divided region may include at most one sub-image, and a plurality of divided regions may be composed of one sub-image. The compressed bitstream encoded through this division method can process two or more outputs in the service and application stages. For example, in the service, the entire image is restored and output through decoding of the base layer, and in the enhancement layer, only a partial region and some sub-images are reflected by reflecting the user's movement, viewpoint change, and manipulation through the service or application. can be decrypted.

도 7 내지 도 9는 본 발명의 다양한 실시 예에 따른 공간적 구조 정보의 시그널링 방법을 설명하기 위한 도면들이다.7 to 9 are diagrams for explaining a method of signaling spatial structure information according to various embodiments of the present disclosure.

도 7 내지 도 9에 도시된 바와 같이, 공간적 구조 정보는 일반적 영상 부호화에 있어서, 부호화 파라미터로 정의되는 SPS(SEQUENCE PARAMETER SET) 또는 VPS(VIDEO PARAMETER SET)과 같은 HLS상에 NAL(NETWORK ABSTRACTION LAYER) UNIT 형식의 하나의 클래스 타입으로 시그널링 될 수 있다.7 to 9, spatial structure information is NAL (NETWORK ABSTRACTION LAYER) on HLS such as SPS (SEQUENCE PARAMETER SET) or VPS (VIDEO PARAMETER SET) defined as encoding parameters in general video encoding. It can be signaled as one class type of UNIT format.

먼저, 도 7은 본 발명의 실시 예에 따른 동기화된 영상 부호화 플래그가 삽입되는 NAL UNIT 타입을 도시한 것으로, 예를 들어, VPS(VIDEO PARAMETER SET) 등에 본 발명의 실시 예에 따른 동기화된 영상 부호화 플래그가 삽입될 수 있다.First, FIG. 7 shows a NAL UNIT type into which a synchronized video encoding flag is inserted according to an embodiment of the present invention. A flag may be inserted.

이에 따라, 도 8은 본 발명의 실시 예에 따른 공간적 구조 정보 플래그를 VPS(VIDEO PARAMETER SET)에 삽입하는 실시 예를 도시한 것이다.Accordingly, FIG. 8 shows an embodiment of inserting the spatial structure information flag into a VIDEO PARAMETER SET (VPS) according to an embodiment of the present invention.

도 8에 도시된 바와 같이, 본 발명의 실시 예에 따른 공간적 구조 정보 시그널링부(130)는, VPS 상에 별도의 입력 영상의 종류 확인을 위한 플래그를 삽입할 수 있다. 부호화 장치(100)는 공간적 구조 정보 시그널링부(130)를 통해, vps_other_type_coding_flag를 이용하여 VR 콘텐츠와 같은 동기화된 다시점 영상 부호화가 수행되고, 공간적 구조 정보가 시그널링됨을 나타내는 플래그를 삽입할 수 있다.As shown in FIG. 8 , the spatial structure information signaling unit 130 according to an embodiment of the present invention may insert a flag for checking the type of a separate input image on the VPS. The encoding apparatus 100 may insert a flag indicating that synchronized multi-view image encoding such as VR content is performed using vps_other_type_coding_flag through the spatial structure information signaling unit 130 and that the spatial structure information is signaled.

또한, 도 9에 도시된 바와 같이 본 발명의 실시 예에 따른 공간적 구조 정보 시그널링부(130)는 SPS(SEQUENCE PARAMETER SET) 상에 다시점 동기화된 영상 부호화된 영상임을 시그널링할 수 있다.In addition, as shown in FIG. 9 , the spatial structure information signaling unit 130 according to an embodiment of the present invention may signal a multi-view synchronized image encoded image on a Sequence PARAMETER SET (SPS).

예를 들어, 도 9에 도시된 바와 같이 공간적 구조 정보 시그널링부(130)는 입력 영상의 타입(INPUT_IMAGE_TYPE)을 삽입함으로써, 동기화된 다시점 영상의 인덱스 정보가 SPS에 포함되어 전송될 수 있다.For example, as shown in FIG. 9 , the spatial structure information signaling unit 130 inserts the input image type (INPUT_IMAGE_TYPE), so that index information of the synchronized multi-view image may be included in the SPS and transmitted.

여기서, SPS상 INPUT_IMAGE_TYPE_INDEX가 -1이 아닌 경우, 또는 INDEX 값이 -1인 경우, 또는 그 값이 0으로 지정되어 의미적으로 -1에 대응될 경우 INPUT_IMAGE_TYPE이 본 발명의 실시 예에 따른 동기화된 다시점 영상임을 나타낼 수 있다.Here, when INPUT_IMAGE_TYPE_INDEX on SPS is not -1, or when the INDEX value is -1, or when the value is designated as 0 and semantically corresponds to -1, INPUT_IMAGE_TYPE is synchronized again according to an embodiment of the present invention. It can be indicated that it is a point image.

또한, 공간적 구조 정보 시그널링부(130)는 입력 영상의 타입이 동기화된 다시점 영상인 경우, 그 시점 정보(PERSPECTIVE INFORMATION)을 SPS에 포함시켜 시그널링함으로써, 동기화된 다시점 영상의 공간적 구조 정보의 일부를 SPS에 삽입하여 전송할 수도 있다. 시점 정보는 2D 영상의 3D 렌더링 프로세싱 과정에 따라 각 시간대별 이미지 레이아웃이 시그널링되는 정보로서, 상단, 하단, 측면 등의 순서 정보가 포함 될 수 있다.In addition, when the type of the input image is a synchronized multi-view image, the spatial structure information signaling unit 130 signals by including the viewpoint information (PERSPECTIVE INFORMATION) in the SPS to signal a part of the spatial structure information of the synchronized multi-view image. can be transmitted by inserting into the SPS. The viewpoint information is information that an image layout for each time period is signaled according to a 3D rendering processing process of a 2D image, and may include order information such as top, bottom, and side.

이에 따라, 복호화 장치(200)는 VPS 또는 SPS의 상기 플래그를 복호화 하여 해당 영상이 본 발명의 실시 예에 따른 공간적 구조 정보를 이용한 부호화를 수행했는지를 식별할 수 있다. 예를 들어, 도 5의 VPS의 경우에는 VPS_OTHER_TYPE_CODING_FLAG를 추출하여 해당 영상이 공간적 구조 정보를 이용하여 부호화된 동기화 다시점 영상인지 여부를 확인할 수 있다.Accordingly, the decoding apparatus 200 may decode the flag of the VPS or SPS to identify whether the corresponding image has been encoded using spatial structure information according to an embodiment of the present invention. For example, in the case of the VPS of FIG. 5 , by extracting VPS_OTHER_TYPE_CODING_FLAG, it can be checked whether the corresponding image is a synchronized multi-view image encoded using spatial structure information.

또한 도 9의 SPS의 경우에는 PERSPECTIVE_INFORMATION_INDEX 정보를 복호화함으로써, 레이아웃과 같은 실제적인 공간적 구조정보를 식별할 수 있다.In addition, in the case of the SPS of FIG. 9, by decoding the PERSPECTIVE_INFORMATION_INDEX information, actual spatial structure information such as a layout can be identified.

이 때, 공간적 구조 정보는 파라미터의 형식으로 구성될 수 있으며, 예를 들어, 공간적 구조 파라미터 정보는 SPS, VPS 등의 HLS 상에 서로 다르게 포함되거나, 별도의 함수와 같은 형태로 Syntax가 구성되거나, SEI 메시지로 정의될 수 있다.At this time, the spatial structure information may be configured in the form of parameters, for example, the spatial structure parameter information is included differently on HLS such as SPS and VPS, or syntax is configured in the form of a separate function, It may be defined as an SEI message.

또한, 일 실시 예에 따르면, 공간적 구조 정보는 PPS(PICTURE PARAMETER SET)에 포함되어 전송될 수 있다. 이 경우, 각 서브 이미지별 속성 정보가 포함될 수 있다. 예를 들어, 서브 이미지의 독립성이 시그널링될 수 있다. 독립성은 해당 영상이 다른 영상을 참조하지 않고 부호화 및 복호화될 수 있음을 나타낼 수 있으며, 동기화된 다시점 영상의 서브 이미지들은 독립적(INDEPENDENT) 서브 이미지와 의존적(DEPENDENT) 서브 이미지를 포함할 수 있다. 의존적 서브 이미지는 독립적 서브 이미지를 참조하여 복호화될 수 있다. 공간적 구조 정보 시그널링부(130)는 PPS 상에 독립적 서브 이미지를 리스트(Independent sub image list) 형태로 시그널링할 수 있다.Also, according to an embodiment, the spatial structure information may be included in a PPS (PICTURE PARAMETER SET) and transmitted. In this case, attribute information for each sub-image may be included. For example, the independence of the sub-image may be signaled. Independence may indicate that a corresponding image can be encoded and decoded without referring to another image, and sub-images of the synchronized multi-view image may include an independent (INDEPENDENT) sub-image and a dependent (DEPENDENT) sub-image. The dependent sub-image may be decoded with reference to the independent sub-image. The spatial structure information signaling unit 130 may signal an independent sub image on the PPS in the form of an independent sub image list.

또한, 상기한 공간적 구조 정보는 SEI 메시지로 정의되어 시그널링될 수 있다. 도 10은 공간적 구조 정보로서 SEI 메시지를 예시한 것으로, Spatial layout information 디스크립터를 이용하여 파라미터화된 공간적 구조 정보가 삽입될 수 있다.In addition, the spatial structure information may be defined and signaled as an SEI message. 10 illustrates an SEI message as spatial structure information, parameterized spatial structure information may be inserted using a spatial layout information descriptor.

도 10에 도시된 바와 같이, 공간적 구조 정보는 입력 영상의 공간적 레이아웃(Spatial layout)을 나타낼 수 있는 타입 인덱스 정보(INPUT IMAGE TYPE INDEX), 시점 정보(PERSPECTIVE INFORMATION), 카메라 파라미터 정보(CAMERA PARAMETER), 장면 앵글 정보(SCEN ANGLE), 장면 다이나믹 레인지 정보(SCENE DYNAMIC RANGE), 독립적 서브 이미지 정보(INDEPENDENT SUB IMAGE), 장면 시간 정보(SCENE TIME INFORMATION)중 적어도 하나를 포함할 수 있으며, 이 외의 다시점 동기화된 영상을 효율적으로 부호화하는데 필요한 다양한 정보가 더 추가될 수 있다. 이와 같은 파라미터들은 하나의 디스크립터 형태의 SEI 메시지 형식으로 정의될 수 있으며, 복호화 장치(200)는 이를 파싱하여 복호화, 후처리 및 렌더링 단계에서 상기 공간적 구조 정보를 효율적으로 활용할 수 있다.As shown in FIG. 10 , the spatial structure information includes type index information (INPUT IMAGE TYPE INDEX) that can indicate the spatial layout of the input image, viewpoint information (PERSPECTIVE INFORMATION), camera parameter information (CAMERA PARAMETER), It may include at least one of scene angle information (SCEN ANGLE), scene dynamic range information (SCENE DYNAMIC RANGE), independent sub image information (INDEPENDENT SUB IMAGE), and scene time information (SCENE TIME INFORMATION). Various information necessary for efficiently encoding the image may be further added. Such parameters may be defined in the form of an SEI message in the form of one descriptor, and the decoding apparatus 200 may parse it and efficiently utilize the spatial structure information in decoding, post-processing, and rendering steps.

그리고, 상기한 바와 같이 공간적 구조 정보는 SEI 또는 메타데이터의 형식으로 복호화 장치(200)로 전달될수 있다.And, as described above, the spatial structure information may be transmitted to the decoding apparatus 200 in the form of SEI or metadata.

또한, 예를 들어, 공간적 구조 정보는 부호화 단계에서 configuration 과 같은 선택 옵션에 의해 시그널링될 수 있다.Also, for example, spatial structure information may be signaled by a selection option such as configuration in the encoding step.

제1 옵션으로서, 공간적 구조 정보는 신택스상의 부호화 효율에 따라 HLS 상의 VPS / SPS / PPS 또는 Coding unit 신택스에 포함될 수 있다.As a first option, spatial structure information may be included in VPS / SPS / PPS or Coding unit syntax on HLS according to coding efficiency on syntax.

제2 옵션으로서, 공간적 구조 정보는 신택스상 SEI 형태의 메타 데이터로 한번에 시그널링될 수 있다.As a second option, the spatial structure information may be signaled at once as metadata in the form of SEI on syntax.

이하에서는 도 11 내지 도 19를 참조하여, 본 발명의 일 실시 예에 따른 동기화된 다시점 영상 포맷에 따른 효율적인 비디오 부호화 및 복호화 방법에 대하여 보다 구체적으로 설명하도록 한다.Hereinafter, an efficient video encoding and decoding method according to a synchronized multi-view image format according to an embodiment of the present invention will be described in more detail with reference to FIGS. 11 to 19 .

전술한 바와 같이 전처리 단계에서 생성되는 복수의 시점별 영상이 하나의 입력 영상으로 합성되어 부호화될 수 있다. 이 경우, 하나의 입력 영상은 복수의 서브 이미지를 포함할 수 있다. 각각의 서브 이미지들은 동일한 시간시점에 동기화될 수 있으며, 각각 서로 다른 뷰, 시각적 시점(PERSPECTIVE) 또는 장면에 대응될 수 있다. 이는 기존과 같은 별도의 깊이 정보를 이용하지 않고도 동일한 POC(PICTURE ORDER COUNT)에 다양한 VIEW를 지원하게 되는 효과를 가지며, 각 서브 이미지간 중복되는 영역은 바운더리(BOUNDARY) 영역으로 제한되게 된다.As described above, a plurality of images for each viewpoint generated in the preprocessing step may be synthesized and encoded into one input image. In this case, one input image may include a plurality of sub images. Each of the sub-images may be synchronized to the same time point, and may correspond to a different view, a visual point of view (PERSPECTIVE), or a scene, respectively. This has the effect of supporting various views in the same POC (PICTURE ORDER COUNT) without using separate depth information as in the past, and the overlapping area between each sub-image is limited to the boundary area.

특히, 입력 영상의 공간적 구조 정보는 전술한 바와 같은 형태로 시그널링될 수 있으며, 부호화 장치(100) 및 복호화 장치(200)는 공간적 구조 정보를 파싱하여 효율적인 부호화 및 복호화를 수행하는데 이용할 수 있다. 즉, 부호화 장치(100)는 인코딩 단계에서 상기 공간적 구조 정보를 이용한 다시점 영상 부호화를 처리할 수 있으며, 복호화 장치(200)는 복호화, 전처리 및 렌더링 단계에서 상기 공간적 구조 정보를 이용한 복호화를 처리할 수 있다.In particular, the spatial structure information of the input image may be signaled in the form described above, and the encoding apparatus 100 and the decoding apparatus 200 may parse the spatial structure information and use it to perform efficient encoding and decoding. That is, the encoding apparatus 100 may process multi-view image encoding using the spatial structure information in the encoding step, and the decoding apparatus 200 may process decoding using the spatial structure information in decoding, pre-processing, and rendering steps. can

도 11 내지 도 12는 본 발명의 실시 예에 따른 공간적 구조 정보의 타입 인덱스 테이블을 설명하기 위한 도면들이다.11 to 12 are diagrams for explaining a type index table of spatial structure information according to an embodiment of the present invention.

전술한 바와 같이 입력 영상의 서브 이미지들은 다양한 방식으로 배치될 수 있다. 이에 따라, 공간적 구조 정보는 배치 정보를 시그널링하기 위한 테이블 인덱스를 별도 포함할 수 있다. 예를 들어, 도 8에 도시된 바와 같이 동기화된 다시점 영상은 CUBIC LAYOUT 4 X 3, CUBIC LAYOUT 3 X 2, ICOSAHEDRON, EQUIRECTANGULAR 등의 레이아웃이 예시될 수 있으며, 공간적 구조 정보에는 각각의 레이아웃에 대응되는 도 9에 도시된 테이블 인덱스가 삽입될 수 있다.As described above, the sub-images of the input image may be arranged in various ways. Accordingly, the spatial structure information may separately include a table index for signaling the arrangement information. For example, as shown in FIG. 8 , for the synchronized multi-view image, layouts such as CUBIC LAYOUT 4 X 3, CUBIC LAYOUT 3 X 2, ICOSAHEDRON, EQUIRECTANGULAR, etc. may be exemplified, and spatial structure information corresponds to each layout. The table index shown in FIG. 9 may be inserted.

다만, 도 12에 도시된 테이블은 입력 영상에 따라 임의적으로 배치된 것으로, 부호화 효율 및 시장의 컨텐츠 분포 등에 따라 변경될 수 있다.However, the table shown in FIG. 12 is arbitrarily arranged according to the input image, and may be changed according to encoding efficiency and content distribution in the market.

이에 따라, 복호화 장치(200)는 별도 시그널링되는 테이블 인덱스를 파싱하여, 복호화 처리에 이용할 수 있다.Accordingly, the decoding apparatus 200 may parse the separately signaled table index and use it for the decoding process.

특히, 본 발명의 실시 예에서 상기 각 레이아웃 정보는 영상의 일부 복호화에 유용하게 이용될 수 있다. 즉 CUBIC LAYOUT과 같은 서브 이미지 배치 정보는 독립적 서브 이미지와 의존적 서브 이미지를 구분하는데 이용 수 있으며 이에 따라 효율적인 부호화 및 복호화 스캐닝 순서를 결정하거나, 특정 시점에 대한 일부 복호화를 수행하는데 이용될 수도 있다.In particular, in an embodiment of the present invention, each of the layout information may be usefully used for partial decoding of an image. That is, sub-image arrangement information such as CUBIC LAYOUT can be used to distinguish an independent sub-image from a dependent sub-image, and accordingly, it can be used to determine an efficient encoding and decoding scanning order or to perform partial decoding for a specific viewpoint.

도 13은 본 발명의 실시 예에 따른 복호화 방법을 설명하기 위한 흐름도이다.13 is a flowchart illustrating a decoding method according to an embodiment of the present invention.

도 13을 참조하면, 먼저 복호화 장치(200)는 영상 비트스트림을 수신한다(S101).Referring to FIG. 13 , first, the decoding apparatus 200 receives an image bitstream ( S101 ).

그리고, 복호화 장치(200)는 영상이 동기화된 다시점 영상인지를 확인한다(S103).Then, the decoding apparatus 200 checks whether the image is a synchronized multi-view image (S103).

여기서, 복호화 장치(200)는 영상 비트스트림으로부터 공간적 구조 정보 시그널링부(130)로부터 시그널링되는 플래그로부터 동기화된 다시점 영상인지를 식별할 수 있다. 예를 들어, 복호화 장치(200)는 전술한 바와 같은 VPS, SPS 등으로부터 영상이 동기화된 다시점 영상인지를 미리 식별할 수 있다.Here, the decoding apparatus 200 may identify whether it is a synchronized multi-view image from a flag signaled from the spatial structure information signaling unit 130 from the image bitstream. For example, the decoding apparatus 200 may identify in advance whether an image is a synchronized multi-view image from the above-described VPS, SPS, or the like.

만약 동기화된 다시점 영상이 아닌 경우에는 일반적인 전체 영상 복호화를 수행한다(S113).If it is not a synchronized multi-view image, general image decoding is performed (S113).

그리고, 복호화 장치(200)는 동기화된 다시점 영상인 경우, 공간적 구조 정보로부터 테이블 인덱스를 복호화한다(S105).Then, in the case of the synchronized multi-view image, the decoding apparatus 200 decodes the table index from the spatial structure information (S105).

여기서, 복호화 장치(200)는 테이블 인덱스로부터 EQUIRECTANGULAR 영상인지 여부를 식별할 수 있다(S107).Here, the decoding apparatus 200 may identify whether it is an EQUIRECTANGULAR image from the table index (S107).

이는 동기화된 다시점 영상 중 EQUIRECTANGULAR 영상의 경우에는 별도의 서브 이미지로 구분되지 않을 수 있기 때문이며, 복호화 장치(200)는 EQUIRECTANGULAR 영상에 대하여는 전체 영상의 복호화를 수행하게 된다(S113).This is because the EQUIRECTANGULAR image among the synchronized multi-view images may not be divided into separate sub images, and the decoding apparatus 200 decodes the entire image for the EQUIRECTANGULAR image (S113).

EQUIRECTANGULAR 영상이 아닌 경우, 복호화 장치(200)는 나머지 전체 공간적 구조 정보(SPATIAL LAYOUT INFORMATION)를 복호화하며(S109), 상기 공간적 구조정보에 기초한 영상 복호화 처리를 수행한다(S111).If it is not an EQUIRECTANGULAR image, the decoding apparatus 200 decodes the remaining total spatial structure information (SPATIAL LAYOUT INFORMATION) (S109), and performs image decoding processing based on the spatial structure information (S111).

도 14는 본 발명의 실시 예에 따른 공간적 구조 정보의 시점 정보 테이블을 설명하기 위한 도면이다.14 is a diagram for explaining a viewpoint information table of spatial structure information according to an embodiment of the present invention.

본 발명의 실시 예에 따른 공간적 구조 정보는 시점 정보(PERSPECTIVE INFORMATION)을 위한 테이블을 포함할 수 있다.The spatial structure information according to an embodiment of the present invention may include a table for viewpoint information (PERSPECTIVE INFORMATION).

부호화 장치(100) 및 복호화 장치(200)는 영상 간 참조, 부호화, 복호화 순서 및 독립적 서브 이미지를 구분하기 위한 정보로서, 상기 시점 정보 테이블을 이용할 수 있다.The encoding apparatus 100 and the decoding apparatus 200 may use the viewpoint information table as information for discriminating reference between images, encoding and decoding orders, and independent sub-images.

또한, 다시점 디스플레이 장치에서의 렌더링시, 시점 정보 테이블은 복호화된 영상과 함께 장치의 시스템 레이어로 전달될 수 있으며, 해당 정보를 이용하여 사용자는 컨텐츠 제공자(contents provider)의 의도에 따른 위상에 맞추어 영상을 시청할 수 있게 된다.In addition, when rendering in the multi-viewpoint display device, the viewpoint information table may be transmitted to the system layer of the device together with the decoded image, and using the information, the user can match the phase according to the intention of the contents provider. You will be able to watch the video.

보다 구체적으로, 공간적 구조 정보 시그널링부(130)는 각 서브 이미지의 시점(PERSPECTIVE)을 시그널링할 수 있다. 특히, 공간적 구조 정보 시그널링부(130)는 영상의 타입에 따라 Top, Bottom 및 전방 영상에 대한 정보만을 Signaling 하고, 나머지 측면들에 대한 영상의 정보는 복호화 장치(200)에서 Top Scene, 전방 Perspective Scene, Bottom Scene 정보를 이용하여 유도하게 할 수 있다. 따라서, 최소한의 정보만이 시그널링될 수 있게 된다.More specifically, the spatial structure information signaling unit 130 may signal a viewpoint (PERSPECTIVE) of each sub-image. In particular, the spatial structure information signaling unit 130 signals only information on the top, bottom, and front images according to the type of image, and the image information on the remaining sides is transmitted to the decoding device 200 by the top scene and front perspective scene. , it can be guided by using the Bottom Scene information. Accordingly, only minimal information can be signaled.

도 15 및 도 16은 본 발명의 실시 예에 따른 공간적 구조 정보의 시그널링에 따라 복호화단에서의 스캐닝 순서가 결정되는 것을 예시한 도면들이다.15 and 16 are diagrams illustrating that a scanning order of a decoder is determined according to signaling of spatial structure information according to an embodiment of the present invention.

도 15 및 도 16에 도시된 바와 같이, 공간적 구조 정보의 타입 인덱스에 따라, 서브 이미지들의 Scanning 순서가 함께 전송될 수 있으며, 전송되는 스캐닝 순서 정보를 통해 효과적인 복호화 및 렌더링이 수행될 수 있다.15 and 16 , the scanning order of sub-images may be transmitted together according to the type index of the spatial structure information, and effective decoding and rendering may be performed through the transmitted scanning order information.

도 15 및 도 16에서는 원본 영상이 A->B->C->D->E->F 순서로 스캐닝이 수행되는 것을 도시하고 있다.15 and 16 show that the original image is scanned in the order of A->B->C->D->E->F.

이에 대한 시그널링를 위해, 공간적 구조 정보 시그널링부(130)는 Scanning 순서 중 Top view, Bottom view, 전방 View와 같은 일부 서브 이미지에 대한 순서 정보만을 시그널링할 수 있다. 복호화 장치(200)는 상기 일부 서브 이미지의 순서 정보를 이용하여 전체 순서를 유도할 수 있다.For this signaling, the spatial structure information signaling unit 130 may signal only order information for some sub-images, such as a top view, a bottom view, and a front view, among scanning sequences. The decoding apparatus 200 may derive an overall order by using order information of the partial sub-images.

또한, 전송되는 영상의 종류 및 병렬성 및 참조 구조에 따라 도 15 또는 도 16과 같이 Scanning 순서가 변할 수 있다. In addition, the scanning order may be changed as shown in FIG. 15 or 16 according to the type, parallelism, and reference structure of the transmitted image.

Top을 A, Bottom을 F, Front을 B라고 하면, 도 15의 스캐닝 순서는 A -> F -> B -> C -> D -> E 일 수 있으며, 도 16의 스캐닝 순서는 A->B->C->D->E->F일 수 있다. 이는 부호화 효율을 고려하여 스캐닝 순서를 상이하게 결정하는 경우에 유용하게 이용될 수 있다.Assuming that Top is A, Bottom is F, and Front is B, the scanning order of FIG. 15 may be A -> F -> B -> C -> D -> E, and the scanning order of FIG. 16 is A->B It can be ->C->D->E->F. This can be usefully used when different scanning orders are determined in consideration of encoding efficiency.

도 17은 공간적 구조 정보의 시그널링에 따라 구분되는 독립적 서브 이미지와 의존적 서브 이미지를 설명하기 위한 도면이다.17 is a diagram for describing an independent sub-image and a dependent sub-image that are distinguished according to signaling of spatial structure information.

도 17에 도시된 바와 같이, 공간적 구조에 배치되는 각 서브 이미지들은 참조성 및 병렬성을 고려하여, 의존적 서브 이미지(Dependent sub image)와 독립적 서브 이미지(Independent sub image)로 구분될 수 있다. 독립적 서브 이미지는 다른 서브 이미지를 참조하지 않고 복호화 되는 특성을 가지며, 의존적 서브 이미지는 인접한 독립적 서브 이미지 또는 인접한 의존적 서브 이미지를 참조하여 복원할 수 있다.As shown in FIG. 17 , each sub-image disposed in the spatial structure may be divided into a dependent sub image and an independent sub image in consideration of referentiality and parallelism. The independent sub-image has a characteristic of being decoded without referring to other sub-images, and the dependent sub-image can be reconstructed with reference to the adjacent independent sub-image or the adjacent dependent sub-image.

따라서, 독립적 서브 이미지는 의존적 서브 이미지보다 먼저 부호화 또는 복호화 되어야 하는 특성을 가질 수 있다.Accordingly, the independent sub-image may have a characteristic that must be encoded or decoded before the dependent sub-image.

일 실시 예에서, 독립적 서브 이미지는 시간축에서 동일하지 않은 기 부호화 또는 복호화된 독립적 서브 이미지를 참조하여 부호화 또는 복호화될 수 있으며, 의존적 서브 이미지는 시간축에서 동일하거나 동일하지 않은 독립적 서브 이미지를 참조하여 부호화 또는 복호화될 수 있다. 또한, 독립적인지 여부는 공간적 구조 정보에 따라 별도 인덱스로 시그널링될 수 있다. In an embodiment, the independent sub-image may be encoded or decoded with reference to a previously encoded or decoded independent sub-image that is not identical in the time axis, and the dependent sub-image is encoded with reference to the same or non-identical independent sub-image in the time axis Or it can be decrypted. In addition, whether it is independent may be signaled by a separate index according to spatial structure information.

도 18 내지 도 19는 공간적 구조 정보에 따라, 서브 이미지간 바운더리 영역이 독립적 서브 이미지를 참조하여 복호화되는 것을 도시한다.18 to 19 show that a boundary region between sub-images is decoded with reference to an independent sub-image according to spatial structure information.

전술한 바와 같이, 부호화 장치(100) 또는 복호화 장치(200)는 스캐닝 순서와, 서브 이미지의 의존성(Dependency) 정보를 동시에 고려하여 부호화 및 복호화를 처리할 수 있다.As described above, the encoding apparatus 100 or the decoding apparatus 200 may process encoding and decoding by simultaneously considering a scanning order and dependency information of a sub-image.

만약 도 18에 도시된 바와 같이, A와 F가 독립적 서브 이미지일 경우, 스캐닝 순서는 A -> F -> B -> C -> D -> E 와 같이 전송 또는 유도 될 수 있으며, 이러한 경우에 A, F는 다른 의존적 서브 이미지 대비 우선 복호화가 수행되어야 한다. 그리고 나머지 B, C, D, E 가 복호화 될 때 각 서브 이미지의 인접한 바운더리 영역은 독립적 서브 이미지를 참조하여 복호화 할 수 있다. 이에 따라, 기 복호화된 독립적 서브 이미지 또는 기 복호화된 의존적 서브 이미지의 바운더리 영역이 나머지 서브 이미지의 복호화에 참조될 수 있다.If A and F are independent sub-images as shown in Fig. 18, the scanning order can be transmitted or derived as follows: A -> F -> B -> C -> D -> E, in this case For A and F, decoding should be performed first compared to other dependent sub-images. And when the remaining B, C, D, and E are decoded, the adjacent boundary region of each sub-image can be decoded by referring to the independent sub-image. Accordingly, the boundary region of the previously decoded independent sub-image or the previously decoded dependent sub-image may be referred to for decoding of the remaining sub-images.

또한, 전술한 독립적 서브 이미지는 하나의 영상 프레임뿐만 아니라, 인접한 Picture의 Boundary 영역에서의 인트라/인터 부호화 및 복호화 수행에도 이용될 수 있다.In addition, the above-described independent sub-image may be used not only for one image frame but also for performing intra/inter encoding and decoding in a boundary region of an adjacent picture.

*다만, 도 19에 도시된 바와 같이, 서로 다른 해상도로 인해 1:1 매핑이 되지 않을 경우가 있을 수 있다.(일반적인 영상의 경우, Width 의 비율이 Height보다 더 넓다)*However, as shown in FIG. 19, 1:1 mapping may not be possible due to different resolutions. (In the case of a general image, the ratio of Width is wider than Height)

이 경우, 해당 인접 면을 참조하기 위하여, 대상 서브 이미지에 대해 로테이션 및 업 샘플링(Up sampling) 과 같은 영상 처리 기법을 통하여 해상도에 따른 스케일(Scale)을 조절하여 바운더리 영역의 부호화 또는 복호화에 참조할 수 있다.In this case, in order to refer to the adjacent surface, a scale according to the resolution is adjusted through an image processing technique such as rotation and up-sampling for the target sub-image to be referenced for encoding or decoding of the boundary region. can

예를 들어, 도 19에서의 C의 상단 영역 바운더리의 부호화/복호화에 있어서, A의 측면 값을 참조할 수 있다. 이를 위해, 부호화 장치(100) 또는 복호화 장치(200)는 A의 측면 값(Height)를 C의 Width에 해당 하는 비율로 업샘플링하여 해당 위치에 따른 참조 블록 값을 생성하고, 이를 통한 부호화 및 복호화를 수행할 수 있다.For example, in encoding/decoding of the upper region boundary of C in FIG. 19 , the side value of A may be referred to. To this end, the encoding apparatus 100 or the decoding apparatus 200 up-samples the side value (Height) of A at a ratio corresponding to the width of C to generate a reference block value according to the corresponding position, and encodes and decodes it can be performed.

도 20 내지 도 24는 본 발명의 일 실시 예에 따른 복호화 시스템 및 그 동작을 도시한 도면들이다.20 to 24 are diagrams illustrating a decoding system and its operation according to an embodiment of the present invention.

도 20을 참조하면, 본 발명의 실시 예에 따른 복호화 시스템(300)은 전술한 바와 같은 부호화 장치(100) 또는 외부 서버 등으로부터 수신되는 전체 동기화된 다시점 영상 비트스트림 및 공간적 구조 정보를 수신하여, 사용자의 가상현실 디스플레이 장치(400)로 하나 이상의 복호화된 픽쳐를 제공하는 클라이언트 시스템을 구성할 수 있다.Referring to FIG. 20 , the decoding system 300 according to an embodiment of the present invention receives the entire synchronized multi-view image bitstream and spatial structure information received from the encoding apparatus 100 or an external server as described above. , a client system that provides one or more decoded pictures to the user's virtual reality display device 400 may be configured.

이를 위해, 복호화 시스템(300)은 복호화 처리부(310), 사용자 동작 분석부(320) 및 인터페이스부(330)를 포함한다. 다만, 복호화 시스템(300)은 본 명세서에서 별도의 시스템으로 설명되고는 있으나, 이는 필요한 복호화 처리 및 후처리를 수행하기 위한 전술한 복호화 장치(200) 및 후처리 장치(20)를 구성하는 전부 또는 일부 모듈의 조합으로 구성될 수 있으며, 복호화 장치(200)를 확장하여 구성할 수도 있다. 따라서, 그 명칭에 한정되는 것은 아니다.To this end, the decryption system 300 includes a decryption processing unit 310 , a user motion analysis unit 320 , and an interface unit 330 . However, although the decryption system 300 is described as a separate system in this specification, it is all or all of the above-described decryption apparatus 200 and post-processing apparatus 20 for performing the necessary decryption processing and post-processing. It may be configured by a combination of some modules, and may be configured by expanding the decryption apparatus 200 . Therefore, it is not limited to the name.

이에 따라, 본 발명의 실시 예에 따른 복호화 시스템(300)은 부호화 장치(100)로부터 수신되는 공간적 구조 정보와, 사용자 동작 분석에 따른 사용자 시점 정보에 기초하여 전체 비트스트림 중 일부에 대한 선택적 복호화를 수행할 수 있다. 특히, 도 20에서 설명되는 선택적 복호화에 따라, 복호화 시스템(300)은 공간적 구조 정보를 이용하여, 동일한 시간(POC, Picture of Count)의 복수의 시점을 갖는 입력 영상들을 일정 방향을 기준으로 사용자의 시점(PERSPECTIVE)과 대응시킬 수 있다. 또한, 이를 기준으로 사용자 시점에 의해 결정되는 관심 영역(ROI, Region Of Interest) 픽쳐들에 대한 일부 복호화를 수행할 수 있다.Accordingly, the decoding system 300 according to an embodiment of the present invention selectively decodes a part of the entire bitstream based on spatial structure information received from the encoding apparatus 100 and user viewpoint information according to user motion analysis. can be done In particular, according to the selective decoding described in FIG. 20 , the decoding system 300 uses spatial structure information to view input images having a plurality of viewpoints at the same time (POC, Picture of Count) based on a predetermined direction of the user. It can correspond to the viewpoint (PERSPECTIVE). Also, based on this, partial decoding may be performed on Region Of Interest (ROI) pictures determined by the user's viewpoint.

이를 위해, 복호화 시스템(300)에는 사용자 정보 수신 및 분석을 위한 인터페이스 레이어가 포함될 수 있으며, 현재 복호화 하는 영상이 지원하는 시점과 VR 디스플레이 장치(400)의 시점 매핑 및 후처리, 렌더링 등을 선택적으로 수행할 수 있다. 보다 구체적으로 인터페이스 레이어는 상기 후처리와 렌더링을 위한 하나 이상의 프로세싱 모듈과, 인터페이스부(330) 및 사용자 동작 분석부(320)를 포함할 수 있다.To this end, the decoding system 300 may include an interface layer for receiving and analyzing user information, and selectively selects the viewpoint mapping, post-processing, rendering, etc. between the viewpoint supported by the currently decoded image and the VR display apparatus 400 . can be done More specifically, the interface layer may include one or more processing modules for the post-processing and rendering, an interface unit 330 , and a user motion analysis unit 320 .

인터페이스부(330)는 사용자가 착용한 VR 디스플레이 장치(400)로부터 움직임 정보를 수신할 수 있다.The interface unit 330 may receive motion information from the VR display device 400 worn by the user.

인터페이스부(330)는 예를 들어, 사용자의 VR 디스플레이 장치(400)의 환경 센서, 근접 센서, 동작 감지 센서, 위치 센서, 자이로스코프 센서, 가속도 센서, 및 지자기 센서 중 적어도 하나를 유선 또는 무선으로 수신하기 위한 하나 이상의 데이터 통신 모듈을 포함할 수 있다.The interface unit 330 may, for example, connect at least one of an environment sensor, a proximity sensor, a motion detection sensor, a position sensor, a gyroscope sensor, an acceleration sensor, and a geomagnetic sensor of the user's VR display device 400 by wire or wirelessly. one or more data communication modules for receiving.

그리고, 사용자 동작 분석부(320)는 상기 인터페이스부(330)로부터 수신되는 사용자 동작 정보를 분석하여 사용자의 시점(PERSPECTIVE)을 결정하며, 이에 대응되는 복호화 픽쳐 그룹을 적응적으로 선택하기 위한 선택 정보를 복호화 처리부(310)로 전달할 수 있다.Then, the user motion analysis unit 320 analyzes the user motion information received from the interface unit 330 to determine the user's viewpoint (PERSPECTIVE), and selection information for adaptively selecting a decoded picture group corresponding thereto may be transmitted to the decryption processing unit 310 .

이에 따라, 복호화 처리부(310)는 사용자 동작 분석부(320)로부터 전달된 선택 정보에 기초하여, ROI(Region Of Intrest) 픽쳐를 선택하기 위한 ROI 마스크를 설정할 수 있으며, 상기 설정된 ROI 마스크에 대응되는 픽쳐 영역만을 복호화할 수 있다. 예를 들어, 픽쳐 그룹은 전술한 영상 프레임 내 복수의 서브 이미지, 또는 참조 이미지들 중 적어도 하나에 대응될 수 있다.Accordingly, the decoding processing unit 310 may set an ROI mask for selecting an ROI (Region Of Intrest) picture based on the selection information transmitted from the user motion analysis unit 320 , and may set an ROI mask corresponding to the set ROI mask. Only the picture region can be decoded. For example, a picture group may correspond to at least one of a plurality of sub-images or reference images in the above-described image frame.

예를 들어, 도 20에 도시된 바와 같이, 복호화 처리부(310)에서 복호화된 특정 POC의 서브 이미지가 1 내지 8까지 존재하는 경우, 복호화 처리부(310)는 사용자의 시각 시점(PERSPECTIVE)에 대응되는 서브 이미지 영역 6, 7만을 복호화 처리함으로써, 처리 속도 및 효율을 실시간으로 향상시킬 수 있다.For example, as shown in FIG. 20 , when sub-images of a specific POC decoded by the decoding processing unit 310 exist from 1 to 8, the decoding processing unit 310 corresponds to the user's viewpoint (PERSPECTIVE). By decoding only the sub-image areas 6 and 7, the processing speed and efficiency can be improved in real time.

도 21은 본 발명의 실시 예에 따른 전체 부호화된 비트스트림과 GOP(GROUP OF PICTURES)를 예시한 것이며, 도 22는 전체 비트스트림 중 복호화되는 픽쳐 그룹이 사용자 시점에 따라 선택적으로 변화되는 것을 나타내고 있다.21 exemplifies the entire encoded bitstream and GROUP OF PICTURES (GOP) according to an embodiment of the present invention, and FIG. 22 shows that the decoded picture group among the entire bitstream is selectively changed according to the user's viewpoint. .

전술한 바와 같이, 본 발명의 실시 예에 따른 복호화 시스템(300)은 수신된 동기화된 다시점 영상의 전체 부호화된 비트스트림을 수신하여, 사용자의 시각적 뷰(perspective view)에 대응되는 서브 비트스트림에 대한 복호화를 수행할 수 있다.As described above, the decoding system 300 according to an embodiment of the present invention receives the entire encoded bitstream of the synchronized multi-view image, and converts it into a sub-bitstream corresponding to the user's perspective view. decryption can be performed.

이때, 동기화된 다시점 영상의 공간적 구조 정보는 전술한 바와 같은 SEI 또는 HLS 등의 형태로 시그널링될 수 있다. 특히, 일 실시 예에 따르면, 복호화 시스템(300)은 상기 시그널링 정보로부터 식별되는 독립적 서브 이미지(Independent sub image), 의존적 서브 이미지(dependent sub image) 및 SRAP(Spatial random access picture)를 이용한 참조 구조를 생성 및 구축할 수 있다. 그리고, 상기 참조 구조를 이용하여 시각적 시점(PERSPECTIVE) 변화에 따라 일부 비트스트림을 선택하여 복호화할 수 있다.In this case, the spatial structure information of the synchronized multi-view image may be signaled in the form of SEI or HLS as described above. In particular, according to an embodiment, the decoding system 300 obtains a reference structure using an independent sub image, a dependent sub image, and a spatial random access picture (SRAP) identified from the signaling information. can be created and built. And, using the reference structure, it is possible to select and decode some bitstreams according to a change in a visual viewpoint (PERSPECTIVE).

이를 위해, 부호화 장치(100)는 NAL 타입 또는 PPS에 의해 공간적 랜덤 액세스 픽쳐(SRAP, SPATIAL RANDAOM ACCESS PICTURE)로 픽쳐를 선택할 수 있으며, SRAP 픽쳐는 선택되지 않은 다른 픽쳐의 서브 이미지들의 인트라 예측 부호화에 있어서 참조 픽쳐로 이용될 수 있다.To this end, the encoding apparatus 100 may select a picture as a spatial random access picture (SRAP, SPATIAL RANDAOM ACCESS PICTURE) by NAL type or PPS, and the SRAP picture is used for intra prediction encoding of sub-images of other pictures that are not selected. may be used as a reference picture.

따라서, 부호화 장치(100) 및 복호화 장치(200)는 특정 GOP(Group Of Picture) 또는 NAL 타입 또는 PPS에 의해 SPAP로 선택되지 않은 픽쳐에 대하여는 상기 SRAP 픽쳐 중 하나 이상을 선택하여 부호화 및 복호화 처리할 수 있다.Accordingly, the encoding apparatus 100 and the decoding apparatus 200 select one or more of the SRAP pictures for a picture that is not selected as SPAP by a specific group of picture (GOP) or NAL type or PPS to perform encoding and decoding processing. can

보다 구체적으로 SRAP 픽쳐는, 전체 비트스트림 중 일부 비트스트림에 대한 복호화를 위해, 동시간에 대해 다시점으로 구성된 한 장의 입력 영상이, 서로 다른 시간에 대한 영상들의 복호화와 상관 없이, SRAP 픽쳐 내 서브 이미지들간의 Data redundancy 등을 이용하여, 독립적으로 복호화 할 수 있는 픽쳐를 의미할 수 있다.More specifically, in the SRAP picture, for decoding of some bitstreams among the entire bitstream, a single input image composed of multiple views for the same time is sub-in the SRAP picture, regardless of the decoding of images for different times. It may mean a picture that can be independently decoded by using data redundancy between images.

또한, SRAP로 선택된 픽쳐는 각 서브 이미지들이 화면 내 예측 (Intra coding)등을 통해 부호화/복호화 될 수 있으며, 일정 GOP 내 최소 한 장 이상의 SRAP 픽쳐가 전체 비트스트림 중 일부 비트스트림을 복호화 하기 위하여 포함될 수 있다.In addition, in the picture selected by SRAP, each sub-image can be encoded/decoded through intra coding, etc., and at least one SRAP picture in a certain GOP is included in order to decode some bitstreams of the entire bitstream. can

또한, SRAP 픽쳐로 선택되지 않은 픽쳐들은 SRAP 픽쳐로 선택된 픽쳐들을 참조 픽쳐로 활용하여 화면 간 예측 방법 등을 통해, 사용자 시점에 따른 복호화를 수행하는데 이용될 수 있다.Also, pictures not selected as SRAP pictures may be used to perform decoding according to a user's viewpoint through an inter prediction method by using pictures selected as SRAP pictures as reference pictures.

이에 따라, 복호화 시스템(300)은 복호화 처리부(310)를 통해 사용자의 시각적 시점(PERSPECTIVE)에 대응하여 선택되는 하나 이상의 서브 이미지를 상기 참조 픽쳐를 이용하여 복호화할 수 있게 된다.Accordingly, the decoding system 300 can decode one or more sub-images selected in response to the user's visual viewpoint (PERSPECTIVE) through the decoding processing unit 310 using the reference picture.

한편, 상기 SRAP로 선택된 픽쳐는 참조 없이, 일반적 부호화 및 복호화 방법에 따라 단일 영상의 전체 인트라 예측 부호화될 수 있다.Meanwhile, the picture selected by the SRAP may be fully intra-prediction-encoded of a single image according to a general encoding and decoding method without reference.

따라서, 복호화 시스템(300) 또는 복호화 장치(200)는 DPB(DECODED PICTURE BUFFER)에 적어도 하나 이상의 SRAP 픽쳐를 저장할 필요성이 있다. 그리고, 하나의 GOP에서 기 복호화된 픽쳐 또는 SRAP 픽쳐 사이의 서브 이미지는 상기 SRAP로 저장된 픽쳐를 참조 픽쳐로 이용하여 복호화될 수 있다.Therefore, the decoding system 300 or the decoding apparatus 200 needs to store at least one SRAP picture in a DECODED PICTURE BUFFER (DPB). In addition, a picture previously decoded in one GOP or a sub-image between SRAP pictures may be decoded using a picture stored in the SRAP as a reference picture.

이에 따라, 도 22에서는 SRAP로 지정된 픽쳐와 사용자 시점 변화에 따른 선택적 복호화 영역의 변화를 도시하고 있다. 도 19에 도시된 바와 같이, SRAP 픽쳐가 복호화되는 주기는 미리 결정된 일정 시간 주기일 수 있다.Accordingly, FIG. 22 shows a picture designated as SRAP and a change in a selective decoding area according to a change in a user's viewpoint. As shown in FIG. 19 , the period during which the SRAP picture is decoded may be a predetermined period of time.

그리고, 주기적으로 복호화되는 SRAP 픽쳐를 참조하여, 사용자 시점 변화(USER PERSPECTIVE OF CHANGE) 발생 시간에 따라 복호화되는 서브 이미지 그룹이 변경될 수 있다. 예를 들어, 도 19와 같이, 사용자 시점 변화 발생 시점에 복호화 ROI 영역은 제1 서브 이미지 그룹에서 제2 이미지 서브 이미지 그룹으로 변화될 수 있다.In addition, with reference to the SRAP picture that is periodically decoded, the decoded sub-image group may be changed according to the occurrence time of a USER PERSPECTIVE OF CHANGE. For example, as shown in FIG. 19 , the decoded ROI region may be changed from the first sub-image group to the second image sub-image group when the user viewpoint change occurs.

한편 도 23은 본 발명의 일 실시 예에 따른 서브 이미지 복호화 방법을 설명하기 위한 도면이다.Meanwhile, FIG. 23 is a diagram for explaining a sub-image decoding method according to an embodiment of the present invention.

전술한 바와 같이 사용자 시점(Perspective)에 따라 일부 픽쳐를 복호화 하는 방법에 있어, 도 20에 도시된 바와 같이 각 서브 이미지들의 의존성(DEPENDENCY)이 고려될 수 있다.As described above, in the method of decoding some pictures according to the user's perspective, the dependency (DEPENDENCY) of each sub-image may be considered as shown in FIG. 20 .

전술한 바와 같이, 부호화 장치(100) 및 복호화 장치(200)는 부호화 효율 또는 서브 이미지들의 스캔 순서에 따라, 일부 픽쳐(서브 이미지)들에 대해 독립성을 부여할 수 있다. 그리고, 독립성이 부여된 독립적 서브 이미지는 화면 내 픽쳐로 부호화 및 복호화 처리함으로써, 다른 서브 이미지들과의 의존성을 제거할 수 있다.As described above, the encoding apparatus 100 and the decoding apparatus 200 may grant independence to some pictures (sub-images) according to encoding efficiency or a scan order of the sub-images. In addition, by encoding and decoding the independent sub-image to which the independence is given as an in-screen picture, dependence on other sub-images can be eliminated.

한편, 부호화 장치(100) 및 복호화 장치(200)는 나머지 서브 이미지를 의존적 서브 이미지(Dependent sub image)로 지정할 수 있다. 의존적 서브 이미지와 동일한 POC를 가진 독립적 서브 이미지(Independent sub image)들은 의존적 서브 이미지의 레퍼런스 픽쳐 리스트(reference picture list)에 추가될 수 있으며, 상기 의존적 서브 이미지들은 인접한 독립적 서브 이미지들에 대해 화면 내 예측 (Intra coding)을 통한 부호화 또는 복호화되거나, 동일한 POC를 가진 독립적 서브 이미지를 참조 픽쳐로 이용하여 화면 간 예측(inter coding)되거나, 또는 서로 다른 POC를 가진 독립적 서브 이미지들에 대해 화면 간 예측 방법을 통해 부호화 또는 복호화 처리될 수 있다.Meanwhile, the encoding apparatus 100 and the decoding apparatus 200 may designate the remaining sub-images as dependent sub-images. Independent sub-images having the same POC as the dependent sub-image may be added to a reference picture list of the dependent sub-image, and the dependent sub-images are intra prediction for adjacent independent sub-images. Encoding or decoding through (intra coding), inter prediction using an independent sub-image having the same POC as a reference picture, or inter-prediction method for independent sub-images having different POCs can be encoded or decoded.

보다 구체적으로, 의존적 서브 이미지들에 대응하여, 서로 다른 POC를 가진 독립적 서브 이미지들도 참조 픽쳐 리스트에 추가될 수 있다. 상기 의존적 서브 이미지는 화면 내 예측 부호화 또는 리스트에 추가된 독립적 서브 이미지들을 참조한 화면 간 예측 부호화에 있어서 참조 픽쳐로 활용될 수 있다.More specifically, in response to the dependent sub-images, independent sub-images having different POCs may also be added to the reference picture list. The dependent sub-image may be used as a reference picture in intra prediction encoding or inter prediction encoding referring to independent sub-images added to a list.

이에 따라, 의존적 서브 이미지들에 대해 독립적 서브 이미지를 참조 픽쳐로 이용하여 화면 내 예측 방법을 통한 복호화가 수행될 수 있다. 또한, 현재 복호화를 수행하는 서브 이미지와 동일한 POC를 참조 픽쳐로 지정하여, 화면 간 예측 방법을 통한 복호화가 수행될 수 있다.Accordingly, decoding through the intra prediction method may be performed on the dependent sub-images by using the independent sub-image as a reference picture. Also, decoding may be performed through an inter prediction method by designating the same POC as the sub-image currently being decoded as a reference picture.

예를 들어, 의존적 서브 이미지는 동일한 POC를 가진 독립적 서브 이미지들과, 바운더리 영역에서의 유사성이 높으므로, 부호화 장치(100)는 의존적 서브 이미지의 상기 바운더리 영역에 대하여 상기 독립적 서브 이미지와의 화면 내 예측 방법을 이용한 부호화를 수행할 수 있다.For example, since the dependent sub-image has a high similarity to the independent sub-images having the same POC in the boundary region, the encoding apparatus 100 determines the boundary region of the dependent sub-image within the screen with the independent sub-image. Encoding using a prediction method may be performed.

이 때, 참조될 독립적 서브 이미지들은 HLS 상의 PPS 또는 SPS 등의 단위에 따라 변화될 수 있으며, 이에 대한 정보는 별도로 시그널링될 수 있다. 그리고, 복호화 장치(200)는 독립적 서브 이미지들에 기초하여 의존적 서브 이미지들을 유도 (derivation)할 수 있다 또한, 복호화 장치(200)는 전체 서브 이미지의 독립성 여부를 별도의 리스트 형태로 수신할 수도 있다.In this case, independent sub-images to be referenced may be changed according to a unit such as PPS or SPS on HLS, and information on this may be separately signaled. And, the decoding apparatus 200 may derive dependent sub-images based on the independent sub-images. Also, the decoding apparatus 200 may receive the independence of all sub-images in the form of a separate list. .

또한, 상기와 같은 SRAP 및 독립성 기반의 부호화 및 복호화 처리에 따라, 복호화 시스템(300)에서는 전체 비트스트림 중 사용자의 시점(perspective)에 따른 일부 비트스트림을 용이하게 선택하고, 적응적 및 선택적 복호화를 효율적으로 수행할 수 있게 된다.In addition, according to the SRAP and independence-based encoding and decoding processing as described above, the decoding system 300 easily selects some bitstreams according to the user's perspective from among the entire bitstreams, and performs adaptive and selective decoding. can be performed efficiently.

또한, 독립적 서브 이미지 또는 의존적 서브 이미지들은 별도로 시그널링 또는 인덱싱 되어 이들을 식별할 수 있는 신호가 복호화 장치(200)로 전달될 수도 있다. 이 때, 복호화 장치(200)는 사용자의 시점에 따라 복호화 대상이 되는 서브 이미지들의 일부 또는 전체의 영역을 인접한 독립적 서브 이미지 또는 기 복호화된 서브 이미지들을 참조하여 서브 이미지들간 인접한 영역에 대하여, 직접 부호화 모드를 추측하고, 해당 영역에 대한 복원을 수행할 수 있다. In addition, independent sub-images or dependent sub-images may be separately signaled or indexed, and a signal capable of identifying them may be transmitted to the decoding apparatus 200 . In this case, the decoding apparatus 200 directly encodes a region of some or all of the sub-images to be decoded according to the user's viewpoint with reference to an adjacent independent sub-image or an adjacent region between the sub-images with reference to previously decoded sub-images. The mode can be guessed and restoration can be performed on the corresponding area.

한편, 도 24에 도시된 바와 같이 이와 같은 화면 내 예측 및 화면 간 예측 부호화에 있어서, 부호화 효율을 고려한 코딩 방법이 예시될 수 있다.Meanwhile, as shown in FIG. 24 , in such intra prediction and inter prediction encoding, a coding method in consideration of encoding efficiency may be exemplified.

도 24를 참조하면, 본 발명의 실시 예에 따른 부호화 장치(100)는 영상의 PSNR 등의 Distortion 을 측정하거나, RDO 과정(부호화 과정)의 Error 값 등을 측정하기 위해, 영상의 특징에 따른 영역별 가중치(Weighting factor)를 상이하게 적용하는 연산(Metric)을 수행할 수 있다.Referring to FIG. 24 , the encoding apparatus 100 according to an embodiment of the present invention measures a distortion such as a PSNR of an image or an error value of an RDO process (encoding process), an area according to the characteristics of an image. A metric for applying different weighting factors may be performed.

도 24에 도시된 바와 같이, VR 디스플레이를 이용하여, 동기화된 다시점 영상을 시청하는 사용자는 일반적으로 정면 시점을 기준으로 좌우 시야범위는 넓을 수 있으나, 상하 시야범위가 좁을 수 있다. 따라서, 부호화 장치(100)는 다시점 영상을 2차원 영상으로 변환하고, 시야 상단 및 하단 영역에 대하여는 낮은 가중치 영역(LOW WEIGHT FACTOR REGION)으로 설정함으로써, 부호화 효을을 높일 수 있다.As shown in FIG. 24 , a user viewing a synchronized multi-view image using a VR display may generally have a wide left and right viewing range based on a front view, but may have a narrow vertical viewing range. Accordingly, the encoding apparatus 100 converts a multi-viewpoint image into a two-dimensional image, and sets the upper and lower regions of the field of view as LOW WEIGHT FACTOR REGIONs, thereby increasing the encoding efficiency.

도 24는 다양한 동기화된 다시점 영상의 사용자 시점 영역을 나타내며, 이중 다시점 영상이 Equirectangular type인 경우에서의 낮은 가중치 영역(LOW weighting factor) 적용 지역을 나타내고 있다.24 shows a user view region of various synchronized multi-view images, and shows a region to which a low weighting factor is applied in the case where the dual multi-view image is an equirectangular type.

특히, 가중치 적용을 위해, 먼저 부호화 장치(100)는 동기화된 다시점 영상을 2차원 영상으로 변환 부호화를 수행할 수 있으며, 다시점 영상의 타입에 따라, 가중치 영역(Weighting region)이 상이하게 결정될 수 있다. 또한, 가중치 영역 정보는 전술한 공간적 구조 정보에 포함되거나, HLS를 통해 별도 시그널링 될 수 있다. 가중치 요소(Weighting factor) 또한 별도의 HLS 상에 시그널링되거나 리스트 형태로 전송될 수 있다.In particular, for weight application, first, the encoding apparatus 100 may perform transcoding on a synchronized multi-view image into a 2D image, and a weighting region may be determined differently depending on the type of the multi-view image. can In addition, the weight region information may be included in the above-described spatial structure information or may be separately signaled through HLS. A weighting factor may also be signaled on a separate HLS or transmitted in the form of a list.

이와 같은 본 발명의 실시 예에 따른 가중치 영역은 자연 영상의 특징 및 굴절된 3D 영상 등의 특성 상, 인간이 집중할 수 있는 영역이 제한된다는 점을 가정하여 설정될 수 있다.The weight area according to the embodiment of the present invention may be set on the assumption that an area on which a human can focus is limited due to characteristics of a natural image and a refracted 3D image.

이에 따라, 부호화 장치(100)는 상대적으로 낮은 가중치 영역에 대하여, 상대적으로 낮은 QP를 Base QP로 적용 하여 부호화를 수행할 수 있다. 또한, 부호화 장치(100)는 PSNR을 측정시, 상대적으로 낮은 가중치 영역에 대한 PSNR을 낮게 측정할 수 있다. 반대로 부호화 장치(100)는 가중치가 상대적으로 높은 영역에 대해서는 상대적으로 높은 QP를 Base QP로 적용하거나, PSNR을 상대적으로 높게 측정하는 동작을 수행할 수 있다.Accordingly, the encoding apparatus 100 may perform encoding by applying a relatively low QP as a base QP to a relatively low weight region. Also, when measuring the PSNR, the encoding apparatus 100 may measure the PSNR for a relatively low weight region to be low. Conversely, the encoding apparatus 100 may perform an operation of applying a relatively high QP as a base QP or measuring a relatively high PSNR to a region having a relatively high weight.

또한, 본 발명의 실시 예에 따르면, 전술한 바와 같이 인트라 블록 복사 예측을 위해, 동일한 POC 내에서도 서로 다른 시점을 갖는 영상을 참조한 복호화가 수행될 수 있으며, 이 때 참조되는 영상은 독립적 복호화가 가능한 독립적 서브 이미지가 예시될 수 있다.In addition, according to an embodiment of the present invention, as described above, for intra block copy prediction, decoding with reference to images having different viewpoints within the same POC may be performed, and the referenced images are independent decoding possible. A sub image may be exemplified.

특히, 바운더리 영역에 대한 인트라 예측이 효율적으로 적용될 수 있으며, 바운더리 영역에 대한 부호화 효율을 최대화하기 위해 상기 가중치 요소가 이용될 수 있다. 예를 들어, 부호화 장치(100)는 독립적 서브 이미지 영역으로부터 참조되는 영역을 패치(PATCH)할 수 있도록 하는 상기 가중치 요소를 거리에 따라 지정하고, 이에 대한 스케일링 값을 적용함으로써 주변 블록의 예측 값을 효율적으로 결정할 수 있다.In particular, intra prediction for the boundary region may be efficiently applied, and the weight factor may be used to maximize encoding efficiency for the boundary region. For example, the encoding apparatus 100 designates the weight factor that allows the area referenced from the independent sub-image area to be patched according to the distance, and applies a scaling value to the weighting factor to determine the prediction value of the neighboring block. can be efficiently determined.

또한, 부호화 장치(100)는 화면 내 부호화에 대한 MPM(MOST PROBABLE MODE)를 구성함에 있어서, 바운더리 영역에 대한 가중치를 적용할 수 있다. 따라서, 부호화 장치(100)는 구조적으로는 이웃하지 않더라도, 영상내 서브 이미지간 경계면을 중심으로 MPM의 예측 방향을 유도할 수 있다.Also, the encoding apparatus 100 may apply a weight to the boundary area when configuring the MPM (MOST PROBABLE MODE) for intra-picture encoding. Accordingly, the encoding apparatus 100 may derive the prediction direction of the MPM centering on the boundary between the sub-images in the image, even if they are not structurally adjacent.

한편, 부호화 장치(100)는 화면 간 부호화(INTER CODING)에 있어서도 독립적 서브 이미지를 고려할 수 있다. 예를 들어, 부호화 장치(100)는 독립적 서브 이미지에 대응하여, 연관 지역에 위치한(co-lacated) 서브 이미지를 참조할 수 있다. 이에 따라, 화면 간 부호화를 위한 참조 픽쳐 구성시, 영상 중 일부 서브 이미지에 대해서만 참조 픽쳐 리스트에 추가할 수 있게 된다.Meanwhile, the encoding apparatus 100 may also consider an independent sub-image in INTER CODING. For example, the encoding apparatus 100 may refer to a sub-image co-lacated in a related region in correspondence to the independent sub-image. Accordingly, when configuring a reference picture for inter-screen encoding, only some sub-images of an image can be added to the reference picture list.

도 25 내지 도 26은 본 발명의 실시 예에 따른 부호화 및 복호화 처리를 설명하기 위한 도면들이다.25 to 26 are diagrams for explaining encoding and decoding processing according to an embodiment of the present invention.

도 25는 본 발명의 일실시예에 따른 동영상 부호화 장치의 구성을 블록도로 도시한 것으로, 본 발명의 실시 예에 따른 동기화된 다시점 영상의 각각의 서브 이미지 또는 전체 프레임을 입력 비디오 신호로서 입력받아 처리할 수 있다.25 is a block diagram showing the configuration of a moving picture encoding apparatus according to an embodiment of the present invention. Each sub-image or entire frame of a synchronized multi-view image according to an embodiment of the present invention is received as an input video signal. can be processed

도 25를 참조하면, 본 발명에 따른 동영상 부호화 장치(100)는 픽쳐 분할부(160), 변환부, 양자화부, 스캐닝부, 엔트로피 부호화부, 인트라 예측부(169), 인터 예측부(170), 역양자화부, 역변환부, 후처리부(171), 픽쳐 저장부(172), 감산부 및 가산부(168)를 포함한다.Referring to FIG. 25 , the video encoding apparatus 100 according to the present invention includes a picture division unit 160 , a transform unit, a quantization unit, a scanning unit, an entropy encoding unit, an intra prediction unit 169 , and an inter prediction unit 170 . , an inverse quantization unit, an inverse transform unit, a post-processing unit 171 , a picture storage unit 172 , a subtraction unit and an addition unit 168 .

픽쳐 분할부(160)는 입력되는 비디오 신호를 분석하여 픽쳐를 가장 큰 코딩 유닛(LCU:Largest Coding Unit)마다 소정 크기의 코딩 유닛으로 분할하여 예측 모드를 결정하고, 상기 코딩 유닛별로 예측 유닛의 크기를 결정한다.The picture divider 160 analyzes the input video signal, divides the picture into coding units of a predetermined size for each largest coding unit (LCU), determines a prediction mode, and the size of the prediction unit for each coding unit. to decide

그리고, 픽쳐 분할부(160)는 부호화할 예측 유닛을 예측 모드(또는 예측 방법)에 따라 인트라 예측부(169) 또는 인터 예측부(170)로 보낸다. 또한, 픽쳐 분할부(160)는 부호화할 예측 유닛을 감산부로 보낸다.Then, the picture splitter 160 transmits the prediction unit to be encoded to the intra prediction unit 169 or the inter prediction unit 170 according to the prediction mode (or prediction method). Also, the picture dividing unit 160 sends a prediction unit to be encoded to the subtracting unit.

픽쳐는 복수의 슬라이스로 구성되고, 슬라이스는 복수개의 최대 부호화 단위(Largest coding unit: LCU)로 구성될 수 있다.A picture may be composed of a plurality of slices, and a slice may be composed of a plurality of largest coding units (LCUs).

상기 LCU는 복수개의 부호화 단위(CU)로 분할될 수 있고, 부호기는 분할여부를 나타내는 정보(flag)를 비트스트림에 추가할 수 있다. 복호기는 LCU의 위치를 어드레스(LcuAddr)를 이용하여 인식할 수 있다.The LCU may be divided into a plurality of coding units (CUs), and the encoder may add information (flag) indicating whether to divide to a bitstream. The decoder may recognize the location of the LCU using an address (LcuAddr).

분할이 허용되지 않는 경우의 부호화 단위(CU)는 예측 단위(Prediction unit: PU)로 간주되고, 복호기는 PU의 위치를 PU인덱스를 이용하여 인식할 수 있다.When splitting is not allowed, the coding unit (CU) is regarded as a prediction unit (PU), and the decoder may recognize the position of the PU using the PU index.

예측 단위(PU)는 복수개의 파티션으로 나뉠 수 있다. 또한 예측 단위(PU)는 복수개의 변환 단위(Transform unit: TU)로 구성될 수 있다.A prediction unit (PU) may be divided into a plurality of partitions. Also, the prediction unit (PU) may be composed of a plurality of transform units (TUs).

이 경우, 픽쳐 분할부(160)는 결정된 부호화 모드에 따른 소정 크기의 블록 단위(예를 들면, PU 단위 또는 TU 단위)로 영상 데이터를 감산부로 보낼 수 있다.In this case, the picture divider 160 may transmit the image data to the subtractor in block units (eg, PU units or TU units) of a predetermined size according to the determined encoding mode.

동영상 부호화 단위로 CTB (Coding Tree Block)을 사용하며, 이 때 CTB는 다양한 정사각형 모양으로 정의된다. CTB는 코딩단위 CU(Coding Unit)라고 부른다. A Coding Tree Block (CTB) is used as a video coding unit, and CTB is defined in various square shapes. CTB is called a coding unit (CU).

코딩단위(CU)는 분할에 따른 쿼드트리(Quad Tree)의 형태를 가질 수 있다. 또한, QTBT(Quadtree plus binary tree) 분할의 경우 코딩단위는 상기 쿼드트리 또는 단말 노드에서 이진 분할된 바이너리 트리(Binary Tree)의 형태를 가질 수 있으며, 부호화기의 표준의 따라 최대 크기가 256X256에서 64ㅧ64로 구성될 수 있다.The coding unit (CU) may have the form of a quad tree according to division. In addition, in the case of QTBT (Quadtree plus binary tree) partitioning, the coding unit may have the form of the quadtree or binary tree partitioned binary in the terminal node, and the maximum size according to the encoder standard is 256X256 to 64XVIII. It may consist of 64.

예를 들어 픽쳐 분할부(160)는 최대 크기가 64X64인 경우, 최대 코딩단위 LCU(Largest Coding Unit)일 때 깊이(Depth)를 0으로 하여 깊이가 3이 될 때까지, 즉 8ㅧ8크기의 코딩단위(CU)까지 재귀적(Recursive)으로 최적의 예측단위를 찾아 부호화를 수행한다. 또한, 예를 들어 QTBT로 분할된 단말 노드의 코딩 유닛에 대해, PU(Prediction Unit) 및 TU(Transform Unit)는 상기 분할된 코딩 유닛과 동일한 형태를 갖거나 더 분할된 형태를 가질 수 있다.For example, when the maximum size is 64X64, the picture divider 160 sets the depth to 0 when the maximum coding unit is the largest coding unit (LCU) until the depth becomes 3, that is, the Encoding is performed by finding the optimal prediction unit recursively up to the coding unit (CU). Also, for example, for a coding unit of a terminal node divided into QTBT, a prediction unit (PU) and a transform unit (TU) may have the same form as the segmented coding unit or may have a further segmented form.

예측을 수행하는 예측단위는 PU(Prediction Unit)로 정의되며, 각 코딩단위(CU)는 다수개의 블록으로 분할된 단위의 예측이 수행되며, 정사각형과 직사각형의 형태로 나뉘어 예측을 수행한다. A prediction unit that performs prediction is defined as a prediction unit (PU), and each coding unit (CU) performs prediction of a unit divided into a plurality of blocks, and performs prediction by dividing it into square and rectangular shapes.

변환부는 입력된 예측 유닛의 원본 블록과 인트라 예측부(169) 또는 인터 예측부(170)에서 생성된 예측 블록의 잔차신호인 잔차 블록을 변환한다. 상기 잔차 블록은 코딩 유닛 또는 예측 유닛으로 구성된다. 코딩 유닛 또는 예측 유닛으로 구성된 잔차 블록은 최적의 변환 단위(Transform Unit)로 분할되어 변환된다. 예측 모드(intra or inter)에 따라 서로 다른 변환 매트릭스가 결정될 수 있다. 또한, 인트라 예측의 잔차 신호는 인트라 예측 모드에 따라 방향성을 가지므로 인트라 예측 모드에 따라 적응적으로 변환 매트릭스가 결정될 수 있다.The transform unit transforms a residual block that is a residual signal of the original block of the input prediction unit and the prediction block generated by the intra prediction unit 169 or the inter prediction unit 170 . The residual block is composed of a coding unit or a prediction unit. A residual block composed of a coding unit or a prediction unit is divided and transformed into an optimal transform unit. Different transform matrices may be determined according to prediction modes (intra or inter). In addition, since the residual signal of the intra prediction has directionality according to the intra prediction mode, the transform matrix may be adaptively determined according to the intra prediction mode.

변환 단위는 2개(수평, 수직)의 1차원 변환 매트릭스에 의해 변환될 수 있다. 예를 들어, 인터 예측의 경우에는 미리 결정된 1개의 변환 매트릭스가 결정된다.A transformation unit may be transformed by two (horizontal and vertical) one-dimensional transformation matrices. For example, in the case of inter prediction, one predetermined transform matrix is determined.

반면에, 인트라 예측의 경우, 인트라 예측 모드가 수평인 경우에는 잔차 블록이 수직방향으로의 방향성을 가질 확률이 높아지므로, 수직방향으로는 DCT 기반의 정수 매트릭스를 적용하고, 수평방향으로는 DST 기반 또는 KLT 기반의 정수 매트릭스를 적용한다. 인트라 예측 모드가 수직인 경우에는 수직방향으로는 DST 기반 또는 KLT 기반의 정수 매트릭스를, 수평 방향으로는 DCT 기반의 정수 매트릭스를 적용한다.On the other hand, in the case of intra prediction, when the intra prediction mode is horizontal, the probability that the residual block has a vertical direction increases, so a DCT-based integer matrix is applied in the vertical direction and DST-based in the horizontal direction. Alternatively, a KLT-based integer matrix is applied. When the intra prediction mode is vertical, a DST-based or KLT-based integer matrix is applied in a vertical direction and a DCT-based integer matrix is applied in a horizontal direction.

DC 모드의 경우에는 양방향 모두 DCT 기반 정수 매트릭스를 적용한다. 또한, 인트라 예측의 경우, 변환 단위의 크기에 의존하여 변환 매트릭스가 적응적으로 결정될 수도 있다.In the case of DC mode, DCT-based integer matrix is applied in both directions. Also, in the case of intra prediction, the transform matrix may be adaptively determined depending on the size of the transform unit.

양자화부는 상기 변환 매트릭스에 의해 변환된 잔차 블록의 계수들을 양자화하기 위한 양자화 스텝 사이즈를 결정한다. 양자화 스텝 사이즈는 미리 정해진 크기 이상의 부호화 단위(이하, 양자화 유닛이라 함)별로 결정된다.The quantization unit determines a quantization step size for quantizing coefficients of the residual block transformed by the transform matrix. The quantization step size is determined for each coding unit (hereinafter, referred to as a quantization unit) having a size greater than or equal to a predetermined size.

상기 미리 정해진 크기는 8x8 또는 16x16일 수 있다. 그리고, 결정된 양자화 스텝 사이즈 및 예측 모드에 따라 결정되는 양자화 매트릭스를 이용하여 상기 변환 블록의 계수들을 양자화한다.The predetermined size may be 8x8 or 16x16. Then, the coefficients of the transform block are quantized using a quantization matrix determined according to the determined quantization step size and prediction mode.

양자화부는 현재 양자화 유닛의 양자화 스텝 사이즈 예측자로서 현재 양자화 유닛에 인접한 양자화 유닛의 양자화 스텝 사이즈를 이용한다.The quantization unit uses the quantization step size of a quantization unit adjacent to the current quantization unit as a quantization step size predictor of the current quantization unit.

양자화부는 현재 양자화 유닛의 좌측 양자화 유닛, 상측 양자화 유닛, 좌상측 양자화 유닛 순서로 검색하여 1개 또는 2개의 유효한 양자화 스텝 사이즈를 이용하여 현재 양자화 유닛의 양자화 스텝 사이즈 예측자를 생성할 수 있다.The quantization unit searches in the order of the left quantization unit, the upper quantization unit, and the upper left quantization unit of the current quantization unit, and uses one or two valid quantization step sizes to generate a quantization step size predictor of the current quantization unit.

예를 들어, 상기 순서로 검색된 유효한 첫번째 양자화 스텝 사이즈를 양자화 스텝 사이즈 예측자로 결정할 수 있다. 또한, 상기 순서로 검색된 유효한 2개의 양자화 스텝 사이즈의 평균값을 양자화 스텝 사이즈 예측자로 결정할 수도 있고, 1개만이 유효한 경우에는 이를 양자화 스텝 사이즈 예측자로 결정할 수 있다.For example, a valid first quantization step size searched in the above order may be determined as a quantization step size predictor. In addition, the average value of two valid quantization step sizes searched in the above order may be determined as the quantization step size predictor, or if only one is valid, it may be determined as the quantization step size predictor.

상기 양자화 스텝 사이즈 예측자가 결정되면, 현재 부호화 단위의 양자화 스텝 사이즈와 상기 양자화 스텝 사이즈 예측자 사이의 차분값을 엔트로피 부호화부로 전송한다.When the quantization step size predictor is determined, a difference value between the quantization step size of the current coding unit and the quantization step size predictor is transmitted to the entropy encoder.

한편, 현재 코딩 유닛의 좌측 코딩 유닛, 상측 코딩 유닛, 좌상측 코딩 유닛 모두가 존재하지 않을 가능성이 있다. 반면에 최대 코딩 유닛 내의 부호화 순서 상으로 이전에 존재하는 코딩 유닛이 존재할 수 있다.On the other hand, there is a possibility that all of the left coding unit, the upper coding unit, and the upper left coding unit of the current coding unit do not exist. On the other hand, a previously existing coding unit in the coding order within the largest coding unit may exist.

따라서, 현재 코딩 유닛에 인접한 양자화 유닛들과 상기 최대 코딩 유닛 내에서는 부호화 순서상 바로 이전의 양자화 유닛의 양자화 스텝 사이즈가 후보자가 될 수 있다.Accordingly, in the quantization units adjacent to the current coding unit and the largest coding unit, the quantization step size of the quantization unit immediately preceding in the coding order may be a candidate.

이 경우, 1) 현재 코딩 유닛의 좌측 양자화 유닛, 2) 현재 코딩 유닛의 상측 양자화 유닛, 3) 현재 코딩 유닛의 좌상측 양자화 유닛, 4) 부호화 순서상 바로 이전의 양자화 유닛 순서로 우선순위를 둘 수 있다. 상기 순서는 바뀔 수 있고, 상기 좌상측 양자화 유닛은 생략될 수도 있다.In this case, 1) the left quantization unit of the current coding unit, 2) the upper quantization unit of the current coding unit, 3) the upper left quantization unit of the current coding unit, 4) the order of the quantization unit immediately preceding in the coding order. can The order may be changed, and the upper-left quantization unit may be omitted.

상기 양자화된 변환 블록은 역양자화부와 스캐닝부로 제공된다.The quantized transform block is provided to an inverse quantizer and a scanning unit.

스캐닝부는 양자화된 변환 블록의 계수들을 스캐닝하여 1차원의 양자화 계수들로 변환한다. 양자화 후의 변환 블록의 계수 분포가 인트라 예측 모드에 의존적일 수 있으므로, 스캐닝 방식은 인트라 예측 모드에 따라 결정된다.The scanning unit scans the coefficients of the quantized transform block and converts them into one-dimensional quantized coefficients. Since the coefficient distribution of the transform block after quantization may depend on the intra prediction mode, the scanning method is determined according to the intra prediction mode.

또한, 계수 스캐닝 방식은 변환 단위의 크기에 따라 달리 결정될 수도 있다. 상기 스캔 패턴은 방향성 인트라 예측 모드에 따라 달라질 수 있다. 양자화 계수들의 스캔순서는 역방향으로 스캔한다.Also, the coefficient scanning method may be determined differently according to the size of the transform unit. The scan pattern may vary according to a directional intra prediction mode. The scan order of the quantization coefficients is reversed.

상기 양자화된 계수들이 복수개의 서브셋으로 분할된 경우에는 각각의 서브셋 내의 양자화 계수들에 동일한 스캔패턴을 적용한다. 서브셋 간의 스캔패턴은 지그재그 스캔 또는 대각선 스캔을 적용한다. 스캔 패턴은 DC를 포함하는 메인 서브셋으로부터 순방향으로 잔여 서브셋들로 스캔하는 것이 바람직하나, 그 역방향도 가능하다.When the quantized coefficients are divided into a plurality of subsets, the same scan pattern is applied to the quantized coefficients in each subset. A zigzag scan or a diagonal scan is applied for the scan pattern between subsets. The scan pattern is preferably scanned from the main subset including DC to the remaining subsets in a forward direction, but the reverse direction is also possible.

또한, 서브셋 내의 양자화된 계수들의 스캔패턴과 동일하게 서브셋 간의 스캔패턴을 설정할 수도 있다. 이 경우, 서브셋 간의 스캔패턴이 인트라 예측 모드에 따라 결정된다. 한편, 부호기는 상기 변환 유닛내의 0이 아닌 마지막 양자화 계수의 위치를 나타낼 수 있는 정보를 복호기로 전송한다.Also, the scan pattern between the subsets may be set to be the same as the scan pattern of the quantized coefficients in the subset. In this case, the scan pattern between the subsets is determined according to the intra prediction mode. Meanwhile, the encoder transmits information indicating the position of the last non-zero quantization coefficient in the transform unit to the decoder.

각 서브셋 내의 0이 아닌 마지막 양자화 계수의 위치를 나타낼 수 있는 정보도 복호기로 전송할 수 있다.Information that may indicate the position of the last non-zero quantization coefficient in each subset may also be transmitted to the decoder.

역양자화(135)는 상기 양자화된 양자화 계수를 역양자화한다. 역변환부는 역양자화된 변환 계수를 공간 영역의 잔차 블록으로 복원한다. 가산기는 상기 역변환부에 의해 복원된 잔차블록과 인트라 예측부(169) 또는 인터 예측부(170)로부터의 수신된 예측 블록을 합쳐서 복원 블록을 생성한다.The inverse quantization 135 inversely quantizes the quantized quantization coefficient. The inverse transform unit restores the inverse quantized transform coefficient to a residual block in the spatial domain. The adder generates a reconstructed block by combining the residual block reconstructed by the inverse transform unit and the prediction block received from the intra prediction unit 169 or the inter prediction unit 170 .

후처리부(171)는 복원된 픽쳐에 발생하는 블록킹 효과의 제거하기 위한 디블록킹 필터링 과정, 화소 단위로 원본 영상과의 차이값을 보완하기 위한 적응적 오프셋 적용 과정 및 코딩 유닛으로 원본 영상과의 차이값을 보완하기 위한 적응적 루프 필터링 과정을 수행한다.The post-processing unit 171 includes a deblocking filtering process for removing a blocking effect occurring in the reconstructed picture, an adaptive offset application process for compensating for a difference value from the original image in units of pixels, and a difference from the original image as a coding unit. An adaptive loop filtering process is performed to supplement the values.

디블록킹 필터링 과정은 미리 정해진 크기 이상의 크기를 갖는 예측 유닛 및 변환 단위의 경계에 적용하는 것이 바람직하다. 상기 크기는 8x8일 수 있다. 상기 디블록킹 필터링 과정은 필터링할 경계(boundary)를 결정하는 단계, 상기 경계에 적용할 경계 필터링 강도(bounary filtering strength)를 결정하는 단계, 디블록킹 필터의 적용 여부를 결정하는 단계, 상기 디블록킹 필터를 적용할 것으로 결정된 경우, 상기 경계에 적용할 필터를 선택하는 단계를 포함한다.The deblocking filtering process is preferably applied to a boundary between a prediction unit and a transform unit having a size greater than or equal to a predetermined size. The size may be 8x8. The deblocking filtering process includes determining a boundary to be filtered, determining a boundary filtering strength to be applied to the boundary, determining whether to apply a deblocking filter, and the deblocking filter and selecting a filter to be applied to the boundary when it is determined to apply .

상기 디블록킹 필터의 적용 여부는 i) 상기 경계 필터링 강도가 0보다 큰지 여부 및 ii) 상기 필터링할 경계에 인접한 2개의 블록(P 블록, Q블록) 경계 부분에서의 화소값들이 변화 정도를 나타내는 값이 양자화 파라미터에 의해 결정되는 제1 기준값보다 작은지 여부에 의해 결정된다.Whether the deblocking filter is applied is determined by i) whether the boundary filtering intensity is greater than 0, and ii) a value indicating the degree of change in pixel values at the boundary of two blocks (P block, Q block) adjacent to the boundary to be filtered. It is determined by whether it is smaller than a first reference value determined by the quantization parameter.

상기 필터는 적어도 2개 이상인 것이 바람직하다. 블록 경계에 위치한 2개의 화소들간의 차이값의 절대값이 제2 기준값보다 크거나 같은 경우에는 상대적으로 약한 필터링을 수행하는 필터를 선택한다.The filter is preferably at least two or more. When the absolute value of the difference value between two pixels located at the block boundary is greater than or equal to the second reference value, a filter that performs relatively weak filtering is selected.

상기 제2 기준값은 상기 양자화 파라미터 및 상기 경계 필터링 강도에 의해 결정된다.The second reference value is determined by the quantization parameter and the boundary filtering strength.

적응적 오프셋 적용 과정은 디블록킹 필터가 적용된 영상내의 화소와 원본 화소간의 차이값(distortion)을 감소시키기 위한 것이다. 픽쳐 또는 슬라이스 단위로 상기 적응적 오프셋 적용 과정을 수행할지 여부를 결정할 수 있다.The process of applying the adaptive offset is to reduce the distortion between the pixel in the image to which the deblocking filter is applied and the original pixel. It may be determined whether to perform the adaptive offset application process in units of pictures or slices.

픽쳐 또는 슬라이스는 복수개의 오프셋 영역들로 분할될 수 있고, 각 오프셋 영역별로 오프셋 타입이 결정될 수 있다. 오프셋 타입은 미리 정해진 개수(예를 들어, 4개)의 에지 오프셋 타입과 2개의 밴드 오프셋 타입을 포함할 수 있다.A picture or slice may be divided into a plurality of offset regions, and an offset type may be determined for each offset region. The offset type may include a predetermined number (eg, four) of edge offset types and two band offset types.

오프셋 타입이 에지 오프셋 타입일 경우에는 각 화소가 속하는 에지 타입을 결정하여, 이에 대응하는 오프셋을 적용한다. 상기 에지 타입은 현재 화소와 인접하는 2개의 화소값의 분포를 기준으로 결정한다.When the offset type is an edge offset type, an edge type to which each pixel belongs is determined, and an offset corresponding thereto is applied. The edge type is determined based on the distribution of values of two pixels adjacent to the current pixel.

적응적 루프 필터링 과정은 디블록킹 필터링 과정 또는 적응적 오프셋 적용 과정을 거친 복원된 영상과 원본 영상을 비교한 값을 기초로 필터링을 수행할 수 있다. 적응적 루프 필터링은 상기 결정된 ALF는 4x4 크기 또는 8x8 크기의 블록에 포함된 화소 전체에 적용될 수 있다.The adaptive loop filtering process may perform filtering based on a value obtained by comparing the original image with the reconstructed image that has undergone the deblocking filtering process or the adaptive offset application process. In the adaptive loop filtering, the determined ALF may be applied to all pixels included in a 4x4 or 8x8 block.

적응적 루프 필터의 적용 여부는 코딩 유닛별로 결정될 수 있다. 각 코딩 유닛에 따라 적용될 루프 필터의 크기 및 계수는 달라질 수 있다. 코딩 유닛별 상기 적응적 루프 필터의 적용 여부를 나타내는 정보는 각 슬라이스 헤더에 포함될 수 있다.Whether to apply the adaptive loop filter may be determined for each coding unit. The size and coefficient of the loop filter to be applied according to each coding unit may vary. Information indicating whether the adaptive loop filter is applied for each coding unit may be included in each slice header.

색차 신호의 경우에는, 픽쳐 단위로 적응적 루프 필터의 적용 여부를 결정할 수 있다. 루프 필터의 형태도 휘도와 달리 직사각형 형태를 가질 수 있다.In the case of a color difference signal, it may be determined whether the adaptive loop filter is applied in units of pictures. The shape of the loop filter may also have a rectangular shape unlike the luminance.

적응적 루프 필터링은 슬라이스별로 적용 여부를 결정할 수 있다. 따라서, 현재 슬라이스에 적응적 루프 필터링이 적용되는지 여부를 나타내는 정보는 슬라이스 헤더 또는 픽쳐 헤더에 포함된다.Whether to apply the adaptive loop filtering for each slice may be determined. Accordingly, information indicating whether adaptive loop filtering is applied to the current slice is included in the slice header or the picture header.

현재 슬라이스에 적응적 루프 필터링이 적용됨을 나타내면, 슬라이스 헤더 또는 픽쳐 헤더는 추가적으로 적응적 루프 필터링 과정에 사용되는 휘도 성분의 수평 및/또는 수직 방향의 필터 길이를 나타내는 정보를 포함한다.If it indicates that adaptive loop filtering is applied to the current slice, the slice header or the picture header additionally includes information indicating the filter length in the horizontal and/or vertical direction of the luminance component used in the adaptive loop filtering process.

슬라이스 헤더 또는 픽쳐 헤더는 필터 세트의 수를 나타내는 정보를 포함할 수 있다. 이때 필터 세트의 수가 2 이상이면, 필터 계수들이 예측 방법을 사용하여 부호화될 수 있다. 따라서, 슬라이스 헤더 또는 픽쳐 헤더는 필터 계수들이 예측 방법으로 부호화되는지 여부를 나타내는 정보를 포함할 수 있으며, 예측 방법이 사용되는 경우에는 예측된 필터 계수를 포함한다.The slice header or the picture header may include information indicating the number of filter sets. In this case, if the number of filter sets is two or more, filter coefficients may be encoded using a prediction method. Accordingly, the slice header or the picture header may include information indicating whether filter coefficients are encoded by the prediction method, and include the predicted filter coefficients when the prediction method is used.

한편, 휘도 뿐만 아니라, 색차 성분들도 적응적으로 필터링될 수 있다. 따라서, 색차 성분 각각이 필터링되는지 여부를 나타내는 정보를 슬라이스 헤더 또는 픽쳐 헤더가 포함할 수 있다. 이 경우, 비트수를 줄이기 위해 Cr과 Cb에 대한 필터링 여부를 나타내는 정보를 조인트 코딩(즉, 다중화 코딩)할 수 있다.Meanwhile, not only luminance but also chrominance components may be adaptively filtered. Accordingly, the slice header or the picture header may include information indicating whether each chrominance component is filtered. In this case, information indicating whether to filter Cr and Cb may be jointly coded (ie, multiplexed coding) in order to reduce the number of bits.

이때, 색차 성분들의 경우에는 복잡도 감소를 위해 Cr과 Cb를 모두 필터링하지 않는 경우가 가장 빈번할 가능성이 높으므로, Cr과 Cb를 모두 필터링하지 않는 경우에 가장 작은 인덱스를 할당하여 엔트로피 부호화를 수행한다.In this case, in the case of color difference components, it is most likely that neither Cr nor Cb is filtered to reduce complexity. Therefore, when neither Cr nor Cb is filtered, the smallest index is assigned to perform entropy encoding. .

그리고, Cr 및 Cb를 모두 필터링하는 경우에 가장 큰 인덱스를 할당하여 엔트로피 부호화를 수행한다.Then, when both Cr and Cb are filtered, entropy encoding is performed by allocating the largest index.

픽쳐 저장부(172)는 후처리된 영상 데이터를 후처리부(171)로부터 입력받아 픽쳐(picture) 단위로 영상을 복원하여 저장한다. 픽쳐는 프레임 단위의 영상이거나 필드 단위의 영상일 수 있다. 픽쳐 저장부(172)는 다수의 픽쳐를 저장할 수 있는 버퍼(도시되지 않음)를 구비한다.The picture storage unit 172 receives the post-processed image data from the post-processing unit 171 to restore and store the image in units of pictures. A picture may be an image in units of frames or images in units of fields. The picture storage unit 172 includes a buffer (not shown) capable of storing a plurality of pictures.

인터 예측부(170)는 상기 픽쳐 저장부(172)에 저장된 적어도 하나 이상의 참조 픽쳐를 이용하여 움직임 추정을 수행하고, 참조 픽쳐를 나타내는 참조 픽쳐 인덱스 및 움직임 벡터를 결정한다.The inter prediction unit 170 performs motion estimation using at least one or more reference pictures stored in the picture storage unit 172 , and determines a reference picture index and a motion vector indicating the reference picture.

그리고, 결정된 참조 픽쳐 인덱스 및 움직임 벡터에 따라, 픽쳐 저장부(172)에 저장된 다수의 참조 픽쳐들 중 움직임 추정에 이용된 참조 픽쳐로부터, 부호화하고자 하는 예측 유닛에 대응하는 예측 블록을 추출하여 출력한다.Then, according to the determined reference picture index and motion vector, from a reference picture used for motion estimation among a plurality of reference pictures stored in the picture storage unit 172, a prediction block corresponding to a prediction unit to be encoded is extracted and output. .

인트라 예측부(169)는 현재 예측 유닛이 포함되는 픽처 내부의 재구성된 화소값을 이용하여 인트라 예측 부호화를 수행한다.The intra prediction unit 169 performs intra prediction encoding by using the reconstructed pixel values inside the picture including the current prediction unit.

인트라 예측부(169)는 예측 부호화할 현재 예측 유닛을 입력받아 현재 블록의 크기에 따라 미리 설정된 개수의 인트라 예측 모드 중에 하나를 선택하여 인트라 예측을 수행한다.The intra prediction unit 169 receives a current prediction unit to be prediction-encoded, selects one of a preset number of intra prediction modes according to the size of the current block, and performs intra prediction.

인트라 예측부(169)는 인트라 예측 블록을 생성하기 위해 참조 화소를 적응적으로 필터링한다. 참조 화소가 이용 가능하지 않은 경우에는 이용 가능한 참조 화소들을 이용하여 참조 화소들을 생성할 수 있다.The intra prediction unit 169 adaptively filters reference pixels to generate an intra prediction block. When the reference pixel is not available, the reference pixels may be generated using the available reference pixels.

엔트로피 부호화부는 양자화부에 의해 양자화된 양자화 계수, 인트라 예측부(169)로부터 수신된 인트라 예측 정보, 인터 예측부(170)로부터 수신된 움직임 정보 등을 엔트로피 부호화한다.The entropy encoding unit entropy-encodes the quantized coefficients quantized by the quantization unit, intra prediction information received from the intra prediction unit 169 , and motion information received from the inter prediction unit 170 .

도시되지는 않았으나, 인터 예측 부호화 장치는 움직임 정보 결정부, 움직임 정보 부호화 모드 결정부, 움직임 정보 부호화부, 예측 블록 생성부, 잔차 블록 생성부, 잔차 블록 부호화부 및 멀티플렉서를 포함하여 구성될 수 있다.Although not shown, the inter prediction encoding apparatus may include a motion information determiner, a motion information encoding mode determiner, a motion information encoder, a prediction block generator, a residual block generator, a residual block encoder, and a multiplexer. .

움직임 정보 결정부는 현재 블록의 움직임 정보를 결정한다. 움직임 정보는 참조 픽쳐 인덱스와 움직임 벡터를 포함한다. 참조 픽쳐 인덱스는 이전에 부호화되어 복원된 픽쳐 중 어느 하나를 나타낸다.The motion information determining unit determines motion information of the current block. The motion information includes a reference picture index and a motion vector. The reference picture index indicates any one of previously encoded and reconstructed pictures.

현재 블록이 단방향 인터 예측 부호화되는 경우에는 리스트 0(L0)에 속하는 참조 픽쳐들 중의 어느 하나를 나타낸다. 반면에, 현재 블록이 양방향 예측 부호화되는 경우에는 리스트 0(L0)의 참조 픽쳐들 중 하나를 나타내는 참조픽쳐 인덱스와 리스트 1(L1)의 참조 픽쳐들 중의 하나를 나타내는 참조픽쳐 인덱스를 포함할 수 있다.When the current block is unidirectional inter prediction coded, it indicates any one of reference pictures belonging to list 0 (L0). On the other hand, when the current block is bi-predictively coded, it may include a reference picture index indicating one of the reference pictures of list 0 (L0) and a reference picture index indicating one of the reference pictures of list 1 (L1). .

또한, 현재 블록이 양방향 예측 부호화되는 경우에는 리스트 0과 리스트 1을 결합하여 생성된 복합 리스트(LC)의 참조 픽쳐들 중의 1개 또는 2개의 픽쳐를 나타내는 인덱스를 포함할 수 있다.In addition, when the current block is bi-predictively encoded, an index indicating one or two of the reference pictures of the composite list LC generated by combining the list 0 and the list 1 may be included.

움직임 벡터는 각각의 참조픽쳐 인덱스가 나타내는 픽쳐 내의 예측 블록의 위치를 나타낸다. 움직임 벡터는 화소단위(정수단위)일수도 있으나, 서브화소단위일 수도 있다.The motion vector indicates the position of the prediction block in the picture indicated by each reference picture index. The motion vector may be a pixel unit (integer unit) or a sub-pixel unit.

예를 들어, 1/2, 1/4, 1/8 또는 1/16 화소의 해상도를 가질 수 있다. 움직임 벡터가 정수단위가 아닐 경우에는 예측 블록은 정수 단위의 화소들로부터 생성된다.For example, it may have a resolution of 1/2, 1/4, 1/8, or 1/16 pixels. When the motion vector is not an integer unit, a prediction block is generated from pixels of an integer unit.

움직임 정보 부호화 모드 결정부는 현재 블록의 움직임 정보를 스킵 모드로 부호화할지, 머지 모드로 부호화할지, AMVP 모드로 부호화할지를 결정한다.The motion information encoding mode determiner determines whether to encode the motion information of the current block in the skip mode, the merge mode, or the AMVP mode.

스킵 모드는 현재 블록의 움직임 정보와 동일한 움직임 정보를 갖는 스킵 후보자가 존재하고, 잔차신호가 0인 경우에 적용된다. 또한, 스킵 모드는 현재 블록이 코딩 유닛과 사이즈가 같을 때 적용된다. 현재 블록은 예측 유닛으로 볼 수 있다.The skip mode is applied when a skip candidate having the same motion information as the motion information of the current block exists and the residual signal is 0. In addition, the skip mode is applied when the current block has the same size as the coding unit. The current block can be viewed as a prediction unit.

머지 모드는 현재 블록의 움직임 정보와 동일한 움직임 정보를 갖는 머지 후보자가 존재할 때 적용된다. 머지 모드는 현재 블록이 코딩 유닛과 사이즈가 다르거나, 사이즈가 같을 경우에는 잔차 신호가 존재하는 경우에 적용된다. 머지 후보자와 스킵 후보자는 동일할 수 있다.The merge mode is applied when there is a merge candidate having the same motion information as the motion information of the current block. The merge mode is applied when a residual signal exists when the size of the current block is different from that of the coding unit or when the size is the same. The merge candidate and the skip candidate may be the same.

AMVP 모드는 스킵 모드 및 머지 모드가 적용되지 않을 때 적용된다. 현재 블록의 움직임 벡터와 가장 유사한 움직임 벡터를 갖는 AMVP 후보자를 AMVP 예측자로 선택한다.AMVP mode is applied when skip mode and merge mode are not applied. An AMVP candidate having a motion vector most similar to the motion vector of the current block is selected as the AMVP predictor.

움직임 정보 부호화부는 움직임 정보 부호화 모드 결정부에 의해 결정된 방식에 따라 움직임 정보를 부호화한다. 움직임 정보 부호화 모드가 스킵 모드 또는 머지 모드일 경우에는 머지 움직임 벡터 부호화 과정을 수행한다. 움직임 정보 부호화 모드가 AMVP일 경우에는 AMVP 부호화 과정을 수행한다.The motion information encoder encodes the motion information according to a method determined by the motion information encoding mode determiner. When the motion information encoding mode is the skip mode or the merge mode, a merge motion vector encoding process is performed. When the motion information encoding mode is AMVP, an AMVP encoding process is performed.

예측 블록 생성부는 현재 블록의 움직임 정보를 이용하여 예측 블록을 생성한다. 움직임 벡터가 정수 단위일 경우에는, 참조픽쳐 인덱스가 나타내는 픽쳐 내의 움직임 벡터가 나타내는 위치에 대응하는 블록을 복사하여 현재 블록의 예측 블록을 생성한다.The prediction block generator generates a prediction block by using motion information of the current block. When the motion vector is an integer unit, a prediction block of the current block is generated by copying the block corresponding to the position indicated by the motion vector in the picture indicated by the reference picture index.

그러나, 움직임 벡터가 정수 단위가 아닐 경우에는, 참조픽쳐 인덱스가 나타내는 픽쳐내의 정수 단위 화소들로 부터 예측 블록의 화소들을 생성한다.However, when the motion vector is not an integer unit, pixels of the prediction block are generated from integer unit pixels in the picture indicated by the reference picture index.

이 경우, 휘도 화소의 경우에는 8탭의 보간 필터를 사용하여 예측 화소를 생성할 수 있다. 색차 화소의 경우에는 4탭 보간 필터를 사용하여 예측 화소를 생성할 수 있다.In this case, in the case of a luminance pixel, a prediction pixel may be generated using an 8-tap interpolation filter. In the case of a chrominance pixel, a prediction pixel may be generated using a 4-tap interpolation filter.

잔차 블록 생성부는 현재 블록과 현재 블록의 예측 블록을 이용하여 잔차 블록을 생성한다. 현재 블록의 크기가 2Nx2N인 경우에는 현재 블록과 현재 블록에 대응하는 2Nx2N 크기의 예측 블록을 이용하여 잔차 블록을 생성한다.The residual block generator generates a residual block by using the current block and the prediction block of the current block. When the size of the current block is 2Nx2N, a residual block is generated using the current block and a prediction block having a size of 2Nx2N corresponding to the current block.

그러나, 예측에 이용되는 현재 블록의 크기가 2NxN 또는 Nx2N인 경우에는 2Nx2N을 구성하는 2개의 2NxN 블록 각각에 대한 예측 블록을 구한 후, 상기 2개의 2NxN 예측 블록을 이용하여 2Nx2N 크기의 최종 예측 블록을 생성할 수 있다.However, when the size of the current block used for prediction is 2NxN or Nx2N, after obtaining a prediction block for each of the two 2NxN blocks constituting 2Nx2N, the final prediction block of the 2Nx2N size is obtained using the two 2NxN prediction blocks. can create

그리고, 상기 2Nx2N 크기의 예측 블록을 이용하여 2Nx2N 의 잔차 블록을 생성할 수도 있다. 2NxN 크기의 2개의 예측블록들의 경계부분의 불연속성을 해소하기 위해 경계 부분의 픽셀들을 오버랩 스무딩할 수 있다.In addition, a 2Nx2N residual block may be generated using the 2Nx2N prediction block. In order to solve the discontinuity of the boundary between two prediction blocks having a size of 2NxN, the pixels of the boundary portion may be overlapped and smoothed.

잔차 블록 부호화부는 생성된 잔차 블록을 하나 이상의 변환 유닛으로 나눈다. 그리고, 각 변환 유닛을 변환 부호화, 양자화 및 엔트로피 부호화된다. 이때, 변환 유닛의 크기는 잔차 블록의 크기에 따라 쿼드트리 방식으로 결정될 수 있다.The residual block encoder divides the generated residual block into one or more transform units. Then, each transform unit is transcoded, quantized, and entropy coded. In this case, the size of the transform unit may be determined in a quadtree method according to the size of the residual block.

잔차 블록 부호화부는 인터 예측 방법에 의해 생성된 잔차 블록을 정수기반 변환 매트릭스를 이용하여 변환한다. 상기 변환 매트릭스는 정수기반 DCT 매트릭스이다.The residual block encoder transforms the residual block generated by the inter prediction method using an integer-based transform matrix. The transformation matrix is an integer-based DCT matrix.

잔차 블록 부호화부는 상기 변환 매트릭스에 의해 변환된 잔차 블록의 계수들을 양자화하기 위해 양자화 매트릭스를 이용한다. 상기 양자화 매트릭스는 양자화 파라미터에 의해 결정된다.The residual block encoder uses a quantization matrix to quantize coefficients of the residual block transformed by the transform matrix. The quantization matrix is determined by a quantization parameter.

상기 양자화 파라미터는 미리 정해진 크기 이상의 코딩 유닛별로 결정된다. 상기 미리 정해진 크기는 8x8 또는 16x16일 수 있다. 따라서, 현재 코딩 유닛이 상기 미리 정해진 크기보다 작은 경우에는 상기 미리 정해진 크기 내의 복수개의 코딩 유닛 중 부호화 순서상 첫번째 코딩 유닛의 양자화 파라미터만을 부호화하고, 나머지 코딩 유닛의 양자화 파라미터는 상기 파라미터와 동일하므로 부호화할 필요가 없다.The quantization parameter is determined for each coding unit having a predetermined size or larger. The predetermined size may be 8x8 or 16x16. Accordingly, when the current coding unit is smaller than the predetermined size, only the quantization parameter of the first coding unit in the coding order among the plurality of coding units within the predetermined size is encoded, and the quantization parameters of the remaining coding units are the same as the parameters, so the encoding is performed. no need to do

그리고, 결정된 양자화 파라미터 및 예측 모드에 따라 결정되는 양자화 매트릭스를 이용하여 상기 변환 블록의 계수들을 양자화한다.Then, the coefficients of the transform block are quantized using a quantization matrix determined according to the determined quantization parameter and the prediction mode.

상기 미리 정해진 크기 이상의 코딩 유닛별로 결정되는 양자화 파라미터는 현재 코딩 유닛에 인접한 코딩 유닛의 양자화 파라미터를 이용하여 예측 부호화된다. 현재 코딩 유닛의 좌측 코딩 유닛, 상측 코딩 유닛 순서로 검색하여 유효한 1개 또는 2개의 유효한 양자화 파라미터를 이용하여 현재 코딩 유닛의 양자화 파라미터 예측자를 생성할 수 있다.A quantization parameter determined for each coding unit having a size greater than or equal to the predetermined size is predictively encoded using a quantization parameter of a coding unit adjacent to the current coding unit. A quantization parameter predictor of the current coding unit may be generated using one or two valid quantization parameters by searching in the order of the left coding unit and the upper coding unit of the current coding unit.

예를 들어, 상기 순서로 검색된 유효한 첫번째 양자화 파라미터를 양자화 파라미터 예측자로 결정할 수 있다. 또한, 좌측 코딩 유닛, 부호화 순서상 바로 이전의 코딩 유닛 순으로 검색하여 유효한 첫번째 양자화 파라미터를 양자화 파라미터 예측자로 결정할 수 있다.For example, a valid first quantization parameter retrieved in the above order may be determined as a quantization parameter predictor. In addition, a first valid quantization parameter may be determined as a quantization parameter predictor by searching in the order of the left coding unit and the coding unit immediately preceding in the coding order.

양자화된 변환 블록의 계수들은 스캐닝되어 1차원의 양자화 계수들로 변환한다. 스캐닝 방식은 엔트로피 부호화 모드에 따라 달리 설정될 수 있다. 예를 들어, CABAC으로 부호화될 경우에는 인터 예측 부호화된 양자화 계수들은 미리 정해진 하나의 방식(지그재그, 또는 대각선 방향으로의 래스터 스캔)으로 스캐닝될 수 있다. 반면에 CAVLC으로 부호화될 경우에는 상기 방식과 다른 방식으로 스캐닝될 수 있다.The coefficients of the quantized transform block are scanned and transformed into one-dimensional quantized coefficients. The scanning method may be set differently according to the entropy encoding mode. For example, when coded with CABAC, inter prediction coded quantization coefficients may be scanned in one predetermined method (zigzag or raster scan in a diagonal direction). On the other hand, in the case of CAVLC encoding, scanning may be performed in a method different from the above method.

예를 들어, 스캐닝 방식이 인터의 경우에는 지그재그, 인트라의 경우에는 인트라 예측 모드에 따라 결정될 수 있다. 또한, 계수 스캐닝 방식은 변환 단위의 크기에 따라 달리 결정될 수도 있다.For example, the scanning method may be determined according to the zigzag mode in the case of inter and the intra prediction mode in the case of intra. Also, the coefficient scanning method may be determined differently according to the size of the transform unit.

상기 스캔 패턴은 방향성 인트라 예측 모드에 따라 달라질 수 있다. 양자화 계수들의 스캔순서는 역방향으로 스캔한다.The scan pattern may vary according to a directional intra prediction mode. The scan order of the quantization coefficients is reversed.

멀티플렉서는 상기 움직임 정보 부호화부에 의해 부호화된 움직임 정보들과 상기 잔차 블록 부호화부에 의해 부호화된 잔차 신호들을 다중화한다. 상기 움직임 정보는 부호화 모드에 따라 달라질 수 있다.The multiplexer multiplexes the motion information encoded by the motion information encoder and the residual signals encoded by the residual block encoder. The motion information may vary according to an encoding mode.

즉, 스킵 또는 머지일 경우에는 예측자를 나타내는 인덱스만을 포함한다. 그러나, AMVP일 경우에는 현재 블록의 참조 픽쳐 인덱스, 차분 움직임 벡터 및 AMVP 인덱스를 포함한다.That is, in the case of skip or merge, only the index indicating the predictor is included. However, in case of AMVP, the reference picture index, differential motion vector, and AMVP index of the current block are included.

이하, 인트라 예측부(169)의 동작에 대한 일실시예를 상세히 설명하기로 한다.Hereinafter, an embodiment of the operation of the intra prediction unit 169 will be described in detail.

먼저, 픽쳐 분할부(160)에 의해 예측 모드 정보 및 예측 블록의 크기를 수신하며, 예측 모드 정보는 인트라 모드를 나타낸다. 예측 블록의 크기는 64x64, 32x32, 16x16, 8x8, 4x4등의 정방형일 수 있으나, 이에 한정하지 않는다. 즉, 상기 예측 블록의 크기가 정방형이 아닌 비정방형일 수도 있다. First, prediction mode information and a size of a prediction block are received by the picture divider 160 , and the prediction mode information indicates an intra mode. The size of the prediction block may be a square such as 64x64, 32x32, 16x16, 8x8, 4x4, but is not limited thereto. That is, the size of the prediction block may be non-square rather than square.

다음으로, 예측 블록의 인트라 예측 모드를 결정하기 위해 참조 화소를 픽쳐 저장부(172)로부터 읽어 들인다.Next, the reference pixel is read from the picture storage unit 172 to determine the intra prediction mode of the prediction block.

상기 이용 가능하지 않은 참조화소가 존재하는지 여부를 검토하여 참조 화소 생성 여부를 판단한다. 상기 참조 화소들은 현재 블록의 인트라 예측 모드를 결정하는데 사용된다.It is determined whether the reference pixel is created by examining whether the unavailable reference pixel exists. The reference pixels are used to determine the intra prediction mode of the current block.

현재 블록이 현재 픽쳐의 상측 경계에 위치하는 경우에는 현재 블록의 상측에 인접한 화소들이 정의되지 않는다. 또한, 현재 블록이 현재 픽쳐의 좌측 경계에 위치하는 경우에는 현재 블록의 좌측에 인접한 화소들이 정의되지 않는다.When the current block is located at the upper boundary of the current picture, pixels adjacent to the upper side of the current block are not defined. Also, when the current block is located at the left boundary of the current picture, pixels adjacent to the left of the current block are not defined.

이러한 화소들은 이용 가능한 화소들이 아닌 것으로 판단한다. 또한, 현재 블록이 슬라이스 경계에 위치하여 슬라이스의 상측 또는 좌측에 인접하는 화소들이 먼저 부호화되어 복원되는 화소들이 아닌 경우에도 이용 가능한 화소들이 아닌 것으로 판단한다.It is determined that these pixels are not available pixels. Also, when the current block is located at the slice boundary and pixels adjacent to the upper or left side of the slice are not pixels that are first encoded and reconstructed, it is determined that they are not available pixels.

상기와 같이 현재 블록의 좌측 또는 상측에 인접한 화소들이 존재하지 않거나, 미리 부호화되어 복원된 화소들이 존재하지 않는 경우에는 이용 가능한 화소들만을 이용하여 현재 블록의 인트라 예측 모드를 결정할 수도 있다.As described above, when pixels adjacent to the left or upper side of the current block do not exist or there are no pre-encoded and reconstructed pixels, the intra prediction mode of the current block may be determined using only available pixels.

그러나, 현재 블록의 이용 가능한 참조화소들을 이용하여 이용 가능하지 않은 위치의 참조화소들을 생성할 수도 있다. 예를 들어, 상측 블록의 화소들이 이용 가능하지 않은 경우에는 좌측 화소들의 일부 또는 전부를 이용하여 상측 화소들을 생성할 수 있고, 그 역으로도 가능하다.However, reference pixels at positions that are not available may be generated using the available reference pixels of the current block. For example, when the pixels of the upper block are not available, the upper pixels may be generated using some or all of the pixels on the left, and vice versa.

즉, 이용 가능하지 않은 위치의 참조화소로부터 미리 정해진 방향으로 가장 가까운 위치의 이용 가능한 참조화소를 복사하여 참조화소로 생성할 수 있다. 미리 정해진 방향에 이용 가능한 참조화소가 존재하지 않는 경우에는 반대 방향의 가장 가까운 위치의 이용 가능한 참조화소를 복사하여 참조화소로 생성할 수 있다.That is, it is possible to generate a reference pixel by copying an available reference pixel at a position closest to it in a predetermined direction from a reference pixel at an unavailable position. When there is no reference pixel available in the predetermined direction, the reference pixel may be generated as the reference pixel by copying the reference pixel available in the nearest position in the opposite direction.

한편, 현재 블록의 상측 또는 좌측 화소들이 존재하는 경우에도 상기 화소들이 속하는 블록의 부호화 모드에 따라 이용 가능하지 않은 참조 화소로 결정될 수 있다.Meanwhile, even when pixels above or to the left of the current block exist, they may be determined as unavailable reference pixels according to the encoding mode of the block to which the pixels belong.

예를 들어, 현재 블록의 상측에 인접한 참조 화소가 속하는 블록이 인터 부호화되어 복원된 블록일 경우에는 상기 화소들을 이용 가능하지 않은 화소들로 판단할 수 있다.For example, when a block to which a reference pixel adjacent to the upper side of the current block belongs is an inter-encoded and reconstructed block, the pixels may be determined as unavailable pixels.

이 경우에는 현재 블록에 인접한 블록이 인트라 부호화되어 복원된 블록에 속하는 화소들을 이용하여 이용 가능한 참조 화소들을 생성할 수 있다. 이 경우에는 부호기에서 부호화 모드에 따라 이용 가능한 참조 화소를 판단한다는 정보를 복호기로 전송해야 한다.In this case, usable reference pixels may be generated using pixels belonging to a block in which a block adjacent to the current block is intra-encoded and reconstructed. In this case, information indicating that the encoder determines available reference pixels according to the encoding mode must be transmitted to the decoder.

다음으로, 상기 참조 화소들을 이용하여 현재 블록의 인트라 예측 모드를 결정한다. 현재 블록에 허용 가능한 인트라 예측 모드의 수는 블록의 크기에 따라 달라질 수 있다. 예를 들어, 현재 블록의 크기가 8x8, 16x16, 32x32인 경우에는 34개의 인트라 예측 모드가 존재할 수 있고, 현재 블록의 크기가 4x4인 경우에는 17개의 인트라 예측 모드가 존재할 수 있다.Next, an intra prediction mode of the current block is determined using the reference pixels. The number of intra prediction modes permissible for the current block may vary according to the size of the block. For example, when the size of the current block is 8x8, 16x16, or 32x32, 34 intra prediction modes may exist, and when the size of the current block is 4x4, 17 intra prediction modes may exist.

상기 34개 또는 17개의 인트라 예측 모드는 적어도 하나 이상의 비방향성 모드(non-directional mode)와 복수개의 방향성 모드들(directional modes)로 구성될 수 있다.The 34 or 17 intra prediction modes may include at least one non-directional mode and a plurality of directional modes.

하나 이상의 비방향성 모드는 DC 모드 및/또는 플래너(planar) 모드일수 있다. DC 모드 및 플래너모드가 비방향성 모드로 포함되는 경우에는, 현재 블록의 크기에 관계없이 35개의 인트라 예측 모드가 존재할 수도 있다.The one or more non-directional modes may be DC mode and/or planar mode. When the DC mode and the planar mode are included as the non-directional mode, 35 intra prediction modes may exist regardless of the size of the current block.

이 때에는 2개의 비방향성 모드(DC 모드 및 플래너 모드)와 33개의 방향성 모드를 포함할 수 있다.In this case, two non-directional modes (DC mode and planar mode) and 33 directional modes may be included.

플래너 모드는 현재 블록의 우하측(bottom-right)에 위치하는 적어도 하나의 화소값(또는 상기 화소값의 예측값, 이하 제1 참조값이라 함)과 참조화소들을 이용하여 현재 블록의 예측 블록을 생성한다.In the planner mode, a prediction block of the current block is generated using at least one pixel value (or a predicted value of the pixel value, hereinafter referred to as a first reference value) and reference pixels located at the bottom-right side of the current block. .

상기한 바와 같이, 본 발명의 일실시예에 따른 동영상 복호화 장치의 구성은 도 1, 2 및 도 25를 참조하여 설명한 동영상 부호화 장치의 구성으로부터 도출될 수 있으며, 예를 들어 도 2 및 도 25를 참조하여 설명한 바와 같은 부호화 과정의 역과정을 수행함으로써 영상을 복호화할 수 있다.As described above, the configuration of the video decoding apparatus according to an embodiment of the present invention may be derived from the configuration of the video encoding apparatus described with reference to FIGS. 1, 2 and 25, for example, referring to FIGS. 2 and 25 . An image can be decoded by performing a reverse process of the encoding process as described with reference to.

도 26은 본 발명의 일실시예에 따른 동영상 복호화 장치의 구성을 블록도로 도시한 것이다.26 is a block diagram showing the configuration of a video decoding apparatus according to an embodiment of the present invention.

도 26을 참조하면, 본 발명에 따른 동영상 복호화 장치는, 엔트로피 복호화부(210), 역양자화/역변환부(220), 가산기(270), 디블록킹 필터(250), 픽쳐 저장부(260), 인트라 예측부(230), 움직임 보상 예측부(240) 및 인트라/인터전환 스위치(280)를 구비한다.Referring to FIG. 26 , the video decoding apparatus according to the present invention includes an entropy decoding unit 210 , an inverse quantization/inverse transformation unit 220 , an adder 270 , a deblocking filter 250 , a picture storage unit 260 , It includes an intra prediction unit 230 , a motion compensation prediction unit 240 , and an intra/inter switching switch 280 .

엔트로피 복호화부(210)는, 동영상 부호화 장치로부터 전송되는 부호화 비트 스트림을 복호하여, 인트라 예측 모드 인덱스, 움직임 정보, 양자화 계수 시퀀스 등으로 분리한다. 엔트로피 복호화부(210)는 복호된 움직임 정보를 움직임 보상 예측부(240)에 공급한다.The entropy decoding unit 210 decodes the encoded bit stream transmitted from the video encoding apparatus, and separates the encoded bit stream into an intra prediction mode index, motion information, a quantization coefficient sequence, and the like. The entropy decoding unit 210 supplies the decoded motion information to the motion compensation prediction unit 240 .

*엔트로피 복호화부(210)는 상기 인트라 예측 모드 인덱스를 상기 인트라 예측부(230), 역양자화/역변환부(220)로 공급한다. 또한, 상기 엔트로피 복호화부(210)는 상기 역양자화 계수 시퀀스를 역양자화/역변환부(220)로 공급한다.* The entropy decoding unit 210 supplies the intra prediction mode index to the intra prediction unit 230 and the inverse quantization/inverse transform unit 220 . Also, the entropy decoding unit 210 supplies the inverse quantization coefficient sequence to the inverse quantization/inverse transformation unit 220 .

역양자화/역변환부(220)는 상기 양자화 계수 시퀀스를 2차원 배열의 역양자화 계수로 변환한다. 상기 변환을 위해 복수개의 스캐닝 패턴 중에 하나를 선택한다. 현재 블록의 예측모드(즉, 인트라 예측 및 인터 예측 중의 어느 하나)와 인트라 예측 모드 중 적어도 하나에 기초하여 복수개의 스캐닝 패턴 중 하나를 선택한다.The inverse quantization/inverse transform unit 220 transforms the quantization coefficient sequence into inverse quantization coefficients of a two-dimensional array. One of a plurality of scanning patterns is selected for the transformation. One of a plurality of scanning patterns is selected based on at least one of a prediction mode (ie, any one of intra prediction and inter prediction) of the current block and an intra prediction mode.

상기 인트라 예측 모드는 인트라 예측부 또는 엔트로피 복호화부로부터 수신한다.The intra prediction mode is received from an intra prediction unit or an entropy decoder.

역양자화/역변환부(220)는 상기 2차원 배열의 역양자화 계수에 복수개의 양자화 매트릭스 중 선택된 양자화 매트릭스를 이용하여 양자화 계수를 복원한다. 복원하고자 하는 현재 블록의 크기에 따라 서로 다른 양자화 매트릭스가 적용되며, 동일 크기의 블록에 대해서도 상기 현재 블록의 예측 모드 및 인트라 예측 모드 중 적어도 하나에 기초하여 양자화 매트릭스를 선택한다.The inverse quantization/inverse transform unit 220 restores the quantization coefficients by using a quantization matrix selected from among a plurality of quantization matrices for the inverse quantization coefficients of the two-dimensional array. Different quantization matrices are applied according to the size of the current block to be reconstructed, and a quantization matrix is selected based on at least one of a prediction mode and an intra prediction mode of the current block even for a block of the same size.

그리고, 상기 복원된 양자화 계수를 역변환하여 잔차 블록을 복원한다.Then, the residual block is reconstructed by inverse transforming the reconstructed quantization coefficient.

가산기(270)는 역양자화/역변환부(220)에 의해 복원된 잔차 블록과 인트라 예측부(230) 또는 움직임 보상 예측부(240)에 의해 생성되는 예측 블록을 가산함으로써, 영상 블록을 복원한다.The adder 270 reconstructs an image block by adding the residual block reconstructed by the inverse quantization/inverse transform unit 220 and the prediction block generated by the intra prediction unit 230 or the motion compensation prediction unit 240 .

디블록킹 필터(250)는 가산기(270)에 의해 생성된 복원 영상에 디블록킹 필터 처리를 실행한다. 이에 따라, 양자화 과정에 따른 영상 손실에 기인하는 디블록킹 아티펙트를 줄일 수 있다.The deblocking filter 250 performs a deblocking filter process on the reconstructed image generated by the adder 270 . Accordingly, it is possible to reduce the deblocking artifact caused by the image loss caused by the quantization process.

픽쳐 저장부(260)는 디블록킹 필터(250)에 의해 디블록킹 필터 처리가 실행된 로컬 복호 영상을 유지하는 프레임 메모리이다.The picture storage unit 260 is a frame memory that maintains a local decoded image on which the deblocking filter process has been performed by the deblocking filter 250 .

인트라 예측부(230)는 엔트로피 복호화부(210)로부터 수신된 인트라 예측 모드 인덱스에 기초하여 현재 블록의 인트라 예측 모드를 복원한다. 그리고, 복원된 인트라 예측 모드에 따라 예측 블록을 생성한다.The intra prediction unit 230 reconstructs the intra prediction mode of the current block based on the intra prediction mode index received from the entropy decoding unit 210 . Then, a prediction block is generated according to the reconstructed intra prediction mode.

움직임 보상 예측부(240)는 움직임 벡터 정보에 기초하여 픽쳐 저장부(260)에 저장된 픽쳐로부터 현재 블록에 대한 예측 블록을 생성한다. 소수 정밀도의 움직임 보상이 적용될 경우에는 선택된 보간 필터를 적용하여 예측 블록을 생성한다.The motion compensation prediction unit 240 generates a prediction block for the current block from the picture stored in the picture storage unit 260 based on the motion vector information. When motion compensation of fractional precision is applied, a prediction block is generated by applying the selected interpolation filter.

인트라/인터 전환 스위치(280)는 부호화 모드에 기초하여 인트라 예측부(230)와 움직임 보상 예측부(240)의 어느 하나에서 생성된 예측 블록을 가산기(270)에 제공한다.The intra/inter switching switch 280 provides the prediction block generated by any one of the intra prediction unit 230 and the motion compensation prediction unit 240 to the adder 270 based on the encoding mode.

이와 같은 방식으로 복원된 현재 블록의 예측 블록과 복호화한 현재 블록의 잔차 블록을 이용하여 현재 블록이 복원된다.The current block is reconstructed using the prediction block of the current block reconstructed in this way and the residual block of the decoded current block.

본 발명의 일실시예에 따른 동영상 비트스트림은 하나의 픽처에서의 부호화된 데이터를 저장하는데 사용되는 단위로서, PS(parameter sets)와 슬라이스 데이터를 포함할 수 있다.A video bitstream according to an embodiment of the present invention is a unit used to store encoded data in one picture, and may include parameter sets (PS) and slice data.

PS(parameter sets)는, 각 픽처의 헤드에 상당하는 데이터인 픽처 파라미터 세트(이하 간단히 PPS라 한다)와 시퀀스 파라미터 세트(이하 간단히 SPS라 한다)로 분할된다. 상기 PPS와 SPS는 각 부호화를 초기화하는데 필요한 초기화 정보를 포함할 수 있으며, 본 발명의 실시 예에 따른 공간적 구조 정보(SPATIAL LAYOUT INFORMATION)가 포함될 수 있다.PS (parameter sets) is divided into a picture parameter set (hereinafter simply referred to as PPS) and a sequence parameter set (hereinafter simply referred to as SPS), which are data corresponding to the head of each picture. The PPS and SPS may include initialization information required to initialize each encoding, and may include spatial structure information (SPATIAL LAYOUT INFORMATION) according to an embodiment of the present invention.

SPS는 램덤 액세스 유닛(RAU)으로 부호화된 모든 픽처를 복호화하기 위한 공통 참조 정보로서, 프로파일, 참조용으로 사용 가능한 픽처의 최대 수 및 픽처 크기 등을 포함할 수 있다.The SPS is common reference information for decoding all pictures encoded by the random access unit (RAU), and may include a profile, the maximum number of pictures usable for reference, and a picture size.

PPS는, 랜덤 액세스 유닛(RAU)으로 부호화된 각 픽처에 대해, 픽처를 복호화하기 위한 참조 정보로서 가변 길이 부호화 방법의 종류, 양자화 단계의 초기값 및 다수의 참조 픽처들을 포함할 수 있다.The PPS may include a type of a variable length coding method, an initial value of a quantization step, and a plurality of reference pictures as reference information for decoding a picture for each picture encoded by a random access unit (RAU).

한편, 슬라이스 헤더(SH)는 슬라이스 단위의 코딩시 해당 슬라이스에 대한 정보를 포함한다.Meanwhile, the slice header SH includes information on a corresponding slice when coding in a slice unit.

상술한 본 발명에 따른 방법은 컴퓨터에서 실행되기 위한 프로그램으로 제작되어 컴퓨터가 읽을 수 있는 기록 매체에 저장될 수 있으며, 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있다.The above-described method according to the present invention may be produced as a program to be executed on a computer and stored in a computer-readable recording medium. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape. , floppy disks, and optical data storage devices.

컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 상기 방법을 구현하기 위한 기능적인(function) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.The computer-readable recording medium is distributed in a network-connected computer system, so that the computer-readable code can be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the method can be easily inferred by programmers in the art to which the present invention pertains.

또한, 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형 실시가 가능한 것은 물론이고, 이러한 변형 실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해 되어서는 안될 것이다.In addition, although preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention as claimed in the claims Various modifications may be made by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

Claims

disposing a plurality of images constituting an omnidirectional image of a three-dimensional space on a two-dimensional image;
transforming the arrangement of at least some of the plurality of regions included in the two-dimensional image;
encoding the converted two-dimensional image; and
Including; transmitting the encoded two-dimensional image and the SEI message;
The SEI message includes placement information including an arrangement method of an omnidirectional image with respect to the two-dimensional image, information on the transformation, and viewpoint information according to the intention of the content provider,
The arrangement information is at least one of a Cube Map method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Cube Map format, and an Equirectangular method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Equirectangular format. contains an SEI message syntax indicating one,
In the SEI message, the 2D image region is mapped to a 3D space when the bitstream of the 2D image in which the omnidirectional image is pre-encoded is performed according to at least a part of the SEI message and user viewpoint information. It contains information that allows it to be reproduced and reproduced;
Time information according to the intention of the content provider,
A user viewing the omnidirectional image restored from the encoded two-dimensional image, including viewpoint information of the omnidirectional image that allows the user to view the omnidirectional image according to the phase according to the intention of a content provider or image producer characterized by
Image processing method.

an image acquisition unit that arranges a plurality of images constituting an omnidirectional image of a three-dimensional space on a two-dimensional image, and transforms the arrangement of at least a portion of a plurality of regions included in the two-dimensional image;
an image encoder for encoding the converted two-dimensional image; and
Including; a transmission processing unit for transmitting the encoded two-dimensional image and the SEI message,
The SEI message includes placement information including an arrangement method of an omnidirectional image with respect to the two-dimensional image, information on the transformation, and viewpoint information according to the intention of the content provider,
The arrangement information includes a Cube Map method of disposing the sub-images of the omnidirectional image in the two-dimensional image according to the Cube Map format, and Equirectangular methods of disposing the sub-images of the omnidirectional image in the two-dimensional image according to the Equirectangular format. an SEI message syntax indicating at least one;
In the SEI message, the 2D image region is mapped to a 3D space when the bitstream of the 2D image in which the omnidirectional image is pre-encoded is performed according to at least a part of the SEI message and user viewpoint information. It contains information that allows it to be reproduced and reproduced;
Time information according to the intention of the content provider,
A user viewing the omnidirectional image restored from the encoded two-dimensional image, including viewpoint information of the omnidirectional image that allows the user to view the omnidirectional image according to the phase according to the intention of a content provider or image producer characterized by
image processing device.

obtaining an encoded two-dimensional image and an SEI message;
decoding the encoded two-dimensional image; and
Reproducing a 3D spatial image according to a user's point of view from the decoded 2D image;
The SEI message includes placement information including an arrangement method of an omnidirectional image with respect to the two-dimensional image, conversion information, and viewpoint information according to the intention of the content provider,
The arrangement information is at least one of a Cube Map method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Cube Map format, and an Equirectangular method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Equirectangular format. contains an SEI message syntax indicating one,
The transformation information is information indicating that the arrangement of at least some of the plurality of regions included in the two-dimensional image is transformed,
The decryption step is
According to at least a part of the SEI message and user viewpoint information, when decoding of the bitstream of the pre-encoded 2D image of the omnidirectional image is performed, the 2D image region is mapped to a 3D space and reproduced includes
Time information according to the intention of the content provider,
The user viewing the omnidirectional image restored from the encoded two-dimensional image, including the viewpoint information of the omnidirectional image that allows the user to view the omnidirectional image according to the phase according to the intention of the content provider or image producer
Image processing method.

A decoding unit that obtains an encoded two-dimensional image and an SEI message, decodes the encoded two-dimensional image, and processes coordinate mapping for reproducing a three-dimensional spatial image according to a user's point of view from the decoded two-dimensional image, ,
The SEI message includes placement information including an arrangement method of an omnidirectional image with respect to the two-dimensional image, conversion information, and viewpoint information according to the intention of the content provider,
The arrangement information is at least one of a Cube Map method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Cube Map format, and an Equirectangular method of arranging the sub-images of the omnidirectional image in the two-dimensional image according to the Equirectangular format. contains an SEI message syntax indicating one,
The transformation information is information indicating that the arrangement of at least some of the plurality of regions included in the two-dimensional image is transformed,
The decryption unit,
According to at least a part of the SEI message and user viewpoint information, when decoding of the bitstream of the 2D image in which the omnidirectional image is pre-encoded is performed, the 2D image region is mapped to a 3D space and reproduced,
Time information according to the intention of the content provider,
A user viewing the omnidirectional image restored from the encoded two-dimensional image, including viewpoint information of the omnidirectional image that allows the user to view the omnidirectional image according to the phase according to the intention of a content provider or image producer characterized by
image processing device.