KR102475334B1

KR102475334B1 - Video encoding/decoding method and apparatus

Info

Publication number: KR102475334B1
Application number: KR1020210003220A
Authority: KR
Inventors: 신홍창; 음호민; 이광순; 이진환; 정준영; 윤국진; 박종일; 윤준영
Original assignee: 한국전자통신연구원; 한양대학교 산학협력단
Priority date: 2020-01-13
Filing date: 2021-01-11
Publication date: 2022-12-07
Also published as: KR20210091058A

Abstract

영상 부호화/복호화 방법 및 장치가 제공된다. 본 개시에 따른 영상 복호화 방법은, 복수의 시점의 영상에 대한 영상 데이터를 획득하는 단계, 상기 복수의 시점 중에서 기본 시점 및 복수의 참조 시점을 결정하는 단계, 상기 복수의 참조 시점의 프루닝 순서를 결정하는 단계 및 상기 프루닝 순서에 기초하여 상기 영상 데이터를 파싱하여, 상기 기본 시점의 영상과 상기 복수의 참조 시점의 영상을 복호화 하는 단계를 포함한다.An image encoding/decoding method and apparatus are provided. An image decoding method according to the present disclosure includes obtaining image data for images of a plurality of viewpoints, determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints, and a pruning order of the plurality of reference viewpoints. and parsing the image data based on the pruning order, and decoding the image of the base view and the images of the plurality of reference views.

Description

Video encoding/decoding method and apparatus {VIDEO ENCODING/DECODING METHOD AND APPARATUS}

본 개시는 영상 부호화/복호화 방법 및 장치에 관한 것이다. 구체적으로, 본 개시는 복수의 참조 시점의 영상을 이용하는 경우, 복수의 참조 시점의 영상의 중요도에 기초하여 프루닝(Pruning) 순서를 정하는 영상 부호화/복호화 하는 방법, 장치 및 본 개시의 영상 부호화 방법 또는 장치에 의해 생성된 비트스트림을 저장한 기록 매체에 관한 것이다.The present disclosure relates to an image encoding/decoding method and apparatus. Specifically, the present disclosure provides an image encoding/decoding method and apparatus for determining a pruning order based on the importance of images of a plurality of reference viewpoints when images of a plurality of reference viewpoints are used, and an image encoding method of the present disclosure. Or, it relates to a recording medium storing a bitstream generated by the device.

최근 HD(High Definition) 영상 및 UHD(Ultra High Definition) 영상과 같은 고해상도, 고품질의 영상에 대한 수요가 다양한 응용 분야에서 증가하고 있다. 영상 데이터가 고해상도, 고품질이 될수록 기존의 영상 데이터에 비해 상대적으로 데이터량이 증가하기 때문에 기존의 유무선 광대역 회선과 같은 매체를 이용하여 영상 데이터를 전송하거나 기존의 저장 매체를 이용해 저장하는 경우, 전송 비용과 저장 비용이 증가하게 된다. 영상 데이터가 고해상도, 고품질화 됨에 따라 발생하는 이러한 문제들을 해결하기 위해서는 더 높은 해상도 및 화질을 갖는 영상에 대한 고효율 영상 부호화(encoding)/복호화(decoding) 기술이 요구된다.Recently, demand for high-resolution and high-quality images such as high definition (HD) images and ultra high definition (UHD) images is increasing in various application fields. As image data becomes higher resolution and higher quality, the amount of data increases relatively compared to existing image data. Therefore, when image data is transmitted using a medium such as an existing wired/wireless broadband line or stored using an existing storage medium, transmission cost and Storage costs increase. In order to solve these problems caused by high-resolution and high-quality video data, high-efficiency video encoding/decoding technology for video with higher resolution and quality is required.

본 개시는 복수의 시점 영상들의 프루닝(Pruning) 순서를 결정하는 영상 부호화/복호화 방법 및 장치를 제공하는 것을 목적으로 한다.An object of the present disclosure is to provide an image encoding/decoding method and apparatus for determining a pruning order of a plurality of view images.

또한, 본 개시는 복수의 시점 영상들의 중요도를 결정하는 영상 부호화/복호화 방법 및 장치를 제공하는 것을 목적으로 한다.In addition, an object of the present disclosure is to provide an image encoding/decoding method and apparatus for determining importance of images of a plurality of viewpoints.

또한, 본 개시는 영상 데이터를 효율적으로 관리하는 영상 부호화/복호화 방법 및 장치를 제공하는 것을 목적으로 한다.In addition, an object of the present disclosure is to provide a video encoding/decoding method and apparatus for efficiently managing video data.

또한, 본 개시는 자연스러운 전방위 영상을 제공하는 영상 부호화/복호화 방법 및 장치를 제공하는 것을 목적으로 한다.In addition, an object of the present disclosure is to provide an image encoding/decoding method and apparatus for providing a natural omnidirectional image.

또한, 본 개시는 영상 압축 효율과 영상 합성 품질을 향상시키는 영상 부호화/복호화 방법 및 장치를 제공하는 것을 목적으로 한다.In addition, an object of the present disclosure is to provide a video encoding/decoding method and apparatus for improving video compression efficiency and video synthesis quality.

또한, 본 개시는 본 개시의 영상 부호화/복호화 방법 또는 장치에 의해 생성된 비트스트림을 저장한 기록 매체를 제공하는 것을 목적으로 한다.In addition, an object of the present disclosure is to provide a recording medium storing a bitstream generated by a video encoding/decoding method or apparatus of the present disclosure.

본 개시에 따르면, 복수의 시점의 영상을 획득하는 단계; 상기 복수의 시점 중에서 기본 시점 및 복수의 참조 시점을 결정하는 단계; 상기 복수의 참조 시점의 프루닝(Pruning) 순서를 결정하는 단계; 및 상기 프루닝 순서에 기초하여, 상기 기본 시점의 영상과 상기 복수의 참조 시점의 영상 간 프루닝을 수행하는 단계를 포함하는 영상 부호화 방법이 제공될 수 있다.According to the present disclosure, obtaining images of a plurality of viewpoints; determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints; determining a pruning order of the plurality of reference views; and performing pruning between the base view image and the plurality of reference view images based on the pruning order.

상기 영상 부호화 방법은, 상기 복수의 시점의 영상의 중요 영역에 기초하여, 보호 영역 마스크를 결정하는 단계를 더 포함하는 것을 특징으로 할 수 있다.The image encoding method may further include determining a protection region mask based on important regions of the images of the plurality of viewpoints.

상기 보호 영역 마스크가 결정된 영역은 프루닝되지 않는 것을 특징으로 할 수 있다.The region where the protective region mask is determined may not be pruned.

상기 보호 영역 마스크는 임의의 형태를 가지는 것을 특징으로 할 수 있다.The protection area mask may have an arbitrary shape.

상기 복수의 참조 시점의 프루닝 순서를 결정하는 단계는, 상기 복수의 참조 시점의 중요도를 결정하는 단계; 및 상기 중요도에 기초하여 상기 복수의 참조 시점의 프루닝 순서를 결정하는 단계를 포함하는 것을 특징으로 할 수 있다.The determining of the pruning order of the plurality of reference viewpoints may include determining importance of the plurality of reference viewpoints; and determining a pruning order of the plurality of reference views based on the importance.

상기 복수의 참조 시점의 중요도를 결정하는 단계는, 상기 복수의 참조 시점의 영상내 화소마다 가중치를 부여하는 단계; 및 상기 가중치에 기초하여 상기 복수의 참조 시점의 중요도를 결정하는 단계를 포함하는 것을 특징으로 할 수 있다.The step of determining the importance of the plurality of reference viewpoints may include assigning a weight to each pixel in an image of the plurality of reference viewpoints; and determining importance of the plurality of reference viewpoints based on the weights.

상기 가중치를 부여하는 단계는, 객체의 위치, 카메라로부터의 거리, 가려짐 영역의 너비 및 깊이 값 중에서 적어도 하나 이상에 기초하여 상기 영상내 화소마다 상기 가중치를 부여하는 것을 특징으로 할 수 있다.The assigning of the weight may include assigning the weight to each pixel in the image based on at least one of a location of an object, a distance from a camera, and a width and depth value of an occluded area.

상기 복수의 참조 시점의 중요도를 결정하는 단계는, 사용자의 가상카메라의 위치를 나타내는 목표 위치에 기초하여 상기 복수의 참조 시점의 중요도를 결정하는 것을 특징으로 할 수 있다.The determining of the importance of the plurality of reference viewpoints may include determining the importance of the plurality of reference viewpoints based on a target position indicating a location of a user's virtual camera.

상기 복수의 시점 중에서 기본 시점 및 복수의 참조 시점을 결정하는 단계는, 사용자의 가상카메라의 위치를 나타내는 목표 위치의 변동에 기초하여 상기 기본 시점을 결정하는 것을 특징으로 할 수 있다.The determining of a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints may include determining the basic viewpoint based on a change in a target position indicating a position of a user's virtual camera.

복수의 시점의 영상에 대한 영상 데이터를 획득하는 단계; 상기 복수의 시점 중에서 기본 시점 및 복수의 참조 시점을 결정하는 단계; 상기 복수의 참조 시점의 프루닝 순서를 결정하는 단계; 및 상기 프루닝 순서에 기초하여 상기 영상 데이터를 파싱하여, 상기 기본 시점의 영상과 상기 복수의 참조 시점의 영상을 복호화 하는 단계를 포함하는 영상 복호화 방법이 제공될 수 있다.acquiring image data for images of a plurality of viewpoints; determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints; determining a pruning order of the plurality of reference viewpoints; and parsing the image data based on the pruning order and decoding the image of the base view and the images of the plurality of reference views.

상기 영상 복호화 방법은, 상기 복수의 시점의 영상의 중요 영역에 기초하여, 보호 영역 마스크를 결정하는 단계를 더 포함하는 것을 특징으로 할 수 있다.The image decoding method may further include determining a protection region mask based on important regions of the images of the plurality of viewpoints.

상기 보호 영역이 마스크가 결정된 영역은 프루닝되지 않는 것을 특징으로 할 수 있다.An area where the mask of the protection area is determined may not be pruned.

상기 가중치를 부여하는 단계는, 객체의 위치, 카메라로부터의 거리, 가려짐 영역의 너비 및 깊이 값 중에서 적어도 하나 이상에 기초하여 상기 화소마다 상기 가중치를 부여하는 것을 특징으로 할 수 있다.The assigning of the weight may include assigning the weight to each pixel based on at least one of a location of an object, a distance from a camera, and a width and depth value of an occluded area.

상기 복수의 시점 중에서 상기 기본 시점은 복수인 것을 특징으로 할 수 있다.Among the plurality of viewpoints, the basic viewpoint may be plural.

영상 복호화 방법에 따라 복호화되는 영상 부호화 데이터를 포함하는 비트스트림을 저장한 컴퓨터 판독 가능한 기록 매체로서, 상기 영상 복호화 방법은, 복수의 시점의 영상에 대한 영상 데이터를 획득하는 단계; 상기 복수의 시점 중에서 기본 시점 및 복수의 참조 시점을 결정하는 단계; 상기 복수의 참조 시점의 프루닝 순서를 결정하는 단계; 및 상기 프루닝 순서에 기초하여 상기 영상 데이터를 파싱하여, 상기 기본 시점의 영상과 상기 복수의 참조 시점의 영상을 복호화 하는 단계를 포함하는 것을 특징으로 하는 컴퓨터로 판독 가능한 기록 매체가 제공될 수 있다.A computer-readable recording medium storing a bitstream including video encoding data decoded according to an image decoding method, the image decoding method comprising: obtaining image data for images of a plurality of viewpoints; determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints; determining a pruning order of the plurality of reference viewpoints; and parsing the image data based on the pruning sequence to decode the image of the base view and the images of the plurality of reference views. .

본 개시에 따르면, 복수의 시점 영상들의 프루닝 순서를 결정하는 영상 부호화/복호화 방법 및 장치가 제공될 수 있다.According to the present disclosure, an image encoding/decoding method and apparatus for determining a pruning order of a plurality of view images may be provided.

또한, 본 개시에 따르면, 복수의 시점 영상들의 중요도를 결정하는 영상 부호화/복호화 방법 및 장치가 제공될 수 있다.In addition, according to the present disclosure, an image encoding/decoding method and apparatus for determining importance of a plurality of viewpoint images may be provided.

또한, 본 개시에 따르면, 영상 데이터를 효율적으로 관리하는 영상 부호화/복호화 방법 및 장치가 제공될 수 있다.In addition, according to the present disclosure, an image encoding/decoding method and apparatus for efficiently managing image data may be provided.

또한, 본 개시에 따르면, 자연스러운 전방위 영상을 제공하는 영상 부호화/복호화 방법 및 장치가 제공될 수 있다.In addition, according to the present disclosure, an image encoding/decoding method and apparatus for providing a natural omnidirectional image may be provided.

또한, 본 개시에 따르면, 영상 압축 효율과 영상 합성 품질을 향상시키는 영상 부호화/복호화 방법 및 장치가 제공될 수 있다.In addition, according to the present disclosure, an image encoding/decoding method and apparatus for improving image compression efficiency and image synthesis quality may be provided.

도 1은 본 개시의 일 실시예에 따른, 복수의 시점으로부터 획득한 영상을 설명한다.
도 2는 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 중복영역을 제거하는 과정을 설명한다.
도 3은 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 프루닝(Pruning) 과정을 설명한다.
도 4a 및 도 4b는 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 1차 프루닝과 2차 프루닝 과정을 설명한다.
도 5는 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 중요도를 산정할 때, 가중치 요소들을 설명한다.
도 6은 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 중요도를 산정할 때, 목표위치의 변화를 설명한다.
도 7은 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 프루닝 과정에서, 보호 영역 마스크를 결정하는 방법을 설명한다.
도 8은 본 개시의 일 실시예에 따른, 사람 등의 객체에 기초하여 보호 영역 마스크를 결정한 것을 설명한다.
도 9는 본 개시의 일 실시예에 따른, 영상 부호화 방법을 설명한다.
도 10은 본 개시의 일 실시예에 따른, 영상 복호화 방법을 설명한다.1 illustrates images acquired from a plurality of viewpoints according to an embodiment of the present disclosure.
2 illustrates a process of removing overlapping regions between images of a plurality of viewpoints according to an embodiment of the present disclosure.
3 illustrates a pruning process between images of a plurality of viewpoints according to an embodiment of the present disclosure.
4A and 4B illustrate processes of primary pruning and secondary pruning between images of a plurality of viewpoints according to an embodiment of the present disclosure.
5 illustrates weight factors when calculating the importance of images of a plurality of viewpoints according to an embodiment of the present disclosure.
6 illustrates a change in a target position when calculating importance of images of a plurality of viewpoints according to an embodiment of the present disclosure.
7 illustrates a method of determining a protection region mask in a process of pruning images of a plurality of viewpoints according to an embodiment of the present disclosure.
8 illustrates determining a protection area mask based on an object, such as a person, according to an embodiment of the present disclosure.
9 illustrates an image encoding method according to an embodiment of the present disclosure.
10 illustrates an image decoding method according to an embodiment of the present disclosure.

본 개시는 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 개시를 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 개시의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다. 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다. 후술하는 예시적 실시예들에 대한 상세한 설명은, 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 실시예를 실시할 수 있기에 충분하도록 상세히 설명된다. 다양한 실시예들은 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 개시의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 실시예의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 예시적 실시예들의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다.Since the present disclosure can make various changes and have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present disclosure to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present disclosure. Like reference numbers in the drawings indicate the same or similar function throughout the various aspects. The shapes and sizes of elements in the drawings may be exaggerated for clarity. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS For detailed descriptions of exemplary embodiments described below, reference is made to the accompanying drawings, which illustrate specific embodiments by way of example. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments. It should be understood that the various embodiments are different, but need not be mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented in another embodiment without departing from the spirit and scope of the present disclosure in connection with one embodiment. Additionally, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the embodiment. Accordingly, the detailed description set forth below is not to be taken in a limiting sense, and the scope of the exemplary embodiments, if properly described, is limited only by the appended claims, along with all equivalents as claimed by those claims.

본 개시에서 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 개시의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.In this disclosure, terms such as first and second may be used to describe various components, but the components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present disclosure. The terms and/or include any combination of a plurality of related recited items or any of a plurality of related recited items.

본 개시의 어떤 구성 요소가 다른 구성 요소에 “연결되어” 있다거나 “접속되어” 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있으나, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다거나 "직접 접속되어"있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component of the present disclosure is referred to as “connected” or “connected” to another component, it may be directly connected or connected to the other component, but there may be other components in the middle. It should be understood that it may be On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

본 개시의 실시예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 개시의 본질에서 벗어나지 않는 한 본 개시의 권리범위에 포함된다.Components appearing in the embodiments of the present disclosure are shown independently to represent different characteristic functions, and do not mean that each component is composed of separate hardware or a single software component. That is, each component is listed and included as each component for convenience of explanation, and at least two components of each component can be combined to form a single component, or one component can be divided into a plurality of components to perform a function, and each of these components can be divided into a plurality of components. Integrated embodiments and separated embodiments of components are also included in the scope of the present disclosure unless departing from the essence of the present disclosure.

본 개시에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 개시를 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 개시에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 즉, 본 개시에서 특정 구성을 “포함”한다고 기술하는 내용은 해당 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성이 본 개시의 실시 또는 본 개시의 기술적 사상의 범위에 포함될 수 있음을 의미한다. Terms used in the present disclosure are only used to describe specific embodiments, and are not intended to limit the present disclosure. Singular expressions include plural expressions unless the context clearly dictates otherwise. In the present disclosure, terms such as "comprise" or "having" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded. That is, the description of “including” a specific configuration in the present disclosure does not exclude configurations other than the corresponding configuration, and means that additional configurations may be included in the practice of the present disclosure or the scope of the technical idea of the present disclosure.

본 개시의 일부의 구성 요소는 본 개시에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 개시는 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 개시의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 개시의 권리범위에 포함된다.Some of the components of the present disclosure may be optional components for improving performance rather than essential components that perform essential functions in the present disclosure. The present disclosure may be implemented by including only components essential to implement the essence of the present disclosure, excluding components used for performance improvement, and a structure including only essential components excluding optional components used for performance enhancement. Also included in the scope of the present disclosure.

이하, 도면을 참조하여 본 개시의 실시 형태에 대하여 구체적으로 설명한다. 본 명세서의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략하고, 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.EMBODIMENT OF THE INVENTION Hereinafter, embodiment of this disclosure is specifically described with reference to drawings. In describing the embodiments of this specification, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present specification, the detailed description will be omitted, and the same reference numerals will be used for the same components in the drawings. and duplicate descriptions of the same components are omitted.

도 1은 본 개시의 일 실시예에 따른, 복수의 시점으로부터 획득한 영상을 설명한다. 시점영상은 해당 시점에서 획득된 영상에 해당될 수 있다. 시점영상은 해당 시점의 영상이라고 표현될 수 있다.1 illustrates images acquired from a plurality of viewpoints according to an embodiment of the present disclosure. The viewpoint image may correspond to an image acquired at a corresponding viewpoint. The viewpoint image may be expressed as an image of a corresponding viewpoint.

도 1을 참조하면, 시점이 서로 다른 복수의 카메라에 의해 영상이 획득될 수 있다. View C1(104)은 중심 시점에서 획득된 영상에 해당될 수 있다. View L1(102)은 좌측 시점에서 획득된 영상에 해당될 수 있다. View R1(105)은 우측 시점에서 획득된 영상에 해당될 수 있다. View V(103)은 참조 시점을 이용하여 View L1(102)와 View C1(104)의 중간에 위치한 가상 시점의 영상에 해당될 수 있다. View V(103)에서는 View C1(104)에서는 보이지 않는 가려짐 영역(Occluded area)이 포함될 수 있다. 가려짐 영역은 View L1(102)에서 일부 보이므로 영상을 합성할 때 참조될 수 있다.Referring to FIG. 1 , images may be acquired by a plurality of cameras having different viewpoints. View C1 104 may correspond to an image obtained from a central viewpoint. View L1 102 may correspond to an image acquired from a left viewpoint. View R1 105 may correspond to an image obtained from a right viewpoint. View V 103 may correspond to an image of a virtual viewpoint located in the middle of View L1 102 and View C1 104 using a reference viewpoint. View V 103 may include an occluded area that is not visible in View C1 104 . Since the occluded area is partially visible in View L1 (102), it can be referred to when synthesizing images.

도 2는 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 중복영역을 제거하는 과정을 설명한다. 기본 시점과 참조 시점이 존재하는 경우, 중복된 부분을 제거하여 영상 데이터가 감소될 수 있다.2 illustrates a process of removing overlapping regions between images of a plurality of viewpoints according to an embodiment of the present disclosure. When a basic viewpoint and a reference viewpoint exist, image data may be reduced by removing overlapping parts.

도 2를 참조하면, View C1(203)은 중심 시점에서 획득한 영상에 해당될 수 있다. 나머지 View L2(201), View L1(202) 및 View R1(204)은 참조 시점에서 획득한 영상에 해당될 수 있다. View C1(203)은 각 참조 시점의 영상으로 3차원 기하관계와 깊이 정보를 이용하여 3차원 영상워핑(3D View Warping)될 수 있다. 그리고, View C1(203)은 각 참조 시점의 영상으로 매핑(Mapping)될 수 있다.Referring to FIG. 2 , View C1 203 may correspond to an image obtained from a central viewpoint. The remaining View L2 201, View L1 202, and View R1 204 may correspond to images obtained from a reference viewpoint. View C1 203 is an image of each reference viewpoint and may be 3D View Warped using 3D geometric relationships and depth information. Also, View C1 203 may be mapped to an image of each reference viewpoint.

Warped-view C1 onto L2(211)는 View C1(203)을 참조시점인 View L2(201)으로 워핑하여 생성된 영상에 해당될 수 있다. Warped-view C1 onto L1(212)는 View C1(203)을 참조시점인 View L1(202)으로 워핑하여 생성된 영상에 해당될 수 있다. 이 경우, View C1(203)에서는 보이지 않는 영역(Occluded area)이 Warped-view C1 onto L2(211)와 Warped-view C1 onto L1(212)에서 검은 영역처럼 생성될 수 있다. 상기 보이지 않는 영역은 데이터가 없는 홀 영역에 해당될 수 있다. 상기 홀 영역이 아닌 나머지 영역은 View C1(203)에서 보이는 영역에 해당될 수 있다. 이 때, View L2(201)와 Warped-view C1 onto L2(211)간에 중복된 영역이 맞는지 확인하는 절차를 거쳐 중복영역이 제거될 수 있다. 그리고, View L1(202)과 Warped-view C1 onto L1(212)간에 중복된 영역이 맞는지 확인하는 절차를 거쳐 중복영역이 제거될 수 있다. 중복영역을 제거하는 방법으로서, 서로 같은 좌표 또는 해당 좌표의 일정 범위 안에서 매핑된 영상화소 단위로 텍스처 데이터와 깊이 정보가 비교될 수 있다. 그리고, 해당 비교를 통해 잔차영상 1(221)과 잔차영상 2(222)가 생성될 수 있다.Warped-view C1 onto L2 211 may correspond to an image generated by warping View C1 203 to View L2 201 as a reference view. Warped-view C1 onto L1 212 may correspond to an image generated by warping View C1 203 to View L1 202 as a reference view. In this case, an occluded area in View C1 (203) may be created like a black area in Warped-view C1 onto L2 (211) and Warped-view C1 onto L1 (212). The invisible area may correspond to a hole area without data. The remaining area other than the hole area may correspond to an area visible in View C1 (203). At this time, the overlapping area can be removed through a procedure for checking whether the overlapped area between View L2 (201) and Warped-view C1 onto L2 (211) matches. In addition, the overlapping area may be removed through a procedure for confirming whether the overlapping area between the View L1 202 and the Warped-view C1 onto L1 212 is correct. As a method of removing overlapping regions, texture data and depth information may be compared in units of image pixels mapped to the same coordinates or within a certain range of corresponding coordinates. Then, residual image 1 221 and residual image 2 222 may be generated through the comparison.

잔차영상 1(221)은 View C1(203)에서는 보이지 않고 참조시점인 View L2(201)에서만 보이는 영상에 해당될 수 있다. 잔차영상 2(222)는 View C1(203)에서는 보이지 않고 참조시점인 View L1(202)에서만 보이는 영상에 해당될 수 있다.Residual image 1 (221) may correspond to an image that is not visible in View C1 (203) and is visible only in View L2 (201), which is a reference viewpoint. Residual image 2 (222) may correspond to an image that is not visible in View C1 (203) and is visible only in View L1 (202), which is a reference viewpoint.

도 3은 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 프루닝 과정을 설명한다. 프루닝(Pruning)은 복수의 시점의 영상 간 잔차영상을 제거하는 과정에 해당될 수 있다.3 illustrates a pruning process between images of a plurality of viewpoints according to an embodiment of the present disclosure. Pruning may correspond to a process of removing residual images between images of a plurality of viewpoints.

도 3을 참조하면, 5개의 시점에서 영상이 획득될 수 있다. 시점영상 v0과 시점영상 v1이 기본 시점영상(Basic View)에 해당될 수 있다. 나머지 시점영상 v2, v3 및 v4는 추가 시점영상(Additional View)에 해당될 수 있다. 시점영상 v0과 시점영상 v1은 기본 시점영상이므로 영상 정보가 그대로 사용될 수 있다. 나머지 추가 시점영상들(v2~v4)은 기본 시점영상과 비교하여 중복영역(Redundancy)이 제거될 수 있다. 여기서 중복영역을 제거할 때, 깊이 정보와 카메라 정보가 이용될 수 있다. 깊이 정보와 카메라 정보를 이용하여 기본 시점영상으로 추가 시점영상을 워핑하여 중복된 화소가 추가 시점영상에서 제거될 수 있다.Referring to FIG. 3 , images may be obtained from five viewpoints. The viewpoint image v0 and the viewpoint image v1 may correspond to the basic viewpoint image (Basic View). The remaining viewpoint images v2, v3, and v4 may correspond to additional viewpoint images (Additional View). Since the viewpoint image v0 and the viewpoint image v1 are basic viewpoint images, the image information can be used as it is. The remaining additional viewpoint images (v2 to v4) may be compared with the basic viewpoint image and redundancy may be removed. When removing the overlapping region, depth information and camera information may be used. By warping the additional viewpoint image with the basic viewpoint image using depth information and camera information, overlapping pixels may be removed from the additional viewpoint image.

시점영상 v2는 기본 시점영상 v0과 v1을 기준으로 중복된 영역이 제거될 수 있다. 그리고, 남은 잔차영상 정보만을 나타내는 프루닝된 v2의 마스크(Mask) 영상이 생성될 수 있다. 다음 순서로 시점영상 v3가 기본 시점영상 v0과 v1을 기준으로 중복된 영역이 제거될 수 있다. 그리고, 프루닝된 v3의 마스크 영상이 생성될 수 있다. 이 때, 추가적으로 시점영상 v2와 중복 영역이 존재할 수 있으므로 v2와도 프루닝 과정이 진행될 수 있다. 다음 순서로 시점영상 v4가 기본 시점영상 v0과 v1을 기준으로 중복된 영역이 제거될 수 있다. 그리고, 프루닝된 v4의 마스크 영상이 생성될 수 있다. 이 때, 추가적으로 시점영상 v3와 중복 영역이 존재할 수 있으므로 v3과도 프루닝 과정이 진행될 수 있다.In the viewpoint image v2, overlapping regions based on the basic viewpoint images v0 and v1 may be removed. And, a pruned mask image of v2 representing only the remaining residual image information may be generated. In the following order, overlapping regions of the viewpoint image v3 based on the basic viewpoint images v0 and v1 may be removed. And, a pruned mask image of v3 may be generated. At this time, since an area overlapping with the viewpoint image v2 may additionally exist, the pruning process may also be performed with v2. In the following order, overlapping regions of the viewpoint image v4 based on the basic viewpoint images v0 and v1 may be removed. And, a mask image of the pruned v4 may be generated. At this time, since an area overlapping with the viewpoint image v3 may additionally exist, the pruning process may be performed with v3 as well.

추가 시점영상에 남아있는 영상은 기본 시점영상에서 가려짐에 의해 보이지 않은 정보에 해당될 수 있다. 이러한 정보는 추가 시점영상간 중복되는 영역이 존재할 수 있다. 그리고, 해당 정보는 2차 프루닝 과정에서 제거될 수 있다. 2차 프루닝 과정에서 추가 시점영상의 프루닝 순서에 의해 프루닝되는 양상이 달라질 수 있다. 따라서, 프루닝 양상에 따라 최종 합성 품질과 압축 효율이 달라질 수 있다.Images remaining in the additional viewpoint image may correspond to information that is not visible due to occlusion in the basic viewpoint image. Such information may exist in overlapping regions between additional viewpoint images. Also, the corresponding information may be removed in the secondary pruning process. In the secondary pruning process, the pruning aspect may vary according to the pruning order of additional viewpoint images. Therefore, the final synthesis quality and compression efficiency may vary depending on the pruning aspect.

도 4a 및 도 4b는 본 개시의 일 실시예에 따른, 복수의 시점의 영상 간 1차 프루닝과 2차 프루닝 과정을 설명한다.4A and 4B illustrate processes of primary pruning and secondary pruning between images of a plurality of viewpoints according to an embodiment of the present disclosure.

도 4a 및 도 4b를 참조하면, View 0(400)과 View 1(401)은 기본 시점영상에 해당될 수 있다. View 2(402)와 View 3(403)은 참조 시점영상에 해당될 수 있다. Warping view 0 onto view 2(412)는 View 0(400)을 깊이를 이용하여 View 2(402)의 위치로 워핑한 영상에 해당될 수 있다. Warping view 0 onto view 3(413)은 View 0(400)을 깊이를 이용하여 View 3(413)의 위치로 워핑한 영상에 해당될 수 있다. 그리고, Warping view 0 onto view 2(412)와 Warping view 0 onto view 3(413)에서는 View 0(400)에서 보이지 않는 영역이 검출될 수 있다. 그리고, 프루닝 과정을 거치면 잔차영상 1(422)과 잔차영상 2(423)이 생성될 수 있다. 상기 프루닝 과정은 1차 프루닝 과정에 해당될 수 있다.Referring to FIGS. 4A and 4B , View 0 (400) and View 1 (401) may correspond to basic viewpoint images. View 2 (402) and View 3 (403) may correspond to the reference viewpoint image. Warping view 0 onto view 2 412 may correspond to an image obtained by warping View 0 400 to the position of View 2 402 using depth. Warping view 0 onto view 3 413 may correspond to an image obtained by warping View 0 400 to the position of View 3 413 using depth. Also, in Warping view 0 onto view 2 412 and Warping view 0 onto view 3 413 , an area not visible in View 0 400 may be detected. After the pruning process, residual image 1 422 and residual image 2 423 may be generated. The pruning process may correspond to a first pruning process.

1차 프루닝 과정이후, 참조 시점영상간 중복 영역을 제거하기 위해 2차 프루닝 과정이 수행될 수 있다. 2차 프루닝 과정에서 잔차영상 1(422)과 잔차영상 2(423) 중에서 프루닝 순서에 따라 프루닝 양상이 달라질 수 있다. 잔차영상 1(422)를 기준으로 잔차영상 2(423)이 프루닝된 경우(440), 중복 영역이 제거되어 잔차영상 3(433)이 생성될 수 있다. 그리고, 프루닝된 영상은 잔차영상 1(422)과 잔차영상 3(433)에 해당될 수 있다. 잔차영상 2(423)을 기준으로 잔차영상 1(422)이 프루닝된 경우(441), 중복 영역이 제거되어 잔차영상 4(432)가 생성될 수 있다. 그리고, 프루닝된 영상은 잔차영상 2(423)와 잔차영상 4(432)에 해당될 수 있다.After the primary pruning process, a secondary pruning process may be performed to remove overlapping regions between reference view images. In the secondary pruning process, among the residual image 1 422 and the residual image 2 423, the pruning pattern may vary according to the pruning order. When the residual image 2 423 is pruned (440) based on the residual image 1 422, a residual image 3 433 may be generated by removing the overlapping region. Also, the pruned images may correspond to residual image 1 422 and residual image 3 433. When the residual image 1 422 is pruned (441) based on the residual image 2 423, a residual image 4 432 may be generated by removing the overlapping region. Also, the pruned images may correspond to residual image 2 423 and residual image 4 432.

잔차영상 1(422)를 기준으로 잔차영상 2(423)이 프루닝된 경우(440), 프루닝된 영상이 하나의 영상인 잔차영상 1(422)에 몰려있을 수 있다. 반면에, 잔차영상 2(423)을 기준으로 잔차영상 1(422)이 프루닝된 경우(441), 잔차영상 1(422)의 객체가 반으로 나뉘어 각각 잔차영상 2(423)와 잔차영상 4(432)에 나뉘어 존재할 수 있다. 이후, 프루닝된 영상들을 영상 단위로 패킹(Packing)될 수 있다. 패킹된 영상들은 압축되어 단말기에 전송될 수 있다. 단말기는 이를 받아 복호화하거나 영상 합성을 수행할 수 있다. 여기서 영상 합성을 수행하는 경우, 하나의 객체가 여러 패치로 나누어져 있는 것보다 하나의 객체가 하나 혹은 적은 수의 패치에 존재하는 것이 압축 효율이나 영상 품질면에서 유리할 수 있다.When the residual image 2 423 is pruned (440) based on the residual image 1 422, the pruned image may be concentrated in the residual image 1 422, which is one image. On the other hand, when the residual image 1 (422) is pruned (441) based on the residual image 2 (423), the object of the residual image 1 (422) is divided in half, and the residual image 2 (423) and the residual image 4 are respectively divided in half. (432) may exist. Thereafter, the pruned images may be packed in image units. Packed images can be compressed and transmitted to the terminal. The terminal may receive and decode it or perform image synthesis. Here, when video synthesis is performed, it may be advantageous in terms of compression efficiency or image quality that one object exists in one or a small number of patches rather than one object divided into several patches.

도 5는 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 중요도를 산정할 때, 가중치 요소들을 설명한다. 본 개시에서는 프루닝 순서를 결정할 때, 복수의 시점영상 중에서 중요도가 높은 시점영상이 결정될 수 있다. 해당 시점영상의 중요도가 높은 경우 우선순위가 높게 부여될 수 있다. 복수의 시점영상의 중요도를 산정할 때, 객체의 위치, 카메라로부터의 거리, 영상합성에서 발생하는 가려짐 영역의 너비 등이 고려될 수 있다. 그리고, 각 요소에 따라 가중치를 부여하여 복수의 시점영상의 중요도가 산정될 수 있다. 중요도 산정식은

에 해당될 수 있다. 여기서 i는 중요도가 산정되는 개별 시점영상의 번호에 해당될 수 있다. x와 y는 개별 시점영상의 화소의 좌표위치에 해당될 수 있다. 각 화소마다 가중치가 계산되면 모든 화소의 가중치를 합산하여 해당 시점영상의 중요도가 산정될 수 있다.5 illustrates weight factors when calculating the importance of images of a plurality of viewpoints according to an embodiment of the present disclosure. In the present disclosure, when a pruning order is determined, a viewpoint image having a high importance among a plurality of viewpoint images may be determined. When the importance of the corresponding view image is high, a high priority may be given. When calculating the importance of the plurality of viewpoint images, the position of the object, the distance from the camera, and the width of the occluded area generated in image synthesis may be considered. In addition, the importance of the plurality of viewpoint images may be calculated by assigning a weight according to each element. The importance calculation formula is

may apply to Here, i may correspond to the number of an individual viewpoint image whose importance is calculated. x and y may correspond to coordinate positions of pixels of individual viewpoint images. When the weight is calculated for each pixel, the importance of the corresponding viewpoint image may be calculated by adding the weights of all pixels.

도 5를 참조하면, 영상 1(501)에서는 객체의 위치가 영상의 중심에서 가까울수록 높은 가중치가 부여될 수 있다. 가중치를 산정하는 식은

,

에 해당될 수 있다. 여기서 x와 y는 각 화소의 2차원 좌표에 해당될 수 있다. 장면에서 관심 영역 혹은 관심 객체에 해당하는 화소가 카메라의 중간에 위치할 확률이 높다는 가정에 기반하여, 영상 1(501)에서처럼 각 화소마다 중심으로부터 거리를 고려하여 가중치가 산정될 수 있다. 영상 2(502)는 영상 1(501)의 깊이영상에 해당될 수 있다. 영상 2(502)에서는 개별 전경 혹은 배경의 깊이에 따라 x축과 y축으로 워핑되면서 발생하는 가려짐 영역의 너비 또는 시차(Parallax)의 정도가 달라질 수 있다. 따라서, 시점영상의 중요도 산정 시에 깊이값이 고려될 수 있다. 이는 장면에서 중요한 객체일수록 카메라에 가까울 확률이 높을 수 있기 때문이다. 전경에 속하는 객체가 워핑될 때, 객체에 의해 가려졌던 영역이 발생될 수 있다. 상기 영역은 전경에 의해 가려질 수 있다. 또는, 상기 영역은 인접한 배경 화소의 깊이 값의 차이에 비례할 수 있다. 즉, 전경과 배경의 깊이차이가 클수록 가려짐 영역이 넓거나 길게 생성될 수 있다. 영상 3(503)에서는 전경과 배경간 깊이차이를 나타내기 위해 소벨(Sobel) 연산을 통해 경계부분에서 깊이차이가 산정될 수 있다. 중요도 산정식에서

는 중요도 산정의 각 항마다 적용되는 가중치에 해당될 수 있다.Referring to FIG. 5 , in image 1 501, a higher weight may be assigned as the location of an object is closer to the center of the image. The formula for calculating the weight is

,

may apply to Here, x and y may correspond to 2D coordinates of each pixel. Based on the assumption that there is a high probability that a pixel corresponding to a region of interest or an object of interest in a scene is located in the middle of a camera, a weight may be calculated by considering a distance from the center of each pixel as in image 1 501 . Image 2 (502) may correspond to the depth image of image 1 (501). In the image 2 502, the width of the occluded area or the degree of parallax caused by warping in the x-axis and y-axis may vary according to the depth of each foreground or background. Accordingly, the depth value may be considered when calculating the importance of the viewpoint image. This is because a more important object in a scene may have a higher probability of being closer to the camera. When an object belonging to the foreground is warped, a region that has been occluded by the object may be generated. The area may be obscured by the foreground. Alternatively, the area may be proportional to a difference in depth values of adjacent background pixels. That is, as the depth difference between the foreground and the background increases, the occluded area may be wider or longer. In image 3 (503), a depth difference can be calculated at the boundary through a Sobel operation to indicate a depth difference between the foreground and the background. In the importance calculation formula

may correspond to a weight applied to each term of the importance calculation.

중요도 산정식은 여러 사례에 활용될 수 있다. 도 3에서 복수의 시점영상의 프루닝 과정에서 중요도가 높다고 판정된 시점영상은 높은 우선순위에 해당될 수 있다. 이에 따라, 시점영상들 중에서 중요도가 높은 시점영상은 적게 프루닝될 수 있다. 적게 프루닝된다는 것은 프루닝 과정을 통해 중복되는 영역이 사라지는 화소의 수가 적다는 것을 의미할 수 있다. 또한, 중요도가 낮게 판정된 시점영상은 낮은 순위에 해당될 수 있다. 이에 따라, 시점영상들 중에서 중요도가 낮은 시점영상은 많이 프루닝될 수 있다. 많이 프루닝된다는 것은 프루닝 과정을 통해 중복되는 영역이 사라지는 화소의 수가 많다는 것을 의미할 수 있다.The importance calculation formula can be used in many cases. In the process of pruning a plurality of viewpoint images in FIG. 3 , a viewpoint image determined to be of high importance may correspond to a high priority. Accordingly, a viewpoint image having a high importance among viewpoint images may be pruned less. Small pruning may mean that the number of pixels in which overlapping regions disappear through the pruning process is small. In addition, viewpoint images determined to be of low importance may correspond to low ranks. Accordingly, viewpoint images of low importance among viewpoint images may be pruned a lot. A lot of pruning may mean that there are many pixels in which overlapping regions disappear through the pruning process.

도 6은 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 중요도를 산정할 때, 목표위치의 변화를 설명한다.6 illustrates a change in a target position when calculating importance of images of a plurality of viewpoints according to an embodiment of the present disclosure.

도 6을 참조하면, 0번 카메라부터 8번 카메라까지 9개의 카메라가 각각의 위치에서 시점영상을 촬영할 수 있다. 동시에 촬영되어 획득된 시점영상들을 이용하여 목표 가상시점에서의 영상이 합성될 수 있다. 이 경우, 목표 위치는 시간축에 따라 위치 p0에서부터 p4, p6을 거쳐 p15까지 변경될 수 있다. 상기 목표 위치는 사용자가 보는 가상카메라의 위치를 나타낼 수 있다. 그리고 시간축에 따라 움직이는 가상카메라의 추적좌표는 포즈 트레이스(Pose Trace)라고 정의될 수 있다. 포즈 트레이스는 사용자가 단말기에서 합성된 시점 영상을 볼 때, 사용자가 임의로 움직이는 목표 가상시점 영상의 좌표에 해당될 수 있다. 그리고, 상기 좌표는 단말기에서 획득된 사용자 움직임이 서버 쪽의 부호화기에 전달되거나 혹은 제작자의 의도대로 사전에 미리 정의된 사용자의 예상 움직임 좌표에 해당될 수 있다.Referring to FIG. 6 , nine cameras from camera 0 to camera 8 may capture viewpoint images at respective positions. An image at a target virtual viewpoint may be synthesized using viewpoint images obtained by being simultaneously photographed. In this case, the target position may be changed from position p0 to p15 via p4 and p6 along the time axis. The target location may indicate a location of a virtual camera viewed by the user. Also, the tracking coordinates of the virtual camera moving along the time axis may be defined as a pose trace. The pose trace may correspond to the coordinates of the target virtual viewpoint image that the user arbitrarily moves when the user views the synthesized viewpoint image in the terminal. Further, the coordinates may correspond to coordinates in which the user's movement obtained from the terminal is transmitted to the server-side encoder or the user's expected movement coordinates predefined in advance as intended by the manufacturer.

이 때, 목표 위치의 가상 카메라의 3차원 x, y, z좌표가 위치 이동(translation)될 수 있다. 그리고, 가상 카메라가 장면을 바라보는 회전(rotation) 각도인 포즈도 변경될 수 있다. 목표 위치가 변동될 때, 참조 시점영상의 개수나 각 시점영상의 중요도가 다를 수 있다. 여기서 목표 위치와 연관성이 높은 시점영상에 높은 우선순위가 부여될 수 있다. 높은 우선순위가 부여된 시점영상은 적게 프루닝될 수 있다.At this time, the 3D x, y, and z coordinates of the virtual camera at the target location may be translated. Also, a pose that is a rotation angle at which the virtual camera views the scene may be changed. When the target position changes, the number of reference viewpoint images or the importance of each viewpoint image may be different. Here, a high priority may be given to a viewpoint image having a high correlation with the target position. A viewpoint image to which a high priority is assigned may be pruned less often.

또한, 기본 시점영상을 선정할 때도 목표 위치와 연관성이 높은 시점이 기본 시점으로 선정될 수 있다. 그리고, 상기 기본 시점에 대한 정보는 메타데이터로 전송될 수 있다. 기본 시점영상은 프루닝의 기준이 되는 시점의 영상에 해당될 수 있다. 참조 시점영상들은 기본 시점영상의 화소를 참조하여 프루닝 과정을 통해 중복된 영역이 제거될 수 있다.Also, when selecting a basic viewpoint image, a viewpoint having a high correlation with the target position may be selected as the basic viewpoint. Also, the information on the basic viewpoint may be transmitted as metadata. The basic viewpoint image may correspond to an image of a viewpoint serving as a criterion for pruning. In the reference viewpoint images, overlapping regions may be removed through a pruning process by referring to pixels of the basic viewpoint image.

도 7은 본 개시의 일 실시예에 따른, 복수의 시점의 영상의 프루닝 과정에서, 보호 영역 마스크를 결정하는 방법을 설명한다.7 illustrates a method of determining a protection region mask in a process of pruning images of a plurality of viewpoints according to an embodiment of the present disclosure.

도 7을 참조하면, 701은 하나의 객체를 다른 위치에서 촬영한 시점영상인

와

를 포함할 수 있다.

가

보다 우선순위가 높으면 702처럼 프루닝이 수행될 수 있다. 반대로,

가

보다 우선순위가 낮으면 703처럼 프루닝이 수행될 수 있다. 701에서

는 숫자면 4에 대한 텍스처 정보가 적고 옆면에 대한 텍스처 정보를 많이 가질 수 있다.

는 옆면에 대한 텍스처 정보가 적고 숫자면 4에 대한 텍스처 정보를 많이 가질 수 있다. 704에서는

와

에서 많이 가진 텍스처 정보는 보존하고 중요도가 떨어지는 영역의 텍스처 정보는 제거될 수 있다. 704와 같이 프루닝이 수행되는 경우, 영상합성에서 좋은 결과가 나타날 수 있다. 704와 같이 프루닝을 수행하기위해서 705처럼 사전에

와

에 중요한 영역이라고 판단되는 곳에 보호 영역 마스크(Protection Mask)가 지정될 수 있다. 705처럼 보호 영역 마스크가 지정된 경우, 해당 영역은 프루닝 과정에서 시점영상의 우선순위에 상관없이 프루닝되지 않고 보존될 수 있다. 즉, 705에서

는 옆면에 보호 영역 마스크가 지정되고

는 숫자면 4에 보호 영역 마스크가 지정되면 704와 같이 프루닝이 수행될 수 있다.Referring to FIG. 7 , 701 is a viewpoint image of an object captured at a different location.

Wow

can include

go

If the priority is higher, pruning may be performed as in 702 . on the other way,

go

If the priority is lower than 703, pruning may be performed. at 701

may have less texture information for the number side 4 and a lot of texture information for the side face.

may have less texture information for the side face and a lot of texture information for the number face 4. in 704

Wow

A lot of texture information can be preserved, and texture information of less important areas can be removed. When pruning is performed as in 704, good results may be obtained in image synthesis. To perform pruning as in 704, as in 705

Wow

A protection mask may be designated where it is determined to be an important area. When a protection region mask is designated as in 705, the corresponding region may be preserved without being pruned regardless of the priority of the viewpoint image in the pruning process. i.e. at 705

is assigned a protection area mask on the side and

pruning can be performed as in step 704 when a protection area mask is designated for the number surface 4.

도 8은 본 개시의 일 실시예에 따른, 사람 등의 객체에 기초하여 보호 영역 마스크를 결정한 것을 설명한다. 일 예로, 보호 영역 마스크는 삼각형이나 사각형과 같은 다각형이나 원형과 같은 도형의 모양을 가질 수 있다. 일 예로, 사용자가 임의대로 자유로운 모양으로 보호 영역 마스크를 지정할 수 있다. 일 예로, 보호 영역 마스크는 컴퓨터 비전 알고리즘 등에 의해 자동으로 지정될 수 있다. 일 예로, 해당 보호 영역 마스크는 사용자가 단말기에서 합성된 영상을 보다가 사용자가 주시하거나 관심을 유발하는 영역이 지정되어 서버 쪽으로 별도의 메타데이터로서 전달되어 지정될 수 있다. 일 예로, 보호 영역 마스크는 장면에서 배경으로부터 분리되는 사람이나 사물 등의 객체 마스크 영상에 해당될 수 있다. 또한, 상기 객체 마스크 영상을 기준으로 생성된 제2의 마스크 영상에 해당될 수 있다. 다만 본 개시는 상기 실시예에 한정되지 않는다.8 illustrates determining a protection area mask based on an object, such as a person, according to an embodiment of the present disclosure. For example, the protection area mask may have a shape such as a polygon such as a triangle or a quadrangle or a figure such as a circle. For example, a user may designate a protection area mask in a free shape as desired. For example, the protection area mask may be automatically designated by a computer vision algorithm or the like. For example, the corresponding protection area mask may be designated by being transferred to the server as separate metadata after a user watches a synthesized image in a terminal and designates an area that the user pays attention to or arouses interest. For example, the protection area mask may correspond to an object mask image such as a person or an object separated from a background in a scene. In addition, it may correspond to a second mask image generated based on the object mask image. However, the present disclosure is not limited to the above embodiment.

도 8을 참조하면, 영상내 사람에 대한 영역이 관심이 높은 영역 또는 중요한 영역에 해당될 수 있다. 이에 따라 사람에 대한 영역에 보호 영역 마스크가 지정될 수 있다. 이에 따라 프루닝 과정에서 보호 영역 마스크가 지정된 사람에 대한 영역은 프루닝 대상에서 제외될 수 있다. 프루닝 대상에서 제외된 화소들에 의해서 합성되는 영역은 상대적으로 다른 영역에 비해서 합성 품질이 향상될 수 있다.Referring to FIG. 8 , a region of a person in an image may correspond to a region of high interest or an important region. Accordingly, a protection area mask can be assigned to the area for the person. Accordingly, in the pruning process, a region for a person to whom a protection region mask is designated may be excluded from the pruning target. A region synthesized by pixels excluded from pruning may have a relatively improved composite quality compared to other regions.

도 9는 본 개시의 일 실시예에 따른, 영상 부호화 방법을 설명한다.9 illustrates an image encoding method according to an embodiment of the present disclosure.

도 9를 참조하면, 복수의 시점의 영상이 획득될 수 있다(S910).Referring to FIG. 9 , images of a plurality of viewpoints may be acquired (S910).

그리고, 복수의 시점 중에서 기본 시점 및 복수의 참조 시점이 결정될 수 있다(S920).Also, a basic viewpoint and a plurality of reference viewpoints may be determined from among the plurality of viewpoints (S920).

일 실시예에 따르면, 사용자의 가상카메라의 위치를 나타내는 목표 위치의 변동에 기초하여 기본 시점이 결정될 수 있다.According to an embodiment, a basic viewpoint may be determined based on a change in a target position indicating a position of a user's virtual camera.

일 실시예에 따르면, 복수의 시점 중에서 기본 시점은 복수에 해당될 수 있다.According to an embodiment, a plurality of basic viewpoints may correspond to a plurality of viewpoints.

일 실시예에 따르면, 기본 시점에 대한 정보는 별도의 정보로서 메타데이터로 전송될 수 있다.According to an embodiment, information on the basic viewpoint may be transmitted as metadata as separate information.

그리고, 복수의 참조 시점의 프루닝 순서가 결정될 수 있다(S930).Then, the pruning order of the plurality of reference viewpoints may be determined (S930).

일 실시예에 따르면, 복수의 참조 시점의 중요도가 결정될 수 있고, 중요도에 기초하여 복수의 참조 시점의 프루닝 순서가 결정될 수 있다.According to an embodiment, importance of a plurality of reference viewpoints may be determined, and a pruning order of the plurality of reference viewpoints may be determined based on the importance.

일 실시예에 따르면, 복수의 참조 시점의 영상내 화소마다 가중치가 부여될 수 있고, 가중치에 기초하여 복수의 참조 시점의 중요도가 결정될 수 있다.According to an embodiment, a weight may be assigned to each pixel in an image of a plurality of reference views, and the importance of the plurality of reference views may be determined based on the weight.

일 실시예에 따르면, 객체의 위치, 카메라로부터의 거리, 가려짐 영역의 너비 및 깊이 값 중에서 적어도 하나 이상에 기초하여 영상내 화소마다 가중치가 부여될 수 있다.According to an embodiment, a weight may be assigned to each pixel in an image based on at least one of a location of an object, a distance from a camera, and a width and depth value of an occluded area.

일 실시예에 따르면, 사용자의 가상카메라의 위치를 나타내는 목표 위치의 변동에 기초하여 복수의 참조 시점의 중요도가 결정될 수 있다.According to an embodiment, the importance of a plurality of reference viewpoints may be determined based on a change in a target position representing a position of a user's virtual camera.

그리고, 프루닝 순서에 기초하여 기본 시점의 영상과 복수의 참조 시점의 영상 간 프루닝이 수행될 수 있다(S940).Then, pruning may be performed between an image of a basic view and images of a plurality of reference views based on the pruning order (S940).

일 실시예에 따르면, 복수의 시점의 영상의 중요 영역에 기초하여, 보호 영역 마스크가 결정될 수 있다.According to an embodiment, a protection region mask may be determined based on important regions of images of a plurality of viewpoints.

일 실시예에 따르면, 보호 영역 마스크가 결정된 영역은 프루닝되지 않을 수 있다.According to one embodiment, the region where the protection region mask is determined may not be pruned.

일 실시예에 따르면, 보호 영역 마스크는 임의의 형태를 가질 수 있다.According to one embodiment, the guard area mask can have any shape.

도 10은 본 개시의 일 실시예에 따른, 영상 복호화 방법을 설명한다.10 illustrates an image decoding method according to an embodiment of the present disclosure.

도 10을 참조하면, 복수의 시점의 영상에 대한 영상 데이터가 획득될 수 있다(S1010).Referring to FIG. 10 , image data for images of a plurality of viewpoints may be obtained (S1010).

그리고, 복수의 시점 중에서 기본 시점 및 복수의 참조 시점이 결정될 수 있다(S1020).Also, a basic viewpoint and a plurality of reference viewpoints may be determined from among a plurality of viewpoints (S1020).

일 실시예에 따르면 복수의 시점 중에서 기본 시점은 복수에 해당될 수 있다.According to an embodiment, a plurality of basic viewpoints may correspond to a plurality of viewpoints.

그리고, 복수의 참조 시점의 프루닝 순서가 결정될 수 있다(S1030).And, the pruning order of the plurality of reference viewpoints may be determined (S1030).

일 실시예에 따르면, 객체의 위치, 카메라로부터의 거리, 가려짐 영역의 너비 및 깊이 값 중에서 적어도 하나 이상에 기초하여 화소마다 가중치가 부여될 수 있다.According to an embodiment, a weight may be assigned to each pixel based on at least one of the location of the object, the distance from the camera, and the width and depth values of the occluded area.

그리고, 프루닝 순서에 기초하여 영상 데이터를 파싱하여 기본 시점의 영상과 복수의 참조 시점의 영상이 복호화될 수 있다(S1040).In addition, an image of a basic view and an image of a plurality of reference views may be decoded by parsing the image data based on the pruning sequence (S1040).

일 실시예에 따르면, 영상 복호화 방법에 따라 복호화되는 영상 부호화 데이터를 포함하는 비트스트림을 저장한 컴퓨터 판독 가능한 기록매체가 제공될 수 있다.According to an embodiment, a computer readable recording medium storing a bitstream including image encoding data decoded according to an image decoding method may be provided.

상술한 실시예들에서, 방법들은 일련의 단계 또는 유닛으로서 순서도를 기초로 설명되고 있으나, 본 개시는 단계들의 순서에 한정되는 것은 아니며, 어떤 단계는 상술한 바와 다른 단계와 다른 순서로 또는 동시에 발생할 수 있다. 또한, 당해 기술 분야에서 통상의 지식을 가진 자라면 순서도에 나타난 단계들이 배타적이지 않고, 다른 단계가 포함되거나, 순서도의 하나 또는 그 이상의 단계가 본 개시의 범위에 영향을 미치지 않고 삭제될 수 있음을 이해할 수 있을 것이다.In the foregoing embodiments, the methods are described on the basis of a flow chart as a series of steps or units, but the present disclosure is not limited to the order of steps, and some steps may occur in a different order or concurrently with other steps as described above. can In addition, those skilled in the art will understand that the steps shown in the flow chart are not exclusive, other steps may be included, or one or more steps of the flow chart may be deleted without affecting the scope of the present disclosure. You will understand.

상술한 실시예는 다양한 양태의 예시들을 포함한다. 다양한 양태들을 나타내기 위한 모든 가능한 조합을 기술할 수는 없지만, 해당 기술 분야의 통상의 지식을 가진 자는 다른 조합이 가능함을 인식할 수 있을 것이다. 따라서, 본 개시는 이하의 특허청구범위 내에 속하는 모든 다른 교체, 수정 및 변경을 포함한다고 할 것이다.The foregoing embodiment includes examples of various aspects. It is not possible to describe all possible combinations to represent the various aspects, but those skilled in the art will recognize that other combinations are possible. Accordingly, it is intended that this disclosure embrace all other substitutions, modifications and variations falling within the scope of the following claims.

이상 설명된 본 개시에 따른 실시예들은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 개시를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 개시에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Embodiments according to the present disclosure described above may be implemented in the form of program instructions that can be executed through various computer components and recorded in a computer readable recording medium. The computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the computer-readable recording medium may be those specially designed and configured for the present disclosure, or may be known and usable to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those produced by a compiler. The hardware device may be configured to act as one or more software modules to perform processing according to the present disclosure and vice versa.

이상에서 본 개시가 구체적인 구성요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나, 이는 본 개시의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 개시가 상기 실시예들에 한정되는 것은 아니며, 본 개시가 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형을 꾀할 수 있다.In the above, the present disclosure has been described by specific details such as specific components and limited embodiments and drawings, but these are provided to help a more general understanding of the present disclosure, and the present disclosure is not limited to the above embodiments. However, those skilled in the art can make various modifications and variations from these descriptions.

따라서, 본 개시의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위 뿐만 아니라 이 특허청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 개시의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present disclosure should not be limited to the above-described embodiments, and not only the claims to be described later, but also all modifications equivalent or equivalent to these claims belong to the scope of the spirit of the present disclosure. will do it

Claims

Acquiring images of a plurality of viewpoints;
determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints;
determining a pruning order of the plurality of reference views;
determining a protection region mask based on important regions of the images of the plurality of viewpoints; and
Performing pruning between an image of the base view and images of the plurality of reference views based on the pruning order;
An image encoding method according to claim 1 , wherein the region in which the protection region mask is determined is not pruned.

delete

According to claim 1,
The image encoding method, characterized in that the protection region mask has an arbitrary shape.

According to claim 1,
Determining the pruning order of the plurality of reference views,
determining importance of the plurality of reference viewpoints; and
and determining a pruning order of the plurality of reference views based on the importance.

According to claim 5,
The step of determining the importance of the plurality of reference points,
assigning a weight to each pixel in the images of the plurality of reference viewpoints; and
and determining importance of the plurality of reference views based on the weights.

According to claim 6,
In the step of assigning the weight,
An image encoding method characterized in that the weight is assigned to each pixel in the image based on at least one of a location of an object, a distance from a camera, and a width and depth value of an occluded area.

According to claim 5,
The step of determining the importance of the plurality of reference points,
An image encoding method characterized in that determining importance of the plurality of reference viewpoints based on a target position indicating a position of a user's virtual camera.

According to claim 1,
The step of determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints,
An image encoding method characterized in that the basic viewpoint is determined based on a change in a target position indicating a position of a user's virtual camera.

acquiring image data for images of a plurality of viewpoints;
determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints;
determining a pruning order of the plurality of reference viewpoints;
determining a protection region mask based on important regions of the images of the plurality of viewpoints; and
Parsing the image data based on the pruning order and decoding the image of the base view and the images of the plurality of reference views;
An image decoding method according to claim 1 , wherein an area in which the mask of the protection area is determined is not pruned.

delete

According to claim 10,
The image decoding method, characterized in that the protection region mask has an arbitrary shape.

According to claim 10,
Determining the pruning order of the plurality of reference views,
determining importance of the plurality of reference viewpoints; and
and determining a pruning order of the plurality of reference views based on the importance.

According to claim 14,
The step of determining the importance of the plurality of reference points,
assigning a weight to each pixel in the images of the plurality of reference viewpoints; and
and determining importance of the plurality of reference views based on the weights.

According to claim 15,
In the step of assigning the weight,
An image decoding method characterized in that the weight is assigned to each pixel based on at least one of a position of an object, a distance from a camera, and a width and depth value of an occluded area.

According to claim 14,
The step of determining the importance of the plurality of reference points,
An image decoding method characterized in that determining the importance of the plurality of reference viewpoints based on the variation of the target position indicating the position of the user's virtual camera.

According to claim 10,
The step of determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints,
An image decoding method characterized in that the basic viewpoint is determined based on a target position indicating a position of a user's virtual camera.

According to claim 10,
Among the plurality of views, the basic view is a plurality of views.

A computer-readable recording medium storing a bitstream including video coded data decoded according to an image decoding method, comprising:
The video decoding method,
acquiring image data for images of a plurality of viewpoints;
determining a basic viewpoint and a plurality of reference viewpoints among the plurality of viewpoints;
determining a pruning order of the plurality of reference viewpoints;
determining a protection region mask based on important regions of the images of the plurality of viewpoints; and
Parsing the image data based on the pruning order and decoding the image of the base view and the images of the plurality of reference views,
A region in which the mask of the protection region is determined is not pruned.