KR102417959B1

KR102417959B1 - Apparatus and method for providing three dimensional volumetric contents

Info

Publication number: KR102417959B1
Application number: KR1020200093601A
Authority: KR
Inventors: 이재희; 변수진; 이정숙; 최지희
Original assignee: 주식회사 엘지유플러스
Priority date: 2020-07-28
Filing date: 2020-07-28
Publication date: 2022-07-06
Also published as: KR20220014037A

Abstract

일 실시 예에 의한 3차원 입체 콘텐츠 제공 방법은, 다수의 카메라들을 통해 촬영된 2차원 객체 영상들을 기반으로 3차원 포인트 클라우드를 추출하는 단계; 및 상기 3차원 포인트 클라우드를 복수의 서브프레임들로 분해하고, 상기 복수의 서브프레임들을 차등 압축한 후 기하학적 영상 및 텍스처 영상을 나타내는 2차원 프레임으로 변환하여 부호화된 비트스트림을 출력하는 단계;를 포함할 수 있다.According to an embodiment, a method for providing 3D stereoscopic content includes extracting a 3D point cloud based on 2D object images captured by a plurality of cameras; and outputting an encoded bitstream by decomposing the three-dimensional point cloud into a plurality of subframes, differentially compressing the plurality of subframes, and converting the plurality of subframes into a two-dimensional frame representing a geometric image and a texture image; can do.

Description

Apparatus and method for providing 3D stereoscopic content {APPARATUS AND METHOD FOR PROVIDING THREE DIMENSIONAL VOLUMETRIC CONTENTS}

본 발명은 3차원 입체 콘텐츠 제공 장치에 관한 것으로, 보다 상세하게는 사용자 체감품질에 따른 차등 압축기술을 통해 3차원 포인트 클라우드를 2차원 프레임으로 분해하여 출력함으로써 이동통신망에서 효율적인 스트리밍 서비스를 구현할 수 있는 3차원 입체 콘텐츠 제공 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus for providing three-dimensional stereoscopic content, and more particularly, by decomposing and outputting a three-dimensional point cloud into two-dimensional frames through a differential compression technology according to a user's experience quality, it is possible to implement an efficient streaming service in a mobile communication network. The present invention relates to an apparatus for providing 3D stereoscopic content and a method therefor.

현재 사람들이 접하는 수많은 미디어 콘텐츠는 흑백에서 컬러 영상으로, 저해상도에서 고해상도의 영상으로 발전해 왔다. 또한, 실제와 유사한 콘텐츠를 제공하기 위하여 사용자 중심의 360도 VR(Virtual Reality) 콘텐츠와 광시야각의 실감형 미디어인 UWV(Ultra-Wide Vision) 콘텐츠들이 등장하였으며, 해당 콘텐츠의 더 높은 몰입도를 위해 곡면 디스플레이, HMD(Head Mount Display) 등을 사용하기 시작했다.Numerous media contents that people encounter have developed from black-and-white to color images and from low-resolution to high-resolution images. In addition, user-centered 360-degree VR (Virtual Reality) contents and UWV (Ultra-Wide Vision) contents, which are immersive media with wide viewing angles, have appeared to provide content similar to reality. They started to use curved displays and HMDs (Head Mount Displays).

이처럼 미디어 기술은 실제와 같은 경험을 서비스하기 위해 발전을 거듭해 왔으며, 사용자에게 자유로운 시야각 및 입체감을 제공하는 3차원으로 이루어진 미디어로 눈길을 돌리기 시작했다. 이 중 3차원 포인트 클라우드는 AR/VR 및 자율주행 자동차 분야에서 차세대 미디어로 주목받고 있다.As such, media technology has evolved to provide a realistic experience, and has begun to turn its attention to three-dimensional media that provides users with a free viewing angle and three-dimensional effect. Among them, 3D point cloud is attracting attention as the next-generation media in AR/VR and autonomous vehicle fields.

그러나 3차원 포인트 클라우드를 표현하기 위해서는 수만에서 수십만 개의 포인트 데이터가 필요하고, 기존의 2차원 영상에 비해 많은 양의 저장공간을 요구한다. 이에, 3차원 포인트 클라우드의 방대한 데이터를 효율적으로 서비스하기 위한 다양한 기술 개발이 요구되고 있다However, in order to express a 3D point cloud, tens of thousands to hundreds of thousands of point data are required, and a large amount of storage space is required compared to the existing 2D image. Accordingly, the development of various technologies is required to efficiently service the massive data of the 3D point cloud.

실시 예는 대용량의 3차원 포인트 클라우드를 2차원 프레임으로 분해하되, 사용자 체감품질을 고려하여 미리 할당된 중요도 인자에 따라 차등 압축하여 이동통신망에서 효율적인 스트리밍 서비스를 구현할 수 있는 3차원 입체 콘텐츠 제공 장치 및 그 방법을 제공하기 위한 것이다.The embodiment decomposes a large-capacity three-dimensional point cloud into two-dimensional frames, but differentially compresses it according to a pre-allocated importance factor in consideration of the user's experience quality, thereby implementing an efficient streaming service in a mobile communication network. to provide that method.

실시 예에서 해결하고자 하는 기술적 과제는 이상에서 언급한 기술적 과제로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제는 아래의 기재로부터 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problem to be solved in the embodiment is not limited to the technical problem mentioned above, and another technical problem not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the description below. will be able

일 실시 예는, 다수의 카메라들을 통해 촬영된 2차원 객체 영상들을 기반으로 3차원 포인트 클라우드를 추출하는 단계; 및 상기 3차원 포인트 클라우드를 복수의 서브프레임들로 분해하고, 상기 복수의 서브프레임들을 차등 압축한 후 기하학적 영상 및 텍스처 영상을 나타내는 2차원 프레임으로 변환하여 부호화된 비트스트림을 출력하는 단계;를 포함하는, 3차원 입체 콘텐츠 제공 방법을 제공할 수 있다.According to an embodiment, extracting a 3D point cloud based on 2D object images captured by a plurality of cameras; and outputting an encoded bitstream by decomposing the three-dimensional point cloud into a plurality of subframes, differentially compressing the plurality of subframes, and converting the plurality of subframes into a two-dimensional frame representing a geometric image and a texture image; It is possible to provide a method for providing 3D stereoscopic content.

상기 추출하는 단계는, 상기 객체 영상들 간의 정합점을 토대로 3차원 공간좌표와 2차원 영상좌표 사이의 변환관계와 관련된 파라미터를 획득하는 단계;를 포함할 수 있다.The extracting may include obtaining a parameter related to a transformation relationship between 3D spatial coordinates and 2D image coordinates based on a matching point between the object images.

상기 3차원 포인트 클라우드에 상응하는 바운딩 박스(bounding box)를 생성하는 단계; 및 상기 바운딩 박스를 소정 크기의 복셀 그리드로 분할하는 단계;를 더 포함할 수 있다.generating a bounding box corresponding to the three-dimensional point cloud; and dividing the bounding box into a voxel grid of a predetermined size.

상기 바운딩 박스는 매 프레임별로 생성될 수 있다.The bounding box may be generated for each frame.

상기 출력하는 단계는, 상기 복셀 그리드의 내부에 존재하는 상기 3차원 포인트 클라우드의 포인트들을 상기 바운딩 박스의 각 면에 직교 투영하는 단계;를 포함하고, 상기 복수의 서브프레임들 각각은, 상기 바운딩 박스의 각 면 및 상기 각 면에 투영된 포인트들의 클러스터로 구성된 다수의 패치(patch)들을 포함할 수 있다.The outputting may include orthogonally projecting the points of the 3D point cloud existing inside the voxel grid onto each surface of the bounding box, wherein each of the plurality of subframes includes the bounding box. It may include a plurality of patches consisting of each face of and a cluster of points projected on each face.

상기 출력하는 단계는, 상기 복수의 서브프레임들의 압축률을 미리 할당된 중요도 인자에 기반하여 조정하고, 상기 조정된 압축률에 따라 상기 복수의 서브프레임들을 차등적으로 압축하는 단계;를 포함할 수 있다.The outputting may include adjusting compression ratios of the plurality of subframes based on a pre-allocated importance factor, and differentially compressing the plurality of subframes according to the adjusted compression ratios.

상기 바운딩 박스는 다면체의 형태로 구성되고, 상기 복수의 서브프레임들은, 상기 바운딩 박스의 정면에 대응하는 제1 서브프레임; 상기 바운딩 박스의 후면과 좌우 측면에 대응하는 제2 프레임; 및 상기 바운딩 박스의 상면과 저면에 대응하는 제3 프레임;을 포함할 수 있다.The bounding box is configured in the form of a polyhedron, and the plurality of subframes may include: a first subframe corresponding to a front surface of the bounding box; a second frame corresponding to the rear and left and right sides of the bounding box; and a third frame corresponding to the upper and lower surfaces of the bounding box.

이때, 상기 압축하는 단계는, 상기 제1 서브프레임을 제1 압축률로 조정하고, 상기 제2 서브프레임을 제2 압축률로 조정하되, 상기 제1 압축률은 상기 제2 압축률보다 작을 수 있다.In this case, the compressing may include adjusting the first subframe to a first compression ratio and adjusting the second subframe to a second compression ratio, wherein the first compression ratio is smaller than the second compression ratio.

또한, 상기 압축하는 단계는, 상기 제3 서브프레임을 상기 제2 압축률보다 큰 제3 압축률로 조정할 수 있다.In addition, the compressing may include adjusting the third subframe to a third compression ratio greater than the second compression ratio.

상기 기하학적 구조 영상은 상기 3차원 포인트 클라우드의 포인트들에 대한 위치 정보를 포함하고, 상기 텍스처 영상은 상기 3차원 포인트 클라우드의 포인트들에 대한 색상 정보를 포함할 수 있다.The geometric structure image may include position information on points of the 3D point cloud, and the texture image may include color information on points of the 3D point cloud.

그리고, 상기 출력된 비트스트림을 전송받아 상기 2차원 프레임에 대한 복호화 및 렌더링 과정을 수행하여 3차원 입체 콘텐츠를 생성하는 단계;를 더 포함할 수 있다.The method may further include a step of receiving the output bitstream and performing decoding and rendering on the 2D frame to generate 3D stereoscopic content.

본 발명의 적어도 일 실시 예에 의하면, 대용량의 3차원 포인트 클라우드를 2차원 프레임으로 분해하되, 시점(view)에 따른 사용자 체감품질(QoE)를 고려하여 상기 2차원 프레임을 차등적으로 압축함으로써, 전송되는 데이터의 대역폭을 상당 부분 감소할 수 있을 뿐만 아니라 렌더링되는 3차원 입체 콘텐츠의 품질 또한 일정 수준을 유지할 수 있게 된다. 이에 따라, 사용자는 적절한 수준의 데이터 통신품질과 함께 이동통신망에서 효율적인 스트리밍 서비스를 제공받을 수 있다.According to at least one embodiment of the present invention, a large-capacity three-dimensional point cloud is decomposed into a two-dimensional frame, but by differentially compressing the two-dimensional frame in consideration of user quality of experience (QoE) according to a view, Not only can the bandwidth of transmitted data be significantly reduced, but the quality of rendered 3D stereoscopic content can also be maintained at a certain level. Accordingly, the user can be provided with an efficient streaming service in a mobile communication network with an appropriate level of data communication quality.

본 실시 예에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며 언급하지 않은 또 다른 효과는 아래의 기재로부터 본 발명이 속하는 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.Effects obtainable in this embodiment are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention pertains from the description below. .

도 1은 본 발명의 일 실시 예에 따른 3차원 입체 콘텐츠를 제공하는 시스템의 개략적인 블록도이다.
도 2는 도 1에 도시된 다시점 영상획득장치를 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시 예에 따른 서버의 개략적인 구성도이다.
도 4는 본 발명의 일 실시 예에 따른 3D 포인트 클라우드를 도시한다.
도 5는 본 발명의 일 실시 예에 따른 바운딩 박스와 복셀 그리드를 설명하기 위한 예시도이다.
도 6은 본 발명의 일 실시 예에 따른 인코더의 구성 예를 도시한다.
도 7은 본 발명의 일 실시 예에 따른 인코더에 의해 3D 포인트 클라우드를 바운딩 박스의 각 면에 투영하는 동작을 설명하기 위한 예시도이다.
도 8은 도 7에 도시된 패치를 복수의 서브프레임들로 분해하는 동작을 설명하기 위한 예시도이다.
도 9는 본 발명의 일 실시 예에 따른 인코더를 통해 복수의 서브프레임들을 차등 압축하는 동작을 설명하기 위한 예시도이다.
도 10은 본 발명의 일 실시 예에 따른 3D 포인트 클라우드를 나타내는 2D 프레임들을 도시한다.
도 11은 본 발명의 일 실시 예에 따른 클라이언트의 개략적인 구성도이다.
도 12는 본 발명의 일 실시 예에 따른 3차원 입체 콘텐츠를 제공하는 방법에 대한 개략적인 흐름도이다.
도 13은 도 12에 도시된 S40 단계의 수행 절차를 보다 상세히 설명하기 위한 흐름도이다.1 is a schematic block diagram of a system for providing 3D stereoscopic content according to an embodiment of the present invention.
FIG. 2 is a view for explaining the multi-viewpoint image acquisition apparatus shown in FIG. 1 .
3 is a schematic configuration diagram of a server according to an embodiment of the present invention.
4 shows a 3D point cloud according to an embodiment of the present invention.
5 is an exemplary diagram illustrating a bounding box and a voxel grid according to an embodiment of the present invention.
6 shows a configuration example of an encoder according to an embodiment of the present invention.
7 is an exemplary diagram for explaining an operation of projecting a 3D point cloud on each side of a bounding box by an encoder according to an embodiment of the present invention.
8 is an exemplary diagram for explaining an operation of decomposing the patch shown in FIG. 7 into a plurality of subframes.
9 is an exemplary diagram for explaining an operation of differentially compressing a plurality of subframes by an encoder according to an embodiment of the present invention.
10 illustrates 2D frames representing a 3D point cloud according to an embodiment of the present invention.
11 is a schematic configuration diagram of a client according to an embodiment of the present invention.
12 is a schematic flowchart of a method of providing 3D stereoscopic content according to an embodiment of the present invention.
13 is a flowchart for explaining in more detail a procedure for performing step S40 shown in FIG. 12 .

이하, 첨부된 도면들을 참조하여 실시 예를 상세히 설명한다. 실시 예는 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 본문에 상세하게 설명하고자 한다. 그러나 이는 실시 예를 특정한 개시 형태에 대해 한정하려는 것이 아니며, 실시 예의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Hereinafter, an embodiment will be described in detail with reference to the accompanying drawings. Since the embodiment may have various changes and may have various forms, specific embodiments will be illustrated in the drawings and described in detail in the text. However, this is not intended to limit the embodiment to the specific disclosed form, and it should be understood to include all changes, equivalents, or substitutes included in the spirit and scope of the embodiment.

"제1", "제2" 등의 용어는 다양한 구성요소들을 설명하는 데 사용될 수 있지만, 이러한 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로 사용된다. 또한, 실시 예의 구성 및 작용을 고려하여 특별히 정의된 용어들은 실시 예를 설명하기 위한 것일 뿐이고, 실시 예의 범위를 한정하는 것이 아니다.Terms such as “first” and “second” may be used to describe various elements, but these elements should not be limited by the terms. These terms are used for the purpose of distinguishing one component from another. In addition, terms specifically defined in consideration of the configuration and operation of the embodiment are only for describing the embodiment, and do not limit the scope of the embodiment.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가질 수 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석될 수 있으며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, may have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in a commonly used dictionary may be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present application, it is interpreted in an ideal or excessively formal meaning. doesn't happen

도 1은 본 발명의 일 실시 예에 따른 3차원 입체 콘텐츠를 제공하는 시스템의 개략적인 블록도이다.1 is a schematic block diagram of a system for providing 3D stereoscopic content according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 상기 시스템은 다시점 영상획득장치(100), 서버(200), 및 클라이언트(300)를 포함할 수 있다.As shown in FIG. 1 , the system may include a multi-viewpoint image acquisition apparatus 100 , a server 200 , and a client 300 .

다시점 영상획득장치(100)는 다수의 카메라들을 통해 객체(Object)에 대한 2차원 다시점(Multiview) 영상(이하, "객체 영상"이라 칭한다)들을 획득한다. 예를 들어, 다시점 영상획득장치(100)는 다수의 카메라들 간에 동기화 작업을 수행하고, 서로 다른 방향에서 객체를 동시 촬영하여 다중 영상을 취득한 후 프레임 단위로 다중 영상을 실시간 처리하여 객체 영상들을 획득할 수 있다.The multi-viewpoint image acquisition apparatus 100 acquires two-dimensional multiview images (hereinafter, referred to as “object images”) of an object through a plurality of cameras. For example, the multi-viewpoint image acquisition apparatus 100 performs a synchronization operation between a plurality of cameras, acquires multiple images by simultaneously photographing objects in different directions, and then processes the multiple images in real-time frame by frame to generate the object images. can be obtained

서버(200)는 다시점 영상획득장치(100)로부터 수신한 객체 영상들을 기반으로 3차원 포인트 클라우드(3D Point Cloud)를 추출하고, 3D 포인트 클라우드를 기하학적 구조 및 텍스처 속성을 나타내는 2차원 프레임(2D frame)으로 변환하여 부호화(encoding)된 비트스트림(bitstream)을 출력할 수 있다.The server 200 extracts a three-dimensional point cloud (3D Point Cloud) based on the object images received from the multi-viewpoint image acquisition device 100, and uses the 3D point cloud as a two-dimensional frame (2D) representing geometrical structure and texture properties. frame) to output an encoded bitstream.

클라이언트(300)는 서버(200)로부터 부호화된 비트스트림을 전송받아 2D 프레임에 대한 복호화(decoding) 및 렌더링(rendering) 과정을 수행하여 3D 입체 콘텐츠(3D volumetric contents)를 생성하고, 이를 재생할 수 있다.The client 300 receives the encoded bitstream from the server 200 and performs decoding and rendering processes on the 2D frame to generate 3D volumetric contents, and reproduce it. .

클라이언트(300)는 특정 이동통신망 서비스에 가입 및 등록되어 있는 단말로서, 통신망 사업자에 의해 운영되는 네트워크(NW)를 통해 서버(200)와 통신할 수 있다. 이러한 네트워크(NW)는 일 예로, 와이파이(Wi-Fi), 블루투스(Bluetooth), 셀룰러(cellular), LTE(Long-Term Evolution), LTE-A(LTE Advanced), 5G(5-Generation), 와이맥스(Wimax), 또는 어떤 다른 유형의 무선 네트워크를 포함할 수 있다.The client 300 is a terminal registered and subscribed to a specific mobile communication network service, and may communicate with the server 200 through a network NW operated by a communication network operator. Such a network (NW) is, for example, Wi-Fi (Wi-Fi), Bluetooth (Bluetooth), cellular (cellular), LTE (Long-Term Evolution), LTE-A (LTE Advanced), 5G (5-Generation), WiMAX (Wimax), or some other type of wireless network.

또한, 클라이언트(300)는 일 예로, 스마트폰, 노트북 컴퓨터, 디지털방송용 단말기, PDA(personal digital assistants), PMP(portable multimedia player), 네비게이션, 슬레이트 PC, 태블릿 PC, 울트라북, 웨어러블 디바이스(가령, 스마트 워치, 스마트 글래스 등), 헤드 마운티드 디스플레이(HMD) 등과 같은 이동 단말기 및/또는 디지털 TV, 데스크탑 컴퓨터, 디지털 사이니지 등과 같은 고정 단말기를 포함할 수 있다.In addition, the client 300 is, for example, a smartphone, a notebook computer, a digital broadcasting terminal, personal digital assistants (PDA), a portable multimedia player (PMP), a navigation system, a slate PC, a tablet PC, an ultrabook, a wearable device (eg, a mobile terminal such as a smart watch, smart glasses, etc.), a head mounted display (HMD), and/or a fixed terminal such as a digital TV, a desktop computer, and a digital signage.

도 2는 도 1에 도시된 다시점 영상획득장치를 설명하기 위한 도면이다.FIG. 2 is a view for explaining the multi-viewpoint image acquisition apparatus shown in FIG. 1 .

도 2를 참조하면, 다시점 영상획득장치(100)는 크로마키된 스튜디오 공간에 일정 간격으로 배치된 다수의 카메라(110)들을 포함한다.Referring to FIG. 2 , the multi-viewpoint image acquisition apparatus 100 includes a plurality of cameras 110 arranged at regular intervals in a chroma keyed studio space.

예를 들어, 다수의 카메라(110)들은 객체(Object, 120)를 향하도록 원주 방향을 따라 일정 간격으로 배치되고, 다시점 영상획득장치(100)는 다수의 카메라(110)들 간의 출력 시점이 동기화되도록 미리 캘리브레이션(calibration)을 수행할 수 있다. 다만, 카메라(110)들 간의 배치가 반드시 도시된 형태에 국한되는 것은 아니며, 객체(110)의 동적인 특성에 따라 다양한 형태-예컨대, 평행형(parallel) 또는 분산형(divergent) 배치 형태를 포함함-로 가변될 수도 있다.For example, the plurality of cameras 110 are arranged at regular intervals along the circumferential direction to face the object 120 , and the multi-viewpoint image acquisition apparatus 100 determines the output viewpoints between the plurality of cameras 110 . Calibration may be performed in advance to be synchronized. However, the arrangement between the cameras 110 is not necessarily limited to the illustrated form, and includes various forms depending on the dynamic characteristics of the object 110 - for example, parallel or divergent arrangement forms. It may be changed to ham-.

다시점 영상획득장치(100)는 다수의 카메라(110)들 간에 캘리브레이션을 수행한 이후, 전 방향에서 객체(120)를 동시 촬영하여 다중 영상을 수집하고, 매 프레임 별로 다중 영상을 실시간으로 처리하여 2차원 객체 영상들을 획득할 수 있다.After performing calibration between a plurality of cameras 110, the multi-viewpoint image acquisition device 100 collects multiple images by simultaneously photographing the object 120 in all directions, and processes multiple images for each frame in real time. Two-dimensional object images may be obtained.

또한, 다시점 영상획득장치(100)는 획득된 2차원 객체 영상들을 토대로 객체(120)의 3차원 지형적 좌표와 색상을 추출하여 서버(200)로 전송할 수 있다. 이때, 다수의 카메라(110)들 각각은 HD급 및 24 비트 컬러 이상의 특성을 가지는 카메라로 구성될 수 있으나, 본 발명의 범주가 이에 한정되는 것은 아니다.Also, the multi-viewpoint image acquisition apparatus 100 may extract 3D topographical coordinates and colors of the object 120 based on the acquired 2D object images and transmit the extracted 3D topographical coordinates and colors to the server 200 . In this case, each of the plurality of cameras 110 may be configured as a camera having characteristics of HD level and 24-bit color or higher, but the scope of the present invention is not limited thereto.

이하에서는 도 3을 참조하여 객체 영상들을 토대로 3D 포인트 클라우드를 추출하고, 이를 2D 프레임으로 압축 및 분해하는 서버(200)에 대하여 보다 상세히 설명한다.Hereinafter, the server 200 that extracts a 3D point cloud based on object images and compresses and decomposes it into a 2D frame will be described in detail with reference to FIG. 3 .

도 3은 본 발명의 일 실시 예에 따른 서버의 개략적인 구성도이다.3 is a schematic configuration diagram of a server according to an embodiment of the present invention.

도 3을 참조하면, 서버(200)는 포인트 클라우드 추출부(210), 복셀 그리드 생성부(220), 및 인코더(230)를 포함하여 구성될 수 있다. 다만, 이는 예시적인 것으로 서버(200)는 상술한 구성요소 이외에 명령 및 데이터를 저장하는 메모리 및 네트워크를 통한 통신을 돕는 네트워크 인터페이스를 추가적으로 포함할 수 있다.Referring to FIG. 3 , the server 200 may include a point cloud extractor 210 , a voxel grid generator 220 , and an encoder 230 . However, this is an example, and the server 200 may additionally include a memory for storing commands and data in addition to the above-described components, and a network interface that helps communication through a network.

포인트 클라우드 추출부(210)는 다시점 영상획득장치(100)로부터 입력되는 객체 영상들에 기초하여 3D 포인트 클라우드를 추출한다. 예를 들어, 포인트 클라우드 추출부(210)는 객체 영상들 간의 정합점을 토대로 3차원 공간좌표와 2차원 영상좌표 사이의 변환관계와 관련된 파라미터를 획득하고, 상기 파라미터에 근거하여 객체(120)에 대응하는 3D 포인트 클라우드를 추출할 수 있다. 여기서, 3D 포인트 클라우드는 객체(120)를 3차원 공간 안에서 시각적으로 규정하는 디지털화된 데이터로 정의되며, 도 4에서 보다 상세히 예시될 수 있다.The point cloud extraction unit 210 extracts a 3D point cloud based on object images input from the multi-viewpoint image acquisition apparatus 100 . For example, the point cloud extractor 210 obtains a parameter related to a transformation relationship between 3D spatial coordinates and 2D image coordinates based on a matching point between object images, and applies the parameter to the object 120 based on the parameter. A corresponding 3D point cloud can be extracted. Here, the 3D point cloud is defined as digitized data that visually defines the object 120 in a three-dimensional space, and may be exemplified in more detail in FIG. 4 .

도 4는 본 발명의 일 실시 예에 따른 3D 포인트 클라우드를 도시한다.4 shows a 3D point cloud according to an embodiment of the present invention.

도 4를 참조하면, 3D 포인트 클라우드(40)는 다수의 포인트(42)들로 구성되며, 각각의 포인트(42)는 하나 이상의 속성들을 포함할 수 있다. 하나 이상의 속성들은 각각의 포인트(42)의 지형적 위치와 같은 기하학적 구조(geometry)와 색상, 밝기, 모션, 재료, 반사, 강도 등을 포함할 수 있다. 기하학적 구조 이외의 속성들은 텍스처(texture)라 지칭할 수 있으며, 텍스처는 3D 포인트 클라우드(40)의 각각의 포인트(42)와 관련된 다양한 양태 및 특성을 나타낸다.Referring to FIG. 4 , the 3D point cloud 40 is composed of a plurality of points 42 , and each point 42 may include one or more properties. The one or more attributes may include geometry, such as the topographical location of each point 42 , and color, brightness, motion, material, reflection, intensity, and the like. Properties other than geometry may be referred to as texture, which represents various aspects and properties associated with each point 42 of the 3D point cloud 40 .

즉, 3D 포인트 클라우드(40)는 좌표계에 의해 정의되는 데이터 포인트들의 집합으로 정의되며, 각각의 포인트(42)는 객체의 외부 표면을 나타낸다. 예를 들어, 데카르트 좌표계에서 3D 포인트 클라우드(40)의 각각의 포인트(42)는 X, Y, 및 Z의 세 개의 좌표들에 의해 식별되며, 한 포인트의 색상은 빨강(R), 녹색(G), 및 파랑(B)의 조합으로 이루어질 수 있다.That is, the 3D point cloud 40 is defined as a set of data points defined by a coordinate system, and each point 42 represents the outer surface of the object. For example, in a Cartesian coordinate system, each point 42 of a 3D point cloud 40 is identified by three coordinates X, Y, and Z, and the color of one point is red (R), green (G). ), and a combination of blue (B).

이러한 3D 포인트 클라우드(40)는 이를 구성하는 다수의 포인트(42)들 및 이들과 관련된 속성들로 인해 전송에 상당한 대역폭을 필요로 한다. 따라서, 서버(200)에서 클라이언트(300)로 3D 포인트 클라우드(40)를 효율적으로 전송하기 위해서는 2D 프레임으로의 분해 및 압축 작업이 요구되며, 3D 포인트 클라우드(40)를 2D 프레임 상으로 투영시키기 위한 바운딩 박스(bounding box)의 생성이 선행된다.Such a 3D point cloud 40 requires significant bandwidth for transmission due to a large number of points 42 constituting it and properties related thereto. Therefore, in order to efficiently transmit the 3D point cloud 40 from the server 200 to the client 300, decomposition and compression into 2D frames are required, and for projecting the 3D point cloud 40 onto the 2D frame The creation of a bounding box is preceded.

복셀 그리드 생성부(220)는 3D 포인트 클라우드(40)를 2D 프레임으로 분해하기 위한 바운딩 박스(bounding box)를 생성하고, 바운딩 박스를 일정 크기의 복셀 그리드(voxel grid)로 분할한다. 이에 대한 설명은 도 5를 참조하여 이하에서 서술한다.The voxel grid generator 220 generates a bounding box for decomposing the 3D point cloud 40 into a 2D frame, and divides the bounding box into voxel grids of a predetermined size. This will be described below with reference to FIG. 5 .

도 5는 본 발명의 일 실시 예에 따른 바운딩 박스와 복셀 그리드를 설명하기 위한 예시도이다.5 is an exemplary diagram illustrating a bounding box and a voxel grid according to an embodiment of the present invention.

도 5의 (a)를 참조하면, 복셀 그리드 생성부(220)는 3D 포인트 클라우드(40, 도 4)에 대응되는 바운딩 박스(50)를 생성한다. 바운딩 박스(50)는 3D 포인트 클라우드(40)의 주변을 완전히 둘러싸는 큐브(Cube) 형태로 구성될 수 있다. 바운딩 박스(50)는 3D 포인트 클라우드(40)를 2D 평면 상에 투영 시, 경계면에서 발생하는 왜곡이나 노이즈를 방지하기 위해 육면체와 같은 3차원 구조로 형성된다. 다만, 본 발명의 범주가 반드시 이에 국한되는 것은 아니고, 실시 예에 따라 바운딩 박스(50)의 형태는 팔면체, 이십면체 등과 같은 다면체로 확장되어 적용될 수도 있다.Referring to FIG. 5A , the voxel grid generator 220 generates a bounding box 50 corresponding to the 3D point cloud 40 ( FIG. 4 ). The bounding box 50 may be configured in the form of a cube that completely surrounds the periphery of the 3D point cloud 40 . The bounding box 50 is formed in a three-dimensional structure such as a hexahedron to prevent distortion or noise occurring at the boundary when the 3D point cloud 40 is projected on a 2D plane. However, the scope of the present invention is not necessarily limited thereto, and the shape of the bounding box 50 may be extended and applied to a polyhedron such as an octahedron or an icosahedron according to an embodiment.

바운딩 박스(50)는 그 내부에 수용되는 3D 포인트 클라우드(40)의 위치에 따라 정면(front), 배면(back), 우측면(right), 좌측면(left), 상면(top), 및 저면(bottom)을 포함하는 6개의 면(face)으로 구성되며, 본 명세서에서는 각각의 면(face)을 3차원 포인트 클라우드(31)가 투영되는 2D 평면의 일종으로 '서브프레임(subframe)'이라 지칭하기로 한다.The bounding box 50 is a front, a back, a right, a left, a top, and a bottom according to the position of the 3D point cloud 40 accommodated therein. bottom), and in this specification, each face is referred to as a 'subframe' as a kind of 2D plane on which the three-dimensional point cloud 31 is projected. do it with

도 3의 (b)를 참조하면, 복셀 그리드 생성부(220)는 바운딩 박스(50)를 복셀화(voxelization)하여 복수 개의 복셀(voxel, 52)로 분할하고 일정 크기의 복셀 그리드(voxel grid, 54)를 생성한다.Referring to FIG. 3B , the voxel grid generator 220 voxelizes the bounding box 50 to divide it into a plurality of voxels 52 , and a voxel grid of a predetermined size; 54) is created.

복셀(52)은 부피(volume)를 가진 픽셀(pixel)을 의미하며, 각 복셀(52)의 크기는 사용자에 의해 시스템에 미리 설정될 수 있다. 복셀(52)의 크기가 작을수록 렌더링 결과는 정확해지나 텍스처를 위한 메모리가 증가하게 되며, 3D 포인트 클라우드(40)의 각각의 포인트(42)는 하나의 복셀(52)에 소속되게 된다.The voxel 52 means a pixel having a volume, and the size of each voxel 52 may be preset by a user in the system. As the size of the voxel 52 decreases, the rendering result becomes more accurate, but the memory for texture increases, and each point 42 of the 3D point cloud 40 belongs to one voxel 52 .

인코더(230)는 3D 포인트 클라우드(40)를 바운딩 박스(50)의 각 면에 투영시켜 복수의 서브프레임들로 분해하고, 상기 복수의 서브프레임들을 차등 압축한 후 기하학적 구조 및 텍스처 속성을 나타내는 2차원 프레임(2D frame)으로 변환하여 부호화(encoding)된 비트스트림(bitstream)을 출력할 수 있다. 이에 대하여 도 6을 참조하여 이하에서 보다 상세히 설명한다.The encoder 230 projects the 3D point cloud 40 on each side of the bounding box 50 to decompose it into a plurality of subframes, and after differentially compressing the plurality of subframes, 2 representing the geometrical and texture properties It is possible to output an encoded bitstream by converting it into a dimensional frame (2D frame). This will be described in more detail below with reference to FIG. 6 .

도 6은 본 발명의 일 실시 예에 따른 인코더의 구성 예를 도시한다.6 shows a configuration example of an encoder according to an embodiment of the present invention.

도 6을 참조하면, 인코더(230)는 패치 생성부(231), 차등 압축부(233), 프레임 패킹부(235), 인코딩 엔진(237), 및 멀티플렉서(239)를 포함할 수 있다.Referring to FIG. 6 , the encoder 230 may include a patch generating unit 231 , a differential compression unit 233 , a frame packing unit 235 , an encoding engine 237 , and a multiplexer 239 .

패치 생성부(231)는 3D 포인트 클라우드(40)를 바운딩 박스(50)의 각 면에 투영(projection)시켜 다수의 패치(patch, 231a)들을 생성한다. 다수의 패치(231a)들은 3D 포인트 클라우드(40)의 포인트(42)들을 분할(segmenting)함으로써 생성된다. 특히, 3D 포인트 클라우드(40)는 법선 벡터에 기반하여 각 포인트(42)들을 클러스터링함(clustering)으로써 분할된다. 클러스터링된 포인트들은 3D 공간에서 2D 서브프레임으로 투영된다. 각각의 투영된 클러스터를 패치(231a)라고 정의한다. 패치(231a)들은 개별적인 서브프레임(S1~S6) 안에 정리되고 모아진다.The patch generator 231 generates a plurality of patches 231a by projecting the 3D point cloud 40 onto each surface of the bounding box 50 . A plurality of patches 231a are generated by segmenting the points 42 of the 3D point cloud 40 . In particular, the 3D point cloud 40 is divided by clustering each point 42 based on the normal vector. The clustered points are projected into a 2D subframe in 3D space. Each projected cluster is defined as a patch 231a. The patches 231a are arranged and collected in individual subframes S1 to S6.

또한, 패치 생성부(231)는 생성된 다수의 패치(231a)들과 관련된 메타 데이터(metadata)를 생성한다. 메타 데이터는 각각의 서브프레임(S1~S6) 내 패치(231a)의 위치, 패치(231a)의 크기, 패치(231a)의 투영면 종류 등과 같은 보조 정보와 연관되어 있다.Also, the patch generator 231 generates metadata related to the plurality of generated patches 231a. The metadata is associated with auxiliary information such as the location of the patch 231a in each of the subframes S1 to S6, the size of the patch 231a, and the type of projection surface of the patch 231a.

차등 압축부(233)는 3D 포인트 클라우드(40)를 나타내는 복수의 서브프레임들(231b)을 미리 할당된 중요도 인자에 따라 차등 압축한다. 여기서, 중요도 인자는 3D 포인트 클라우드(40)의 특성과 관련된 사용자 체감 품질(Quality of Experience, QoE)을 나타내며, 사용자 체감품질(QoE)의 중요도 값과 압축률은 반비례하는 관계를 가진다.The differential compression unit 233 differentially compresses the plurality of subframes 231b representing the 3D point cloud 40 according to a pre-allocated importance factor. Here, the importance factor represents the user's quality of experience (QoE) related to the characteristics of the 3D point cloud 40, and the importance value of the user's quality of experience (QoE) and the compression rate have an inversely proportional relationship.

프레임 패킹부(235)는 차등 압축된 복수의 서브프레임들(231b)을 하나의 2D 프레임 안에 패킹하고, 상기 2D 프레임 내에 포함된 패치들을 기하학적 구조 영상(235a)과 텍스처 영상(235b)으로 분해하여 생성한다.The frame packing unit 235 packs a plurality of differentially compressed subframes 231b into one 2D frame, and decomposes patches included in the 2D frame into a geometric structure image 235a and a texture image 235b. create

기하학적 구조 영상(235a)은 3차원 공간 내 X, Y, 및, Z 위치에 기반하는 3D 포인트 클라우드(40)의 각 포인트(42)들에 대한 지형적 좌표 정보를 포함한다. 기하학적 구조 영상(235a) 내 각각의 픽셀은 텍스처 영상(235b) 안에 대응하는 픽셀을 가지며, 텍스처 영상(235b)은 기하학적 구조 영상(235a) 내 각각의 대응하는 포인트의 색상을 표현한다.The geometric structure image 235a includes geographic coordinate information for each point 42 of the 3D point cloud 40 based on the X, Y, and Z positions in the 3D space. Each pixel in the geometry image 235a has a corresponding pixel in the texture image 235b, and the texture image 235b represents the color of each corresponding point in the geometry image 235a.

텍스처 영상(235b)은 3D 포인트 클라우드(40)의 각 포인트(42)들에 대한 텍스처 정보를 포함한다. 한 포인트의 텍스처 정보는 빨강(R) 값, 녹색(G) 값, 및 파랑(B) 값의 조합으로 이루어진 색상 정보를 포함할 수 있다. 이외에도, 추가적인 텍스처 정보들에는 밝기, 모션, 재료, 반사, 강도 등이 포함될 수 있다.The texture image 235b includes texture information for each point 42 of the 3D point cloud 40 . The texture information of one point may include color information consisting of a combination of a red (R) value, a green (G) value, and a blue (B) value. In addition, the additional texture information may include brightness, motion, material, reflection, intensity, and the like.

또한, 프레임 패킹부(235)는 점유영역 영상(235c)을 생성한다. 점유영역 영상은 2D 프레임에 투영되거나 매핑되는 3D 포인트 클라우드(40)의 유효 포인트들을 포함하는 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 안의 픽셀 위치를 가리킨다. 예를 들어, 점유영역 영상(235c)은 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 상의 각각의 픽셀이 유효 픽셀(valid pixel, 도 7의 (b) 참조)인지 여부를 나타낸다. 점유영역 영상(235c) 상의 유효 픽셀은 3D 포인트 클라우드(40)의 각 포인트(42)들이 2D 프레임 상에 투영된 픽셀을 나타낸다.Also, the frame packing unit 235 generates an occupied area image 235c. The occupancy image refers to pixel positions in the geometry image 235a and the texture image 235b including valid points of the 3D point cloud 40 that are projected or mapped to the 2D frame. For example, the occupied area image 235c indicates whether each pixel on the geometric structure image 235a and the texture image 235b is a valid pixel (refer to FIG. 7B ). An effective pixel on the occupied area image 235c represents a pixel in which each point 42 of the 3D point cloud 40 is projected on the 2D frame.

점유영역 영상(235c)은 복수의 점유영역 영상을 포함하며, 여기서 각각의 점유영역 영상은 각각의 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 같은 하나의 영상에 대응한다.Occupancy image 235c includes a plurality of occupancy images, where each occupancy image corresponds to one image, such as respective geometric structure image 235a and texture image 235b.

프레임 패킹부(235)에서 생성된 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)은 각기 인코딩 엔진(237)을 통해 부호화된다.The geometric structure image 235a , the texture image 235b , and the occupied area image 235c generated by the frame packing unit 235 are encoded through the encoding engine 237 , respectively.

인코딩 엔진(237)은 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)을 재압축하여 인코딩된 비트스트림(bitstream)을 생성한다. 여기서, 인코딩 엔진(237)은 3D 포인트 클라우드(40)의 전송 대역폭을 감소시킨다. 예를 들어, 3D 포인트 클라우드(40)가 2D 프레임에 맞도록 처리될 때, HEVC, AVC, VP9, VP8, JVNET 등의 비디오 또는 이미지 코덱을 통해 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)을 재압축하는데 사용될 수 있다. 이후, 메타 데이터, 부호화된 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)이 멀티플렉서(239)를 통해 다중화된다.The encoding engine 237 recompresses the geometric structure image 235a, the texture image 235b, and the occupation area image 235c to generate an encoded bitstream. Here, the encoding engine 237 reduces the transmission bandwidth of the 3D point cloud 40 . For example, when the 3D point cloud 40 is processed to fit a 2D frame, the geometry image 235a, the texture image 235b, and recompressing the occupied area image 235c. Thereafter, the metadata, the encoded geometric structure image 235a, the texture image 235b, and the occupied area image 235c are multiplexed through the multiplexer 239 .

멀티플렉서(239)는 메타데이터, 부호화된 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)을 결합하여 단일 비트스트림을 생성하고, 클라이언트(300)로 출력한다.The multiplexer 239 generates a single bitstream by combining the metadata, the encoded geometric structure image 235a, the texture image 235b, and the occupied area image 235c, and outputs it to the client 300 .

이하에서는, 도 7 내지 도 8을 참조하여 도 6에 도시된 패치 생성부(231)에 대해서 보다 상세히 설명한다.Hereinafter, the patch generating unit 231 illustrated in FIG. 6 will be described in more detail with reference to FIGS. 7 to 8 .

도 7은 본 발명의 일 실시 예에 따른 인코더에 의해 3D 포인트 클라우드를 바운딩 박스의 각 면에 투영하는 동작을 설명하기 위한 예시도이다.7 is an exemplary diagram for explaining an operation of projecting a 3D point cloud on each side of a bounding box by an encoder according to an embodiment of the present invention.

패치 생성부(231)는 코딩 효율성을 높이기 위해 3D 포인트 클라우드(40)를 2D 프레임으로 분류 및 처리한다.The patch generator 231 classifies and processes the 3D point cloud 40 into 2D frames to increase coding efficiency.

도 7의 (a) 내지 (b)를 함께 참조하면, 패치 생성부(231)는 3D 포인트 클라우드(40)에 상응하는 바운딩 박스(50)를 매 프레임마다 결정한다. 그리고, 생성된 바운딩 박스(50)의 각 면에 가장 가까운 포인트들을 정사영(orthogonal projection)의 형태로 투영시켜 패치(231a)를 생성한다.Referring to FIGS. 7A to 7B together, the patch generator 231 determines a bounding box 50 corresponding to the 3D point cloud 40 for every frame. Then, the patch 231a is generated by projecting the points closest to each side of the generated bounding box 50 in the form of an orthogonal projection.

예를 들어, 3D 포인트 클라우드(40)의 각 포인트들은 바운딩 박스(50)의 한 축(가령, Y축)을 따라 투영되고, 상기 한 축에 수직한 평면(가령, XY 평면) 상에 매핑(mapping)된다. 이때, 패치 생성부(231)는 서로 인접한 공간적 상관 데이터를 포함하는 포인트들은 한 패치(231a) 안에 배치되도록 분류할 수 있다.For example, each point of the 3D point cloud 40 is projected along one axis (eg, the Y axis) of the bounding box 50, and is mapped onto a plane (eg, the XY plane) perpendicular to the one axis (eg, the XY plane) ( mapping). In this case, the patch generator 231 may classify points including spatial correlation data adjacent to each other to be disposed in one patch 231a.

도 8은 도 7에 도시된 패치를 복수의 서브프레임들로 분해하는 동작을 설명하기 위한 예시도이다.8 is an exemplary diagram for explaining an operation of decomposing the patch shown in FIG. 7 into a plurality of subframes.

도 8의 (a) 내지 (b)를 함께 참조하면, 패치 생성부(231)는 바운딩 박스(50)의 각 면과 각 면에 투영된 패치(231a)들을 조합하여 복수의 서브프레임들(231b)을 구성한다. 복수의 서브프레임둘(231b)은 바운딩 박스(50)의 각 면과 대응되는 제1 내지 제6 서브프레임(S1~S6)을 포함하며, 각 서브프레임(S1~S6)의 개수는 바운딩 박스(50)의 각 면의 개수와 대응된다.Referring to FIGS. 8A to 8B together, the patch generating unit 231 combines each side of the bounding box 50 and the patches 231a projected on each side to form a plurality of subframes 231b. ) constitutes The plurality of two subframes 231b includes first to sixth subframes S1 to S6 corresponding to each side of the bounding box 50, and the number of each subframe S1 to S6 is the bounding box ( 50) corresponds to the number of each side.

제1 서브프레임(S1)은 양의 Y축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 XZ 평면(또는, 바운딩 박스(50)의 정면) 상에 직교 투영하여 매핑되는 제1 패치(P1)를 포함한다.The first subframe S1 is a first patch P1 mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the positive Y-axis onto the XZ plane (or the front of the bounding box 50). ) is included.

제2 서브프레임(S2)은 음의 Y축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 XZ 평면(또는, 바운딩 박스(50)의 배면) 상에 직교 투영하여 매핑되는 제2 패치(P2)를 포함한다.The second subframe S2 is a second patch P2 that is mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the negative Y axis onto the XZ plane (or the back side of the bounding box 50). ) is included.

제3 서브프레임(S3)은 양의 X축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 YZ 평면(또는, 바운딩 박스(50)의 우측면) 상에 직교 투영하여 매핑되는 제3 패치(P3)를 포함한다.The third subframe S3 is a third patch P3 that is mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the positive X axis onto the YZ plane (or the right side of the bounding box 50). ) is included.

제4 서브프레임(S4)은 음의 X축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 YZ 평면(또는, 바운딩 박스(50)의 좌측면) 상에 직교 투영하여 매핑되는 제4 패치(P4)를 포함한다.The fourth subframe S4 is a fourth patch mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the negative X-axis onto the YZ plane (or the left side of the bounding box 50) ( P4).

제5 서브프레임(S5)은 양의 Z축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 XY 평면(또는, 바운딩 박스(50)의 상면) 상에 직교 투영하여 매핑되는 제5 패치(P5)를 포함한다.The fifth subframe S5 is a fifth patch P5 mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the positive Z axis onto the XY plane (or the upper surface of the bounding box 50). ) is included.

제6 서브프레임(S6)은 음의 Z축을 따라 3D 포인트 클라우드(40)의 포인트(42)들을 XY 평면(또는, 바운딩 박스(50)의 저면) 상에 직교 투영하여 매핑되는 제6 패치(P6)를 포함한다.The sixth subframe (S6) is a sixth patch (P6) mapped by orthogonally projecting the points 42 of the 3D point cloud 40 along the negative Z axis on the XY plane (or the bottom surface of the bounding box 50) ) is included.

이하에서는, 도 9를 참조하여 상기 복수의 서브프레임들(231b)을 미리 할당된 중요도 인자에 따라 차등 압축하는 동작에 대해서 설명한다.Hereinafter, an operation of differentially compressing the plurality of subframes 231b according to a pre-allocated importance factor will be described with reference to FIG. 9 .

도 9는 본 발명의 일 실시 예에 따른 인코더를 통해 복수의 서브프레임들을 차등 압축하는 동작을 설명하기 위한 예시도이다.9 is an exemplary diagram for explaining an operation of differentially compressing a plurality of subframes by an encoder according to an embodiment of the present invention.

도 9를 참조하면, 차등 압축부(233)는 복수의 서브프레임들(S1~S6)들의 압축률을 미리 할당된 중요도 인자에 기반하여 조정하고, 조정된 압축률에 따라 각 서브프레임(S1~S6)을 차등적으로 압축할 수 있다. 여기서, 중요도 인자는 3D 포인트 클라우드(40)의 특성과 관련된 사용자 체감 품질(Quality of Experience, QoE)을 나타낸다.Referring to FIG. 9 , the differential compression unit 233 adjusts compression rates of a plurality of subframes S1 to S6 based on a pre-allocated importance factor, and each subframe S1 to S6 according to the adjusted compression ratio. can be differentially compressed. Here, the importance factor represents a user quality of experience (QoE) related to the characteristics of the 3D point cloud 40 .

복수의 서브프레임들(231b)은 투영 대상이 되는 3D 포인트 클라우드(40)의 특성에 따라 각 시점(view) 별로 요구되는 사용자 체감품질(QoE)이 각기 상이할 수 있다.The plurality of subframes 231b may have different user quality of experience (QoE) required for each view according to the characteristics of the 3D point cloud 40 to be projected.

예를 들어, 3D 포인트 클라우드(40)가 인간, 동물, 식물, 또는 캐릭터 등과 관련된 동적 피사체인 것으로 가정하면, 3D 포인트 클라우드(40)의 정면(front) 이미지는 다른 시점(view)의 이미지들에 비하여 사용자 체감품질(QoE)의 중요도 값이 상대적으로 더 높게 할당될 수 있다. 그 이유는, 3D 포인트 클라우드(40)의 정면 이미지에는 동종 범주의 피사체들마다 고유의 특성을 가진 얼굴 형태와 신체 동작 및 외형을 포함하고 있기 때문이다.For example, assuming that the 3D point cloud 40 is a dynamic subject related to a human, an animal, a plant, or a character, the front image of the 3D point cloud 40 is different from images of different views. In comparison, the importance value of the user's quality of experience (QoE) may be assigned relatively higher. The reason is that the front image of the 3D point cloud 40 includes a face shape, body motion, and appearance having unique characteristics for each subject of the same category.

반면에, 3D 포인트 클라우드(40)의 상면(top) 및/또는 저면(bottom) 이미지는 다른 시점(view)의 이미지들에 비하여 사용자 체감품질(QoE)의 중요도 값이 상대적으로 더 낮게 할당될 수 있다. 그 이유는, 3D 포인트 클라우드(40)의 상면 및/또는 저면 이미지에 포함되는 정수리 또는 발바닥 부분은 동종 범주의 피사체들 간에 유사도가 높아 구별되는 특징을 도출하는데 한계가 있기 때문이다.On the other hand, the top and/or bottom image of the 3D point cloud 40 has a relatively lower importance value of the user's quality of experience (QoE) compared to images of other views. have. The reason is that the crown or sole included in the top and/or bottom image of the 3D point cloud 40 has a high degree of similarity between subjects of the same category, and thus there is a limit in deriving distinctive features.

따라서, 차등 압축부(233)는 네트워크(NW)의 통신품질을 고려하여 적절한 수준의 사용자 체감품질(QoE)을 보장하기 위해 복수의 서브프레임들(231b) 간의 시점(view) 별로 하기의 표 1과 같이 각각의 압축률을 차등적으로 조정할 수 있다.Therefore, the differential compression unit 233 considers the communication quality of the network (NW) to ensure an appropriate level of user quality of experience (QoE) for each view between the plurality of subframes 231b Table 1 below. Each compression ratio can be differentially adjusted as shown.

표 1을 참조하면, 차등 압축부(233)는 바운딩 박스(50)의 정면(front)에 대응하는 제1 서브프레임(S1)은 제1 압축률로 조정하고, 바운딩 박스(50)의 배면(back), 우측면(right), 및 좌측면(left)에 대응하는 제2 내지 제4 서브프레임(S2~S4)은 제2 압축률로 조정할 수 있다. 그리고, 바운딩 박스(50)의 상면(top) 및 저면(bottom)에 대응하는 제5 및 제6 서브프레임(S5, S6)은 제3 압축률로 조정할 수 있다.Referring to Table 1, the differential compression unit 233 adjusts the first subframe S1 corresponding to the front of the bounding box 50 to the first compression ratio, and the back of the bounding box 50 ), the second to fourth subframes S2 to S4 corresponding to the right, and left may be adjusted to a second compression ratio. In addition, the fifth and sixth subframes S5 and S6 corresponding to the top and bottom of the bounding box 50 may be adjusted to a third compression ratio.

여기서, 제1 압축률 내지 제3 압축률을 서로 상이할 수 있다. 예를 들어, 제1 압축률은 제2 압축률보다 작고, 제3 압축률은 제2 압축률보다 크게 설정될 수 있다. 즉, 사용자 체감품질(QoE)의 중요도 값과 압축률은 반비례하는 관계를 가진다.Here, the first to third compression ratios may be different from each other. For example, the first compression ratio may be smaller than the second compression ratio, and the third compression ratio may be set to be larger than the second compression ratio. That is, the importance value of the user's quality of experience (QoE) and the compression rate have an inversely proportional relationship.

전술한 바와 같이, 복수의 서브프레임들(231b)을 시점(view) 별 사용자 체감품질(QoE)를 고려하여 차등적으로 압축할 경우, 클라이언트(300)로 전송되는 데이터의 대역폭이 상당 부분 감소될 뿐만 아니라 렌더링되는 3D 입체 콘텐츠의 품질 또한 일정 수준을 유지하게 되므로, 사용자는 적절한 수준의 데이터 통신품질을 제공받을 수 있다.As described above, when the plurality of subframes 231b are differentially compressed in consideration of the user quality of experience (QoE) for each view, the bandwidth of data transmitted to the client 300 may be significantly reduced. In addition, since the quality of rendered 3D stereoscopic content also maintains a certain level, the user can be provided with an appropriate level of data communication quality.

도 10은 본 발명의 일 실시 예에 따른 3D 포인트 클라우드를 나타내는 2D 프레임들을 도시한다.10 illustrates 2D frames representing a 3D point cloud according to an embodiment of the present invention.

도 10의 (a)를 참조하면, 2D 프레임의 기하학적 구조 영상(235a)에는 3D 포인트 클라우드(40)의 각 포인트(42)들을 2D 상으로 분해한 다수의 패치들을 포함하며, 다수의 패치들은 전술한 각 서브프레임(S1~S6)의 시점(view) 별로 상이한 압축률로 인코딩된다. 예를 들어, 3D 포인트 클라우드(40)의 얼굴이 묘사된 패치(1010)는 정수리가 묘사된 패치(1020)보다 낮은 포인트 밀도로 압축된다.Referring to (a) of FIG. 10 , the geometric structure image 235a of the 2D frame includes a plurality of patches obtained by decomposing each point 42 of the 3D point cloud 40 into a 2D image, and the plurality of patches are described above. It is encoded with a different compression rate for each view of each subframe S1 to S6. For example, the patch 1010 in which the face of the 3D point cloud 40 is depicted is compressed to a lower point density than the patch 1020 in which the crown is depicted.

또한, 기하학적 구조 영상(235a)은 3차원 공간 내 X, Y, 및, Z 위치에 기반하는 3D 포인트 클라우드(40)의 각 포인트(42)들에 대한 지형적 좌표 정보(x, y, z)를 포함한다.In addition, the geometric structure image 235a includes geographic coordinate information (x, y, z) for each point 42 of the 3D point cloud 40 based on the X, Y, and Z positions in the three-dimensional space. include

도 10의 (b)를 참조하면, 2D 프레임의 텍스처 영상(235b)에는 3D 포인트 클라우드(40)의 각 포인트(42)들을 2D 상으로 분해한 다수의 패치들을 포함하며, 다수의 패치들은 전술한 각 서브프레임(S1~S6)의 시점(view) 별로 상이한 압축률로 인코딩된다. 예를 들어, 3D 포인트 클라우드(40)의 얼굴이 묘사된 패치(1010)는 정수리가 묘사된 패치(1020)보다 낮은 포인트 밀도로 압축된다.Referring to FIG. 10B , the texture image 235b of the 2D frame includes a plurality of patches obtained by decomposing each point 42 of the 3D point cloud 40 into a 2D image, and the plurality of patches are described above. It is encoded with a different compression rate for each view of each subframe (S1 to S6). For example, the patch 1010 in which the face of the 3D point cloud 40 is depicted is compressed to a lower point density than the patch 1020 in which the crown is depicted.

또한, 텍스처 영상(235b)은 3D 포인트 클라우드(40)의 각 포인트(42)들에 대한 텍스처 정보를 포함한다. 한 포인트의 텍스처 정보는 빨강(R) 값, 녹색(G) 값, 및 파랑(B) 값의 조합으로 이루어진 색상 정보를 포함할 수 있다. 이외에도, 추가적인 텍스처 정보들에는 밝기, 모션, 재료, 반사, 강도 등이 포함될 수 있다.Also, the texture image 235b includes texture information for each point 42 of the 3D point cloud 40 . The texture information of one point may include color information consisting of a combination of a red (R) value, a green (G) value, and a blue (B) value. In addition, the additional texture information may include brightness, motion, material, reflection, intensity, and the like.

도 10의 (a) 내지 (b)를 함께 참조하면, 기하학적 구조 영상(235a) 내 각각의 픽셀은 텍스처 영상(235b) 안에 대응하는 픽셀을 가지며, 텍스처 영상(235b)은 기하학적 구조 영상(235a) 내 각각의 대응하는 포인트의 색상을 표현한다.10A to 10B together, each pixel in the geometric structure image 235a has a corresponding pixel in the texture image 235b, and the texture image 235b is the geometric structure image 235a. I express the color of each corresponding point within me.

도 10의 (c)를 참조하면, 2D 프레임의 점유영역 영상(235c)에는 3D 포인트 클라우드(40)의 유효 포인트들을 포함하는 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 안의 픽셀 위치가 도시된다. 예를 들어, 점유영역 영상(235c)은 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 상의 각각의 픽셀이 유효 픽셀(valid pixel)인지 여부를 바이너리(binary) 맵의 형태로 표현한다. 점유영역 영상(235c) 상의 유효 픽셀은 3D 포인트 클라우드(40)의 각 포인트(42)들이 2D 프레임 상에 투영된 픽셀을 나타낸다.Referring to FIG. 10C , pixel positions in the geometric structure image 235a and the texture image 235b including effective points of the 3D point cloud 40 are shown in the occupied area image 235c of the 2D frame. . For example, the occupied area image 235c expresses whether each pixel on the geometric structure image 235a and the texture image 235b is a valid pixel in the form of a binary map. An effective pixel on the occupied area image 235c represents a pixel in which each point 42 of the 3D point cloud 40 is projected on the 2D frame.

도 11은 본 발명의 일 실시 예에 따른 클라이언트의 개략적인 구성도이다.11 is a schematic configuration diagram of a client according to an embodiment of the present invention.

도 11을 참조하면, 클라이언트(300)는 인코더(230)로부터 비트스트림을 전송받아 2D 프레임에 대한 복호화(decoding) 및 렌더링(rendering) 과정을 수행하여 3D 입체 콘텐츠(3D volumetric contents)를 생성하는 디코더(310) 및 사용자의 명령에 따라 상기 3D 입체 콘텐츠를 재생하는 디스플레이부(320)를 포함할 수 있다. 다만, 이는 예시적인 것으로 클라이언트(300)는 상술한 구성요소 중 적어도 하나를 생략하거나, 다른 구성요소를 추가적으로 포함할 수 있다.Referring to FIG. 11 , the client 300 receives the bitstream from the encoder 230 and performs decoding and rendering processes on the 2D frame to generate 3D volumetric contents. 310 and a display unit 320 that reproduces the 3D stereoscopic content according to a user's command. However, this is an example, and the client 300 may omit at least one of the above-described components or may additionally include other components.

디코더(310)는 디멀티플렉서(311), 디코딩 엔진(313), 및 렌더링부(315)를 포함한다.The decoder 310 includes a demultiplexer 311 , a decoding engine 313 , and a rendering unit 315 .

디멀티플렉서(311)는 인코더(230)로부터 발생된 부호화된 비트스트림을 수신한다. 디멀티플렉서(311)는 부호화된 비트스트림을 3D 포인트 클라우드(40)의 기하학적 구조, 텍스처, 점유영역, 및 메타 데이터에 대해 각각 압축된 비트스트림들로 다중화를 해제한다.The demultiplexer 311 receives the encoded bitstream generated from the encoder 230 . The demultiplexer 311 demultiplexes the encoded bitstream into compressed bitstreams for the geometry, texture, occupation area, and metadata of the 3D point cloud 40 , respectively.

즉, 디멀티플렉서(311)는 부호화된 비트스트림으로부터 데이터의 다양한 비트스트림들을 분리한다. 예를 들어, 디멀티플렉서(311)는 기하학적 구조 영상(235a), 텍스처 영상(235b), 점유영역 영상(235c), 및 메타 데이터와 같은 데이터의 다양한 비트스트림들을 분리시킨다.That is, the demultiplexer 311 separates various bitstreams of data from the encoded bitstream. For example, the demultiplexer 311 separates various bitstreams of data such as a geometric structure image 235a, a texture image 235b, an occupied area image 235c, and metadata.

디코딩 엔진(313)은 기하학적 구조에 대한 비트스트림 및 텍스처에 대한 비트스트림을 복호화(decoding)하여, 기하학적 구조 및 텍스처에 대한 2D 프레임들을 생성한다. 디코딩 엔진(313)은 기하학적 구조 및 텍스처를 나타내는 비트스트림에 대한 압축을 해제하여 도 10의 (a) 내지 (b)에 대응하는 기하학적 구조 영상(235a) 및 텍스처 영상(235b)에 관한 2D 프레임들을 생성한다.The decoding engine 313 decodes the bitstream for the geometry and the bitstream for the texture to generate 2D frames for the geometry and texture. The decoding engine 313 decompresses the bitstream representing the geometry and texture to generate 2D frames related to the geometry image 235a and the texture image 235b corresponding to FIGS. 10 (a) to (b). create

또한, 디코딩 엔진(313)은 점유영역 영상(235c)을 이용하여 기하학적 구조 영상(235a) 및 텍스처 영상(235b) 같은 각각의 2D 프레임 내 유효 포인트들을 식별한다. 점유영역 영상(235c)는 3D 포인트 클라우드(40)를 재구성하기 위한 2D 프레임들 내 유효 픽셀 위치를 나타낸다.Also, the decoding engine 313 identifies valid points in each 2D frame, such as the geometric structure image 235a and the texture image 235b, using the occupied area image 235c. The occupied area image 235c represents an effective pixel position in 2D frames for reconstructing the 3D point cloud 40 .

렌더링부(315)는 디멀티플렉서(311)와 디코딩 엔진(313)으로부터 수신된 데이터에 기반하여 2D 프레임들(압축 해제된 기하학적 구조 영상(235a), 텍스처 영상(235b), 및 점유영역 영상(235c)을 포함함)을 재구성하여 3D 입체 콘텐츠를 생성한다.The rendering unit 315 generates 2D frames (decompressed geometry image 235a, texture image 235b, and occupied area image 235c) based on the data received from the demultiplexer 311 and the decoding engine 313 . including) to create 3D stereoscopic content.

디스플레이부(320)는 사용자의 명령에 따라 디코더(310)에서 생성된 3D 입체 콘텐츠를 출력할 수 있다. 디스플레이부(320)는 사용자에 의한 터치 입력을 수신하는 터치 패드와 상호 레이어 구조를 이루어 터치스크린으로 구성될 수 있으며, 클라이언트(300)의 시스템 설정과 관계된 인터페이스를 출력할 수 있다.The display 320 may output 3D stereoscopic content generated by the decoder 310 according to a user's command. The display unit 320 may be configured as a touch screen by forming a layer structure with a touch pad that receives a touch input by a user, and may output an interface related to system settings of the client 300 .

또한, 디스플레이부(320)는 시인성을 위하여 투명한 재질로 형성되는 것이 바람직하며, 일 예로 액정 디스플레이(liquid crystal display, LCD), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display, TFT LCD), 유기 발광 다이오드(organic light-emitting diode, OLED), 플렉시블 디스플레이(flexible display), 3차원 디스플레이(3D display), 투명디스플레이, 헤드업 디스플레이(head-up display, HUD), 및 터치스크린 중 적어도 하나로 구현될 수 있다.In addition, the display unit 320 is preferably formed of a transparent material for visibility, for example, a liquid crystal display (LCD), a thin film transistor-liquid crystal display (TFT LCD), an organic To be implemented as at least one of an organic light-emitting diode (OLED), a flexible display, a three-dimensional display (3D display), a transparent display, a head-up display (HUD), and a touch screen can

도 12는 본 발명의 일 실시 예에 따른 3차원 입체 콘텐츠를 제공하는 방법에 대한 개략적인 흐름도이다.12 is a schematic flowchart of a method of providing 3D stereoscopic content according to an embodiment of the present invention.

도 12를 참조하면, 상기 방법은 다수의 카메라들을 통해 촬영된 2D 객체 영상들을 획득하고(S10), 상기 객체 영상들을 서로 정합하여 3D 포인트 클라우드를 추출한다(S20).Referring to FIG. 12 , the method obtains 2D object images captured by a plurality of cameras (S10), and extracts a 3D point cloud by matching the object images with each other (S20).

이후, 3D 포인트 클라우드에 상응하는 바운딩 박스(bounding box)를 생성하고, 바운딩 박스를 일정 크기의 복셀 그리드(voxel grid)로 분할한다(S30).Thereafter, a bounding box corresponding to the 3D point cloud is generated, and the bounding box is divided into voxel grids of a predetermined size ( S30 ).

그리고, 바운딩 박스의 각 면에 3D 포인트 클라우드를 투영시켜 기하학적 구조 및 텍스처 속성을 나타내는 2차원 프레임(2D frame)으로 변환하고, 이를 인코딩하여 압축된 비트스트림(bitstream)을 전송한다(S40).Then, a 3D point cloud is projected on each side of the bounding box, converted into a 2D frame representing geometrical structure and texture properties, and encoded to transmit a compressed bitstream (S40).

다음으로, 수신된 비트스트림에 대한 압축해제를 통해 2D 프레임에 대한 복호화(decoding)를 수행하고, 이를 재구성하여 렌더링(rendering)함으로써 3D 입체 콘텐츠(3D volumetric contents)를 생성한다(S50).Next, decoding of the 2D frame is performed through decompression of the received bitstream, and 3D volumetric contents are generated by reconstructing and rendering (S50).

이후, 사용자의 명령에 응답하여 생성된 3D 입체 콘텐츠를 재생한다(S60).Thereafter, the 3D stereoscopic content generated in response to the user's command is reproduced (S60).

도 13은 도 12에 도시된 S40 단계의 수행 절차를 보다 상세히 설명하기 위한 흐름도이다.13 is a flowchart for explaining in more detail a procedure for performing step S40 shown in FIG. 12 .

도 13을 참조하면, S40 단계는 3D 포인트 클라우드를 복수의 2D 서브프레임들로 분해하는 단계(S41); 사용자 체감품질(QoE)을 고려하여 복수의 서브프레임들을 차등 압축하는 단계(S43); 차등 압축된 복수의 서브프레임들을 하나의 2D 프레임에 패킹하는 단계(S45); 2D 프레임을 기하학적 구조 영상, 텍스처 영상, 및 점유영역 영상으로 분해하여 생성하는 단계(S47); 및 부호화 및 다중화를 통해 비트스트림을 출력하는 단계(S49)를 포함할 수 있다.Referring to FIG. 13 , the step S40 includes decomposing the 3D point cloud into a plurality of 2D subframes (S41); Differential compression of a plurality of subframes in consideration of user quality of experience (QoE) (S43); Packing a plurality of differentially compressed subframes into one 2D frame (S45); decomposing the 2D frame into a geometric structure image, a texture image, and an occupied area image (S47); and outputting a bitstream through encoding and multiplexing (S49).

S41 단계는 바운딩 박스(50)의 각 면에 3D 포인트 클라우드(40) 투영(projection)하는 단계(S411); 및 투영된 포인트들을 분할하여 클러스터로 구성된 다수의 패치들을 생성하는 단계(413);를 포함할 수 있다. 이때, 3D 포인트 클라우드는 법선 벡터에 기반하여 각 포인트들을 클러스터링함으로써 분할된다. 클러스터링된 포인트들은 3D 공간에서 2D 서브프레임으로 투영되어 패치들을 생성하고, 패치들은 개별적인 서브프레임 안에 정리되고 모아진다.Step S41 is a 3D point cloud 40 on each side of the bounding box 50, the step of projecting (projection) (S411); and dividing the projected points to generate a plurality of patches configured as a cluster ( 413 ). At this time, the 3D point cloud is divided by clustering each point based on the normal vector. Clustered points are projected into 2D subframes in 3D space to create patches, which are organized and collected in individual subframes.

S43 단계에서, 3D 포인트 클라우드를 나타내는 복수의 서브프레임들은 미리 할당된 중요도 인자에 따라 차등 압축된다. 여기서, 중요도 인자는 3D 포인트 클라우드(40)의 특성과 관련된 사용자 체감 품질(Quality of Experience, QoE)을 나타내며, 사용자 체감품질(QoE)의 중요도 값과 압축률은 반비례하는 관계를 가진다.In step S43, a plurality of subframes representing the 3D point cloud are differentially compressed according to a pre-allocated importance factor. Here, the importance factor represents the user's quality of experience (QoE) related to the characteristics of the 3D point cloud 40, and the importance value of the user's quality of experience (QoE) and the compression rate have an inversely proportional relationship.

S45 단계 및 S47 단계에서, 차등 압축된 복수의 서브프레임들은 하나의 2D 프레임 안에 패킹되고, 2D 프레임 안에 포함된 패치들은 기하학적 구조 영상과 텍스처 영상으로 분해된다. 기하학적 구조 영상은 3차원 공간 내 X, Y, 및, Z 위치에 기반하는 3D 포인트 클라우드의 각 포인트들에 대한 지형적 좌표 정보를 포함한다. 텍스처 영상은 3D 포인트 클라우드의 각 포인트들에 대한 텍스처 정보를 포함한다. 한 포인트의 텍스처 정보는 빨강(R) 값, 녹색(G) 값, 및 파랑(B) 값의 조합으로 이루어진 색상 정보를 포함할 수 있다. 이외에도, 추가적인 텍스처 정보들에는 밝기, 모션, 재료, 반사, 강도 등이 포함될 수 있다. 기하학적 구조 영상 내 각각의 픽셀은 텍스처 영상 안에 대응하는 픽셀을 가지며, 텍스처 영상은 기하학적 구조 영상 내 각각의 대응하는 포인트의 색상을 표현한다.In steps S45 and S47, a plurality of differentially compressed subframes are packed into one 2D frame, and patches included in the 2D frame are decomposed into a geometry image and a texture image. The geometric structure image includes geographic coordinate information for each point of the 3D point cloud based on the X, Y, and Z positions in the 3D space. The texture image includes texture information for each point of the 3D point cloud. The texture information of one point may include color information consisting of a combination of a red (R) value, a green (G) value, and a blue (B) value. In addition, the additional texture information may include brightness, motion, material, reflection, intensity, and the like. Each pixel in the geometry image has a corresponding pixel in the texture image, and the texture image represents the color of each corresponding point in the geometry image.

또한, S47 단계에서, 점유영역 영상 생성된다. 점유영역 영상은 2D 프레임에 투영되거나 매핑되는 3D 포인트 클라우드의 유효 포인트들을 포함하는 기하학적 구조 영상 및 텍스처 영상 안의 픽셀 위치를 가리킨다. 점유영역 영상은 S50 단계에서 2D 프레임 내 유효 픽셀을 식별하여 재구성하는데 이용된다.Also, in step S47, an occupied area image is generated. Occupancy images refer to pixel positions in geometry images and texture images that contain valid points in a 3D point cloud projected or mapped to a 2D frame. The occupied area image is used to identify and reconstruct effective pixels in the 2D frame in step S50.

S49 단계에서, 기하학적 구조 영상, 텍스처 영상, 및 점유영역 영상은 각기 소정 코덱을 통해 부호화되고, 다중화를 통해 서로 결합되어 단일 비트스트림이 생성된다. 생성된 비트스트림은 S50 단계의 입력으로 대체된다.In step S49, the geometric structure image, the texture image, and the occupied area image are each encoded through a predetermined codec, and combined with each other through multiplexing to generate a single bitstream. The generated bitstream is replaced with the input of step S50.

상술한 실시 예에 따른 3차원 입체 콘텐츠를 제공하는 방법은 컴퓨터에서 실행되기 위한 프로그램으로 제작되어 컴퓨터가 읽을 수 있는 기록 매체에 저장될 수 있으며, 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 포함될 수 있다.The method for providing 3D stereoscopic content according to the above-described embodiment may be produced as a program to be executed by a computer and stored in a computer-readable recording medium, and examples of the computer-readable recording medium include ROM, RAM , CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 상술한 방법을 구현하기 위한 기능적인(function) 프로그램, 코드 및 코드 세그먼트들은 실시예가 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.The computer-readable recording medium is distributed in a network-connected computer system, so that the computer-readable code can be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the above-described method can be easily inferred by programmers in the technical field to which the embodiment belongs.

실시 예와 관련하여 전술한 바와 같이 몇 가지만을 기술하였지만, 이외에도 다양한 형태의 실시가 가능하다. 앞서 설명한 실시 예들의 기술적 내용들은 서로 양립할 수 없는 기술이 아닌 이상은 다양한 형태로 조합될 수 있으며, 이를 통해 새로운 실시 형태로 구현될 수도 있다.Although only a few have been described as described above in relation to the embodiments, various other forms of implementation are possible. The technical contents of the above-described embodiments may be combined in various forms unless they are incompatible with each other, and may be implemented in a new embodiment through this.

한편, 전술한 실시 예에 의한 3차원 입체 콘텐츠를 제공하는 장치 및 방법은 증강현실(Augmented Reality; AR) 서비스 및 가상 현실(Virtual Reality; VR) 서비스 등에 활용될 수 있다.On the other hand, the apparatus and method for providing 3D stereoscopic content according to the above-described embodiment may be utilized in an augmented reality (AR) service, a virtual reality (VR) service, and the like.

본 발명은 본 발명의 정신 및 필수적 특징을 벗어나지 않는 범위에서 다른 특정한 형태로 구체화될 수 있음은 통상의 기술자에게 자명하다. 따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.It is apparent to those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit and essential characteristics of the present invention. Accordingly, the above detailed description should not be construed as restrictive in all respects but as exemplary. The scope of the present invention should be determined by a reasonable interpretation of the appended claims, and all modifications within the equivalent scope of the present invention are included in the scope of the present invention.

Claims

extracting a 3D point cloud based on 2D object images captured by a plurality of cameras; and
The three-dimensional point cloud is decomposed into a plurality of subframes, the plurality of subframes are differentially compressed with a compression ratio adjusted based on a pre-allocated importance factor, and then converted into a two-dimensional frame representing a geometric image and a texture image. A method of providing 3D stereoscopic content, comprising: outputting an encoded bitstream.

According to claim 1,
The extracting step is
and obtaining a parameter related to a transformation relationship between 3D spatial coordinates and 2D image coordinates based on the matching points between the object images.

According to claim 1,
generating a bounding box corresponding to the three-dimensional point cloud; and
and dividing the bounding box into a voxel grid of a predetermined size.

4. The method of claim 3,
Wherein the bounding box is generated for each frame, 3D stereoscopic content providing method.

4. The method of claim 3,
The output step is
orthogonally projecting the points of the 3D point cloud existing inside the voxel grid to each side of the bounding box;
Each of the plurality of subframes,
Each side of the bounding box and a plurality of patches composed of a cluster of points projected on each side, the 3D stereoscopic content providing method.

6. The method of claim 5,
The pre-allocated importance factor is a user experience quality related to the 3D point cloud characteristic.

7. The method of claim 6,
The bounding box is configured in the form of a polyhedron,
The plurality of subframes,
a first subframe corresponding to the front of the bounding box;
a second sub-frame corresponding to the rear and left and right sides of the bounding box; and
A method of providing three-dimensional stereoscopic content, including; a third sub-frame corresponding to an upper surface and a lower surface of the bounding box.

8. The method of claim 7,
The compressing step is
Adjusting the first subframe to a first compression ratio, and adjusting the second subframe to a second compression ratio,
The first compression ratio is smaller than the second compression ratio, 3D stereoscopic content providing method.

9. The method of claim 8,
The compressing step is
and adjusting the third subframe to a third compression ratio greater than the second compression ratio.

According to claim 1,
The geometric image includes location information for points of the three-dimensional point cloud,
The method for providing three-dimensional stereoscopic content, wherein the texture image includes color information on points of the three-dimensional point cloud.

According to claim 1,
The method further comprising: receiving the output bitstream and performing decoding and rendering processes on the 2D frame to generate 3D stereoscopic content;

A computer-readable recording medium in which an application program for realizing the method for providing three-dimensional stereoscopic content according to any one of claims 1 to 11 is recorded through being executed by a processor.

a point cloud extractor for extracting a three-dimensional point cloud based on two-dimensional object images captured by a plurality of cameras; and
The three-dimensional point cloud is decomposed into a plurality of subframes, the plurality of subframes are differentially compressed with a compression ratio adjusted based on a pre-allocated importance factor, and then converted into a two-dimensional frame representing a geometric image and a texture image. An encoder for outputting an encoded bitstream; including, an apparatus for providing 3D stereoscopic content.

14. The method of claim 13,
The point cloud extraction unit,
and obtaining a parameter related to a transformation relationship between three-dimensional spatial coordinates and two-dimensional image coordinates based on a matching point between the object images.

14. The method of claim 13,
and a voxel grid generator that generates a bounding box corresponding to the 3D point cloud and divides the bounding box into voxel grids of a predetermined size.

16. The method of claim 15,
The bounding box is generated for each frame, a 3D stereoscopic content providing apparatus.

16. The method of claim 15,
The encoder is
a patch generating unit generating a plurality of patches by orthogonally projecting points of the 3D point cloud existing inside the voxel grid onto each surface of the bounding box; and
A differential compression unit that adjusts the compression ratios of the plurality of subframes based on a pre-allocated importance factor, and differentially compresses the plurality of subframes according to the adjusted compression ratio;
Each of the plurality of subframes,
3D stereoscopic content providing apparatus comprising the plurality of patches consisting of each side of the bounding box and a cluster of points projected on each side.

18. The method of claim 17,
The pre-allocated importance factor is a three-dimensional stereoscopic content providing apparatus that is a user-sensible quality related to the three-dimensional point cloud characteristic.

19. The method of claim 18,
The bounding box is configured in the form of a polyhedron,
The plurality of subframes,
a first subframe corresponding to the front of the bounding box;
a second subframe corresponding to the rear and left and right sides of the bounding box; and
3D stereoscopic content providing apparatus including; a third sub-frame corresponding to the top and bottom surfaces of the bounding box.

20. The method of claim 19,
The differential compression unit,
Adjusting the first subframe to a first compression ratio, and adjusting the second subframe to a second compression ratio,
The first compression ratio is smaller than the second compression ratio, 3D stereoscopic content providing apparatus.

21. The method of claim 20,
The differential compression unit,
and adjusting the third subframe to a third compression ratio greater than the second compression ratio.

14. The method of claim 13,
The geometric image includes location information for points of the three-dimensional point cloud,
The texture image includes color information on points of the 3D point cloud.

14. The method of claim 13,
and a decoder for receiving the output bitstream and performing decoding and rendering on the 2D frame to generate 3D stereoscopic content.