KR102681496B1

KR102681496B1 - Method and apparatus for converting image

Info

Publication number: KR102681496B1
Application number: KR1020220115402A
Authority: KR
Inventors: 배성준; 강정원; 김수웅; 방건; 이진호; 이하현; 임성창
Original assignee: 한국전자통신연구원
Priority date: 2021-11-30
Filing date: 2022-09-14
Publication date: 2024-07-05

Abstract

본 발명의 일실시예에 따른 영상변환 방법은 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성하는 단계와, 상기 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성하는 단계와, 상기 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 단계를 포함할 수 있다.An image conversion method according to an embodiment of the present invention includes generating multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data, and dividing the layers of the multi-plane 3D data into at least one layer. It may include combining layers to create a combined layer, and converting multi-plane 3D data including the combined layer into a 2D atlas image.

Description

Image conversion method and device {METHOD AND APPARATUS FOR CONVERTING IMAGE}

본 발명은 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 방법 및 장치에 관한 것이다.The present invention relates to a method and device for converting multi-plane 3D data into a 2D atlas image.

일반적으로, 다중평면 3D 데이터(Multi Plane Image, MPI)는 3D 공간을 깊이 방향 계층으로 재구성하여 깊이 방향의 면에 공간에서의 픽셀을 위치시킨 형태의 3D 공간 표현 방법이다.In general, multi-plane 3D data (Multi Plane Image, MPI) is a 3D space expression method that reconstructs 3D space into a depth direction layer and places pixels in the space on a depth direction plane.

MPI 기반의 공간 표현 방법은 공간을 임의의 시점에서 자유롭게 랜더링할 때 상대적으로 높은 화질을 얻을 수 있으며 실사 기반의 공간 정보를 표현할 때 화질 유지에 가장 큰 요소인 고품질의 깊이 정보를 필요하지 않아 새로운 3D 실사 공간 표현 방식으로 다양하게 사용되고 있다.The MPI-based spatial expression method can achieve relatively high image quality when freely rendering space from an arbitrary viewpoint, and does not require high-quality depth information, which is the biggest factor in maintaining image quality when expressing photo-realistic spatial information, creating a new 3D It is used in various ways as a method of expressing real-life space.

종래 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환시키기 위해서는 다중평면 3D 데이터의 각 계층 영상에 있는 패치 영역(실제 픽셀이 존재하는 영역)을 그대로 한 장의 2D 아틀라스 영상으로 모으게 되는데, 이는 2D 아틀라스 영상의 크기가 매우 커지고, 2D 아틀라스 영상에서 패치 영역을 담지 않은 빈 공간이 매우 많아져 압축 효율을 떨어뜨리는 문제가 있다.In order to convert conventional multi-plane 3D data into a 2D atlas image, the patch areas (areas where actual pixels exist) in each layer image of the multi-plane 3D data are collected into a single 2D atlas image, which is the size of the 2D atlas image. becomes very large, and there is a large amount of empty space in the 2D atlas image that does not contain the patch area, which reduces compression efficiency.

본 발명의 목적은 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환할 때 2D 아틀라스 영상의 크기를 줄이기 위한 영상변환 방법 및 장치를 제공하는 것이다.The purpose of the present invention is to provide an image conversion method and device for reducing the size of a 2D atlas image when converting multi-plane 3D data into a 2D atlas image.

또한, 본 발명의 목적은 2D 아틀라스 영상의 압축 효율을 향상시키기 위한 영상변환 방법 및 장치를 제공하는 것이다.Additionally, the purpose of the present invention is to provide an image conversion method and device for improving the compression efficiency of 2D atlas images.

상기한 목적을 달성하기 위한 본 발명에 따른 영상변환 방법은 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성하는 단계와, 상기 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성하는 단계와, 상기 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 단계를 포함할 수 있다.The image conversion method according to the present invention for achieving the above object includes generating multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data, and creating layers of the multi-plane 3D data. It may include generating a combined layer by combining at least one layer, and converting multi-plane 3D data including the combined layer into a 2D atlas image.

상기 결합 계층을 생성하는 단계는, 상기 계층들 중 깊이가 가장 깊은 계층에서 카메라 원점 방향을 보았을 때 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 제1 결합계층으로 생성하는 단계와, 상기 계층들 중 두 번째로 깊은 계층에서 상기 카메라 원점 방향을 보았을 때 상기 제1 결합계층에 포함된 픽셀을 제외하고, 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 제2 결합계층으로 생성하는 단계를 포함할 수 있다.The step of generating the combining layer includes generating as a first combining layer the pixels that are most visible for the same pixel coordinates when viewed in the direction of the camera origin in the layer with the deepest depth among the layers, and the second combining layer among the layers. Excluding pixels included in the first combining layer when looking at the camera origin direction in a deep layer, it may include generating the pixels that are most visible for each pixel coordinate as a second combining layer.

상기 계층에 픽셀이 존재하지 않으면 상기 결합 계층을 생성하는 단계를 종료할 수 있다.If there are no pixels in the layer, the step of creating the combined layer can be ended.

상기 2D 아틀라스 영상은 상기 다중평면 3D 데이터의 각 결합 계층의 픽셀 정보를 기초로 생성된 투명도 영상 및 컬러 영상과, 상기 픽셀에 대한 상기 계층의 위치 정보를 가지는 계층 인덱스 영상을 포함할 수 있다.The 2D atlas image may include a transparency image and a color image generated based on pixel information of each combined layer of the multi-plane 3D data, and a layer index image having location information of the layer for the pixel.

상기 2D 아틀라스 영상을 압축하는 단계를 더 포함할 수 있다.The step of compressing the 2D atlas image may be further included.

상기 2D 아틀라스 영상을 HEVC, H.263 및 VVC 중 적어도 하나를 이용하여 비트스트림 데이터로 생성할 수 있다.The 2D atlas image can be generated as bitstream data using at least one of HEVC, H.263, and VVC.

상기 복수의 다 시점 영상 데이터는 2차원 배열의 영상 데이터를 포함할 수 있다.The plurality of multi-view image data may include image data in a two-dimensional array.

깊이 방향의 거리 별로 평면들을 생성하고, 상기 평면 별로 대응되는 픽셀값과 상기 픽셀의 투명도를 재배치하여 다중평면 3D 데이터를 생성할 수 있다.Multi-plane 3D data can be generated by creating planes for each distance in the depth direction and rearranging the pixel values and transparency of the pixels corresponding to each plane.

또한, 실시예에 따른 영상변환 장치는 영상변환을 위한 제어 프로그램이 저장된 메모리와, 상기 메모리에 저장된 제어 프로그램을 실행하는 프로세서를 포함하고, 상기 프로세서는, 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성하고, 상기 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성하고, 상기 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환할 수 있다.In addition, the image conversion device according to the embodiment includes a memory storing a control program for image conversion, and a processor executing the control program stored in the memory, and the processor performs depth direction based on a plurality of multi-view image data. Create multi-plane 3D data reconstructed with layers, combine the layers of the multi-plane 3D data into at least one layer to create a combined layer, and convert the multi-plane 3D data including the combined layer into a 2D atlas image. It can be converted.

상기 프로세서는, 상기 계층들 중 깊이가 가장 깊은 계층에서 카메라 원점 방향을 보았을 때 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 제1 결합계층으로 생성하고, 상기 계층들 중 두 번째로 깊은 계층에서 상기 카메라 원점 방향을 보았을 때 상기 제1 결합계층에 포함된 픽셀을 제외하고, 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 제2 결합계층으로 생성할 수 있다.The processor generates the first visible pixels for the same pixel coordinates when looking in the direction of the camera origin in the layer with the deepest depth among the layers as a first combining layer, and the camera origin in the layer with the second deepest depth among the layers. Excluding pixels included in the first combining layer when looking at the direction, the pixels that are most visible for each pixel coordinate can be generated as a second combining layer.

상기 프로세서는, 상기 계층에 픽셀이 존재하지 않으면 상기 결합 계층을 생성하는 과정을 종료할 수 있다.The processor may end the process of creating the combined layer if there are no pixels in the layer.

상기 프로세서는, 상기 2D 아틀라스 영상을 압축할 수 있다.The processor may compress the 2D atlas image.

상기 프로세서는, 상기 2D 아틀라스 영상을 HEVC, H.263 및 VVC 중 적어도 하나를 이용하여 비트스트림 데이터로 생성할 수 있다.The processor may generate the 2D atlas image as bitstream data using at least one of HEVC, H.263, and VVC.

상기 프로세서는, 깊이 방향의 거리 별로 평면들을 생성하고, 상기 평면 별로 대응되는 픽셀값과 상기 픽셀의 투명도를 재배치하여 다중평면 3D 데이터를 생성할 수 있다.The processor may generate multi-plane 3D data by generating planes for each distance in the depth direction and rearranging the pixel values and transparency of the pixels corresponding to each plane.

본 발명에 따르면, 다중평면 3D 데이터의 각 계층을 결합함으로써, 2D 아틀라스 영상의 크기를 줄일 수 있다.According to the present invention, the size of a 2D atlas image can be reduced by combining each layer of multi-plane 3D data.

또한, 본 발명은 2D 아틀라스 영상에서 패치 영역을 담지 않은 빈 공간을 최소화시켜 압축 효율을 향상시킬 수 있다.Additionally, the present invention can improve compression efficiency by minimizing empty space that does not contain patch areas in a 2D atlas image.

도 1은 실시예에 따른 영상변환 장치를 나타낸 블록도이다.
도 2는 실시예에 따른 영상변환 방법을 나타낸 순서도이다.
도 3은 실시예에 따른 다 시점 영상 데이터를 나타낸 도면이다.
도 4는 실시예에 따른 다중평면 3D 데이터로 변환된 모습을 나타낸 도면이다.
도 5는 종래 다중평면 3D 데이터와 실시예에 따른 다중평면 3D 데이터의 비교예를 나타낸 도면이다.
도 6은 실시예에 따른 다중평면 3D 데이터의 결합 계층을 생성하는 과정을 나타낸 도면이다.
도 7은 실시예에 따른 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 모습을 나타낸 도면이다.
도 8은 실시예에 따른 컴퓨터 시스템의 구성을 나타낸 블록도이다.Figure 1 is a block diagram showing an image conversion device according to an embodiment.
Figure 2 is a flowchart showing an image conversion method according to an embodiment.
Figure 3 is a diagram showing multi-view image data according to an embodiment.
Figure 4 is a diagram showing conversion into multi-plane 3D data according to an embodiment.
Figure 5 is a diagram showing a comparative example between conventional multi-plane 3D data and multi-plane 3D data according to an embodiment.
Figure 6 is a diagram illustrating a process for creating a combined layer of multi-plane 3D data according to an embodiment.
Figure 7 is a diagram showing conversion of multi-plane 3D data into a 2D atlas image according to an embodiment.
Figure 8 is a block diagram showing the configuration of a computer system according to an embodiment.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and will be implemented in various different forms. The present embodiments only serve to ensure that the disclosure of the present invention is complete and that common knowledge in the technical field to which the present invention pertains is not limited. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

비록 "제1" 또는 "제2" 등이 다양한 구성요소를 서술하기 위해서 사용되나, 이러한 구성요소는 상기와 같은 용어에 의해 제한되지 않는다. 상기와 같은 용어는 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용될 수 있다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있다.Although terms such as “first” or “second” are used to describe various components, these components are not limited by the above terms. The above terms may be used only to distinguish one component from another component. Accordingly, the first component mentioned below may also be the second component within the technical spirit of the present invention.

본 명세서에서 사용된 용어는 실시예를 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 또는 "포함하는(comprising)"은 언급된 구성요소 또는 단계가 하나 이상의 다른 구성요소 또는 단계의 존재 또는 추가를 배제하지 않는다는 의미를 내포한다.The terms used in this specification are for describing embodiments and are not intended to limit the invention. As used herein, singular forms also include plural forms, unless specifically stated otherwise in the context. As used in the specification, “comprises” or “comprising” implies that the mentioned component or step does not exclude the presence or addition of one or more other components or steps.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 해석될 수 있다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms used in this specification can be interpreted as meanings commonly understood by those skilled in the art to which the present invention pertains. Additionally, terms defined in commonly used dictionaries are not to be interpreted ideally or excessively unless clearly specifically defined.

본 문서에서 “A 또는 B, “A 및 B 중 적어도 하나, “A 또는 B 중 적어도 하나”, “A,B 또는 C 중 적어도 하나”, 및 “A,B, 또는 C 중 적어도 하나”와 같은 문구들 각각은 그 문구들 중 해당하는 문구와 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다.In this document, “A or B,” “at least one of A and B,” “at least one of A or B,” “at least one of A, B, or C,” and “at least one of A, B, or C.” Each of the phrases may include any one of the items listed with the corresponding phrase, or any possible combination thereof.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면 부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings. When describing with reference to the drawings, identical or corresponding components will be assigned the same reference numerals and redundant description thereof will be omitted. .

도 1은 실시예에 따른 영상변환 장치를 나타낸 블록도이다.Figure 1 is a block diagram showing an image conversion device according to an embodiment.

도 1을 참조하면, 실시예에 따른 영상변환 장치는 다중평면 3D 데이터 생성부(100), 다중평면 3D 데이터 계층 결합부(200), 2D 아틀라스 영상 변환부(300), 2D 영상 압축부(400)를 포함할 수 있다.Referring to FIG. 1, the image conversion device according to the embodiment includes a multi-plane 3D data generator 100, a multi-plane 3D data layer combining unit 200, a 2D atlas image conversion unit 300, and a 2D image compression unit 400. ) may include.

다중평면 3D 데이터 생성부(100)는 다 시점 영상(Multiview Image) 데이터를 입력받아 다중평면 3D 데이터를 생성할 수 있다. 다 시점 영상 데이터는 M X N 개수의 2차원 배열의 영상 데이터 또는 N개의 1차원 배열의 영상 데이터일 수 있다. 실시예에서는 2차원 배열의 영상 데이터가 사용될 수 있다.The multi-plane 3D data generator 100 can receive multi-view image data and generate multi-plane 3D data. Multi-view image data may be M X N two-dimensional array image data or N one-dimensional array image data. In an embodiment, a two-dimensional array of image data may be used.

다중평면 3D 데이터 생성부(100)는 다 시점 영상 데이터를 깊이 방향의 계층들로 재구성하여 다중평면 3D 데이터를 생성할 수 있다.The multi-plane 3D data generator 100 may generate multi-plane 3D data by reconstructing multi-view image data into layers in the depth direction.

다중평면 3D 데이터 계층 결합부(300)는 다중평면 3D 데이터 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성할 수 있다. The multi-plane 3D data layer combiner 300 may generate a combined layer by combining multi-plane 3D data layers into at least one layer.

2D 아틀라스 영상 변환부(300)는 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환할 수 있다. 2D 아틀라스 영상 변환부(300)는 다중평면 3D 데이터의 모든 계층에 픽셀이 존재하는 영역들을 한 장의 영상으로 재배치하여 생성할 수 있다.The 2D atlas image converter 300 can convert multi-plane 3D data including a combination layer into a 2D atlas image. The 2D atlas image converter 300 can generate a single image by rearranging areas where pixels exist in all layers of multi-plane 3D data.

2D 영상 압축부(400)는 2D 아틀라스 영상을 HEVC, H.263 및 VVC 등의 압축 방법 중 하나를 이용하여 비트스트림(Bitstream) 데이터로 생성할 수 있다.The 2D image compression unit 400 can generate a 2D atlas image as bitstream data using one of compression methods such as HEVC, H.263, and VVC.

도 2는 실시예에 따른 영상변환 방법을 나타낸 순서도이다. 여기서, 실시예에 따른 영상변환 방법은 영상변환 장치에서 수행될 수 있다.Figure 2 is a flowchart showing an image conversion method according to an embodiment. Here, the image conversion method according to the embodiment can be performed in an image conversion device.

도 2를 참조하면, 실시예에 따른 영상변환 장치는 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성할 수 있다(S100).Referring to FIG. 2, the image conversion device according to the embodiment can generate multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data (S100).

도 3은 실시예에 따른 다 시점 영상 데이터를 나타낸 도면이다.Figure 3 is a diagram showing multi-view image data according to an embodiment.

도 3에 도시된 바와 같이, 다 시점 영상 데이터(M1)는 2D 배열 카메라 어레이로부터 촬영될 수 있으며, 다양한 뷰 위치에서 획득된 영상일 수 있다.As shown in FIG. 3, multi-view image data M1 may be captured from a 2D array camera array and may be images obtained from various viewing positions.

도 4는 실시예에 따른 다중평면 3D 데이터로 변환된 모습을 나타낸 도면이다.Figure 4 is a diagram showing conversion into multi-plane 3D data according to an embodiment.

도 4에 도시된 바와 같이, 다중평면 3D 데이터(M2)는 특정 카메라 위치(reference cam Position)에서 깊이 방향의 거리 별로 평면들을 생성한 후 해당 평면 별로 대응되는 픽셀값(RGB)과 해당 픽셀의 투명도(Alpha)를 재배치하여 3D 공간 상에 표현할 수 있다. 즉, 다중평면 3D 데이터는 임의의 카메라 위치에서의 영상을 자유롭게 생성할 수 있다.As shown in FIG. 4, multi-plane 3D data (M2) generates planes for each distance in the depth direction from a specific camera position (reference cam position), and then calculates the corresponding pixel value (RGB) for each plane and the transparency of the pixel. (Alpha) can be rearranged and expressed in 3D space. In other words, multi-plane 3D data can freely generate images from arbitrary camera positions.

도 2로 돌아가서, 실시예에 따른 영상변환 장치는 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성할 수 있다(S200).Returning to FIG. 2, the image conversion device according to the embodiment may generate a combined layer by combining layers of multi-plane 3D data into at least one layer (S200).

도 5는 종래 다중평면 3D 데이터와 실시예에 따른 다중평면 3D 데이터의 비교예를 나타낸 도면이다.Figure 5 is a diagram showing a comparative example between conventional multi-plane 3D data and multi-plane 3D data according to an embodiment.

도 5에 도시된 바와 같이, 결합 계층이 생성된 다중평면 3D 데이터(M3)는 다중평면 3D 데이터에 비해 적은 수의 계층으로 집적할 수 있게 된다. 결합 계층을 생성하는 과정은 도 6을 참조하여 상세히 설명한다.As shown in FIG. 5, multi-plane 3D data (M3) from which a combined layer is created can be integrated into fewer layers than multi-plane 3D data. The process of creating a combination layer will be described in detail with reference to FIG. 6.

도 6은 실시예에 따른 다중평면 3D 데이터의 결합 계층을 생성하는 과정을 나타낸 도면이다.Figure 6 is a diagram illustrating a process for creating a combined layer of multi-plane 3D data according to an embodiment.

도 6에 도시된 바와 같이, 다중평면 3D 데이터(M2)의 복수의 계층들(MPI Layer#0, MPI Layer#1, MPI Layer#2 등) 중 가장 깊이가 깊은 계층 즉, 카메라에서 제일 멀리 떨어져 있는 계층(예를 들어, MPI Layer#0)부터 시작하여 카메라에서 가까운 계층(예를 들어, MPI Layer#2) 순으로 결합을 수행할 수 있다.As shown in FIG. 6, among the plurality of layers (MPI Layer #0, MPI Layer #1, MPI Layer #2, etc.) of the multi-plane 3D data (M2), the layer with the deepest depth, that is, the layer furthest from the camera Combining can be performed starting from the layer located in the camera (e.g., MPI Layer #0) and starting from the layer closest to the camera (e.g., MPI Layer #2).

즉, 계층들 중 깊이가 가장 깊은 계층(MPI Layer#0)에서 카메라의 원점 방면을 보았을 때 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 결합하여 제1 결합계층(Aggregated Layer#0) 을 생성할 수 있다.In other words, the first combined layer (Aggregate Layer #0) can be created by combining the pixels that are most visible for the same pixel coordinates when looking toward the origin of the camera in the layer with the deepest depth among the layers (MPI Layer #0). .

다중평면 3D 데이터(M2)의 계층 영상에서 영상들의 동일 픽셀 좌표 (x,y)는 카메라 원점에서 동일한 선(ray)을 지나치는 점이 되는데, 각 선(r_A

r_L) 별로 가장 먼 계층에서 카메라 원점 방향으로 보았을 때 가장 먼저 보이는 픽셀을 하나의 계층으로 모아 결합 계층을 생성할 수 있다.In the hierarchical image of multi-plane 3D data (M2), the same pixel coordinates (x,y) of the images become points passing the same line (ray) from the camera origin, and each line (r _A

r _L ) You can create a combined layer by gathering the pixels that are most visible when viewed in the direction of the camera origin from the furthest layer into one layer.

예컨대, r_A에서는 MPI Layer#2의 1의 1번 픽셀이 가장 먼저 만나는 픽셀이며, r_B에서는 MPI Layer#2의 2번 픽셀, r_D에서는 MPI Layer#1의 1번 픽셀, r_F에서는 MPI Layer#0의 1번 픽셀과 같은 순서이다.For example, in r _A , pixel 1 of MPI Layer#2 is the first pixel encountered, in r _B , pixel 2 of MPI Layer#2, in r _D , pixel 1 of MPI Layer#1, and in r _F , pixel 1 is MPI. This is the same order as pixel 1 of Layer #0.

상기와 같이, 각 선 별로 뒤에서 가장 처음 만난 픽셀을 하나의 계층 영상으로 모은 결과가 우측의 0번째인 제1 결합계층(Aggregated Layer#0)을 의미할 수 있다. 여기서, 결합 계층이 생성된 다중평면 3D 데이터(M3)는 컬러 영상(Color Image Plane), 계층 인덱스 영상(Layer Index Plane)으로 구성될 수 있다. 컬러 영상은 모아진 해당 픽셀의 컬러값을 가지며, 계층 인덱스 영상은 해당 픽셀이 원래 다중평면 3D 데이터의 어느 계층에서 온 픽셀인지 계층 인덱스를 보존할 수 있다. 여기서, 컬러 영상에는 투명도 영상을 포함하고 있으며, 해당 픽셀의 투명도 값을 가질 수 있다.As described above, the result of gathering the first pixels encountered behind each line into one layer image may refer to the first combined layer (Aggregate Layer #0), which is the 0th layer on the right. Here, the multi-plane 3D data (M3) from which the combined layer is created may be composed of a color image plane and a layer index plane. A color image has the color value of the collected pixel, and a layer index image can preserve the layer index that indicates which layer of the multi-plane 3D data the pixel originally came from. Here, the color image includes a transparency image and may have a transparency value of the corresponding pixel.

마찬가지 방법으로, 계층들 중 두 번째로 깊은 계층에서 원점 방면을 보았을 때 제1 결합계층(Aggregated Layer#0)에 포함된 픽셀을 제외하고, 동일 픽셀 좌표 별로 가장 먼저 보이는 픽셀들을 제2 결합계층(Aggregated Layer#1)으로 생성할 수 있다.In the same way, when looking toward the origin from the second deepest layer among the layers, excluding the pixels included in the first combined layer (Aggregate Layer #0), the pixels that are first visible for the same pixel coordinates are placed in the second combined layer (Aggregated Layer #0). It can be created as Aggregated Layer#1).

상기와 같이, 계층들에 대해 깊이 방향의 거리가 가까운 순으로 결합 계층들을 생성할 수 있다. 결합 계층의 생성은 다중평면 3D 데이터(M2)에 남아있는 픽셀이 하나도 없을 때까지 반복할 수 있다.As described above, combined layers can be created in the order of the short distances in the depth direction for the layers. The creation of the combined layer can be repeated until there are no pixels remaining in the multiplane 3D data (M2).

도 6에서는 결합 계층을 생성하는 과정에 따라 3개의 결합 계층을 생성하면 다중평면 3D 데이터의 모든 픽셀을 수용하게 된다.In Figure 6, if three combination layers are created according to the process of creating a combination layer, all pixels of the multi-plane 3D data will be accepted.

도 2로 돌아가서, 실시예에 따른 영상변환 장치는 결합 계층을 가지는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환할 수 있다(S300).Returning to Figure 2, the image conversion device according to the embodiment can convert multi-plane 3D data with a combination layer into a 2D atlas image (S300).

도 7은 실시예에 따른 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 모습을 나타낸 도면이다.Figure 7 is a diagram showing conversion of multi-plane 3D data into a 2D atlas image according to an embodiment.

도 7에 도시된 바와 같이, 실시예에 따른 영상변환 장치는 다중평면 3D 데이터의 모든 결합 계층에 픽셀이 존재하는 영역들을 하나의 영상에 재배치하여 2D 아틀라스 영상으로 변환할 수 있다.As shown in FIG. 7, the image conversion device according to the embodiment can convert the areas in which pixels exist in all combined layers of multi-plane 3D data into a 2D atlas image by rearranging them into one image.

결합 계층을 가지는 다중평면 3D 데이터(M3)의 투명도 영상과 컬러 영상을 하나의 영상으로 응축하여 2D 아틀라스 영상으로 변환시킬 수 있다. 이때, 계층 인덱스 영상은 그대로 가져갈 수 있다. 이에 따라 2D 아틀라스 영상은 투명도 영상(M4-1), 컬러 영상(M4-2), 계층 인덱스 영상(M4-3)을 포함할 수 있다.The transparency image and color image of multi-plane 3D data (M3) with a combined layer can be condensed into one image and converted into a 2D atlas image. At this time, the hierarchical index image can be taken as is. Accordingly, the 2D atlas image may include a transparency image (M4-1), a color image (M4-2), and a hierarchical index image (M4-3).

실시예는 초기 다중평면 3D 데이터(M2)의 각 계층의 픽셀 존재 영역(패치 영역)들이 겹치지 않는 한도 내에서 하나의 계층으로 최대한 모음으로써 다중평면 3D 데이터의 결합 계층 수를 줄이고, 결합 계층에서의 픽셀 존재 영역들을 키움으로써 2D 아틀라스 영상을 구성할 때 2D 아틀라스 영상에서 빈 영역이 크게 줄어들 수 있도록 할 수 있다.The embodiment reduces the number of combined layers of multi-plane 3D data by collecting the pixel presence areas (patch areas) of each layer of the initial multi-plane 3D data (M2) into one layer as much as possible within the limit of not overlapping, and reduces the number of combined layers in the combined layer. By increasing the pixel presence areas, the empty area in the 2D atlas image can be greatly reduced when constructing the 2D atlas image.

도 2로 돌아가서, 실시예에 따른 영상변환 장치는 2D 아틀라스 영상을 압축할 수 있다(S400).Returning to Figure 2, the image conversion device according to the embodiment can compress the 2D atlas image (S400).

실시예에 따른 영상변환 장치는 2D 아틀라스 영상을 2D 아틀라스 영상을 HEVC, H.263 및 VVC 등의 압축 방법 중 하나를 이용하여 압축할 수 있으며, 그 결과로 비트스트림 데이터를 생성할 수 있다.An image conversion device according to an embodiment can compress a 2D atlas image using one of compression methods such as HEVC, H.263, and VVC, and generate bitstream data as a result.

실시예에 따른 영상변환 장치는 압축된 비트스트림 데이터를 이용하여 다 시점 영상을 복구할 수 있다.An image conversion device according to an embodiment can recover a multi-view image using compressed bitstream data.

예컨대, 실시예에 따른 영상변환 장치는 압축된 비트스트림 데이터를 입력받아 2D 아틀라스 영상으로 복구하고, 복구한 2D 아틀라스 영상을 이용하여 다중평면 3D 데이터로 복원할 수 있다. 복원된 다중평면 3D 데이터를 이용하여 임의 시점 영상을 자유롭게 생성할 수 있다.For example, the image conversion device according to the embodiment can receive compressed bitstream data, restore it into a 2D atlas image, and restore it into multi-plane 3D data using the restored 2D atlas image. Images from arbitrary viewpoints can be freely created using the restored multi-plane 3D data.

실시예에 따른 영상변환 장치는 컴퓨터 판독 가능한 기록매체와 같은 컴퓨터 시스템에서 구현될 수 있다.The image conversion device according to the embodiment may be implemented in a computer system such as a computer-readable recording medium.

도 8은 실시예에 따른 컴퓨터 시스템의 구성을 나타낸 블록도이다.Figure 8 is a block diagram showing the configuration of a computer system according to an embodiment.

도 8을 참조하면, 실시예에 따른 컴퓨터 시스템(1000)은 버스(1020)를 통하여 서로 통신하는 하나 이상의 프로세서(1010), 메모리(1030), 사용자 인터페이스 입력 장치(1040), 사용자 인터페이스 출력 장치(1050) 및 스토리지(1060)를 포함할 수 있다. 또한, 컴퓨터 시스템(1000)은 네트워크에 연결되는 네트워크 인터페이스(1070)를 더 포함할 수 있다.Referring to FIG. 8, the computer system 1000 according to the embodiment includes one or more processors 1010, a memory 1030, a user interface input device 1040, and a user interface output device ( 1050) and storage 1060. Additionally, the computer system 1000 may further include a network interface 1070 connected to a network.

프로세서(1010)는 중앙 처리 장치 또는 메모리나 스토리지에 저장된 프로그램 또는 프로세싱 인스트럭션들을 실행하는 반도체 장치일 수 있다. 프로세서(1010)는 일종의 중앙처리장치로서 영상변환 장치의 전체 동작을 제어할 수 있다.The processor 1010 may be a central processing unit or a semiconductor device that executes programs or processing instructions stored in memory or storage. The processor 1010 is a type of central processing device and can control the entire operation of the image conversion device.

프로세서(1010)는 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로써, 마이크로프로세서(microporcessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 처리 장치를 망라할 수 있으나, 이에 한정되는 것은 아니다.The processor 1010 may include any type of device capable of processing data. Here, 'processor' may mean, for example, a data processing device built into hardware that has a physically structured circuit to perform a function expressed by code or instructions included in a program. Examples of data processing devices built into hardware include microprocessor, central processing unit (CPU), processor core, multiprocessor, and application-specific integrated (ASIC). circuit) and FPGA (field programmable gate array), but are not limited thereto.

메모리(1030)는 실시예에 따른 영상변환 방법을 수행하기 위한 제어 프로그램 등 전반적인 동작을 위한 다양한 데이터가 저장될 수 있다. 구체적으로, 메모리에는 영상변환 장치에서 구동되는 다수의 응용 프로그램, 영상변환 장치의 동작을 위한 데이터 및 명령어가 저장될 수 있다.The memory 1030 may store various data for overall operation, such as a control program for performing an image conversion method according to an embodiment. Specifically, a number of application programs running on the image conversion device, data and commands for operation of the image conversion device may be stored in the memory.

메모리(1030) 및 스토리지(1060)는 휘발성 매체, 비휘발성 매체, 분리형 매체, 비분리형 매체, 통신 매체, 또는 정보 전달 매체 중에서 적어도 하나 이상을 포함하는 저장 매체일 수 있다. 예를 들어, 메모리(1030)는 ROM(1031)이나 RAM(1032)을 포함할 수 있다.The memory 1030 and storage 1060 may be storage media that includes at least one of volatile media, non-volatile media, removable media, non-removable media, communication media, and information transfer media. For example, memory 1030 may include ROM 1031 or RAM 1032.

일 실시예에 따르면, 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능한 기록 매체로서, 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성하는 단계와, 상기 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성하는 단계와, 상기 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 단계를 포함하는 동작을 포함하는 방법을 프로세서가 수행하도록 하기 위한 명령어를 포함할 수 있다.According to one embodiment, a computer-readable recording medium storing a computer program, comprising the steps of generating multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data, and the multi-plane 3D For a processor to perform a method comprising the steps of combining layers of data into at least one layer to create a combined layer, and converting multi-plane 3D data including the combined layer into a 2D atlas image. It may contain commands for:

일 실시예에 따르면 컴퓨터 판독 가능한 기록매체에 저장되어 있는 컴퓨터 프로그램으로서, 복수의 다 시점 영상 데이터를 기초로 깊이 방향의 계층들로 재구성된 다중평면 3D 데이터를 생성하는 단계와, 상기 다중평면 3D 데이터의 계층들을 적어도 하나의 층으로 결합하여 결합 계층을 생성하는 단계와, 상기 결합 계층을 포함하는 다중평면 3D 데이터를 2D 아틀라스 영상으로 변환하는 단계를 포함하는 동작을 프로세서가 수행하도록 하기 위한 명령어를 포함할 수 있다.According to one embodiment, there is a computer program stored in a computer-readable recording medium, comprising the steps of generating multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data, and generating the multi-plane 3D data. Contains instructions for causing the processor to perform an operation including generating a combined layer by combining the layers into at least one layer, and converting multi-plane 3D data including the combined layer into a 2D atlas image. can do.

본 발명에서 설명하는 특정 실행들은 실시예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들로서 나타내어질 수 있다. 또한, "필수적인","중요하게" 등과 같은 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.The specific implementations described in the present invention are examples and are not intended to limit the scope of the present invention in any way. For the sake of brevity of the specification, descriptions of conventional electronic components, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connections or connection members of lines between components shown in the drawings exemplify functional connections and/or physical or circuit connections, and in actual devices, various functional connections or physical connections may be replaced or added. Can be represented as connections, or circuit connections. Additionally, if there is no specific mention such as “essential,” “important,” etc., it may not be a necessary component for the application of the present invention.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and the scope of the patent claims described below as well as all scopes equivalent to or equivalently changed from the scope of the claims are within the scope of the spirit of the present invention. It will be said to belong to

100: 다중평면 3D 데이터 생성부
200: 다중평면 3D 데이터 계층 결합부
300: 2D 아틀라스 영상 변환부
400: 2D 영상 압축부100: Multi-plane 3D data generation unit
200: Multi-plane 3D data layer joiner
300: 2D atlas image conversion unit
400: 2D image compression unit

Claims

Generating multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data;
generating a combined layer by combining the layers of the multi-plane 3D data into at least one layer; and
Converting multi-plane 3D data including the combined layer into a 2D atlas image;
Including,
The step of creating the combined layer is,
generating the first visible pixels for the same pixel coordinates when looking in the direction of the camera origin in the layer with the deepest depth among the layers as a first combining layer;
Excluding pixels included in the first combining layer when looking at the camera origin direction in the second deepest layer among the layers, generating the pixels that are most visible for the same pixel coordinates as a second combining layer. Video conversion method.

delete

According to paragraph 1,
An image conversion method that terminates the step of generating the combined layer when there are no pixels in the layer.

According to paragraph 2,
The 2D atlas image includes a transparency image and a color image generated based on pixel information of each combined layer of the multi-plane 3D data, and a layer index image having location information of the layer for the pixel. Image conversion method.

According to paragraph 1,
An image conversion method further comprising compressing the 2D atlas image.

According to clause 5,
An image conversion method for generating the 2D atlas image into bitstream data using HEVC, H.263, or VVC.

According to paragraph 1,
An image conversion method wherein the plurality of multi-view image data includes image data in a two-dimensional array.

According to paragraph 1,
An image conversion method for generating multi-plane 3D data by generating planes for each distance in the depth direction and rearranging the pixel values corresponding to each plane and the transparency of the pixels.

A memory storing a control program for image conversion; and
It includes a processor that executes a control program stored in the memory,
The processor,
Generate multi-plane 3D data reconstructed into layers in the depth direction based on a plurality of multi-view image data, generate a combined layer by combining the layers of the multi-plane 3D data into at least one layer, and create a combined layer. Convert the containing multi-plane 3D data into a 2D atlas image,
The processor,
When looking at the camera origin direction from the layer with the deepest depth among the layers, the first visible pixels for each pixel coordinate are generated as a first combined layer, and when looking at the camera origin direction from the layer with the second deepest depth among the layers, An image conversion device that generates the pixels most visible for each pixel coordinate as a second combining layer, excluding pixels included in the first combining layer.

delete

According to clause 9,
The processor,
An image conversion device that terminates the process of creating the combined layer if there are no pixels in the layer.

According to clause 10,
The 2D atlas image includes a transparency image and a color image generated based on pixel information of each combined layer of the multi-plane 3D data, and a layer index image having location information of the layer for the pixel. Image conversion device.

According to clause 9,
The processor,
An image conversion device that compresses the 2D atlas image.

According to clause 13,
The processor,
An image conversion device that generates the 2D atlas image into bitstream data using HEVC, H.263, or VVC.

According to clause 9,
An image conversion device wherein the plurality of multi-view image data includes image data in a two-dimensional array.

According to clause 9,
The processor,
An image conversion device that generates planes for each distance in the depth direction and rearranges the pixel values corresponding to each plane and the transparency of the pixels to generate multi-plane 3D data.