KR102220769B1

KR102220769B1 - Depth map creating method, depth map creating device, image converting method and image converting device

Info

Publication number: KR102220769B1
Application number: KR1020190076386A
Authority: KR
Inventors: 이성원; 김원진
Original assignee: 광운대학교 산학협력단
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2021-02-25
Also published as: KR20210001005A

Abstract

본 발명은 깊이 지도 생성 방법, 깊이 지도 생성 장치, 영상 변환 방법 및 영상 변환 장치에 관한 것이다. 본 발명의 일 실시예에 따른 깊이 지도 생성 방법은, 2차원의 단안 이미지를 3차원으로 재구성하기 위한 깊이 지도 생성 방법으로서, 이미지 내의 소실점을 추정하는 소실점 추정 단계; 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성 단계; 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당 단계; 및 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성 단계를 포함할 수 있다.The present invention relates to a depth map generation method, a depth map generation device, an image conversion method, and an image conversion device. A depth map generation method according to an embodiment of the present invention is a depth map generation method for reconstructing a two-dimensional monocular image into three dimensions, the method comprising: estimating a vanishing point in the image; An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point; A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area; And generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area.

Description

Depth map generation method, depth map generation device, image conversion method, and image conversion device {DEPTH MAP CREATING METHOD, DEPTH MAP CREATING DEVICE, IMAGE CONVERTING METHOD AND IMAGE CONVERTING DEVICE}

본 발명은 이미지 처리에 관한 것으로서, 더욱 상세하게는, 깊이 지도 생성 방법, 깊이 지도 생성 장치, 영상 변환 방법 및 영상 변환 장치에 관한 것이다.The present invention relates to image processing, and more particularly, to a depth map generation method, a depth map generation apparatus, an image conversion method, and an image conversion apparatus.

최근 증강, 가상현실 또는 로봇과 자율주행 자동차 등 3차원 정보가 필요한 기술들이 크게 대두되고 있다. 이에 따라, 저비용의 효과를 기대할 수 있는 카메라를 이용해 3차원으로 재구성하는 영상 처리가 요구된다. 3차원 재구성은 카메라로부터 얻은 2차원 영상 데이터들을 양안 시차 단서와 단안 영상 단서를 추출하고 결합하여 깊이정보를 구성한다. 양안 시차를 이용하는 스테레오 방법은 영상 간의 시각적 차이를 활용하여 화소 단위의 깊이정보를 상대적으로 정확히 얻을 수 있다. 이 방법에서는 화소 단위 별로 정합하여 시차를 계산하게 되는데, 정합 시 오류가 나타날 경우 주변의 다른 화소들에 영향을 미칠 수 있어 제대로 동작할 수 없는 경우가 발생한다. 이렇게 나타나는 오류를 보완하기 위해 전체적인 깊이정보를 구성할 때, 단안 영상의 깊이정보도 필요한 것이다. 단안 영상에서는 스테레오 영상에 비해 깊이정보를 판단할 수 있는 정보가 부족하기 때문에, 먼저 영상 내 깊이단서를 이용하여 상대적인 깊이정보를 구성한다. 그 후에, 스테레오 영상에서 생성한 깊이정보와 결합하거나 카메라 매개변수를 이용하여 절대적인 깊이정보를 계산한다. Recently, technologies that require 3D information, such as augmented, virtual reality, robots and autonomous vehicles, have emerged greatly. Accordingly, there is a need for image processing that reconstructs 3D using a camera that can expect a low cost effect. In 3D reconstruction, 2D image data obtained from a camera are extracted and combined with binocular parallax clues and monocular image clues to construct depth information. The stereo method using binocular parallax can relatively accurately obtain depth information in units of pixels by utilizing visual differences between images. In this method, the parallax is calculated by matching for each pixel unit. If an error appears during matching, it may affect other pixels around it, and thus may not operate properly. When constructing the overall depth information to compensate for such an error, the depth information of the monocular image is also required. Since monocular images lack information capable of determining depth information compared to stereo images, relative depth information is first constructed using depth cues in the image. After that, absolute depth information is calculated by combining with depth information generated from a stereo image or using camera parameters.

그러나, 2차원 단안 영상에서는 상대적으로 정보량이 부족하고, 일정한 정보만을 사용하기 때문에 많은 영역에서 심한 오차가 발생한다. 또한, 종래의 기법들은 사전 정보를 필요로 하거나, 계산량이 매우 많기 때문에, 처리에 장시간이 소요되었다. However, in a 2D monocular image, the amount of information is relatively insufficient, and since only certain information is used, severe errors occur in many areas. In addition, since conventional techniques require prior information or have a very large amount of computation, it takes a long time to process.

본 발명이 이루고자 하는 기술적 과제는, 오차를 최소화하여 높은 정확도를 가지면서도 적은 연산량에 의해 처리 속도가 향상된 깊이 지도 생성 방법 및 영상 변환 방법을 제공하는 것이다.The technical problem to be achieved by the present invention is to provide a depth map generation method and an image conversion method with improved processing speed by minimizing errors and having high accuracy and a small amount of computation.

본 발명의 다른 실시예에 따르면, 상기 이점을 갖는 깊이 지도 생성 장치 및 영상 변환 장치를 제공하는 것이다.According to another embodiment of the present invention, an apparatus for generating a depth map and an apparatus for converting an image having the above advantages is provided.

본 발명의 일 실시예에 따른 깊이 지도 생성 방법은, 2차원의 단안 이미지를 3차원으로 재구성하기 위한 깊이 지도 생성 방법으로서, 이미지 내의 소실점을 추정하는 소실점 추정 단계; 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성 단계; 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당 단계; 및 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성 단계를 포함할 수 있다.A depth map generation method according to an embodiment of the present invention is a depth map generation method for reconstructing a two-dimensional monocular image into three dimensions, the method comprising: estimating a vanishing point in the image; An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point; A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area; And generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area.

일 실시예에서, 상기 초기 깊이 지도 생성 단계에서는 직선 원근과 관계없는 사각 형태구조를 검출하고, 검출된 상기 사각 형태 구조에 대하여는 동일한 깊이 값을 할당할 수 있다. In an embodiment, in the initial depth map generation step, a rectangular shape structure irrelevant to a linear perspective may be detected, and the same depth value may be allocated to the detected rectangular shape structure.

일 실시예에서, 상기 기하학적 정보는 상기 이미지의 분할 영역이 하늘, 수직 성분 또는 수평 성분에 대응하는지를 나타내는 분류 정보를 포함할 수 있다.In an embodiment, the geometric information may include classification information indicating whether the divided area of the image corresponds to a sky, a vertical component, or a horizontal component.

일 실시예에서, 상기 깊이 정보 할당 단계에서 하늘에 대응하는 분할 영역의 깊이 값이 0으로 설정되고, 상기 깊이 정보 할당 단계에서 수평 성분에 대응하는 분할 영역의 깊이 값이 직선 원근법에 의한 깊이 값과 동일한 값으로 설정되고, 상기 깊이 정보 할당 단계에서 수직 성분에 대응하는 분할 영역의 깊이 값이 해당 분할 영역의 평균 깊이 값으로 설정될 수 있다.In an embodiment, in the depth information allocation step, the depth value of the divided region corresponding to the sky is set to 0, and in the depth information allocation step, the depth value of the divided region corresponding to the horizontal component is the depth value according to the linear perspective method and It is set to the same value, and in the depth information allocation step, a depth value of a divided area corresponding to a vertical component may be set as an average depth value of the divided area.

일 실시예에서, 결정 트리(decision tree) 학습법을 이용하여 상기 이미지의 분할 영역이 하늘, 수직 성분 또는 수평 성분에 대응하는지를 나타내는 상기 분류 정보가 결정되고, 결정 트리 학습법에 의한 오류는 얕은 구조의 컨볼루션 신경망(Convolutional Neural Network; CNN)에 의한 학습 모델에 의해 보완될 수 있다.In one embodiment, the classification information indicating whether the divided region of the image corresponds to the sky, a vertical component, or a horizontal component is determined using a decision tree learning method, and an error due to the decision tree learning method is a shallow convolutional structure. It can be supplemented by a learning model by a convolutional neural network (CNN).

일 실시예에서, 상기 기하학적 정보는 상기 이미지의 분할 영역이 직선 원근과 관계없는 사각 형태구조에 대응하는지 여부를 나타내는 그래디언트 발생 정보를 포함할 수 있다.In an embodiment, the geometric information may include gradient generation information indicating whether the divided area of the image corresponds to a rectangular shape structure irrespective of linear perspective.

일 실시예에서, 상기 깊이 정보 할당 단계에서는, 상기 기하학적 정보에 추가하여, 상기 이미지 내의 직선의 강도를 반영하여 상기 분할 영역을 생성하고 각 분할 영역에 깊이 정보를 할당할 수 있다.In an embodiment, in the depth information allocation step, in addition to the geometric information, the divided region may be generated by reflecting the strength of a straight line in the image, and depth information may be allocated to each divided region.

일 실시예에서, 상기 보정 깊이 지도에서 불연속적인 영역을 보정하는 스무딩 처리 단계를 더 포함할 수 있다.In an embodiment, a smoothing processing step of correcting a discontinuous area in the correction depth map may be further included.

일 실시예에서, 상기 소실점 추정 단계에서는 맨해튼 월드(Manhattan World:MW)의 특성을 이용한 RANSAC 기반 알고리즘을 이용하여 상기 이미지 내의 소실점이 추정되고, RANSAC 기반 알고리즘에 의한 오류는 얕은 구조의 컨볼루션 신경망에 의한 학습 모델에 의해 보완될 수 있다. In one embodiment, in the step of estimating the vanishing point, the vanishing point in the image is estimated using a RANSAC-based algorithm using the characteristics of the Manhattan World (MW), and the error by the RANSAC-based algorithm is applied to a shallow convolutional neural network. Can be supplemented by a learning model by

본 발명의 일 실시예에 따른 깊이 지도 생성 장치는, 2차원의 단안 이미지를 3차원으로 재구성하기 위한 깊이 지도 생성 장치로서, 이미지 내의 소실점을 추정하는 소실점 추정부; 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성부; 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당부; 및 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성부를 포함할 수 있다.An apparatus for generating a depth map according to an embodiment of the present invention is a depth map generating apparatus for reconstructing a two-dimensional monocular image into three dimensions, comprising: a vanishing point estimating unit for estimating a vanishing point in the image; An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image; A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image and allocating depth information to each divided area; And a corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area.

일 실시예에서, 상기 초기 깊이 지도 생성부는, 직선 원근과 관계없는 사각 형태구조를 검출하고, 검출된 상기 사각 형태 구조에 대하여는 동일한 깊이 값을 부여할 수 있다.In an embodiment, the initial depth map generator may detect a quadrangular shape structure not related to a straight perspective, and may assign the same depth value to the detected quadrangular shape structure.

일 실시예에서, 상기 기하학적 정보는 상기 이미지의 분할 영역이 하늘, 수직 성분 또는 수평 성분에 대응하는지를 나타내는 분류 정보를 포함할 수 있다. 또한, 일 실시예에서, 상기 깊이 정보 할당부는, 하늘에 대응하는 분할 영역의 깊이 값을 0으로 설정하고, 수평 성분에 대응하는 분할 영역의 깊이 값을 직선 원근법에 의한 깊이 값과 동일한 값으로 설정하며, 수직 성분에 대응하는 분할 영역의 깊이 값을 해당 분할 영역의 평균 깊이 값으로 설정할 수 있다.In an embodiment, the geometric information may include classification information indicating whether the divided area of the image corresponds to a sky, a vertical component, or a horizontal component. In addition, in one embodiment, the depth information allocator sets the depth value of the divided area corresponding to the sky to 0, and sets the depth value of the divided area corresponding to the horizontal component to the same value as the depth value according to the linear perspective method. In addition, the depth value of the divided area corresponding to the vertical component may be set as the average depth value of the divided area.

일 실시예에서, 결정 트리(decision tree) 학습법을 이용하여 상기 이미지의 분할 영역이 하늘, 수직 성분 또는 수평 성분에 대응하는지를 나타내는 상기 분류 정보가 결정되고, 결정 트리 학습법에 의한 오류는 얕은 구조의 컨볼루션 신경망에 의한 학습 모델에 의해 보완될 수 있다.In one embodiment, the classification information indicating whether the divided region of the image corresponds to the sky, a vertical component, or a horizontal component is determined using a decision tree learning method, and an error due to the decision tree learning method is a shallow convolutional structure. It can be supplemented by a learning model by a lution neural network.

일 실시예에서, 상기 깊이 정보 할당부는 상기 기하학적 정보에 추가하여 상기 이미지 내의 직선의 강도를 반영하여 상기 분할 영역을 생성하고 각 분할 영역에 깊이 정보를 할당할 수 있다.In an embodiment, the depth information allocating unit may generate the divided regions by reflecting the strength of a straight line in the image in addition to the geometric information and allocate depth information to each divided region.

일 실시예에서, 상기 보정 깊이 지도에서 불연속적인 영역을 보정하는 스무딩 처리부를 더 포함할 수 있다.In an embodiment, a smoothing processing unit for correcting a discontinuous area in the correction depth map may be further included.

일 실시예에서, 상기 소실점 추정부는 맨해튼 월드(Manhattan World:MW)의 특성을 이용한 RANSAC 기반 알고리즘을 이용하여 상기 이미지 내의 소실점을 추정하고, RANSAC 기반 알고리즘에 의한 오류을 얕은 구조의 컨볼루션 신경망에 의한 학습 모델에 의해 보완할 수 있다.In one embodiment, the vanishing point estimator estimates the vanishing point in the image using a RANSAC-based algorithm using characteristics of a Manhattan World (MW), and learns an error by a RANSAC-based algorithm using a shallow convolutional neural network. Can be supplemented by model.

본 발명의 다른 실시예에 따른 영상 변환 방법은, 2차원의 단안 이미지를 3차원으로 재구성하기 위한 영상 변환 방법으로서, 이미지 내의 소실점을 추정하는 소실점 추정 단계; 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성 단계; 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당 단계; 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성 단계; 및 상기 보정 깊이 지도에 근거하여 3차원 영상을 생성하는 입체 영상 생성 단계를 포함할 수 있다.An image conversion method according to another embodiment of the present invention is an image conversion method for reconstructing a two-dimensional monocular image into three dimensions, the method comprising: estimating a vanishing point in the image; An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point; A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area; A corrected depth map generation step of generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And generating a 3D image for generating a 3D image based on the corrected depth map.

본 발명의 다른 실시예에 따른 영상 변환 장치는, 2차원의 단안 이미지를 3차원으로 재구성하기 위한 영상 변환 장치로서, 이미지 내의 소실점을 추정하는 소실점 추정부; 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성부; 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당부; 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성부; 및 상기 보정 깊이 지도에 근거하여 3차원 영상을 생성하는 입체 영상 생성부를 포함할 수 있다.An image conversion apparatus according to another embodiment of the present invention is an image conversion apparatus for reconstructing a two-dimensional monocular image into three dimensions, comprising: a vanishing point estimating unit for estimating a vanishing point in an image; An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image; A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image, and allocating depth information to each divided area; A corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And a three-dimensional image generator that generates a three-dimensional image based on the corrected depth map.

본 발명의 일 실시예에 따르면, 2차원 단안 이미지로부터 강한 기하학적 규칙성을 보여주는 인공 환경의 맨해튼 월드(Manhattan World:MW)의 특성을 추출하여, 2차원 단안 이미지로부터 맨해튼 월드의 특성을 반영한 깊이 지도를 생성함으로써, 오차를 최소화하여 높은 정확도를 가지면서도 적은 연산량에 의해 처리 속도가 향상된 깊이 지도 생성 방법 및 깊이 지도 생성 장치를 제공할 수 있다.According to an embodiment of the present invention, a depth map reflecting the characteristics of the Manhattan World (MW) in an artificial environment showing strong geometric regularity from a two-dimensional monocular image, and reflecting the characteristics of the Manhattan world from a two-dimensional monocular image By generating an error, it is possible to provide a depth map generating method and a depth map generating apparatus with improved processing speed by minimizing an error and having high accuracy and a small amount of computation.

본 발명의 다른 실시예에 따르면, 2차원 단안 이미지로부터 강한 기하학적 규칙성을 보여주는 인공 환경의 맨해튼 월드(Manhattan World:MW)의 특성을 추출하여, 2차원 단안 이미지로부터 맨해튼 월드의 특성을 반영한 깊이 지도를 생성하고 생성된 깊이 지도에 의해 3차원 영상을 생성함으로써, 오차를 최소화하여 높은 정확도를 가지면서도 적은 연산량에 의해 처리 속도가 향상된 영상 변환 방법 및 영상 변환 장치를 제공할 수 있다.According to another embodiment of the present invention, a depth map reflecting the characteristics of the Manhattan World from a two-dimensional monocular image by extracting the characteristics of the Manhattan World (MW) of an artificial environment showing strong geometric regularity from a two-dimensional monocular image By generating a 3D image using the generated depth map, it is possible to provide an image conversion method and an image conversion apparatus with improved processing speed by minimizing an error and having high accuracy and a small amount of computation.

도 1은 본 발명의 일 실시예에 따른 깊이 지도 생성 장치의 구성을 도시한다.
도 2는 본 발명의 일 실시예에 따라 직선 원근과 관계 없는 사각 형태 구조의 검출 방법을 설명하는 도면이다.
도 3은 본 발명의 일 실시예에 따라 깊이 그래디언트와 관련 있는 영역과 깊이 그래디언트와 관련 없는 영역이 나눠진 이미지를 나타내는 도면이다.
도 4는 본 발명의 일 실시예에 따라 기하학적 정보를 반영한 최종 깊이 지도를 도시한다.
도 5는 본 발명의 일 실시예에 따른 소실점 추정을 위한 컨볼루션 신경망의 구성을 도시한다.
도 6은 본 발명의 실시예에 따른 소실점 추정 모델의 손실률을 나타내는 그래프이다.
도 7은 본 발명의 일 실시예에 따른 기하학적 정보의 분류를 위한 컨볼루션 신경망의 구성을 도시한다.
도 8은 본 발명의 실시예에 다른 영상 기하학 분할의 손실률을 나타내는 그래프이다.
도 9a는 기하학 분할 알고리즘을 적용하기 전의 원본 영상이고, 도 9b는 원본 영상의 실제 깊이 지도이고, 도 9c는 Sky, Vertical, Horizontal 성분의 분류에 따른 기하학적 정보가 반영된 깊이 지도이며, 9d는 기하학적 정보에 추가하여 컨볼루션 신경망을 이용하여 얻은 깊이 지도를 도시한다.
도 10은 본 발명의 일 실시예에 따른 깊이 지도 생성 방법을 나타내는 순서도이다.1 shows a configuration of a depth map generating apparatus according to an embodiment of the present invention.
2 is a diagram illustrating a method of detecting a quadrangular shape structure irrelevant to a linear perspective according to an embodiment of the present invention.
3 is a diagram illustrating an image in which a region related to a depth gradient and a region not related to a depth gradient are divided according to an embodiment of the present invention.
4 shows a final depth map reflecting geometric information according to an embodiment of the present invention.
5 illustrates a configuration of a convolutional neural network for estimating a vanishing point according to an embodiment of the present invention.
6 is a graph showing a loss rate of a vanishing point estimation model according to an embodiment of the present invention.
7 is a diagram illustrating a configuration of a convolutional neural network for classification of geometric information according to an embodiment of the present invention.
8 is a graph showing a loss rate of image geometry segmentation according to an embodiment of the present invention.
9A is an original image before applying the geometry segmentation algorithm, FIG. 9B is an actual depth map of the original image, FIG. 9C is a depth map reflecting geometric information according to classification of Sky, Vertical, and Horizontal components, and 9D is geometric information. In addition to, a depth map obtained using a convolutional neural network is shown.
10 is a flowchart illustrating a method of generating a depth map according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들은 당해 기술 분야에서 통상의 지식을 가진 자에게 본 발명을 더욱 완전하게 설명하기 위하여 제공되는 것이며, 하기 실시예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 하기 실시예에 한정되는 것은 아니다. 오히려, 이들 실시예는 본 개시를 더욱 충실하고 완전하게 하고, 당업자에게 본 발명의 사상을 완전하게 전달하기 위하여 제공되는 것이다.The embodiments of the present invention are provided to more completely describe the present invention to those of ordinary skill in the art, and the following examples may be modified in various other forms, and the scope of the present invention is as follows. It is not limited to the examples. Rather, these embodiments are provided to make the present disclosure more faithful and complete, and to completely convey the spirit of the present invention to those skilled in the art.

도면에서 동일 부호는 동일한 요소를 지칭한다. 또한, 본 명세서에서 사용된 바와 같이, 용어 "및/또는"은 해당 열거된 항목 중 어느 하나 및 하나 이상의 모든 조합을 포함한다.In the drawings, the same reference numerals refer to the same elements. Also, as used herein, the term “and/or” includes any and all combinations of one or more of the corresponding listed items.

본 명세서에서 사용된 용어는 실시예를 설명하기 위하여 사용되며, 본 발명의 범위를 제한하기 위한 것이 아니다. 또한, 본 명세서에서 단수로 기재되어 있다 하더라도, 문맥상 단수를 분명히 지적하는 것이 아니라면, 복수의 형태를 포함할 수 있다. 또한, 본 명세서에서 사용되는 "포함한다(comprise)" 및/또는 "포함하는(comprising)"이란 용어는 언급한 형상들, 숫자, 단계, 동작, 부재, 요소 및/또는 이들 그룹의 존재를 특정하는 것이며, 다른 형상, 숫자, 동작, 부재, 요소 및/또는 그룹들의 존재 또는 부가를 배제하는 것이 아니다.The terms used in this specification are used to describe examples, and are not intended to limit the scope of the present invention. In addition, even if it is described in the singular in this specification, a plurality of forms may be included unless the context clearly indicates the singular. In addition, the terms "comprise" and/or "comprising" as used herein specify the presence of the mentioned shapes, numbers, steps, actions, members, elements and/or groups thereof. It does not exclude the presence or addition of other shapes, numbers, movements, members, elements and/or groups.

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 실시예들을 설명함에 있어 관련된 공지 기능 또는 구성에 관한 구체적인 설명이 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는, 그에 관한 상세한 설명은 생략할 것이다. Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. When it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the invention in describing the embodiments, a detailed description thereof will be omitted.

본 발명의 실시예에서는 영상에 포함된 인위적인 요소를 분석하여 3차원 재구성 요소를 기하학적인 정보와 함께 고려할 수 있다. 단안 영상 기반에서의 3차원 재구성 문제를 해결하기 위해 건물과 도로정보가 포함된 영상인 맨해튼 월드(Manhattan World:MW)에서 얻을 수 있는 영상 정보를 활용하여 소실점을 추정할 수 있다. 또한 소실점과 영상 내 직선의 상대적 위치 및 방향성을 활용해 Sky, Vertical, Horizontal의 기하학 분류와 소실점과 관련 없는 영역 검출과 같은 3D 기하정보를 추정하고, 3D 기하정보에 따라 보다 정확한 깊이 지도를 생성할 수 있다. 또한, 본 발명의 실시예에서는 영상의 3D 기하정보에 근거하여 영상의 영역 별로 깊이 값을 다르게 할당할 수 있고, 인공지능 기술을 접목하여 알고리즘에서 돌발적인 오류가 발생하는 문제를 대처할 수 있다. 인공지능 기술은 소실점 검출과 영상의 3D 기하정보 추정에 적용될 수 있다.In an embodiment of the present invention, an artificial element included in an image may be analyzed and a 3D reconstructed element may be considered together with geometric information. In order to solve the problem of 3D reconstruction based on monocular images, the vanishing point can be estimated by using image information obtained from the Manhattan World (MW), which is an image including building and road information. In addition, by using the relative position and direction of the vanishing point and the straight line in the image, it is possible to estimate 3D geometric information such as classification of the sky, vertical, and horizontal geometry and detection of an area not related to the vanishing point, and generate a more accurate depth map according to the 3D geometric information I can. In addition, in an embodiment of the present invention, a depth value may be differently allocated for each area of an image based on 3D geometric information of an image, and a problem in which an unexpected error occurs in an algorithm may be solved by combining artificial intelligence technology. Artificial intelligence technology can be applied to detecting vanishing points and estimating 3D geometric information of images.

도 1은 본 발명의 일 실시예에 따른 깊이 지도 생성 장치의 구성을 도시한다.1 shows a configuration of a depth map generating apparatus according to an embodiment of the present invention.

도 1을 참조하면, 깊이 지도 생성 장치는 소실점 추정부(10), 초기 깊이 지도 생성부(20), 깊이 정보 할당부(30), 보정 깊이 지도 생성부(40) 및 스무딩 처리부(50)을 포함할 수 있다.Referring to FIG. 1, the depth map generation apparatus includes a vanishing point estimation unit 10, an initial depth map generation unit 20, a depth information allocation unit 30, a correction depth map generation unit 40, and a smoothing processing unit 50. Can include.

소실점 추정부(10)는 깊이 지도 생성 장치로 입력되는 2차원 단안 이미지 내에 존재하는 소실점을 추정할 수 있다. 소실점은 이미지 내에서 직선을 추출한 후, 복수의 직선의 교차점으로부터 추정될 수 있다. 직선 추출을 위해서는 허프 변환 또는 Line Segment Detector 등의 직선 추출 방식을 이용할 수있다. 텍스쳐와 노이즈에 민감한 허프 변환 보다는 텍스쳐와 노이즈에 민감하지 않고 상대적으로 허프 변환보다 빠른 Line Segment Detector 방식에 의해 소실점을 추정하는 것이 바람직하다.The vanishing point estimating unit 10 may estimate a vanishing point existing in a 2D monocular image input to the depth map generating device. The vanishing point may be estimated from the intersection of a plurality of straight lines after extracting a straight line in the image. For straight line extraction, a straight line extraction method such as Hough transform or Line Segment Detector can be used. It is preferable to estimate the vanishing point by the Line Segment Detector method, which is not sensitive to texture and noise, and is relatively faster than Hough transform, rather than Hough transform, which is sensitive to texture and noise.

직선을 이용한 소실점 추정 알고리즘으로는, Gaussian Sphere 알고리즘, Expectation-Maximization(EM) 기반의 알고리즘, RANSAC 기반의 알고리즘 등을 이용할 수 있다. Gaussian Sphere 알고리즘은 Gaussian Sphere에 대해 Great Circle을 이용해 소실점을 추정하는 방법이지만, 잡음에 매우 민감하다. EM 기반 알고리즘은 영상의 영역을 함께 이용하기 때문에 초기 영역에 대한 설정에 따라 성능이 좌우된다. RANSAC 기반의 알고리즘은 휴리스틱 기준을 사용하여 사전에 설정한 가설과 직선의 교차점을 이용한다. RANSAC 방식은 가설에 대한 신뢰도를 반복적으로 평가하지만, 임계 신뢰도를 만족하면 알고리즘은 종료된다. 본 발명의 실시예에서는 RANSAC 방법에 MW의 특성을 활용하여 NW 영상에서 높은 소실점 추정 성능을 나타내는 Triplet-RANSAC 알고리즘이 사용될 수 있다.As the vanishing point estimation algorithm using a straight line, a Gaussian Sphere algorithm, an Expectation-Maximization (EM)-based algorithm, a RANSAC-based algorithm, and the like can be used. The Gaussian Sphere algorithm is a method of estimating the vanishing point using the Great Circle for Gaussian Sphere, but it is very sensitive to noise. Since EM-based algorithms use the image region together, the performance depends on the initial region setting. The RANSAC-based algorithm uses a hypothesis set in advance using a heuristic criterion and the intersection of a straight line. The RANSAC method iteratively evaluates the reliability of the hypothesis, but when the critical reliability is satisfied, the algorithm ends. In an embodiment of the present invention, a Triplet-RANSAC algorithm indicating high vanishing point estimation performance in an NW image may be used by utilizing the characteristics of MW in the RANSAC method.

초기 깊이 지도 생성부(20)는 소실점 추정부(10)에 의해 추정된 소실점에 기초하여 2차원 단안 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성할 수 있다.The initial depth map generator 20 may generate an initial depth map based on the vanishing point estimated by the vanishing point estimating unit 10 with respect to the 2D monocular image using a linear perspective method.

소실점을 이용하여 초기 깊이 지도를 생성하기 위해서 수직(Vertical) 영역과 수평(Horizontal) 영역을 나눌 수 있도록 소실점을 추정한 수렴선 중 주 수렴선을 검출한다. 주 수렴선을 검출한 뒤, Horizontal 영역은 행을 기준으로, Vertical 영역은 열을 기준으로 깊이 그래디언트를 계산하여 초기 깊이 정보를 할당한다. 또한, 초기 깊이 지도는 영상에서 상대적으로 아래 부분은 가깝고 위의 부분은 멀다는 기하학적 특징을 고려한 깊이 정보로 다시 구성될 수 있다. 아래의 [수학식 1]과 같이 기하학 정보를 활용한 깊이 지도를 생성 후 초기 깊이 지도와 결합함으로써 기하학적 특징을 고려한 깊이 정보가 구성될 수 있다.In order to generate an initial depth map using the vanishing point, the main convergence line is detected among the convergence lines estimated for the vanishing point so that the vertical area and the horizontal area can be divided. After the main convergence line is detected, initial depth information is allocated by calculating a depth gradient based on a row in the horizontal region and a column in the vertical region. In addition, the initial depth map may be reconfigured with depth information in consideration of geometrical features that the lower part is relatively close and the upper part is far in the image. As shown in [Equation 1] below, a depth map using geometric information is generated and then combined with an initial depth map to construct depth information considering geometric features.

[수학식 1][Equation 1]

여기서 D_h(x,y)는 해당 좌표에서의 기하학적 깊이 정보를 나타내며, H는영상의 높이이다. 따라서 기하학적인 깊이 정보는 영상의 높이와 y 좌표에 의해 결정된다. Here, D _h (x,y) represents geometric depth information at the corresponding coordinates, and H is the height of the image. Therefore, the geometric depth information is determined by the height of the image and the y coordinate.

초기 깊이 지도 생성부(20)는 직선 원근과 관계없는 사각 형태구조를 검출하고, 검출된 상기 사각 형태 구조에 대하여는 동일한 깊이 값을 부여할 수 있다. The initial depth map generator 20 may detect a quadrangular shape structure irrelevant to a linear perspective, and may assign the same depth value to the detected quadrangular shape structure.

소실점을 이용하여 초기 깊이 지도를 생성하면 자연스럽게 그래디언트가 발생한다. 하지만 직선 원근의 성질과 무관한 영역에서는 그래디언트가 나타날 수 없다. 이러한 영역은 MW에서 주로 인위적인 사각 형태로 나타나기 때문에 해당 형태가 소실점 방향으로 향하는지 아닌지를 판단하면 3D구조를 파악할 수 있다. 따라서 본 발명의 실시예에서는 LSD로 검출한 직선을 활용하여 3D구조를 파악하였다.When an initial depth map is generated using the vanishing point, a gradient occurs naturally. However, the gradient cannot appear in an area that is not related to the property of linear perspective. Since these areas appear mainly in an artificial square shape in the MW, the 3D structure can be grasped by determining whether the corresponding shape is directed toward the vanishing point. Therefore, in the embodiment of the present invention, a 3D structure was identified using a straight line detected by LSD.

도 2는 본 발명의 일 실시예에 따라 직선 원근과 관계 없는 사각 형태 구조의 검출 방법을 설명하는 도면이다.2 is a diagram illustrating a method of detecting a quadrangular shape structure irrelevant to a linear perspective according to an embodiment of the present invention.

도 2를 참조하면, 먼저 (a) 검출한 직선들 중 사각형을 구성하는 주된 성분인 수평선 선분과 수직선 선분을 선별한다. 그 후, (b) 수직선은 하나의 직선으로 연결하고 연결된 수직선과 교차하지 않는 가장 작은 거리의 3D구조 추정에 기준이 되는 수평선 선분(Base Lines, 기준선)을 검출한다. 마지막으로 (c) 기준선과 수직선, 나머지 수평선 선분만 나타낸 뒤, 기준선과 수직선을 활용하여 3D 구조를 추정한다. 추정된 3D 구조에 대응하는 영역은 깊이 그래디언트와 관련 없는 영역으로 설정될 수 있다.Referring to FIG. 2, first (a) a horizontal line segment and a vertical line segment, which are main components constituting a square, are selected among the detected straight lines. After that, (b) the vertical line is connected by one straight line, and the horizontal line segments (base lines), which are the basis for estimating the 3D structure of the smallest distance that does not intersect the connected vertical line, are detected. Finally, (c) after showing only the reference line, the vertical line, and the remaining horizontal line segments, the 3D structure is estimated using the reference line and the vertical line. The area corresponding to the estimated 3D structure may be set as an area not related to the depth gradient.

도 3은 본 발명의 일 실시예에 따라 깊이 그래디언트와 관련 있는 영역과 깊이 그래디언트와 관련 없는 영역이 나눠진 이미지를 나타내는 도면이다. 수직 성분에 대해 3D구조 추정이 완료되면, 도 3에 도시된 바와 같이 그래디언트가 발생하는 영역과 그렇지 않은 영역으로 나눌 수 있다.3 is a diagram illustrating an image in which a region related to a depth gradient and a region not related to a depth gradient are divided according to an embodiment of the present invention. When the 3D structure estimation for the vertical component is completed, as shown in FIG. 3, a region in which a gradient occurs and a region in which the gradient does not occur can be divided.

깊이 정보 할당부(30)는 2차원 단안 이미지로부터 검출된 기하학적 정보를 반영하여 2차원 단안 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당할 수 있다.The depth information allocating unit 30 may generate a divided area obtained by dividing the 2D monocular image into a plurality by reflecting geometric information detected from the 2D monocular image, and may allocate depth information to each divided area.

소실점에 기초하여 얻은 깊이정보는 영상 내 물체에 대한 정보를 포함하고 있지 않다. 그렇기 때문에 물체를 판단할 수 있는 핵심 영역 간의 불연속성을 나타내야 할 필요가 있다. 영역 간 불연속성을 나타내기 위해 영상을 영역별로 분할한다. 영상분할은 큰 영역들을 분할해 나가는 Top-down 방식과 잘게 나누어진 작은 영역들을 합치는 Bottom-up 방식으로 나뉜다. Bottom-up 방식의 단계적인 알고리즘은 Top-down 방식보다 고성능을 나타낸다. Bottom-up 방식은 영상을 먼저 과분할한 뒤, 분할된 영역에 대해 기하정보와 특징 값을 계산하여 병합하게 된다.The depth information obtained based on the vanishing point does not include information on the object in the image. Therefore, it is necessary to show discontinuity between key areas where objects can be judged. The image is segmented for each region to show discontinuity between regions. Image segmentation is divided into a top-down method that divides large areas and a bottom-up method that combines finely divided small areas. The stepwise algorithm of the bottom-up method shows higher performance than the top-down method. In the bottom-up method, the image is first over-divided, and then geometric information and feature values are calculated for the divided area and merged.

과분할 단계에서 많은 영역으로 잘게 나눌 경우, 각 영역의 특징이 부족하기 때문에 영역에 대한 기하정보를 얻기 어렵다. 따라서 기하정보를 고려한 과분할 방법을 활용해야 한다.In the case of subdividing into many areas in the over-division stage, it is difficult to obtain geometric information about the area because the features of each area are insufficient. Therefore, it is necessary to use the over-division method considering geometric information.

과분할 방법으로는 그래프 기반의 분할 방법인 Efficient Graph-Based(EGB) 방식과 화소별 특징 분할인 Simple Linear Iterative Clustering(SLIC), Quickshift 등을 사용할 수 있다. 상대적으로 세밀하게 분할하는 SLIC, Quickshift 방식과는 달리 EGB 방식은 각 영역에서 기하정보를 얻을 수 있다.As the over-segmentation method, the Efficient Graph-Based (EGB) method, which is a graph-based segmentation method, and Simple Linear Iterative Clustering (SLIC), and Quickshift, which are feature segmentation for each pixel, can be used. Unlike the SLIC and Quickshift methods, which are relatively finely divided, the EGB method can obtain geometric information from each area.

영역에 대한 기하정보를 할당하기 위한 알고리즘으로는 Hoiem이 제안한 Surface Layout 방식을 사용할 수 있다. 이 방식은 다양한 영상 특징에 대해 미리 학습한 boosted decision tree 분류기를 사용하여 영상을 Horizontal, Vertical, 그리고 Sky로 나눌 수 있다.Hoiem's proposed Surface Layout method can be used as an algorithm for allocating geometric information for a region. This method can divide the image into Horizontal, Vertical, and Sky using a boosted decision tree classifier that has been learned in advance for various image features.

EGB 방식을 사용하여 과분할된 영역들이 유사한 특징 값을 가진 영역과 병합될 수 있고, 이를 위해 본 발명의 실시예에서는 기존 HFS 방식에서 사용한 특징 외에 기하학 정보와 직선의 정보를 특징으로 활용하여 확장된 영상 분할 방법(Modified HFS; M-HFS)이 사용될 수 있다. 병합 단계에서는 분할된 영역을 그래프로 재구성하고 각 인접 영역에 대한 특징 값을 비교하여 병합하는 EGB 병합 방법을 사용할 수 있다. 인접 영역들이 더 이상 유사하지 않을 때까지 반복 병합이 수행될 수 있다.The regions that are overdivided using the EGB method can be merged with regions having similar characteristic values. For this purpose, in the embodiment of the present invention, in addition to the characteristics used in the existing HFS method, geometric information and information of a straight line are used as features. An image segmentation method (Modified HFS; M-HFS) may be used. In the merging step, an EGB merging method may be used in which the divided areas are reconstructed into a graph and feature values for each adjacent area are compared and merged. Iterative merging may be performed until adjacent regions are no longer similar.

본 발명에서 제안된 M-HFS기법은 기존 HFS에 직선의 강도(Straight Line Strength)와 기하학적 정보(Geometric features)를 새롭게 사용하였다. 제안된 M-HFS기법에서 특징으로 이용한 직선의 강도는 MW 특성에서 나타나는 직선의 특징 값으로써, 이 값을 표현하기 위해 Average boundary gradient와 Canny 경계 검출, 그리고 LSD 알고리즘을 결합하였다. 또한, 제안된 M-HFS기법에서 특징으로 이용한 기하학적 정보는 Hoiem의 Surface Layout 기법에서 나타나는 기하정보 3가지(SKY, HORIZONTAL, VERTICAL)에 대한 특징값과 사각 형태의 3D 구조 추정에서 나타나는 2개의 특징값인 소실점 관련 영역과 그렇지 않은 영역에 관한 특징값(그래디언트 발생 정보)을 더하여 모두 5가지의 기하학적 정보에 대한 특징 값을 포함할 수 있다.The M-HFS technique proposed in the present invention newly uses straight line strength and geometric features in the existing HFS. The strength of the straight line used as a feature in the proposed M-HFS technique is the feature value of the straight line appearing in the MW characteristic, and to express this value, the average boundary gradient, the Canny boundary detection, and the LSD algorithm are combined. In addition, geometric information used as a feature in the proposed M-HFS technique is a feature value for three geometric information (SKY, HORIZONTAL, VERTICAL) that appears in Hoiem's Surface Layout technique, and two feature values that appear in the estimation of a rectangular 3D structure. Feature values for all five types of geometric information may be included by adding feature values (gradient generation information) for a region related to a phosphorus vanishing point and a region that is not.

[표 1][Table 1]

위의 [표 1]은 NYU Depth V2 데이터셋의 MW 영상을 이용하여 본 발명에서 제안된 영상 분할 알고리즘(Modified HFS)과 다른 알고리즘(gPb 및 HFS)을 비교한 것이다. 경계선에 대한 평가지표인 F-measure은 아래의 [수학식 2]와 같이 계산된다.[Table 1] is a comparison of the image segmentation algorithm (Modified HFS) proposed in the present invention and other algorithms (gPb and HFS) using MW images of the NYU Depth V2 dataset. The F-measure, an evaluation index for the boundary line, is calculated as shown in [Equation 2] below.

[수학식 2][Equation 2]

분할된 영역에 대한 성능 평가는 데이터 셋에서의 최적 값(Optimal Dataset Scale; ODS)와 영상 하나 당 최적 값(Optimal Image Scale; OIS), 그리고 gPb에서 활용한 지표를 이용하였다. 확률적 Rand Index(Probability Rand Index; PRI)와 겹치는 정도를 판단하는 Covering, 그리고 조건부 엔트로피 관점에서 두 영역의 거리를 판단하는 Variation of Information(VI)으로 평가했다. 경계선 관점에서 M-HFS 알고리즘이 다른 알고리즘 보다 월등하게 고성능을 나타냈으며, 영역 관점에서는 M-HFS 알고리즘이 gPb 알고리즘과 대등한 성능을 보였다. 시간 복잡도를 고려하였을 때, M-HFS가 gPb보다 매우 우수한 성능을 나타낸 것을 확인할 수 있다.The performance evaluation of the segmented region used the optimal value (ODS) in the data set, the optimal value per image (OIS), and the index used in gPb. It was evaluated by Covering, which judges the degree of overlap with the Probability Rand Index (PRI), and Variation of Information (VI), which judges the distance between two areas in terms of conditional entropy. In terms of the boundary line, the M-HFS algorithm showed superior performance compared to other algorithms, and in the domain viewpoint, the M-HFS algorithm showed comparable performance to the gPb algorithm. In consideration of the time complexity, it can be seen that M-HFS exhibited much better performance than gPb.

깊이 정보 할당부(30)가 각 분할 영역에 깊이 정보를 할당할 때 반영되는 기하학적 정보는 영상의 분할 영역이 하늘(SKY), 수직 성분(VERTICAL) 또는 수평 성분(HORIZONTAL)에 대응하는지를 나타내는 분류 정보를 포함할 수 있다. 깊이 정보 할당부(30)는 하늘에 대응하는 분할 영역의 깊이 값을 0으로 설정하고, 수평 성분에 대응하는 분할 영역의 깊이 값을 직선 원근법에 의한 깊이 값과 동일한 값으로 설정하며, 수직 성분에 대응하는 분할 영역의 깊이 값을 해당 분할 영역의 평균 깊이 값으로 설정할 수 있다.Geometric information reflected when the depth information allocating unit 30 allocates depth information to each divided area is classification information indicating whether the divided area of the image corresponds to the sky (SKY), vertical component (VERTICAL), or horizontal component (HORIZONTAL). It may include. The depth information allocation unit 30 sets the depth value of the divided area corresponding to the sky to 0, sets the depth value of the divided area corresponding to the horizontal component to the same value as the depth value according to the linear perspective method, and The depth value of the corresponding divided area may be set as the average depth value of the corresponding divided area.

영상분할이 완료된 후, 최종적으로 얻은 기하정보에 따라 다르게 깊이 값을 할당하여 영역 간의 불연속성을 나타낼 수 있다. 불연속성을 나타내기 위한 가장 중요한 성분은 수직성분이다. Vertical, 즉 수직성분으로 분류된 영역은 모두 같은 깊이 값을 갖는다. 먼저, 각 기하정보에 따른 깊이 값 할당 방법은 아래의 [수학식 3]과 같다.After the image segmentation is completed, a depth value may be allocated differently according to the finally obtained geometric information to indicate discontinuity between regions. The most important component to show discontinuity is the vertical component. Vertical, that is, areas classified as vertical components all have the same depth value. First, the depth value allocation method according to each geometric information is as shown in [Equation 3] below.

[수학식 3][Equation 3]

여기서 p는 영상 내 픽셀을 나타내며, D _vp은 소실점을 활용하여 생성된 초기 깊이 지도를 나타낸다. 수직성분의 깊이 값인 D _building은 아래의 [수학식 4]에 의해 계산할 수 있다.Here, p denotes a pixel in the image, and D _vp denotes an initial depth map generated using the vanishing point. D _{building, which} is the depth value of the vertical component, can be calculated by the following [Equation 4].

[수학식 4][Equation 4]

먼저, 영상의 열을 기준으로 깊이 값(D _cols)을 계산한다. 수직성분인 건물이 다른 건물에 의해 폐색영역이 나타나면, 기하학 구조에 의해 위의 건물은 아래의 건물보다 더 멀리 떨어져 있게 된다. 그러므로 열을 기준으로 다른 영역으로 나누어져 있다면, 다른 깊이 레벨을 부여해야 한다. 깊이 레벨(d _level)은 각 영역의 불연속성을 나타내는 임의 값이다. 깊이 레벨(d _level)은 3 내지 10 사이의 범위가 바람직하며, 본 발명의 실시예에서는 5로 설정하였다. 여기서 R(x,y)는 영상 좌표에서의 영역 값을 나타낸다. 따라서 전체적인 D _building은 아래의 [수학식 5]와 같이 영역의 평균 깊이 값으로 할당될 수 있다.First, the depth value ( D _cols ) is calculated based on the image column. If a building that is a vertical component is occluded by another building, the upper building is farther apart than the lower building by the geometry. Therefore, if it is divided into different areas based on the column, different depth levels must be assigned. The depth level ( d _level ) is an arbitrary value indicating discontinuity of each area. The depth level ( d _level ) is preferably in the range of 3 to 10, and is set to 5 in the embodiment of the present invention. Here, R (x,y) represents an area value in image coordinates. Therefore, the overall D _building can be assigned as the average depth value of the area as shown in [Equation 5] below.

[수학식 5][Equation 5]

여기서, pixel_num(R)은 영역 R의 전체 픽셀 수를 나타낸다.Here, pixel_num( R ) represents the total number of pixels in the region R.

보정 깊이 지도 생성부(40)는 2차원 단안 이미지의 각 분할 영역에 대하여 깊이 정보 할당부(30)에 의해 할당된 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성할 수 있다.The corrected depth map generator 40 may generate a corrected depth map obtained by correcting the initial depth map based on depth information allocated by the depth information allocating unit 30 for each divided area of the 2D monocular image. .

보정 깊이 지도는 소실점을 통해 얻은 초기 깊이 지도와 기하학적 정보에 기초하여 얻은 깊이 지도를 아래의 [수학식 6]과 같이 합산하여 얻을 수 있다. 일정한 값의 가중치를 이용하여 영상 깊이 지도의 불연속성과 그래디언트를 함께 표현할 수 있다. 기하학적 정보에 기초하여 얻은 깊이 지도에 대한 가중치 값을 높일수록 수직 성분이 더 선명하게 나타난다. ω_Geo + ω_vp =1 이고, 본 발명의 실시예에서는 ω_Geo를 0.6으로 설정하고, ω_vp를 0.4로 설정하였다. 그러나, 다른 실시예에서는 필요에 따라 적절하게 다른 ω_vp값을 설정할 수 있따.The corrected depth map can be obtained by summing the initial depth map obtained through the vanishing point and the depth map obtained based on geometric information as shown in [Equation 6] below. By using a weight of a certain value, the discontinuity and the gradient of the image depth map can be expressed together. Vertical components appear more clearly as the weight value for the depth map obtained based on geometric information increases. ω _Geo + ω _vp = 1, and in the embodiment of the present invention, ω _Geo is set to 0.6, and ω _vp is set to 0.4. However, in other embodiments, different ω _vp values may be appropriately set as needed.

[수학식 6][Equation 6]

스무딩 처리부(50)는 보정 깊이 지도 생성부(40)에 의해 생성된 보정 깊이 지도에서 불연속적인 영역을 보정할 수 있다.The smoothing processing unit 50 may correct a discontinuous area in the corrected depth map generated by the corrected depth map generating unit 40.

실제 영상에서는 같은 영역이지만 깊이 지도에서는 서로 다른 깊이 값을 가질 수 있다. 이것은 원래 하나의 영역을 더 세밀하게 분할하여 나타나는 현상으로 인위적인 요소들의 영향을 받는다. 그래서 원 영상과 해당 픽셀의 깊이 값을 동시에 고려하는 Cross bilateral filter를 이용하여 [수학식 7]과 같이 최종 깊이 값을 보정할 수 있다.It is the same area in the actual image, but it may have different depth values in the depth map. This is a phenomenon that occurs by dividing an area in more detail and is influenced by artificial factors. Therefore, the final depth value can be corrected as shown in [Equation 7] using a cross bilateral filter that simultaneously considers the depth value of the original image and the corresponding pixel.

[수학식 7][Equation 7]

여기서, p는 영상 내 픽셀을 나타내고, q는 p의 인접한 픽셀들을 나타낸다. Wp는 정규화 계수이고 아래의 [수학식 8]로부터 얻을 수 있다.Here, p denotes pixels in the image, and q denotes pixels adjacent to p. W p is a normalization coefficient and can be obtained from [Equation 8] below.

[수학식 8][Equation 8]

Ωp은 p와 인접한 픽셀들을 의미하며,

는 scale factor을

로 갖는 가우시안 함수를 나타낸다. I는 원 영상을 나타내고, D _init은 최종적으로 생성된 깊이 지도를, D_final은 Cross bilateral filter를 이용한 스무딩 처리를 수행한 후의 깊이 지도를 나타낸다. 이러한 스무딩 처리는 유사한 intensity 영역에서 깊이 값의 불연속성이 나타나는 것을 방지한다.Ωp means pixels adjacent to p,

Is the scale factor

Represents a Gaussian function with I denotes the original image, D _init denotes the finally generated depth map, and D _final denotes the depth map after smoothing processing using a cross bilateral filter. This smoothing process prevents the discontinuity of depth values from appearing in similar intensity areas.

도 4는 본 발명의 일 실시예에 따라 기하학적 정보를 반영한 최종 깊이 지도를 도시한다. 도 4의 (a)는 원본 영상이고, (b)는 전술한 그래디언트, 불연속성 및 수직 성분 구조의 특징을 모두 나타낸 최종 깊이 지도를 도시한다.4 shows a final depth map reflecting geometric information according to an embodiment of the present invention. 4A is an original image, and (b) shows a final depth map showing all of the characteristics of the above-described gradient, discontinuity, and vertical component structure.

종래 기술에서는 기하학적 정보를 사용하지 않고 영역에 같은 깊이 값을 할당하였기 때문에, 소실점과 가까운 영역에서는 큰 오차가 발생하였다, 본 발명의 실시예에서는 기하학적 정보를 고려하여 그래디언트를 할당하였기 때문에, 종래 기술에서 나타났던 오차를 현저히 줄일 수 있다. 도 4의 (b)에 도시된 기하학적 정보를 고려한 깊이 지도가 도 4의 (a)에 도시된 원본 영상의 실제 깊이 지도에 근접하고 있는 점으로부터 본 발명의 실시예가 오차를 최소화하여 높은 정확도를 나타낸다는 것을 확인할 수 있다.In the prior art, since the same depth value was assigned to the region without using geometric information, a large error occurred in the region close to the vanishing point. In the embodiment of the present invention, since the gradient was allocated in consideration of geometric information, in the prior art The error that appeared can be significantly reduced. Since the depth map in consideration of geometric information shown in FIG. 4(b) is close to the actual depth map of the original image shown in FIG. 4(a), the embodiment of the present invention minimizes errors and shows high accuracy. Can be confirmed.

도 5는 본 발명의 일 실시예에 따른 소실점 추정을 위한 컨볼루션 신경망의 구성을 도시한다.5 illustrates a configuration of a convolutional neural network for estimating a vanishing point according to an embodiment of the present invention.

본 발명의 실시예에서, 소실점 추정부(10)는 이미지 내의 소실점 추정에 얕은 구조의 컨볼루션 신경망(Convolutional Neural Network; CNN)에 의한 학습을 이용할 수 있다.In an embodiment of the present invention, the vanishing point estimating unit 10 may use learning by a shallow convolutional neural network (CNN) to estimate vanishing points in an image.

딥러닝 구조에서는 학습 시간이 오래 걸리므로, 영상 내 주요 특징을 추출하고, 추출된 특징을 이용하여 인공지능 모델을 학습하면 깊은 구조가 아닌 얕은 구조의 컨볼루션 신경망에 의한 학습을 이용할 수 있다. Since the deep learning structure takes a long time to learn, if the main features in the image are extracted and the artificial intelligence model is trained using the extracted features, it is possible to use learning by a convolutional neural network with a shallow structure rather than a deep structure.

영상에서 소실점을 추정하기 위해 직선에 많이 의존한다. 따라서 직접적으로 계산할 수 있는 직선에 대한 특징 값을 모델의 입력 값으로 설정할 수 있다. 직선은 전술한 LSD 알고리즘을 사용하여 검출될 수 있고, 아래의 [표 2]는 각 직선의 특징 값을 나타낸다.It relies heavily on straight lines to estimate the vanishing point in the image. Therefore, a feature value for a straight line that can be directly calculated can be set as an input value of the model. The straight line can be detected using the LSD algorithm described above, and Table 2 below shows the characteristic values of each straight line.

[표 2][Table 2]

각 직선의 특징에서 인접한 영역 내 직선을 고려하기에는 모호하므로, 학습 모델은 컨볼루션(Convolution) 계층을 포함하는 Convolutional Neural Network(CNN)으로 구성할 수 있고, 컨볼루션 계층은 제공받은 각 직선의 정보를 이용하여 소실점 추정을 위한 고유의 특징을 검출할 수 있다.Since it is ambiguous to consider the straight line within the adjacent region in the characteristic of each straight line, the learning model can be composed of a Convolutional Neural Network (CNN) including a convolution layer, and the convolutional layer can be configured to provide information on each straight line. Using this, it is possible to detect a unique feature for estimating the vanishing point.

소실점 추정 모델은 좌표와 같이 연속적인 값을 계산하기 때문에, 분류 문제가 아닌 회귀(regression) 문제를 해결할 수 있도록 설계될 수 있다. 실수값을 계산하므로 쉽게 수렴하지 않을 가능성이 존재한다. 따라서 영상의 왼쪽 상단을 [0, 0]으로, 오른쪽 상단을 [1, 0]으로 가정하여 입력 값을 정규화할 수 있다.Since the vanishing point estimation model calculates continuous values like coordinates, it can be designed to solve a regression problem rather than a classification problem. Since real values are calculated, there is a possibility that they do not converge easily. Therefore, the input value can be normalized assuming that the upper left of the image is [0, 0] and the upper right is [1, 0].

도 5를 참조하면, 소실점 추정 모델은 1개의 컨볼루션 계층과 3개의 완전연결(Fully-Connected;FC)계층으로 구성될 수 있다. 소실점에 대한 좌표 x, y를 계산하기 위해 마지막 FC계층의 뉴런 수는 2개로 설정할 수 있다. 그 외 각 완전연결 계층의 뉴런의 수는 2,048개로 설정하고, 컨볼루션은 4 X 4의 필터로 128개의 특징 지도를 갖도록 설정할 수 있다. 모델의 초매개변수(Hyperparameter)는 아래의 [표 3]와 같이 설정하였고, 활성화 함수로는 Tanh를 사용할 수 있다.Referring to FIG. 5, the vanishing point estimation model may consist of one convolutional layer and three fully-connected (FC) layers. In order to calculate the coordinates x and y for the vanishing point, the number of neurons in the last FC layer can be set to two. In addition, the number of neurons in each fully connected layer can be set to 2,048, and convolution can be set to have 128 feature maps with a 4 X 4 filter. The hyperparameter of the model is set as shown in [Table 3] below, and Tanh can be used as the activation function.

[표 3][Table 3]

도 6은 본 발명의 실시예에 따른 소실점 추정 모델의 손실률을 나타내는 그래프이다.6 is a graph showing a loss rate of a vanishing point estimation model according to an embodiment of the present invention.

도 6을 참조하면, 학습후의 손실 함수 결과 값으로부터 손실률은 최대 0.075까지 떨어진 것을 알 수 있다. 이것은 소실점 추정시 최대 오차가 11인 것을 의미한다. Triplet-RANSAC은 NormDist Error가 0.143이 결과를 나타낸 반면, 인공지능을 활용한 소실점 추정의 NormDist Error는 0.028로 뛰어난 성능 개선을 나타냈다.Referring to FIG. 6, it can be seen that the loss rate has fallen to a maximum of 0.075 from the result value of the loss function after learning. This means that the maximum error in estimating the vanishing point is 11. In Triplet-RANSAC, the NormDist Error was 0.143, while the NormDist Error of vanishing point estimation using artificial intelligence was 0.028, which showed excellent performance improvement.

도 7은 본 발명의 일 실시예에 따른 기하학적 정보의 분류를 위한 컨볼루션 신경망의 구성을 도시한다.7 is a diagram illustrating a configuration of a convolutional neural network for classification of geometric information according to an embodiment of the present invention.

본 발명의 실시예에서, 이미지의 분할 영역이 하늘, 수직 성분 또는 수평 성분에 대응하는지를 나타내는 분류 정보는 얕은 구조의 컨볼루션 신경망(Convolutional Neural Network; CNN)에 의한 학습을 이용하여 결정될 수 있다.In an embodiment of the present invention, classification information indicating whether a divided region of an image corresponds to a sky, a vertical component, or a horizontal component may be determined using learning by a shallow convolutional neural network (CNN).

영상의 기하학 분할 학습 모델은 과분할 후 각 영역에 대해 Horizontal, Vertical, 그리고 Sky로 분류하는 알고리즘을 따른다. 따라서 소실점 추정과 는 달리, 기하학 분할 학습 모델은 분류(Classification) 문제를 해결할 수 있도록 설계될 수 있다. 해당 모델의 입력은 과분할된 영역에서의 특징 값으로 설정할 수 있다. 아래의 [표 4]는 영역 내의 특징 값을 나타낸다.The geometry segmentation learning model of an image follows an algorithm that classifies each region into Horizontal, Vertical, and Sky after over segmentation. Therefore, unlike the vanishing point estimation, the geometry division learning model can be designed to solve the classification problem. The input of the model can be set as a feature value in an over-divided area. [Table 4] below shows the feature values in the region.

[표 4][Table 4]

도 6을 참조하면, 기하학 분할 학습 모델의 구조는 컨볼루션(Convolution) 계층을 포함하여 세부 특징을 추출할 수 있도록 CNN의 형태로 구현될 수 있다. 한 개의 영역에 대한 15개의 특징 값을 입력으로 하여 모델을 학습할 수 있다. Referring to FIG. 6, the structure of the geometry division learning model may be implemented in the form of a CNN so that detailed features including a convolution layer can be extracted. A model can be trained by inputting 15 feature values for one area.

두 개의 FC 계층과 하나의 Softmax 계층을 이용하여 Horizontal, Vertical, Sky, 그리고 미분류에 대해 총 4가지의 분류를 수행할 수 있다. FC계층의 뉴런 수는 각각 2,048개, 1,024개로 설정하며, 컨볼루션은 1 X 15의 필터로 256개의 특징 지도를 갖도록 설정할 수 있다. 모델의 초매개변수는 아래의 [표 5]와 같이 설정하고, 활성화 함수로는 ReLU를 사용할 수 있다.A total of 4 classifications can be performed for Horizontal, Vertical, Sky, and Unclassified by using two FC layers and one Softmax layer. The number of neurons in the FC layer is set to 2,048 and 1,024, respectively, and the convolution can be set to have 256 feature maps with a 1 X 15 filter. The hyperparameter of the model is set as shown in [Table 5] below, and ReLU can be used as the activation function.

[표 5][Table 5]

도 8은 본 발명의 실시예에 따른 영상 기하학 분할의 손실률을 나타내는 그래프이다.8 is a graph showing a loss rate of image geometry segmentation according to an embodiment of the present invention.

도 8을 참조하면, 학습후의 손실 함수 결과 값으로부터 손실률은 최대 0.55까지 떨어졌다. 분류 정확도는 86%를 나타낸다. KITTI SEMANTIC 데이터에서 M-HFS에서의 기하학 분류 알고리즘의 정확도는 유사하게 86%를 나타냈는데, 깊이 할당의 핵심 부분에서 기하학 분류의 차이가 있음을 확인할 수 있다.Referring to FIG. 8, the loss rate dropped to a maximum of 0.55 from the result value of the loss function after learning. The classification accuracy represents 86%. In KITTI SEMANTIC data, the accuracy of the geometry classification algorithm in M-HFS was similarly 86%, and it can be seen that there is a difference in geometry classification in the core part of the depth allocation.

도 9a는 기하학 분할 알고리즘을 적용하기 전의 원본 영상이고, 도 9b는 원본 영상의 실제 깊이 지도이고, 도 9c는 Sky, Vertical, Horizontal 성분의 분류에 따른 기하학적 정보가 반영된 깊이 지도이며, 9d는 기하학적 정보에 추가하여 컨볼루션 신경망을 이용하여 얻은 깊이 지도를 도시한다. 컨볼루션 신경망을 이용하여 얻은 도 9d에 도시된 깊이 지도가 9c에 도시된 깊이 지도보다 원본 영상의 실제 깊이 지도인 도 9b에 더 근접한 것을 확인할 수 있다.9A is an original image before applying the geometry segmentation algorithm, FIG. 9B is an actual depth map of the original image, FIG. 9C is a depth map reflecting geometric information according to classification of Sky, Vertical, and Horizontal components, and 9D is geometric information. In addition to, a depth map obtained using a convolutional neural network is shown. It can be seen that the depth map shown in FIG. 9D obtained using the convolutional neural network is closer to the actual depth map of FIG. 9B than the depth map shown in 9C.

본 발명의 다른 실시예에 따른 영상 변환 장치는 도 1에 도시된 깊이 지도 생성 장치가 구비하는 소실점 추정부(10), 초기 깊이 지도 생성부(20), 깊이 정보 할당부(30) 및 보정 깊이 지도 생성부(40)에 추가하여, 보정 깊이 지도 생성부(40)에 의해 생성된 보정 깊이 지도에 근거하여 3차원 영상을 생성하는 입체 영상 생성부(60)를 포함할 수 있다. 입체 영상 생성부(60)는 입력된 2차원 단안 이미지와 생성된 보정 깊이 지도를 합성하여 3차원 입체 영상을 생성할 수 있다. 또한, 영상 변환 장치는 도 1에 도시된 스무딩 처리부(50)를 더 포함할 수 있다.An image conversion apparatus according to another embodiment of the present invention includes a vanishing point estimation unit 10, an initial depth map generation unit 20, a depth information allocation unit 30, and a correction depth provided in the depth map generation apparatus shown in FIG. In addition to the map generation unit 40, a 3D image generation unit 60 for generating a 3D image based on the corrected depth map generated by the corrected depth map generation unit 40 may be included. The 3D image generator 60 may generate a 3D 3D image by synthesizing the input 2D monocular image and the generated corrected depth map. In addition, the image conversion apparatus may further include a smoothing processing unit 50 shown in FIG. 1.

영상 변환 장치가 구비하는 소실점 추정부(10), 초기 깊이 지도 생성부(20), 깊이 정보 할당부(30), 보정 깊이 지도 생성부(40) 및 스무딩 처리부(50)의 상세에 대하여는 도 1에 도시된 깊이 지도 생성 장치의 소실점 추정부(10), 초기 깊이 지도 생성부(20), 깊이 정보 할당부(30) 및 보정 깊이 지도 생성부(40)에 대한 설명이 참조될 수 있다. For details of the vanishing point estimating unit 10, the initial depth map generation unit 20, the depth information allocating unit 30, the corrected depth map generation unit 40, and the smoothing processing unit 50 provided in the image conversion device, see FIG. 1 A description of the vanishing point estimating unit 10, the initial depth map generation unit 20, the depth information allocating unit 30, and the corrected depth map generation unit 40 of the depth map generating apparatus shown in may be referred to.

도 10은 본 발명의 일 실시예에 따른 깊이 지도 생성 방법을 나타내는 순서도이다.10 is a flowchart illustrating a method of generating a depth map according to an embodiment of the present invention.

도 10을 참조하면, 본 발명의 일 실시예에 따른 깊이 지도 생성 방법은, 소실점 추정부(10)가 이미지 내의 소실점을 추정하는 소실점 추정 단계(S10)와, 초기 깊이 지도 생성부(20)가 추정된 소실점에 기초하여 상기 이미지에 대하여 직선 원근법에 의한 초기 깊이 지도를 생성하는 초기 깊이 지도 생성 단계(S20)와, 깊이 정보 할당부(30)가 상기 이미지로부터 검출된 기하학적 정보를 반영하여 상기 이미지를 복수개로 분할한 분할 영역을 생성하고, 각 분할 영역에 깊이 정보를 할당하는 깊이 정보 할당 단계(S30)와, 보정 깊이 지도 생성부(40)가 각 분할 영역에 대한 상기 깊이 정보에 기초하여 상기 초기 깊이 지도를 보정한 보정 깊이 지도를 생성하는 보정 깊이 지도 생성 단계(S40)와, 스무딩 처리부(50)가 상기 보정 깊이 지도에서 불연속적인 영역을 보정하는 스무딩 처리 단계(S50)를 포함할 수 있다. 소실점 추정부(10)이 수행하는 소실점 추정 단계(S10), 초기 깊이 지도 생성부(20)가 수행하는 초기 깊이 지도 생성 단계(S20), 깊이 지도 정보 할당부(30)이 수행하는 깊이 정보 할당 단계(S30), 보정 깊이 지도 생성부(40)이 수행하는 보정 깊이 지도 생성 단계(S40) 및 스무딩 처리부(50)이 수행하는 스무딩 처리 단계(S50)의 상세에 대하여는 전술한 지도 생성 장치의 소실점 추정부(10), 초기 깊이 지도 생성부(20), 깊이 정보 할당부(30) 및 보정 깊이 지도 생성부(40)에 대한 설명이 참조될 수 있다.Referring to FIG. 10, in the depth map generation method according to an embodiment of the present invention, the vanishing point estimating unit 10 estimates a vanishing point in an image (S10), and the initial depth map generation unit 20 An initial depth map generation step (S20) of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point, and the depth information allocation unit 30 reflects the geometric information detected from the image. A depth information allocation step (S30) of generating a divided area divided into a plurality of divided areas, and allocating depth information to each divided area, and the corrected depth map generator 40 based on the depth information for each divided area. A corrected depth map generation step (S40) of generating a corrected depth map correcting an initial depth map, and a smoothing processing step (S50) of correcting a discontinuous area in the corrected depth map by the smoothing processor 50. . Vanishing point estimation step (S10) performed by the vanishing point estimating unit 10, initial depth map generation step (S20) performed by the initial depth map generator 20, and depth information allocation performed by the depth map information allocation unit 30 For details of the step S30, the corrected depth map generation step S40 performed by the corrected depth map generation unit 40, and the smoothing processing step S50 performed by the smoothing processing unit 50, see the vanishing point of the above-described map generation device. Descriptions of the estimating unit 10, the initial depth map generation unit 20, the depth information allocation unit 30, and the correction depth map generation unit 40 may be referred to.

본 발명의 다른 실시예에 따른 2차원의 단안 이미지를 3차원으로 재구성하기 위한 영상 변환 방법은, 도 10에 도시된 깊이 지도 생성 방법이 포함하는 소실점 추정 단계(S10), 초기 깊이 지도 생성 단계(S20), 깊이 정보 할당 단계(S30), 보정 깊이 지도 생성 단계(S40)에 추가하여, 보정 깊이 지도 생성 단계(S40)에 의해 생성된 보정 깊이 지도에 근거하여 3차원 영상을 생성하는 입체 영상 생성 단계(S60)를 포함할 수 있다. 입체 영상 생성 단계(S60)는 입력된 2차원 단안 이미지와 생성된 보정 깊이 지도를 합성하여 3차원 입체 영상을 생성할 수 있다. 또한, 영상 변환 방법은 도 10에 도시된 스무딩 처리 단계(S50)를 더 포함할 수 있다.An image conversion method for reconstructing a two-dimensional monocular image into three dimensions according to another embodiment of the present invention includes the step of estimating a vanishing point (S10) and the step of generating an initial depth map included in the depth map generating method shown in FIG. S20), in addition to the depth information allocation step (S30) and the corrected depth map generation step (S40), a three-dimensional image generating a three-dimensional image based on the corrected depth map generated by the corrected depth map generation step (S40) It may include a step (S60). In the 3D image generation step S60, a 3D 3D image may be generated by synthesizing the input 2D monocular image and the generated corrected depth map. In addition, the image conversion method may further include a smoothing processing step (S50) shown in FIG. 10.

영상 변환 방법이 포함하는 소실점 추정 단계(S10), 초기 깊이 지도 생성 단계(S20), 깊이 정보 할당 단계(S30), 보정 깊이 지도 생성 단계(S40) 및 스무딩 처리 단계(S50)의 상세에 대하여는 도 1에 도시된 깊이 지도 생성 방법의 소실점 추정 단계(S10), 초기 깊이 지도 생성 단계(S20), 깊이 정보 할당 단계(S30), 보정 깊이 지도 생성 단계(S40) 및 스무딩 처리 단계(S50)에 대한 설명이 참조될 수 있다. For details of the vanishing point estimation step (S10), the initial depth map generation step (S20), the depth information allocation step (S30), the correction depth map generation step (S40) and the smoothing processing step (S50) included in the image conversion method, For the vanishing point estimation step (S10), the initial depth map generation step (S20), the depth information allocation step (S30), the correction depth map generation step (S40), and the smoothing processing step (S50) of the depth map generation method shown in 1 Description may be referred.

이상에서 설명한 본 발명이 전술한 실시예 및 첨부된 도면에 한정되지 않으며, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러가지 치환, 변형 및 변경이 가능하다는 것은, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어 명백할 것이다.The present invention described above is not limited to the above-described embodiments and the accompanying drawings, and that various substitutions, modifications, and changes are possible within the scope of the technical spirit of the present invention. It will be obvious to those who have knowledge.

Claims

As a depth map generation method for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating step of estimating a vanishing point in the image;
An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point;
A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area; And
A correction depth map generation step of generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area,
The geometric information includes classification information indicating whether the divided area of the image corresponds to a vertical component, a horizontal component, or other components other than the vertical component and the horizontal component,
In the depth information allocation step,
The depth value of the divided region corresponding to the other component is set to 0,
The depth value of the divided area corresponding to the horizontal component is set to the same value as the depth value according to the linear perspective method,
A depth map generation method in which a depth value of a divided area corresponding to the vertical component is set as an average depth value of the divided area.

The method of claim 1,
In the initial depth map generation step, a rectangular shape structure irrelevant to a linear perspective is detected, and the same depth value is assigned to the detected rectangular shape structure.

The method of claim 1,
The other component is the sky depth map generation method.

delete

The method of claim 1,
The classification information indicating whether the segmented region of the image corresponds to the other component, the vertical component, or the horizontal component is determined using a decision tree learning method, and the error due to the decision tree learning method is convolution of a shallow structure. Depth map generation method complemented by a learning model using a convolutional neural network (CNN).

The method of claim 1,
The geometric information includes gradient generation information indicating whether the divided area of the image corresponds to a quadrangular shape structure not related to a linear perspective.

As a depth map generation method for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating step of estimating a vanishing point in the image;
An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point;
A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area; And
A correction depth map generation step of generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area,
In the depth information allocation step, in addition to the geometric information, the divided area is generated by reflecting the strength of a straight line in the image, and depth information is assigned to each divided area.

The method of claim 1,
And a smoothing process step of correcting a discontinuous area in the corrected depth map.

The method of claim 1,
In the step of estimating the vanishing point, the vanishing point in the image is estimated using a RANSAC-based algorithm using the characteristics of the Manhattan World (MW). Depth map generation method complemented.

As a depth map generating device for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating unit for estimating a vanishing point in the image;
An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image;
A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image, and allocating depth information to each divided area; And
A corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area,
The geometric information includes classification information indicating whether the divided area of the image corresponds to a vertical component, a horizontal component, or other components other than the vertical component and the horizontal component,
The depth information allocation unit,
Set the depth value of the divided area corresponding to the other components to 0,
The depth value of the divided area corresponding to the horizontal component is set to the same value as the depth value according to the linear perspective method,
A depth map generating apparatus configured to set a depth value of a divided region corresponding to the vertical component as an average depth value of the divided region.

The method of claim 10,
The initial depth map generation unit detects a quadrangular shape structure irrelevant to a linear perspective, and assigns the same depth value to the detected quadrangular shape structure.

As an image conversion method for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating step of estimating a vanishing point in the image;
An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point;
A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area;
A corrected depth map generation step of generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And
And a three-dimensional image generating step of generating a three-dimensional image based on the corrected depth map,
In the step of allocating depth information, in addition to the geometric information, the image conversion method generates the divided regions by reflecting the intensity of a straight line in the image and assigns depth information to each divided region.

As an image conversion device for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating unit for estimating a vanishing point in the image;
An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image;
A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image, and allocating depth information to each divided area;
A corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And
And a three-dimensional image generator for generating a three-dimensional image based on the corrected depth map,
The depth information allocating unit generates the divided regions by reflecting the strength of a straight line in the image in addition to the geometric information and allocates depth information to each divided region.

The method of claim 10,
The classification information indicating whether the segmented region of the image corresponds to the other component, the vertical component, or the horizontal component is determined using a decision tree learning method, and the error due to the decision tree learning method is convolution of a shallow structure. A depth map generation device complemented by a learning model by neural networks.

The method of claim 10,
The geometric information includes gradient generation information indicating whether the divided area of the image corresponds to a quadrangular shape structure not related to a linear perspective.

As a depth map generating device for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating unit for estimating a vanishing point in the image;
An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image;
A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image, and allocating depth information to each divided area; And
A corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area,
The depth information allocation unit generates the divided regions by reflecting the strength of a straight line in the image in addition to the geometric information and allocates depth information to each divided region.

The method of claim 10,
A depth map generating apparatus further comprising a smoothing processing unit for correcting a discontinuous area in the corrected depth map.

The method of claim 10,
The vanishing point estimation unit estimates the vanishing point in the image using a RANSAC-based algorithm using the characteristics of the Manhattan World (MW), and compensates for the error by the RANSAC-based algorithm by a learning model using a shallow convolutional neural network Depth map generating device.

As an image conversion method for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating step of estimating a vanishing point in the image;
An initial depth map generation step of generating an initial depth map using a linear perspective method for the image based on the estimated vanishing point;
A depth information allocation step of generating a divided area obtained by dividing the image into a plurality of pieces by reflecting geometric information detected from the image, and allocating depth information to each divided area;
A corrected depth map generation step of generating a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And
And a three-dimensional image generating step of generating a three-dimensional image based on the corrected depth map,
The geometric information includes classification information indicating whether the divided area of the image corresponds to a vertical component, a horizontal component, or other components other than the vertical component and the horizontal component,
In the depth information allocation step,
The depth value of the divided region corresponding to the other component is set to 0,
The depth value of the divided area corresponding to the horizontal component is set to the same value as the depth value according to the linear perspective method,
An image conversion method in which a depth value of a divided area corresponding to the vertical component is set as an average depth value of the divided area.

As an image conversion device for reconstructing a two-dimensional monocular image into three dimensions,
A vanishing point estimating unit for estimating a vanishing point in the image;
An initial depth map generator for generating an initial depth map based on the estimated vanishing point using a linear perspective method with respect to the image;
A depth information allocator configured to generate a divided area obtained by dividing the image into a plurality of pieces by reflecting the geometric information detected from the image, and allocating depth information to each divided area;
A corrected depth map generator configured to generate a corrected depth map obtained by correcting the initial depth map based on the depth information for each divided area; And
And a three-dimensional image generator for generating a three-dimensional image based on the corrected depth map,
The geometric information includes classification information indicating whether the divided area of the image corresponds to a vertical component, a horizontal component, or other components other than the vertical component or the horizontal component,
The depth information allocation unit,
Set the depth value of the divided area corresponding to the other components to 0,
The depth value of the divided area corresponding to the horizontal component is set to the same value as the depth value according to the linear perspective method,
An image conversion device that sets a depth value of a divided area corresponding to the vertical component as an average depth value of the divided area.