KR20220050253A

KR20220050253A - Apparatus for constructing a 3D map for providing augmented reality based on pose information and depth information, and a method therefor

Info

Publication number: KR20220050253A
Application number: KR1020200133110A
Authority: KR
Inventors: 장준환; 박우출; 양진욱; 윤상필; 최민수; 이준석; 송수호; 구본재
Original assignee: 한국전자기술연구원
Priority date: 2020-10-15
Filing date: 2020-10-15
Publication date: 2022-04-25
Also published as: WO2022080553A1; KR102472568B1

Abstract

The present invention provides a device for constructing a three-dimensional map for providing augmented reality, which includes: a pose obtaining unit for obtaining pose information from input two frames, when the two frames including a first frame and a second frame of an image taken while moving the position are input; a depth map calculation unit for deriving a depth map from the two frames using a deep learning model; and a three-dimensional map generation unit for generating a three-dimensional map based on the pose information and the depth map.

Description

Apparatus for constructing a 3D map for providing augmented reality based on pose information and depth information, and a method therefor

본 발명은 3차원 맵 구성 기술에 관한 것으로, 보다 상세하게는, 포즈 정보와 뎁스 맵 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치 및 이를 위한 방법에 관한 것이다. The present invention relates to a 3D map construction technology, and more particularly, to an apparatus and a method for configuring a 3D map for providing augmented reality based on pose information and depth map information.

가상현실(virtual reality, VR)은 컴퓨터 등을 사용한 인공적인 기술로 만들어낸 실제와 유사하지만 실제가 아닌 어떤 특정한 환경이나 상황 혹은 그 기술 자체를 의미한다. 증강현실(augmented reality, AR)은 가상현실(VR)의 한 분야로 실제로 존재하는 환경에 가상의 사물이나 정보를 합성하여 마치 원래의 환경에 존재하는 사물처럼 보이도록 하는 컴퓨터 그래픽 기법이다. 즉, 증강현실은 사용자가 눈으로 보는 현실세계에 가상 물체를 겹쳐 보여주는 기술이다. 현실세계에 실시간으로 부가정보를 갖는 가상세계를 합쳐 하나의 영상으로 보여주므로 혼합현실(mixed reality, MR)이라고도 한다. 현실세계를 가상세계로 보완해주는 개념인 증강현실은 컴퓨터 그래픽으로 만들어진 가상환경을 사용하지만 주역은 현실 환경이다. 컴퓨터 그래픽은 현실 환경에 필요한 정보를 추가 제공하는 역할을 한다. 사용자가 보고 있는 실사 영상에 3차원 가상영상을 겹침(overlap)으로써 현실 환경과 가상화면과의 구분이 모호해지도록 한다는 뜻이다. Virtual reality (VR) refers to a specific environment or situation or the technology itself that is similar to reality created by artificial technology using a computer, etc. but is not real. Augmented reality (AR) is a field of virtual reality (VR) and is a computer graphic technique that synthesizes virtual objects or information in an actual environment to make them appear as if they exist in the original environment. In other words, augmented reality is a technology that superimposes virtual objects on the real world that users see with their eyes. It is also called mixed reality (MR) because the real world and the virtual world with additional information are combined in real time and displayed as a single image. Augmented reality, a concept that complements the real world with a virtual world, uses a virtual environment created with computer graphics, but the main character is the real environment. Computer graphics serve to provide additional information necessary for the real environment. This means that the distinction between the real environment and the virtual screen is blurred by overlapping the 3D virtual image on the actual image the user is viewing.

가상현실 기술은 가상환경에 사용자를 몰입하게 하여 실제 환경을 볼 수 없다. 하지만 실제 환경과 가상의 객체가 혼합된 증강현실기술은 사용자가 실제 환경을 볼 수 있게 하여 보다 나은 현실감과 부가 정보를 제공한다. Virtual reality technology immerses the user in the virtual environment, making it impossible to see the real environment. However, augmented reality technology, in which the real environment and virtual objects are mixed, allows users to see the real environment, providing better realism and additional information.

한국공개특허 제2020-0108484호 2020년 09월 18일 공개 (명칭: 뉴럴 네트워크를 이용하여 상황을 인지하는 증강 현실 제공 장치, 제공 방법 및 상기 방법을 실행하기 위하여 매체에 저장된 컴퓨터 프로그램)Korean Patent Publication No. 2020-0108484 published on September 18, 2020 (Name: Augmented reality providing apparatus for recognizing a situation using a neural network, providing method, and a computer program stored in a medium to execute the method)

본 발명의 목적은 포즈 정보와 뎁스 맵 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치 및 이를 위한 방법을 제공함에 있다. An object of the present invention is to provide an apparatus and a method for configuring a 3D map for providing augmented reality based on pose information and depth map information, and a method therefor.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치는 위치를 이동하면서 촬영된 영상의 제1 프레임 및 제2 프레임을 포함하는 2개의 프레임이 입력되면, 입력된 2개의 프레임으로부터 포즈 정보를 획득하는 포즈획득부와, 심층학습모델을 이용하여 상기 2개의 프레임으로부터 뎁스 맵을 도출하는 뎁스맵산출부와, 상기 포즈 정보 및 상기 뎁스 맵을 기초로 3차원 맵을 생성하는 3차원맵생성부를 포함한다. An apparatus for constructing a three-dimensional map for providing augmented reality according to a preferred embodiment of the present invention for achieving the above object includes a first frame and a second frame of an image taken while moving the location When two frames are input, a pose obtaining unit obtaining pose information from the two input frames, a depth map calculating unit deriving a depth map from the two frames using a deep learning model, the pose information and the and a 3D map generator that generates a 3D map based on the depth map.

상기 포즈획득부는 상기 2개의 프레임 각각에서 동일한 대상을 나타내는 특징점을 추출하고, 추출된 특징점의 좌표 변화를 기초로 포즈 정보 및 포즈 매트릭스를 순차로 도출하는 것을 특징으로 한다. The pose acquisition unit extracts a feature point representing the same object from each of the two frames, and sequentially derives pose information and a pose matrix based on the coordinate change of the extracted feature point.

상기 뎁스산출부는 알려진 카메라 매트릭스 및 상기 포즈 매트릭스를 이용하여 변환 매트릭스를 도출하고, 상기 변환 매트릭스를 이용하여 상기 제1 프레임으로부터 상기 제2 프레임을 모사하는 모사 제2 프레임을 생성하고, 상기 심층학습모델을 통해 상기 제2 프레임의 픽셀과 모사 제2 프레임의 픽셀의 좌표 차이에 따라 뎁스 맵을 도출하는 것을 특징으로 한다. The depth calculation unit derives a transformation matrix by using a known camera matrix and the pose matrix, and generates a simulating second frame simulating the second frame from the first frame using the transformation matrix, and the deep learning model It is characterized in that the depth map is derived according to the coordinate difference between the pixel of the second frame and the pixel of the second simulated frame through .

상기 뎁스산출부는 학습용 제1 프레임 및 학습용 제2 프레임으로부터 도출된 포즈 매트릭스 및 알려진 카메라 매트릭스를 이용하여 변환 매트릭스를 도출하고, 상기 변환 매트릭스를 이용하여 상기 학습용 제1 프레임으로부터 상기 학습용 제2 프레임을 모사하는 모사 학습용 제2 프레임을 생성하고, 모델의 원형에 대해 현실 세계에서 동일한 대상을 나타내는 부분의 상기 학습용 제2 프레임의 픽셀과 상기 모사 학습용 제2 프레임의 픽셀의 좌표 차이와 뎁스 간의 상관관계를 학습시켜 상기 학습용 제2 프레임의 픽셀과 상기 모사 학습용 제2 프레임의 픽셀의 좌표 차이에 따라 뎁스 맵을 도출하는 심층학습모델을 생성하는 것을 특징으로 한다. The depth calculation unit derives a transformation matrix using a pose matrix derived from the first frame for learning and the second frame for learning and a known camera matrix, and uses the transformation matrix to simulate the second frame for learning from the first frame for learning generates a second frame for simulation learning, and learns the correlation between the coordinate difference and the depth of the pixel of the second frame for learning of the portion representing the same object in the real world with respect to the prototype of the model and the pixel of the second frame for simulation learning It is characterized in that to generate a deep learning model for deriving a depth map according to the coordinate difference between the pixel of the second frame for learning and the pixel of the second frame for simulation learning.

상술한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따른 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 방법은 포즈획득부가 위치를 이동하면서 촬영된 영상의 제1 프레임 및 제2 프레임을 포함하는 2개의 프레임이 입력되면, 입력된 2개의 프레임으로부터 포즈 정보를 획득하는 단계와, 뎁스맵산출부가 심층학습모델을 이용하여 상기 2개의 프레임으로부터 뎁스 맵을 도출하는 단계와, 3차원맵생성부가 상기 포즈 정보 및 상기 뎁스 맵을 기초로 3차원맵을 생성하는 단계를 포함한다. In a method for constructing a three-dimensional map for providing augmented reality according to a preferred embodiment of the present invention for achieving the above object, the first frame and the second frame of the image captured while the pose acquirer moves the position When two frames including and generating, by a generator, a 3D map based on the pose information and the depth map.

상기 2개의 프레임으로부터 포즈 정보를 획득하는 단계는 상기 포즈획득부가 상기 2개의 프레임 각각에서 동일한 대상을 나타내는 특징점을 추출하는 단계와, 상기 포즈획득부가 상기 추출된 특징점의 좌표 변화를 기초로 포즈 정보 및 포즈 매트릭스를 순차로 도출하는 단계를 포함한다. The step of obtaining the pose information from the two frames includes the steps of: the pose obtaining unit extracting a feature point representing the same object from each of the two frames; and sequentially deriving a pose matrix.

상기 2개의 프레임으로부터 뎁스 맵을 도출하는 단계는 상기 뎁스산출부가 알려진 카메라 매트릭스 및 상기 포즈 매트릭스를 이용하여 변환 매트릭스를 도출하는 단계와, 상기 뎁스산출부가 상기 변환 매트릭스를 이용하여 상기 2개의 프레임 중 제1 프레임으로부터 제2 프레임을 모사하는 모사 제2 프레임을 생성하는 단계와, 상기 뎁스산출부가 상기 심층학습모델을 통해 상기 제2 프레임의 픽셀과 모사 제2 프레임의 픽셀의 좌표 차이에 따라 뎁스 맵을 도출하는 단계를 포함한다. The step of deriving the depth map from the two frames includes deriving a transformation matrix using the camera matrix and the pose matrix known by the depth calculator, and the depth calculator using the transform matrix to obtain a second one of the two frames. Generating a simulated second frame simulating a second frame from one frame, and the depth calculation unit using the deep learning model to generate a depth map according to the difference in coordinates between the pixels of the second frame and the pixels of the simulated second frame It includes the step of deriving.

상기 방법은 상기 2개의 프레임으로부터 포즈 정보를 획득하는 단계 전, 상기 뎁스산출부가 학습용 제1 프레임 및 학습용 제2 프레임으로부터 도출된 포즈 매트릭스 및 알려진 카메라 매트릭스를 이용하여 변환 매트릭스를 도출하는 단계와, 상기 뎁스산출부가 상기 변환 매트릭스를 이용하여 상기 학습용 제1 프레임으로부터 상기 학습용 제2 프레임을 모사하는 모사 학습용 제2 프레임을 생성하는 단계와, 상기 뎁스산출부가 모델의 원형에 대해 현실 세계에서 동일한 대상을 나타내는 부분의 상기 학습용 제2 프레임의 픽셀과 상기 모사 학습용 제2 프레임의 픽셀의 좌표 차이와 뎁스 간의 상관관계를 학습시켜 상기 학습용 제2 프레임의 픽셀과 상기 모사 학습용 제2 프레임의 픽셀의 좌표 차이에 따라 뎁스 맵을 도출하는 심층학습모델을 생성하는 단계를 더 포함한다. The method includes the steps of deriving a transformation matrix using a pose matrix and a known camera matrix derived from the first frame for training and the second frame for training by the depth calculator before acquiring the pose information from the two frames; The depth calculator generates a second frame for simulation learning that simulates the second frame for learning from the first frame for learning using the transformation matrix, and the depth calculator represents the same object in the real world with respect to the prototype of the model By learning the correlation between the coordinate difference and the depth of the pixel of the second frame for learning and the pixel of the second frame for imitation learning, the pixel of the second frame for learning and the pixel of the second frame for imitation learning according to the coordinate difference The method further includes generating a deep learning model for deriving a depth map.

본 발명에 따르면, 포즈 정보와 뎁스 맵 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성할 수 있다. 이에 따라, 본 발명은 3차원 맵을 이용하여 촬영되는 영상에 가상의 객체를 정합할 수 있다. 본 발명의 3차원 맵은 정밀한 3차원 좌표를 제공하기 때문에 영상에 가상의 객체를 정합할 때 정밀한 정합이 가능하다. 이에 따라, 보다 사실감이 높은 증강 현실을 제공할 수 있다. According to the present invention, a 3D map for providing augmented reality may be configured based on pose information and depth map information. Accordingly, according to the present invention, a virtual object can be registered with an image captured using a 3D map. Since the 3D map of the present invention provides precise 3D coordinates, precise registration is possible when registering a virtual object to an image. Accordingly, it is possible to provide augmented reality with higher realism.

도 1은 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치의 구성을 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 제어부의 세부 구성을 설명하기 위한 도면이다.
도 3은 본 발명의 실시예에 따른 포즈 매트릭스를 도출하는 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 실시예에 따른 포즈 매트릭스 및 카메라 매트릭스를 이용하여 변환 매트릭스를 도출하는 방법을 설명하기 위한 도면이다.
도 5는 본 발명의 실시예에 따른 변환 매트릭스를 이용하여 모사 프레임을 생성하는 방법을 설명하기 위한 도면이다.
도 6은 본 발명의 실시예에 따른 2개의 프레임의 픽셀의 차이를 상관관계를 학습하는 방법을 설명하기 위한 도면이다.
도 7은 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 방법을 설명하기 위한 흐름도이다. 1 is a diagram for explaining the configuration of an apparatus for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention.
2 is a view for explaining a detailed configuration of a control unit according to an embodiment of the present invention.
3 is a diagram for explaining a method of deriving a pose matrix according to an embodiment of the present invention.
4 is a diagram for explaining a method of deriving a transformation matrix using a pose matrix and a camera matrix according to an embodiment of the present invention.
5 is a diagram for explaining a method of generating a simulated frame using a transform matrix according to an embodiment of the present invention.
6 is a diagram for explaining a method for learning a correlation between pixels of two frames according to an embodiment of the present invention.
7 is a flowchart illustrating a method for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention.

본 발명의 상세한 설명에 앞서, 이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념으로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 실시예에 불과할 뿐, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다. Prior to the detailed description of the present invention, the terms or words used in the present specification and claims described below should not be construed as being limited to their ordinary or dictionary meanings, and the inventors should develop their own inventions in the best way. It should be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that it can be appropriately defined as a concept of a term for explanation. Therefore, the embodiments described in the present specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all the technical spirit of the present invention, so various equivalents that can be substituted for them at the time of the present application It should be understood that there may be water and variations.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명한다. 이때, 첨부된 도면에서 동일한 구성 요소는 가능한 동일한 부호로 나타내고 있음을 유의해야 한다. 또한, 본 발명의 요지를 흐리게 할 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략할 것이다. 마찬가지의 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this case, it should be noted that in the accompanying drawings, the same components are denoted by the same reference numerals as much as possible. In addition, detailed descriptions of well-known functions and configurations that may obscure the gist of the present invention will be omitted. For the same reason, some components are exaggerated, omitted, or schematically illustrated in the accompanying drawings, and the size of each component does not fully reflect the actual size.

먼저, 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치에 대해서 설명하기로 한다. 도 1은 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치의 구성을 설명하기 위한 도면이다. 도 1을 참조하면, 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 장치(10, 이하, ‘증강현실장치’로 축약함)는 카메라부(11), 통신부(12), 센서부(13), 오디오부(14), 입력부(15), 표시부(16), 저장부(17) 및 제어부(18)를 포함한다. First, an apparatus for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention will be described. 1 is a diagram for explaining the configuration of an apparatus for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention. Referring to FIG. 1 , an apparatus 10 (hereinafter, abbreviated as 'augmented reality device') for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention is It includes a camera unit 11 , a communication unit 12 , a sensor unit 13 , an audio unit 14 , an input unit 15 , a display unit 16 , a storage unit 17 , and a control unit 18 .

카메라부(11)는 영상을 촬영하기 위한 것이다. 특히, 본 발명의 실시예에 따른 카메라부(11)는 스테레오 카메라가 될 수 있다. 이를 위하여, 카메라부(12)는 2개의 렌즈 및 2개의 이미지 센서를 포함할 수 있다. 각 이미지 센서는 피사체에서 반사되는 빛을 입력받아 전기신호로 변환하며, CCD(Charged Coupled Device), CMOS(Complementary Metal-Oxide Semiconductor) 등을 기반으로 구현될 수 있다. 카메라부(11)는 하나 이상의 아날로그-디지털 변환기(Analog to Digital Converter)를 더 포함할 수 있으며, 이미지 센서에서 출력되는 전기신호를 디지털 수열로 변환하여 제어부(18)로 출력할 수 있다. The camera unit 11 is for capturing an image. In particular, the camera unit 11 according to an embodiment of the present invention may be a stereo camera. To this end, the camera unit 12 may include two lenses and two image sensors. Each image sensor receives light reflected from a subject and converts it into an electrical signal, and may be implemented based on a Charged Coupled Device (CCD), a Complementary Metal-Oxide Semiconductor (CMOS), or the like. The camera unit 11 may further include one or more analog-to-digital converters, and may convert an electrical signal output from the image sensor into a digital sequence and output it to the control unit 18 .

통신부(12)는 다른 장치와 통신을 위한 것이다. 통신부(12)는 송신되는 신호의 주파수를 상승 변환 및 증폭하는 RF(Radio Frequency) 송신기(Tx) 및 수신되는 신호를 저 잡음 증폭하고 주파수를 하강 변환하는 RF 수신기(Rx)를 포함할 수 있다. 그리고 통신부(12)는 송신되는 신호를 변조하고, 수신되는 신호를 복조하는 모뎀(Modem)을 포함할 수 있다. The communication unit 12 is for communication with other devices. The communication unit 12 may include a radio frequency (RF) transmitter (Tx) that up-converts and amplifies the frequency of the transmitted signal, and an RF receiver (Rx) that low-noise amplifies the received signal and down-converts the frequency. In addition, the communication unit 12 may include a modem that modulates a transmitted signal and demodulates a received signal.

센서부(13)는 관성을 측정하기 위한 것이다. 이러한 센서부(13)는 관성센서(Inertial Measurement Unit: IMU), 도플러속도센서(Doppler Velocity Log: DVL) 및 자세방위각센서(Attitude and Heading Reference. System: AHRS) 등을 포함한다. 센서부(13)는 증강현실장치(10)의 회전 및 이동의 위치 및 속도를 포함하는 관성 정보를 측정하여 측정된 증강현실장치(10)의 관성 정보를 제어부(18)로 제공한다. The sensor unit 13 is for measuring inertia. The sensor unit 13 includes an Inertial Measurement Unit (IMU), a Doppler Velocity Log (DVL), an Attitude and Heading Reference System (AHRS), and the like. The sensor unit 13 measures inertial information including the position and speed of rotation and movement of the augmented reality device 10 and provides the measured inertial information of the augmented reality device 10 to the control unit 18 .

오디오부(14)는 오디오 신호를 출력하기 위한 스피커(SPK)와, 오디오 신호를 입력받기 위한 마이크(MIKE)를 포함한다. 오디오부(14)는 제어부(18)의 제어에 따라 오디오 신호를 스피커(SPK)를 통해 출력하거나, 마이크(MIKE)를 통해 입력되는 오디오 신호를 제어부(18)로 전달할 수 있다. The audio unit 14 includes a speaker SPK for outputting an audio signal and a microphone MIKE for receiving an audio signal. The audio unit 14 may output an audio signal through the speaker SPK or transmit an audio signal input through the microphone MIKE to the control unit 18 under the control of the control unit 18 .

입력부(15)는 증강현실장치(10)를 제어하기 위한 사용자의 키 조작을 입력받고 입력 신호를 생성하여 제어부(18)에 전달한다. 입력부(15)는 증강현실장치(10)를 제어하기 위한 각 종 키들을 포함할 수 있다. 입력부(15)는 표시부(16)가 터치스크린으로 이루어진 경우, 각 종 키들의 기능이 표시부(16)에서 이루어질 수 있으며, 터치스크린만으로 모든 기능을 수행할 수 있는 경우, 입력부(15)는 생략될 수도 있다. The input unit 15 receives a user's key manipulation for controlling the augmented reality device 10 , generates an input signal, and transmits it to the control unit 18 . The input unit 15 may include various types of keys for controlling the augmented reality device 10 . In the input unit 15, when the display unit 16 is formed of a touch screen, the functions of various keys can be performed on the display unit 16, and when all functions can be performed only with the touch screen, the input unit 15 may be omitted. may be

표시부(16)는 증강현실장치(10)의 메뉴, 입력된 데이터, 기능 설정 정보 및 기타 다양한 정보를 사용자에게 시각적으로 제공한다. 표시부(16)는 증강현실장치(10)의 부팅 화면, 대기 화면, 메뉴 화면, 등의 화면을 출력하는 기능을 수행한다. 특히, 표시부(16)는 본 발명의 실시예에 따른 3차원 맵을 화면으로 출력하는 기능을 수행한다. 이러한 표시부(16)는 액정표시장치(LCD, Liquid Crystal Display), 유기 발광 다이오드(OLED, Organic Light Emitting Diodes), 능동형 유기 발광 다이오드(AMOLED, Active Matrix Organic Light Emitting Diodes) 등으로 형성될 수 있다. 한편, 표시부(16)는 터치스크린으로 구현될 수 있다. 이러한 경우, 표시부(16)는 터치센서를 포함한다. 터치센서는 사용자의 터치 입력을 감지한다. 터치센서는 정전용량 방식(capacitive overlay), 압력식, 저항막 방식(resistive overlay), 적외선 감지 방식(infrared beam) 등의 터치 감지 센서로 구성되거나, 압력 감지 센서(pressure sensor)로 구성될 수도 있다. 상기 센서들 이외에도 물체의 접촉 또는 압력을 감지할 수 있는 모든 종류의 센서 기기가 본 발명의 터치센서로 이용될 수 있다. 터치센서는 사용자의 터치 입력을 감지하고, 터치된 위치를 나타내는 입력 좌표를 포함하는 감지 신호를 발생시켜 제어부(18)로 전송할 수 있다. 특히, 표시부(16)가 터치스크린으로 이루어진 경우, 입력부(15)의 기능의 일부 또는 전부는 표시부(16)를 통해 이루어질 수 있다. The display unit 16 visually provides a menu of the augmented reality device 10, input data, function setting information, and various other information to the user. The display unit 16 performs a function of outputting a boot screen, a standby screen, a menu screen, and the like of the augmented reality device 10 . In particular, the display unit 16 performs a function of outputting a 3D map according to an embodiment of the present invention to the screen. The display unit 16 may be formed of a liquid crystal display (LCD), an organic light emitting diode (OLED), an active matrix organic light emitting diode (AMOLED), or the like. Meanwhile, the display unit 16 may be implemented as a touch screen. In this case, the display unit 16 includes a touch sensor. The touch sensor detects a user's touch input. The touch sensor may be composed of a touch sensing sensor such as a capacitive overlay, a pressure type, a resistive overlay, or an infrared beam, or may be composed of a pressure sensor. . In addition to the above sensors, all types of sensor devices capable of sensing contact or pressure of an object may be used as the touch sensor of the present invention. The touch sensor may detect a user's touch input, generate a detection signal including input coordinates indicating the touched position, and transmit it to the controller 18 . In particular, when the display unit 16 is formed of a touch screen, some or all of the functions of the input unit 15 may be performed through the display unit 16 .

저장부(17)는 증강현실장치(10)의 동작에 필요한 프로그램 및 데이터를 저장하는 역할을 수행한다. 특히, 저장부(17)는 카메라 매트릭스, 포즈 매트릭스 등을 저장할 수 있다. 저장부(17)에 저장되는 각 종 데이터는 증강현실장치(10) 사용자의 조작에 따라, 삭제, 변경, 추가될 수 있다. The storage unit 17 serves to store programs and data necessary for the operation of the augmented reality device 10 . In particular, the storage unit 17 may store a camera matrix, a pose matrix, and the like. Various types of data stored in the storage unit 17 may be deleted, changed, or added according to a user's operation of the augmented reality device 10 .

제어부(18)는 증강현실장치(10)의 전반적인 동작 및 증강현실장치(10)의 내부 블록들 간 신호 흐름을 제어하고, 데이터를 처리하는 데이터 처리 기능을 수행할 수 있다. 또한, 제어부(18)는 기본적으로, 증강현실장치(10)의 각 종 기능을 제어하는 역할을 수행한다. 제어부(18)는 CPU(Central Processing Unit), BP(baseband processor), AP(application processor), GPU(Graphic Processing Unit), DSP(Digital Signal Processor) 등을 예시할 수 있다. The controller 18 may control the overall operation of the augmented reality device 10 and the signal flow between internal blocks of the augmented reality device 10 , and perform a data processing function of processing data. In addition, the control unit 18 basically performs a role of controlling various functions of the augmented reality device (10). The controller 18 may include a central processing unit (CPU), a baseband processor (BP), an application processor (AP), a graphic processing unit (GPU), a digital signal processor (DSP), and the like.

그러면, 전술한 제어부(18)의 세부 구성에 대해서 보다 상세하게 설명하기로 한다. 도 2는 본 발명의 실시예에 따른 제어부의 세부 구성을 설명하기 위한 도면이다. 도 3은 본 발명의 실시예에 따른 포즈 매트릭스를 도출하는 방법을 설명하기 위한 도면이다. 도 4는 본 발명의 실시예에 따른 포즈 매트릭스 및 카메라 매트릭스를 이용하여 변환 매트릭스를 도출하는 방법을 설명하기 위한 도면이다. 도 5는 본 발명의 실시예에 따른 변환 매트릭스를 이용하여 모사 프레임을 생성하는 방법을 설명하기 위한 도면이다. 도 6은 본 발명의 실시예에 따른 2개의 프레임의 픽셀의 차이를 상관관계를 학습하는 방법을 설명하기 위한 도면이다. Then, the detailed configuration of the above-described control unit 18 will be described in more detail. 2 is a view for explaining a detailed configuration of a control unit according to an embodiment of the present invention. 3 is a diagram for explaining a method of deriving a pose matrix according to an embodiment of the present invention. 4 is a diagram for explaining a method of deriving a transformation matrix using a pose matrix and a camera matrix according to an embodiment of the present invention. 5 is a diagram for explaining a method of generating a simulated frame using a transform matrix according to an embodiment of the present invention. 6 is a diagram for explaining a method for learning a correlation between pixels of two frames according to an embodiment of the present invention.

도 2를 참조하면, 제어부(18)는 포즈획득부(110: Device Pose Acquisition), 뎁스산출부(120: Depth Map Acquisition) 및 3차원맵생성부(130: 3D Map Reconstruction)를 포함한다. Referring to FIG. 2 , the control unit 18 includes a pose acquisition unit 110: Device Pose Acquisition, a depth calculation unit 120: Depth Map Acquisition, and a 3D map generation unit 130: 3D Map Reconstruction.

포즈획득부(110)는 증강현실장치(10)의 포즈를 획득하기 위한 것이다. 포즈획득부(110)는 위치를 이동하면서 카메라부(11)를 통해 촬영된 영상의 복수의 프레임 각각에서 동일한 대상을 나타내는 특징점(Feature Point)을 추출하고, 추출된 특징점(Feature Point)의 좌표 변화를 산출하여 포즈 정보를 산출한다. 예컨대, 도 3의 제1 프레임(F1) 및 제2 프레임(F2)은 위치를 이동하면서 촬영된 영상을 나타낸다. 도 3의 제1 프레임(F1)에서의 특징점 P는 제1 프레임(F1)에서의 위치 P(t-1)에서 제2 영상(F2)에서의 위치로 P(t)로 이동하였다. 따라서 증강현실장치(10)는 특징점 P가 이동한 정도에서 역으로 이동하였음을 알 수 있다. 이와 같이, 포즈획득부(110)는 특징점(Feature Point)의 변화를 산출하여 포즈 정보(위치, 회전 정보)를 도출하고, 포즈 정보를 매트릭스로 표현하여 포즈 매트릭스를 도출한다. The pose acquisition unit 110 is for acquiring a pose of the augmented reality device 10 . The pose acquisition unit 110 extracts a feature point representing the same object from each of a plurality of frames of an image photographed through the camera unit 11 while moving the position, and changes the coordinates of the extracted feature point. , to calculate pose information. For example, the first frame F1 and the second frame F2 of FIG. 3 represent images captured while moving positions. The feature point P in the first frame F1 of FIG. 3 is moved from a position P(t-1) in the first frame F1 to a position in the second image F2 by P(t). Accordingly, it can be seen that the augmented reality device 10 has moved backward from the degree to which the feature point P has moved. In this way, the pose acquisition unit 110 derives pose information (position, rotation information) by calculating changes in feature points, and expresses the pose information as a matrix to derive a pose matrix.

뎁스산출부(120)는 심층학습모델(DLM: Deep learning Model)을 이용하여 뎁스 맵(Depth map)을 획득하기 위한 것이다. 심층학습모델(DLM: Deep learning Model)은 학습용 제1 프레임 및 학습용 제2 프레임을 포함하는 학습 데이터를 이용하여 다음과 같은 방법을 통해 생성된다. 뎁스산출부(120)는 우선, 도 4에 도시된 바와 같이, 포즈 매트릭스(PM: Pose Matrix) 및 카메라 매트릭스(CM: Camera Matrix)를 이용하여 변환 매트릭스(TM: Transition Matrix)를 도출한다. 여기서, 포즈 매트릭스(PM)는 포즈획득부(110)에 의해 학습용 제1 프레임(F1) 및 학습용 제2 프레임(F2)으로부터 도출된 포즈 정보를 매트릭스로 표현한 것이다. 또한, 카메라 매트릭스(CM)는 카메라부(11)의 내부 파라미터로 알려진 것이다. 그리고 뎁스산출부(120)는 도 5에 도시된 바와 같이, 변환 매트릭스(TM)를 이용하여 학습용 제1 프레임(F1)을 변환하여 학습용 제2 프레임(F2)을 모사하는 모사 학습용 제2 프레임(F2’)을 생성한다. 그리고 뎁스산출부(120)는 도 6에 도시된 바와 같이, 모델의 원형에 대해 현실 세계에서 동일한 대상을 나타내는 부분의 학습용 제2 프레임(F2)의 픽셀과 모사 학습용 제2 프레임(F2’)의 픽셀의 좌표 차이와 뎁스 간의 상관관계를 학습(deep learning)시켜 학습용 제2 프레임(F2)의 픽셀과 모사 학습용 제2 프레임(F2’)의 픽셀의 좌표 차이에 따라 뎁스 맵(Depth map)을 도출하는 심층학습모델(DLM)을 생성한다. The depth calculator 120 is to acquire a depth map using a deep learning model (DLM). A deep learning model (DLM) is generated through the following method using learning data including a first frame for learning and a second frame for learning. As shown in FIG. 4 , the depth calculator 120 first derives a transition matrix (TM) using a pose matrix (PM) and a camera matrix (CM). Here, the pose matrix PM is a matrix expressed by the pose information derived from the first frame F1 for learning and the second frame F2 for learning by the pose acquisition unit 110 . Also, the camera matrix CM is known as an internal parameter of the camera unit 11 . And as shown in FIG. 5, the depth calculation unit 120 transforms the first frame F1 for learning by using the transformation matrix TM to simulate the second frame F2 for learning. F2') is created. And, as shown in FIG. 6 , the depth calculation unit 120 includes a pixel of a second frame F2 for learning of a portion representing the same object in the real world with respect to the prototype of the model and a second frame F2' for imitation learning. By learning the correlation between the pixel coordinate difference and the depth (deep learning), the depth map is derived according to the coordinate difference between the pixel of the second frame F2 for learning and the pixel of the second frame F2' for simulation learning. to create a deep learning model (DLM).

전술한 바와 같이, 심층학습모델(DLM)을 생성한 후, 뎁스산출부(120)는 카메라부(11)를 통해 위치를 이동하면서 카메라부(11)를 통해 촬영된 영상의 제1 프레임(F1)과 제2 프레임(F2)이 입력되면, 도 4에 도시된 바와 같이, 포즈 매트릭스(PM) 및 카메라 매트릭스(CM)를 이용하여 변환 매트릭스(TM)를 도출한다. 그런 다음, 뎁스산출부(120)는 도 5에 도시된 바와 같이, 변환 매트릭스(TM)를 이용하여 제1 프레임(F1)을 변환하여 제2 프레임(F2)을 모사하는 모사 제2 프레임(F2’)을 생성한다. 이어서, 뎁스산출부(120)는 제2 프레임(F2) 및 모사 제2 프레임(F2’)을 심층학습모델(DLM)에 입력한다. 그러면, 심층학습모델(DLM)은 제2 프레임(F2)의 픽셀과 모사 제2 프레임(F2’)의 픽셀의 좌표 차이에 따라 뎁스 맵(Depth map)을 도출한다. As described above, after generating the deep learning model (DLM), the depth calculating unit 120 moves the position through the camera unit 11 and the first frame F1 of the image captured through the camera unit 11 . ) and the second frame F2 are input, as shown in FIG. 4 , a transformation matrix TM is derived using the pose matrix PM and the camera matrix CM. Then, as shown in FIG. 5 , the depth calculating unit 120 transforms the first frame F1 using the transformation matrix TM to simulate the second frame F2 to simulate the second frame F2 . ') is created. Next, the depth calculation unit 120 inputs the second frame F2 and the second simulated frame F2 ′ to the deep learning model DLM. Then, the deep learning model (DLM) derives a depth map according to the coordinate difference between the pixel of the second frame F2 and the pixel of the simulated second frame F2'.

3차원맵생성부(130)는 상기 포즈획득부(110)가 획득한 포즈 정보 및 뎁스산출부(120)가 도출한 뎁스 맵을 이용하여 촬영된 영상의 프레임에 대한 3차원 맵을 생성한다. 즉, 포즈 정보로부터 증강현실장치(10)의 제1 프레임(F1)과 제2 프레임(F2) 간의 위치 및 회전 정보를 알 수 있고, 뎁스 맵을 통해 제1 프레임(F1)과 제2 프레임(F1) 간의 뎁스를 알 수 있기 때문에 3차원맵생성부(130)는 위치 및 회전 정보와, 뎁스를 이용하여 해당 프레임의 픽셀의 2차원 좌표를 3차원 좌표로 변환할 수 있다. The 3D map generation unit 130 generates a 3D map for a frame of a photographed image using the pose information obtained by the pose obtaining unit 110 and the depth map derived by the depth calculating unit 120 . That is, the position and rotation information between the first frame F1 and the second frame F2 of the augmented reality device 10 can be known from the pose information, and the first frame F1 and the second frame F1 through the depth map ( Since the depth between F1) can be known, the 3D map generator 130 may convert the 2D coordinates of the pixels of the corresponding frame into 3D coordinates using the position and rotation information and the depth.

다음으로, 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 방법에 대해서 설명하기로 한다. 도 7은 본 발명의 실시예에 따른 포즈 정보와 뎁스 정보를 기초로 증강 현실을 제공하기 위한 3차원 맵을 구성하기 위한 방법을 설명하기 위한 흐름도이다. 도 7의 실시예어서, 전술한 바와 같이, 학습용 제1 프레임 및 학습용 제2 프레임을 포함하는 학습 데이터를 이용하여 모델의 원형을 학습(deep learning)시켜 2개의 프레임의 픽셀 좌표의 차이에 따라 뎁스 맵을 도출하는 심층학습모델(DLM)이 생성된 상태라고 가정한다. Next, a method for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention will be described. 7 is a flowchart illustrating a method for configuring a 3D map for providing augmented reality based on pose information and depth information according to an embodiment of the present invention. In the embodiment of FIG. 7 , as described above, a model prototype is learned using training data including a first frame for training and a second frame for training, and the depth is determined according to the difference between the pixel coordinates of the two frames. It is assumed that the deep learning model (DLM) that derives the map is in the generated state.

도 7을 참조하면, 포즈획득부(110)는 S110 단계에서 위치를 이동하면서 카메라부(11)를 통해 촬영된 영상에서 제1 및 제2 프레임(F1, F2)을 입력받는다. 그러면, 포즈획득부(110)는 S120 단계에서 제1 및 제2 프레임(F1, F2) 각각에서 특징점(P)을 추출하고, 추출된 특징점(P)의 변화를 산출하여 포즈 정보 및 포즈 매트릭스를 산출한다. 예컨대, 도 3의 제1 프레임(F1)에서의 특징점 P는 제1 프레임(F1)에서의 위치 P(t-1)에서 제2 프레임(F2)에서의 위치로 P(t)로 이동하였다. Referring to FIG. 7 , the pose acquisition unit 110 receives first and second frames F1 and F2 from an image captured by the camera unit 11 while moving the position in step S110 . Then, the pose acquisition unit 110 extracts a feature point P from each of the first and second frames F1 and F2 in step S120, calculates a change in the extracted feature point P, and obtains pose information and a pose matrix. Calculate. For example, the feature point P in the first frame F1 of FIG. 3 is moved from a position P(t-1) in the first frame F1 to a position in the second frame F2 by P(t).

다음으로, 뎁스산출부(120)는 S130 단계에서 도 4에 도시된 바와 같이, 포즈획득부(110)에 의해 제1 프레임(F1) 및 제2 프레임(F2)으로부터 도출된 포즈 매트릭스(PM) 및 카메라부(11)에 대해 알려진 카메라 매트릭스(CM)를 이용하여 변환 매트릭스(TM)를 도출한다. Next, as shown in FIG. 4 in step S130 , the depth calculation unit 120 performs a pose matrix (PM) derived from the first frame F1 and the second frame F2 by the pose acquisition unit 110 . and a transformation matrix TM using the known camera matrix CM for the camera unit 11 .

이어서, 뎁스산출부(120)는 S140 단계에서 도 5에 도시된 바와 같이, 변환 매트릭스(TM)를 이용하여 제1 프레임(F1)을 변환하여 제2 프레임(F2)을 모사하는 모사 제2 프레임(F2’)을 생성한다. Subsequently, as shown in FIG. 5 in step S140 , the depth calculator 120 transforms the first frame F1 using the transformation matrix TM to simulate the second frame F2 to simulate the second frame. (F2').

다음으로, 뎁스산출부(120)는 S150 단계에서 제2 프레임(F2) 및 모사 제2 프레임(F2’)을 심층학습모델(DLM)에 입력한다. 그러면, 심층학습모델(DLM)은 S160 단계에서 제2 프레임(F2)의 픽셀과 모사 제2 프레임(F2’)의 픽셀의 좌표 차이에 따라 뎁스 맵(Depth map)을 도출한다. Next, the depth calculator 120 inputs the second frame F2 and the second simulated frame F2 ′ to the deep learning model DLM in step S150 . Then, the deep learning model (DLM) derives a depth map according to the coordinate difference between the pixel of the second frame F2 and the pixel of the simulated second frame F2' in step S160.

다음으로, 3차원맵생성부(130)는 S170 단계에서 포즈획득부(110)가 획득한 포즈 정보 및 뎁스산출부(120)가 도출한 뎁스 맵을 이용하여 촬영된 영상의 프레임에 대한 3차원 맵을 도출한다. 즉, 포즈 정보로부터 증강현실장치(10)의 제1 프레임(F1)과 제2 프레임(F2) 간의 위치 및 회전 정보를 알 수 있고, 뎁스 맵을 통해 제1 프레임(F1)과 제2 프레임(F1) 간의 뎁스를 알 수 있기 때문에 3차원맵생성부(130)는 위치 및 회전 정보와, 뎁스를 이용하여 해당 프레임의 픽셀의 2차원 좌표를 3차원 좌표로 변환할 수 있다. Next, the 3D map generating unit 130 uses the pose information obtained by the pose obtaining unit 110 in step S170 and the depth map derived by the depth calculating unit 120 to provide a three-dimensional (3D) frame for the captured image. draw a map That is, the position and rotation information between the first frame F1 and the second frame F2 of the augmented reality device 10 can be known from the pose information, and the first frame F1 and the second frame F1 through the depth map ( Since the depth between F1) can be known, the 3D map generator 130 may convert the 2D coordinates of the pixels of the corresponding frame into 3D coordinates using the position and rotation information and the depth.

본 발명은 전술한 바와 같이 도출된 3차원 맵을 이용하여 촬영되는 영상에 가상의 객체를 정합할 수 있다. 본 발명의 3차원 맵은 정밀한 3차원 좌표를 제공하기 때문에 가상의 객체를 정합할 때 정밀한 정합이 가능하다. 이에 따라, 보다 사실감이 높은 증강 현실을 제공할 수 있다. According to the present invention, a virtual object can be registered with an image captured by using the 3D map derived as described above. Since the 3D map of the present invention provides precise 3D coordinates, precise registration is possible when registering virtual objects. Accordingly, it is possible to provide augmented reality with higher realism.

한편, 앞서 설명된 본 발명의 실시예에 따른 방법은 다양한 컴퓨터수단을 통하여 판독 가능한 프로그램 형태로 구현되어 컴퓨터로 판독 가능한 기록매체에 기록될 수 있다. 여기서, 기록매체는 프로그램 명령, 데이터 파일, 데이터구조 등을 단독으로 또는 조합하여 포함할 수 있다. 기록매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 예컨대 기록매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광 기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치를 포함한다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어를 포함할 수 있다. 이러한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다. Meanwhile, the method according to the embodiment of the present invention described above may be implemented in the form of a program readable by various computer means and recorded in a computer readable recording medium. Here, the recording medium may include a program command, a data file, a data structure, etc. alone or in combination. The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. For example, the recording medium includes magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floppy disks ( magneto-optical media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions may include high-level languages that can be executed by a computer using an interpreter or the like as well as machine language such as generated by a compiler. Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상 본 발명을 몇 가지 바람직한 실시예를 사용하여 설명하였으나, 이들 실시예는 예시적인 것이며 한정적인 것이 아니다. 이와 같이, 본 발명이 속하는 기술분야에서 통상의 지식을 지닌 자라면 본 발명의 사상과 첨부된 특허청구범위에 제시된 권리범위에서 벗어나지 않으면서 균등론에 따라 다양한 변화와 수정을 가할 수 있음을 이해할 것이다. Although the present invention has been described above using several preferred embodiments, these examples are illustrative and not restrictive. As such, those of ordinary skill in the art to which the present invention pertains will understand that various changes and modifications can be made in accordance with the doctrine of equivalents without departing from the spirit of the present invention and the scope of rights set forth in the appended claims.

10: 증강현실장치 11: 카메라부
12: 통신부 13: 센서부
14: 오디오부 15: 입력부
16: 표시부 17: 저장부
18: 제어부 110: 포즈획득부
120: 뎁스산출부 130: 3차원맵생성부 10: augmented reality device 11: camera unit
12: communication unit 13: sensor unit
14: audio unit 15: input unit
16: display unit 17: storage unit
18: control unit 110: pose acquisition unit
120: depth calculation unit 130: 3D map generation unit

Claims

An apparatus for constructing a three-dimensional map for providing augmented reality, comprising:
When two frames including the first frame and the second frame of the image taken while moving the position is input, the pose acquisition unit for acquiring pose information from the two frames input;
a depth map calculator for deriving a depth map from the two frames using a deep learning model; and
a three-dimensional map generator for generating a three-dimensional map based on the pose information and the depth map;
characterized in that it comprises
A device for constructing a three-dimensional map.

According to claim 1,
The pose acquisition unit
in each of the two frames
Extracting feature points representing the same object, and sequentially deriving pose information and a pose matrix based on the coordinate change of the extracted feature points
A device for constructing a three-dimensional map.

3. The method of claim 2,
The depth calculation unit
Derive a transformation matrix using the known camera matrix and the pose matrix,
generating a second frame replicating the second frame from the first frame using the transformation matrix;
Through the deep learning model, it characterized in that the depth map is derived according to the coordinate difference between the pixel of the second frame and the pixel of the second simulated frame
A device for constructing a three-dimensional map.

According to claim 1,
The depth calculation unit
Derive a transformation matrix using the pose matrix derived from the first frame for training and the second frame for training and the known camera matrix,
using the transformation matrix to generate a second frame for imitation learning that simulates the second frame for learning from the first frame for learning;
With respect to the prototype of the model, by learning the correlation between the coordinate difference and the depth of the pixel of the second frame for training of the part representing the same object in the real world and the pixel of the second frame for simulation learning, the pixel of the second frame for training and the Characterized in generating a deep learning model for deriving a depth map according to the difference in the coordinates of the pixels of the second frame for simulation learning
A device for constructing a three-dimensional map.

A method for constructing a three-dimensional map for providing augmented reality, comprising:
When two frames including the first frame and the second frame of the image taken while the pose acquisition unit is moved are input, obtaining pose information from the two frames;
deriving a depth map from the two frames by a depth map calculation unit using a deep learning model; and
generating, by a 3D map generator, a 3D map based on the pose information and the depth map;
characterized in that it comprises
A method for constructing a three-dimensional map.

6. The method of claim 5,
The step of obtaining the pose information from the two frames is
extracting, by the pose acquisition unit, a feature point representing the same object in each of the two frames; and
sequentially deriving, by the pose acquisition unit, pose information and a pose matrix based on the coordinate changes of the extracted feature points;
characterized in that it comprises
A method for constructing a three-dimensional map.

7. The method of claim 6,
The step of deriving a depth map from the two frames is
deriving a transformation matrix using the known camera matrix and the pose matrix by the depth calculator;
generating, by the depth calculator, a second frame simulating a second frame from a first frame among the two frames by using the transformation matrix; and
deriving, by the depth calculation unit, a depth map according to a difference in coordinates between a pixel of the second frame and a pixel of a simulated second frame through the deep learning model;
characterized in that it comprises
A method for constructing a three-dimensional map.

According to claim 1,
Before the step of obtaining pose information from the two frames,
deriving a transformation matrix using the known camera matrix and the pose matrix derived from the first frame for learning and the second frame for learning by the depth calculator;
generating, by the depth calculation unit, a second frame for imitation learning by using the transformation matrix to simulate the second frame for learning from the first frame for learning; and
The depth calculation unit learns the correlation between the depth and the coordinate difference between the pixel of the second frame for learning of the portion representing the same object in the real world with respect to the prototype of the model and the pixel of the second frame for simulating learning, the second frame for learning generating a deep learning model for deriving a depth map according to a coordinate difference between a pixel of and a pixel of the second frame for simulation learning;
characterized in that it further comprises
A method for constructing a three-dimensional map.