KR20230161309A

KR20230161309A - An augmented reality device for obtaining depth information and a method for operating the same

Info

Publication number: KR20230161309A
Application number: KR1020220101583A
Authority: KR
Inventors: 신승학; 정재윤; 김송현; 김효각; 원승재; 지서원
Original assignee: 삼성전자주식회사
Priority date: 2022-05-18
Filing date: 2022-08-12
Publication date: 2023-11-27

Abstract

추가 하드웨어 모듈없이 높은 정확도를 갖는 깊이 맵을 획득하기 위하여 IMU 센서에 의해 측정된 중력 방향에 기초하여 깊이 값을 보정하는 증강 현실 디바이스 및 그 동작 방법을 제공한다. 본 개시의 일 실시예에 따른 증강 현실 디바이스는 카메라를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득하고, 깊이 맵으로부터 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득하고, IMU 센서(Inertial Measurement Unit)에 의해 측정된 중력 방향에 기초하여 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하고, 보정된 법선 벡터의 방향에 기초하여 깊이 맵의 깊이 값을 보정할 수 있다. An augmented reality device that corrects depth values based on the direction of gravity measured by an IMU sensor and a method of operating the same are provided to obtain a depth map with high accuracy without an additional hardware module. An augmented reality device according to an embodiment of the present disclosure acquires a depth map from an image acquired using a camera, obtains a normal vector of at least one pixel from the depth map, and uses an IMU sensor. The direction of the normal vector of at least one pixel may be corrected based on the direction of gravity measured by an inertial measurement unit, and the depth value of the depth map may be corrected based on the direction of the corrected normal vector.

Description

Augmented reality device for acquiring depth information and method of operating the same {AN AUGMENTED REALITY DEVICE FOR OBTAINING DEPTH INFORMATION AND A METHOD FOR OPERATING THE SAME}

본 개시는 깊이 정보를 획득하는 증강 현실 디바이스(augment reality device) 및 그 동작 방법에 관한 것이다. 구체적으로, 본 개시는 중력 방향에 기초하여 깊이 맵(depth map)의 픽셀 별 깊이 값을 보정하는 증강 현실 디바이스 및 그 동작 방법에 관한 것이다.This disclosure relates to an augmented reality device that acquires depth information and a method of operating the same. Specifically, the present disclosure relates to an augmented reality device that corrects the depth value of each pixel of a depth map based on the direction of gravity and a method of operating the same.

증강 현실(Augmented Reality)은 현실 세계의 물리적 환경 공간이나 현실 객체(real world object) 상에 가상 이미지를 오버레이(overlay)하여 함께 보여주는 기술로서, 증강 현실 기술을 활용한 증강 현실 디바이스(예를 들어, 스마트 글래스(Smart Glass))가 정보 검색, 길 안내, 카메라 촬영과 같이 일상 생활에서 유용하게 사용되고 있다. 특히, 스마트 글래스는 패션 아이템으로도 착용되고, 실외 활동에 주로 사용되고 있다. Augmented Reality is a technology that overlays and displays virtual images on the physical environment space or real world objects of the real world. Augmented reality devices using augmented reality technology (for example, Smart glasses are useful in everyday life, such as information retrieval, route guidance, and camera photography. In particular, smart glasses are also worn as fashion items and are mainly used for outdoor activities.

최근에는 사용자에게 몰입감을 주기 위하여, 3차원 입체 공간으로 구성된 현실 공간에 포함되는 객체의 공간감을 나타내는 깊이 정보를 획득하는 깊이 센서(depth sensor)를 포함하는 디바이스가 널리 사용되고 있다. 종래의 깊이 센서를 이용한 깊이 정보 획득 기술은 예를 들어, 구조광 방식(Structured Light), 스테레오 비전 방식(Stereo Vision), 또는 ToF 방식(Time of Flight) 등이 있다. 상기 깊이 정보 획득 기술 중 구조광 방식, 스테레오 비전 방식, ToF 방식은 카메라 기반의 깊이 추정 방식으로서, 카메라로부터의 거리가 멀어지면 멀어질수록 깊이 값의 정확도가 낮아진다. 거리에 관한 깊이 값의 정확도가 상대적으로 높은 구조광 방식 또는 ToF 방식의 경우 발광부(illuminator) 등 하드웨어 모듈이 추가로 필요하고, 하드웨어 모듈에 의한 전력 소모 및 추가 비용 발생의 문제점이 있다. Recently, in order to provide users with a sense of immersion, devices including a depth sensor that acquires depth information representing the sense of space of objects included in a real space composed of three-dimensional space have been widely used. Conventional depth information acquisition technology using a depth sensor includes, for example, Structured Light, Stereo Vision, or Time of Flight. Among the depth information acquisition technologies, the structured light method, stereo vision method, and ToF method are camera-based depth estimation methods, and as the distance from the camera increases, the accuracy of the depth value decreases. In the case of the structured light method or ToF method, which has relatively high accuracy of depth values regarding distance, additional hardware modules such as illuminators are required, and there is a problem of power consumption and additional costs incurred by the hardware modules.

또한, 증강 현실 디바이스에 의해 실행되는 증강 현실 애플리케이션은 대부분 상시 깊이 정보를 필요로 하므로 전력 소모량이 증가하고, 소형 폼팩터를 갖는 휴대용 디바이스인 증강 현실 디바이스의 특성 상 발열 및 전력 소모량은 디바이스 이용 가능 시간에 크게 영향을 미친다. In addition, most augmented reality applications run by augmented reality devices require depth information at all times, which increases power consumption, and due to the nature of augmented reality devices, which are portable devices with a small form factor, heat generation and power consumption increase during the time the device is available. It has a big impact.

상술한 기술적 과제를 해결하기 위하여 본 개시는 중력 방향에 기초하여 깊이 값을 보정하는 증강 현실 디바이스를 제공한다. 본 개시의 일 실시예에 따른 증강 현실 디바이스는 카메라, IMU 센서(Inertial Measurement Unit), 적어도 하나의 프로세서, 및 메모리를 포함할 수 있다. 적어도 하나의 프로세서는 메모리에 저장된 적어도 하나의 명령어들을 실행함으로써, 카메라를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득할 수 있다. 적어도 하나의 프로세서는 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득할 수 있다. 적어도 하나의 프로세서는 IMU 센서(120)에 의해 측정된 중력 방향에 기초하여 상기 적어도 하나의 픽셀의 법선 벡터의 방향을 보정할 수 있다. 적어도 하나의 프로세서는 보정된 법선 벡터의 방향에 기초하여 상기 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. In order to solve the above-described technical problem, the present disclosure provides an augmented reality device that corrects depth values based on the direction of gravity. An augmented reality device according to an embodiment of the present disclosure may include a camera, an IMU sensor (Inertial Measurement Unit), at least one processor, and memory. At least one processor may obtain a depth map from an image acquired using a camera by executing at least one command stored in a memory. At least one processor may obtain a normal vector of at least one pixel included in the depth map. At least one processor may correct the direction of the normal vector of the at least one pixel based on the direction of gravity measured by the IMU sensor 120. At least one processor may correct the depth value of the at least one pixel based on the direction of the corrected normal vector.

상술한 기술적 과제를 해결하기 위하여 본 개시는 증강 현실 디바이스가 깊이 값을 보정하는 방법을 제공한다. 본 개시의 일 실시예에서, 상기 방법은 카메라를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득하는 단계를 포함할 수 있다. 상기 방법은 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득하는 단계를 포함할 수 있다. 상기 방법은 IMU 센서에 의해 측정된 중력 방향에 기초하여 상기 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하는 단계를 포함할 수 있다. 상기 방법은 보정된 법선 벡터의 방향에 기초하여 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계를 포함할 수 있다. In order to solve the above-described technical problem, the present disclosure provides a method for an augmented reality device to correct depth values. In one embodiment of the present disclosure, the method may include obtaining a depth map from an image acquired using a camera. The method may include obtaining a normal vector of at least one pixel included in the depth map. The method may include correcting the direction of a normal vector of the at least one pixel based on the direction of gravity measured by an IMU sensor. The method may include correcting the depth value of the at least one pixel based on the direction of the corrected normal vector.

상술한 기술적 과제를 해결하기 위하여, 본 개시는 컴퓨터로 읽을 수 있는 저장 매체를 포함하는 컴퓨터 프로그램 제품(Computer Program Product)를 제공한다. 상기 저장 매체는 카메라를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득하는 동작과 관련된 명령어들(instructions)을 저장할 수 있다. 상기 저장 매체는 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득하는 동작과 관련된 명령어들을 저장할 수 있다. 상기 저장 매체는 IMU 센서에 의해 측정된 중력 방향에 기초하여 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하는 동작과 관련된 명령어들을 저장할 수 있다. 상기 저장 매체는 보정된 법선 벡터의 방향에 기초하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정하는 동작과 관련된 명령어들을 저장할 수 있다. In order to solve the above-described technical problems, the present disclosure provides a computer program product including a computer-readable storage medium. The storage medium may store instructions related to obtaining a depth map from an image acquired using a camera. The storage medium may store instructions related to obtaining a normal vector of at least one pixel included in the depth map. The storage medium may store instructions related to correcting the direction of a normal vector of at least one pixel based on the direction of gravity measured by the IMU sensor. The storage medium may store instructions related to correcting the depth value of at least one pixel of the depth map based on the direction of the corrected normal vector.

본 개시는, 다음의 자세한 설명과 그에 수반되는 도면들의 결합으로 쉽게 이해될 수 있으며, 참조 번호(reference numerals)들은 구조적 구성요소(structural elements)를 의미한다.
도 1a는 본 개시의 증강 현실 디바이스가 깊이 값을 보정하는 방법을 설명하기 위한 개념도이다.
도 1b는 본 개시의 증강 현실 디바이스가 IMU 센서에 의해 측정된 중력 방향에 따라 깊이 값을 보정하는 방법을 설명하기 위한 개념도이다.
도 2는 본 개시의 일 실시예에 따른 증강 현실 디바이스의 구성 요소를 도시한 블록도이다.
도 3은 본 개시의 일 실시예에 따른 증강 현실 디바이스의 동작 방법을 도시한 흐름도이다.
도 4는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 픽셀 별 법선 벡터(normal vector)를 획득하는 방법을 도시한 흐름도이다.
도 5는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 깊이 맵으로부터 픽셀 별 법선 벡터를 획득하는 동작을 도시한 도면이다.
도 6a는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 중력 방향에 따라 법선 벡터의 방향을 보정하는 동작을 도시한 도면이다.
도 6b는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 중력 방향에 대하여 수직 방향으로 법선 벡터의 방향을 보정하는 동작을 도시한 도면이다.
도 7은 본 개시의 일 실시예에 따른 증강 현실 디바이스가 인공지능 모델을 이용하여 깊이 맵을 획득하는 트레이닝(training) 동작을 도시한 도면이다.
도 8은 본 개시의 일 실시예에 따른 증강 현실 디바이스가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다.
도 9는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 손실값(loss)를 산출하는 방법을 도시한 흐름도이다.
도 10은 본 개시의 일 실시예에 따른 증강 현실 디바이스가 손실값을 산출하는 동작을 도시한 도면이다.
도 11은 본 개시의 일 실시예에 따른 증강 현실 디바이스가 인공지능 모델을 이용하여 깊이 맵을 획득하는 동작을 도시한 도면이다.
도 12는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다.
도 13은 본 개시의 일 실시예에 따른 증강 현실 디바이스가 플레인 스윕(plane sweep) 방식을 통해 깊이 맵의 픽셀 별 깊이 값을 보정하는 동작을 도시한 도면이다.
도 14는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다.
도 15는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 ToF(Time-of-Flight) 방식을 통해 획득된 깊이 맵의 픽셀 별 깊이 값을 보정하는 동작을 도시한 도면이다.
도 16은 일반적인 방식으로 획득된 깊이 맵으로 복원된 공간 모델과 본 개시의 일 실시예에 따른 증강 현실 디바이스에 의해 획득된 깊이 맵으로 복원된 공간 모델을 도시한 도면이다.The present disclosure may be readily understood by combination of the following detailed description and accompanying drawings, where reference numerals refer to structural elements.
FIG. 1A is a conceptual diagram illustrating how the augmented reality device of the present disclosure corrects a depth value.
FIG. 1B is a conceptual diagram illustrating how the augmented reality device of the present disclosure corrects the depth value according to the direction of gravity measured by the IMU sensor.
Figure 2 is a block diagram showing the components of an augmented reality device according to an embodiment of the present disclosure.
Figure 3 is a flowchart showing a method of operating an augmented reality device according to an embodiment of the present disclosure.
FIG. 4 is a flowchart illustrating a method for an augmented reality device to obtain a normal vector for each pixel according to an embodiment of the present disclosure.
FIG. 5 is a diagram illustrating an operation of an augmented reality device acquiring a normal vector for each pixel from a depth map according to an embodiment of the present disclosure.
FIG. 6A is a diagram illustrating an operation of an augmented reality device correcting the direction of a normal vector according to the direction of gravity according to an embodiment of the present disclosure.
FIG. 6B is a diagram illustrating an operation of an augmented reality device correcting the direction of a normal vector in a direction perpendicular to the direction of gravity according to an embodiment of the present disclosure.
FIG. 7 is a diagram illustrating a training operation in which an augmented reality device acquires a depth map using an artificial intelligence model according to an embodiment of the present disclosure.
FIG. 8 is a flowchart illustrating a method by which an augmented reality device corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.
Figure 9 is a flowchart illustrating a method for an augmented reality device to calculate a loss value according to an embodiment of the present disclosure.
FIG. 10 is a diagram illustrating an operation of an augmented reality device calculating a loss value according to an embodiment of the present disclosure.
FIG. 11 is a diagram illustrating an operation of an augmented reality device acquiring a depth map using an artificial intelligence model according to an embodiment of the present disclosure.
FIG. 12 is a flowchart illustrating a method by which an augmented reality device corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.
FIG. 13 is a diagram illustrating an operation in which an augmented reality device corrects the depth value for each pixel of a depth map through a plane sweep method according to an embodiment of the present disclosure.
FIG. 14 is a flowchart illustrating a method by which an augmented reality device corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.
FIG. 15 is a diagram illustrating an operation of an augmented reality device correcting the depth value for each pixel of a depth map obtained through a time-of-flight (ToF) method according to an embodiment of the present disclosure.
FIG. 16 is a diagram illustrating a spatial model restored with a depth map obtained in a general manner and a spatial model restored with a depth map acquired by an augmented reality device according to an embodiment of the present disclosure.

본 명세서의 실시예들에서 사용되는 용어는 본 개시의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 실시예의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 명세서에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the embodiments of the present specification are general terms that are currently widely used as much as possible while considering the function of the present disclosure, but this may vary depending on the intention or precedent of a person working in the art, the emergence of new technology, etc. . In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the relevant embodiment. Therefore, the terms used in this specification should not be defined simply as the names of the terms, but should be defined based on the meaning of the term and the overall content of the present disclosure.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 용어들은 본 명세서에 기재된 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가질 수 있다. Singular expressions may include plural expressions, unless the context clearly indicates otherwise. Terms used herein, including technical or scientific terms, may have the same meaning as generally understood by a person of ordinary skill in the technical field described herein.

본 개시 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 본 명세서에 기재된 "...부", "...모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.Throughout the present disclosure, when a part “includes” a certain element, this means that it may further include other elements rather than excluding other elements, unless specifically stated to the contrary. In addition, terms such as "...unit" and "...module" used in this specification refer to a unit that processes at least one function or operation, which is implemented as hardware or software or as a combination of hardware and software. It can be implemented.

본 개시에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)", "~하는 능력을 가지는(having the capacity to)", "~하도록 설계된(designed to)", "~하도록 변경된(adapted to)", "~하도록 만들어진(made to)", 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 시스템"이라는 표현은, 그 시스템이 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.The expression “configured to” used in the present disclosure may mean, for example, “suitable for,” “having the capacity to,” depending on the situation. It can be used interchangeably with ", "designed to," "adapted to," "made to," or "capable of." The term “configured (or set to)” may not necessarily mean “specifically designed to” in hardware. Instead, in some contexts, the expression “system configured to” may mean that the system is “capable of” in conjunction with other devices or components. For example, the phrase "processor configured (or set) to perform A, B, and C" refers to a processor dedicated to performing the operations (e.g., an embedded processor), or by executing one or more software programs stored in memory. It may refer to a general-purpose processor (e.g., CPU or application processor) that can perform the corresponding operations.

또한, 본 개시에서 일 구성요소가 다른 구성요소와 "연결된다" 거나 "접속된다" 등으로 언급된 때에는, 상기 일 구성요소가 상기 다른 구성요소와 직접 연결되거나 또는 직접 접속될 수도 있지만, 특별히 반대되는 기재가 존재하지 않는 이상, 중간에 또 다른 구성요소를 매개하여 연결되거나 또는 접속될 수도 있다고 이해되어야 할 것이다. In addition, in the present disclosure, when a component is referred to as “connected” or “connected” to another component, the component may be directly connected or directly connected to the other component, but in particular, the contrary It should be understood that unless a base material exists, it may be connected or connected through another component in the middle.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Below, with reference to the attached drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily practice them. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein.

이하에서는 도면을 참조하여 본 개시의 실시예들을 상세하게 설명한다. Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings.

도 1a는 본 개시의 일 실시예에 따른 증강 현실 디바이스가 깊이 값을 보정하는 방법을 설명하기 위한 개념도이다. FIG. 1A is a conceptual diagram illustrating a method by which an augmented reality device corrects a depth value according to an embodiment of the present disclosure.

도 1a를 참조하면, 증강 현실 디바이스는 카메라(110)를 이용하여 획득된 이미지를 통해 실제 바닥면(10)의 깊이 값(20)을 추정할 수 있다. 추정된 깊이 값(20)은 실제 바닥면(10)과 카메라(110) 간의 물리적 거리 또는 높이 차이 등 위치 관계에 따라 실제 바닥면(10)의 깊이 값과는 다른 값으로 추정될 수 있다. 예를 들어, 카메라(110)가 좌안 카메라 및 우안 카메라를 포함하고, 증강 현실 디바이스가 좌안 카메라로부터 획득된 좌안 이미지 및 우안 카메라로부터 획득된 우안 이미지를 이용하여 스테레오 비전 방식을 통해 깊이 값(20)을 추정하는 경우, 카메라(110)로부터의 거리가 멀면 멀수록 좌안 이미지와 우안 이미지 간의 대응점 사이의 간격, 즉 시차(disparity)의 오차가 커질 수 있다. 예를 들어, 증강 현실 디바이스가 ToF(Time-of-Flight) 방식을 통해 깊이 값(20)을 추정하는 경우에도 카메라(110)로부터의 거리가 멀면 멀수록 추정된 깊이 값(20)의 정확도는 떨어지게 된다. Referring to FIG. 1A , the augmented reality device can estimate the depth value 20 of the actual floor surface 10 through the image acquired using the camera 110. The estimated depth value 20 may be estimated to be a different value from the depth value of the actual floor 10 depending on the positional relationship, such as the physical distance or height difference between the actual floor 10 and the camera 110. For example, the camera 110 includes a left-eye camera and a right-eye camera, and the augmented reality device uses the left-eye image acquired from the left-eye camera and the right-eye image acquired from the right-eye camera to obtain a depth value 20 through a stereo vision method. When estimating , the larger the distance from the camera 110, the larger the error in disparity, that is, the gap between corresponding points between the left-eye image and the right-eye image. For example, even when an augmented reality device estimates the depth value 20 through a Time-of-Flight (ToF) method, the greater the distance from the camera 110, the greater the accuracy of the estimated depth value 20. It falls.

도 1a에 도시된 실시예에서, 추정된 깊이 값(20)은 카메라(110)와의 거리가 가까운 영역은 실제 바닥면(10)의 깊이 값과 유사한 값으로 획득되었지만, 카메라(110)와의 거리가 멀면 멀수록 실제 바닥면(10)의 깊이 값과는 달리 위로 기울어진 경사 형태의 깊이 값으로 획득될 수 있다.In the embodiment shown in FIG. 1A, the estimated depth value 20 was obtained as a value similar to the depth value of the actual floor surface 10 in the area where the distance to the camera 110 is close, but the distance to the camera 110 is similar to the depth value of the actual floor surface 10. The farther away, the more the depth value can be obtained in the form of an upward slope, unlike the depth value of the actual bottom surface 10.

증강 현실 디바이스는 IMU 센서(Inertial Measurement Unit)(120)를 통해 측정된 측정값에 기초하여 중력 방향(G)에 관한 정보를 획득할 수 있다. 본 개시의 일 실시예에서, IMU 센서(120)는 자이로 센서(gyroscope)를 포함하고, 증강 현실 디바이스는 IMU 센서(120)에 포함된 자이로 센서를 이용하여 중력 방향(G) 정보를 획득할 수 있다. 증강 현실 디바이스는 중력 방향(G)에 기초하여 추정된 깊이 값(20)을 보정함으로써, 보정된 깊이 값(30)을 획득할 수 있다. 도 1a에 도시된 실시예에서, 보정된 깊이 값(30)은 실제 바닥면(10)과 같이 평평한 경사도를 갖고, 실제 바닥면(10)과 동일한 깊이 값을 가질 수 있다. 증강 현실 디바이스가 IMU 센서(120)를 통해 획득한 중력 방향(G)에 관한 정보에 기초하여 깊이 값을 보정하는 구체적인 방법은 도 1b에서 설명하기로 한다.The augmented reality device can obtain information about the direction of gravity (G) based on the measurement value measured through the IMU sensor (Inertial Measurement Unit) 120. In one embodiment of the present disclosure, the IMU sensor 120 includes a gyroscope, and the augmented reality device can acquire gravity direction (G) information using the gyro sensor included in the IMU sensor 120. there is. The augmented reality device may obtain the corrected depth value 30 by correcting the estimated depth value 20 based on the direction of gravity (G). In the embodiment shown in FIG. 1A, the corrected depth value 30 may have a flat slope like the actual floor surface 10 and may have the same depth value as the actual floor surface 10. A specific method of correcting the depth value based on information about the direction of gravity (G) acquired by the augmented reality device through the IMU sensor 120 will be described in FIG. 1B.

도 1b는 본 개시의 증강 현실 디바이스가 IMU 센서에 의해 측정된 중력 방향(G)에 따라 깊이 값을 보정하는 방법을 설명하기 위한 개념도이다. FIG. 1B is a conceptual diagram illustrating how the augmented reality device of the present disclosure corrects the depth value according to the direction of gravity (G) measured by the IMU sensor.

도 1b를 참조하면, 증강 현실 디바이스는 픽셀의 깊이 값을 획득한다(동작 ①). 증강 현실 디바이스는 카메라(110)를 이용하여 획득된 이미지에 기초하여 깊이 값(depth value)을 갖는 복수의 픽셀들(p₁ 내지 p_n)을 포함하는 깊이 맵(depth map)을 획득할 수 있다. 본 개시의 일 실시예에서, 카메라(110)는 좌안 카메라 및 우안 카메라를 포함하고, 증강 현실 디바이스는 좌안 카메라로부터 획득된 좌안 이미지 및 우안 카메라로부터 획득된 우안 이미지를 이용하여 스테레오 비전 방식을 통해 깊이 맵을 획득할 수 있다. 본 개시의 일 실시예에서, 카메라(110)는 ToF(Time-of-Flight) 카메라로 구성되고, 증강 현실 디바이스는 ToF 카메라를 이용하여 깊이 맵을 획득할 수 있다. 그러나, 이에 한정되는 것은 아니고, 본 개시의 다른 실시예에서 증강 현실 디바이스는 구조광 방식(structured light)을 통해서 깊이 맵을 획득할 수도 있다.Referring to FIG. 1B, the augmented reality device acquires the depth value of the pixel (operation ①). The augmented reality device may obtain a depth map including a plurality of pixels (p ₁ to p _n ) with a depth value based on the image acquired using the camera 110. . In one embodiment of the present disclosure, the camera 110 includes a left-eye camera and a right-eye camera, and the augmented reality device uses a left-eye image acquired from the left-eye camera and a right-eye image acquired from the right-eye camera to detect depth through a stereo vision method. You can obtain a map. In one embodiment of the present disclosure, the camera 110 is configured as a Time-of-Flight (ToF) camera, and the augmented reality device can acquire a depth map using the ToF camera. However, it is not limited to this, and in another embodiment of the present disclosure, the augmented reality device may acquire a depth map through structured light.

깊이 맵에 포함되는 픽셀들의 깊이 값은 카메라(110)로부터의 거리에 기초하여 실제 바닥면(10)과는 다른 값을 가질 수 있다. 도 1b에 도시된 실시예에서, 복수의 픽셀들(p₁ 내지 p_n) 중 상대적으로 카메라(110)와 인접한 영역(near)에 위치하는 제1 픽셀(p₁) 및 제2 픽셀(p₂)의 깊이 값은 실제 바닥면(10)의 깊이 값과 동일 또는 유사하지만, 카메라(110)와의 거리가 먼 영역(far)에 위치하는 픽셀들(p_n-1, p_n)의 깊이 값은 실제 바닥면(10)의 깊이 값과는 차이가 있을 수 있다.Depth values of pixels included in the depth map may have values different from those of the actual floor surface 10 based on the distance from the camera 110. In the embodiment shown in FIG. 1B, among the plurality of pixels (p ₁ to p _n ), the first pixel (p ₁ ) and the second pixel (p ₂ ) are located in an area (near) relatively adjacent to the camera 110. ) is the same or similar to the depth value of the actual floor surface 10, but the depth value of the pixels (p _n-1 , p _n ) located in the far area from the camera 110 is There may be differences from the depth value of the actual floor surface 10.

증강 현실 디바이스는 픽셀의 법선 벡터(normal vector)(N₁ 내지 N_n)를 획득한다 (동작 ②). 본 개시의 일 실시예에서, 증강 현실 디바이스는 깊이 맵에 포함된 복수의 픽셀들(p₁ 내지 p_n)의 방향 벡터 및 깊이 값에 기초하여 복수의 픽셀들(p₁ 내지 p_n)을 3차원 좌표값으로 변환할 수 있다. 증강 현실 디바이스는 어느 하나의 픽셀을 기준으로 상하좌우 방향으로 인접한 위치에 배치된 인접 픽셀들의 3차원 좌표값의 외적(cross-product)을 계산함으로써, 복수의 픽셀들(p₁ 내지 p_n) 각각의 법선 벡터(N₁ 내지 N_n)를 획득할 수 있다. The augmented reality device acquires the normal vector (N ₁ to N _n ) of the pixel (operation ②). In one embodiment of the present disclosure, the augmented reality device divides the plurality of pixels (p ₁ to p _n ) into 3 based on the direction vector and depth value of the plurality of pixels (p ₁ to p _n ) included in the depth map. It can be converted to dimensional coordinates. The augmented reality device calculates the cross-product of the three-dimensional coordinate values of adjacent pixels arranged in adjacent positions in the up, down, left, and right directions based on one pixel, thereby dividing each of the plurality of pixels (p ₁ to p _n ). The normal vectors (N ₁ to N _n ) can be obtained.

증강 현실 디바이스는 중력 방향에 기초하여 법선 벡터(N₁ 내지 N_n)를 보정한다 (동작 ③). 증강 현실 디바이스는 IMU 센서(120, 도 1a 참조)를 포함하고, IMU 센서(120)의 자이로 센서(gyroscope)를 이용하여 측정된 측정값에 기초하여 중력 방향(G)에 관한 정보를 획득할 수 있다. 증강 현실 디바이스는 픽셀 별 법선 벡터(N₁ 내지 N_n)의 방향을 중력 방향(G) 또는 중력 방향(G)에 수직하는 방향에 따라 보정할 수 있다. 보정 결과, 증강 현실 디바이스는 보정된 법선 벡터(N₁' 내지 N_n')를 획득할 수 있다.The augmented reality device corrects the normal vectors (N ₁ to N _n ) based on the direction of gravity (operation ③). The augmented reality device includes an IMU sensor 120 (see FIG. 1A) and can obtain information about the direction of gravity (G) based on measurements measured using a gyroscope of the IMU sensor 120. there is. The augmented reality device may correct the direction of the normal vectors (N ₁ to N _n ) for each pixel according to the direction of gravity (G) or a direction perpendicular to the direction of gravity (G). As a result of the correction, the augmented reality device can obtain the corrected normal vectors (N ₁ ' to N _n ').

증강 현실 디바이스는 보정된 법선 벡터(N₁' 내지 N_n')에 기초하여 픽셀 별 깊이 값을 보정한다 (동작 ④). 증강 현실 디바이스는 보정된 법선 벡터(N₁' 내지 N_n')의 방향에 따라 픽셀 별 평면을 정의하고, 픽셀 별 평면에 기초하여 픽셀의 깊이 값을 보정할 수 있다. 본 개시의 일 실시예에서, 증강 현실 디바이스는 인공지능 모델(Artificial Intelligence, AI)을 이용하여 깊이 맵을 획득하고, 보정된 법선 벡터(N₁' 내지 N_n')에 의해 정의되는 평면 상의 픽셀과 인접 픽셀들의 깊이 값에 기초하여 손실값(loss)을 산출하며, 산출된 손실값을 인공지능 모델에 적용하는 트레이닝(training)을 수행함으로써, 픽셀 별 깊이 값을 보정할 수 있다. 본 개시의 일 실시예에서, 증강 현실 디바이스는 보정된 법선 벡터(N₁' 내지 N_n')의 방향에 기초하여 정의된 픽셀 별 평면에 따라 플레인 스윕(plane sweep)을 수행함으로써, 픽셀 별 깊이 값을 보정할 수 있다. 본 개시의 일 실시예에서, 보정된 법선 벡터(N₁' 내지 N_n')에 기초하여 정의된 평면으로부터 RGB 이미지의 색상 정보에 기초하여 평면 영역을 식별하고, 식별된 평면 영역 내의 인접 픽셀들의 깊이 값에 기초하여 픽셀 별 깊이 값을 보정할 수 있다. The augmented reality device corrects the depth value for each pixel based on the corrected normal vectors (N ₁ ' to N _n ') (operation ④). The augmented reality device may define a plane for each pixel according to the direction of the corrected normal vectors (N ₁ ' to N _n ') and correct the depth value of the pixel based on the plane for each pixel. In one embodiment of the present disclosure, the augmented reality device acquires a depth map using an artificial intelligence (AI) model, and pixels on a plane defined by the corrected normal vectors (N ₁ ' to N _n ') The loss value is calculated based on the depth values of adjacent pixels, and the depth value for each pixel can be corrected by performing training to apply the calculated loss value to the artificial intelligence model. In one embodiment of the present disclosure, the augmented reality device performs a plane sweep according to a plane for each pixel defined based on the direction of the corrected normal vectors (N ₁ ' to N _n '), thereby determining the depth for each pixel. The value can be corrected. In one embodiment of the present disclosure, a planar region is identified based on color information of an RGB image from a plane defined based on the corrected normal vectors (N ₁ ' to N _n '), and adjacent pixels within the identified plane region are identified. The depth value for each pixel can be corrected based on the depth value.

증강 현실 디바이스는 보정된 깊이 값을 갖는 복수의 픽셀들(p₁' 내지 p_n')을 획득할 수 있다. 복수의 픽셀들(p₁' 내지 p_n') 각각은 실제 바닥면(10)의 깊이와 동일 또는 유사한 깊이 값을 가질 수 있다. The augmented reality device may acquire a plurality of pixels (p ₁ ' to p _n ') having corrected depth values. Each of the plurality of pixels (p ₁ ' to p _n ') may have a depth value that is the same as or similar to the actual depth of the floor surface 10.

카메라(110)를 통해 획득한 이미지를 이용하는 이미지 기반 깊이 값 획득 방식, 예를 들어 구조광 방식(structure light), 스테레오 비전 방식(stereo vision), 또는 ToF 방식(Time-of-Flight)은 카메라(110)로부터의 거리가 멀어지면 멀어질수록 깊이 값의 정확도가 낮아지는 문제점이 있다. 거리에 관한 깊이 값의 정확도가 상대적으로 높은 구조광 방식 또는 ToF 방식의 경우 발광부(illuminator) 등 하드웨어 모듈이 추가로 필요하고, 하드웨어 모듈에 의한 전력 소모 및 추가 비용이 발생될 수 있다. 또한, 증강 현실 애플리케이션은 대부분 상시 깊이 정보를 필요로 하므로 증강 현실 디바이스의 전력 소모량이 증가하는 문제점이 있다. 소형 폼팩터를 갖는 휴대용 디바이스인 증강 현실 디바이스의 특성 상 발열 및 전력 소모량은 디바이스 이용 가능 시간에 크게 영향을 미칠 수 있다. An image-based depth value acquisition method using an image acquired through the camera 110, for example, a structured light method, a stereo vision method, or a Time-of-Flight method using a camera ( There is a problem that the accuracy of the depth value decreases as the distance from 110) increases. In the case of the structured light method or ToF method, which has relatively high accuracy of depth values with respect to distance, additional hardware modules such as illuminators are required, and power consumption and additional costs may be incurred due to the hardware modules. In addition, since most augmented reality applications require depth information at all times, there is a problem in that the power consumption of the augmented reality device increases. Due to the nature of augmented reality devices, which are portable devices with a small form factor, heat generation and power consumption can greatly affect the time the device can be used.

본 개시는 IMU 센서(120, 도 1a 참조)를 통해 획득된 중력 방향(G) 정보를 이용하여 추가 하드웨어 모듈(예를 들어, 발광부 모듈)에 의한 불필요한 전력 소모 없이 높은 정확도를 갖는 깊이 값을 획득하는 증강 현실 디바이스 및 그 동작 방법을 제공하는 데 목적이 있다.The present disclosure uses the gravity direction (G) information acquired through the IMU sensor 120 (see FIG. 1A) to obtain a depth value with high accuracy without unnecessary power consumption by an additional hardware module (e.g., a light emitter module). The purpose is to provide an augmented reality device and a method of operating the same.

도 1a 및 도 1b에 도시된 실시예에서, 증강 현실 디바이스는 IMU 센서(120)를 이용하여 획득된 중력 방향(G) 정보에 기초하여 픽셀 별 법선 벡터(N₁ 내지 N_n)의 방향을 보정하고, 보정된 법선 벡터(N₁' 내지 N_n')에 기초하여 픽셀 별 깊이 값을 보정함으로써, 높은 정확도를 갖는 깊이 맵을 획득하고, 전력 소모량을 감소시킬 수 있다. 또한, 일반적으로 증강 현실 디바이스에는 IMU 센서(120)가 필수 구성 요소로서 포함되는 바, 본 개시의 일 실시예에 따른 증강 현실 디바이스는 소형 폼팩터를 유지하면서도 낮은 전력 소모량을 구현하고, 이로 인하여 휴대성 및 디바이스 사용 시간을 증가시키는 기술적 효과를 제공할 수 있다. In the embodiment shown in FIGS. 1A and 1B, the augmented reality device corrects the direction of the normal vectors (N ₁ to N _n ) for each pixel based on the gravity direction (G) information acquired using the IMU sensor 120. And by correcting the depth value for each pixel based on the corrected normal vectors (N ₁ ' to N _n '), a depth map with high accuracy can be obtained and power consumption can be reduced. In addition, augmented reality devices generally include the IMU sensor 120 as an essential component, and the augmented reality device according to an embodiment of the present disclosure implements low power consumption while maintaining a small form factor, thereby enabling portability. And it can provide a technical effect of increasing device use time.

도 2는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)의 구성 요소를 도시한 블록도이다.FIG. 2 is a block diagram showing components of an augmented reality device 100 according to an embodiment of the present disclosure.

증강 현실 디바이스(100)는 사용자의 안면부에 착용하는 안경 형태의 증강 현실 글래스(Augmented Reality Glass)일 수 있다. 증강 현실 디바이스(100)는 애플리케이션을 실행함으로써, 시야(Field Of View, FOV) 내의 현실 객체 뿐만 아니라, 웨이브 가이드(waveguide)에 표시되는 가상 이미지 컨텐트를 제공할 수 있다. 증강 현실 디바이스(100)는 예를 들어, 무비 애플리케이션, 뮤직 애플리케이션, 포토 애플리케이션, 갤러리 애플리케이션, 웹 브라우저 애플리케이션, 전자책(e-book reader) 애플리케이션, 게임 애플리케이션, 증강 현실 애플리케이션, SNS 애플리케이션, 메신저 애플리케이션, 오브젝트 인식 애플리케이션 등을 실행함으로써, 각각의 애플리케이션에서 표시되는 가상 이미지 콘텐트를 사용자에게 제공할 수 있다. The augmented reality device 100 may be glasses-shaped augmented reality glasses worn on the user's face. By executing an application, the augmented reality device 100 can provide not only real objects within a field of view (FOV) but also virtual image content displayed on a waveguide. The augmented reality device 100 includes, for example, a movie application, a music application, a photo application, a gallery application, a web browser application, an e-book reader application, a game application, an augmented reality application, an SNS application, a messenger application, By executing an object recognition application, etc., virtual image content displayed in each application can be provided to the user.

그러나, 이에 한정되는 것은 아니고, 증강 현실 디바이스(100)는 사용자의 두부(頭部)에 착용하는 헤드 마운트 디스플레이 장치(HMD: Head Mounted Display Apparatus), 또는 증강 현실 헬멧(Augmented Reality Helmet) 등으로 구현될 수도 있다. 그러나, 본 개시의 증강 현실 디바이스(100)가 전술한 예시로 한정되는 것은 아니다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 모바일 디바이스, 스마트 폰(smart phone), 노트북 컴퓨터(laptop computer), 태블릿 PC, 전자책 단말기, 디지털 방송용 단말기, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 네비게이션, MP3 플레이어, 캠코더, IPTV(Internet Protocol Television), DTV(Digital Television), 또는 착용형 기기(wearable device) 등과 같은 다양한 디바이스로 구현될 수도 있다. However, it is not limited to this, and the augmented reality device 100 is implemented as a Head Mounted Display Apparatus (HMD) or an Augmented Reality Helmet worn on the user's head. It could be. However, the augmented reality device 100 of the present disclosure is not limited to the above-described examples. In one embodiment of the present disclosure, the augmented reality device 100 includes mobile devices, smart phones, laptop computers, tablet PCs, e-readers, digital broadcasting terminals, PDAs (Personal Digital Assistants), It may be implemented in various devices such as Portable Multimedia Player (PMP), navigation, MP3 player, camcorder, Internet Protocol Television (IPTV), Digital Television (DTV), or wearable device.

도 2를 참조하면, 증강 현실 디바이스(100)는 카메라(110), IMU 센서(120), 프로세서(130), 및 메모리(140)를 포함할 수 있다. 카메라(110), IMU 센서(120), 프로세서(130), 및 메모리(140)는 각각 전기적 및/또는 물리적으로 서로 연결될 수 있다. 도 2에는 증강 현실 디바이스(100)의 동작을 설명하기 위한 필수적 구성 요소만이 도시되었고, 증강 현실 디바이스(100)가 포함하는 구성 요소가 도 2에 도시된 바와 같이 한정되는 것은 아니다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 시선 추적 센서 또는 디스플레이 엔진 등을 더 포함할 수 있다. Referring to FIG. 2 , the augmented reality device 100 may include a camera 110, an IMU sensor 120, a processor 130, and a memory 140. The camera 110, IMU sensor 120, processor 130, and memory 140 may each be electrically and/or physically connected to each other. In FIG. 2 , only essential components for explaining the operation of the augmented reality device 100 are shown, and the components included in the augmented reality device 100 are not limited as shown in FIG. 2 . In one embodiment of the present disclosure, the augmented reality device 100 may further include an eye tracking sensor or a display engine.

카메라(110)는 현실 공간의 객체를 촬영함으로써, 객체에 관한 이미지를 획득하도록 구성된다. 본 개시의 일 실시예에서, 카메라(110)는 렌즈 모듈, 이미지 센서, 및 영상 처리 모듈을 포함할 수 있다. 카메라(110)는 이미지 센서(예를 들어, CMOS 또는 CCD)에 의해 얻어지는 정지 이미지 또는 동영상을 획득할 수 있다. 영상 처리 모듈은 이미지 센서를 통해 획득된 정지 이미지 또는 동영상을 가공하여, 필요한 정보를 추출하고, 추출된 정보를 프로세서(130)에 전달할 수 있다.The camera 110 is configured to acquire an image about an object by photographing the object in real space. In one embodiment of the present disclosure, the camera 110 may include a lens module, an image sensor, and an image processing module. The camera 110 may acquire a still image or video obtained by an image sensor (eg, CMOS or CCD). The image processing module may process a still image or video acquired through an image sensor, extract necessary information, and transmit the extracted information to the processor 130.

본 개시의 일 실시예에서, 카메라(110)는 좌안 카메라 및 우안 카메라를 포함하고, 두 개의 카메라를 이용하여 객체에 관한 3차원 입체 이미지를 획득하는 스테레오 카메라(Stereo camera)일 수 있다. 그러나, 이에 한정되는 것은 아니고, 카메라(110)는 단일 카메라 또는 3개 이상의 멀티 카메라를 포함할 수 있다.In one embodiment of the present disclosure, the camera 110 may be a stereo camera that includes a left-eye camera and a right-eye camera and acquires a three-dimensional image of an object using two cameras. However, it is not limited to this, and the camera 110 may include a single camera or three or more multi-cameras.

본 개시의 일 실시예예서, 카메라(110)는 객체에 광을 조사하고, 객체로부터 반사된 반사광을 검출하며, 반사광이 검출된 시점과 광이 조사된 시점 간의 시간 차이인 비행 시간(time of flight)에 기초하여 객체의 깊이 값을 획득하는 ToF 카메라일 수 있다. ToF 카메라로 구현되는 경우, 카메라(110)는 객체의 RGB 이미지(1500, 도 15 참조) 및 깊이 맵 이미지(1510, 도 15 참조)를 함께 획득할 수 있다. In one embodiment of the present disclosure, the camera 110 irradiates light to an object, detects reflected light reflected from the object, and time of flight, which is the time difference between the time when the reflected light is detected and the time when the light is irradiated. It may be a ToF camera that acquires the depth value of the object based on ). When implemented as a ToF camera, the camera 110 can simultaneously acquire an RGB image (1500, see FIG. 15) and a depth map image (1510, see FIG. 15) of the object.

IMU 센서(Inertial Measurement Unit)(120)는 가속도 센서(accelerometer), 자이로 센서(gyroscope), 및 자력 센서(magnetometer)의 조합을 통해 증강 현실 디바이스(100)의 이동 속도, 방향, 각도, 및 중력 가속도를 측정하도록 구성되는 센서이다. 본 개시의 일 실시예에서, IMU 센서(120)는 행 방향, 횡 방향, 및 높이 방향의 가속도를 측정하는 3축 가속도계와 롤(roll), 피치(pitch), 및 요(yaw) 각속도를 측정하는 3축 각속도계를 포함할 수 있다. 본 개시의 일 실시예에서, IMU 센서(120)는 자이로 센서를 이용하여 각속도를 측정하고, 측정된 각속도에 기초하여 중력 방향을 감지할 수 있다. IMU 센서(120)는 중력 방향에 관한 정보를 프로세서(130)에 제공할 수 있다. The IMU sensor (Inertial Measurement Unit) 120 measures the movement speed, direction, angle, and gravitational acceleration of the augmented reality device 100 through a combination of an accelerometer, a gyroscope, and a magnetometer. It is a sensor configured to measure. In one embodiment of the present disclosure, the IMU sensor 120 is a three-axis accelerometer that measures acceleration in the row, lateral, and height directions, and measures roll, pitch, and yaw angular velocities. It may include a three-axis angular velocity meter. In one embodiment of the present disclosure, the IMU sensor 120 may measure angular velocity using a gyro sensor and detect the direction of gravity based on the measured angular velocity. The IMU sensor 120 may provide information about the direction of gravity to the processor 130.

프로세서(130)는 메모리(140)에 저장된 프로그램의 하나 이상의 명령어들(instructions)을 실행할 수 있다. 프로세서(130)는 산술, 로직 및 입출력 연산과 시그널 프로세싱을 수행하는 하드웨어 구성 요소로 구성될 수 있다. 프로세서(130)는 예를 들어, 중앙 처리 장치(Central Processing Unit), 마이크로 프로세서(microprocessor), 그래픽 프로세서(Graphic Processing Unit), ASICs(Application Specific Integrated Circuits), DSPs(Digital Signal Processors), DSPDs(Digital Signal Processing Devices), PLDs(Programmable Logic Devices), 및 FPGAs(Field Programmable Gate Arrays) 중 적어도 하나로 구성될 수 있으나, 이에 한정되는 것은 아니다. The processor 130 may execute one or more instructions of a program stored in the memory 140. The processor 130 may be comprised of hardware components that perform arithmetic, logic, input/output operations, and signal processing. The processor 130 may include, for example, a Central Processing Unit (Central Processing Unit), a microprocessor (microprocessor), a Graphic Processing Unit (Graphic Processing Unit), Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), and Digital Signal Processors (DSPDs). It may consist of at least one of Signal Processing Devices (PLDs), Programmable Logic Devices (PLDs), and Field Programmable Gate Arrays (FPGAs), but is not limited thereto.

본 개시의 일 실시예에서, 프로세서(130)는 인공 지능(Artificial Intelligence, AI) 학습을 수행하는 AI 프로세서를 포함할 수 있다. AI 프로세서는, 인공 지능(AI)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 증강 현실 디바이스(100)에 탑재될 수도 있다. In one embodiment of the present disclosure, the processor 130 may include an AI processor that performs artificial intelligence (AI) learning. AI processors may be manufactured in the form of dedicated hardware chips for artificial intelligence (AI), or as part of existing general-purpose processors (e.g. CPU or application processor) or graphics-specific processors (e.g. GPU) for augmented reality devices. It may also be mounted on (100).

도 2에는 프로세서(130)가 하나의 엘리먼트로 도시되었으나, 이에 한정되는 것은 아니다. 본 개시의 일 실시예에서, 프로세서(130)는 하나 이상의 복수 개의 엘리먼트들로 구성될 수 있다. Although the processor 130 is shown as one element in FIG. 2, it is not limited thereto. In one embodiment of the present disclosure, the processor 130 may be composed of one or more elements.

메모리(140)는 예를 들어, 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 또는 광 디스크 중 적어도 하나의 타입의 저장매체로 구성될 수 있다. The memory 140 may be, for example, a flash memory type, a hard disk type, a multimedia card micro type, or a card type memory (e.g., SD or XD memory). etc.), RAM (Random Access Memory), SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), or It may be composed of at least one type of storage medium, such as an optical disk.

메모리(140)에는 증강 현실 디바이스(100)가 객체의 깊이 값 정보를 획득하는 기능 또는 동작들과 관련된 명령어들(instructions)이 저장될 수 있다. 본 개시의 일 실시예에서, 메모리(140)에는 프로세서(130)가 판독할 수 있는 명령어들, 알고리즘(algorithm), 데이터 구조, 프로그램 코드(program code), 및 애플리케이션 프로그램(application program) 중 적어도 하나가 저장될 수 있다. 메모리(140)에 저장되는 명령어들, 알고리즘, 데이터 구조, 및 프로그램 코드는 예를 들어, C,　C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다.The memory 140 may store instructions related to functions or operations by which the augmented reality device 100 acquires depth value information of an object. In one embodiment of the present disclosure, the memory 140 includes at least one of instructions, an algorithm, a data structure, a program code, and an application program that can be read by the processor 130. can be saved. Instructions, algorithms, data structures, and program codes stored in memory 140 may be implemented in, for example, programming or scripting languages such as C, C++, Java, assembler, etc.

메모리(140)에는 깊이 맵 획득 모듈(142), 법선 벡터 보정 모듈(144), 및 깊이 값 보정 모듈(146)에 관한 명령어들, 알고리즘, 데이터 구조, 또는 프로그램 코드가 저장되어 있을 수 있다. 메모리(140)에 포함되는 '모듈'은 프로세서(130)에 의해 수행되는 기능이나 동작을 처리하는 단위를 의미하고, 이는 명령어들, 알고리즘, 데이터 구조, 또는 프로그램 코드와 같은 소프트웨어로 구현될 수 있다. The memory 140 may store instructions, algorithms, data structures, or program codes related to the depth map acquisition module 142, the normal vector correction module 144, and the depth value correction module 146. The 'module' included in the memory 140 refers to a unit that processes functions or operations performed by the processor 130, and may be implemented as software such as instructions, algorithms, data structures, or program code. .

이하의 실시예에서, 프로세서(130)는 메모리(140)에 저장된 명령어들 또는 프로그램 코드들을 실행함으로써 구현될 수 있다.In the following embodiment, the processor 130 may be implemented by executing instructions or program codes stored in the memory 140.

깊이 맵 획득 모듈(142)은 카메라(110)를 이용하여 획득된 이미지에 기초하여 객체에 관한 깊이 맵(depth map)을 획득하는 기능 및/또는 동작과 관련된 명령어들 또는 프로그램 코드로 구성된다. 본 개시의 일 실시예에서, 깊이 맵 획득 모듈(142)은 인공지능 모델을 이용하여 깊이 맵을 획득하도록 구성될 수 있다. 프로세서(130)는 깊이 맵 획득 모듈(142)과 관련된 명령어들 또는 프로그램 코드를 실행함으로써, 깊이 맵을 획득할 수 있다. 본 개시의 일 실시예에서, 카메라(110)는 좌안 카메라 및 우안 카메라를 포함하는 스테레오 카메라(stereo camera)일 수 있다. 이 경우, 프로세서(130)는 좌안 카메라를 이용하여 획득된 좌안 이미지 및 우안 카메라를 이용하여 획득된 우안 이미지를 인공지능 모델에 입력하고, 인공지능 모델을 이용하여 좌안 이미지와 우안 이미지의 픽셀 휘도값 유사도에 따른 시차(disparity)를 계산함으로써 깊이 맵을 획득할 수 있다. 인공지능 모델은 심층 신경망 모델(deep neural network model)로 구현될 수 있다. 심층 신경망 모델은 컨볼루션 신경망 모델(Convolutional Neural Network; CNN), 순환 신경망 모델(Recurrent Neural Network; RNN), RBM(Restricted Boltzmann Machine), DBN(Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등 공지의 인공지능 모델로 구현될 수도 있다. 깊이 맵을 획득하는데 이용되는 심층 신경망 모델은 예를 들어, DispNet일 수 있으나, 이에 한정되는 것은 아니다.The depth map acquisition module 142 is composed of instructions or program code related to the function and/or operation of acquiring a depth map for an object based on an image acquired using the camera 110. In one embodiment of the present disclosure, the depth map acquisition module 142 may be configured to acquire a depth map using an artificial intelligence model. The processor 130 may acquire a depth map by executing instructions or program code related to the depth map acquisition module 142. In one embodiment of the present disclosure, camera 110 may be a stereo camera including a left-eye camera and a right-eye camera. In this case, the processor 130 inputs the left eye image acquired using the left eye camera and the right eye image acquired using the right eye camera into the artificial intelligence model, and uses the artificial intelligence model to calculate the pixel luminance values of the left eye image and the right eye image. A depth map can be obtained by calculating disparity according to similarity. Artificial intelligence models can be implemented as deep neural network models. Deep neural network models include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Bidirectional Recurrent Deep Neural Network (BRDNN), or It can also be implemented with known artificial intelligence models such as Deep Q-Networks. The deep neural network model used to obtain the depth map may be, for example, DispNet, but is not limited thereto.

본 개시의 일 실시예에서, 프로세서(130)는 플레인 스윕(plane sweep) 방식을 통해 좌안 이미지와 우안 이미지 간의 시차(disparity)를 계산함으로써, 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the processor 130 may obtain a depth map by calculating the disparity between the left-eye image and the right-eye image through a plane sweep method.

본 개시의 일 실시예에서, 카메라(110)는 ToF 카메라로 구성되고, 프로세서(130)는 ToF 카메라를 이용하여 획득된 반사광이 검출된 시점과 클럭 신호가 입력된 시점 간의 시간 차이인 광의 비행 시간(time of flight)을 계산하고, 비행 시간과 빛의 속도 간의 연산을 통해 반사광이 검출된 위치와 증강 현실 디바이스(100) 간의 거리, 즉 깊이 값을 산출할 수 있다. 프로세서(130)는 광이 조사되고, 반사광이 검출된 위치에 관한 정보를 획득하고, 획득된 위치 정보에 산출된 깊이 값을 매핑함으로써 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the camera 110 is configured as a ToF camera, and the processor 130 determines the time of flight of light, which is the time difference between the time when reflected light obtained using the ToF camera is detected and the time when the clock signal is input. (time of flight) can be calculated, and the distance between the location where the reflected light is detected and the augmented reality device 100, that is, the depth value, can be calculated through the calculation between the flight time and the speed of light. The processor 130 may acquire information about the location where light is irradiated and reflected light is detected, and obtain a depth map by mapping the calculated depth value to the obtained location information.

법선 벡터 보정 모듈(144)은 깊이 맵에 포함되는 복수의 픽셀들 각각의 법선 벡터(normal vector)를 획득하고, IMU 센서(120)에 의해 획득된 중력 방향에 따라 법선 벡터를 보정하는 기능 및/또는 동작과 관련된 명령어들 또는 프로그램 코드로 구성된다. 프로세서(130)는 법선 벡터 보정 모듈(144)과 관련된 명령어들 또는 프로그램 코드를 실행함으로써, 적어도 하나의 픽셀의 법선 벡터를 획득하고, 획득된 법선 벡터의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 깊이 맵에 포함된 복수의 픽셀들의 방향 벡터 및 복수의 픽셀들의 깊이 값에 기초하여 복수의 픽셀들 각각을 3차원 좌표값으로 변환하고, 변환된 픽셀들 각각을 기준으로 상하좌우 방향으로 인접한 인접 픽셀들의 3차원 좌표값의 외적(cross-product)을 계산함으로써, 픽셀 별 법선 벡터를 획득할 수 있다. The normal vector correction module 144 has the function of acquiring a normal vector for each of a plurality of pixels included in the depth map and correcting the normal vector according to the direction of gravity obtained by the IMU sensor 120. Or, it consists of instructions or program code related to operation. The processor 130 may acquire the normal vector of at least one pixel and correct the direction of the obtained normal vector by executing instructions or program code related to the normal vector correction module 144. In one embodiment of the present disclosure, the processor 130 converts each of the plurality of pixels into a three-dimensional coordinate value based on the direction vector of the plurality of pixels and the depth value of the plurality of pixels included in the depth map, and the converted By calculating the cross-product of the three-dimensional coordinate values of adjacent pixels in the up, down, left, and right directions based on each pixel, a normal vector for each pixel can be obtained.

프로세서(130)는 IMU 센서(120)의 자이로 센서에 의해 측정된 측정값에 기초하여 중력 방향에 관한 정보를 획득하고, 중력 방향에 기초하여 픽셀 별 법선 벡터의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 중력 방향 뿐만 아니라, 중력 방향에 대하여 수직하는 방향을 따라 픽셀 별 법선 벡터의 방향을 보정할 수 있다. 예를 들어, 수직 방향에 가까운 법선 벡터를 갖는 픽셀에 대하여 프로세서(130)는 법선 벡터의 방향을 중력 방향과 평행한 방향으로 보정할 수 있다. 예를 들어, 지면에 대하여 수평 방향에 가까운 법선 벡터를 갖는 픽셀(예를 들어, 벽, 기둥 등 객체를 나타내는 픽셀)에 대하여 프로세서(130)는 중력 방향에 대하여 수직하는 방향으로 법선 벡터의 방향을 보정할 수 있다. The processor 130 may obtain information about the direction of gravity based on a measurement value measured by the gyro sensor of the IMU sensor 120 and correct the direction of the normal vector for each pixel based on the direction of gravity. In one embodiment of the present disclosure, the processor 130 may correct the direction of the normal vector for each pixel not only in the direction of gravity but also in a direction perpendicular to the direction of gravity. For example, for a pixel having a normal vector close to the vertical direction, the processor 130 may correct the direction of the normal vector to a direction parallel to the direction of gravity. For example, for a pixel (e.g., a pixel representing an object such as a wall, pillar, etc.) having a normal vector close to the horizontal direction with respect to the ground, the processor 130 changes the direction of the normal vector in a direction perpendicular to the direction of gravity. It can be corrected.

프로세서(130)가 픽셀 별 법선 벡터를 획득하고, 획득된 법선 벡터의 방향을 보정하는 구체적인 실시예는 도 4, 도 5, 도 6a, 및 도 6b에서 상세하게 설명하기로 한다.A specific embodiment in which the processor 130 acquires a normal vector for each pixel and corrects the direction of the obtained normal vector will be described in detail with reference to FIGS. 4, 5, 6A, and 6B.

깊이 값 보정 모듈(146)은 보정된 법선 벡터의 방향에 기초하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정하는 기능 및/또는 동작과 관련된 명령어들 또는 프로그램 코드로 구성된다. 프로세서(130)는 깊이 값 보정 모듈(146)과 관련된 명령어들 또는 프로그램 코드를 실행함으로써, 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 인공지능 모델의 트레이닝(training) 과정에서 출력된 깊이 맵의 손실값(loss)을 산출하고, 산출된 손실값을 인공지능 모델에 적용하는 트레이닝을 수행함으로써, 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. 프로세서(130)는 보정된 법선 벡터에 의해 정의되는 평면 상의 픽셀의 깊이 값 및 위치 상으로 인접한 인접 픽셀들의 깊이 값에 기초하여 손실값을 산출할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 정의된 평면과 카메라(110)의 광선 벡터(ray vector)가 만나는 복수의 포인트에 기초하여 인접 픽셀들의 깊이 값을 획득하고, 픽셀의 깊이 값과 인접 픽셀들의 깊이 값 간의 차이값을 각각 산출하며, 인접 픽셀들 별로 산출된 차이값에 가중치를 적용하는 가중합(weighted sum) 연산을 통해 손실값을 산출할 수 있다. The depth value correction module 146 is comprised of instructions or program code related to a function and/or operation for correcting the depth value of at least one pixel of the depth map based on the direction of the corrected normal vector. The processor 130 may correct the depth value of at least one pixel by executing instructions or program code related to the depth value correction module 146. In one embodiment of the present disclosure, the processor 130 calculates the loss value of the depth map output during the training process of the artificial intelligence model and performs training to apply the calculated loss value to the artificial intelligence model. By performing this, the depth value of at least one pixel of the depth map can be corrected. The processor 130 may calculate a loss value based on the depth value of a pixel on a plane defined by the corrected normal vector and the depth values of adjacent pixels in position. In one embodiment of the present disclosure, the processor 130 acquires depth values of adjacent pixels based on a plurality of points where a defined plane and a ray vector of the camera 110 meet, and the depth value of the pixel and The difference value between the depth values of adjacent pixels is calculated respectively, and the loss value can be calculated through a weighted sum operation that applies a weight to the difference value calculated for each adjacent pixel.

본 개시의 일 실시예에서, 가중합 연산에 적용되는 가중치는 깊이 맵에서의 인접 픽셀들의 위치와 카메라(110)의 위치 간의 거리에 기초하여 결정되는 제1 가중치 및 깊이 맵 내의 픽셀과 인접 픽셀들 간의 휘도값 차이에 기초하여 결정되는 제2 가중치를 포함할 수 있다. 프로세서(130)가 보정된 법선 벡터에 의해 정의되는 평면과 카메라(110)의 위치 관계에 따라 손실값을 산출하고, 산출된 손실값을 이용하여 인공지능 모델을 트레이닝함으로써 적어도 하나의 픽셀의 깊이 값을 보정하는 구체적인 실시예에 대해서는 도 7 내지 도 10에서 상세하게 설명하기로 한다. In one embodiment of the present disclosure, the weight applied to the weighted sum operation is a first weight determined based on the distance between the location of the camera 110 and the location of adjacent pixels in the depth map and the pixel and adjacent pixels in the depth map. It may include a second weight determined based on the difference in luminance values between the two sides. The processor 130 calculates a loss value according to the positional relationship between the plane defined by the corrected normal vector and the camera 110, and trains an artificial intelligence model using the calculated loss value to obtain the depth value of at least one pixel. Specific embodiments for correcting will be described in detail in FIGS. 7 to 10.

본 개시의 일 실시예에서, 프로세서(130)는 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 수직하는 방향을 따라 픽셀 별 평면을 가정(plane hypothesis)하고, 가정된 평면에 따라 플레인 스윕(plane sweep)을 수행함으로써 적어도 하나의 픽셀의 깊이 값을 획득할 수 있다. 프로세서(130)는 획득된 깊이 값을 이용하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. 프로세서(130)가 보정된 법선 벡터에 기초하여 정의된 평면을 따라 플레인 스윕을 수행함으로써 적어도 하나의 픽셀의 깊이 값을 보정하는 구체적인 실시예에 대해서는 도 12 및 도 13에서 상세하게 설명하기로 한다.In one embodiment of the present disclosure, processor 130 assumes a plane hypothesis for each pixel along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector, and performs a plane sweep according to the hypothesized plane. By performing a plane sweep, the depth value of at least one pixel can be obtained. The processor 130 may correct the depth value of at least one pixel of the depth map using the acquired depth value. A specific embodiment in which the processor 130 corrects the depth value of at least one pixel by performing a plane sweep along a plane defined based on the corrected normal vector will be described in detail with reference to FIGS. 12 and 13.

본 개시의 일 실시예에서, 프로세서(130)는 ToF 카메라를 이용하여 RGB 이미지 및 깊이 맵 이미지를 획득할 수 있다. 프로세서(130)는 보정된 법선 벡터에 기초하여 깊이 맵 이미지에서 픽셀 별 평면을 정의하고, RGB 이미지의 색상 정보에 따라 분할된 영역에 기초하여 깊이 맵 이미지에서의 평면 영역을 식별할 수 있다. 프로세서(130)는 식별된 평면 영역 내에서 보정 대상 픽셀의 깊이 값을 동일 평면 영역 내의 인접 픽셀들의 깊이 값에 기초하여 픽셀의 깊이 값을 보정할 수 있다. 여기서, '보정 대상 픽셀'은 깊이 값의 보정이 필요한 픽셀로서, 평면 영역 내에서 깊이 값을 갖지 않은 픽셀(깊이 값이 미획득된 픽셀) 또는 평면 영역 내의 인접 픽셀들과의 깊이 값 차이가 기 설정된 임계치를 초과하는 픽셀을 의미한다. 프로세서(130)가 ToF 카메라를 이용하여 획득된 깊이 맵 이미지에서 RGB 이미지에 의해 평면 영역을 식별하고, 식별된 평면 영역 내에서 적어도 하나의 픽셀의 깊이 값을 보정하는 구체적인 실시예에 대해서는 도 14 및 도 15에서 상세하게 설명하기로 한다. In one embodiment of the present disclosure, the processor 130 may acquire an RGB image and a depth map image using a ToF camera. The processor 130 may define a plane for each pixel in the depth map image based on the corrected normal vector and identify the plane area in the depth map image based on a region divided according to color information of the RGB image. The processor 130 may correct the depth value of the pixel to be corrected within the identified planar area based on the depth values of adjacent pixels within the same planar area. Here, the 'compensation target pixel' is a pixel whose depth value needs to be corrected, and the depth value difference between a pixel that does not have a depth value in the flat area (pixel for which the depth value has not been acquired) or adjacent pixels in the flat area is based on the pixel. This refers to pixels that exceed a set threshold. 14 and 14 for a specific embodiment in which the processor 130 identifies a planar area by an RGB image in a depth map image acquired using a ToF camera and corrects the depth value of at least one pixel within the identified planar area. This will be explained in detail in Figure 15.

도 3은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)의 동작 방법을 도시한 흐름도이다.FIG. 3 is a flowchart illustrating a method of operating the augmented reality device 100 according to an embodiment of the present disclosure.

단계 S310에서, 증강 현실 디바이스(100)는 카메라를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득한다. 본 개시의 일 실시예에서, 카메라는 좌안 카메라 및 우안 카메라를 포함하는 스테레오 카메라(stereo camera)일 수 있고, 증강 현실 디바이스(100)는 좌안 카메라를 이용하여 획득된 좌안 이미지 및 우안 카메라를 이용하여 획득된 우안 이미지를 인공지능 모델에 입력하고, 인공지능 모델을 이용하여 좌안 이미지와 우안 이미지의 픽셀 휘도값 유사도에 따른 시차(disparity)를 계산함으로써 깊이 맵을 획득할 수 있다. 인공지능 모델은 심층 신경망 모델(deep neural network model)로 구현될 수 있다. 깊이 맵을 획득하는데 이용되는 심층 신경망 모델은 예를 들어, DispNet일 수 있으나, 이에 한정되는 것은 아니다. In step S310, the augmented reality device 100 acquires a depth map from an image acquired using a camera. In one embodiment of the present disclosure, the camera may be a stereo camera including a left-eye camera and a right-eye camera, and the augmented reality device 100 uses a left-eye image acquired using the left-eye camera and the right-eye camera. A depth map can be obtained by inputting the acquired right-eye image into an artificial intelligence model and calculating disparity according to the similarity of pixel luminance values between the left-eye image and the right-eye image using the artificial intelligence model. Artificial intelligence models can be implemented as deep neural network models. The deep neural network model used to obtain the depth map may be, for example, DispNet, but is not limited thereto.

본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 플레인 스윕(plane sweep) 방식을 통해 좌안 이미지와 우안 이미지 간의 시차(disparity)를 계산함으로써, 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the augmented reality device 100 may obtain a depth map by calculating the disparity between the left-eye image and the right-eye image through a plane sweep method.

본 개시의 일 실시예에서, 카메라는 ToF 카메라로 구성되고, 증강 현실 디바이스(100)는 ToF 카메라를 이용하여 획득된 광의 비행 시간(time of flight)에 기초하여 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the camera is configured as a ToF camera, and the augmented reality device 100 can acquire a depth map based on the time of flight of light obtained using the ToF camera.

단계 S320에서, 증강 현실 디바이스(100)는 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득한다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 깊이 맵에 포함된 복수의 픽셀들 각각의 방향 벡터 및 깊이 값에 기초하여 복수의 픽셀들을 3차원 좌표값으로 변환할 수 있다. 증강 현실 디바이스(100)는 변환된 3차원 좌표값을 이용하여 픽셀 별 법선 벡터를 획득할 수 있다. In step S320, the augmented reality device 100 obtains a normal vector of at least one pixel included in the depth map. In one embodiment of the present disclosure, the augmented reality device 100 may convert a plurality of pixels into 3D coordinate values based on the direction vector and depth value of each of the plurality of pixels included in the depth map. The augmented reality device 100 may obtain a normal vector for each pixel using the converted 3D coordinate value.

단계 S330에서, 증강 현실 디바이스(100)는 IMU 센서에 의해 측정된 중력 방향에 기초하여 적어도 하나의 픽셀의 법선 벡터의 방향을 보정한다. 수직 방향에 가까운 법선 벡터를 갖는 픽셀의 경우 증강 현실 디바이스(100)는 법선 벡터의 방향을 중력 방향과 평행한 방향으로 보정할 수 있다. 지면에 대하여 수평 방향에 가까운 법선 벡터를 갖는 픽셀의 경우 증강 현실 디바이스(100)는 중력 방향에 수직하는 방향으로 법선 벡터의 방향을 보정할 수 있다. In step S330, the augmented reality device 100 corrects the direction of the normal vector of at least one pixel based on the direction of gravity measured by the IMU sensor. In the case of a pixel having a normal vector close to the vertical direction, the augmented reality device 100 may correct the direction of the normal vector to a direction parallel to the direction of gravity. In the case of a pixel having a normal vector close to the horizontal direction with respect to the ground, the augmented reality device 100 may correct the direction of the normal vector in a direction perpendicular to the direction of gravity.

단계 S340에서, 증강 현실 디바이스(100)는 보정된 법선 벡터의 방향에 기초하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정한다. 단계 S310에서 인공지능 모델을 이용하여 깊이 맵을 획득한 경우, 증강 현실 디바이스(100)는 인공지능 모델의 트레이닝(training) 과정에서 획득된 깊이 맵의 손실값(loss)을 산출하고, 산출된 손실값을 인공지능 모델에 적용하는 트레이닝을 수행함으로써 깊이 맵에 포함되는 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. 증강 현실 디바이스(100)는 보정된 법선 벡터에 의해 정의되는 평면 상의 픽셀의 깊이 값 및 위치 상으로 인접한 인접 픽셀들의 깊이 값에 기초하여 손실값을 산출할 수 있다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 정의된 평면과 카메라의 광선 벡터(ray vector)가 만나는 복수의 포인트에 기초하여 인접 픽셀들의 깊이 값을 획득하고, 픽셀의 깊이 값과 인접 픽셀들의 깊이 값 간의 차이값을 각각 산출하며, 인접 픽셀들 별로 산출된 차이값에 가중치를 적용하는 가중합(weighted sum) 연산을 통해 손실값을 산출할 수 있다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 깊이 맵 내의 인접 픽셀들의 위치와 카메라의 위치 간의 거리에 기초하여 결정되는 제1 가중치 및 깊이 맵 내의 픽셀과 인접 픽셀들 간의 휘도값 차이에 기초하여 결정되는 제2 가중치를 이용하는 가중합 연산을 통해 손실값을 획득할 수 있다. In step S340, the augmented reality device 100 corrects the depth value of at least one pixel of the depth map based on the direction of the corrected normal vector. When a depth map is acquired using an artificial intelligence model in step S310, the augmented reality device 100 calculates a loss value of the depth map obtained in the training process of the artificial intelligence model, and calculates the calculated loss. The depth value of at least one pixel included in the depth map can be corrected by performing training that applies the value to the artificial intelligence model. The augmented reality device 100 may calculate a loss value based on the depth value of a pixel on a plane defined by the corrected normal vector and the depth values of adjacent pixels in position. In one embodiment of the present disclosure, the augmented reality device 100 acquires depth values of adjacent pixels based on a plurality of points where a defined plane and a ray vector of a camera meet, and the depth value of the adjacent pixel The difference value between the depth values of each pixel is calculated, and the loss value can be calculated through a weighted sum operation that applies a weight to the difference value calculated for each adjacent pixel. In one embodiment of the present disclosure, the augmented reality device 100 is based on a first weight determined based on the distance between the positions of adjacent pixels in the depth map and the position of the camera and the luminance value difference between the pixel in the depth map and the adjacent pixels. The loss value can be obtained through a weighted sum operation using the second weight determined based on the loss value.

본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 수직하는 방향을 따라 픽셀 별 평면을 정의하는 평면 가정(plane hypothesis)을 수행할 수 있다. 증강 현실 디바이스(100)는 평면 가정에 의해 정의된 평면을 따라 플레인 스윕(plane sweep)을 수행함으로써 적어도 하나의 픽셀의 깊이 값을 획득할 수 있다. 증강 현실 디바이스(100)는 획득된 깊이 값을 이용하여 깊이 맵에 포함된 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다.In one embodiment of the present disclosure, the augmented reality device 100 may perform a plane hypothesis that defines a plane for each pixel along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector. there is. The augmented reality device 100 may obtain the depth value of at least one pixel by performing a plane sweep along a plane defined by the plane assumption. The augmented reality device 100 may correct the depth value of at least one pixel included in the depth map using the acquired depth value.

단계 S310에서 ToF 카메라를 이용하여 깊이 맵을 획득한 경우, 증강 현실 디바이스(100)는 ToF 카메라를 이용하여 RGB 이미지를 획득할 수 있다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 보정된 법선 벡터에 기초하여 깊이 맵에서 픽셀 별 평면을 정의하고, RGB 이미지의 색상 정보에 따라 분할된 영역에 기초하여 깊이 맵에서의 평면 영역을 식별할 수 있다. 증강 현실 디바이스(100)는 식별된 평면 영역 내의 보정 대상 픽셀의 깊이 값을 동일 평면 내의 인접 픽셀들의 깊이 값에 기초하여 보정할 수 있다. When a depth map is acquired using a ToF camera in step S310, the augmented reality device 100 may acquire an RGB image using the ToF camera. In one embodiment of the present disclosure, the augmented reality device 100 defines a plane for each pixel in the depth map based on the corrected normal vector, and defines a plane in the depth map based on a region divided according to color information of the RGB image. Areas can be identified. The augmented reality device 100 may correct the depth value of the pixel to be corrected within the identified plane area based on the depth values of adjacent pixels within the same plane.

도 4는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 픽셀 별 법선 벡터(normal vector)를 획득하는 방법을 도시한 흐름도이다.FIG. 4 is a flowchart illustrating a method for the augmented reality device 100 to obtain a normal vector for each pixel according to an embodiment of the present disclosure.

도 4에 도시된 단계 S410 및 S420은 도 3에 도시된 단계 S320을 구체화한 단계들이다. 도 4에 도시된 단계 S410은 도 3의 단계 S310이 수행된 이후에 수행될 수 있다. 도 4에 도시된 단계 S420이 수행된 이후에는 도 3의 단계 S330이 수행될 수 있다.Steps S410 and S420 shown in FIG. 4 are steps that embody step S320 shown in FIG. 3. Step S410 shown in FIG. 4 may be performed after step S310 of FIG. 3 is performed. After step S420 shown in FIG. 4 is performed, step S330 of FIG. 3 may be performed.

도 5는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 깊이 맵으로부터 픽셀 별 법선 벡터를 획득하는 동작을 도시한 도면이다.FIG. 5 is a diagram illustrating an operation of the augmented reality device 100 acquiring a normal vector for each pixel from a depth map according to an embodiment of the present disclosure.

이하에서는, 도 4 및 도 5를 함께 참조하여 증강 현실 디바이스(100)의 동작에 대하여 설명한다.Hereinafter, the operation of the augmented reality device 100 will be described with reference to FIGS. 4 and 5 together.

도 4의 단계 S410을 참조하면, 증강 현실 디바이스(100)는 깊이 맵에 포함된 픽셀들의 방향 벡터 및 픽셀들의 깊이 값에 기초하여, 픽셀들을 3차원 좌표값으로 변환한다. 도 5를 함께 참조하면, 증강 현실 디바이스(100)의 프로세서(130, 도 2 참조)는 깊이 맵(500)으로부터 제1 픽셀(501)의 방향 벡터와 깊이 값(D_i,j), 복수의 인접 픽셀들(502 내지 505) 각각의 방향 벡터와 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1})을 획득할 수 있다. 도 5에 도시된 실시예에서, 제1 픽셀(501)은 깊이 맵(500)에 포함되는 복수의 픽셀들 중 i열(i-th column)과 j행(j-th row)에 배치된 픽셀을 나타내고, D_i,j의 깊이 값을 가질 수 있다. 복수의 인접 픽셀들(501 내지 505)은 제1 픽셀(501)을 기준으로 상하좌우 방향으로 인접하게 배치되는 픽셀들로서, 각각 D_{i-1, j}, D_i+1,_j, D_{i, j-1}, 및 D_{i, j+1}의 깊이 값을 가질 수 있다. 예를 들어, 제2 픽셀(502)은 i-1열, j행에 배치되는 픽셀로서, D_{i-1, j}의 깊이 값을 가지고, 제3 픽셀(503)은 i+1열, j행에 배치되는 픽셀로서, D_{i+1, j}의 깊이 값을 가질 수 있다. 제4 픽셀(504) 및 제5 픽셀(505)은 각각 D_{i, j-1}, 및 D_{i, j+1}의 깊이 값을 가질 수 있다. Referring to step S410 of FIG. 4, the augmented reality device 100 converts the pixels into 3D coordinate values based on the direction vectors of the pixels and the depth values of the pixels included in the depth map. Referring to FIG. 5 together, the processor 130 (see FIG. 2) of the augmented reality device 100 calculates the direction vector and depth value (D _i,j ) of the first pixel 501 from the depth map 500, and a plurality of Direction vectors and depth values (D _{i-1, j} , D _{i+1, j} , D _{i, j-1} , D _{i, j+1} ) of each of the adjacent pixels 502 to 505 may be obtained. In the embodiment shown in FIG. 5, the first pixel 501 is a pixel arranged in the i-th column and j row (j-th row) among the plurality of pixels included in the depth map 500. It represents and can have a depth value of D _i,j . The plurality of adjacent pixels 501 to 505 are pixels arranged adjacent to each other in the up, down, left, and right directions with respect to the first pixel 501, and are respectively D _{i-1, j} , D _i+1 , _j , D _{i, j.} It may have depth values of _-1 , and D _{i, j+1} . For example, the second pixel 502 is a pixel placed in column i-1, row j, and has a depth value of D _{i-1, j} , and the third pixel 503 is a pixel placed in column i+1, row j. It is a pixel placed in and may have a depth value of D _i+1,j . The fourth pixel 504 and the fifth pixel 505 may have depth values of D _{i, j-1} and D _{i, j+1} , respectively.

프로세서(130)는 제1 픽셀(501)의 방향 벡터 및 깊이 값(D_i,j)과 복수의 인접 픽셀들(502 내지 505) 각각의 방향 벡터 및 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1})에 기초하여 제1 픽셀(501)과 복수의 인접 픽셀들(502 내지 505)을 3차원 공간(510) 상의 3차원 좌표값(P_i,j, P_{i-1, j}, P_{i+1, j}, P_{i, j-1}, P_{i, j+1})으로 변환할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 하기의 수식 1을 통해 제1 픽셀(501) 및 복수의 인접 픽셀들(502 내지 505)을 3차원 공간(510) 내의 3차원 좌표값으로 변환할 수 있다.The processor 130 stores the direction vector and depth value (D _i,j ) of the first pixel 501 and the direction vector and depth value (D _{i-1, j} , D) of each of the plurality of adjacent pixels 502 to 505. Based on _{(i+1, j} , D _{i, j-1} , D _{i, j+1} ), the first pixel 501 and the plurality of adjacent pixels 502 to 505 are displayed in three dimensions in the three-dimensional space 510. It can be converted to coordinate values (P _i,j , P _{i-1, j} , P _{i+1, j} , P _{i, j-1} , P _{i, j+1} ). In one embodiment of the present disclosure, the processor 130 converts the first pixel 501 and the plurality of adjacent pixels 502 to 505 into three-dimensional coordinate values in the three-dimensional space 510 through Equation 1 below: can do.

도 5에 도시된 실시예에서, 프로세서(130)는 상기 수식 1의 연산을 통해, 제1 픽셀(501)을 P_{i, j}로 변환하고, 복수의 인접 픽셀들(502 내지 505)을 P_{i-1, j}, P_{i+1, j}, P_{i, j-1}, P_{i, j+1}로 각각 변환할 수 있다. In the embodiment shown in FIG. 5, the processor 130 converts the first pixel 501 into P _{i and j} through the operation of Equation 1, and converts the plurality of adjacent pixels 502 to 505 into P _i They can be converted to _{-1, j} , P _{i+1, j} , P _{i, j-1} , and P _{i, j+1} respectively.

도 4의 단계 S420을 참조하면, 증강 현실 디바이스(100)는 상하좌우 방향으로 인접한 복수의 인접 픽셀들의 3차원 좌표값의 외적(cross-product)을 계산함으로써, 픽셀 별 법선 벡터를 획득한다. 도 5를 함께 참조하면, 증강 현실 디바이스(100)의 프로세서(130)는 제1 픽셀의 3차원 좌표값(P_i,j)을 기준으로 상하좌우 방향으로 인접한 복수의 인접 픽셀들의 3차원 좌표값(P_{i-1, j}, P_{i+1, j}, P_{i, j-1}, P_{i, j+1})의 외적을 계산할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 수식 2의 연산을 통해 제1 픽셀에 관한 법선 벡터(N_i,j)를 획득할 수 있다. Referring to step S420 of FIG. 4, the augmented reality device 100 obtains a normal vector for each pixel by calculating the cross-product of the 3D coordinate values of a plurality of adjacent pixels in the up, down, left, and right directions. Referring to FIG. 5 together, the processor 130 of the augmented reality device 100 calculates the 3D coordinate values of a plurality of adjacent pixels in the up, down, left, and right directions based on the 3D coordinate value (P _i,j ) of the first pixel. The cross product of (P _{i-1, j} , P _{i+1, j} , P _{i, j-1} , P _{i, j+1} ) can be calculated. In one embodiment of the present disclosure, the processor 130 may obtain the normal vector (N _i,j ) for the first pixel through the operation of Equation 2.

도 5에 도시된 실시예에서, 프로세서(130)는 상기 수식 2의 연산을 통해, 제1 픽셀에 대하여 좌우 방향으로 인접한 제2 픽셀 및 제3 픽셀의 3차원 좌표값(P_{i-1, j}, P_{i+1, j})의 차와 제1 픽셀에 대하여 상하 방향으로 인접한 제4 픽셀 및 제5 픽셀의 3차원 좌표값(P_{i, j-1}, P_{i, j+1})의 차를 계산하고, 차의 벡터 곱을 연산함으로써, 제1 픽셀의 법선 벡터(N_i,j)를 획득할 수 있다. 법선 벡터(N_i,j)는 제1 픽셀 내지 제5 픽셀로 구성된 평면을 정의하는 벡터일 수 있다.In the embodiment shown in FIG. 5, the processor 130 calculates the three-dimensional coordinate values (P _{i-1, j)} of the second and third pixels adjacent to the first pixel in the left and right directions through the operation of Equation 2. , P _{i+1, j} ) and the difference between the three-dimensional coordinate values (P _{i, j-1} , P _{i, j+1} ) of the fourth and fifth pixels adjacent to the first pixel in the vertical direction. By calculating and calculating the vector product of the difference, the normal vector (N _i,j ) of the first pixel can be obtained. The normal vector (N _i,j ) may be a vector defining a plane composed of the first to fifth pixels.

도 6a는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 중력 방향(G)에 따라 법선 벡터(N_i,j)의 방향을 보정하는 동작을 도시한 도면이다.FIG. 6A is a diagram illustrating an operation of the augmented reality device 100 according to an embodiment of the present disclosure to correct the direction of the normal vector (N _i,j ) according to the direction of gravity (G).

도 6a를 참조하면, 증강 현실 디바이스(100)는 IMU 센서(120), 프로세서(130), 및 법선 벡터 보정 모듈(144)을 포함할 수 있다. 도 6a에 도시된 구성 요소들은 증강 현실 디바이스(100)가 법선 벡터(N_i,j)의 방향을 보정하는 동작을 설명하기 위한 필수적인 구성만을 도시한 것으로서, 증강 현실 디바이스(100)가 포함하는 구성이 도 6a에 도시된 바와 같이 한정되는 것은 아니다. Referring to FIG. 6A , the augmented reality device 100 may include an IMU sensor 120, a processor 130, and a normal vector correction module 144. The components shown in FIG. 6A show only essential components for explaining the operation of the augmented reality device 100 to correct the direction of the normal vector (N _i,j ), and are components included in the augmented reality device 100. This is not limited as shown in FIG. 6A.

IMU 센서(120)는 자이로 센서(gyroscope)를 이용하여 측정한 측정값에 기초하여 중력 방향(G)에 관한 정보를 획득하고, 중력 방향(G)에 관한 정보를 프로세서(130)에 제공할 수 있다. 프로세서(130)는 법선 벡터 보정 모듈(144)의 명령어들 또는 프로그램 코드를 실행함으로써, IMU 센서(120)로부터 획득한 중력 방향(G)에 관한 정보에 기초하여 법선 벡터(N_i,j)의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 법선 벡터(N_i,j)의 방향이 수직 방향에 가까운지 또는 수평 방향에 가까운지 여부를 판단하고, 수직 방향에 가까운 경우 중력 방향(G)과 평행한 방향을 갖는 중력 법선 벡터(-N_G)에 따라 법선 벡터(N_i,j)의 방향을 보정할 수 있다. 도 6a에 도시된 실시예에서, 법선 벡터(N_i,j)의 방향이 수직 방향과 가까우므로, 프로세서(130)는 법선 벡터(N_i,j)의 방향을 중력 법선 벡터(-N_G)에 따라 평행한 방향으로 보정할 수 있다. 보정 결과, 프로세서(130)는 보정된 법선 벡터(N'_i,j)를 획득할 수 있다. The IMU sensor 120 may acquire information about the direction of gravity (G) based on measurements made using a gyroscope and provide information about the direction of gravity (G) to the processor 130. there is. The processor 130 executes the instructions or program code of the normal vector correction module 144 to determine the normal vector (N _i,j ) based on the information about the gravity direction (G) obtained from the IMU sensor 120. The direction can be corrected. In one embodiment of the present disclosure, the processor 130 determines whether the direction of the normal vector (N _i,j ) is close to the vertical direction or the horizontal direction, and if close to the vertical direction, the direction of gravity (G) The direction of the normal vector (N _i,j ) can be corrected according to the gravity normal vector (-N _G ) having a direction parallel to . In the embodiment shown in FIG. 6A, since the direction of the normal vector (N _i,j ) is close to the vertical direction, the processor 130 changes the direction of the normal vector (N _i,j ) to the gravity normal vector (-N _G ) It can be corrected in a parallel direction according to . As a result of the correction, the processor 130 may obtain the corrected normal vector (N' _i,j ).

프로세서(130)는 보정된 법선 벡터(N'_i,j)의 방향에 따라 제1 픽셀의 3차원 좌표값(P_i,j)으로 정의되는 평면의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 보정된 법선 벡터(N'_i,j)와 보정 전의 법선 벡터(N_i,j) 사이의 각도 차이만큼 평면의 방향을 보정할 수 있다.The processor 130 may correct the direction of the plane defined by the 3D coordinate value (P i _,j ) of the first pixel according to the direction of the corrected normal vector (N' i _,j ). In one embodiment of the present disclosure, the processor 130 may correct the direction of the plane by the angle difference between the corrected normal vector (N' _i,j ) and the normal vector (N _i,j ) before correction.

도 6b는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 중력 방향(G)에 대하여 수직 방향으로 법선 벡터(N_i,j)의 방향을 보정하는 동작을 도시한 도면이다.FIG. 6B is a diagram illustrating an operation of the augmented reality device 100 according to an embodiment of the present disclosure to correct the direction of the normal vector (N _i,j ) in a direction perpendicular to the direction of gravity (G).

도 6b에 도시된 실시예는 제1 픽셀의 법선 벡터(N_i,j)의 방향 및 보정된 법선 벡터(N'_i,j)의 방향을 제외하고는 도 6a에 도시된 실시예와 동일하므로, 중복되는 설명은 생략한다. The embodiment shown in FIG. 6B is the same as the embodiment shown in FIG. 6A except for the direction of the normal vector (N _i,j ) of the first pixel and the direction of the corrected normal vector (N' _i,j ). , Redundant explanations are omitted.

프로세서(130)는 법선 벡터 보정 모듈(144)의 명령어들 또는 프로그램 코드를 실행함으로써, IMU 센서(120)로부터 획득한 중력 방향(G)에 관한 정보에 기초하여 법선 벡터(N_i,j)의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 법선 벡터(N_i,j)의 방향이 수직 방향에 가까운지 또는 수평 방향에 가까운지 여부를 판단할 수 있다. 법선 벡터(N_i,j)의 방향이 수평 방향에 가까운 픽셀은 예를 들어, 벽, 기둥 등을 나타내는 픽셀이거나, 또는 수직 방향으로 배열된 객체를 나타내는 픽셀일 수 있다. 법선 벡터(N_i,j)의 수평 방향에 가깝다고 판단한 경우, 프로세서(130)는 중력 방향(G)에 수직하는 방향에 따라 법선 벡터(N_i,j)의 방향을 보정할 수 있다. 도 6b에 도시된 실시예에서, 법선 벡터(N_i,j)의 방향은 수평 방향에 가까우므로, 프로세서(130)는 법선 벡터(N_i,j)의 방향을 중력 방향(G)에 대하여 수직 방향으로 법선 벡터(N_i,j)의 방향을 보정할 수 있다. 보정 결과, 프로세서(130)는 보정된 법선 벡터(N'_i,j)를 획득할 수 있다. The processor 130 executes the instructions or program code of the normal vector correction module 144 to determine the normal vector (N _i,j ) based on the information about the gravity direction (G) obtained from the IMU sensor 120. The direction can be corrected. In one embodiment of the present disclosure, the processor 130 may determine whether the direction of the normal vector (N _i,j ) is close to the vertical direction or close to the horizontal direction. Pixels whose direction of the normal vector (N _i,j ) is close to the horizontal direction may be, for example, pixels representing walls, pillars, etc., or pixels representing objects arranged in the vertical direction. If it is determined that the normal vector (N _i,j ) is close to the horizontal direction, the processor 130 may correct the direction of the normal vector (N _i,j ) according to the direction perpendicular to the direction of gravity (G). In the embodiment shown in FIG. 6B, the direction of the normal vector (N _i,j ) is close to the horizontal direction, so the processor 130 sets the direction of the normal vector (N _i,j ) perpendicular to the gravity direction (G). The direction of the normal vector (N _i,j ) can be corrected. As a result of the correction, the processor 130 may obtain the corrected normal vector (N' _i,j ).

프로세서(130)는 보정된 법선 벡터(N'_i,j)의 방향에 따라 제1 픽셀의 3차원 좌표값(P_i,j)으로 정의되는 평면의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 보정된 법선 벡터(N'_i,j)와 보정 전의 법선 벡터(N_i,j) 사이의 각도 차이만큼 평면의 방향을 보정할 수 있다.The processor 130 may correct the direction of the plane defined by the 3D coordinate value (P i _{,j) of the first pixel according to the direction of the corrected normal vector (N' i} _,j ). In one embodiment of the present disclosure, the processor 130 may correct the direction of the plane by the angle difference between the corrected normal vector (N' _i,j ) and the normal vector (N _i,j ) before correction.

도 7은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 인공지능 모델(700)을 이용하여 깊이 맵(730)을 획득하는 트레이닝(training) 동작을 도시한 도면이다.FIG. 7 is a diagram illustrating a training operation in which the augmented reality device 100 acquires a depth map 730 using an artificial intelligence model 700 according to an embodiment of the present disclosure.

도 7을 참조하면, 증강 현실 디바이스(100)는 좌안 카메라 및 우안 카메라로 구성된 스테레오 카메라(stereo camera)를 포함할 수 있다. 증강 현실 디바이스(100)는 좌안 카메라를 이용하여 획득된 좌안 이미지(710L) 및 우안 카메라를 이용하여 획득된 우안 이미지(710R)를 인공지능 모델(700)에 입력할 수 있다. 증강 현실 디바이스(100)는 인공지능 모델(700)을 이용하여 좌안 이미지(710L)와 우안 이미지(710R) 간의 픽셀 간 휘도값 유사도에 따른 시차(disparity)를 계산함으로써, 깊이 맵(730)을 획득할 수 있다.Referring to FIG. 7, the augmented reality device 100 may include a stereo camera consisting of a left-eye camera and a right-eye camera. The augmented reality device 100 may input the left eye image 710L acquired using the left eye camera and the right eye image 710R acquired using the right eye camera to the artificial intelligence model 700. The augmented reality device 100 obtains a depth map 730 by calculating disparity according to the similarity in luminance values between pixels between the left eye image 710L and the right eye image 710R using the artificial intelligence model 700. can do.

본 개시의 일 실시예에서, 인공지능 모델(700)은 심층 신경망 모델(deep neural network model)로 구현될 수 있다. 심층 신경망 모델은 컨볼루션 신경망 모델(Convolutional Neural Network; CNN), 순환 신경망 모델(Recurrent Neural Network; RNN), RBM(Restricted Boltzmann Machine), DBN(Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등 공지의 인공지능 모델로 구현될 수도 있다. 예를 들어, 깊이 맵(730)을 획득하는데 이용되는 심층 신경망 모델은 DispNet일 수 있으나, 이에 한정되지 않는다. In one embodiment of the present disclosure, the artificial intelligence model 700 may be implemented as a deep neural network model. Deep neural network models include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Bidirectional Recurrent Deep Neural Network (BRDNN), or It can also be implemented with known artificial intelligence models such as Deep Q-Networks. For example, the deep neural network model used to obtain the depth map 730 may be DispNet, but is not limited thereto.

예를 들어, 인공지능 모델(700)이 DispNet과 같은 컨볼루션 신경망 모델로 구현되는 경우, 인공지능 모델(700)은 좌안 이미지(701L) 및 우안 이미지(710R) 각각을 컨볼루션함으로써 특징 맵(feature map)을 추출할 수 있다. 좌안 이미지(710L)와 우안 이미지(710R) 각각의 특징 맵 추출 과정에서 가중치들(weights)은 같은 값으로 서로 공유될 수 있다. 인공지능 모델(700)은 좌안 이미지(710L) 및 우안 이미지(710R)로부터 각각 추출된 특징 맵 간의 차이값을 3차원 코스트 볼륨(3D cost volume)으로서 누적하고, 누적된 3차원 코스트 볼륨을 업 컨볼루션(up-convolution)하는 계층적 정제(hierarchical refinement)를 수행함으로써 깊이 맵(730)을 출력하도록 트레이닝(training)될 수 있다. 본 개시의 일 실시예에서, 인공지능 모델(700)에는 IMU 센서 측정값(720)이 입력으로 적용되고, IMU 센서 측정값(720)은 3차원 코스트 볼륨에 입력될 수 있다. 트레이닝 과정에서 출력된 깊이 맵(730)과 좌안 이미지(710L)와 우안 이미지(710R)의 픽셀 휘도값 유사도에 따른 시차(disparity)의 정답값(groundtruth) 간에는 오차(error)가 존재할 수 있고, 증강 현실 디바이스(100)는 깊이 맵(730)의 오차를 보정하기 위하여 손실값(loss)(740)을 이용하여 인공지능 모델(700)을 트레이닝할 수 있다. For example, when the artificial intelligence model 700 is implemented as a convolutional neural network model such as DispNet, the artificial intelligence model 700 creates a feature map by convolving each of the left eye image 701L and the right eye image 710R. map) can be extracted. In the feature map extraction process for each of the left eye image 710L and the right eye image 710R, weights may be shared with the same value. The artificial intelligence model 700 accumulates the difference values between the feature maps extracted from the left eye image 710L and the right eye image 710R, respectively, as a 3D cost volume, and up-convolves the accumulated 3D cost volume. It can be trained to output the depth map 730 by performing up-convolution hierarchical refinement. In one embodiment of the present disclosure, the IMU sensor measurement value 720 is applied as an input to the artificial intelligence model 700, and the IMU sensor measurement value 720 may be input to the three-dimensional cost volume. An error may exist between the depth map 730 output during the training process and the groundtruth of the disparity according to the similarity of the pixel luminance values of the left eye image 710L and the right eye image 710R, and augmentation The real-world device 100 may train the artificial intelligence model 700 using the loss value 740 to correct errors in the depth map 730.

증강 현실 디바이스(100)는 IMU 센서 측정값(720)에 기초하여 중력 방향에 관한 정보를 획득하고, 중력 방향 또는 중력 방향에 대하여 수직 방향에 따라 보정된 법선 벡터(N'_i,j, 도 6a 및 도 6b 참조)에 의해 정의되는 평면 상의 픽셀들의 깊이 값에 기초하여 손실값(740)을 산출할 수 있다. 증강 현실 디바이스(100)가 손실값(740)을 산출하는 구체적인 실시예에 대해서는 도 8 내지 도 10에서 상세하게 설명하기로 한다.The augmented reality device 100 acquires information about the direction of gravity based on the IMU sensor measurement value 720, and generates a normal vector (N'i _,j , FIG. 6A) corrected according to the direction of gravity or a direction perpendicular to the direction of gravity. The loss value 740 can be calculated based on the depth values of pixels on the plane defined by (see and FIG. 6B). A specific embodiment in which the augmented reality device 100 calculates the loss value 740 will be described in detail with reference to FIGS. 8 to 10.

도 8은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다. FIG. 8 is a flowchart illustrating a method by which the augmented reality device 100 corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.

도 8에 도시된 단계 S810 및 S820은 도 3에 도시된 단계 S340을 구체화한 단계들이다. 도 8에 도시된 단계 S810은 도 3의 단계 S330이 수행된 이후에 수행될 수 있다. Steps S810 and S820 shown in FIG. 8 are steps that embody step S340 shown in FIG. 3. Step S810 shown in FIG. 8 may be performed after step S330 of FIG. 3 is performed.

단계 S810에서, 증강 현실 디바이스(100)는 보정된 법선 벡터(N'_i,j, 도 6a 및 도 6b 참조)에 의해 정의되는 평면 상의 픽셀 및 복수의 인접 픽셀들의 깊이 값에 기초하여 손실값(loss)(740, 도 7 참조)을 산출한다. 증강 현실 디바이스(100)가 손실값(740)을 산출하는 구체적인 실시예에 대해서는 도 9 및 도 10을 참조하여 상세하게 설명하기로 한다. In step S810, the augmented _reality device 100 generates a loss value ( loss) (740, see FIG. 7) is calculated. A specific embodiment in which the augmented reality device 100 calculates the loss value 740 will be described in detail with reference to FIGS. 9 and 10.

도 9는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 손실값(740)를 산출하는 방법을 도시한 흐름도이다. FIG. 9 is a flowchart illustrating a method by which the augmented reality device 100 calculates a loss value 740 according to an embodiment of the present disclosure.

도 9에 도시된 단계 S910 내지 S940은 도 8에 도시된 단계 S810을 구체화한 단계들이다. 도 9에 도시된 단계 S940이 수행된 이후, 도 8의 단계 S820이 수행될 수 있다. Steps S910 to S940 shown in FIG. 9 are steps that embody step S810 shown in FIG. 8. After step S940 shown in FIG. 9 is performed, step S820 of FIG. 8 may be performed.

도 10은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 손실값(740)을 산출하는 동작을 도시한 도면이다. 이하에서는, 도 8의 단계 S810을 도 9 및 도 10을 함께 참조하여 설명한다.FIG. 10 is a diagram illustrating an operation of the augmented reality device 100 to calculate a loss value 740 according to an embodiment of the present disclosure. Hereinafter, step S810 of FIG. 8 will be described with reference to FIGS. 9 and 10 together.

도 9의 단계 S910을 참조하면, 증강 현실 디바이스(100)는 보정된 법선 벡터를 갖는 픽셀 및 픽셀에 위치적으로 인접한 복수의 인접 픽셀들로 구성된 평면을 정의한다. 도 10을 함께 참조하면, 증강 현실 디바이스(100)의 프로세서(130, 도 2 참조)는 보정된 법선 벡터(N'_i,j)를 갖는 제1 픽셀의 3차원 좌표값(P_i,j)과 제1 픽셀과 상하좌우 방향으로 인접한 위치에 배치되는 복수의 인접 픽셀들의 3차원 좌표값(P_{i-1, j}, P_{i+1, j}, P_{i, j-1}, P_{i, j+1})으로 구성된 평면을 정의할 수 있다. 제1 픽셀의 3차원 좌표값(P_i,j)과 복수의 인접 픽셀들 각각의 3차원 좌표값(P_{i-1, j}, P_{i+1, j}, P_{i, j-1}, P_{i, j+1})은 서로 다른 깊이 값을 가질 수 있다. Referring to step S910 of FIG. 9 , the augmented reality device 100 defines a plane composed of a pixel having a corrected normal vector and a plurality of adjacent pixels positionally adjacent to the pixel. Referring to FIG. 10 together, the processor 130 (see FIG. 2) of the augmented reality device 100 calculates the three-dimensional coordinate value (P _i _{,j) of the first pixel having the corrected normal vector (N' i,j} ). and 3D coordinate values (P _{i-1, j, P i+1, j} , P _{i, j-1} , P _{i, j+)} of _{a plurality} of adjacent pixels arranged in positions adjacent to the first pixel in the up, down, left, and right directions. ₁ ) can be defined as a plane consisting of The 3D coordinate value of the first pixel (P _i,j ) and the 3D coordinate value of each of the plurality of adjacent pixels (P _{i-1, j} , P _{i+1, j} , P _{i, j-1} , P _{i , j+1} ) may have different depth values.

도 9의 단계 S920에서, 증강 현실 디바이스(100)는 정의된 평면과 카메라의 광선 벡터(ray vector)가 만나는 복수의 포인트에 기초하여 복수의 인접 픽셀들의 깊이 값을 획득한다. 도 10을 함께 참조하면, 프로세서(130)는 단계 S910에서 정의된 평면과 카메라(110)의 광선 벡터(R)가 만나는 복수의 포인트를 식별할 수 있다. 카메라(110)의 광선 벡터(R)는 카메라(110)와 평면 간의 거리 또는 높이 중 적어도 하나를 포함하는 위치 관계 및 카메라(110)의 방향에 기초하여 결정될 수 있다. 프로세서(130)는 식별된 복수의 포인트들의 깊이 값(D'_i,j, D'_{i-1, j}, D'_{i+1, j}, D'_{i, j-1}, D'_{i, j+1})을 획득할 수 있다. 도 10에 도시된 실시예에서, 제1 깊이 값(D'_i,j)은 제1 픽셀의 3차원 좌표값(P_i,j)의 깊이 값과 동일할 수 있다. 제2 깊이 값(D'_{i-1, j})은 제2 픽셀의 3차원 좌표값(P_{i-1, j})과 동일한 깊이 값을 가질 수 있다. 그러나, 제2 깊이 값(D'_{i-1, j})을 갖는 제2 포인트의 위치는 제2 픽셀의 3차원 좌표값(P_{i-1, j})의 위치와는 다를 수 있다. 마찬가지로, 제3 깊이 값(D'_{i+1, j})은 제3 픽셀의 3차원 좌표값(P_{i+1, j})과 동일한 깊이 값을 갖지만, 제3 포인트의 위치는 제3 픽셀의 3차원 좌표값(P_{i+1, j})의 위치와는 다를 수 있다.In step S920 of FIG. 9 , the augmented reality device 100 acquires depth values of a plurality of adjacent pixels based on a plurality of points where a defined plane and a ray vector of a camera meet. Referring to FIG. 10 together, the processor 130 may identify a plurality of points where the plane defined in step S910 and the ray vector (R) of the camera 110 meet. The ray vector R of the camera 110 may be determined based on the direction of the camera 110 and a positional relationship including at least one of the distance or height between the camera 110 and the plane. The processor 130 determines the depth values (D' _i,j , D' _{i-1, j, D'i+1,j} _, D' _i,j-1 , D' _i,j+) of the identified plurality of points. ₁ ) can be obtained. In the embodiment shown in FIG. 10 , the first depth value (D' _i,j ) may be equal to the depth value of the 3D coordinate value (P _i,j ) of the first pixel. The second depth value (D' _{i-1, j} ) may have the same depth value as the 3D coordinate value (P _{i-1, j} ) of the second pixel. However, the location of the second point having the second depth value (D' _{i-1, j} ) may be different from the location of the 3D coordinate value (P _{i-1, j} ) of the second pixel. Likewise, the third depth value (D' _{i+1, j} ) has the same depth value as the three-dimensional coordinate value (P _{i+1, j} ) of the third pixel, but the location of the third point is 3 of the third pixel. It may be different from the location of the dimensional coordinate value (P _{i+1, j} ).

도 9의 단계 S930에서, 증강 현실 디바이스(100)는 픽셀의 깊이 값과 복수의 인접 픽셀들의 깊이 값 간의 차이값을 각각 산출한다. 도 10을 함께 참조하면, 프로세서(130)는 하기의 수식 3에 기초하여 제1 픽셀의 깊이 값(D'_i,j)과 복수의 인접 픽셀들의 깊이 값(D'_{i-1, j}, D'_{i+1, j}, D'_{i, j-1}, D'_{i, j+1}) 간의 차이값(d_pq)을 각각 산출할 수 있다. In step S930 of FIG. 9 , the augmented reality device 100 calculates a difference value between the depth value of a pixel and the depth values of a plurality of adjacent pixels. Referring to FIG. 10 together, the processor 130 calculates the depth value (D' _i,j ) of the first pixel and the depth values (D' _i-1,j , D) of the plurality of adjacent pixels based on Equation 3 below. The difference value (d _pq ) between ' _{i+1, j} , D' _{i, j-1} , D' _{i, j+1} ) can be calculated respectively.

상기 수식 3에서 D'_p는 보정된 법선 벡터(N'_i,j)를 갖는 기준 픽셀인 제1 픽셀의 깊이 값(D'_i,j)를 나타내고, D'_q는 제1 픽셀과 위치적으로 인접한 복수의 인접 픽셀들 각각의 깊이 값(D'_{i-1, j}, D'_{i+1, j}, D'_{i, j-1}, D'_{i, j+1})을 나타낸다. 수식 3에 따르면, 프로세서(130)는 D'_i,j- D'_{i-1, j}, D'_i,j- D'_{i+1, j}, D'_i,j- D'_{i, j-1}, 및 D'_i,j- D'_{i, j+1}을 각각 계산함으로써, d_pq를 획득할 수 있다. In Equation 3, D' _p represents the depth value (D' _i,j ) of the first pixel, which is a reference pixel with a corrected normal vector (N'i _,j ), and D' _q is the positional difference between the first pixel and the first pixel. represents the depth value (D' _{i-1, j} , D' _{i+1, j} , D' _{i, j-1} , D' _{i, j+1} ) of each of a plurality of adjacent pixels. According to Equation 3, the processor 130 is D' _i,j - D' _{i-1, j} , D' _i,j - D' _{i+1, j} , D' _i,j - D' _{i, j-} d _pq can be obtained by calculating ₁ , and D' _i,j - D' _{i, j+1} , respectively.

도 9의 단계 S940에서, 증강 현실 디바이스(100)는 산출된 차이값에 가중치를 적용하는 가중합(weighted sum) 연산을 통해 손실값을 획득한다. 도 10을 함께 참조하면, 프로세서(130)는 산출된 차이값 d_pq에 제1 가중치(w_q ^D) 및 제2 가중치(w_pq ^C)를 적용하는 가중합 연산을 통해 손실값(Loss_g)을 획득할 수 있다. 본 개시의 일 실시예에서, 손실값(Loss_g)은 하기의 수식 4에 의해 산출될 수 있다.In step S940 of FIG. 9, the augmented reality device 100 obtains a loss value through a weighted sum operation that applies a weight to the calculated difference value. Referring to Figure 10, the processor 130 calculates the loss value (Loss _g ) through a weighted sum operation that applies the first weight (w _q ^D ) and the second weight (w _pq ^C ) to the calculated difference value d _pq . can be obtained. In one embodiment of the present disclosure, the loss value (Loss _g ) can be calculated by Equation 4 below.

상기 수식 4에서 제1 가중치(w_q ^D)는 깊이 맵(730, 도 7 참조)에서의 복수의 인접 픽셀들의 위치와 카메라(110)의 위치 사이의 거리에 기초하여 결정될 수 있다. 본 개시의 일 실시예에서, 제1 가중치(w_q ^D)는 하기 수식 5와 같이 깊이 맵(730)에서의 복수의 인접 픽셀들의 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1})에 기초하여 산출될 수 있다. In Equation 4, the first weight (w _q ^D ) may be determined based on the distance between the positions of a plurality of adjacent pixels in the depth map 730 (see FIG. 7) and the position of the camera 110. In one embodiment of the present disclosure, the first weight (w _q ^D ) is the depth value (D _{i-1, j} , D _{i+1, j)} of a plurality of adjacent pixels in the depth map 730 as shown in Equation 5 below: , D _{i, j-1} , D _{i, j+1} ).

상기 수식 5를 참조하면, D_q는 깊이 맵(730) 내의 복수의 인접 픽셀들의 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1})을 나타내고, 복수의 인접 픽셀들의 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1})이 작으면 작을수록 제1 가중치(w_q ^D)의 값은 커지게 된다. 복수의 인접 픽셀들의 깊이 값(D_{i-1, j}, D_{i+1, j}, D_{i, j-1}, D_{i, j+1}), 즉 인접 픽셀들과 카메라(110) 간의 거리가 가까울수록 제1 가중치(w_q ^D)의 값이 커질 수 있다. 스테레오 카메라(stereo camera)를 이용하는 깊이 맵(730) 추출 방식에서 거리가 멀어질수록 깊이 값의 정확도가 떨어지므로, 수식 5는 카메라(110)와의 거리가 가까운 픽셀에 가중치를 더 부여하도록 설계되었다. Referring to Equation 5, D _q is the depth value (D _{i-1, j} , D _{i+1, j} , D _{i, j-1} , D _{i, j+)} of a plurality of adjacent pixels in the depth map 730. ₁ ), and the smaller the depth value (D _{i-1, j} , D _{i+1, j} , D _{i, j-1} , D _{i, j+1} ) of the plurality of adjacent pixels, the smaller the first weight ( The value of w _q ^D ) increases. Depth values (D _{i-1, j} , D _{i+1, j} , D _{i, j-1} , D _{i, j+1} ) of a plurality of adjacent pixels, that is, the distance between the adjacent pixels and the camera 110 is close. As the value increases, the value of the first weight (w _q ^D ) may increase. In the depth map 730 extraction method using a stereo camera, the accuracy of the depth value decreases as the distance increases, so Equation 5 is designed to give more weight to pixels that are closer to the camera 110.

상기 수식 4에서 제2 가중치(w_pq ^C)는 깊이 맵(730)에서의 제1 픽셀과 복수의 인접 픽셀들 간의 픽셀 휘도값 차이에 기초하여 결정될 수 있다. 본 개시의 일 실시예에서, 제2 가중치(w_pq ^C)는 하기 수식 6과 같이 깊이 맵(730)에서의 픽셀 간 휘도값 차이에 기초하여 산출될 수 있다. In Equation 4, the second weight (w _pq ^C ) may be determined based on the difference in pixel luminance values between the first pixel and a plurality of adjacent pixels in the depth map 730. In one embodiment of the present disclosure, the second weight w _pq ^C may be calculated based on the difference in luminance values between pixels in the depth map 730 as shown in Equation 6 below.

상기 수식 6을 참조하면, I_p는 깊이 맵(730) 내의 제1 픽셀의 휘도값을 나타내고, I_q는 복수의 인접 픽셀들의 휘도값을 나타낸다. 제1 픽셀의 휘도값과 복수의 인접 픽셀들 휘도값 각각의 차이값이 작을수록 제2 가중치(w_pq ^C)의 값은 커지게 된다. 깊이 맵(730) 상에서 픽셀의 휘도값이 유사할수록 동일 또는 유사한 깊이 값을 가질 확률이 높으므로, 수식 6은 깊이 맵(730)에서의 픽셀의 휘도값의 차이값에 가중치를 더 부여하도록 설계되었다. Referring to Equation 6, I _p represents the luminance value of the first pixel in the depth map 730, and I _q represents the luminance value of a plurality of adjacent pixels. As the difference between the luminance value of the first pixel and the luminance values of the plurality of adjacent pixels decreases, the value of the second weight w _pq ^C increases. The more similar the luminance values of pixels on the depth map 730, the higher the probability of having the same or similar depth values, so Equation 6 is designed to give more weight to the difference between the luminance values of pixels in the depth map 730. .

다시 도 8을 참조하면, 단계 S820에서 증강 현실 디바이스(100)는 산출된 손실값(Loss_g, 도 10 참조)을 인공지능 모델에 적용하는 트레이닝을 수행함으로써, 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정한다. Referring again to FIG. 8, in step S820, the augmented reality device 100 performs training to apply the calculated loss value (Loss _g , see FIG. 10) to the artificial intelligence model to determine the depth of at least one pixel of the depth map. Correct the value.

도 8 내지 도 10에 도시된 실시예에서, 손실값(Loss_g)은 보정된 법선 벡터(N'_i,j, 도 6a 및 도 6b 참조)에 의해 정의된 평면과 카메라(110)의 광선 벡터(R)에 의한 깊이 값(D'_i,j, D'_{i-1, j}, D'_{i+1, j}, D'_{i, j-1}, D'_{i, j+1})에 기초하여 결정될 수 있다. 특히, 손실값(Loss_g)은 깊이 맵에서의 카메라(110)와의 거리에 기초하여 결정되는 제1 가중치(w_q ^D) 및 깊이 맵의 픽셀 별 휘도값 차이에 기초하여 결정되는 제2 가중치(w_pq ^C)가 적용되어 산출되는 바, 손실값(Loss_g)을 인공지능 모델에 적용하는 트레이닝을 통해 깊이 값의 정확도가 향상될 수 있다. 8-10, the loss value (Loss _g ) is the plane defined by the corrected normal vector (N'i _,j , see FIGS. 6A and 6B) and the ray vector of the camera 110. To be determined based on the depth value (D' _i,j , D' _{i-1, j} , D' _{i+1, j} , D' _{i, j-1} , D' _{i, j+1} ) by (R) You can. In particular, the loss value (Loss _g ) is a first weight (w _q ^D ) determined based on the distance to the camera 110 in the depth map and a second weight determined based on the difference in luminance value for each pixel in the depth map ( Since w _pq ^C ) is applied and calculated, the accuracy of the depth value can be improved through training that applies the loss value (Loss _g ) to the artificial intelligence model.

도 11은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 인공지능 모델(1100)을 이용하여 깊이 맵(1130)을 획득하는 동작을 도시한 도면이다.FIG. 11 is a diagram illustrating an operation of the augmented reality device 100 acquiring a depth map 1130 using an artificial intelligence model 1100 according to an embodiment of the present disclosure.

도 11을 참조하면, 증강 현실 디바이스(100)는 좌안 카메라 및 우안 카메라로 구성된 스테레오 카메라(stereo camera)를 포함할 수 있다. 증강 현실 디바이스(100)는 좌안 카메라를 이용하여 획득된 좌안 이미지(1110L), 우안 카메라를 이용하여 획득된 우안 이미지(1110R), 및 IMU 센서 측정값(1120)을 인공지능 모델(1100)에 입력하고, 인공지능 모델(1100)을 이용하는 추론을 수행함으로써 깊이 맵(1130)을 획득할 수 있다. Referring to FIG. 11, the augmented reality device 100 may include a stereo camera consisting of a left-eye camera and a right-eye camera. The augmented reality device 100 inputs the left eye image 1110L acquired using the left eye camera, the right eye image 1110R acquired using the right eye camera, and the IMU sensor measurement value 1120 to the artificial intelligence model 1100. And, the depth map 1130 can be obtained by performing inference using the artificial intelligence model 1100.

인공지능 모델(1100)은 도 7에 도시된 실시예와 같이, 손실값(740, 도 7 참조)을 적용하여 트레이닝된(trained) 모델일 수 있다. 증강 현실 디바이스(100)는 IMU 센서(120, 도 2 참조)에 의해 측정된 IMU 센서 측정값(1120)을 인공지능 모델(1100)에 입력하고, 입력된 IMU 센서 측정값(1120)은 3차원 코스트 볼륨에 적용될 수 있다. 인공지능 모델(1100)의 추론 과정에서, 3차원 코스트 볼륨의 업 컨볼루션 등 계층적 정제(hierarchical refinement)를 거쳐 깊이 맵(1130)이 출력될 수 있다. The artificial intelligence model 1100 may be a trained model by applying the loss value 740 (see FIG. 7), as in the embodiment shown in FIG. 7. The augmented reality device 100 inputs the IMU sensor measurement value 1120 measured by the IMU sensor 120 (see FIG. 2) into the artificial intelligence model 1100, and the input IMU sensor measurement value 1120 is displayed in three dimensions. Can be applied to cost volume. In the inference process of the artificial intelligence model 1100, a depth map 1130 may be output through hierarchical refinement, such as up-convolution of the 3D cost volume.

도 11에 도시된 실시예에서, 증강 현실 디바이스(100)는 중력 방향에 관한 정보를 포함하는 IMU 센서 측정값(1120)을 이용하여 트레이닝된 인공지능 모델(1100)을 이용하여 깊이 맵(1130)을 획득하는 바, 깊이 맵(1130)의 깊이 값 정확도를 향상시킬 수 있다. 또한, 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)는 필수 구성 요소인 IMU 센서(120)로부터 획득된 IMU 센서 측정값(1120)을 이용하여 인공지능 모델(1100)을 트레이닝하므로, 추가 하드웨어 모듈이 필요하지 않고, 이로 인하여 소형 폼팩터를 유지하면서도 낮은 전력 소모량을 구현하는 기술적 효과를 제공할 수 있다.In the embodiment shown in FIG. 11 , the augmented reality device 100 creates a depth map 1130 using an artificial intelligence model 1100 trained using IMU sensor measurements 1120 that include information about the direction of gravity. By obtaining, the depth value accuracy of the depth map 1130 can be improved. In addition, the augmented reality device 100 according to an embodiment of the present disclosure trains the artificial intelligence model 1100 using the IMU sensor measurement value 1120 obtained from the IMU sensor 120, which is an essential component, so additional No hardware modules are required, and this provides the technical effect of realizing low power consumption while maintaining a small form factor.

도 12는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다. FIG. 12 is a flowchart illustrating a method by which the augmented reality device 100 corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.

도 12에 도시된 단계 S1210은 도 3에 도시된 단계 S330에 포함되는 단계이다. 도 12에 도시된 단계 S1220 내지 S1240은 도 3에 도시된 단계 S340을 구체화한 단계들이다. Step S1210 shown in FIG. 12 is a step included in step S330 shown in FIG. 3. Steps S1220 to S1240 shown in FIG. 12 are steps that embody step S340 shown in FIG. 3.

도 13은 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 플레인 스윕(plane sweep) 방식을 통해 깊이 맵(1320)의 픽셀 별 깊이 값을 보정하는 동작을 도시한 도면이다. 이하에서는, 도 12 및 도 13을 함께 참조하여 증강 현실 디바이스(100)가 깊이 맵(1320)의 픽셀 별 깊이 값을 보정하는 동작을 설명한다.FIG. 13 is a diagram illustrating an operation of the augmented reality device 100 according to an embodiment of the present disclosure to correct the depth value for each pixel of the depth map 1320 through a plane sweep method. Hereinafter, an operation in which the augmented reality device 100 corrects the depth value for each pixel of the depth map 1320 will be described with reference to FIGS. 12 and 13 together.

도 12를 참조하면, 단계 S1210에서 증강 현실 디바이스(100)는 IMU 센서(120, 도 2 참조)에 의해 측정된 중력 방향에 기초하여 좌안 이미지 및 우안 이미지의 법선 벡터의 방향을 보정한다. 도 13을 함께 참조하면, 증강 현실 디바이스(100)는 좌안 카메라 및 우안 카메라를 포함하는 스테레오 카메라(stereo camera)를 포함할 수 있다. 증강 현실 디바이스(100)의 프로세서(130, 도 2 참조)는 IMU 센서(120, 도 2 참조)의 자이로 센서(gyroscope)로부터 중력 방향 정보를 획득하고, 획득된 중력 방향 정보에 기초하여 좌안 카메라를 이용하여 획득된 좌안 이미지(1310L) 및 우안 카메라를 이용하여 획득된 우안 이미지(1310R)의 픽셀 별 법선 벡터의 방향을 보정할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 중력 방향 또는 중력 방향에 수직하는 방향으로 픽셀 별 법선 벡터의 방향을 보정할 수 있다. 법선 벡터의 방향을 보정하는 구체적인 방법은 도 6a 및 도 6b에서 설명한 것과 동일하므로, 중복되는 설명은 생략한다. Referring to FIG. 12, in step S1210, the augmented reality device 100 corrects the directions of the normal vectors of the left eye image and the right eye image based on the direction of gravity measured by the IMU sensor 120 (see FIG. 2). Referring to FIG. 13 together, the augmented reality device 100 may include a stereo camera including a left-eye camera and a right-eye camera. The processor 130 (see FIG. 2) of the augmented reality device 100 acquires gravity direction information from the gyroscope of the IMU sensor 120 (see FIG. 2), and uses the left eye camera based on the obtained gravity direction information. The direction of the normal vector for each pixel of the left eye image 1310L acquired using the right eye camera and the right eye image 1310R acquired using the right eye camera can be corrected. In one embodiment of the present disclosure, the processor 130 may correct the direction of the normal vector for each pixel in the direction of gravity or in a direction perpendicular to the direction of gravity. Since the specific method for correcting the direction of the normal vector is the same as that described in FIGS. 6A and 6B, overlapping descriptions will be omitted.

도 12를 참조하면, 단계 S1220에서 증강 현실 디바이스(100)는 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 대한 수직 방향을 따라 평면 가정(plane hypothesis)을 수행한다. 도 13을 함께 참조하면, 프로세서(130)는 좌안 이미지(1310L) 및 우안 이미지(1310R) 각각에 대한 평면 가정(plane hypothesis)을 통해 픽셀 별로 평면을 정의할 수 있다. 프로세서(130)는 좌안 이미지(1310L)에 포함되는 픽셀들 각각의 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 대한 수직 방향에 따라 평면 가정을 수행함으로써, 좌안 이미지(1310L)에서 픽셀 별 평면을 정의할 수 있다. 동일한 방식으로, 프로세서(130)는 우안 이미지(1310R)에 포함되는 픽셀들 각각의 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 대한 수직 방향에 따라 평면 가정을 수행함으로써, 우안 이미지(1310R)에서 픽셀 별 평면을 정의할 수 있다. Referring to FIG. 12, in step S1220, the augmented reality device 100 performs a plane hypothesis along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector. Referring to FIG. 13 , the processor 130 may define a plane for each pixel through a plane hypothesis for each of the left eye image 1310L and the right eye image 1310R. The processor 130 performs a plane assumption according to the direction of the corrected normal vector of each of the pixels included in the left eye image 1310L or the direction perpendicular to the direction of the corrected normal vector, so that each pixel in the left eye image 1310L A plane can be defined. In the same way, the processor 130 performs a plane assumption according to the direction of the corrected normal vector of each of the pixels included in the right eye image 1310R or the direction perpendicular to the direction of the corrected normal vector, thereby ), you can define a plane for each pixel.

도 12를 참조하면, 단계 S1230에서 증강 현실 디바이스(100)는 평면 가정에 의해 정의된 평면을 따라 플레인 스윕(plane sweep)을 수행함으로써, 픽셀 별 깊이 값을 획득한다. 도 13을 함께 참조하면, 프로세서(130)는 좌안 이미지(1310L)에서 정의된 평면에서의 기준 픽셀의 2차원 위치 좌표값(x, y)을 기준으로 우안 이미지(1310R)에서의 정합점을 서치하는 플레인 스윕을 수행할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 평면 가정을 통해 좌안 이미지(1310L)에서 정의된 평면과 대응되는 평면을 우안 이미지(1310R)에서 식별하고, 우안 이미지(1310R)에서 식별된 평면 내에서 정합점을 서치할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 우안 이미지(1310R)에서 정의된 평면 내의 픽셀들 중 d₀ 내지 d_max의 시차 서치 범위(disparity search range) 내에서 좌안 이미지(1310L)의 기준 픽셀에 대응되는 정합점을 서치할 수 있다. 프로세서(130)는 좌안 이미지(1310L)의 평면 내 기준 픽셀과 우안 이미지(1310R)의 대응되는 평면 내의 픽셀들 간의 휘도값 유사도를 측정하고, 픽셀 간 휘도값 비유사도(dissimilarity)와 시차(disparity) 간의 관계를 나타내는 그래프(1310)에 기초하여 우안 이미지(1310R)에서 휘도값 비유사도가 가장 낮은 픽셀을 정합점으로서 식별할 수 있다. 프로세서(130)는 좌안 이미지(1310L)의 기준 픽셀의 위치와 우안 이미지(1310R)에서 정합점으로서 식별된 픽셀 간의 거리(d_x)를 시차로 결정할 수 있다. 프로세서(130)는 전술한 방식으로 좌안 이미지(1310L) 및 우안 이미지(1310R)의 모든 픽셀에 대하여 플레인 스윕을 수행함으로써, 픽셀 별 깊이 값을 획득할 수 있다. Referring to FIG. 12, in step S1230, the augmented reality device 100 obtains a depth value for each pixel by performing a plane sweep along a plane defined by the plane assumption. Referring to FIG. 13 together, the processor 130 searches for a matching point in the right eye image 1310R based on the two-dimensional position coordinate values (x, y) of the reference pixel on the plane defined in the left eye image 1310L. You can perform a plane sweep. In one embodiment of the present disclosure, the processor 130 identifies a plane in the right eye image 1310R that corresponds to a plane defined in the left eye image 1310L through a plane assumption, and determines a plane within the plane identified in the right eye image 1310R. You can search for the matching point. In one embodiment of the present disclosure, the processor 130 selects a reference pixel of the left eye image 1310L within a disparity search range of d ₀ to d _max among pixels in a plane defined in the right eye image 1310R. You can search for the matching point corresponding to . The processor 130 measures the luminance value similarity between the reference pixel within the plane of the left eye image 1310L and the pixels within the corresponding plane of the right eye image 1310R, and calculates the luminance value dissimilarity and disparity between pixels. Based on the graph 1310 showing the relationship between the pixels, the pixel with the lowest luminance value dissimilarity in the right eye image 1310R can be identified as a matching point. The processor 130 may determine the distance ( _d The processor 130 may obtain depth values for each pixel by performing a plane sweep on all pixels of the left-eye image 1310L and the right-eye image 1310R in the manner described above.

도 12를 참조하면, 단계 S1240에서 증강 현실 디바이스(100)는 획득된 깊이 값을 이용하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정한다. 도 13을 함께 참조하면, 증강 현실 디바이스(100)의 프로세서(130)는 획득된 픽셀 별 깊이 값을 이용하여 깊이 맵(1320)의 깊이 값을 보정할 수 있다. Referring to FIG. 12, in step S1240, the augmented reality device 100 corrects the depth value of at least one pixel of the depth map using the acquired depth value. Referring to FIG. 13 together, the processor 130 of the augmented reality device 100 may correct the depth value of the depth map 1320 using the obtained depth value for each pixel.

도 12 및 도 13에 도시된 실시예에서, 증강 현실 디바이스(100)는 중력 방향으로 픽셀 별 법선 벡터를 보정하고, 보정된 법선 벡터에 따라 정의된 평면으로 플레인 스윕을 수행함으로써, 높은 정확도를 갖는 깊이 맵(1320)을 획득할 수 있다. In the embodiment shown in FIGS. 12 and 13, the augmented reality device 100 corrects the normal vector for each pixel in the direction of gravity and performs a plane sweep to a plane defined according to the corrected normal vector, thereby providing high accuracy. A depth map 1320 can be obtained.

도 14는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 깊이 맵의 픽셀 별 깊이 값을 보정하는 방법을 도시한 흐름도이다.FIG. 14 is a flowchart illustrating a method by which the augmented reality device 100 corrects the depth value for each pixel of a depth map according to an embodiment of the present disclosure.

도 14에 도시된 단계 S1410 내지 S1430은 도 3에 도시된 단계 S340을 구체화한 단계들이다. 도 14에 도시된 단계 S1410은 도 3의 단계 S330이 수행된 이후에 수행될 수 있다. Steps S1410 to S1430 shown in FIG. 14 are steps that embody step S340 shown in FIG. 3. Step S1410 shown in FIG. 14 may be performed after step S330 of FIG. 3 is performed.

도 15는 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)가 ToF(Time-of-Flight) 방식을 통해 획득된 깊이 맵 이미지(1510)의 픽셀 별 깊이 값을 보정하는 동작을 도시한 도면이다. FIG. 15 is a diagram illustrating an operation of the augmented reality device 100 according to an embodiment of the present disclosure to correct the depth value for each pixel of the depth map image 1510 acquired through a time-of-flight (ToF) method. am.

도 14를 참조하면, 단계 S1410에서 증강 현실 디바이스(100)는 보정된 법선 벡터에 기초하여 깊이 맵에서 픽셀 별 평면을 정의한다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 ToF 카메라를 포함할 수 있다. ToF 카메라는 광원을 이용하여 객체에 광을 조사하고, 객체로부터 반사된 반사광을 검출하며, 반사광이 검출된 시점과 광이 조사된 시점 간의 시간 차이인 비행 시간(time of flight)에 기초하여 객체의 깊이 값을 획득하도록 구성될 수 있다. 도 15를 함께 참조하면, 증강 현실 디바이스(100)는 ToF 카메라를 이용하여 객체의 RGB 이미지(1500) 및 깊이 맵 이미지(1510)를 획득할 수 있다. 증강 현실 디바이스(100)의 프로세서(130, 도 2 참조)는 깊이 맵 이미지(1510)에서 픽셀 별 법선 벡터를 획득하고, 중력 방향 또는 중력 방향에 대하여 수직하는 방향을 따라 법선 벡터의 방향을 보정할 수 있다. 프로세서(130)는 보정된 법선 벡터의 방향에 기초하여 깊이 맵 이미지(1510)에서의 픽셀 별 평면을 정의할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 보정된 법선 벡터를 갖는 픽셀과 상기 픽셀과 위치적으로 인접한 복수의 인접 픽셀들로 구성된 평면을 정의할 수 있다.Referring to FIG. 14, in step S1410, the augmented reality device 100 defines a plane for each pixel in the depth map based on the corrected normal vector. In one embodiment of the present disclosure, the augmented reality device 100 may include a ToF camera. A ToF camera irradiates light to an object using a light source, detects reflected light reflected from the object, and determines the time of flight of the object based on the time of flight, which is the time difference between the time when the reflected light is detected and the time when the light is irradiated. It may be configured to obtain a depth value. Referring to FIG. 15 together, the augmented reality device 100 may acquire an RGB image 1500 and a depth map image 1510 of an object using a ToF camera. The processor 130 (see FIG. 2) of the augmented reality device 100 acquires a normal vector for each pixel from the depth map image 1510 and corrects the direction of the normal vector along the direction of gravity or a direction perpendicular to the direction of gravity. You can. The processor 130 may define a plane for each pixel in the depth map image 1510 based on the direction of the corrected normal vector. In one embodiment of the present disclosure, the processor 130 may define a plane composed of a pixel having a corrected normal vector and a plurality of adjacent pixels that are positionally adjacent to the pixel.

도 14를 참조하면, 단계 S1420에서 증강 현실 디바이스(100)는 RGB 이미지의 색상 정보에 따라 분할된 영역에 기초하여 깊이 맵에서의 평면 영역을 식별한다. 도 15를 함께 참조하면, 증강 현실 디바이스(100)의 프로세서(130)는 RGB 이미지(1500)에 포함되는 복수의 픽셀들 각각의 색상 정보에 기초하여 RGB 이미지(1500)를 복수의 영역(1501 내지 1506)으로 분할할 수 있다. 본 개시의 일 실시예에서, 프로세서(130)는 RGB 이미지(1500)의 복수의 픽셀들의 픽셀 별 휘도값에 기초하여 RGB 이미지(1500)를 복수의 영역(1501 내지 1506)으로 분할할 수 있다. 프로세서(130)는 RGB 이미지(1500)의 복수의 픽셀들 각각의 휘도값의 차이값을 산출하고, 산출된 차이값이 기 설정된 임계값 이하인 픽셀들을 그룹핑(grouping)함으로써, RGB 이미지(1500)를 복수의 영역(1501 내지 1506)으로 분할할 수 있다. 프로세서(130)는 RGB 이미지(1500)의 분할된 복수의 영역(1501 내지 1506)의 위치, 형태, 및 크기 중 적어도 하나의 정보에 기초하여 깊이 맵 이미지(1510)에서의 평면 영역을 식별할 수 있다. 도 15에 도시된 실시예에서, 프로세서(130)는 깊이 맵 이미지(1510)로부터 RGB 이미지(1500)의 분할된 복수의 영역(1501 내지 1506)에 대응되는 복수의 평면 영역(1511 내지 1516)을 식별할 수 있다. Referring to FIG. 14, in step S1420, the augmented reality device 100 identifies a flat area in the depth map based on the area divided according to the color information of the RGB image. Referring to FIG. 15 together, the processor 130 of the augmented reality device 100 divides the RGB image 1500 into a plurality of regions 1501 to 1501 based on the color information of each of the plurality of pixels included in the RGB image 1500. 1506). In one embodiment of the present disclosure, the processor 130 may divide the RGB image 1500 into a plurality of regions 1501 to 1506 based on pixel-specific luminance values of a plurality of pixels of the RGB image 1500. The processor 130 calculates the difference value of the luminance value of each of the plurality of pixels of the RGB image 1500, and groups the pixels whose calculated difference value is less than or equal to a preset threshold, thereby creating the RGB image 1500. It can be divided into a plurality of areas (1501 to 1506). The processor 130 may identify a flat area in the depth map image 1510 based on at least one of the location, shape, and size of the plurality of divided areas 1501 to 1506 of the RGB image 1500. there is. In the embodiment shown in FIG. 15, the processor 130 creates a plurality of planar regions 1511 to 1516 corresponding to the plurality of divided regions 1501 to 1506 of the RGB image 1500 from the depth map image 1510. can be identified.

도 14를 참조하면, 단계 S1430에서 증강 현실 디바이스(100)는 식별된 평면 영역 내의 픽셀의 깊이 값을 인접 픽셀들의 깊이 값에 기초하여 보정한다. 본 개시의 일 실시예에서, 증강 현실 디바이스(100)는 깊이 맵 내의 평면 영역에서 보정 대상 픽셀을 식별하고, 식별된 보정 대상 픽셀의 깊이 값을 동일 평면 영역 내의 인접 픽셀들의 깊이 값에 기초하여 보정할 수 있다. 본 개시에서, '보정 대상 픽셀'은 깊이 값의 보정이 필요한 픽셀로서, 평면 영역 내에서 깊이 값이 미획득된 픽셀 또는 평면 영역 내의 인접 픽셀들과의 깊이 값 차이가 기 설정된 임계치를 초과하는 픽셀을 의미한다. 도 15에 도시된 실시예에서, 깊이 맵 이미지(1510)로부터 식별된 복수의 평면 영역(1511 내지 1516) 중에서 제1 평면 영역(1511) 내지 제5 평면 영역(1515)에는 깊이 값이 미획득된 픽셀 또는 깊이 값의 차이가 기 설정된 임계치를 초과하는 픽셀들이 식별되지 않을 수 있다. 복수의 평면 영역(1511 내지 1516) 중 제6 평면 영역(1516)에는 깊이 값이 획득되지 않은 적어도 하나의 보정 대상 픽셀(1526)이 식별될 수 있다. 프로세서(130)는 식별된 보정 대상 픽셀(1526)의 깊이 값을 제6 평면(1516) 내에 포함되는 복수의 픽셀들 중 인접 픽셀들의 깊이 값에 기초하여 보정할 수 있다. Referring to FIG. 14, in step S1430, the augmented reality device 100 corrects the depth value of a pixel in the identified planar area based on the depth values of adjacent pixels. In one embodiment of the present disclosure, the augmented reality device 100 identifies a pixel to be corrected in a planar area within a depth map, and corrects the depth value of the identified pixel to be corrected based on the depth values of adjacent pixels in the same planar area. can do. In the present disclosure, a 'correction target pixel' is a pixel whose depth value needs to be corrected, and is a pixel whose depth value has not been obtained in a flat area or a pixel whose depth value difference with adjacent pixels in a flat area exceeds a preset threshold. means. In the embodiment shown in FIG. 15, among the plurality of planar areas 1511 to 1516 identified from the depth map image 1510, depth values have not been obtained in the first to fifth planar areas 1511 to 1515. Pixels whose difference in pixel or depth value exceeds a preset threshold may not be identified. At least one correction target pixel 1526 for which a depth value has not been obtained may be identified in the sixth plane area 1516 among the plurality of planar areas 1511 to 1516. The processor 130 may correct the depth value of the identified correction target pixel 1526 based on the depth values of adjacent pixels among a plurality of pixels included in the sixth plane 1516.

일반적인 ToF 카메라의 경우, 카메라로부터의 거리가 멀어질수록 깊이 값의 정확도가 떨어지거나, 또는 깊이 값이 획득되지 않는 문제점이 있다. 도 14 및 도 15에 도시된 실시예에서, 증강 현실 디바이스(100)는 깊이 맵 이미지(1510)에서 픽셀 별 법선 벡터를 중력 방향에 기초하여 보정하고, 보정된 법선 벡터에 기초하여 평면을 정의하며, RGB 이미지(1500)의 픽셀 별 색상 정보에 기초하여 깊이 맵 이미지(1510)의 평면 영역을 식별하고, 평면 영역 내의 보정 대상 픽셀(1526)의 깊이 값을 보정하는 바, 깊이 맵의 정확도를 향상시킬 수 있다. In the case of a general ToF camera, there is a problem that the accuracy of the depth value decreases as the distance from the camera increases, or the depth value is not obtained. In the embodiment shown in FIGS. 14 and 15, the augmented reality device 100 corrects the normal vector for each pixel in the depth map image 1510 based on the direction of gravity, and defines a plane based on the corrected normal vector. , Based on the color information for each pixel of the RGB image 1500, the flat area of the depth map image 1510 is identified, and the depth value of the correction target pixel 1526 in the flat area is corrected, improving the accuracy of the depth map. You can do it.

도 16은 일반적인 방식으로 획득된 깊이 맵으로 복원한 3차원 공간 모델 (1610)과 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)에 의해 획득된 깊이 맵으로 복원된 3차원 공간 모델(1620)을 도시한 도면이다.16 shows a three-dimensional space model 1610 restored with a depth map obtained in a general manner and a three-dimensional space model 1620 restored with a depth map acquired by the augmented reality device 100 according to an embodiment of the present disclosure. ) This is a drawing showing.

도 16을 참조하면, 종래의 일반적인 깊이 맵 획득 방식을 통해 획득된 현실 공간(1600)의 3차원 공간 모델(1610)을 참조하면, 객체의 표면이 기울어져 있고, 3차원 공간 모델(1610)의 정확도가 낮다. 일반적인 깊이 맵 획득 방식은 카메라로부터의 거리가 멀어질수록 깊이 값의 정확도가 낮아지는 문제점이 있다. Referring to FIG. 16, referring to the three-dimensional space model 1610 of the real space 1600 acquired through a conventional depth map acquisition method, the surface of the object is inclined, and the three-dimensional space model 1610 Accuracy is low. The general depth map acquisition method has the problem that the accuracy of the depth value decreases as the distance from the camera increases.

본 개시의 일 실시예에 따른 증강 현실 디바이스(100)는 IMU 센서(120, 도 2 참조)에 의해 측정된 중력 방향 또는 중력 방향에 수직하는 방향을 따라 픽셀 별 법선 벡터의 방향을 보정하고, 보정된 법선 벡터에 기초하여 픽셀 별 깊이 값을 보정하므로, 획득된 깊이 맵의 정확도 및 해상도가 일반적인 방식으로 획득된 깊이 맵에 비하여 향상될 수 있다. 또한, 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)에 의해 획득된 깊이 맵으로 복원된 3차원 공간 모델(1620)은 객체의 방향이 중력 방향과 수직 또는 수평한 방향으로 보정되었음을 알 수 있다. The augmented reality device 100 according to an embodiment of the present disclosure corrects the direction of the normal vector for each pixel along the gravity direction measured by the IMU sensor 120 (see FIG. 2) or a direction perpendicular to the gravity direction, and corrects Since the depth value for each pixel is corrected based on the obtained normal vector, the accuracy and resolution of the obtained depth map can be improved compared to the depth map obtained in a general manner. In addition, the three-dimensional space model 1620 restored with the depth map acquired by the augmented reality device 100 according to an embodiment of the present disclosure can be seen that the direction of the object has been corrected to be perpendicular or horizontal to the direction of gravity. there is.

또한, 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)는 필수 구성 요소인 IMU 센서(120)를 이용하여 중력 방향 정보를 획득하는 바, 소형 폼팩터를 유지하면서도 낮은 전력 소모량을 구현할 수 있다. 이로 인하여 본 개시의 증강 현실 디바이스(100)는 휴대성 및 디바이스 사용 시간을 증가시키고, 사용자 편의성을 향상시키는 기술적 효과를 제공할 수 있다. In addition, the augmented reality device 100 according to an embodiment of the present disclosure obtains gravity direction information using the IMU sensor 120, which is an essential component, and can implement low power consumption while maintaining a small form factor. As a result, the augmented reality device 100 of the present disclosure can provide technical effects that increase portability and device use time and improve user convenience.

본 개시는 중력 방향에 기초하여 깊이 값을 보정하는 증강 현실 디바이스(100)를 제공한다. 본 개시의 일 실시예에 따른 증강 현실 디바이스(100)는 카메라(110, 도 2 참조), IMU 센서(Inertial Measurement Unit)(120, 도 2 참조), 적어도 하나의 프로세서(130, 도 2 참조), 및 메모리(140, 도 2 참조)를 포함할 수 있다. 적어도 하나의 프로세서(130)는 메모리(140)에 저장된 적어도 하나의 명령어들을 실행함으로써, 카메라(110)를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득할 수 있다. 적어도 하나의 프로세서(130)는 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득할 수 있다. 적어도 하나의 프로세서(130)는 IMU 센서(120)에 의해 측정된 중력 방향에 기초하여 상기 적어도 하나의 픽셀의 법선 벡터의 방향을 보정할 수 있다. 적어도 하나의 프로세서(130)는 보정된 법선 벡터의 방향에 기초하여 상기 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. The present disclosure provides an augmented reality device 100 that corrects depth values based on the direction of gravity. The augmented reality device 100 according to an embodiment of the present disclosure includes a camera 110 (see FIG. 2), an IMU sensor (Inertial Measurement Unit) 120 (see FIG. 2), and at least one processor 130 (see FIG. 2). , and a memory 140 (see FIG. 2). At least one processor 130 may obtain a depth map from an image acquired using the camera 110 by executing at least one command stored in the memory 140. At least one processor 130 may obtain a normal vector of at least one pixel included in the depth map. At least one processor 130 may correct the direction of the normal vector of the at least one pixel based on the direction of gravity measured by the IMU sensor 120. At least one processor 130 may correct the depth value of the at least one pixel based on the direction of the corrected normal vector.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 깊이 맵에 포함된 적어도 하나의 픽셀의 방향 벡터 및 상기 적어도 하나의 픽셀의 깊이 값에 기초하여 적어도 하나의 픽셀을 3차원 좌표값으로 변환하고, 상하좌우 방향으로 인접한 복수의 인접 픽셀들의 3차원 좌표값의 외적(cross-product)을 계산함으로써, 법선 벡터를 획득할 수 있다. In one embodiment of the present disclosure, at least one processor 130 converts at least one pixel into a three-dimensional coordinate value based on the direction vector of at least one pixel included in the depth map and the depth value of the at least one pixel. A normal vector can be obtained by converting and calculating the cross-product of the 3D coordinate values of a plurality of adjacent pixels in the up, down, left, and right directions.

본 개시의 일 실시예에서, 카메라(110)는 좌안 이미지를 획득하는 좌안 카메라 및 우안 이미지를 획득하는 우안 카메라를 포함할 수 있다. 적어도 하나의 프로세서(130)는 좌안 이미지 및 우안 이미지를 인공지능 모델에 입력으로 적용하고, 인공지능 모델을 이용하여 좌안 이미지와 우안 이미지의 픽셀 휘도값 유사도에 따른 시차(disparity)를 계산함으로써 깊이 맵을 획득할 수 있다.In one embodiment of the present disclosure, the camera 110 may include a left eye camera that acquires a left eye image and a right eye camera that acquires a right eye image. At least one processor 130 applies the left-eye image and the right-eye image as input to an artificial intelligence model, and uses the artificial intelligence model to calculate disparity according to the similarity of pixel luminance values of the left-eye image and the right-eye image to create a depth map. can be obtained.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 보정된 법선 벡터에 의해 정의되는 평면 상의 픽셀 및 픽셀과 위치 상으로 인접한 복수의 인접 픽셀들의 깊이 값에 기초하여, 인공지능 모델에 의해 획득된 깊이 맵의 손실값(loss)을 산출할 수 있다. 적어도 하나의 프로세서(130)는 산출된 손실값을 인공지능 모델에 적용하는 트레이닝(training)을 수행함으로써, 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. In one embodiment of the present disclosure, the at least one processor 130 uses an artificial intelligence model based on the depth values of a pixel on a plane defined by a corrected normal vector and a plurality of adjacent pixels that are positionally adjacent to the pixel. The loss of the acquired depth map can be calculated. At least one processor 130 may correct the depth value of at least one pixel by performing training to apply the calculated loss value to the artificial intelligence model.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 보정된 법선 벡터를 갖는 픽셀 및 픽셀에 위치적으로 인접한 상기 복수의 인접 픽셀들로 구성된 평면을 정의하고, 정의된 평면과 카메라(110)의 광선 벡터(ray vector)가 만나는 복수의 포인트에 기초하여 복수의 인접 픽셀들의 깊이 값을 획득할 수 있다. 적어도 하나의 프로세서(130)는 획득된 픽셀의 깊이 값과 복수의 인접 픽셀들의 깊이 값 간의 차이값을 각각 산출할 수 있다. 적어도 하나의 프로세서(130)는 복수의 인접 픽셀들 별로 산출된 차이값에 가중치를 적용하는 가중합(weighted sum) 연산을 수행함으로써, 손실값을 획득할 수 있다.In one embodiment of the present disclosure, at least one processor 130 defines a plane consisting of a pixel having a corrected normal vector and a plurality of adjacent pixels positionally adjacent to the pixel, and the defined plane and the camera 110 The depth values of a plurality of adjacent pixels can be obtained based on a plurality of points where ray vectors of ) meet. At least one processor 130 may calculate a difference value between the depth value of the acquired pixel and the depth values of a plurality of adjacent pixels. At least one processor 130 may obtain a loss value by performing a weighted sum operation that applies a weight to the difference value calculated for each plurality of adjacent pixels.

본 개시의 일 실시예에서, 가중치는 깊이 맵에서의 복수의 인접 픽셀들 각각의 카메라와의 거리에 기초하여 결정되는 제1 가중치 및 깊이 맵에서의 픽셀과 복수의 인접 픽셀들 간의 휘도값 차이에 기초하여 결정되는 제2 가중치를 포함할 수 있다. In one embodiment of the present disclosure, the weight is determined based on a first weight determined based on the distance to the camera of each of a plurality of adjacent pixels in the depth map and a luminance value difference between the pixel in the depth map and the plurality of adjacent pixels. It may include a second weight determined based on the weight.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 트레이닝된 인공지능 모델에 좌안 이미지 및 우안 이미지를 입력하는 추론을 수행함으로써, 보정된 깊이 맵을 획득할 수 있다.In one embodiment of the present disclosure, at least one processor 130 may obtain a corrected depth map by performing inference by inputting a left-eye image and a right-eye image to a trained artificial intelligence model.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 카메라(110)는 좌안 이미지를 획득하는 좌안 카메라 및 우안 이미지를 획득하는 우안 카메라를 포함할 수 있다. 적어도 하나의 프로세서(130)는 중력 방향 또는 중력 방향에 수직한 방향에 따라 좌안 이미지 및 우안 이미지의 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하고, 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 수직하는 방향을 따라 평면 가정(plane hypothesis)을 수행할 수 있다. 적어도 하나의 프로세서(130)는 평면 가정에 의해 정의된 평면을 따라 플레인 스윕(plane sweep)을 수행함으로써, 적어도 하나의 픽셀의 깊이 값을 획득할 수 있다. 적어도 하나의 프로세서(130)는 획득된 깊이 값을 이용하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. In one embodiment of the present disclosure, the camera 110 of the at least one processor 130 may include a left-eye camera that acquires a left-eye image and a right-eye camera that acquires a right-eye image. At least one processor 130 corrects the direction of the normal vector of at least one pixel of the left-eye image and the right-eye image according to the direction of gravity or a direction perpendicular to the direction of gravity, and determines the direction of the corrected normal vector or the direction of the corrected normal vector. The plane hypothesis can be made along the direction perpendicular to the direction. At least one processor 130 may obtain the depth value of at least one pixel by performing a plane sweep along a plane defined by the plane assumption. At least one processor 130 may correct the depth value of at least one pixel of the depth map using the obtained depth value.

본 개시의 일 실시예에서, 카메라(110)는 ToF 카메라(Time-of-Flight camera)로 구성되고, 적어도 하나의 프로세서(130)는 ToF 카메라를 이용하여 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the camera 110 is configured as a Time-of-Flight camera (ToF camera), and at least one processor 130 may acquire a depth map using the ToF camera.

본 개시의 일 실시예에서, 적어도 하나의 프로세서(130)는 보정된 법선 벡터에 기초하여 픽셀 별 평면을 정의할 수 있다. 적어도 하나의 프로세서(130)는 RGB 이미지의 색상 정보에 따라 분할된 영역에 기초하여 깊이 맵에서 정의된 평면의 평면 영역을 식별할 수 있다. 적어도 하나의 프로세서(130)는 식별된 평면 영역 내의 인접 픽셀들의 깊이 값에 기초하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정할 수 있다. In one embodiment of the present disclosure, at least one processor 130 may define a plane for each pixel based on the corrected normal vector. At least one processor 130 may identify a flat area of the plane defined in the depth map based on the area divided according to color information of the RGB image. At least one processor 130 may correct the depth value of at least one pixel of the depth map based on the depth value of adjacent pixels in the identified planar area.

본 개시는 증강 현실 디바이스(100)가 깊이 값을 보정하는 방법을 제공한다. 본 개시의 일 실시예에서, 상기 방법은 카메라(110)를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득하는 단계(S310)를 포함할 수 있다. 상기 방법은 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득하는 단계(S320)를 포함할 수 있다. 상기 방법은 IMU 센서(120)에 의해 측정된 중력 방향에 기초하여 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하는 단계(S330)를 포함할 수 있다. 상기 방법은 보정된 법선 벡터의 방향에 기초하여 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)를 포함할 수 있다. The present disclosure provides a method for the augmented reality device 100 to correct depth values. In one embodiment of the present disclosure, the method may include obtaining a depth map from an image acquired using the camera 110 (S310). The method may include obtaining a normal vector of at least one pixel included in the depth map (S320). The method may include correcting the direction of the normal vector of at least one pixel based on the direction of gravity measured by the IMU sensor 120 (S330). The method may include correcting the depth value of at least one pixel based on the direction of the corrected normal vector (S340).

본 개시의 일 실시예에서, 상기 깊이 맵을 획득하는 단계(S310)는 좌안 카메라를 이용하여 획득된 좌안 이미지 및 우안 카메라를 이용하여 획득된 우안 이미지를 인공지능 모델에 입력하는 단계, 및 인공지능 모델을 이용하여 좌안 이미지와 우안 이미지의 픽셀 휘도값 유사도에 따른 시차(disparity)를 계산함으로써 깊이 맵을 획득하는 단계를 포함할 수 있다. In one embodiment of the present disclosure, the step of acquiring the depth map (S310) includes inputting a left eye image acquired using a left eye camera and a right eye image acquired using a right eye camera into an artificial intelligence model, and artificial intelligence It may include obtaining a depth map by calculating disparity according to the similarity of pixel luminance values of the left-eye image and the right-eye image using the model.

본 개시의 일 실시예에서, 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 보정된 법선 벡터에 의해 정의되는 평면 상의 픽셀 및 픽셀과 위치 상으로 인접한 복수의 인접 픽셀들의 깊이 값에 기초하여, 인공지능 모델에 의해 획득된 깊이 맵의 손실값(loss)을 산출하는 단계(S810)를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 산출된 손실값을 상기 인공지능 모델에 적용하는 트레이닝(training)을 수행함으로써, 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S820)를 포함할 수 있다. In one embodiment of the present disclosure, the step of correcting the depth value of the at least one pixel (S340) includes the depth values of the pixel on the plane defined by the corrected normal vector and a plurality of adjacent pixels that are positionally adjacent to the pixel. Based on this, it may include calculating the loss of the depth map acquired by the artificial intelligence model (S810). The step of correcting the depth value of at least one pixel (S340) is a step of correcting the depth value of at least one pixel of the depth map by performing training by applying the calculated loss value to the artificial intelligence model. (S820) may be included.

본 개시의 일 실시예에서, 상기 손실값을 산출하는 단계(S810)는 보정된 법선 벡터를 갖는 픽셀 및 픽셀에 위치적으로 인접한 복수의 인접 픽셀들로 구성된 평면을 정의하는 단계(S910), 및 정의된 평면과 카메라(110)의 광선 벡터(ray vector)가 만나는 복수의 포인트에 기초하여 복수의 인접 픽셀들의 깊이 값을 획득하는 단계(S920)를 포함할 수 있다. 상기 손실값을 산출하는 단계(S810)는 획득된 픽셀의 깊이 값과 복수의 인접 픽셀들의 깊이 값 간의 차이값을 각각 산출하는 단계(S930)를 포함할 수 있다. 상기 손실값을 산출하는 단계(S810)는 복수의 인접 픽셀들 별로 산출된 차이값에 가중치를 적용하는 가중합(weighted sum) 연산을 수행함으로써, 손실값을 획득하는 단계(S940)를 포함할 수 있다. In one embodiment of the present disclosure, calculating the loss value (S810) includes defining a plane consisting of a pixel with a corrected normal vector and a plurality of adjacent pixels positionally adjacent to the pixel (S910), and It may include obtaining depth values of a plurality of adjacent pixels based on a plurality of points where the defined plane and the ray vector of the camera 110 meet (S920). Calculating the loss value (S810) may include calculating a difference value between the depth value of the acquired pixel and the depth values of a plurality of adjacent pixels (S930). The step of calculating the loss value (S810) may include the step of obtaining the loss value by performing a weighted sum operation that applies a weight to the difference value calculated for each of the plurality of adjacent pixels (S940). there is.

본 개시의 일 실시예에서, 가중치는 깊이 맵에서의 복수의 인접 픽셀들 각각의 카메라(110)와의 거리에 기초하여 결정되는 제1 가중치 및 깊이 맵에서의 픽셀과 복수의 인접 픽셀들 간의 휘도값 차이에 기초하여 결정되는 제2 가중치를 포함할 수 있다. In one embodiment of the present disclosure, the weight is a first weight determined based on the distance from the camera 110 of each of the plurality of adjacent pixels in the depth map and the luminance value between the pixel in the depth map and the plurality of adjacent pixels. It may include a second weight determined based on the difference.

본 개시의 일 실시예에서, 상기 방법은 트레이닝된 인공지능 모델에 좌안 이미지 및 우안 이미지를 입력하는 추론을 수행함으로써, 보정된 깊이 맵을 획득하는 단계를 더 포함할 수 있다. In one embodiment of the present disclosure, the method may further include obtaining a corrected depth map by performing inference by inputting the left-eye image and the right-eye image to the trained artificial intelligence model.

본 개시의 일 실시예에서, 상기 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하는 단계(S330)는 중력 방향 또는 중력 방향에 수직한 방향에 따라 좌안 카메라를 이용하여 획득된 좌안 이미지 및 우안 카메라를 이용하여 획득된 우안 이미지의 법선 벡터의 방향을 보정하는 단계를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 보정된 법선 벡터의 방향 또는 보정된 법선 벡터의 방향에 수직하는 방향을 따라 평면 가정(plane hypothesis)을 수행하는 단계(S1220)를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 평면 가정에 의해 정의된 평면을 따라 플레인 스윕(plane sweep)을 수행함으로써, 적어도 하나의 픽셀의 깊이 값을 획득하는 단계(S1230)를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 획득된 깊이 값을 이용하여 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S1240)를 포함할 수 있다. In one embodiment of the present disclosure, the step of correcting the direction of the normal vector of the at least one pixel (S330) involves using the left eye image and the right eye camera acquired using the left eye camera according to the direction of gravity or the direction perpendicular to the direction of gravity. It may include a step of correcting the direction of the normal vector of the right eye image obtained using the method. The step of correcting the depth value of at least one pixel (S340) includes performing a plane hypothesis along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector (S1220). can do. The step of correcting the depth value of at least one pixel (S340) includes the step of obtaining the depth value of at least one pixel by performing a plane sweep along a plane defined by the plane assumption (S1230). It can be included. The step of correcting the depth value of at least one pixel (S340) may include a step of correcting the depth value of at least one pixel using the obtained depth value (S1240).

본 개시의 일 실시예에서, 상기 깊이 맵을 획득하는 단계(S310)는 ToF 카메라(Time-of-Flight camera)를 이용하여 깊이 맵을 획득할 수 있다. In one embodiment of the present disclosure, the step of acquiring the depth map (S310) may acquire the depth map using a Time-of-Flight camera (ToF camera).

본 개시의 일 실시예에서, 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 보정된 법선 벡터에 기초하여 픽셀의 평면을 정의하는 단계(S1410)를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 RGB 이미지의 색상 정보에 따라 분할된 영역에 기초하여 깊이 맵에서 정의된 평면의 평면 영역을 식별하는 단계(S1420)를 포함할 수 있다. 상기 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S340)는 식별된 평면 영역 내의 인접 픽셀들의 깊이 값에 기초하여 적어도 하나의 픽셀의 깊이 값을 보정하는 단계(S1430)를 포함할 수 있다. In one embodiment of the present disclosure, correcting the depth value of at least one pixel (S340) may include defining a plane of the pixel based on the corrected normal vector (S1410). The step of correcting the depth value of the at least one pixel (S340) may include the step of identifying a flat area of the plane defined in the depth map based on the area divided according to the color information of the RGB image (S1420). . The step of correcting the depth value of the at least one pixel (S340) may include the step of correcting the depth value of the at least one pixel based on the depth values of adjacent pixels in the identified planar area (S1430).

본 개시는 컴퓨터로 읽을 수 있는 저장 매체를 포함하는 컴퓨터 프로그램 제품(Computer Program Product)를 제공한다. 상기 저장 매체는 카메라(110)를 이용하여 획득한 이미지로부터 깊이 맵(depth map)을 획득하는 동작과 관련된 명령어들(instructions)을 저장할 수 있다. 상기 저장 매체는 깊이 맵에 포함된 적어도 하나의 픽셀의 법선 벡터(normal vector)를 획득하는 동작과 관련된 명령어들을 저장할 수 있다. 상기 저장 매체는 IMU 센서에 의해 측정된 중력 방향에 기초하여 적어도 하나의 픽셀의 법선 벡터의 방향을 보정하는 동작과 관련된 명령어들을 저장할 수 있다. 상기 저장 매체는 보정된 법선 벡터의 방향에 기초하여 깊이 맵의 적어도 하나의 픽셀의 깊이 값을 보정하는 동작과 관련된 명령어들을 저장할 수 있다. The present disclosure provides a computer program product including a computer-readable storage medium. The storage medium may store instructions related to obtaining a depth map from an image acquired using the camera 110. The storage medium may store instructions related to obtaining a normal vector of at least one pixel included in the depth map. The storage medium may store instructions related to correcting the direction of a normal vector of at least one pixel based on the direction of gravity measured by the IMU sensor. The storage medium may store instructions related to correcting the depth value of at least one pixel of the depth map based on the direction of the corrected normal vector.

본 개시에서 설명된 증강 현실 디바이스(100)에 의해 실행되는 프로그램은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 프로그램은 컴퓨터로 읽을 수 있는 명령어들을 수행할 수 있는 모든 시스템에 의해 수행될 수 있다. A program executed by the augmented reality device 100 described in this disclosure may be implemented with hardware components, software components, and/or a combination of hardware components and software components. A program can be executed by any system that can execute computer-readable instructions.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령어(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device.

소프트웨어는, 컴퓨터로 읽을 수 있는 저장 매체(computer-readable storage media)에 저장된 명령어를 포함하는 컴퓨터 프로그램으로 구현될 수 있다. 컴퓨터가 읽을 수 있는 기록 매체로는, 예를 들어 마그네틱 저장 매체(예컨대, ROM(read-only memory), RAM(random-access memory), 플로피 디스크, 하드 디스크 등) 및 광학적 판독 매체(예컨대, 시디롬(CD-ROM), 디브이디(DVD: Digital Versatile Disc)) 등이 있다. 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템들에 분산되어, 분산 방식으로 컴퓨터가 판독 가능한 코드가 저장되고 실행될 수 있다. 매체는 컴퓨터에 의해 판독 가능하며, 메모리에 저장되고, 프로세서에서 실행될 수 있다. Software may be implemented as a computer program including instructions stored on computer-readable storage media. Computer-readable recording media include, for example, magnetic storage media (e.g., read-only memory (ROM), random-access memory (RAM), floppy disk, hard disk, etc.) and optical read media (e.g., CD-ROM). (CD-ROM), DVD (Digital Versatile Disc), etc. The computer-readable recording medium is distributed among computer systems connected to a network, so that computer-readable code can be stored and executed in a distributed manner. The media may be readable by a computer, stored in memory, and executed by a processor.

컴퓨터로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '비일시적'은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장되는 경우를 구분하지 않는다. 예를 들어, '비일시적 저장매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다. Computer-readable storage media may be provided in the form of non-transitory storage media. Here, 'non-transitory' only means that the storage medium does not contain signals and is tangible, and does not distinguish between cases where data is stored semi-permanently or temporarily in the storage medium. For example, a 'non-transitory storage medium' may include a buffer where data is temporarily stored.

또한, 본 명세서에 개시된 실시예들에 따른 프로그램은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다.Additionally, programs according to embodiments disclosed in this specification may be included and provided in a computer program product. Computer program products are commodities and can be traded between sellers and buyers.

컴퓨터 프로그램 제품은 소프트웨어 프로그램, 소프트웨어 프로그램이 저장된 컴퓨터로 읽을 수 있는 저장 매체를 포함할 수 있다. 예를 들어, 컴퓨터 프로그램 제품은 증강 현실 디바이스(100)의 제조사 또는 전자 마켓(예를 들어, 삼성 갤럭시 스토어)을 통해 전자적으로 배포되는 소프트웨어 프로그램 형태의 상품(예를 들어, 다운로드 가능한 애플리케이션(downloadable application))을 포함할 수 있다. 전자적 배포를 위하여, 소프트웨어 프로그램의 적어도 일부는 저장 매체에 저장되거나, 임시적으로 생성될 수 있다. 이 경우, 저장 매체는 증강 현실 디바이스(100)의 제조사의 서버, 전자 마켓의 서버, 또는 소프트웨어 프로그램을 임시적으로 저장하는 중계 서버의 저장 매체가 될 수 있다. A computer program product may include a software program and a computer-readable storage medium on which the software program is stored. For example, a computer program product may be a product in the form of a software program (e.g., a downloadable application) distributed electronically by the manufacturer of the augmented reality device 100 or through an electronic market (e.g., Samsung Galaxy Store). ))) may be included. For electronic distribution, at least a portion of the software program may be stored on a storage medium or created temporarily. In this case, the storage medium may be a storage medium of a server of the manufacturer of the augmented reality device 100, a server of an electronic market, or a relay server that temporarily stores a software program.

컴퓨터 프로그램 제품은, 증강 현실 디바이스(100) 및/또는 서버로 구성되는 시스템에서, 서버의 저장매체 또는 증강 현실 디바이스(100)의 저장매체를 포함할 수 있다. 또는, 증강 현실 디바이스(100)와 통신 연결되는 제3 장치(예를 들어, 모바일 디바이스)가 존재하는 경우, 컴퓨터 프로그램 제품은 제3 장치의 저장매체를 포함할 수 있다. 또는, 컴퓨터 프로그램 제품은 증강 현실 디바이스(100)로부터 제3 장치로 전송되거나, 제3 장치로부터 전자 장치로 전송되는 소프트웨어 프로그램 자체를 포함할 수 있다.The computer program product, in a system comprised of the augmented reality device 100 and/or a server, may include a storage medium of the server or a storage medium of the augmented reality device 100. Alternatively, if there is a third device (eg, a mobile device) in communication connection with the augmented reality device 100, the computer program product may include a storage medium of the third device. Alternatively, the computer program product may include a software program itself that is transmitted from the augmented reality device 100 to a third device or from a third device to an electronic device.

이 경우, 증강 현실 디바이스(100) 또는 제3 장치 중 하나가 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 수행할 수 있다. 또는, 증강 현실 디바이스(100) 및 제3 장치 중 적어도 하나 이상이 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 분산하여 실시할 수 있다.In this case, either the augmented reality device 100 or a third device may execute the computer program product to perform the method according to the disclosed embodiments. Alternatively, at least one of the augmented reality device 100 and the third device may execute the computer program product and perform the methods according to the disclosed embodiments in a distributed manner.

예를 들면, 증강 현실 디바이스(100)가 메모리(140, 도 2 참조)에 저장된 컴퓨터 프로그램 제품을 실행하여, 증강 현실 디바이스(100)와 통신 연결된 타 전자 장치(예를 들어, 모바일 디바이스)가 개시된 실시예들에 따른 방법을 수행하도록 제어할 수 있다. For example, the augmented reality device 100 executes a computer program product stored in the memory 140 (see FIG. 2), and another electronic device (e.g., a mobile device) connected to communicate with the augmented reality device 100 is disclosed. It can be controlled to perform the method according to the embodiments.

또 다른 예로, 제3 장치가 컴퓨터 프로그램 제품을 실행하여, 제3 장치와 통신 연결된 전자 장치가 개시된 실시예에 따른 방법을 수행하도록 제어할 수 있다. As another example, a third device may execute a computer program product to control an electronic device communicatively connected to the third device to perform the method according to the disclosed embodiment.

제3 장치가 컴퓨터 프로그램 제품을 실행하는 경우, 제3 장치는 증강 현실 디바이스(100)로부터 컴퓨터 프로그램 제품을 다운로드하고, 다운로드된 컴퓨터 프로그램 제품을 실행할 수 있다. 또는, 제3 장치는 프리로드(pre-load)된 상태로 제공된 컴퓨터 프로그램 제품을 실행하여 개시된 실시예들에 따른 방법을 수행할 수도 있다. When the third device executes the computer program product, the third device may download the computer program product from the augmented reality device 100 and execute the downloaded computer program product. Alternatively, the third device may perform the methods according to the disclosed embodiments by executing a computer program product provided in a pre-loaded state.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 컴퓨터 시스템 또는 모듈 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다. As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques may be performed in a different order than the described method, and/or components, such as a described computer system or module, may be combined or combined in a form different from the described method, or other components or equivalents may be used. Appropriate results can be achieved even if replaced or replaced by .

Claims

camera 110;
An Inertial Measurement Unit (IMU) sensor configured to measure the direction of gravity (120);
a memory 140 that stores at least one instruction; and
At least one processor 130 executing the at least one instruction;
Including,
The at least one processor 130,
Obtaining a depth map from the image acquired using the camera 110,
Obtaining a normal vector of at least one pixel included in the depth map,
Correcting the direction of a normal vector of the at least one pixel based on the direction of gravity measured by the IMU sensor 120,
The augmented reality device 100 corrects the depth value of the at least one pixel based on the direction of the corrected normal vector.

According to claim 1,
The at least one processor 130,
Converting the at least one pixel into a three-dimensional coordinate value based on the direction vector of the at least one pixel included in the depth map and the depth value of the at least one pixel,
An augmented reality device 100 that obtains the normal vector by calculating a cross-product of three-dimensional coordinate values of a plurality of adjacent pixels adjacent in the up, down, left, and right directions.

According to any one of claims 1 and 2,
The camera 110 includes a left eye camera for acquiring a left eye image and a right eye camera for acquiring a right eye image,
The at least one processor 130,
Applying the left eye image and the right eye image as input to the artificial intelligence model,
An augmented reality device 100 that obtains the depth map by calculating disparity according to the similarity of pixel luminance values of the left-eye image and the right-eye image using the artificial intelligence model.

According to clause 3,
The at least one processor 130,
Based on the depth values of a pixel on a plane defined by the corrected normal vector and a plurality of adjacent pixels in position adjacent to the pixel, calculate a loss value of the depth map obtained by the artificial intelligence model, ,
An augmented reality device 100 that corrects the depth value of the at least one pixel by performing training to apply the calculated loss value to the artificial intelligence model.

According to clause 4,
The at least one processor 130,
Defining a plane consisting of a pixel having the corrected normal vector and a plurality of adjacent pixels that are positionally adjacent to the pixel,
Obtaining depth values of the plurality of adjacent pixels based on a plurality of points where the defined plane meets the ray vector of the camera 110,
Calculating a difference value between the depth value of the acquired pixel and the depth values of the plurality of adjacent pixels, respectively,
An augmented reality device 100 that obtains the loss value by performing a weighted sum operation that applies a weight to the difference value calculated for each of the plurality of adjacent pixels.

According to clause 5,
The weight is a first weight determined based on the distance from the camera 110 of each of the plurality of adjacent pixels in the depth map and the difference in luminance value between the pixel and the plurality of adjacent pixels in the depth map. Augmented reality device 100, including a second weight determined based on .

According to any one of claims 3 to 6,
The at least one processor 130,
An augmented reality device 100 that obtains a corrected depth map by performing inference by inputting the left eye image and the right eye image into the trained artificial intelligence model.

According to claim 1,
The camera 110 includes a left eye camera for acquiring a left eye image and a right eye camera for acquiring a right eye image,
The at least one processor 130,
Correcting the directions of normal vectors of the left-eye image and the right-eye image according to the direction of gravity or a direction perpendicular to the direction of gravity,
Performing a plane hypothesis along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector,
Obtaining a depth value of the at least one pixel by performing a plane sweep along a plane defined by the plane assumption,
The augmented reality device 100 corrects the depth value of the at least one pixel using the obtained depth value.

According to claim 1,
The camera 110 is configured as a Time-of-Flight camera (ToF camera),
The at least one processor 130,
An augmented reality device 100 that acquires the depth map using the ToF camera.

According to clause 9,
The at least one processor 130,
Define a plane for each pixel based on the corrected normal vector,
Identifying a flat area of the plane defined in the depth map based on the area divided according to the color information of the RGB image,
Augmented reality device 100, correcting the depth value based on the depth value of adjacent pixels within the identified planar area.

In the method for the augmented reality device 100 to correct the depth value,
Obtaining a depth map from an image acquired using the camera 110 (S310);
Obtaining a normal vector of at least one pixel included in the depth map (S320);
Correcting the direction of the normal vector of the at least one pixel based on the direction of gravity measured by the IMU sensor 120 (S330); and
Correcting the depth value of the at least one pixel based on the direction of the corrected normal vector (S340);
Method, including.

According to claim 11,
The step of acquiring the depth map (S310) is,
Inputting a left eye image acquired using a left eye camera and a right eye image acquired using a right eye camera into an artificial intelligence model; and
Obtaining the depth map by calculating disparity according to the similarity of pixel luminance values of the left-eye image and the right-eye image using the artificial intelligence model;
Method, including.

According to claim 12,
The step (S340) of correcting the depth value for each pixel of the depth map is,
Calculating the loss of the depth map obtained by the artificial intelligence model based on the depth values of the pixel on the plane defined by the corrected normal vector and a plurality of adjacent pixels in position adjacent to the pixel. Step (S810); and
Compensating the depth value of the at least one pixel by performing training by applying the calculated loss value to the artificial intelligence model (S820);
Method, including.

According to claim 13,
The step of calculating the loss value (S810) is,
Defining a plane consisting of a pixel having the corrected normal vector and a plurality of adjacent pixels that are positionally adjacent to the pixel (S910);
Obtaining depth values of the plurality of adjacent pixels based on a plurality of points where the defined plane meets the ray vector of the camera 110 (S920);
Calculating difference values between the depth value of the acquired pixel and the depth values of the plurality of adjacent pixels (S930); and
Obtaining the loss value by performing a weighted sum operation that applies a weight to the difference value calculated for each of the plurality of adjacent pixels (S940);
Method, including.

According to claim 14,
The weight is a first weight determined based on the distance from the camera 110 of each of the plurality of adjacent pixels in the depth map and the difference in luminance value between the pixel and the plurality of adjacent pixels in the depth map. A method comprising a second weight determined based on .

According to any one of claims 12 to 15,
Obtaining a corrected depth map by performing inference by inputting the left eye image and the right eye image into the trained artificial intelligence model;
A method further comprising:

According to claim 11,
The step (S330) of correcting the direction of the normal vector for each pixel is,
Correcting the directions of normal vectors of a left-eye image acquired using a left-eye camera and a right-eye image acquired using a right-eye camera according to the direction of gravity or a direction perpendicular to the direction of gravity;
Including,
The step (S340) of correcting the depth value of the at least one pixel is,
Performing a plane hypothesis along the direction of the corrected normal vector or a direction perpendicular to the direction of the corrected normal vector (S1220); and
Obtaining the depth value by performing a plane sweep along a plane defined by the plane assumption (S1230); and
Correcting the depth value of the at least one pixel using the obtained depth value (S1240);
Method, including.

According to claim 11,
The step of acquiring the depth map (S310) is,
A method of acquiring the depth map using a Time-of-Flight camera (ToF camera).

According to clause 18,
The step (S340) of correcting the depth value of the at least one pixel is,
defining a pixel plane based on the corrected normal vector (S1410);
Identifying a flat area of the plane defined in the depth map based on the area divided according to the color information of the RGB image (S1420); and
Compensating the depth value of at least one pixel based on the depth value of adjacent pixels in the identified planar area (S1430);
Method, including.

A computer-readable recording medium on which at least one program for implementing the method according to any one of claims 11 to 19 is recorded.