KR20230017088A

KR20230017088A - Apparatus and method for estimating uncertainty of image points

Info

Publication number: KR20230017088A
Application number: KR1020210127547A
Authority: KR
Inventors: 김윤태; 설상훈; 김재현; 윤성욱; 사공동훈; 주호진; 피재환
Original assignee: 삼성전자주식회사; 고려대학교 산학협력단
Priority date: 2021-07-27
Filing date: 2021-09-27
Publication date: 2023-02-03

Abstract

The present invention relates to an apparatus for estimating uncertainty, which comprises a processor estimating the uncertainty of image coordinates by executing at least one program. The processor can receive first tracking coordinates which correspond to the reference coordinates of first image data acquired through a camera sensor and are image-based tracking coordinates in second image data acquired after acquisition of the first image data; acquire second tracking coordinates which correspond to the reference coordinates based on motion data acquired by a motion sensor and the depth value of the first image data, and are motion-based tracking coordinates in the second image data; calculate target coordinate distribution in the second image data based on the first tracking coordinates and the second tracking coordinates; acquire estimated target coordinates and the uncertainty of the estimated target coordinates based on the calculated target coordinate distribution; and update the first tracking coordinates based on the estimated target coordinates. Accordingly, the accuracy of tracking image coordinates can be ensured.

Description

Apparatus and method for estimating uncertainty of image coordinates {APPARATUS AND METHOD FOR ESTIMATING UNCERTAINTY OF IMAGE POINTS}

본 개시에 따른 다양한 실시 예들은, 영상 좌표의 불확실성을 추정하여 영상 데이터에 대한 추적 좌표를 획득하는 장치 및 방법에 관한 것이다.Various embodiments according to the present disclosure relate to an apparatus and method for obtaining tracking coordinates for image data by estimating uncertainty of image coordinates.

컴퓨터 비전 및 로봇 공학 분야에서 시각적 주행 거리 측정(visual odometry, VO)에 대한 기술 및 동시적 위치 추정 및 지도 작성(simultaneous localization and mapping, SLAM)에 대한 기술이 연구되고 있다. 특히, 이러한 기술들은 점차적으로 대중화되고 있는 자율 주행(autonomous navigation) 및 증강 현실(augmented reality)에 적용될 수 있다.In the field of computer vision and robotics, a technique for visual odometry (VO) and a technique for simultaneous localization and mapping (SLAM) are being studied. In particular, these technologies can be applied to autonomous navigation and augmented reality, which are becoming increasingly popular.

특징 기반의 주행 거리 측정 및 SLAM 기술에는, 움직이는 카메라 센서를 통해 연속적인 영상 데이터를 획득하고, 획득된 영상 데이터에서의 좌표 이동을 추적함으로써 연속적인 영상 데이터들 간에 관계를 분석하는 영상 좌표 추적 방법이 적용될 수 있다. 영상 좌표 추적 방법은 과거 영상 데이터에서의 특정 좌표에 대한 현재 영상 데이터에서의 목표 좌표를 찾아내는 방법을 의미할 수 있다. 이때, 과거 영상 데이터에서의 특정 좌표와 현재 영상 데이터에서의 목표 좌표는 실질적으로 동일한 점일 수 있다. In the feature-based mileage measurement and SLAM technology, there is an image coordinate tracking method that analyzes the relationship between continuous image data by acquiring continuous image data through a moving camera sensor and tracking coordinate movement in the acquired image data. can be applied The image coordinate tracking method may refer to a method of finding a target coordinate in current image data for a specific coordinate in past image data. In this case, a specific coordinate in the past image data and a target coordinate in the current image data may be substantially the same point.

일반적인 영상 좌표 추적 방법은, 과거 영상 데이터에서의 특정 좌표의 주변 영상과 가장 유사한 주변 영상을 갖는 현재 영상 데이터에서의 좌표를 목표 좌표로 가정할 수 있다. 이후에, 과거 영상 데이터에서의 특정 좌표와 현재 영상 데이터에서의 목표 좌표가 이루는 복수의 쌍(pair)들 중에서 다수에 비해 명확히 구별되는 쌍(pair)들을 추적 실패로 추정하고, 이를 반복하여 추적 방법의 신뢰도를 향상시킬 수 있다.In a general image coordinate tracking method, coordinates in current image data having a neighboring image most similar to a neighboring image of a specific coordinate in past image data may be assumed as target coordinates. Thereafter, among a plurality of pairs formed by specific coordinates in the past image data and target coordinates in the current image data, pairs that are clearly distinguished from the majority are estimated as tracking failures, and the tracking method is repeated. reliability can be improved.

움직이는 카메라 센서를 통해 연속적인 영상 데이터를 획득하여 좌표 이동을 추적함에 있어서, 움직이는 카메라 센서의 추적 결과는 시각적인 특징의 변화에 따라 크게 달라질 수 있다. 예를 들어, 카메라 센서의 움직임이 큰 경우에, 모션 블러(motion blur), 조명 변화(illumination change) 및 가림(occlusion) 등과 같은 시각적인 특징의 저하가 발생할 수 있다. 특히 모션 블러가 발생하면, 모션 블러가 발생한 방향으로 과거 영상 데이터에서의 특정 좌표와 유사한 다수의 좌표들이 분포할 수 있으므로 목표 좌표를 추정하기 어렵게 된다. 또한, 목표 좌표에 대한 추정이 어려움에 따라 오류 발생이 누적될 수 있고, 이에 따라 영상 좌표 추적의 정확도가 크게 감소할 수 있다.In tracking coordinate movement by acquiring continuous image data through a moving camera sensor, tracking results of the moving camera sensor may vary greatly depending on changes in visual characteristics. For example, when a movement of a camera sensor is large, deterioration of visual characteristics such as motion blur, illumination change, and occlusion may occur. In particular, when motion blur occurs, since a plurality of coordinates similar to specific coordinates in past image data may be distributed in a direction in which motion blur occurs, it is difficult to estimate target coordinates. In addition, errors may accumulate due to difficulty in estimating the target coordinates, and accordingly, the accuracy of image coordinate tracking may be greatly reduced.

따라서, 본 개시에서의 다양한 실시 예에서는 카메라 센서뿐만 아니라 모션 센서를 함께 사용하여 목표 좌표를 추정하고 추정 목표 좌표의 불확실성을 획득하는 장치 및 그 동작 방법을 제공하고자 한다.Accordingly, various embodiments of the present disclosure intend to provide an apparatus and an operation method for estimating target coordinates and obtaining uncertainty of the estimated target coordinates by using a motion sensor as well as a camera sensor.

본 개시의 실시 예들을 통해 해결하고자 하는 과제가 상술한 과제로 한정되는 것은 아니며, 언급되지 아니한 과제들은 본 개시 및 첨부된 도면으로부터 실시 예들이 속하는 기술 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The problems to be solved through the embodiments of the present disclosure are not limited to the above-described problems, and problems not mentioned are clearly understood by those skilled in the art from the present disclosure and the accompanying drawings to which the embodiments belong. It could be.

일 실시 예에서의 불확실성을 추정하는 장치는 적어도 하나의 프로그램을 실행함으로써 영상 좌표의 불확실성을 추정하는 프로세서를 포함하고, 프로세서는 카메라 센서를 통해 획득된 제1 영상 데이터의 기준 좌표에 대응되고 제1 영상 데이터 이후에 획득된 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표를 수신하고, 모션 센서로부터 획득된 모션 데이터 및 제1 영상 데이터의 깊이 값에 기초하여 기준 좌표에 대응되고 제2 영상 데이터에서의 모션 기반 추적 좌표인 제2 추적 좌표를 획득하고, 제1 추적 좌표 및 제2 추적 좌표에 기초하여 제2 영상 데이터에서의 목표 좌표 분포를 계산하고, 계산된 목표 좌표 분포에 기초하여 추정 목표 좌표 및 추정 목표 좌표의 불확실성(uncertainty)을 획득하고, 추정 목표 좌표에 기초하여 제1 추적 좌표를 갱신(update)할 수 있다.An apparatus for estimating uncertainty in an embodiment includes a processor that estimates uncertainty of image coordinates by executing at least one program, and the processor corresponds to reference coordinates of first image data acquired through a camera sensor and first Receiving first tracking coordinates, which are image-based tracking coordinates in second image data obtained after the image data, and corresponding to reference coordinates based on motion data obtained from a motion sensor and a depth value of the first image data, and second tracking coordinates Obtain second tracking coordinates that are motion-based tracking coordinates in the image data, calculate a target coordinate distribution in the second image data based on the first tracking coordinates and the second tracking coordinates, and based on the calculated target coordinate distribution Estimated target coordinates and uncertainties of the estimated target coordinates may be obtained, and first tracking coordinates may be updated based on the estimated target coordinates.

일 실시 예에서의 불확실성을 추정하는 방법은 카메라 센서를 통해 획득된 제1 영상 데이터의 기준 좌표에 대응되고 제1 영상 데이터 이후에 획득된 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표를 수신하는 단계, 모션 센서로부터 획득된 모션 데이터 및 제1 영상 데이터의 깊이 값에 기초하여 기준 좌표에 대응되고 제2 영상 데이터에서의 모션 기반 추적 좌표인 제2 추적 좌표를 획득하는 단계, 제1 추적 좌표 및 제2 추적 좌표에 기초하여 제2 영상 데이터에서의 목표 좌표 분포를 계산하는 단계, 계산된 목표 좌표 분포에 기초하여 추정 목표 좌표 및 추정 목표 좌표의 불확실성(uncertainty)을 획득하는 단계, 및 추정 목표 좌표에 기초하여 제1 추적 좌표를 갱신(update)하는 단계를 포함할 수 있다.A method for estimating uncertainty in an embodiment includes first tracking coordinates that correspond to reference coordinates of first image data acquired through a camera sensor and are image-based tracking coordinates in second image data obtained after the first image data. Receiving a second tracking coordinate corresponding to the reference coordinate based on motion data obtained from a motion sensor and a depth value of the first image data and being a motion-based tracking coordinate in the second image data; Calculating a target coordinate distribution in the second image data based on the tracking coordinates and the second tracking coordinates, obtaining estimated target coordinates and uncertainty of the estimated target coordinates based on the calculated target coordinate distribution, and Updating the first tracking coordinates based on the estimated target coordinates may be included.

일 실시 예에서의 SLAM 연산을 수행하는 전자 장치는 주변 환경에 대한 영상 데이터를 획득하는 카메라 센서, 전자 장치의 회전 및 이동을 감지하여 모션 데이터를 획득하는 모션 센서, 및 카메라 센서 및 모션 센서와 전기적으로 연결되는 프로세서를 포함하고, 프로세서는 카메라 센서를 통해 획득된 제1 영상 데이터의 기준 좌표에 대응되고 제1 영상 데이터 이후에 획득된 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표를 수신하고, 모션 센서로부터 획득된 모션 데이터 및 제1 영상 데이터의 깊이 값에 기초하여 기준 좌표에 대응되고 제2 영상 데이터에서의 모션 기반 추적 좌표인 제2 추적 좌표를 획득하고, 제1 추적 좌표 및 제2 추적 좌표에 기초하여 제2 영상 데이터에서의 목표 좌표 분포를 계산하고, 계산된 목표 좌표 분포에 기초하여 추정 목표 좌표 및 추정 목표 좌표의 불확실성(uncertainty)을 획득하고, 추정 목표 좌표에 기초하여 제1 추적 좌표를 갱신(update)할 수 있다.An electronic device performing a SLAM operation in an embodiment includes a camera sensor for acquiring image data of a surrounding environment, a motion sensor for acquiring motion data by detecting rotation and movement of the electronic device, and an electrical connection between the camera sensor and the motion sensor. and a processor connected to , wherein the processor receives first tracking coordinates that correspond to reference coordinates of the first image data acquired through the camera sensor and are image-based tracking coordinates in second image data obtained after the first image data. receiving, and obtaining second tracking coordinates that correspond to the reference coordinates and are motion-based tracking coordinates in the second image data based on the motion data obtained from the motion sensor and the depth value of the first image data, and the first tracking coordinates and A target coordinate distribution in the second image data is calculated based on the second tracking coordinates, an estimated target coordinate and an uncertainty of the estimated target coordinate are obtained based on the calculated target coordinate distribution, and based on the estimated target coordinates, The first tracking coordinates may be updated.

도 1은 일 실시 예에 따른 전자 장치의 구성 요소에 대한 블록도이다.
도 2는 영상 좌표를 추적하는 종래의 방법을 설명하기 위한 예시도이다.
도 3은 일 실시 예에 따른 장치가 추적 좌표의 불확실성을 추정하는 방법을 설명하기 위한 흐름도이다.
도 4는 일 실시 예에 따른 장치가 모션 기반 추적 좌표를 획득하는 방법을 설명하기 위한 예시도이다.
도 5는 일 실시 예에 따른 장치가 추적 좌표의 불확실성을 추정하는 방법에 대한 구체적인 흐름도이다.
도 6은 일 실시 예에 따른 장치가 영상 데이터를 샘플링하기 위하여 그리드를 결정하는 방법을 설명하기 위한 예시도이다.
도 7은 일 실시 예에 따른 전자 장치의 사시도이다.1 is a block diagram of components of an electronic device according to an exemplary embodiment.
2 is an exemplary diagram for explaining a conventional method of tracking image coordinates.
3 is a flowchart illustrating a method of estimating uncertainty of tracking coordinates by a device according to an exemplary embodiment.
4 is an exemplary diagram for explaining a method of acquiring motion-based tracking coordinates by a device according to an exemplary embodiment.
5 is a detailed flowchart of a method for estimating uncertainty of tracking coordinates by a device according to an embodiment.
6 is an exemplary diagram for explaining a method of determining a grid for sampling image data by a device according to an exemplary embodiment.
7 is a perspective view of an electronic device according to an exemplary embodiment.

본 실시 예들에서 사용되는 용어는 본 실시 예들에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 기술분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 임의로 선정된 용어도 있으며, 이 경우 해당 실시 예의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서, 본 실시예들에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 실시 예들의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present embodiments have been selected from general terms that are currently widely used as much as possible while considering the functions in the present embodiments, but this may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, etc. . In addition, in a specific case, there is also an arbitrarily selected term, and in this case, the meaning will be described in detail in the description of the embodiment. Therefore, the term used in the present embodiments should be defined based on the meaning of the term and the overall content of the present embodiment, not a simple name of the term.

실시 예들에 대한 설명들에서, 어떤 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적으로 연결되어 있는 경우뿐 아니라, 그 중간에 다른 구성요소를 사이에 두고 전기적으로 연결되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 포함한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In the descriptions of the embodiments, when a part is said to be connected to another part, this includes not only the case where it is directly connected but also the case where it is electrically connected with another component interposed therebetween. In addition, when a part includes a certain component, this means that it may further include other components without excluding other components unless otherwise stated.

본 실시 예들에서 사용되는 “구성된다” 또는 “포함한다” 등의 용어는 본 개시에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Terms such as “consists of” or “includes” used in the present embodiments should not be construed as necessarily including all of the various components or steps described in the present disclosure, and some components or some of them It should be construed that steps may not be included, or may further include additional components or steps.

또한, 본 개시에서 사용되는 '제1' 또는 '제2' 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.Also, terms including ordinal numbers such as 'first' or 'second' used in the present disclosure may be used to describe various components, but the components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

또한, 본 개시에서 사용되는 ‘월드 좌표계(world coordinate system)’는 현실 세계를 기준으로 설정된 3차원 좌표계를 의미할 수 있다.In addition, the 'world coordinate system' used in the present disclosure may mean a three-dimensional coordinate system set based on the real world.

하기 실시 예들에 대한 설명은 권리범위를 제한하는 것으로 해석되지 말아야 하며, 해당 기술분야의 당업자가 용이하게 유추할 수 있는 것은 실시 예들의 권리범위에 속하는 것으로 해석되어야 할 것이다. 이하 첨부된 도면들을 참조하면서 오로지 예시를 위한 실시 예들을 상세히 설명하기로 한다.Description of the following embodiments should not be construed as limiting the scope of rights, and what can be easily inferred by those skilled in the art should be construed as belonging to the scope of the embodiments. Hereinafter, embodiments for illustrative purposes only will be described in detail with reference to the accompanying drawings.

도 1은 일 실시 예에 따른 전자 장치의 구성 요소에 대한 블록도이다.1 is a block diagram of components of an electronic device according to an exemplary embodiment.

도 1을 참조하면, 일 실시 예에 따른 전자 장치(100)는 불확실성 추정 장치(110), 카메라 센서(120), 모션 센서(130), 프론트-엔드 프로세서(140) 및 백-엔드 프로세서(150)를 포함할 수 있다. Referring to FIG. 1 , an electronic device 100 according to an embodiment includes an uncertainty estimation device 110, a camera sensor 120, a motion sensor 130, a front-end processor 140, and a back-end processor 150. ) may be included.

일 실시 예에서, 전자 장치(100)는 카메라 센서(120)로부터 얻어지는 영상 정보에 기초하여 3차원 공간 상에서의 전자 장치(100)의 현재 포즈(pose)를 인식하고 주변 환경에 대하여 모델링(modeling)하는 장치일 수 있다. 전자 장치(100)는 시각적 주행 거리 측정(visual odometry, VO), 시각적 SLAM(visual simultaneous localization and mapping, V-SLAM) 및/또는 시각적 관성 거리 측정(visual-inertial odometry, VIO) 기술이 적용되는 장치일 수 있다. 예를 들어, 전자 장치(100)는 VO, V-SLAM 및/또는 VIO 기술이 적용됨에 따라 자율 비행 드론, 로봇, 자율 주행 차량, 가상 현실 및/또는 증강 현실을 제공하는 전자 장치(예: 스마트 글래스)일 수 있다. 다만, 이에 제한되지 않고, 전자 장치(100)는 상기 기술들이 적용될 수 있는 다양한 전자 장치를 포함할 수도 있다.In an embodiment, the electronic device 100 recognizes a current pose of the electronic device 100 in a 3D space based on image information obtained from the camera sensor 120 and models the surrounding environment. It may be a device that The electronic device 100 is a device to which visual odometry (VO), visual simultaneous localization and mapping (SLAM) and/or visual-inertial odometry (VIO) technology is applied. can be For example, the electronic device 100 may include a self-flying drone, a robot, an autonomous vehicle, an electronic device that provides virtual reality and/or augmented reality as VO, V-SLAM, and/or VIO technologies are applied (eg, smart devices). glass). However, it is not limited thereto, and the electronic device 100 may include various electronic devices to which the above technologies can be applied.

일 실시 예에서, 카메라 센서(120)는 주변 환경에 대한 영상 데이터를 획득할 수 있다. 예를 들어, 카메라 센서(120)는 전방향 카메라, 스테레오 카메라 또는 모노 카메라에 해당할 수 있다. 일 실시 예에서, 카메라 센서(120)는 동적 환경에서의 다양한 객체(예: 정적 및/또는 동적 객체)의 3차원 정보를 포함하는 영상 데이터를 획득할 수 있다. 이때, 카메라 센서(120)가 획득하는 영상 데이터는 각 픽셀에서의 영상 데이터뿐만 아니라 각 픽셀에서의 깊이 데이터를 포함할 수 있다. In one embodiment, the camera sensor 120 may obtain image data of the surrounding environment. For example, the camera sensor 120 may correspond to an omnidirectional camera, a stereo camera, or a mono camera. In one embodiment, the camera sensor 120 may acquire image data including 3D information of various objects (eg, static and/or dynamic objects) in a dynamic environment. In this case, image data acquired by the camera sensor 120 may include not only image data for each pixel but also depth data for each pixel.

일 실시 예에서, 모션 센서(130)는 전자 장치(100)의 회전(rotation) 및 이동(translation)을 감지하여 모션 데이터를 획득할 수 있다. 예를 들어, 모션 센서(130)는 가속도 센서 및/또는 자이로 센서 등을 포함하는 관성 측정 장치(inertial measurement unit, IMU)일 수 있다. In an embodiment, the motion sensor 130 may obtain motion data by detecting rotation and translation of the electronic device 100 . For example, the motion sensor 130 may be an inertial measurement unit (IMU) including an acceleration sensor and/or a gyro sensor.

일 실시 예에서, 프론트-엔드 프로세서(140)는 카메라 센서(120) 및 모션 센서(130)로부터 수신된 데이터들을 처리할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 카메라 센서(120) 및 모션 센서(130)로부터 수신된 데이터들을 처리하여 영상 좌표를 추적하기 위한 데이터를 획득할 수 있다.In one embodiment, the front-end processor 140 may process data received from the camera sensor 120 and the motion sensor 130 . For example, the front-end processor 140 may obtain data for tracking image coordinates by processing data received from the camera sensor 120 and the motion sensor 130 .

일 실시 예에서, 프론트-엔드 프로세서(140)는 카메라 센서(120)로부터 수신된 영상 데이터를 분석할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 영상 데이터를 분석하여 시각적 대응 관계(visual correspondence)를 획득할 수 있다. 본 개시에서, '시각적 대응 관계'는 특정 3차원 좌표가 연속적으로 획득되는 영상 데이터들 각각에 대해 투영된(projected) 2차원 좌표들 간에 관계를 의미할 수 있다. In one embodiment, the front-end processor 140 may analyze image data received from the camera sensor 120 . For example, the front-end processor 140 may acquire visual correspondence by analyzing image data. In the present disclosure, 'visual correspondence' may refer to a relationship between 2D coordinates projected for each of image data from which specific 3D coordinates are successively acquired.

일 실시 예에서, 프론트-엔드 프로세서(140)는 영상 좌표 추적 알고리즘을 통해 추적 좌표를 획득할 수 있다. 즉, 프론트-엔드 프로세서(140)는 영상 좌표 추적 알고리즘을 이용하여 카메라 센서(120)로부터 수신된 복수의 영상 데이터들 간에 시각적 대응 관계를 획득하고, 획득한 시각적 대응 관계에 기초하여 추적 좌표를 획득할 수 있다.In one embodiment, the front-end processor 140 may acquire tracking coordinates through an image coordinate tracking algorithm. That is, the front-end processor 140 obtains a visual correspondence between the plurality of image data received from the camera sensor 120 by using an image coordinate tracking algorithm, and obtains tracking coordinates based on the obtained visual correspondence. can do.

예를 들어, 프론트-엔드 프로세서(140)는 카메라 센서(120)로부터 제1 영상 데이터를 수신할 수 있다. 이때, 프론트-엔드 프로세서(140)는 제1 영상 데이터의 기준 좌표(

)를 설정할 수 있고, 상기 기준 좌표는 특징점(key point)에 대응되도록 설정될 수 있다. 제1 영상 데이터를 수신한 이후에, 프론트-엔드 프로세서(140)는 카메라 센서(120)로부터 제2 영상 데이터를 수신할 수 있다. 프론트-엔드 프로세서(140)는 제1 영상 데이터 및 제2 영상 데이터의 영상 유사도에 기초하여 시각적 대응 관계를 획득할 수 있다. 이때, 프론트-엔드 프로세서(140)는 제1 영상 데이터에서 설정한 기준 좌표(

)와 제2 영상 데이터에서의 임의의 좌표(

)의 영상 유사도를 계산할 수 있으며, 영상 유사도는 수식 1을 통해 계산될 수 있다.For example, the front-end processor 140 may receive first image data from the camera sensor 120 . At this time, the front-end processor 140 determines the reference coordinates of the first image data (

) can be set, and the reference coordinates can be set to correspond to key points. After receiving the first image data, the front-end processor 140 may receive second image data from the camera sensor 120 . The front-end processor 140 may obtain a visual correspondence based on the image similarity between the first image data and the second image data. At this time, the front-end processor 140 sets the reference coordinates in the first image data (

) and arbitrary coordinates in the second image data (

) can be calculated, and the image similarity can be calculated through Equation 1.

[수식 1][Formula 1]

영상 유사도 =

image similarity =

즉, 영상 유사도는 좌표 주변의

의 정사각형 내 픽셀 값들의 평균 제곱근 편차(root mean square error, RMSE)로 계산될 수 있다. 프론트-엔드 프로세서(140)는 Kanade-Lucas-Tomasi(KLT) 알고리즘을 통해 상기 영상 유사도가 최대가 되는 좌표를 제1 추적 좌표(

))로 결정할 수 있다. 이때, '제1 추적 좌표(

)'는 영상 기반 추적 좌표를 의미할 수 있다.In other words, the image similarity is

It can be calculated as the root mean square error (RMSE) of the pixel values within the square of . The front-end processor 140 determines the coordinates at which the image similarity is maximized through the Kanade-Lucas-Tomasi (KLT) algorithm as first tracking coordinates (

)) can be determined. At this time, 'first tracking coordinates (

)' may mean image-based tracking coordinates.

일 실시 예에서, 프론트-엔드 프로세서(140)는 모션 센서(130)로부터 수신된 모션 데이터에 기초하여 관성 포즈(inertial pose)를 추정할 수 있다. 예를 들어, 모션 센서(130)가 6DoF 관성 측정 장치인 경우에, 프론트-엔드 프로세서(140)는 모션 센서(130)로부터 3축 선형 가속도(linear acceleration) 및 3축 각속도(angular velocity)에 대한 데이터를 수신할 수 있다. 프론트-엔드 프로세서(140)는 초기 속도 및 센서의 바이어스 값을 추정하여, 상기 선형 가속도 및 각속도에 대한 데이터를 적분할 수 있다. 이때, 프론트-엔드 프로세서(140)가 초기 속도 및 센서의 바이어스 값을 추정하여 적분함에 따라 오차(error)가 누적되는 것을 방지할 수 있다. 프론트-엔드 프로세서(140)는 상기 데이터를 초기 속도 및 센서의 바이어스 값을 추정하여 적분함에 따라, 관성 포즈(예: (

,

))를 추정할 수 있다.In one embodiment, the front-end processor 140 may estimate an inertial pose based on motion data received from the motion sensor 130 . For example, when the motion sensor 130 is a 6DoF inertial measurement device, the front-end processor 140 calculates 3-axis linear acceleration and 3-axis angular velocity from the motion sensor 130. data can be received. The front-end processor 140 may integrate the linear acceleration and angular velocity data by estimating the initial velocity and the bias value of the sensor. In this case, as the front-end processor 140 estimates and integrates the initial speed and the bias value of the sensor, it is possible to prevent an error from being accumulated. As the front-end processor 140 estimates and integrates the initial velocity and the bias value of the sensor, the inertial pose (eg: (

,

)) can be estimated.

일 실시 예에서, 프론트-엔드 프로세서(140)는 영상 기반 추적 좌표인 제1 추적 좌표(

) 및 모션 데이터에 기초하여 영상 데이터의 깊이 값(

)을 추정할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 제1 추적 좌표(

) 및 모션 데이터를 기준으로 삼각 측량(triangulation)을 수행하여 영상 데이터의 깊이 값(

)을 추정할 수 있다. In an embodiment, the front-end processor 140 uses first tracking coordinates that are image-based tracking coordinates (

) and the depth value of the image data based on the motion data (

) can be estimated. For example, the front-end processor 140 has a first tracking coordinate (

) and the depth value of the image data by performing triangulation based on the motion data (

) can be estimated.

일 실시 예에서, 프론트-엔드 프로세서(140)는 제1 추적 좌표(

), 관성 포즈(

,

) 및 영상 데이터의 깊이 값(

)에 대한 데이터를 불확실성 추정 장치(110)로 전달할 수 있다. In one embodiment, the front-end processor 140 first tracking coordinates (

), inertial pose (

,

) and the depth value of the image data (

) may be transmitted to the uncertainty estimation device 110.

일 실시 예에서, 불확실성 추정 장치(110)는 적어도 하나의 프로세서(115)를 포함할 수 있다. 예를 들어, 프로세서(115)는 임베디드 프로세서(embedded processor), 마이크로 프로세서(micro-processor), 하드웨어 제어 로직(hardware control logic), 유한 상태 기계(finite state machine, FSM), 디지털 신호 처리장치(digital signal processor, DSP) 또는 그 조합이 될 수 있다. In one embodiment, the uncertainty estimation device 110 may include at least one processor 115 . For example, the processor 115 may include an embedded processor, a micro-processor, hardware control logic, a finite state machine (FSM), and a digital signal processor. signal processor, DSP) or a combination thereof.

본 개시에서, '불확실성 추정 장치'는 시각적 대응 관계의 불확실성을 추정하는 장치를 의미하며, 구체적으로는 시각적 대응 관계 및 실제 대응 관계 간의 확률적 차이를 추정하는 장치를 의미할 수 있다. 또한, 불확실성 추정 장치(110)는 적어도 하나의 프로세서(115)를 포함할 수 있으나, 이에 한정되는 것은 아니며, 적어도 하나의 프로세서(115) 자체가 불확실성 추정 장치(110)일 수도 있다. In the present disclosure, 'uncertainty estimating device' means a device for estimating the uncertainty of a visual correspondence, and may specifically mean a device for estimating a stochastic difference between a visual correspondence and an actual correspondence. Also, the uncertainty estimating device 110 may include at least one processor 115, but is not limited thereto, and the at least one processor 115 itself may be the uncertainty estimating device 110.

일 실시 예에서, 프로세서(115)는 프론트-엔드 프로세서(140)로부터 수신한 제1 추적 좌표(

), 관성 포즈(

,

) 및 영상 데이터의 깊이 값(

)에 대한 데이터에 기초하여 제2 추적 좌표(

)를 결정할 수 있다. 이때, '제2 추적 좌표(

)'는 모션 기반 추적 좌표를 의미할 수 있다. 프로세서(115)가 제2 추적 좌표(

)를 결정하는 방법에 대한 구체적인 설명은 도 4에서 후술하고자 한다. In one embodiment, the processor 115 receives the first tracking coordinates from the front-end processor 140 (

), inertial pose (

,

) and the depth value of the image data (

) Based on the data for the second tracking coordinates (

) can be determined. At this time, 'second tracking coordinates (

)' may mean motion-based tracking coordinates. The processor 115 determines the second tracking coordinates (

) A detailed description of the method for determining will be described later with reference to FIG. 4 .

일 실시 예에서, 프로세서(115)는 제2 추적 좌표(

)를 결정한 이후에, 일련의 단계를 거쳐 제2 영상 데이터에서의 목표 좌표 분포를 계산할 수 있다. 예를 들어, 프로세서(115)는 사전 확률 분포를 계산하는 단계(제1 단계), 후보 좌표 그룹을 결정하는 단계(제2 단계) 및 영상 기반 분포를 계산하는 단계(제3 단계)를 거쳐 최종적으로 제2 영상 데이터에서의 목표 좌표 분포를 계산할 수 있다. 프로세서(115)가 제2 영상 데이터에서의 목표 좌표 분포를 계산하는 방법에 대한 구체적인 설명은 도 5에서 후술하고자 한다.In one embodiment, the processor 115 second tracking coordinates (

) is determined, a distribution of target coordinates in the second image data may be calculated through a series of steps. For example, the processor 115 calculates a prior probability distribution (step 1), determines a group of candidate coordinates (step 2), and calculates an image-based distribution (step 3), and finally A distribution of target coordinates in the second image data may be calculated as A detailed description of how the processor 115 calculates the distribution of target coordinates in the second image data will be described later with reference to FIG. 5 .

일 실시 예에서, 프로세서(115)는 복수의 제2 추적 좌표(

)에 대해 목표 좌표 분포를 계산하는 과정을 반복하여, 복수의 목표 좌표 분포 및 복수의 그리드(

)를 획득할 수 있다. 프로세서(115)는 복수의 목표 좌표 분포 및 복수의 그리드(

)에 기초하여 추정 목표 좌표(

) 및 추정 목표 좌표의 불확실성(

)을 획득할 수 있다. 예를 들어, 프로세서(115)는 복수의 목표 좌표 분포에 대한 평균을 계산하여 추정 목표 좌표(

)를 획득하고, 추정 목표 좌표에 대한 공분산 행렬을 계산하여 추정 목표 좌표의 불확실성(

)을 획득할 수 있다. 프로세서(115)가 추정 목표 좌표(

) 및 추정 목표 좌표의 불확실성(

)을 획득하는 방법에 대한 구체적인 설명은 도 5에서 후술하고자 한다.In one embodiment, the processor 115 has a plurality of second tracking coordinates (

) By repeating the process of calculating the target coordinate distribution for a plurality of target coordinate distributions and a plurality of grids (

) can be obtained. The processor 115 includes a plurality of target coordinate distributions and a plurality of grids (

) Based on the estimated target coordinates (

) and the uncertainty of the estimated target coordinates (

) can be obtained. For example, the processor 115 calculates an average of a plurality of target coordinate distributions to estimate target coordinates (

) and the uncertainty of the estimated target coordinates by calculating the covariance matrix for the estimated target coordinates (

) can be obtained. The processor 115 estimates target coordinates (

) and the uncertainty of the estimated target coordinates (

) A detailed description of the method of obtaining will be described later with reference to FIG. 5 .

일 실시 예에서, 프로세서(115)는 상기 획득된 추정 목표 좌표(

)에 기초하여 프론트-엔드 프로세서(140)로부터 수신된 제1 추적 좌표(

)를 갱신할 수 있다. 즉, 갱신된 제1 추적 좌표는 상기 획득된 추정 목표 좌표(

)와 동일한 좌표를 의미할 수 있다. 이후에, 프로세서(115)는 갱신된 제1 추적 좌표(

) 및 불확실성(

)을 백-엔드 프로세서(150)로 전달할 수 있다.In one embodiment, the processor 115 is the obtained estimated target coordinates (

The first tracking coordinates received from the front-end processor 140 based on ) (

) can be updated. That is, the updated first tracking coordinates are the obtained estimated target coordinates (

) and may mean the same coordinates. Thereafter, the processor 115 updates the first tracking coordinates (

) and uncertainty (

) to the back-end processor 150.

일 실시 예에서, 백-엔드 프로세서(150)는 불확실성 추정 장치(110)로부터 갱신된 제1 추적 좌표(

) 및 불확실성(

)을 수신하고, 프론트-엔드 프로세서(140)로부터 추정된 관성 포즈(

,

)에 대한 데이터를 수신할 수 있다. 백-엔드 프로세서(150)는 수신된 데이터들에 기초하여 카메라 센서(120)의 포즈(예: 위치(position) 및 방향(orientation))를 계산할 수 있다. 또한, 백-엔드 프로세서(150)는 칼만 필터(kalman filter) 및/또는 번들 조정(bundle adjustment) 등을 통해 최적화를 수행한 결과를 불확실성 추정 장치(110)에 전달할 수 있다. 예를 들어, 백-엔드 프로세서(150)는 카메라 센서(120)의 포즈를 계산 및 최적화한 결과 데이터(예: 제2 영상 데이터의 깊이 값(

))를 불확실성 추정 장치(110)에 전달할 수 있다.In one embodiment, the back-end processor 150 updates the first tracking coordinates from the uncertainty estimation device 110 (

) and uncertainty (

) is received, and the inertial pose estimated from the front-end processor 140 (

,

) can receive data for. The back-end processor 150 may calculate a pose (eg, position and orientation) of the camera sensor 120 based on the received data. In addition, the back-end processor 150 may transmit a result of optimization through a Kalman filter and/or bundle adjustment to the uncertainty estimator 110 . For example, the back-end processor 150 calculates and optimizes the pose of the camera sensor 120 and outputs data (eg, a depth value of the second image data (

)) may be transmitted to the uncertainty estimation device 110.

도 2는 영상 좌표를 추적하는 종래의 방법을 설명하기 위한 예시도이다.2 is an exemplary diagram for explaining a conventional method of tracking image coordinates.

도 2를 참조하면, 전자 장치(200)를 착용한 사용자의 움직임에 따라, 전자 장치(200)는 디스플레이(미도시)를 통해 사용자의 움직임을 반영한 영상 데이터를 출력할 수 있다. 예를 들어, 전자 장치(200)를 착용한 사용자가 움직임에 따라 사용자의 포즈가 제1 포즈(210a)에서 제2 포즈(210b)로 바뀌는 경우에, 전자 장치(200)는 디스플레이를 통해 상기 포즈의 변화를 반영한 영상 데이터를 출력할 수 있다. 이때, 제1 포즈(210a)는 제1 시점에서 전자 장치(200)를 착용한 사용자의 위치(position) 및 방향(orientation)을 포함할 수 있다. 제2 포즈(210b)는 상기 제1 시점보다 이후인 제2 시점에서 전자 장치(200)를 착용한 사용자의 위치 및 방향을 포함할 수 있다.Referring to FIG. 2 , according to the movement of a user wearing the electronic device 200, the electronic device 200 may output image data reflecting the user's movement through a display (not shown). For example, when the user's pose changes from the first pose 210a to the second pose 210b as the user wearing the electronic device 200 moves, the electronic device 200 displays the pose through the display. Image data reflecting the change of can be output. In this case, the first pose 210a may include the position and orientation of the user wearing the electronic device 200 at the first point of view. The second pose 210b may include the location and direction of the user wearing the electronic device 200 at a second point in time later than the first point in time.

일 실시 예에서, 전자 장치(200)는 종래의 영상 좌표 추적 알고리즘을 통해 영상 좌표를 추적할 수 있다. 즉, 전자 장치(200)는 카메라(예: 도 1의 카메라 센서(120))를 통해 획득한 연속적인 영상 데이터들 간에 시각적 대응 관계만으로 영상 좌표를 추적할 수 있다. In an embodiment, the electronic device 200 may track image coordinates through a conventional image coordinate tracking algorithm. That is, the electronic device 200 may track image coordinates only with a visual correspondence between consecutive image data acquired through a camera (eg, the camera sensor 120 of FIG. 1 ).

예를 들어, 전자 장치(200)는 제1 시점에서 제1 영상 데이터(205)의 기준 좌표(220)를 설정할 수 있다. 이때, 기준 좌표(220)는 카메라의 시점, 또는 주변 환경이 변해도 식별이 용이한 좌표를 의미할 수 있다. 기준 좌표(220)는 특징점(key point) 추출 방법(예: Harris Corner 방법)을 통해 설정될 수 있다.For example, the electronic device 200 may set the reference coordinates 220 of the first image data 205 at the first viewpoint. In this case, the reference coordinates 220 may refer to coordinates that are easy to identify even when the viewpoint of the camera or the surrounding environment changes. The reference coordinates 220 may be set through a key point extraction method (eg, the Harris Corner method).

이후, 전자 장치(200)를 착용한 사용자가 움직임에 따라, 전자 장치(200)는 제2 시점에서 제2 영상 데이터(215)를 출력할 수 있다. 이때, 사용자의 움직임이 크면 제2 영상 데이터(215)에는 모션 블러(motion blur)가 발생할 수 있고, 제1 영상 데이터(205) 및 제2 영상 데이터(215)는 영상 차이에 평탄화가 일어나 좌표 식별이 어려워질 수 있다. 전자 장치(200)는 제1 영상 데이터(205) 및 제2 영상 데이터(215) 간에 발생한 영상 차이를 샘플링하나, 제2 영상 데이터(215)에 기준 좌표(220)와 유사한 복수의 좌표가 분포됨에 따라 상기 샘플링의 효과가 감소할 수 있다. 따라서, 제1 영상 데이터(205) 및 제2 영상 데이터(215) 간에 영상 차이에 대한 샘플링의 효과가 감소하면, 최종적인 목표 좌표에 대한 확률 분포가 부정확할 수 있고, 결과적으로는 불확실성 계산이 저해될 수 있다.Then, as the user wearing the electronic device 200 moves, the electronic device 200 may output the second image data 215 at a second viewpoint. At this time, if the user's movement is large, motion blur may occur in the second image data 215, and flattening occurs due to the image difference between the first image data 205 and the second image data 215 to identify coordinates. this can be difficult The electronic device 200 samples the image difference generated between the first image data 205 and the second image data 215, but since a plurality of coordinates similar to the reference coordinates 220 are distributed in the second image data 215 Accordingly, the effect of the sampling may decrease. Therefore, if the effect of sampling on the image difference between the first image data 205 and the second image data 215 is reduced, the probability distribution for the final target coordinates may be inaccurate, and as a result, the uncertainty calculation is hindered. It can be.

도 3은 일 실시 예에 따른 장치가 추적 좌표의 불확실성을 추정하는 방법을 설명하기 위한 흐름도이다.3 is a flowchart illustrating a method of estimating uncertainty of tracking coordinates by a device according to an exemplary embodiment.

도 3을 참조하면, 불확실성 추정 장치(예: 도 1의 불확실성 추정 장치(110))의 프로세서(예: 도 1의 프로세서(115))는 동작 301에서 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표(

)를 수신할 수 있다. 제1 추적 좌표(

)는 제1 영상 데이터의 기준 좌표(

)에 대응될 수 있다.Referring to FIG. 3 , a processor (eg, processor 115 of FIG. 1 ) of an uncertainty estimation apparatus (eg, uncertainty estimation apparatus 110 of FIG. 1 ) performs image-based tracking coordinates in second image data in operation 301 . First tracking coordinates (

) can be received. First tracking coordinates (

) is the reference coordinates of the first image data (

) can correspond to

본 개시에서 '제2 영상 데이터'는 '제1 영상 데이터' 이후에 획득된 영상 데이터를 의미하며, '제1 영상 데이터' 및 '제2 영상 데이터'는 연속적인 영상 데이터를 의미할 수 있다. 또한, 본 개시에서의 '불확실성 추정 장치'는 사용자의 움직임이 커서 모션 블러가 발생하거나, 시차 발생 또는 조명 변화 등으로 인해 '제1 영상 데이터'와 영상 차이가 큰 '제2 영상 데이터'에 대해서도 목표 좌표의 불확실성을 추정할 수 있다.In the present disclosure, 'second image data' means image data acquired after 'first image data', and 'first image data' and 'second image data' may mean consecutive image data. In addition, the 'uncertainty estimating device' in the present disclosure also applies to 'second image data' having a large difference between the 'first image data' and the 'second image data' due to the occurrence of motion blur due to the large movement of the user, occurrence of parallax or lighting change, and the like. The uncertainty of target coordinates can be estimated.

일 실시 예에서, 프로세서(115)는 프론트-엔드 프로세서(140)로부터 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표(

)를 수신할 수 있다. 프론트-엔드 프로세서(140)는 제1 영상 데이터 및 제2 영상 데이터의 영상 유사도에 기초하여 시각적 대응 관계를 획득할 수 있고, 획득된 시각적 대응 관계에 기초하여 제1 추적 좌표(

)를 결정할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 제1 영상 데이터에서 설정한 기준 좌표(

)와 제2 영상 데이터에서의 임의의 좌표(

)의 영상 유사도를 계산하고, KLT 알고리즘을 통해 상기 영상 유사도가 최대가 되는 좌표를 제1 추적 좌표(

)로 결정할 수 있다.In an embodiment, the processor 115 may perform first tracking coordinates that are image-based tracking coordinates in the second image data from the front-end processor 140 (

) can be received. The front-end processor 140 may obtain a visual correspondence based on the image similarity between the first image data and the second image data, and based on the obtained visual correspondence, the first tracking coordinates (

) can be determined. For example, the front-end processor 140 sets reference coordinates in the first image data (

) and arbitrary coordinates in the second image data (

), and the coordinates at which the image similarity is maximized through the KLT algorithm are first tracking coordinates (

) can be determined.

일 실시 예에 따르면, 프로세서(115)는 동작 303에서 제2 영상 데이터에서의 모션 기반 추적 좌표인 제2 추적 좌표(

)를 획득할 수 있다. 일 실시 예에서, 프로세서(115)는 프론트-엔드 프로세서(140)로부터 수신된 모션 데이터 및 제1 영상 데이터의 깊이 값(

)에 기초하여 제2 추적 좌표(

)를 획득할 수 있다. 이때, '모션 데이터'는 프론트-엔드 프로세서(140)로부터 추정된 관성 포즈(

,

)를 의미할 수 있고, '제1 영상 데이터의 깊이 값(

)'은 프론트-엔드 프로세서(140)가 삼각 측량을 통해 추정한 깊이 값(

)을 의미할 수 있다. According to an embodiment, in operation 303, the processor 115 performs second tracking coordinates that are motion-based tracking coordinates in the second image data (

) can be obtained. In an embodiment, the processor 115 may include motion data received from the front-end processor 140 and a depth value of the first image data (

) based on the second tracking coordinates (

) can be obtained. At this time, 'motion data' is the inertial pose estimated from the front-end processor 140 (

,

), and 'the depth value of the first image data (

)' is a depth value estimated by the front-end processor 140 through triangulation (

) can mean.

예를 들어, 프로세서(115)는 제1 영상 데이터에서의 기준 좌표(

)에 대하여 제1 영상 데이터의 깊이 값(

)에 기초하여 3차원 좌표로 변환할 수 있다. 변환된 3차원 좌표에 대하여, 프로세서(115)는 프론트-엔드 프로세서(140)로부터 수신된 관성 포즈(

,

)를 적용하여 제2 영상 데이터에서의 3차원 좌표로 변환할 수 있다. 또한, 프로세서(115)는 제2 영상 데이터에서의 3차원 좌표를 투영(project)하여 제2 추적 좌표(

)를 획득할 수 있다.For example, the processor 115 uses reference coordinates in the first image data (

), the depth value of the first image data (

), it can be converted into three-dimensional coordinates based on With respect to the transformed three-dimensional coordinates, the processor 115 receives the inertial pose from the front-end processor 140 (

,

) to be converted into 3D coordinates in the second image data. In addition, the processor 115 projects the 3D coordinates in the second image data to second tracking coordinates (

) can be obtained.

일 실시 예에 따르면, 프로세서(115)는 동작 305에서 제2 영상 데이터에서의 목표 좌표 분포를 계산할 수 있다. 본 개시에서 '목표 좌표 분포'란, 제2 영상 데이터의 좌표들 중에서 제1 영상 데이터의 기준 좌표(

)가 실질적으로 대응될 좌표들의 확률 분포를 의미할 수 있다. According to an embodiment, in operation 305, the processor 115 may calculate a target coordinate distribution in the second image data. In the present disclosure, 'target coordinate distribution' refers to the reference coordinates of the first image data among the coordinates of the second image data (

) may mean a probability distribution of coordinates to be substantially corresponded to.

일 실시 예에서, 프로세서(115)는 제2 영상 데이터에서의 영상 기반 추적 좌표인 제1 추적 좌표(

) 및 모션 기반 추적 좌표인 제2 추적 좌표(

)에 기초하여 목표 좌표 분포를 계산할 수 있다. 예를 들어, 프로세서(115)는 사전 확률 분포를 계산하는 단계(제1 단계), 후보 좌표 그룹을 결정하는 단계(제2 단계) 및 영상 기반 분포를 계산하는 단계(제3 단계)를 거쳐 최종적으로 제2 영상 데이터에서의 목표 좌표 분포를 계산할 수 있다. 프로세서(115)가 제2 영상 데이터에서의 목표 좌표 분포를 계산하는 방법에 대한 구체적인 설명은 도 5에서 후술하고자 한다.In one embodiment, the processor 115 may include first tracking coordinates (which are image-based tracking coordinates in the second image data)

) and motion-based tracking coordinates, the second tracking coordinates (

), it is possible to calculate the target coordinate distribution based on. For example, the processor 115 calculates a prior probability distribution (step 1), determines a group of candidate coordinates (step 2), and calculates an image-based distribution (step 3), and finally A distribution of target coordinates in the second image data may be calculated as A detailed description of how the processor 115 calculates the distribution of target coordinates in the second image data will be described later with reference to FIG. 5 .

일 실시 예에 따르면, 프로세서(115)는 동작 307에서 추정 목표 좌표 및 추정 목표 좌표의 불확실성을 획득할 수 있다. 본 개시에서 '목표 좌표'란 제1 영상 데이터의 기준 좌표에 실질적으로 대응되는 좌표를 의미하고, '추정 목표 좌표'란 제1 영상 데이터의 기준 좌표에 대응되는 것으로 추정되는 좌표를 의미할 수 있다. 이때, 최초의 '추정 목표 좌표'는 영상 기반 추적 좌표(예: 제1 추적 좌표(

))일 수 있다. According to an embodiment, the processor 115 may obtain estimated target coordinates and uncertainties of the estimated target coordinates in operation 307 . In the present disclosure, 'target coordinates' may mean coordinates substantially corresponding to the reference coordinates of the first image data, and 'estimated target coordinates' may mean coordinates estimated to correspond to the reference coordinates of the first image data. . At this time, the first 'estimated target coordinates' are image-based tracking coordinates (e.g., first tracking coordinates (

)) can be.

일 실시 예에서, 프로세서(115)는 동작 305에서 계산한 목표 좌표 분포에 기초하여 추정 목표 좌표 및 추정 목표 좌표의 불확실성을 획득할 수 있다. 예를 들어, 프로세서(115)는 복수의 제2 추적 좌표(

)에 대해 목표 좌표 분포를 계산하는 과정을 반복하여 복수의 목표 좌표 분포 및 복수의 그리드(

)에 기초하여 추정 목표 좌표(

) 및 추정 목표 좌표의 불확실성(

)을 획득할 수 있다.In an embodiment, the processor 115 may obtain the estimated target coordinates and uncertainty of the estimated target coordinates based on the target coordinate distribution calculated in operation 305 . For example, the processor 115 may include a plurality of second tracking coordinates (

) Based on the estimated target coordinates (

) and the uncertainty of the estimated target coordinates (

) can be obtained.

일 실시 예에 따르면, 프로세서(115)는 동작 309에서 추정 목표 좌표(

)에 기초하여 제1 추적 좌표를 갱신(update)할 수 있다. 예를 들어, 프로세서(115)는 복수의 목표 좌표 분포에 대한 평균인 추정 목표 좌표(

)를 제2 영상 데이터의 영상 기반 추적 좌표인 제1 추적 좌표로 갱신하여 설정할 수 있다. According to one embodiment, the processor 115 in operation 309 the estimated target coordinates (

), the first tracking coordinates may be updated. For example, the processor 115 estimates target coordinates that are the average of a plurality of target coordinate distributions (

) may be updated and set to first tracking coordinates, which are image-based tracking coordinates of the second image data.

도 4는 일 실시 예에 따른 장치가 모션 기반 추적 좌표를 획득하는 방법을 설명하기 위한 예시도이다.4 is an exemplary diagram for explaining a method of acquiring motion-based tracking coordinates by a device according to an exemplary embodiment.

도 4를 참조하면, 전자 장치(예: 도 1의 전자 장치(100))는 디스플레이를 통해 영상 데이터를 출력할 수 있다. 예를 들어, 전자 장치(100)는 정육면체 형상의 객체를 포함하는 제1 영상 데이터(405)를 출력할 수 있다. Referring to FIG. 4 , an electronic device (eg, the electronic device 100 of FIG. 1 ) may output image data through a display. For example, the electronic device 100 may output first image data 405 including a cube-shaped object.

일 실시 예에서, 프론트-엔드 프로세서(예: 도 1의 프론트-엔드 프로세서(140))는 제1 영상 데이터(405)의 기준 좌표(

)(420)를 설정할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 제1 영상 데이터(405)의 특징점(예: 정육면체 형상의 객체의 코너점)을 기준 좌표(

)(420)로 설정할 수 있다.In an embodiment, a front-end processor (eg, the front-end processor 140 of FIG. 1 ) may include reference coordinates of the first image data 405 (

) 420 can be set. For example, the front-end processor 140 converts feature points (eg, corner points of a cube-shaped object) of the first image data 405 to reference coordinates (

) (420).

일 실시 예에서, 프론트-엔드 프로세서(140)는 제1 영상 데이터(405) 및 제2 영상 데이터(415)의 영상 유사도에 기초하여 영상 기반 추적 좌표인 제1 추적 좌표(

)(430)를 결정할 수 있다.In an embodiment, the front-end processor 140 may determine first tracking coordinates (which are image-based tracking coordinates) based on image similarities between the first image data 405 and the second image data 415.

) 430 can be determined.

일 실시 예에서, 프론트-엔드 프로세서(140)는 모션 센서(예: 도 1의 모션 센서(130))로부터 수신된 모션 데이터에 기초하여 관성 포즈(inertial pose)를 추정할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 모션 센서(130)의 초기 속도 및 바이어스 값을 추정하여 선형 가속도 및 각속도에 대한 데이터를 적분할 수 있고, 적분 결과로 관성 포즈(

,

)를 추정할 수 있다.In one embodiment, the front-end processor 140 may estimate an inertial pose based on motion data received from a motion sensor (eg, the motion sensor 130 of FIG. 1 ). For example, the front-end processor 140 may integrate data for linear acceleration and angular velocity by estimating the initial velocity and bias values of the motion sensor 130, and as a result of the integration, the inertial pose (

,

) can be estimated.

일 실시 예에서, 프론트-엔드 프로세서(140)는 카메라 센서(예: 도 1의 카메라 센서(120))로부터 제1 영상 데이터(405)의 깊이 값(

)(422)을 획득할 수 있다. 예를 들어, 카메라 센서(120)가 스테레오 방식(stereo-type), ToF(time-of-flight) 방식, Structured Pattern 방식 등을 통해 영상 데이터의 3차원 깊이 값을 인식할 수 있는 경우에, 프론트-엔드 프로세서(140)는 카메라 센서(120)로부터 제1 영상 데이터(405)의 깊이 값(

)(422)을 획득할 수 있다. 다른 실시 예에서, 프론트-엔드 프로세서(140)는 제1 추적 좌표(

)을 추정할 수도 있다.In an embodiment, the front-end processor 140 may perform a depth value (eg, the camera sensor 120 of FIG. 1 ) of the first image data 405 from a camera sensor.

) 422 can be obtained. For example, when the camera sensor 120 can recognize a 3D depth value of image data through a stereo-type method, a time-of-flight (ToF) method, a structured pattern method, or the like, the front -The end processor 140 is a depth value of the first image data 405 from the camera sensor 120 (

) 422 can be obtained. In another embodiment, the front-end processor 140 uses the first tracking coordinates (

) can be estimated.

일 실시 예에서, 프론트-엔드 프로세서(140)는 제1 추적 좌표(

), 관성 포즈(

,

) 및 영상 데이터의 깊이 값(

)에 대한 데이터를 불확실성 추정 장치(110)로 전달할 수 있다.In one embodiment, the front-end processor 140 first tracking coordinates (

), inertial pose (

,

) and the depth value of the image data (

) may be transmitted to the uncertainty estimation device 110.

일 실시 예에서, 불확실성 추정 장치(예: 도 1의 불확실성 추정 장치(110))의 프로세서(예: 도 1의 프로세서(115))는 수식 2를 통해 제2 영상 데이터(415)의 모션 기반 추적 좌표인 제2 추적 좌표(

)를 획득할 수 있다. 프로세서(115)는 프론트-엔드 프로세서(140)로부터 수신한 데이터들에 기초하여 수식 2를 계산할 수 있다.In an embodiment, a processor (eg, processor 115 of FIG. 1 ) of an uncertainty estimation apparatus (eg, uncertainty estimation apparatus 110 of FIG. 1 ) tracks the second image data 415 based on motion through Equation 2 The coordinates of the second tracking coordinates (

) can be obtained. The processor 115 may calculate Equation 2 based on the data received from the front-end processor 140 .

[수식 2][Equation 2]

이때, 수식 2에서

는 3차원 좌표를 영상에서의 좌표(즉, 2차원 좌표)로 계산하는 투영 함수(projection function)이고,

는 영상에서의 좌표를 깊이 값(

)과 함께 3차원 좌표로 계산하는 역-투영 함수(back-projection function)에 해당한다. 즉, 프로세서(115)는 제1 영상 데이터(405)의 기준 좌표(

)(420)를 깊이 값(

)(422)에 기초하여 역-투영 함수

에 따라 기준 좌표(

)(420)에 대한 3차원 좌표(424)를 획득할 수 있다.At this time, in Equation 2

Is a projection function that calculates 3-dimensional coordinates as coordinates in an image (ie, 2-dimensional coordinates),

is the depth value of the coordinates in the image (

) and corresponds to a back-projection function calculated in three-dimensional coordinates. That is, the processor 115 determines the reference coordinates of the first image data 405 (

) (420) as the depth value (

) (422) based on the back-projection function

Reference coordinates according to (

) 420, 3D coordinates 424 can be obtained.

프로세서(115)는 획득한 3차원 좌표(424)에 대하여 관성 포즈(

,

)를 적용함으로써, 제2 영상 데이터(415)에서의 3차원 좌표를 획득할 수 있다. 즉, 프로세서(115)는 기준 좌표(

)(420)에 대한 3차원 좌표(424)인 함수

에 대하여, 전자 장치(100)의 회전(rotation) 값인

을 곱하는 곱 연산을 수행할 수 있다. 또한, 프로세서(115)는 상기 곱 연산을 수행한 결과 값인

및 전자 장치(100)의 이동(translation) 값인

에 대해 합 연산을 수행할 수 있다. 프로세서(115)는 상기 합 연산을 수행한 결과 값인

을 제2 영상 데이터(415)에서의 3차원 좌표로 판단할 수 있다.The processor 115 has an inertial pose (with respect to the acquired three-dimensional coordinates 424).

,

), it is possible to obtain 3D coordinates in the second image data 415 . That is, the processor 115 is the reference coordinate (

) is a function that is three-dimensional coordinates (424) for (420)

For , the rotation value of the electronic device 100

You can perform a multiplication operation that multiplies . In addition, the processor 115 is the result value of performing the multiplication operation

and a translation value of the electronic device 100.

A sum operation can be performed on . The processor 115 is the result value of performing the sum operation

can be determined as 3D coordinates in the second image data 415.

프로세서(115)는 제2 영상 데이터(415)에서의 3차원 좌표를 투영 함수

에 따라 모션 기반 추적 좌표인 제2 추적 좌표(

)(440)를 획득할 수 있다.The processor 115 converts the 3D coordinates of the second image data 415 into a projection function.

According to the motion-based tracking coordinates, the second tracking coordinates (

) 440 can be obtained.

도 5는 일 실시 예에 따른 장치가 추적 좌표의 불확실성을 추정하는 방법에 대한 구체적인 흐름도이다.5 is a detailed flowchart of a method for estimating uncertainty of tracking coordinates by a device according to an embodiment.

도 5를 참조하면, 카메라 센서(120) 및 모션 센서(130)는 각각 획득한 데이터를 프론트-엔드 프로세서(140)로 전송할 수 있다. 예를 들어, 주변 환경에 대한 영상 데이터를 획득한 카메라 센서(120)는 동작 500에서 프론트-엔드 프로세서(140)로 영상 데이터를 전송할 수 있다. 다른 예를 들어, 전자 장치(예: 도 1의 전자 장치(100))의 회전(rotation) 및 이동(translation)을 감지하여 모션 데이터를 획득한 모션 센서(130)는 동작 505에서 프론트-엔드 프로세서(140)로 모션 데이터를 전송할 수 있다. Referring to FIG. 5 , the camera sensor 120 and the motion sensor 130 may each transmit acquired data to the front-end processor 140 . For example, the camera sensor 120 that acquires image data of the surrounding environment may transmit the image data to the front-end processor 140 in operation 500 . As another example, the motion sensor 130 that acquires motion data by detecting rotation and translation of an electronic device (eg, the electronic device 100 of FIG. 1 ) is a front-end processor in operation 505. Motion data may be transmitted to (140).

다만, 도 5는 카메라 센서(120) 및 모션 센서(130)가 각각 획득한 데이터를 순차적으로 전송하는 실시 예만을 도시하고 있으나, 이에 한정되는 것은 아니다. 다른 실시 예에서, 카메라 센서(120) 및 모션 센서(130)는 각각 획득한 데이터를 프론트-엔드 프로세서(140)에 병렬적으로 전송할 수 있다.However, although FIG. 5 illustrates only an embodiment in which the camera sensor 120 and the motion sensor 130 sequentially transmit acquired data, the present invention is not limited thereto. In another embodiment, the camera sensor 120 and the motion sensor 130 may transmit respectively acquired data to the front-end processor 140 in parallel.

일 실시 예에서, 프론트-엔드 프로세서(140)는 카메라 센서(120) 및 모션 센서(130)로부터 수신된 영상 데이터 및 모션 데이터를 처리할 수 있다. 예를 들어, 프론트-엔드 프로세서(140)는 카메라 센서(120)로부터 수신된 영상 데이터에 기초하여 영상 기반 추적 좌표인 제1 추적 좌표(

)를 획득할 수 있다. 프론트-엔드 프로세서(140)는 모션 센서(130)로부터 수신된 모션 데이터에 기초하여 전자 장치(100)의 관성 포즈(

,

)를 추정할 수 있다. 또한, 프론트-엔드 프로세서(140)는 영상 기반 추적 좌표인 제1 추적 좌표(

) 및 모션 데이터에 기초하여 영상 데이터의 깊이 값(

)을 추정할 수 있다.In one embodiment, the front-end processor 140 may process image data and motion data received from the camera sensor 120 and the motion sensor 130 . For example, the front-end processor 140 performs first tracking coordinates (which are image-based tracking coordinates) based on image data received from the camera sensor 120 .

) can be obtained. The front-end processor 140 performs an inertial pose of the electronic device 100 based on the motion data received from the motion sensor 130.

,

) can be estimated. In addition, the front-end processor 140 first tracking coordinates (which are image-based tracking coordinates)

) and the depth value of the image data based on the motion data (

) can be estimated.

일 실시 예에 따르면, 프론트-엔드 프로세서(140)는 동작 510에서 제1 추적 좌표(

), 관성 포즈(

,

) 및 영상 데이터의 깊이 값(

)에 대한 데이터를 불확실성 추정 장치(예: 도 1의 불확실성 추정 장치(110))의 프로세서(115)에 전송할 수 있다.According to one embodiment, the front-end processor 140, in operation 510, first tracking coordinates (

), inertial pose (

,

) and the depth value of the image data (

) may be transmitted to the processor 115 of the uncertainty estimating device (eg, the uncertainty estimating device 110 of FIG. 1 ).

일 실시 예에 따르면, 프로세서(115)는 동작 515에서 제2 영상 데이터의 모션 기반 추적 좌표인 제2 추적 좌표(

)를 획득할 수 있다. 예를 들어, 프로세서(115)는 프론트-엔드 프로세서(140)로부터 수신된 제1 추적 좌표(

), 관성 포즈(

,

) 및 영상 데이터의 깊이 값(

)에 대한 데이터에 기초하여 제2 추적 좌표(

)를 획득할 수 있다.According to an embodiment, in operation 515, the processor 115 performs second tracking coordinates that are motion-based tracking coordinates of the second image data (

) can be obtained. For example, the processor 115 receives the first tracking coordinates received from the front-end processor 140 (

), inertial pose (

,

) and the depth value of the image data (

) Based on the data for the second tracking coordinates (

) can be obtained.

일 실시 예에서, 프로세서(115)는 제2 추적 좌표(

)를 획득한 이후에, 일련의 단계를 거쳐 제2 영상 데이터에서의 목표 좌표 분포를 계산할 수 있다. In one embodiment, the processor 115 second tracking coordinates (

) is obtained, a distribution of target coordinates in the second image data may be calculated through a series of steps.

예를 들어, 프로세서(115)는 동작 520에서 사전 확률 분포를 계산할 수 있다. 본 개시에서 '사전 확률 분포'란, 영상 기반 추적 좌표 및 모션 기반 추적 좌표를 통해 샘플링 한 임의의 좌표

에 대한 확률 분포를 의미할 수 있다. 사전 확률 분포는 임의의 좌표

의 거리 기반 확률을 의미할 수 있다. 프로세서(115)는 수식 3을 통해 임의의 좌표

에 대한 사전 확률 분포를 계산할 수 있다. For example, processor 115 may calculate a prior probability distribution in operation 520 . In the present disclosure, 'prior probability distribution' refers to random coordinates sampled through image-based tracking coordinates and motion-based tracking coordinates.

It can mean a probability distribution for . The prior probability distribution has random coordinates

may mean a distance-based probability of The processor 115 calculates an arbitrary coordinate through Equation 3.

We can calculate the prior probability distribution for

[수식 3][Formula 3]

즉, 프로세서(115)는 임의의 좌표

및 영상 기반 추적 좌표(

)의 거리 값인

과 임의의 좌표

및 모션 기반 추적 좌표(

)의 거리 값인

의 가중 평균을 이용하여 사전 확률 분포를 계산할 수 있다. That is, the processor 115 is an arbitrary coordinate

and image-based tracking coordinates (

), which is the distance value of

and arbitrary coordinates

and motion-based tracking coordinates (

), which is the distance value of

The prior probability distribution can be calculated using the weighted average of .

이때,

는 영상 기반 추적 좌표(

) 및 모션 기반 추적 좌표(

) 간에 균형을 위한 가중 파라미터일 수 있다. 이때,

는 0 내지 1의 범위에 해당할 수 있다. 예를 들어, 모션 데이터에 포함된 값(예: 관성 포즈(

,

)에 대응되는 값)이 기 설정된 값 미만인 경우에, 카메라 센서(120)로부터 획득되는 정보가 상대적으로 더 정확할 수 있으므로, 영상 기반 추적 좌표(

)의 중요도를 높이기 위하여

의 값이 증가할 수 있다. 또한,

의 값이 증가함에 따라 모션 기반 추적 좌표(

)의 중요도는 낮아질 수 있다.

는 스케일 파라미터이고,

는 정규화 팩터(normalization factor)인 상수일 수 있다. At this time,

is the image-based tracking coordinates (

) and motion-based tracking coordinates (

) may be a weighting parameter for balancing between. At this time,

may correspond to a range of 0 to 1. For example, values contained in motion data (e.g. inertial pose (

,

When the value corresponding to ) is less than the preset value, since the information obtained from the camera sensor 120 may be relatively more accurate, the image-based tracking coordinates (

) to increase the importance of

value may increase. also,

As the value of increases, the motion-based tracking coordinates (

) may be less important.

is the scale parameter,

may be a constant that is a normalization factor.

일 실시 예에서, 상기 2개의 파라미터

및

는 수식 4 및 수식 5에서 영상 기반 추적 좌표(

) 및 모션 기반 추적 좌표(

)의 확률 밀도 값을 사용하여 조절될 수 있다.In one embodiment, the two parameters

and

Is the image-based tracking coordinates in Equations 4 and 5 (

) and motion-based tracking coordinates (

) can be adjusted using the probability density value of

[수식 4][Formula 4]

[수식 5][Formula 5]

이때,

는 영상 기반 추적 좌표(

) 및 모션 기반 추적 좌표(

)의 상대적인 중요도를 조절하기 위한 파라미터이고,

는 사전 확률 분포의 높이를 조절하기 위한 파라미터일 수 있다.At this time,

is the image-based tracking coordinates (

) and motion-based tracking coordinates (

) is a parameter for controlling the relative importance of

may be a parameter for adjusting the height of the prior probability distribution.

동작 520에서 사전 확률 분포를 계산한 이후에, 프로세서(115)는 동작 525에서 후보 좌표 그룹을 결정할 수 있다. 예를 들어, 프로세서(115)는 사전 확률 분포 중에서 사전 확률이 임계 값 이상인 부분을 후보 좌표 그룹으로 결정할 수 있다. 프로세서(115)는 임계 값의 등고선을 포함하는 가장 작은 직사각형 영역을 그리드 영역(

)으로 선택할 수 있다. 이에 따라, 선택된 그리드 영역(

)은 사전 확률 분포의 형태에 따라 다르게 결정될 수 있다.After calculating the prior probability distribution in operation 520, the processor 115 may determine a group of candidate coordinates in operation 525. For example, the processor 115 may determine, as a candidate coordinate group, a portion having a prior probability greater than or equal to a threshold value in the prior probability distribution. The processor 115 converts the smallest rectangular area including the contour lines of the threshold into a grid area (

) can be selected. Accordingly, the selected grid area (

) may be determined differently depending on the shape of the prior probability distribution.

프로세서(115)는 동작 530에서 영상 기반 분포를 계산할 수 있다. 예를 들어, 프로세서(115)는 동작 525에서 결정된 후보 좌표 그룹의 그리드 영역(

)에 대해 기준 좌표와의 영상 유사도에 기초하여 영상 기반 분포를 계산할 수 있다. 이때, 영상 기반 분포는 수식 6을 통해 계산될 수 있다.Processor 115 may calculate an image-based distribution in operation 530 . For example, the processor 115 performs a grid area of the candidate coordinate group determined in operation 525 (

), an image-based distribution can be calculated based on the image similarity with reference coordinates. At this time, the image-based distribution can be calculated through Equation 6.

[수식 6][Formula 6]

이때,

는 스케일 팩터이고,

는 그리드 영역(

)에 대한 정규화 팩터인 상수일 수 있다. 일 실시 예에서,

는 영상 기반 분포(

) 및 사전 확률 분포(

)가 유사해지도록 하는 스케일 팩터일 수 있고,

는 상기 두 분포가 다른 정도를 나타내는 KL-divergence를 최소화함에 따라 결정될 수 있다.At this time,

is the scale factor,

is the grid area (

) may be a constant that is a normalization factor for In one embodiment,

is the image-based distribution (

) and the prior probability distribution (

) can be a scale factor that makes it similar,

can be determined by minimizing KL-divergence, which represents the degree of difference between the two distributions.

프로세서(115)는 동작 535에서 목표 좌표 분포를 계산할 수 있다. 예를 들어, 프로세서(115)는 영상 기반 분포 및 사전 확률 분포에 기초하여 목표 좌표 분포를 계산할 수 있다. 이때, 목표 좌표 분포는 수식 7을 통해 계산될 수 있다.The processor 115 may calculate the target coordinate distribution in operation 535 . For example, the processor 115 may calculate the target coordinate distribution based on the image-based distribution and the prior probability distribution. At this time, the target coordinate distribution can be calculated through Equation 7.

[수식 7][Formula 7]

프로세서(115)는 영상 기반 분포 및 사전 확률 분포의 가중 기하 평균을 이용하여 목표 좌표 분포를 계산할 수 있다. 이때,

는 영상 기반 분포 및 사전 확률 분포 간에 균형을 위한 가중 파라미터일 수 있고, 0 내지 1의 범위에 해당할 수 있다.The processor 115 may calculate the target coordinate distribution using the image-based distribution and the weighted geometric mean of the prior probability distribution. At this time,

may be a weighting parameter for balancing between the image-based distribution and the prior probability distribution, and may correspond to a range of 0 to 1.

프로세서(115)는 복수의 모션 기반 추적 좌표(예:

)에 대해 동작 520 내지 동작 535를 수행하여 복수의 목표 좌표 분포 및 복수의 그리드 영역(

)을 획득할 수 있다. Processor 115 may use a plurality of motion-based tracking coordinates (e.g.,

) by performing operations 520 to 535 to distribute a plurality of target coordinates and a plurality of grid areas (

) can be obtained.

프로세서(115)는 동작 540에서 추정 목표 좌표 및 추정 목표 좌표의 불확실성을 획득할 수 있다. 예를 들어, 프로세서(115)는 복수의 목표 좌표 분포

, 복수의 모션 기반 추적 좌표 각각의 정확도에 대응되는 가중치

및 복수의 그리드 영역

에 기초하여 추정 목표 좌표(

)를 획득할 수 있다. 추정 목표 좌표(

)는 수식 8을 통해 계산될 수 있다.The processor 115 may obtain the estimated target coordinates and uncertainty of the estimated target coordinates in operation 540 . For example, the processor 115 distributes a plurality of target coordinates.

, Weight corresponding to the accuracy of each of the plurality of motion-based tracking coordinates

and multiple grid areas

Estimated target coordinates based on (

) can be obtained. Estimated target coordinates (

) can be calculated through Equation 8.

[수식 8][Formula 8]

또한, 프로세서(115)는 추정 목표 좌표(

)에 기초하여 추정 목표 좌표의 불확실성(

)을 획득할 수 있다. 추정 목표 좌표의 불확실성(

)은 수식 9를 통해 계산될 수 있다.In addition, the processor 115 estimates target coordinates (

) based on the uncertainty of the estimated target coordinates (

) can be obtained. Uncertainty of estimated target coordinates (

) can be calculated through Equation 9.

[수식 9][Formula 9]

즉, 프로세서(115)는 복수의 목표 좌표 분포에 대한 평균을 계산하여 추정 목표 좌표(

)을 획득할 수 있다.That is, the processor 115 calculates an average of a plurality of target coordinate distributions to estimate target coordinates (

) can be obtained.

도 6은 일 실시 예에 따른 장치가 영상 데이터를 샘플링하기 위한 그리드 영역을 결정하는 방법을 설명하기 위한 예시도이다.6 is an exemplary diagram for explaining a method of determining a grid area for sampling image data by a device according to an exemplary embodiment.

도 6을 참조하면, 프로세서(예: 도 1의 프로세서(115))는 도 5의 동작 520에서 계산된 사전 확률 분포 중에서 사전 확률이 임계 값(c) 이상인 부분을 그리드 영역(

)으로 선택할 수 있다. Referring to FIG. 6, a processor (eg, the processor 115 of FIG. 1) selects a portion of the prior probability distribution calculated in operation 520 of FIG.

) can be selected.

일 실시 예에서, 프로세서(115)는 영상 기반 추적 좌표(예: 도 4의 제1 추적 좌표(430)) 및 모션 기반 추적 좌표(예: 도 4의 제2 추적 좌표(440))가 동일한 경우에, 정사각형 형태의 그리드 영역(600)을 선택할 수 있다. 이때, 그리드 영역(600)의 한 변의 길이는

일 수 있다. 상기

값은 수식 10을 통해 계산될 수 있다.In one embodiment, the processor 115 may determine when image-based tracking coordinates (eg, first tracking coordinates 430 of FIG. 4 ) and motion-based tracking coordinates (eg, second tracking coordinates 440 of FIG. 4 ) are the same. For example, a square grid area 600 may be selected. At this time, the length of one side of the grid area 600 is

can be remind

The value can be calculated through Equation 10.

[수식 10][Equation 10]

다른 실시 예에서, 프로세서(115)는 영상 기반 추적 좌표 및 모션 기반 추적 좌표가 상이한 경우에, 직사각형 형태의 그리드 영역(610)을 선택할 수 있다. 예를 들어,

가

미만인 경우에, 그리드 영역(610)의 한 변의 길이는

이고, 다른 한 변의 길이는

일 수 있다. 이때,

값은

일 수 있다. 다른 예를 들어,

가

이상인 경우에, 그리드 영역(610)의 한 변의 길이는

이고, 다른 한 변의 길이는

일 수 있다. In another embodiment, the processor 115 may select a rectangular grid area 610 when image-based tracking coordinates and motion-based tracking coordinates are different. for example,

go

If less than, the length of one side of the grid area 610 is

and the length of the other side is

can be At this time,

value is

can be For another example,

go

In this case, the length of one side of the grid area 610 is

and the length of the other side is

can be

도 7은 일 실시 예에 따른 전자 장치의 사시도이다.7 is a perspective view of an electronic device according to an exemplary embodiment.

도 7을 참조하면, 일 실시 예에 따른 전자 장치(700)는 데이터 획득부(710) 및 프로세서(720)를 포함할 수 있다. 전자 장치(700)는 전자 장치의 현재 포즈(pose)를 추정하고, 추정된 현재 포즈에 기초하여 전자 장치의 미래의 포즈를 예측할 수 있다.Referring to FIG. 7 , an electronic device 700 according to an embodiment may include a data acquisition unit 710 and a processor 720. The electronic device 700 may estimate a current pose of the electronic device and predict a future pose of the electronic device based on the estimated current pose.

일 실시 예에 따르면, 전자 장치(700)는 SLAM(simultaneous localization and mapping)을 통해 전자 장치(700)의 주변 맵(map) 및/또는 전자 장치(700)의 현재 포즈를 추정할 수 있다.According to an embodiment, the electronic device 700 may estimate a surrounding map of the electronic device 700 and/or a current pose of the electronic device 700 through simultaneous localization and mapping (SLAM).

본 개시에서, 'SLAM'은 장치가 임의의 공간을 이동하면서 주변의 정보를 획득하고, 획득된 정보에 기초하여 해당 공간의 맵 및 장치의 현재 포즈를 추정하는 기술을 의미할 수 있으며, 해당 표현은 이하에서도 동일한 의미로 사용될 수 있다.In the present disclosure, 'SLAM' may refer to a technique in which a device acquires surrounding information while moving in a certain space, and estimates a map of the space and a current pose of the device based on the obtained information, and the corresponding expression may be used in the same meaning below.

예를 들어, 전자 장치(700)의 프로세서(720)는 데이터 획득부(710)를 통해 획득된 외부 데이터(예: 영상 데이터, 모션 데이터 등)에 기초하여 주변 맵 및 현재 포즈를 추정할 수 있다.For example, the processor 720 of the electronic device 700 may estimate a surrounding map and a current pose based on external data (eg, image data, motion data, etc.) acquired through the data acquisition unit 710. .

본 개시에서, '전자 장치의 포즈'는 전자 장치의 위치 정보를 포함하는 데이터를 의미할 수 있으며, 해당 표현은 이하에서도 동일한 의미로 사용될 수 있다. 이때, 포즈 데이터는 6DoF 포즈 정보를 포함할 수 있으며, 6DoF 포즈 정보는 전자 장치(700)의 위치(position)를 나타내는 정보 및 방향(orientation)을 나타내는 정보를 포함할 수 있다.In the present disclosure, 'pause of an electronic device' may mean data including location information of an electronic device, and the corresponding expression may be used in the same meaning below. In this case, the pose data may include 6DoF pose information, and the 6DoF pose information may include information indicating the position and orientation of the electronic device 700 .

일 실시 예에서, 전자 장치(700)는 사용자의 신체의 일부에 착용 가능한 웨어러블 전자 장치일 수 있다. 예를 들어, 전자 장치(700)는 렌즈(730) 및 전자 장치(700)의 적어도 일 영역을 사용자의 신체 일부에 고정시키기 위한 연결부(740)를 더 포함할 수 있다.In one embodiment, the electronic device 700 may be a wearable electronic device that can be worn on a part of the user's body. For example, the electronic device 700 may further include a connector 740 for fixing the lens 730 and at least one area of the electronic device 700 to a part of the user's body.

일 실시 예에서, 전자 장치(700)는 도 7에 도시된 바와 같이 사용자의 귀에 착용 가능한 안경 타입의 웨어러블 전자 장치일 수 있으나, 이에 한정되는 것은 아니다. 다른 실시 예에서, 전자 장치(700)는 사용자의 머리에 착용 가능한 머리 착용형 디스플레이(head mounted display, HMD) 장치일 수도 있다.In one embodiment, the electronic device 700 may be a glasses-type wearable electronic device wearable on the user's ear as shown in FIG. 7 , but is not limited thereto. In another embodiment, the electronic device 700 may be a head mounted display (HMD) device that can be worn on a user's head.

일 실시 예에서, 데이터 획득부(710) 및 프로세서(720)는 연결부(740)에 배치될 수 있으나, 데이터 획득부(710) 및 프로세서(720)의 배치 구조가 이에 한정되는 것은 아니다. 다른 실시 예(미도시)에서, 데이터 획득부(710) 및/또는 프로세서(720)는 렌즈(730)의 주변 영역(예: 테두리)에 배치될 수도 있다.In one embodiment, the data acquisition unit 710 and the processor 720 may be disposed on the connection unit 740, but the arrangement structure of the data acquisition unit 710 and the processor 720 is not limited thereto. In another embodiment (not shown), the data acquisition unit 710 and/or the processor 720 may be disposed in a peripheral area (eg, an edge) of the lens 730 .

도면 상에 도시되지 않았으나, 전자 장치(700)는 증강 현실 이미지에 대한 데이터가 포함된 광을 방출하고, 방출된 광의 이동 경로를 조절하기 위한 광학 부품들을 포함할 수 있다. 프로세서(720)는 광학 부품들을 통해 증강 현실 이미지에 대한 데이터가 포함된 광을 방출하고, 방출된 광이 렌즈(730)에 도달하도록 할 수 있다.Although not shown in the drawing, the electronic device 700 may include optical components for emitting light including data for an augmented reality image and adjusting a movement path of the emitted light. The processor 720 may emit light including data for an augmented reality image through optical components and allow the emitted light to reach the lens 730 .

증강 현실 이미지에 대한 데이터가 포함된 광이 렌즈(730)에 도달함에 따라, 렌즈(730)에는 증강 현실 이미지가 표시될 수 있으며, 전자 장치(700)는 상술한 과정을 통해 사용자(또는, 착용자)에게 증강 현실 이미지를 제공할 수 있다.As the light containing the data for the augmented reality image reaches the lens 730, the augmented reality image may be displayed on the lens 730, and the electronic device 700 is controlled by the user (or the wearer) through the above process. ) may be provided with an augmented reality image.

도 7에서는 전자 장치(700)가 웨어러블 전자 장치인 실시 예에 대해서만 도시하였으나, 전자 장치(예: 도 1의 전자 장치(100))의 적용 분야가 이에 한정되는 것은 아니다. 일 실시 예에서, 전자 장치(100)는 SLAM을 통해 주변 맵 및 자신의 현재 포즈를 추정할 수 있는 무인 비행 장치(UAV, unmanned aerial vehicle) 및/또는 자율 주행 차량에도 적용될 수 있다.Although FIG. 7 illustrates only an embodiment in which the electronic device 700 is a wearable electronic device, the application field of the electronic device (eg, the electronic device 100 of FIG. 1 ) is not limited thereto. In an embodiment, the electronic device 100 may be applied to an unmanned aerial vehicle (UAV) and/or an autonomous vehicle capable of estimating a surrounding map and a current pose of the electronic device 100 through SLAM.

이상에서 실시 예들에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속한다.Although the embodiments have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concept of the present invention defined in the following claims are also within the scope of the present invention. belong

Claims

In the apparatus for estimating uncertainty,
A processor for estimating the uncertainty of image coordinates by executing at least one program;
the processor,
Receiving first tracking coordinates that correspond to reference coordinates of first image data acquired through a camera sensor and are image-based tracking coordinates in second image data obtained after the first image data;
Obtaining second tracking coordinates that correspond to the reference coordinates and are motion-based tracking coordinates in the second image data based on motion data obtained from a motion sensor and a depth value of the first image data;
Calculate a distribution of target coordinates in the second image data based on the first tracking coordinates and the second tracking coordinates;
Obtaining an estimated target coordinate and an uncertainty of the estimated target coordinate based on the calculated target coordinate distribution;
and updating the first tracking coordinates based on the estimated target coordinates.

According to claim 1,
the processor,
Obtaining 3D coordinates for the reference coordinates based on a depth value of the first image data;
converting the 3D coordinates into 3D coordinates in the second image data based on the motion data;
Acquiring the second tracking coordinates by projecting the transformed three-dimensional coordinates.

According to claim 1,
the processor,
Calculate a weighted average of each distance value between a plurality of coordinates included in the second image data and the first tracking coordinates and the second tracking coordinates;
Calculate a prior probability distribution based on the calculated weighted average;
An apparatus for determining a portion having a probability greater than or equal to a threshold value among the prior probability distributions as a candidate coordinate group.

According to claim 3,
the processor,
When a value included in the motion data is less than a preset value, a weighting parameter for a distance value between a plurality of coordinates included in the second image data and the first tracking coordinate is set to a plurality of coordinates included in the second image data. Set higher than the weighting parameter for the distance value between the coordinates of and the second tracking coordinates.

According to claim 3,
the processor,
Apparatus for calculating an image-based distribution based on an image similarity with the reference coordinates for the determined candidate coordinate group.

According to claim 5,
the processor,
Apparatus for calculating the target coordinate distribution based on the prior probability distribution and the image-based distribution.

According to claim 1,
the processor,
and transmits the updated first tracking coordinates and the uncertainty to an external device.

According to claim 1,
the processor,
An apparatus for estimating target coordinates by calculating an average of the target coordinate distribution, and estimating the uncertainty by calculating a covariance matrix for the estimated target coordinates.

In the method of estimating uncertainty,
Receiving first tracking coordinates that correspond to reference coordinates of first image data obtained through a camera sensor and are image-based tracking coordinates in second image data obtained after the first image data;
obtaining second tracking coordinates that correspond to the reference coordinates and are motion-based tracking coordinates in the second image data, based on motion data obtained from a motion sensor and a depth value of the first image data;
calculating a distribution of target coordinates in the second image data based on the first tracking coordinates and the second tracking coordinates;
acquiring estimated target coordinates and uncertainties of the estimated target coordinates based on the calculated target coordinate distribution; and
and updating the first tracking coordinates based on the estimated target coordinates.

According to claim 9,
The step of obtaining the second tracking coordinates,
Obtaining three-dimensional coordinates of the reference coordinates based on depth data of the first image data;
converting the 3D coordinates into 3D coordinates in the second image data based on the motion data; and
And obtaining the second tracking coordinates by projecting the transformed three-dimensional coordinates.

According to claim 9,
Calculating a weighted average of each distance value between a plurality of coordinates included in the second image data and the first tracking coordinates and the second tracking coordinates;
calculating a prior probability distribution based on the calculated weighted average; and
And determining a plurality of coordinates having a probability greater than or equal to a threshold value among the prior probability distribution as a candidate coordinate group for a target coordinate.

According to claim 11,
And calculating an image-based distribution based on an image similarity with the reference coordinates for the determined candidate coordinate group.

According to claim 12,
And calculating the target coordinate distribution based on the prior probability distribution and the image-based distribution.

According to claim 9,
and transmitting the updated first tracking coordinates and the uncertainty to an external device.

According to claim 9,
estimating target coordinates by calculating a mean of the target coordinate distribution, and estimating the uncertainty by calculating a covariance matrix for the estimated target coordinates.

In an electronic device that performs a SLAM operation,
A camera sensor that acquires image data about the surrounding environment;
a motion sensor that obtains motion data by detecting rotation and movement of the electronic device; and
A processor electrically connected to the camera sensor and the motion sensor;
the processor,
Receiving first tracking coordinates that correspond to reference coordinates of first image data obtained through the camera sensor and are image-based tracking coordinates in second image data obtained after the first image data;
Obtaining second tracking coordinates that correspond to the reference coordinates and are motion-based tracking coordinates in the second image data based on motion data obtained from the motion sensor and a depth value of the first image data;
Calculate a distribution of target coordinates in the second image data based on the first tracking coordinates and the second tracking coordinates;
Obtaining an estimated target coordinate and an uncertainty of the estimated target coordinate based on the calculated target coordinate distribution;
An electronic device that updates the first tracking coordinates based on the estimated target coordinates.

According to claim 16,
the processor,
Obtaining 3D coordinates for the reference coordinates based on a depth value of the first image data;
converting the 3D coordinates into 3D coordinates in the second image data based on the motion data;
The electronic device of obtaining the second tracking coordinates by projecting the converted three-dimensional coordinates.

According to claim 16,
Further comprising a back-end processor for calculating a rotation angle and position of the camera sensor;
the processor,
and transmits the updated first tracking coordinates and the uncertainty to the back-end processor.

According to claim 16,
the processor,
and estimating the target coordinates by calculating an average of the target coordinate distributions, and estimating the uncertainty by calculating a covariance matrix for the estimated target coordinates.