KR20230127830A

KR20230127830A - Methods performed by electronic devices, electronic devices and storage media

Info

Publication number: KR20230127830A
Application number: KR1020220072356A
Authority: KR
Inventors: 시옹펭 펭; 지후아 리우; 창 왕; 김윤태
Original assignee: 삼성전자주식회사
Priority date: 2022-02-25
Filing date: 2022-06-14
Publication date: 2023-09-01
Also published as: CN116701700A

Abstract

본 개시의 실시예는 전자 기기에 의해 수행되는 방법, 전자 기기 및 컴퓨터 판독 가능 저장 매체를 제공하며, 전자 기기에 의해 수행되는 기술 분야에 관한 것이다. 상기 방법은 쿼리 이미지에 대한 검색 이미지를 획득하는 단계; 쿼리 이미지 및 검색 이미지 각각의 공간적 특징을 획득하는 단계; 공간적 이미지에 기초하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정하는 단계를 포함한다. 본 개시의 실시예가 제공하는, 전자 기기에 의하여 실행되는 방법은, 인공 지능 방식으로 상대적 포즈를 결정하고, 글로벌 맵을 보다 정확하게 최적화할 수 있다.Embodiments of the present disclosure provide a method performed by an electronic device, an electronic device, and a computer readable storage medium, and relate to a technical field performed by an electronic device. The method includes obtaining a search image for a query image; acquiring spatial features of each of the query image and the search image; estimating a relative pose between the query image and the search image based on the spatial image. A method executed by an electronic device provided by an embodiment of the present disclosure may determine a relative pose in an artificial intelligence manner and more accurately optimize a global map.

Description

Methods performed by electronic devices, electronic devices and storage media}

본 개시는 동시적 위치추정 및 지도작성(simultaneous localization and mapping, SLAM) 기술 분야에 관한 것으로서, 전자 기기에 의해 수행되는 방법, 전자 기기 및 컴퓨터 판독 가능한 저장 매체에 관한 것이다.The present disclosure relates to the field of simultaneous localization and mapping (SLAM) technology, and relates to a method performed by an electronic device, the electronic device, and a computer readable storage medium.

장치 상의 카메라, 레이저 레이더 등의 센서를 사용하여 장치가 위치한 공간의 3차원 지도를 실시간으로 작성 및 기술하고, 장치의 포즈(pose)(위치와 자세)를 파악하는 기술을 SLAM이라고 칭한다. 카메라 보정의 오차와 특징 매칭 정확도의 한계로 인해, 시각적 SLAM의 매핑 및 위치 추정 과정에서 불가피한 누적 오차가 발생하게 된다. 이 문제를 해결하기 위하여, SLAM 시스템에 현재 프레임과 초기 키 프레임 간의 공통 뷰 관계를 식별하고 글로벌 맵을 최적화하여 누적 오차를 감소시킴으로써 드리프트(drift) 없는 위치추정을 구현하는 폐루프(LC) 모듈을 추가할 수 있다.SLAM is a technology that uses sensors such as cameras and laser radars on the device to create and describe a 3D map of the space where the device is located in real time, and to determine the pose (position and posture) of the device. Due to the error of camera calibration and the limitation of feature matching accuracy, unavoidable cumulative errors occur in the process of mapping and estimating the position of visual SLAM. To solve this problem, a closed-loop (LC) module is implemented in the SLAM system to implement drift-free localization by identifying common view relationships between the current frame and the initial key frame and optimizing the global map to reduce cumulative errors. can be added

현재, 관련 기술에서는 일반적으로 특징 매칭 등의 방법을 통하여 시각적 제약을 구축한 다음 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 계산하여 글로벌 맵을 최적화하는 방법을 사용하고 있으나, 이러한 방식은 시각적 변화가 비교적 크고 글로벌 맵 최적화에 소비되는 시간 또한 비교적 길다는 문제를 해결할 수 없어, 현재의 SLAM 폐루프 모듈을 최적화할 필요가 있다.Currently, related technologies generally use a method of optimizing a global map by constructing visual constraints through a method such as feature matching and then calculating a relative pose between a query image and a search image, but this method has relatively large visual changes. The problem that the time consumed for global map optimization is also relatively long cannot be solved, so it is necessary to optimize the current SLAM closed-loop module.

전자 기기에 의해 수행되는 방법, 전자 기기 및 컴퓨터 판독 가능한 저장 매체를 제공하는 데 있다. 본 실시예가 이루고자 하는 기술적 과제는 상기와 같은 기술적 과제들로 한정되지 않으며 이하의 실시예들로부터 또 따른 기술적 과제들이 유추될 수 있다.It is to provide a method performed by an electronic device, an electronic device and a computer readable storage medium. The technical problem to be achieved by this embodiment is not limited to the above technical problems, and other technical problems can be inferred from the following embodiments.

본 개시는 전자 기기에 의해 수행되는 방법, 전자 기기 및 컴퓨터 판독 가능 저장 매체를 제공하며, 그 기술적 해결 방법은 다음과 같다.The present disclosure provides a method performed by an electronic device, an electronic device, and a computer readable storage medium, and a technical solution thereof is as follows.

일 실시예에 있어서, 본 개시의 전자 기기에 의해 수행되는 방법은 쿼리 이미지에 대해 검색 이미지를 획득하는 단계; 쿼리 이미지 및 검색 이미지 각각의 공간적 특징을 획득하는 단계; 및 공간적 특징에 기반하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정하는 단계;를 포함한다.In an embodiment, a method performed by an electronic device of the present disclosure includes obtaining a search image for a query image; acquiring spatial features of each of the query image and the search image; and estimating a relative pose between the query image and the search image based on the spatial feature.

다른 실시예에 있어서, 본 개시의 전자 기기는 하나 이상의 프로세서; 메모리; 및 하나 이상의 응용 프로그램을 포함하며, 하나 이상의 응용 프로그램은 메모리에 저장되며 하나 이상의 프로세스에 의해 실행되도록 구성되고, 하나 이상의 응용 프로그램은 상기 일 실시예에 따른 전자 기기에 의해 실행되는 방법과 대응되는 동작을 실행하도록 구성된다.In another embodiment, an electronic device of the present disclosure includes one or more processors; Memory; and one or more application programs, wherein the one or more application programs are stored in a memory and are configured to be executed by one or more processes, and the one or more application programs correspond to the method executed by the electronic device according to the embodiment. is configured to run

또 다른 실시예에 있어서, 본 개시의 컴퓨터 판독 가능 저장 매체는 적어도 하나의 명령어, 적어도 하나의 프로그램, 코드 세트 또는 명령어 세트를 저장하고, 적어도 하나의 명령어, 적어도 하나의 프로그램, 코드 세트 또는 명령어 세트는 프로세스에 의해 로드되고 실행되어 상기 일 실시예에서 설명한 바와 같이 전자 기기에 의해 실행되는 방법을 구현한다.In another embodiment, a computer readable storage medium of the present disclosure stores at least one instruction, at least one program, code set, or instruction set, and at least one instruction, at least one program, code set, or instruction set. is loaded and executed by the process to implement the method executed by the electronic device as described in the above embodiment.

본 개시가 제공하는 기술적 효과는 다음과 같다.The technical effects provided by the present disclosure are as follows.

본 개시는 전자 기기에 의해 실행되는 방법, 전자 기기 및 컴퓨터 판독 가능 저장 매체를 제공하며, 선행기술과 비교하여, 본 개시는 쿼리 이미지와 검색 이미지의 공간적 특징에 의하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정함으로써, 공간적 특징은 보다 넓은 감지 시야각(Field of view) 레벨과 공간적 정보를 가지며, 글로벌 맵을 보다 정밀하게 최적화할 수 있다.The present disclosure provides a method executed by an electronic device, an electronic device, and a computer-readable storage medium, and compared to the prior art, the present disclosure provides a relative pose between a query image and a search image by a spatial feature of the query image and the search image. By estimating , the spatial feature has a wider sensed field of view level and spatial information, and the global map can be more precisely optimized.

또한, 본 개시는 이미지에서 키포인트와 ORB 기술자 (ORB descriptor)를 조밀하고 균일하게 추출한 다음, 에피폴라 제약(epipolar constraint)을 사용하여 스테레오 매칭 및 삼각 측량을 완료함으로써 3차원 포인트 세트를 추정하므로, 추정된 3차원 포인트 세트는 글로벌 맵보다 공간에서 더욱 균일하고 밀도 높게 분포된다. 이로써 상대적 포즈를 보다 정확하게 결정하여 글로벌 맵을 최적화할 수 있다.In addition, since the present disclosure estimates a set of three-dimensional points by densely and uniformly extracting keypoints and ORB descriptors from an image, and then completing stereo matching and triangulation using epipolar constraints, estimation 3D point sets are more uniformly and densely distributed in space than the global map. This allows a more accurate determination of the relative pose to optimize the global map.

또한, 본 개시는 점진적 클러스터 조정과 전체 클러스터 조정을 통하여 글로벌 맵을 효과적으로 최적화할 수 있으며, 작동 시간은 감소되고, 정확도는 증가된다.In addition, the present disclosure can effectively optimize the global map through gradual cluster adjustment and overall cluster adjustment, reducing operation time and increasing accuracy.

본 개시의 실시예에 설명된 기술적 해결 방법을 보다 명확하게 설명하기 위하여, 본 개시의 실시예를 설명할 때 사용되는 도면을 간략하게 소개한다.
도 1은 본 개시의 실시예에 의하여 제공되는, 전자 기기에 의하여 수행되는 방법의 흐름도이다.
도 2는 본 개시의 실시예에 의한 3차원 포인트 세트를 클러스터링하기 위한 방법을 나타내는 개략도이다.
도 3은 본 개시의 실시예에 의한 제1 특징 매칭 쌍과 제2 특징 매칭 쌍을 생성하는 방법을 나타내는 개략도이다.
도 4는 본 개시의 실시예에 의한 제3 특징 매칭 쌍을 생성하기 위한 방법을 나타내는 개략도이다.
도 5는 본 개시의 실시예에 의한 최적화된 글로벌 맵의 생성 방법을 나타내는 개략도이다.
도 6은 본 개시의 실시예에서 제공되는 전자 기기의 개략적인 구조도이다.
도 7은 본 개시의 실시예에서 제공되는 전자 기기에 의하여 수행되는 방법을 나타내는 개략도이다.
도 8은 본 개시의 실시예에서 제공되는 전자 기기의 개략적인 구조도이다.
도 9는 본 개시의 실시예에서 제공되는 전자 기기의 개략적인 구조도이다.In order to more clearly explain the technical solutions described in the embodiments of the present disclosure, drawings used in describing the embodiments of the present disclosure will be briefly introduced.
1 is a flowchart of a method performed by an electronic device provided by an embodiment of the present disclosure.
2 is a schematic diagram illustrating a method for clustering a 3D point set according to an embodiment of the present disclosure.
3 is a schematic diagram illustrating a method of generating a first feature matching pair and a second feature matching pair according to an embodiment of the present disclosure.
4 is a schematic diagram illustrating a method for generating a third feature matching pair according to an embodiment of the present disclosure.
5 is a schematic diagram illustrating a method for generating an optimized global map according to an embodiment of the present disclosure.
6 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
7 is a schematic diagram illustrating a method performed by an electronic device provided in an embodiment of the present disclosure.
8 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
9 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

이하, 본 개시의 실시예를 자세히 설명한다. 상기 실시예의 예시는 도면에 도시되어 있으며, 이 중 동일하거나 유사한 참조부호는 시종 동일하거나 유사한 구성요소, 또는 동일하거나 유사한 기능을 가지는 구성요소를 나타낸다. 이하, 도면을 참조하여 설명하는 실시예는 예시적인 것으로서 본 개시를 설명하기 위한 것일 뿐, 본 개시를 제한하는 것은 아니다.Hereinafter, embodiments of the present disclosure will be described in detail. Examples of the above embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar components or components having the same or similar functions. Hereinafter, the embodiments described with reference to the drawings are only for explaining the present disclosure as illustrative, and do not limit the present disclosure.

특별히 기재되지 않는 한, 본 명세서에서 사용되는 "하나", "1개" 등의 단수 형태는 복수 형태 또한 포함할 수 있다. 또한 본 개시의 명세서에서 "포함하다"와 같은 단어의 사용은 상기 특징, 정수, 단계, 작업, 부품 및/또는 구성요소의 존재를 나타내는 것이며, 하나 이상의 다른 특징, 정수, 단계, 작업, 부품, 구성요소 및/또는 이의 조합의 존재 또는 추가 가능성을 배제하기 위한 것이 아님이 이해될 것이다. Unless otherwise specified, singular forms such as “a” and “one” as used herein may also include plural forms. Also, use of the word "comprises" in the specification of this disclosure indicates the presence of such features, integers, steps, operations, parts and/or components, and one or more other features, integers, steps, operations, parts, It will be understood that it is not intended to exclude the presence or additional possibilities of elements and/or combinations thereof.

부품이 다른 부품에 "연결" 또는 "커플링"된다고 할 때, 다른 부품에 직접적으로 연결 또는 커플링되거나, 또는 다른 중간 구성요소가 존재할 수 있음은 물론이다. 또한, 본 명세서에서 사용되는 "연결" 또는 "커플링"은 무선 연결 또는 무선 커플링을 포함할 수 있다. 본 명세서에서 사용되는 "및/또는"과 같은 단어는 하나 이상의 서로 관련된 나열 항목 또는 임의의 유닛과 모든 조합을 포함한다.When a component is said to be "connected" or "coupled" to another component, it is understood that it may be directly connected or coupled to the other component, or other intermediate components may be present. Also, “connection” or “coupling” used herein may include wireless connection or wireless coupling. Words such as "and/or" as used herein include one or more interrelated enumerated items or any unit and in all combinations.

본 개시의 목적, 기술적 해결 방법 또는 이점을 보다 명확히 하기 위하여, 이하 도면을 참조하여 도면의 실시 형태를 보다 자세히 설명한다.In order to more clearly clarify the purpose, technical solutions or advantages of the present disclosure, embodiments of the drawings will be described in more detail with reference to the following drawings.

장치 상의 카메라, 관성 측정 유닛 등의 센서를 사용하여 장치가 위치한 공간의 3차원 지도를 실시간으로 작성하고, 지도 내에서 장치의 위치와 자세를 실시간으로 파악하는 기술을 SLAM이라고 칭한다. 카메라와 관성 측정 유닛은 그 가격이 LiDAR 센서에 비하여 저렴하며, 휴대폰, 증강현실 안경, 실내 로봇 등 장치의 표준 구성이기도 하고, 다양한 상황에서 사용이 가능하다. 또한, 기존의 모든 SLAM 기술의 주된 연구 내용은 카메라와 관성 측정 유닛을 센서로 사용하여 실시간으로 맵을 작성하고 장치 포즈를 획득하는 것이다. 단안 카메라에 비하여, 쌍안 카메라로 구축한 3차원 맵은 보다 실제적인 물리적 척도를 가지므로, 실제 응용에 있어서, 장치 상의 시각 센서는 쌍안 카메라인 경우가 많다.SLAM is a technology that uses sensors such as cameras and inertial measurement units on the device to create a 3D map of the space where the device is located in real time, and to determine the location and posture of the device in real time on the map. The camera and inertial measurement unit are cheaper than LiDAR sensors, and are standard components of devices such as mobile phones, augmented reality glasses, and indoor robots, and can be used in various situations. In addition, the main research content of all existing SLAM technologies is to create maps and acquire device poses in real time using cameras and inertial measurement units as sensors. Compared to a monocular camera, a three-dimensional map built by a binocular camera has a more realistic physical scale, so in practical applications, the visual sensor on the device is often a binocular camera.

기존의 SLAM 시스템은 주로 다시점 기하학(Multiple-view Geometry) 이론에 근거한 것으로서, 이미지에서 포인트 특징에 대한 추적 및 매칭을 사용하여 장치의 포즈(장치의 포즈는 장치의 공간적 3차원 위치 및 방향을 나타냄)와 3차원 환경 정보를 획득한다. 시계열에서 동영상 관련 이미지의 포인트 특징은 다시점 기하학적 원리에 따라 추정 및 매칭되며, 쌍안 이미지의 포인트 특징은 에피폴라 제약에 따라 매칭되고, 마지막으로 이러한 매칭은 장치의 포즈와 3차원 맵 포인트 간의 기하학적 제약 관계를 성립시키며, 필터링 또는 클러스터 조정을 통하여 장치의 포즈 및 3차원 맵 포인트를 해결할 수 있다.Existing SLAM systems are mainly based on multiple-view geometry theory, which uses tracking and matching of point features in an image to pose the device (the pose of the device represents the spatial 3-dimensional position and orientation of the device). ) and 3D environment information is acquired. In the time series, point features of video-related images are estimated and matched according to multi-view geometric principles, point features of binocular images are matched according to epipolar constraints, and finally, these matching are based on geometric constraints between device poses and 3D map points. The relationship is established, and the device's pose and 3D map points can be resolved through filtering or cluster adjustment.

카메라 보정 및 특징 매칭의 오차로 인해, 시각적 SLAM은 맵 작성 및 위치 추정 과정에서 불가피한 누적 오차를 생성한다. 드리프트 없는 위치추정을 구현하고 정확한 글로벌 맵을 작성하는 것은 해결하여야 할 과제이다. 이 문제를 해결하기 위하여, SLAM 시스템에 현재 프레임과 초기 키 프레임 간의 공통 뷰 관계를 식별하고 글로벌 맵을 최적화하여 누적 오차를 감소시킴으로써 드리프트 없는 위치추정을 구현하는 폐루프(LC) 모듈이 추가될 수 있다. 따라서, LC는 SLAM 시스템의 주요 모듈을 구성하여, SLAM 성능을 현저하게 향상시킬 수 있다.Due to errors in camera calibration and feature matching, visual SLAM creates unavoidable cumulative errors in the process of map creation and localization. Implementing drift-free localization and creating an accurate global map are challenges to be solved. To solve this problem, a closed-loop (LC) module can be added to the SLAM system to implement drift-free localization by identifying the common view relationship between the current frame and the initial key frame and optimizing the global map to reduce the cumulative error. there is. Therefore, the LC constitutes a major module of the SLAM system, and can significantly improve SLAM performance.

LC는 일반적으로 세 개의 단계로 분류된다. 제1 단계는 이미지 검색 작업과 유사하고, 이는 쿼리 이미지에 대하여 의미상으로 유사한 이미지를 검색하는 것을 목적으로 한다. 물론, 적절한 이미지 표시가 필수적이며, 대부분의 방법은 Bag of Words(BoW) 모델에 기반한다. 제2 단계는 BoW와 ORB(Oriented Fast and Rotated Binary Robust Independent Elementary Features) 특징 매칭 및 투영 매칭과 같은 특징 매칭 등의 방법을 통하여 시각적 제약을 형성한 후, 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정하는 것이다. 제3 단계는, 글로벌 맵을 최적화하여 드리프트 없는 위치추정을 구현하는 것이다.LC is generally classified into three stages. The first step is similar to an image retrieval task, which aims to retrieve images that are semantically similar to the query image. Of course, proper image display is essential, and most methods are based on the Bag of Words (BoW) model. The second step is to form visual constraints through feature matching methods such as BoW and ORB (Oriented Fast and Rotated Binary Robust Independent Elementary Features) feature matching and projection matching, and then estimate the relative pose between the query image and the search image. will be. The third step is to optimize the global map to implement drift-free localization.

최근, LC에 대한 연구가 이루어지고 있다. 일부 관련 기술에서, 글로벌 맵 중 키 프레임 포즈와 현재 프레임 포즈의 글로벌 일치도를 최적화하기 위하여 4 자유도(4DOF) 포즈 그래프 최적화가 제시되었다. 4DOF 포즈 그래프 최적화 방법은 적은 시간으로 키 프레임 포즈의 글로벌 일치도를 최적화한다. 그러나, 하나의 글로벌 맵을 유지하지 못하므로 최적화의 정확도가 감소된다. 그 밖에도, 일부 관련 기술에서 세 개의 키 프레임의 시간 일치도 검사를 쿼리 키 프레임과 세 개의 공통 뷰 키 프레임 사이의 로컬 일치도 검사로 대체하여 LC 회수율을 향상시키는 기술이 제시되었다. 그러나, 카메라의 시야각 변화가 비교적 크고 장면에 지각 에일리어싱(perceptual aliasing)이 존재할 경우, 쿼리와 검색 키 프레임 사이의 상대 포즈의 인라이어(inlier)는 비교적 적을 것으로 추정되며, LC 또한 실패할 수 있다. 또한, 전체 번들 조정 방법(FBA)으로 글로벌 맵을 최적화하는 데에는 비교적 긴 시간이 소요된다. 다른 일부 관련 기술에서는 특징 재식별 방법이 제안되며, 사전 포즈(prior pose)는 시공간적으로 민감성을 가지는 제안된 글로벌 서브 맵이 기존의 특징을 신속하게 식별하는 데에 도움이 된다. 그러나, 사전 포즈(prior pose)가 신뢰도가 낮을 경우, LC와 특징 재식별이 결합하여 드리프트 없는 카메라 포즈가 획득된다. 또한, 카메라 드리프트가 비교적 큰 경우, 특징 재식별은 작용하지 않는다. 따라서, 카메라의 큰 시점 변화와 장면에서의 지각 에일리어싱 때문에, LC 또한 실패할 가능성이 매우 높다. 또한, 카메라의 드리프트가 비교적 큰 경우, 점진적 번들 조정 방법(IBA)은 글로벌 맵의 최적화에는 충분하지 않다.Recently, research on LC has been conducted. In some related technologies, four degrees of freedom (4DOF) pose graph optimization has been proposed in order to optimize global correspondence between a key frame pose and a current frame pose in a global map. The 4DOF pose graph optimization method optimizes the global consistency of key frame poses in a small amount of time. However, since one global map cannot be maintained, the accuracy of the optimization is reduced. In addition, in some related technologies, a technique of improving the LC recovery rate by replacing the temporal consistency check of three key frames with a local consistency check between a query key frame and three common view key frames has been proposed. However, if the camera's field of view change is relatively large and there is perceptual aliasing in the scene, the relative pose inlier between query and search key frames is assumed to be relatively small, and LC may also fail. Also, optimizing the global map with the whole bundle adjustment method (FBA) takes a relatively long time. In some other related technologies, a feature re-identification method is proposed, and the proposed global sub-map having temporal and spatial sensitivity in a prior pose helps quickly identify existing features. However, if the prior pose has low reliability, a drift-free camera pose is obtained by combining LC and feature re-identification. Also, if the camera drift is relatively large, feature re-identification does not work. Therefore, LC is also very likely to fail because of the large viewpoint change of the camera and perceptual aliasing in the scene. Also, if the drift of the camera is relatively large, the progressive bundle adjustment method (IBA) is not sufficient for the optimization of the global map.

종합하면, 정확하고 안정적인 LC에는 몇 가지의 과제가 존재한다. 먼저, 특징 매칭은 더 큰 감지 시야를 가지는 계층 및 공간 정보 대신 작은 블록에서의 로컬 특징, 예컨대 ORB, BRIEF(Binary Robust Independent Elementary Features), SURF (Speeded Up Robust Features) 및 SIFT(Scale-invarient feature transform, 척도 불변 특징 변환) 등을 고려하며, 이는 카메라의 큰 시야각 변화와 장면에서의 지각 에일리어싱이 존재하는 경우 LC 불안정으로 이어질 수 있다. 딥 러닝에 기반한 특징 매칭은 통상적으로 컨볼루션 신경망(CNN, convolution neural network)을 사용하여 데이터로부터 보다 나은 희소(sparse) 감지기와 로컬 기술자(local descriptor)를 학습하는 데에 중점을 둔다. 이러한 방향에 따른 최근의 일부 작업은 대응 관계를 공동으로 검색하고 매칭 불가능한 포인트를 거부하여 두 세트의 로컬 특징의 신경망을 일치시키는 것으로서, 고품질 특징 대응을 필요로 하는 다양한 다시점 기하학적 문제를 해결할 수 있다. 그러나, 딥 러닝 학습 방법은 강력한 컴퓨팅 리소스를 필요로 한다. 다음으로, 단일한 최적화 방법으로는 글로벌 맵의 최적화 문제를 확실하게 해결할 수 없다. 예컨대 카메라의 드리프트가 큰 경우 IBA 최적화는 불충분하고, 전체 번들 조정(FBA) 방법으로 글로벌 맵을 최적화하는 것은 매우 소모적이며, 포즈 최적화 방법으로는 정확한 글로벌 맵을 유지하지 못한다.Taken together, accurate and stable LCs present several challenges. First, feature matching uses local features in small blocks instead of hierarchical and spatial information with larger detection fields, such as ORB, BRIEF (Binary Robust Independent Elementary Features), SURF (Speeded Up Robust Features) and SIFT (Scale-invarient feature transform) , scale invariant feature transformation), etc., which can lead to LC instability in the presence of large field-of-view changes of the camera and perceptual aliasing in the scene. Feature matching based on deep learning typically focuses on learning better sparse detectors and local descriptors from data using convolutional neural networks (CNNs). Some recent work along this direction is matching neural networks of two sets of local features by jointly searching for correspondences and rejecting unmatchable points, which can solve a variety of multi-view geometrical problems requiring high-quality feature matching. . However, deep learning learning methods require powerful computing resources. Next, no single optimization method can reliably solve the global map optimization problem. For example, when the drift of the camera is large, IBA optimization is insufficient, optimizing the global map by the full bundle adjustment (FBA) method is very exhausting, and the pose optimization method cannot maintain an accurate global map.

이하에서는, 실시예들을 통하여 본 개시의 기술적 과제를 해결하는 기술적 방법에 대해 자세히 설명한다. 이하의 실시예들은 서로 결합이 가능하며, 동일 또는 유사한 개념이나 프로세스는 일부 실시예에서는 추가적으로 설명하지 않을 수 있다.Hereinafter, a technical method for solving the technical problem of the present disclosure will be described in detail through embodiments. The following embodiments may be combined with each other, and the same or similar concept or process may not be additionally described in some embodiments.

이하, 도면을 참조하여 본 개시의 실시예를 설명한다.Hereinafter, embodiments of the present disclosure will be described with reference to the drawings.

본 개시의 실시예에서 가능한 구현 방식이 제공되는 바, 도 1에 도시된 바와 같이, 전자 기기에 의해 실행되는 방법이 제공된다. 상기 방법은 다음과 같은 단계를 포함할 수 있다.Possible implementation methods are provided in the embodiments of the present disclosure, as shown in FIG. 1 , a method executed by an electronic device is provided. The method may include the following steps.

단계 S101에서, 쿼리 이미지에 대하여 검색 이미지를 획득한다.In step S101, a search image is acquired for the query image.

여기서, 쿼리 이미지는 전자 기기가 위치 추정 및 맵 작성 과정에서 수집한 이미지이다(예컨대, 현재 프레임의 장면 이미지). 이는 다른 장치로부터 수신한 이미지일 수도 있다.Here, the query image is an image collected by the electronic device during location estimation and map creation (eg, a scene image of a current frame). This may be an image received from another device.

일 실시예에서, 실시간으로 쿼리 이미지를 수집할 수 있고, 또는 주기적으로 간격을 두고 쿼리 이미지를 수집할 수도 있으며, 이벤트 트리거에 의하여 쿼리 이미지를 수집할 수도 있다. 쿼리 이미지를 획득하는 프로세스는 제한되지 아니한다.In one embodiment, the query image may be collected in real time, the query image may be collected at regular intervals, or the query image may be collected by an event trigger. The process of acquiring the query image is not limited.

일 실시예에서, 전자 기기의 위치 추정 및 맵 작성 과정에서, 각각의 키 프레임에 대하여 이미지 데이터 세트가 구성되고, 쿼리 이미지에 대응되는 이미지 데이터 세트를 획득하며, 이미지 데이터 세트에는 다수 개의 후보 이미지가 포함되고, 다수 개의 후보 이미지로부터 쿼리 이미지와 의미상으로 유사한 데이터를 검색하여 검색 이미지를 획득한다. In one embodiment, in the process of estimating the position of an electronic device and creating a map, an image data set is configured for each key frame, an image data set corresponding to a query image is obtained, and a plurality of candidate images are included in the image data set. and search for data semantically similar to the query image from a plurality of candidate images to obtain a search image.

이 때, 검색 이미지는 하나 또는 여러 개일 수 있으나, 본 개시에서는 검색 이미지의 수를 제한하지 아니한다.At this time, the number of search images may be one or several, but the present disclosure does not limit the number of search images.

예컨대, BoW 모델에 기반하여 후보 이미지로부터 검색 이미지를 검색할 수 있다.For example, a search image may be retrieved from candidate images based on the BoW model.

단계 S102에서, 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 각각 획득한다.In step S102, spatial features of the query image and spatial features of the search image are respectively acquired.

이 때, 공간 특징은 3차원 포인트 세트를 포함할 수 있다.In this case, the spatial feature may include a 3D point set.

일부 가능한 실시예에서, 단계 S102의 쿼리 이미지와 검색 이미지 각각의 공간적 특징을 획득하는 단계는 쿼리 이미지와 검색 이미지 중 어느 하나에 대하여 공간 특징을 획득하는 단계로서 (1) 이미지 키 포인트(keypoint)와 특징 기술자(feature descriptor)를 포함하는 이미지 특징 포인트를 추출하는 단계 및 (2) 이미지 특징 포인트에 스테레오 매칭을 실시하여 3차원 포인트 세트를 추정하는 단계를 포함한다.In some possible embodiments, the step of acquiring spatial features of each of the query image and the search image in step S102 is a step of acquiring spatial features for either one of the query image and the search image, including (1) an image keypoint and Extracting image feature points including feature descriptors; and (2) estimating a 3D point set by performing stereo matching on the image feature points.

이 때, 특징 기술자는 예컨대 ORB 기술자일 수 있다.At this time, the feature descriptor may be, for example, an ORB descriptor.

실시 과정에서, 에피폴라 제약을 사용하여 스테레오 매칭 및 삼각 측량을 수행함으로써 쿼리 이미지의 3차원 포인트 세트 및 검색 이미지의 3차원 포인트 세트를 각각 추정할 수 있다.In an implementation process, a 3D point set of a query image and a 3D point set of a search image may be estimated respectively by performing stereo matching and triangulation using epipolar constraints.

단계 S103에서, 공간적 특징에 기반하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정한다.In step S103, a relative pose between the query image and the search image is estimated based on the spatial feature.

일부 가능한 실시예에서, 쿼리 이미지의 공간적 특징과 검색 이미지 간의 공간 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득하고, 특징 매칭 결과에 기반하여 상대적 포즈를 결정할 수 있다.In some possible embodiments, a feature matching result may be obtained by matching a spatial feature of a query image with a spatial feature between search images at least once, and a relative pose may be determined based on the feature matching result.

실시 과정에서, 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트에 대해 coarse-to-fine으로 여러 레이어들의 매칭을 실시하고, 최종 매칭 결과에 근거하여 상대적 포즈를 결정할 수 있다. 상대적 포즈를 결정하는 방법에 대해서는 아래에서 보다 자세히 설명한다.In an implementation process, coarse-to-fine matching of several layers is performed on the 3D point set of the query image and the 3D point set of the search image, and a relative pose may be determined based on the final matching result. The method for determining the relative pose is described in more detail below.

전술한 임의의 실시예에서, 쿼리 이미지와 검색 이미지의 공간적 특징에 의하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정한다. 공간적 특징은 보다 큰 감지 시야각 계층과 공간 정보를 포함하며, 글로벌 맵을 보다 정확하게 최적화할 수 있다.In any of the above embodiments, relative poses between the query image and the search image are estimated based on spatial features of the query image and the search image. Spatial features include larger layers of sense viewing angles and spatial information, allowing the global map to be optimized more accurately.

일부 가능한 실시예에서, 이미지에서 키 포인트와 ORB 기술자를 조밀하고 균일하게 추출한 다음, 에피폴라 제약을 사용하여 스테레오 매칭 및 삼각 측량을 완료함으로써 3차원 포인트 세트를 추정한다. 3차원 포인트 세트가 글로벌 맵보다 공간에서 더욱 균일하고 밀도 높게 분포되므로 상대적 포즈를 보다 정확하게 결정하여 글로벌 맵을 최적화할 수 있다.In some possible embodiments, a set of three-dimensional points is estimated by densely and uniformly extracting key points and ORB descriptors from the image, then completing stereo matching and triangulation using epipolar constraints. Because the three-dimensional point set is more uniformly and densely distributed in space than the global map, the global map can be optimized by more accurately determining relative poses.

아래에서는 실시예를 참조하여, 상대적 포즈를 결정하는 과정을 보다 자세히 설명한다.Hereinafter, a process of determining a relative pose will be described in detail with reference to embodiments.

일부 가능한 실시예에서, 특징 매칭 결과는 제1 특징 매칭 쌍을 포함하고, 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득하는 단계는 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 각각을 클러스터링하여, 쿼리 이미지의 클러스터링 결과와 검색 이미지의 클러스터링 결과 간의 제1 특징 매칭 쌍을 생성하는 단계를 포함할 수 있다.In some possible embodiments, the feature matching result includes a first feature matching pair, and the step of matching the spatial feature of the query image with the spatial feature of the search image at least once to obtain the feature matching result comprises a combination of the query image and the search image. and generating a first feature matching pair between a clustering result of a query image and a clustering result of a search image by clustering each 3D point set.

일부 가능한 실시예에서, 3차원 포인트 세트의 포인트는 공간 분포에 따라 큐브(cube)로 클러스터링된다.In some possible embodiments, the points of the three-dimensional point set are clustered into cubes according to their spatial distribution.

각 클러스터의 중심 기술자 는 큐브에서 모든 3차원 포인트 기술자 의 투표함수에 의하여 획득되며, 보다 큰 감지 시야각의 공간적 정보를 고려한다.Central descriptor of each cluster is all 3-dimensional point descriptors in the cube the voting function of It is obtained by, and considers the spatial information of a larger sensing field of view.

식에서, 은 각 클러스터의 중심 기술자 이고, 는 큐브의 3차원 포인트 기술자이며, 는 투표함수이다.in the expression, is the centroid descriptor of each cluster ego, is the 3D point descriptor of the cube, is the voting function.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 각각을 클러스터링하여 쿼리 이미지의 클러스터링 결과와 검색 이미지의 클러스터링 결과 간의 제1 특징 매칭 쌍을 생성하는 단계는 쿼리 이미지의 3차원 포인트 세트를 클러스터링하여 생성된 적어도 하나의 제1 큐브를 결정하는 단계, 검색 이미지의 3차원 포인트 세트를 클러스터링하여 형성된 적어도 하나의 제2 큐브를 결정하는 단계, 제1 큐브 각각의 제1 클러스터 중심을 결정하고, 제2 큐브 각각의 제2 클러스터 중심을 결정하는 단계, 및 제1 큐브 각각의 제1 클러스터 중심과 서로 매칭되는 제2 클러스터 중심을 결정하고, 서로 매칭되는 제1 클러스터 중심과 제2 클러스터 중심에 기초하여 상기 제1 특징 매칭 쌍을 결정하는 단계를 포함한다.In some possible embodiments, clustering each of the three-dimensional point sets of the query image and the search image to generate a first feature matching pair between the clustering result of the query image and the clustering result of the search image comprises the three-dimensional point set of the query image. Determining at least one first cube generated by clustering, determining at least one second cube formed by clustering a three-dimensional point set of a search image, determining a first cluster center of each of the first cubes, Determining a second cluster center of each second cube, and determining a second cluster center that matches the first cluster center of each first cube, and based on the first cluster center and the second cluster center that match each other and determining the first feature matching pair.

도 2에 도시된 실시예와 같이, 전술한 ORB 기술자의 제1 차원의 경우 숫자 1이 더 많다. 따라서, 클러스터 중심 기술자의 제1 차원은 1이다. 3차원 포인트 세트의 클러스터링 후, 각 큐브의 클러스터 중심 기술자를 획득할 수 있으며, ()는 쿼리 이미지의 클러스터 중심 기술자이고, 는 검색 이미지의 클러스터 중심 기술자이다. 다음으로, 최근접 이웃 검색과 상호 검증을 통하여 쿼리 이미지와 검색 이미지 간의 큐브에 대한 coarse 매칭 쌍, 즉 제1 특징 매칭 쌍을 획득한다.As in the embodiment shown in Figure 2, the number 1 is greater for the first dimension of the ORB descriptor described above. Thus, the first dimension of the cluster centroid descriptor is 1. After clustering a set of three-dimensional points, the cluster centroid descriptor of each cube can be obtained, ( ) is the cluster centroid descriptor of the query image, is the cluster centroid descriptor of the search image. Next, nearest neighbor search and mutual validation A coarse matching pair for a cube between a query image and a search image, that is, a first feature matching pair is obtained through

식에서, ()는 쿼리 이미지의 클러스터 중심 기술자이고, 는 검색 이미지의 클러스터 중심 기술자이며, 는 해밍 거리(Hamming distance)를 나타내고, 는 해밍 거리의 임계치를 나타내며, 은 쿼리 이미지의 클러스터 중심 특징으로부터 검색 이미지의 클러스터 중심까지의 최근접 이웃에 대한 검색이고, 은 검색 이미지의 클러스터 중심 특징으로부터 쿼리 이미지의 클러스터 중심 특징까지의 최근접 이웃에 대한 검색이며, 은 쿼리 이미지의 클러스터 중심 특징과 검색 이미지의 클러스터 중심 특징에 대한 검증이다.in the expression, ( ) is the cluster centroid descriptor of the query image, is the cluster centroid descriptor of the search image, represents the Hamming distance, represents the threshold of the Hamming distance, is the search for the nearest neighbor from the cluster centroid feature of the query image to the cluster centroid of the search image, is the search for nearest neighbors from the cluster centroid feature of the search image to the cluster centroid feature of the query image, is a verification of the cluster center feature of the query image and the cluster center feature of the search image.

도 3에서, 양방향 화살표의 점선으로 연결되는 큐브는 쿼리 이미지와 검색 이미지 간의 coarse 매칭 쌍이다.In FIG. 3 , a cube connected by a dotted line of a double-headed arrow is a coarse matching pair between a query image and a search image.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지에 대한 coarse 매칭을 통하여 큐브 간의 제1 특징 매칭 쌍을 획득하고, 제1 특징 매칭 쌍에 직접적으로 근거하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정할 수 있다.In some possible embodiments, a first feature matching pair between cubes may be obtained through coarse matching of the query image and the search image, and a relative pose between the query image and the search image may be estimated directly based on the first feature matching pair. there is.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지에 대한 coarse 매칭을 통하여 큐브 간의 제1 특징 매칭 쌍을 획득한 후에는, 다시 fine 매칭을 실시하여 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 획득할 수 있다.In some possible embodiments, after obtaining a first feature matching pair between cubes through coarse matching of a query image and a search image, fine matching may be performed again to obtain a second feature matching pair between 3D point sets. there is.

실시 과정에서, 특징 매칭 결과는 제2 특징 매칭 쌍을 포함할 수도 있으며, 상기 쿼리 이미지의 공간적 특징과 상기 검색 이미지의 공간적 특징을 적어도 한 번 매칭시켜 특징 매칭 결과를 획득하는 단계는 제1 특징 매칭 쌍 영역의 3차원 포인트에 최근접 이웃 검색 및 상호 검증을 실시하여 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 획득하는 단계를 더 포함할 수 있고, 상기 특징 매칭 결과에 기초하여 상기 상대적 포즈를 결정하는 단계는, 제2 특징 매칭 쌍에 기반하여 상대적 위치를 추정하는 단계를 포함할 수 있다.In an implementation process, the feature matching result may include a second feature matching pair, and obtaining a feature matching result by matching the spatial feature of the query image with the spatial feature of the search image at least once may include first feature matching. The method may further include obtaining a second feature matching pair between the 3D point set of the query image and the 3D point set of the search image by performing a nearest neighbor search and mutual verification on the 3D points of the pair region. Determining the relative pose based on a feature matching result may include estimating a relative position based on a second feature matching pair.

다시 말해, 제1 특징 매칭 쌍 영역의 모든 3차원 포인트 와 에 대하여 최근접 이웃 검색 및 상호 검증을 실시하며, 과 는 각각 제개 큐브의 공간적 영역 중 27개 큐브의 집합과 제개 큐브의 공간적 영역 중 27개 큐브의 집합을 나타낸다. 다음으로, 제2 특징 매칭 쌍에 기반하여 쿼리 이미지와 검색 이미지 간의 대략적인(coarse) 상대적 포즈 가 추정된다.In other words, all three-dimensional points in the first feature matching pair region and Perform nearest neighbor search and mutual verification for class are respectively The set and limit of the 27 cubes of the spatial domain of the cubes It represents the set of 27 cubes among the spatial domains of the cubes. Next, a coarse relative pose between the query image and the search image based on the second feature matching pair is estimated

식에서, 은 이고, 은 이며, 는 쿼리 이미지의 3차원 포인트 특징으로부터 검색 이미지의 3차원 포인트 특징까지의 최근접 이웃 검색이고, 은 검색 이미지의 3차원 포인트 특징으로부터 쿼리 이미지의 3차원 포인트 특징까지의 최근접 이웃 검색이며, 은 쿼리 이미지의 3차원 포인트 특징과 검색 이미지의 3차원 포인트 특징에 대한 검증이다.in the expression, silver ego, silver is, Is the nearest neighbor search from the 3D point feature of the query image to the 3D point feature of the search image, is the nearest neighbor search from the 3D point feature of the search image to the 3D point feature of the query image, is the verification of the 3D point features of the query image and the 3D point features of the search image.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지 사이에 coarse 매칭을 실시하여 큐브 간의 제1 특징 매칭 쌍을 획득하고, 다시 fine 매칭을 실시하여 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 획득한 다음, 제2 특징 매칭 쌍에 직접적으로 기초하여 쿼리 이미지와 검색 이미지 간의 대략적인(coarse) 상대적 포즈를 추정하고 대략적인 상대적 포즈를 쿼리 이미지와 검색 이미지 간의 상대적 포즈로 설정할 수 있다.In some possible embodiments, coarse matching is performed between the query image and the search image to obtain a first feature matching pair between cubes, fine matching is performed again to obtain a second feature matching pair between 3D point sets, A coarse relative pose between the query image and the search image may be estimated directly based on the second feature matching pair, and the coarse relative pose may be set as a relative pose between the query image and the search image.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지에 coarse 매칭을 실시하여 큐브 간의 제1 특징 매칭 쌍을 획득하고, 다시 fine 매칭을 실시하여 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 획득한 후, 다시 포즈 가이드 매칭을 실시하여 제3 특징 매칭 쌍을 획득할 수 있다.In some possible embodiments, coarse matching is performed on the query image and the search image to obtain a first feature matching pair between cubes, fine matching is performed again to obtain a second feature matching pair between 3D point sets, and then again A third feature matching pair may be obtained by performing pose guide matching.

특징 매칭 결과는 제3 특징 매칭 쌍을 더 포함하고, 상기 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득하는 단계는, 제2 특징 매칭 쌍에 기반하여 쿼리 이미지와 검색 이미지 간의 대략적인 상대적 포즈를 추정하는 단계, 및 대략적인 상대적 포즈에 의하여 검색 이미지의 3차원 포인트 세트를 쿼리 이미지의 좌표계에 투영하고, 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트 간의 제3 특징 매칭 쌍을 결정하는 단계를 더 포함할 수 있다.The feature matching result further includes a third feature matching pair, and the step of matching the spatial feature of the query image with the spatial feature of the search image at least once to obtain the feature matching result comprises querying based on the second feature matching pair. Estimating an approximate relative pose between the image and the search image, and projecting the 3D point set of the search image onto the coordinate system of the query image by the approximate relative pose, and the 3D point set of the query image and the 3D point set of the search image. The method may further include determining a third feature matching pair between point sets.

특징 매칭 결과에 기초하여 상대적 포즈를 결정하는 단계는 제3 특징 매칭 쌍에 기초하여 상대적 위치를 결정하는 단계를 포함할 수 있다.Determining the relative pose based on the feature matching result may include determining the relative position based on the third feature matching pair.

도 4에 도시된 실시예에서, 대략적인 상대적 포즈 를 사용하여 검색 이미지의 3차원 포인트를 쿼리 이미지의 좌표계에 투영한 다음, fine 매칭 부분과 유사하도록, 포인트 위치의 거리와 ORB 기술자의 해밍 거리에 따라 최근접 이웃 검색과 상호 검증을 실시하여 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트 간의 제3 매칭 쌍을 획득하고, 마지막으로 제3 특징 쌍에 기초하여 쿼리 이미지와 검색 이미지 간의 초기(prior) 상대적 포즈를 추정한다. 도 4에 도시된 바와 같이, 쿼리 이미지와 검색 이미지의 대응되는 3차원 포인트들 사이에는 중첩되는 부분이 존재하여 제3 매칭 쌍이 될 수 있으며, 전혀 중복되지 않는 3차원 포인트는 이상값을 나타낸다.In the embodiment shown in Fig. 4, an approximate relative pose , project the 3D points of the search image onto the coordinate system of the query image, and then, similar to the fine matching part, perform nearest neighbor search and mutual verification according to the distance of the point location and the Hamming distance of the ORB descriptor to query the image. A third matching pair between the 3D point set of and the 3D point set of the search image is obtained, and finally, a prior relative pose between the query image and the search image is estimated based on the third feature pair. As shown in FIG. 4 , an overlapping portion exists between corresponding 3D points of a query image and a search image to form a third matching pair, and a 3D point that does not overlap at all represents an outlier.

전술한 실시예는 특징 매칭 결과를 결정하는 과정을 설명하였으며, 이하 도면과 실시예를 참조하여 상대적 포즈를 결정하는 과정을 보다 자세히 설명한다.The foregoing embodiment has described the process of determining the feature matching result, and the process of determining the relative pose will be described in more detail with reference to the drawings and embodiments below.

일부 가능한 실시예에서, 상기 특징 매칭 결과에 기초하여 상기 상대적 포즈를 결정하는 단계는 상기 특징 매칭 결과에 기초하여 상기 쿼리 이미지와 상기 검색 이미지 간의 초기 상대적 포즈를 추정하는 단계, 상기 초기 상대적 포즈에 기반하여 상기 검색 이미지 중 상기 쿼리 이미지의 키 포인트와 대응되는 로컬 포인트를 결정하고, 상기 키 포인트와 대응되는 상기 로컬 포인트에 기초하여 포인트 매칭 쌍을 형성하는 단계, 및 상기 포인트 매칭 쌍에 기초하여 상기 상대적 포즈를 추정하는 단계를 포함할 수 있다.In some possible embodiments, the determining of the relative pose based on the feature matching result may include estimating an initial relative pose between the query image and the search image based on the feature matching result, based on the initial relative pose. determining a local point corresponding to a key point of the query image in the search image, and forming a point matching pair based on the local point corresponding to the key point; and based on the point matching pair, the relative A step of estimating a pose may be included.

실시 과정에서, 투영 검색 매칭 방법을 이용하여 상기 검색 이미지 중 상기 쿼리 이미지의 키 포인트와 대응되는 로컬 포인트를 결정할 수 있고, 상기 키 포인트와 대응되는 상기 로컬 포인트에 기초하여 포인트 매칭 쌍을 형성한 다음, PNP(Perspective-n-Point) 알고리즘을 이용하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정할 수 있다.In the implementation process, a local point corresponding to a key point of the query image in the search image may be determined using a projection search matching method, and a point matching pair is formed based on the local point corresponding to the key point. , a relative pose between a query image and a search image may be estimated using a Perspective-n-Point (PNP) algorithm.

전술한 실시예에서는 상대적 포즈의 상세 과정을 설명하였으며, 상대적 포즈 획득 후, 상대적 포즈에 따라 최적화된 글로벌 맵을 획득할 수 있다.In the above embodiment, the detailed process of the relative pose has been described, and after obtaining the relative pose, a global map optimized according to the relative pose can be obtained.

일부 가능한 실시예에서, 전자 기기에 의해 실행되는 상기 방법은 상대적 포즈에 기초하여 최적화된 글로벌 맵을 획득하는 단계를 더 포함할 수 있다.In some possible embodiments, the method executed by the electronic device may further include obtaining an optimized global map based on the relative pose.

일부 가능한 실시예에서, 점진적 번들 조정(IBA)과 전체 번들 조정(FBA)을 결합하여 최적화된 글로벌 맵을 획득할 수 있으며, 상대적 포즈에 따라 점진적 번들 조정과 전체 번들 조정 중 어느 것을 채택할지 판단하여 최적화의 정확도를 향상시킬 수 있다.In some possible embodiments, an optimized global map may be obtained by combining progressive bundle adjustment (IBA) and full bundle adjustment (FBA), and depending on the relative pose, determining whether to adopt progressive bundle adjustment or full bundle adjustment Optimization accuracy can be improved.

일부 가능한 실시예에서, 상대적 포즈에 기반하여 최적화된 글로벌 맵을 획득하는 단계는 상기 상대적 포즈를 기반으로 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하는 단계를 포함할 수 있다.In some possible embodiments, obtaining the optimized global map based on the relative pose may include obtaining the optimized global map by optimizing a current global map based on the relative pose.

일부 가능한 실시예에서, 위치 추정과 맵 작성 과정에서, 글로벌 맵은 연속하여 최적화되는 것으로서, 상대적 포즈에 기반하여 이전에 최적화된 글로벌 맵을 다시 최적화할 수 있다. 즉, 현재 글로벌 맵을 최적화하여, 최적화된 글로벌 맵을 획득할 수 있다.In some possible embodiments, during the position estimation and map creation process, the global map is continuously optimized, which may re-optimize the previously optimized global map based on the relative pose. That is, an optimized global map may be obtained by optimizing the current global map.

일부 가능한 실시예에서, 상대적 포즈에 기초하여 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하는 단계는 상기 상대적 포즈에 기초하여 포즈 드리프트(pose drift) 정보를 결정하는 단계, 및 상기 포즈 드리프트 정보에 기초하여 최적화 전략을 결정하고, 상기 최적화 전략에 따라 상기 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하는 단계를 포함할 수 있다.In some possible embodiments, optimizing a current global map based on a relative pose to obtain an optimized global map includes determining pose drift information based on the relative pose, and the pose drift information and obtaining the optimized global map by determining an optimization strategy based on and optimizing the current global map according to the optimization strategy.

이 때, 상기 포즈 드리프트 정보는 드리프트 각도, 드리프트 거리 및 유사 드리프트의 폐루프 개수 중 적어도 하나를 포함한다.In this case, the pose drift information includes at least one of a drift angle, a drift distance, and the number of closed loops of similar drift.

이 때, 최적화 전략은 점진적 번들 조정 및/또는 전체 번들 조정을 포함할 수 있다.At this time, the optimization strategy may include incremental bundle adjustment and/or overall bundle adjustment.

아래에서는 일부 실시예를 통하여 포즈 드리프트 정보를 결정하는 과정에 대해서 설명한다.Hereinafter, a process of determining pose drift information will be described through some embodiments.

일부 가능한 실시예에서, 루프가 성공적으로 검출되면, 아래의 공식에 의하여 포즈 드리프트 가 계산된다.In some possible embodiments, if a loop is successfully detected, the pose drift by the formula below is calculated

식에서, 은 SLAM 방법으로 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정하는 것을 나타내며, 는 회전 드리프트를 나타내고, 은 평행이동하는 드리프트를 나타낸다. 드리프트의 각도 와 거리()는 와 에 의하여 계산될 수 있다.in the expression, represents estimating the relative pose between the query image and the search image by the SLAM method, represents the rotational drift, denotes a drift moving in parallel. angle of drift and distance ( )Is and can be calculated by

일부 가능한 실시예에서, 폐루프(LC) 모듈의 상대적 포즈의 정확도를 판단하기 위하여, 타임 윈도우 내 의 쿼리 이미지와 현재 쿼리 이미지 간의 폐루프 드리프트 오차()를 계산하였으며, 아래의 등식에 의하여 포즈 드리프트를 계산하였다.In some possible embodiments, to determine the accuracy of the relative pose of the closed loop (LC) module, within a time window Closed-loop drift error between the query image of and the current query image ( ) was calculated, and the pose drift was calculated by the equation below.

식에서, 는 현재 쿼리 이미지의 인덱스를 나타내고, 는 회전의 드리프트 오차를 나타내며, 는 평행 이동의 드리프트 오차를 나타낸다.in the expression, represents the index of the current query image, denotes the drift error of the rotation, denotes the drift error of parallel movement.

일부 가능한 실시예에서, 오차의 각도()와 거리()는 와 에 의하여 계산될 수 있다. 마지막으로, 타임 윈도우 내에서 시간이 일치하는 루프의 개수()에 대한 통계를 구한다. 가 소정의 임계값() 이상인 경우, 추정된 상대적 위치()가 시간 일치도를 만족하면 충분히 정확함을 의미한다.In some possible embodiments, the angle of error ( ) and distance ( )Is and can be calculated by Finally, the number of time-matched loops within the time window ( ) to get the statistics. is a predetermined threshold value ( ) or more, the estimated relative position ( ) satisfies the temporal consistency, it means that it is sufficiently accurate.

은 아래의 등식으로 나타낸다. is represented by the equation below.

식에서, 은 타임 윈도우 내에서 시간이 일치하는 루프의 개수를 나타낸다.in the expression, represents the number of loops whose time matches within the time window.

이상에서 포즈 드리프트 정보를 결정하는 과정을 설명하였으며, 이하에서는 일부 실시예를 통하여, 포즈 드리프트 정보에 기초하여 최적화된 글로벌 맵을 획득하는 과정을 자세히 설명한다.The process of determining the pose drift information has been described above, and the process of obtaining an optimized global map based on the pose drift information will be described in detail below through some embodiments.

일부 가능한 실시예에서, 상기 포즈 드리프트 정보에 기초하여 최적화 전략을 결정하고, 상기 최적화 전략에 의하여 상기 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하는 단계는 상기 포즈 드리프트 정보가 미리 설정된 오차 조건에 부합하는 경우, 점진적 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 상기 최적화된 글로벌 맵을 획득하는 단계, 또는 상기 포즈 드리프트 정보가 상기 오차 조건에 부합하지 않는 경우, 전체 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 상기 최적화된 글로벌 맵을 획득하는 단계를 포함할 수 있다.In some possible embodiments, the step of determining an optimization strategy based on the pose drift information and optimizing the current global map according to the optimization strategy to obtain the optimized global map may include determining the pose drift information based on a preset error condition. , obtaining the optimized global map by adjusting the initial global map through gradual bundle adjustment, or if the pose drift information does not meet the error condition, the initial global map through full bundle adjustment. obtaining the optimized global map by adjusting the map.

다시 말해, 포즈 드리프트 정보가 미리 설정된 오차 조건에 부합하면, 포인트 매칭 쌍에 기반하여, 점진적 번들 조정에 의하여 초기 글로벌 맵을 조정함으로써 최적화된 글로벌 맵을 획득하고, 포즈 드리프트 정보가 미리 설정된 오차 조건에 부합하지 않는 경우, 상대적 포즈와 포인트 매칭 쌍에 기초하여, 전체 번들 조정에 의하여 초기 글로벌 맵을 조정함으로써 최적화된 글로벌 맵을 획득한다.In other words, if the pose drift information meets the preset error condition, an optimized global map is obtained by adjusting the initial global map by gradual bundle adjustment based on the point matching pair, and the pose drift information meets the preset error condition. If they do not match, an optimized global map is obtained by adjusting the initial global map by adjusting the entire bundle based on the relative pose and point matching pair.

일부 가능한 실시예에서, 이상에서 획득된 드리프트의 각도()와 거리() 및 시간이 일치하는 루프의 개수()에 따라, 아래와 같은 최적화 전략을 실행한다.In some possible embodiments, the angle of drift obtained above ( ) and distance ( ) and the number of time-matched loops ( ), the following optimization strategy is executed.

이 때, IBA는 점진적 번들 조정이고, FBA는 전체 번들 조정이다.At this time, IBA is incremental bundle adjustment, and FBA is overall bundle adjustment.

카메라의 드리프트가 매우 작거나(과 가 소정의 임계값 ( 및 ) 미만) 또는 추정된 상대적 포즈()의 시간 일치도가 아직 검증되지 않은 경우(이 소정의 임계값() 미만), 포인트 매칭 쌍 제약만을 추가한 다음, 점진적 번들 조정을 통하여 관련 키 프레임의 포즈와 맵 포인트를 최적화한다. 또는, 현재 SLAM 시스템의 누적 오차가 비교적 크고 추정된 상대적 포즈()가 시간 일치도를 만족하며 충분히 정확한 경우, 추정된 상대적 포즈()와 포인트 매칭 쌍 제약을 추가하고, 전체 번들 조정을 통하여 모든 키 프레임의 포즈와 모든 맵 포인트를 최적화한다.If the drift of the camera is very small ( class is a predetermined threshold value ( and ) or less than the estimated relative pose ( ), if the temporal concordance of ( This predetermined threshold ( ) less than), only point matching pair constraints are added, and then the poses and map points of the relevant key frames are optimized through progressive bundle adjustment. Alternatively, if the cumulative error of the current SLAM system is relatively large and the estimated relative pose ( ) satisfies the temporal consistency and is sufficiently accurate, then the estimated relative pose ( ) and point matching pair constraints, and optimize all key frame poses and all map points through full bundle adjustment.

일부 가능한 실시예에서, 전체 번들 조정을 통하여 초기 글로벌 맵을 조정하고 최적화된 글로벌 맵을 획득하는 단계는 상대적 포즈에 기초하여 초기 글로벌 맵의 키 프레임의 다중 자유도(Multi-Degree of Freedom) 포즈를 최적화하여 제1 글로벌 맵을 획득하는 단계, 및 전체 번들 조정을 통하여 제1 글로벌 맵의 키 프레임 포즈와 맵 포인트를 최적화하고, 최적화된 글로벌 맵을 획득하는 단계를 포함할 수 있다.In some possible embodiments, the step of adjusting the initial global map and obtaining an optimized global map through full bundle adjustment may include a Multi-Degree of Freedom pose of a key frame of the initial global map based on the relative pose. Optimizing and acquiring a first global map, and optimizing key frame poses and map points of the first global map by adjusting all bundles, and acquiring an optimized global map.

도 5에 도시된 바와 같이, 먼저 모든 키 프레임의 6자유도 포즈를 최적화한 다음, 전체 번들 조정(FBA)을 통하여 모든 키 프레임 포즈와 모든 맵 포인트를 최적화할 수 있다.As shown in FIG. 5, first, the 6 degree of freedom poses of all key frames are optimized, and then all key frame poses and all map points can be optimized through full bundle adjustment (FBA).

일부 가능한 실시예에서 제공되는, 전자 기기에 의해 실행되는 방법은 쿼리 이미지에 대한 검색 이미지를 획득하는 단계, 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 결정하는 단계, 상기 상대적 포즈에 기초하여 포즈 드리프트 정보를 결정하는 단계, 및 상기 포즈 드리프트 정보에 기초하여 최적화 전략을 결정하고, 상기 최적화 전략에 의하여 상기 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하는 단계를 포함할 수 있다.Provided in some possible embodiments, a method executed by an electronic device includes acquiring a search image for a query image, determining a relative pose between the query image and the search image, and generating pose drift information based on the relative pose. The method may include determining an optimization strategy based on the pose drift information, and obtaining the optimized global map by optimizing the current global map according to the optimization strategy.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지의 상대적 포즈를 결정하는 단계는 및 특징 매칭을 통하여 쿼리 이미지와 검색 이미지 간의 시각적 제약(visual constrain)을 형성하고, 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정하는 단계를 포함할 수 있다.In some possible embodiments, the step of determining the relative pose of the query image and the search image forms a visual constraint between the query image and the search image through feature matching, and estimates the relative pose between the query image and the search image. steps may be included.

일부 가능한 실시예에서, 특징 매칭은 BOW 및 ORB 특징 매칭 또는 투영 매칭 등일 수 있다.In some possible embodiments, feature matching may be BOW and ORB feature matching or projection matching, and the like.

일부 가능한 실시예에서, 쿼리 이미지와 검색 이미지의 상대적 포즈를 결정하는 단계는 상기 쿼리 이미지의 공간적 특징과 상기 검색 이미지의 공간적 특징을 각각 획득하는 단계, 및 상기 공간적 특징에 기반하여 상기 쿼리 이미지와 상기 검색 이미지 간의 상대적 포즈를 추정하는 단계를 포함할 수 있다.In some possible embodiments, the determining of the relative poses of the query image and the search image may include obtaining a spatial feature of the query image and a spatial feature of the search image, respectively, and the query image and the search image based on the spatial feature. A step of estimating relative poses between search images may be included.

일부 가능한 실시예에서, 쿼리 이미지의 공간적 특징과 검색 이미지 간의 공간 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득하고, 다시 특징 매칭 결과에 기반하여 상대적 포즈를 추정할 수 있다.In some possible embodiments, a feature matching result may be obtained by matching a spatial feature of a query image with a spatial feature between a search image at least once, and a relative pose may be estimated again based on the feature matching result.

실시 과정에서, 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트에 대하여 여러 레이어의 coarse-to-fine 매칭을 실시하고, 최종 매칭 결과에 근거하여 상대적 포즈를 결정할 수 있다. 상대적 포즈를 결정하는 방법에 대해서는 상기의 내용을 참조할 수 있으므로, 여기에서는 더 이상 설명하지 않는다.In an implementation process, multiple layers of coarse-to-fine matching may be performed on the 3D point set of the query image and the 3D point set of the search image, and a relative pose may be determined based on the final matching result. The method for determining the relative pose can be referred to the above description, so it is not further described herein.

일부 가능한 실시예에서, 상기 포즈 드리프트 정보가 미리 설정된 오차 조건에 부합하는 경우, 점진적 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 상기 최적화된 글로벌 맵을 획득하고, 상기 포즈 드리프트 정보가 상기 오차 조건에 부합하지 않는 경우, 전체 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 최적화된 글로벌 맵을 획득한다.In some possible embodiments, when the pose drift information meets a preset error condition, the optimized global map is obtained by adjusting the initial global map through gradual bundle adjustment, and the pose drift information meets the error condition. If they do not match, an optimized global map is obtained by adjusting the initial global map through full bundle coordination.

이하, 실시예를 참조하여 본 개시의 전자 기기에 의해 실행되는 방법을 설명한다.Hereinafter, a method executed by the electronic device of the present disclosure will be described with reference to embodiments.

일 실시예에서, 도 6에 도시된 바와 같이, 본 개시의 전자 기기는 키 프레임과 대응되는 이미지 데이터 세트에서 쿼리 이미지에 대하여 의미상으로 유사한 검색 이미지를 검색하는, 이미지 검색 모듈, 쿼리 이미지와 검색 이미지 간의 초기 상대적 포즈를 추정하는, 초기 상대적 포즈 추정 모듈, 쿼리 이미지와 검색 이미지 간의 상대적 포즈 제약을 정밀하게 추정하고 쿼리 이미지 키 포인트와 검색 이미지의 대응되는 로컬 맵 포인트 간의 제약을 형성하는, 정밀 상대적 위치 추정 모듈 및 새로 추가된 제약에 따라 추가 최적화하여 최적화된 글로벌 맵을 정밀하게 추정하는, 최적화 모듈을 포함할 수 있다.In an embodiment, as shown in FIG. 6 , the electronic device of the present disclosure includes an image search module, a query image and a search for a search image semantically similar to a query image in an image data set corresponding to a key frame. An initial relative pose estimation module, which estimates an initial relative pose between images, precisely estimates a relative pose constraint between a query image and a retrieval image, and forms a constraint between a query image key point and a corresponding local map point of the retrieval image. It may include a location estimation module and an optimization module that precisely estimates the optimized global map by further optimizing according to the newly added constraints.

이하, 실시예를 참조하여 본 개시의 전자 기기에 의해 실행되는 방법을 추가적으로 설명한다.Hereinafter, a method executed by the electronic device of the present disclosure will be further described with reference to embodiments.

도 7에 도시된 바와 같이, 본 개시의 전자 기기에 의해 실행되는 방법은 BoW 모델에 따라 이미지를 검색하되, 이미지 데이터 세트에서 쿼리 이미지(즉, 도면에 도시된 쿼리 이미지)에 대하여 의미상으로 유사한 검색 이미지(즉, 도면에 도시된 검색 이미지)를 검색하는 단계, 쿼리 이미지의 3차원 포인트 세트를 생성하고, 검색 이미지의 3차원 포인트 세트를 생성하는 단계, 쿼리 이미지의 3차원 포인트 세트를 클러스터링하여 적어도 하나의 제1 큐브를 형성하고, 검색 이미지의 3차원 포인트 세트를 클러스터링하여 적어도 하나의 제2 큐브를 생성하는 단계, 제1 큐브 각각의 제1 클러스터 중심을 결정하고, 제2 큐브 각각의 제2 클러스터 중심을 결정하는 단계, 각각의 제1 클러스터 중심과 서로 매칭되는 제2 클러스터 중심을 결정하고, 서로 매칭된 제1 클러스터 중심과 제2 클러스터 중심에 기초하여 상기 제1 특징 매칭 쌍을 형성하는 단계로서, 즉 도면에 도시된 coarse 매칭 단계, 제1 특징 매칭 쌍에 기초하여 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 형성하는 단계로서, 즉 도면에 도시된 fine 매칭 단계, 포즈 가이드 매칭에 의하여 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 간의 제3 특징 매칭 쌍을 생성하고 초기 상대적 포즈를 생성하는 단계, 초기 상대적 포즈 추정에 기반하여 쿼리 이미지의 키 포인트와 검색 이미지의 대응되는 로컬 포인트 간의 포인트 매칭 쌍을 결정하고 상대적 포즈를 추정하는 단계 및 상대적 포즈와 포인트 매칭 쌍에 기반하여 초기 글로벌 맵에 대한 최적화 전략을 결정하는 단계를 포함함으로써, 전체 번들 조정 또는 점진적 번들 조정을 선택하여 동시적 위치 추정과 맵 작성을 통해 획득한 초기 글로벌 맵을 최적화할 수 있다.As shown in FIG. 7 , a method executed by an electronic device of the present disclosure retrieves an image according to the BoW model, but is semantically similar to a query image (ie, the query image shown in the figure) in an image data set. Retrieving a search image (ie, the search image shown in the figure), generating a 3D point set of the query image, generating a 3D point set of the search image, clustering the 3D point set of the query image Forming at least one first cube, generating at least one second cube by clustering three-dimensional point sets of the search image, determining a first cluster center of each of the first cubes, and determining a first cluster center of each of the second cubes. Determining two cluster centers, determining a second cluster center that matches each of the first cluster centers, and forming the first feature matching pair based on the first and second cluster centers that match each other. As a step, that is, a coarse matching step shown in the figure, forming a second feature matching pair between the 3D point sets of the query image and the search image based on the first feature matching pair, that is, the fine matching step shown in the figure , generating a third feature matching pair between the 3D point sets of the query image and the search image by pose guide matching and generating an initial relative pose; correspondence between the key points of the query image and the search image based on the initial relative pose estimation; determining point matching pairs between local points to be matched and estimating relative poses, and determining an optimization strategy for the initial global map based on the relative poses and point matching pairs, thereby selecting global bundle adjustment or incremental bundle adjustment. Thus, the initial global map obtained through simultaneous localization and map creation can be optimized.

전술한 예시에서, 계층과 혼합적 성질을 가지는 폐루프(DH-LC) 방법으로 명명된 신규한 폐루프(LC) 방법이 제시되었다. 3차원 포인트 세트 생성, 3차원 포인트 세트의 클러스터링, coarse 매칭, fine 매칭, 포즈 가이드 매칭의 과정을 계층 분할에 기반한 공간적 특징 매칭(Hierarchical Spatial Feature Matching: HSFM)이라 명명하고, IBA와 FBA를 결합한 최적화 방법을 혼합 번들 조정(Hybrid Bundle Adjustment: HBA)이라 명명하며, 본 개시는 쿼리 이미지와 검색 이미지 간의 초기 상대적 포즈와 혼합 번들 조정(HBA)을 추정하여 글로벌 맵을 최적화한다.In the foregoing example, a novel closed-loop (LC) method named as a closed-loop (DH-LC) method with hierarchical and mixed properties is presented. The process of 3D point set generation, 3D point set clustering, coarse matching, fine matching, and pose guide matching is named Hierarchical Spatial Feature Matching (HSFM) based on hierarchical segmentation, and optimization combining IBA and FBA The method is named Hybrid Bundle Adjustment (HBA), and the present disclosure optimizes the global map by estimating the initial relative pose and Hybrid Bundle Adjustment (HBA) between the query image and the search image.

각각의 쿼리 이미지에 대하여, BOW 모델에 의하여 후보 이미지 세트로부터 검색 그래픽을 획득한 다음, HSFM을 통하여 coarse-to-fine 순서로 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정한다. 다음으로, 투영 검색 매칭 방법을 이용하여 쿼리 이미지의 키 포인트와 검색 이미지의 대응되는 로컬 맵 포인트 간의 매칭을 완료하고, PNP 알고리즘을 이용하여 쿼리 이미지와 검색 이미지 간의 정확한 상대적 포즈를 추정하고, 마지막으로, 제시된 최적화 전략에 따라, HBA는 IBA 또는 FBA 방법을 적응적으로 선택하여 현재 글로벌 맵을 보다 효과적으로 최적화할 수 있다.For each query image, a search graphic is obtained from a set of candidate images by the BOW model, and then a relative pose between the query image and the search image is estimated in a coarse-to-fine order through HSFM. Next, the matching between the key points of the query image and the corresponding local map points of the search image is completed using the projective search matching method, the accurate relative pose between the query image and the search image is estimated using the PNP algorithm, and finally , according to the proposed optimization strategy, HBA can adaptively choose either the IBA or FBA method to more effectively optimize the current global map.

특징 매칭의 내부 포인트 비율과 효율을 향상시키기 위하여, 본 개시에서는 HSFM을 제시하였다. 기존의, 직접적인 로컬 특징 매칭 또는 특징 클러스터링 가속 매칭에 기반한 방법과 달리, 본 개시는 먼저 쿼리 이미지와 검색 이미지로부터 키 포인트와 ORB 기술자를 조밀하고 균일하게 획득한 다음, 에피폴라 제약을 이용하여 스테레오 매칭과 삼각 측량을 수행하여 쿼리 이미지와 검색 이미지의 대응되는 3차원 포인트를 추정하고, 3차원 포인트가 공간 분포에 따라 큐브로 클러스터링되며, 각 클러스터의 중심 기술자는 보다 큰 감지 시야각을 가지는 큐브의 모든 3차원 포인트 기술자의 투표에 의하여 획득되고, 마지막으로 coarse-to-fine 방식에 의하여 쿼리 이미지 키 프레임과 검색 이미지 간의 초기 상대적 포즈를 추정한다. 확실한 포즈 추정과 포인트 매칭 후, 다음 단계에서는 글로벌 맵을 최적화한다. 단일한 최적화 방법은 정밀도와 효율을 동시에 고려할 수 없으므로, 본 개시에서는 IBA와 FBA를 결합함으로써 글로벌 맵을 효과적으로 최적화할 수 있고 실행 시간이 짧으며, 정밀도가 보다 높은 HBA를 제시하였다.In order to improve the internal point ratio and efficiency of feature matching, the present disclosure proposes HSFM. Unlike existing methods based on direct local feature matching or feature clustering accelerated matching, the present disclosure first densely and uniformly obtains key points and ORB descriptors from query images and search images, and then performs stereo matching using epipolar constraints. and triangulation are performed to estimate the corresponding 3D points of the query image and the search image, the 3D points are clustered into cubes according to the spatial distribution, and the center descriptor of each cluster is all 3 of the cube with a larger detection viewing angle. It is obtained by voting of dimension point descriptors, and finally, the initial relative pose between the query image key frame and the search image is estimated by a coarse-to-fine method. After robust pose estimation and point matching, the next step is to optimize the global map. Since no single optimization method can consider precision and efficiency at the same time, the present disclosure proposes an HBA that can effectively optimize the global map, has a short execution time, and has higher precision by combining IBA and FBA.

1) HSFM은 coarse-to-fine의 레이어 매칭과 에피폴라 제약에 기반하여 3차원 포인트를 생성하고 공간 클러스터링을 실시하여 쿼리 이미지와 검색 이미지 간의 초기 상대적 포즈를 추정한다.1) HSFM generates 3D points based on coarse-to-fine layer matching and epipolar constraints, and performs spatial clustering to estimate the initial relative pose between the query image and the search image.

선행 기술과 비교했을 때, 본 개시가 제시한 방법은 특징 매칭의 내부 포인트 비율과 효율을 향상시켰다.Compared with the prior art, the method proposed by the present disclosure improves the internal point ratio and efficiency of feature matching.

2) HBA는 IBA와 FBA를 결합한 것으로서, 효과적으로 최적화된 글로벌 맵을 제공하며, 실행 시간이 보다 짧고, 정밀도는 보다 높다.2) HBA is a combination of IBA and FBA, effectively providing an optimized global map, with shorter execution time and higher precision.

3) 본 개시는 HSFM과 HBA를 결합하여 DH-LC 방법을 제시하였다. 상기 방법은 폐루프의 회수율과 효율을 향상시켰으며, 누적 오차를 감소시켰고, 나아가 위치 추정의 정밀도를 향상시켰다.3) The present disclosure suggests a DH-LC method by combining HSFM and HBA. This method improved the recovery rate and efficiency of the closed loop, reduced the cumulative error, and further improved the accuracy of position estimation.

전술한 전자 기기에 의해 실시되는 방법은 쿼리 이미지와 검색 이미지의 공간적 특징을 통하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정함으로써, 공간적 특징은 보다 큰 지각 에일리어싱 계층과 공간 정보를 포함하며, 글로벌 맵을 보다 정확하게 최적화할 수 있다.The method implemented by the above-described electronic device estimates a relative pose between a query image and a search image through spatial features of the query image and search image, the spatial feature includes a larger perceptual aliasing layer and spatial information, and generates a global map. can be more precisely optimized.

또한, 본 개시는 이미지에서 키 포인트와 ORB 기술자를 조밀하고 균일하게 추출한 다음, 에피폴라 제약을 사용하여 스테레오 매칭 및 삼각 측량을 완료함으로써 3차원 포인트 세트를 추정하므로, 3차원 포인트 세트가 글로벌 맵보다 공간에서 더욱 균일하고 밀도 높게 분포된다. 이로써 상대적 포즈를 보다 정확하게 결정하여 글로벌 맵을 최적화할 수 있다.In addition, since the present disclosure estimates a set of three-dimensional points by densely and uniformly extracting key points and ORB descriptors from an image, and then completing stereo matching and triangulation using epipolar constraints, the set of three-dimensional points is better than the global map. It is more uniformly and densely distributed in space. This allows a more accurate determination of the relative pose to optimize the global map.

또한, 본 개시는 점진적 번들 조정과 전체 번들 조정을 통하여 글로벌 맵을 효과적으로 최적화할 수 있으며, 작동 시간은 보다 짧고, 정확도는 보다 높다.In addition, the present disclosure can effectively optimize the global map through gradual bundle adjustment and overall bundle adjustment, the operation time is shorter, and the accuracy is higher.

위의 실시예에서는 방법상의 흐름의 관점에서 전자 기기에 의해 실행되는 방법을 소개하였으며, 이하 가상 모듈의 관점에서 본 방법을 소개한다. 예시는 다음과 같다.In the above embodiment, the method executed by the electronic device is introduced from the viewpoint of the flow of the method, and the present method is introduced from the viewpoint of the virtual module below. Examples include:

일 실시예에서, 본 개시의 실시예에 의하여 제공되는 전자 기기(80)는 도 8에 도시된 바와 같이 제1 획득 모듈(801), 제2 획득 모듈(802), 추정 모듈(803) 및 최적화 모듈(804)을 포함할 수 있고, 제1 획득 모듈(801)은 쿼리 이미지에 대한 검색 이미지를 획득하기 위한 것이며, 제2 획득 모듈(802)은 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 각각 획득하기 위한 것이고, 추정 모듈(803)은 공간적 특징에 기반하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈의 최적화된 글로벌 맵을 추정하기 위한 것이다.In one embodiment, the electronic device 80 provided by the embodiment of the present disclosure includes a first acquisition module 801, a second acquisition module 802, an estimation module 803 and optimization as shown in FIG. It may include a module 804, wherein the first acquiring module 801 is for acquiring a search image for the query image, and the second acquiring module 802 is configured to obtain a spatial feature of the query image and a spatial feature of the search image. for each acquisition, and the estimation module 803 is for estimating an optimized global map of relative poses between the query image and the search image based on the spatial features.

일 실시예에서, 공간적 특징은 3차원 포인트 세트를 포함하고, 제2 획득 모듈(802)은 쿼리 이미지와 검색 이미지 중 임의의 하나에 대하여, 공간적 특징을 획득하고, 제2 획득 모듈(802)은 이미지 키 포인트와 특징 기술자를 포함하는 이미지 특징 포인트를 추출하고, 이미지 특징 포인트에 스테레오 매칭을 실시하여 3차원 포인트 세트를 추정하기 위한 것이다.In an embodiment, the spatial feature includes a set of three-dimensional points, the second acquiring module 802 acquires the spatial feature for any one of the query image and the search image, and the second acquiring module 802: It is to extract image feature points including image key points and feature descriptors, and perform stereo matching on the image feature points to estimate a 3D point set.

가능한 일 실시예에서, 추정 모듈(803)은 공간적 특징에 기초하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정한다.In one possible embodiment, the estimation module 803 estimates the relative pose between the query image and the search image based on spatial features.

추정 모듈(803)은 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득하고, 특징 매칭 결과에 기반하여 상대적 포즈를 결정하기 위한 것이다.The estimation module 803 is configured to obtain a feature matching result by matching the spatial feature of the query image with the spatial feature of the retrieval image at least once, and determine a relative pose based on the feature matching result.

일 실시예에서, 특징 매칭 결과는 제1 특징 매칭 쌍을 포함하고, 추정 모듈(803)은 쿼리 이미지의 공간적 특징과 검색 이미지의 공간적 특징을 적어도 한 번 매칭하며, 특징 매칭 결과를 획득한다.In an embodiment, the feature matching result includes a first feature matching pair, and the estimation module 803 matches the spatial feature of the query image with the spatial feature of the retrieval image at least once, and obtains a feature matching result.

추정 모듈(803)은 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 각각을 클러스터링하여, 쿼리 이미지의 클러스터링 결과와 검색 이미지의 클러스터링 결과 간의 제1 특징 매칭 쌍을 생성하기 위한 것이다.The estimation module 803 is for clustering each of the three-dimensional point sets of the query image and the search image to generate a first feature matching pair between the clustering result of the query image and the clustering result of the search image.

가능한 일 실시예에서, 추정 모듈(803)은 쿼리 이미지와 검색 이미지의 3차원 포인트 세트 각각을 클러스터링하여 쿼리 이미지의 클러스터링 결과와 검색 이미지의 클러스터링 결과 간의 제1 특징 매칭 쌍을 생성한다.In one possible embodiment, the estimation module 803 clusters each of the three-dimensional point sets of the query image and the search image to generate a first feature matching pair between the clustering result of the query image and the clustering result of the search image.

추정 모듈(803)은 쿼리 이미지의 3차원 포인트 세트를 축적하여 형성된 적어도 하나의 제1 큐브를 결정하고, 검색 이미지의 3차원 포인트 세트를 축적하여 형성된 적어도 하나의 제2 큐브를 결정하고, 제1 큐브 각각의 제1 클러스터 중심을 결정하고, 제2 큐브 각각의 제2 클러스터 중심을 결정하며, 제1 큐브 각각의 제1 클러스터 중심과 서로 매칭되는 제2 클러스터 중심을 결정하고, 서로 매칭되는 제1 클러스터 중심과 제2 클러스터 중심에 기초하여 상기 제1 특징 매칭 쌍을 형성하기 위한 것이다.The estimation module 803 determines at least one first cube formed by accumulating the three-dimensional point sets of the query image, determines at least one second cube formed by accumulating the three-dimensional point sets of the search image, and determines the first cube formed by accumulating the three-dimensional point sets of the search image. Determining the center of the first cluster of each cube, determining the center of the second cluster of each of the second cubes, determining the center of the second cluster matching each other with the center of the first cluster of each of the first cubes, and determining the center of the first cluster matching each other It is to form the first feature matching pair based on the cluster center and the second cluster center.

일 실시예에서, 특징 매칭 결과는 제2 특징 매칭 쌍을 더 포함하며, 추정 모듈(803)은 상기 쿼리 이미지의 공간적 특징과 상기 검색 이미지의 공간적 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득한다.In an embodiment, the feature matching result further includes a second feature matching pair, and the estimation module 803 matches the spatial feature of the query image with the spatial feature of the retrieval image at least once to obtain a feature matching result. .

추정 모듈(803)은 제1 특징 매칭 쌍의 영역 중 3차원 포인트에 대하여 최근접 이웃 검색 및 상호 검증을 실시하여 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트 간의 제2 특징 매칭 쌍을 획득하기 위한 것이다.The estimation module 803 performs a nearest neighbor search and mutual verification on 3D points in the region of the first feature matching pair, and performs a second feature matching pair between the 3D point set of the query image and the 3D point set of the search image. is to obtain

가능한 일 실시예에서, 특징 매칭 결과는 제3 특징 매칭 쌍을 더 포함하고, 추정 모듈(803)은 상기 쿼리 이미지의 공간적 특징과 상기 검색 이미지의 공간적 특징을 적어도 한 번 매칭하여 특징 매칭 결과를 획득한다.In a possible embodiment, the feature matching result further includes a third feature matching pair, and the estimation module 803 matches the spatial feature of the query image with the spatial feature of the search image at least once to obtain a feature matching result. do.

추정 모듈(803)은 제2 특징 매칭 쌍에 기반하여 쿼리 이미지와 검색 이미지 간의 대략적인 상대적 포즈를 추정하고, 대략적인 상대적 포즈에 의하여 검색 이미지의 3차원 포인트 세트를 쿼리 이미지의 좌표계에 투영하고, 쿼리 이미지의 3차원 포인트 세트와 검색 이미지의 3차원 포인트 세트 간의 제3 특징 매칭 쌍을 결정하기 위한 것이다.The estimation module 803 estimates an approximate relative pose between the query image and the retrieval image based on the second feature matching pair, and projects a three-dimensional point set of the retrieval image onto a coordinate system of the query image according to the approximate relative pose; This is to determine a third feature matching pair between the 3D point set of the query image and the 3D point set of the search image.

가능한 일 실시예에서, 추정 모듈(803)은 상기 특징 매칭 결과에 기반하여 상기 상대적 포즈를 결정한다.In one possible embodiment, estimation module 803 determines the relative pose based on the feature matching result.

추정 모듈(803)은 상기 특징 매칭 결과에 기초하여 상기 쿼리 이미지와 상기 검색 이미지 간의 초기 상대적 포즈를 추정하고, 상기 초기 상대적 포즈에 기초하여 상기 검색 이미지 중 상기 쿼리 이미지의 키 포인트와 대응되는 로컬 포인트를 결정하고, 상기 키 포인트와 대응되는 상기 로컬 포인트에 기초하여 포인트 매칭 쌍을 형성하며, 상기 포인트 매칭 쌍에 기초하여 상기 상대적 포즈를 추정하기 위한 것이다.The estimation module 803 estimates an initial relative pose between the query image and the search image based on the feature matching result, and based on the initial relative pose, a local point in the search image corresponding to a key point of the query image. To determine a, form a point matching pair based on the local point corresponding to the key point, and estimate the relative pose based on the point matching pair.

일 실시예는 최적화 모듈을 더 포함하며, 최적화 모듈은, 상대적 포즈를 기반으로 현재 글로벌 맵을 최적화하여 최적화된 글로벌 맵을 획득하기 위한 것이다.An embodiment further includes an optimization module, wherein the optimization module is configured to optimize the current global map based on the relative pose to obtain an optimized global map.

일 실시예에서, 최적화 모듈은 상대적 포즈를 기반으로 현재 글로벌 맵을 최적화하여 최적화된 글로벌 맵을 획득한다.In one embodiment, the optimization module optimizes the current global map based on the relative pose to obtain an optimized global map.

최적화 모듈은 상대적 포즈에 기초하여 포즈 드리프트를 결정하고, 상기 포즈 드리프트 정보에 기초하여 최적화 전략을 결정하고, 상기 최적화 전략에 의하여 상기 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득하기 위한 것이다.An optimization module is configured to determine pose drift based on the relative pose, determine an optimization strategy based on the pose drift information, and optimize the current global map according to the optimization strategy to obtain the optimized global map.

가능한 일 실시예에서, 최적화 모듈은 상기 포즈 드리프트 정보에 기반하여 최적화 전략을 결정하고, 상기 최적화 전략에 의하여 상기 현재 글로벌 맵을 최적화하여 상기 최적화된 글로벌 맵을 획득한다.In one possible embodiment, an optimization module determines an optimization strategy based on the pose drift information, and optimizes the current global map according to the optimization strategy to obtain the optimized global map.

최적화 모듈은 상기 포즈 드리프트 정보가 미리 설정된 오차 조건에 부합하는 경우, 점진적 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 상기 최적화된 글로벌 맵을 획득하거나, 또는 상기 포즈 드리프트 정보가 상기 오차 조건에 부합하지 않는 경우, 전체 번들 조정을 통하여 상기 초기 글로벌 맵을 조정함으로써 상기 최적화된 글로벌 맵을 획득하기 위한 것이다.The optimization module obtains the optimized global map by adjusting the initial global map through gradual bundle adjustment when the pose drift information meets a preset error condition, or the pose drift information does not meet the error condition. If not, it is to obtain the optimized global map by adjusting the initial global map through overall bundle adjustment.

가능한 일 실시예에서, 최적화 모듈은 전체 번들 조정을 통하여 상기 초기 글로벌 맵을 조정하여 상기 최적화된 글로벌 맵을 획득한다.In one possible embodiment, the optimization module obtains the optimized global map by adjusting the initial global map through full bundle coordination.

최적화 모듈은 상기 상대적 포즈에 기반하여 상기 초기 글로벌 맵의 키 프레임의 다중 자유도 포즈를 최적화하여 제1 글로벌 맵을 획득하고, 전체 번들 조정을 통하여 제1 글로벌 맵의 키 프레임 포즈와 맵 포인트를 최적화하고, 상기 최적화된 글로벌 맵을 획득하기 위한 것이다.The optimization module optimizes the multi-degree-of-freedom pose of the key frame of the initial global map based on the relative pose to obtain a first global map, and optimizes the key frame pose and map points of the first global map through adjustment of the entire bundle. and to acquire the optimized global map.

전술한 전자 기기는 쿼리 이미지와 검색 이미지의 공간적 특징을 통하여 쿼리 이미지와 검색 이미지 간의 상대적 포즈를 추정함으로써, 특징은 보다 큰 감지 시야각 계층과 공간 정보를 가지며, 글로벌 맵을 보다 정확하게 최적화할 수 있다.The above-described electronic device estimates a relative pose between a query image and a search image through spatial features of the query image and search image, so that the feature has a larger sensed viewing angle layer and spatial information, and can more accurately optimize a global map.

또한, 3차원 포인트 세트는, 이미지에서 키 포인트와 ORB 기술자를 조밀하고 균일하게 추출한 다음, 에피폴라 제약을 사용하여 스테레오 매칭 및 삼각 측량을 완료함으로써 3차원 포인트 세트를 추정하므로, 3차원 포인트 세트가 글로벌 맵보다 공간에서 더욱 균일하고 밀도 높게 분포된다.In addition, since the 3D point set is estimated by densely and uniformly extracting key points and ORB descriptors from the image, and then completing stereo matching and triangulation using epipolar constraints, the 3D point set is It is more uniformly and densely distributed in space than the global map.

또한, 점진적 번들 조정과 전체 번들 조정을 통하여 글로벌 맵을 효과적으로 최적화할 수 있으며, 작동 시간은 보다 짧고, 정확도는 보다 높다.In addition, the global map can be effectively optimized through gradual bundle adjustment and overall bundle adjustment, with shorter operating time and higher accuracy.

본 개시의 실시예의 전자 기기는 본 개시의 전술한 임의의 방법의 실시예에 의해 제공되는, 유사한 구현 원리를 가지는 방법을 실행할 수 있다. 본 개시의 각 실시예에서의 장치 중 각 모듈이 실행하는 동작은 본 개시의 각 실시예에서 전자 기기에 의해 실행되는 방법에서의 단계와 서로 대응되는 것으로서, 장치의 각 모듈의 상세한 기능에 대한 설명은 상기에 기재된 대응되는 방법의 설명을 참조할 수 있으므로, 더 설명하지 아니한다.An electronic device in an embodiment of the present disclosure may execute a method having a similar implementation principle provided by any of the foregoing method embodiments in the present disclosure. Operations executed by each module of the device in each embodiment of the present disclosure correspond to steps in a method executed by an electronic device in each embodiment of the present disclosure, and description of detailed functions of each module of the device can refer to the description of the corresponding method described above, and will not be described further.

본 개시의 실시예에서 제공되는 장치는 인공지능(AI, Artificial Intelligence) 모델에 의하여 여러 모듈 중 적어도 하나의 모듈을 구현할 수 있다. 본 개시의 실시예에서 제공되는 장치는 비휘발성 메모리와 휘발성 메모리 및 프로세서에 의하여 AI 관련 기능을 실행할 수 있다.An apparatus provided in an embodiment of the present disclosure may implement at least one module among several modules according to an artificial intelligence (AI) model. An apparatus provided in an embodiment of the present disclosure may execute an AI-related function by using a non-volatile memory, a volatile memory, and a processor.

상기 프로세서는 하나 이상의 프로세서를 포함할 수 있다. 이 때, 상기 하나 이상의 프로세서는 예컨대 중앙처리장치(CPU), 어플리케이션 프로세서(AP) 등의 범용 프로세서, 또는 그래픽 전용 프로세서, 예컨대 그래픽 처리 장치(GPU, Graphics Processing Unit), 시각 처리 장치(VPU, Visual Processing Unit), 및/또는 신경 처리 장치(NPU, Neural Processing Unit)와 같은 AI 전용 프로세서일 수 있다.The processor may include one or more processors. At this time, the one or more processors may be, for example, a central processing unit (CPU), a general-purpose processor such as an application processor (AP), or a graphics-only processor, for example, a graphics processing unit (GPU), a visual processing unit (VPU, Visual Processing Unit), and/or an AI dedicated processor such as a Neural Processing Unit (NPU).

상기 하나 이상의 프로세서는 비휘발성 메모리와 휘발성 메모리에 저장된 사전 정의된 작동 규칙 또는 인공지능(AI) 모델에 따라 입력 데이터의 처리를 제어한다. 훈련 또는 학습을 통하여 사전 정의된 작동 규칙 또는 인공지능 모델이 제공된다.The one or more processors control the processing of input data according to non-volatile memory and predefined operating rules or artificial intelligence (AI) models stored in the volatile memory. Predefined operating rules or artificial intelligence models are provided through training or learning.

여기서, 학습을 통해 제공한다는 것은 학습 알고리즘을 다수의 학습 데이터에 응용하여 사전 정의된 작동 규칙 또는 필요한 특성을 가지는 AI 모델을 획득하는 것이다. 상기 학습은 실시예에 따른 AI를 실행하는 장치 자체에서 실행되거나, 및/또는 별도의 서버/시스템에 의하여 구현될 수 있다.Here, providing through learning means acquiring an AI model having predefined operating rules or necessary characteristics by applying a learning algorithm to a plurality of learning data. The learning may be executed in the device itself executing the AI according to the embodiment, and/or implemented by a separate server/system.

상기 AI 모델은 다수의 뉴럴 네트워크 레이어를 포함할 수 있다. 각 레이어는 다수의 가중치를 가지며, 한 레이어의 계산은 이전 레이어의 계산 결과와 현재 레이어의 다수의 가중치에 의하여 실행된다. 뉴럴 네트워크의 예시는 합성곱 신경망(CNN, Convolutional neural network), 심층 신경망(DNN, Deep neural network), 순환 신경망(RNN, Recurrent neural network), 제한된 볼츠만 머신(RBM, Restricted Boltzmann machine), 심층 신뢰 신경망(DBN, Deep Belief Network), 양방향 순환 심층 신경망(BRDNN, Bidirectional recurrent deep neural network), 생성적 적대 신경망(GAN, Generative Adversarial Networks), 및 심층 Q 네트워크를 포함하나 이에 제한되지 아니한다.The AI model may include a plurality of neural network layers. Each layer has multiple weights, and calculation of one layer is performed based on the calculation result of the previous layer and the multiple weights of the current layer. Examples of neural networks include convolutional neural network (CNN), deep neural network (DNN), recurrent neural network (RNN), restricted Boltzmann machine (RBM), and deep trust neural network. (DBN, Deep Belief Network), bidirectional recurrent deep neural network (BRDNN), generative adversarial networks (GAN, Generative Adversarial Networks), and deep Q networks.

학습 알고리즘은 다수의 학습 데이터를 사용하여 사전에 설정된 목표 장치(예컨대, 로봇)를 훈련시킴으로써 목표 장치가 판단 또는 예측하도록 시키거나, 허용 또는 제어하는 방법이다. 상기 학습 알고리즘의 예시는 지도학습, 비지도학습, 반지도학습, 또는 강화학습을 포함하나 이에 제한되지 아니한다.The learning algorithm is a method of allowing, allowing, or controlling a target device to make a judgment or prediction by training a target device (eg, a robot) set in advance using a plurality of learning data. Examples of the learning algorithm include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning.

이상, 기능 모듈화의 관점에서 본 개시의 실시예가 제공하는 장치를 소개하였으며, 이어서, 하드웨어 실체화의 관점에서 본 개시의 실시예가 제공하는 전자 기기를 소개하고 전자 기기의 계산 시스템에 대해서도 소개한다.In the above, devices provided by embodiments of the present disclosure are introduced from the viewpoint of functional modularization, and then, electronic devices provided by embodiments of the present disclosure are introduced from the viewpoint of hardware realization, and a calculation system of the electronic device is also introduced.

본 개시의 실시예에 나타난 방법과 동일한 원리에 기반하여, 본 개시의 실시예에서는 전자 기기를 추가적으로 제공하며, 상기 전자 기기는 컴퓨터 작동 명령어를 저장하는 메모리와 컴퓨터 작동 명령을 호출함으로써 전술한 실시예의 방법 중 임의의 방법을 실행하도록 구성되는 프로세서를 포함하나 이에 제한되지 아니한다. 선행 기술과 비교하면, 본 개시의 전자 기기에 의하여 실행되는 방법은 글로벌 맵을 보다 정밀하게 최적화한다.Based on the same principle as the method shown in the embodiments of the present disclosure, in the embodiments of the present disclosure, an electronic device is additionally provided, and the electronic device performs the operation of the above-described embodiments by calling a memory for storing computer operating instructions and a computer operating command. a processor configured to execute any of the methods, but is not limited thereto. Compared with the prior art, the method implemented by the electronic device of the present disclosure optimizes the global map more precisely.

일 실시예에서는 전자 기기를 제공하고 있으며, 도 9에 도시된 전자 기기(1000)는 프로세서(1001)와 메모리(1003)를 포함한다. 여기서, 프로세서(1001)는 예컨대 버스(1002)를 통하여 메모리(1003)와 서로 연결된다. 선택적으로, 전자 기기(1000)는 송수신기(1004)를 더 포함할 수 있다. 실제 응용 시 송수신기(1004)의 개수는 하나로 제한되지 아니하며, 상기 전자 기기(1000)의 구조는 본 개시의 실시예를 제한하지 아니한다.One embodiment provides an electronic device, and the electronic device 1000 shown in FIG. 9 includes a processor 1001 and a memory 1003. Here, the processor 1001 is connected to the memory 1003 through a bus 1002, for example. Optionally, the electronic device 1000 may further include a transceiver 1004. In actual application, the number of transceivers 1004 is not limited to one, and the structure of the electronic device 1000 does not limit the embodiments of the present disclosure.

프로세서(100)는 중앙처리장치(CPU, Central Processing Unit), 범용 프로세서, 디지털 신호 프로세서(DSP, Digital Signal Processor), 전용 집적 회로(ASIC, Application Specific Integrated Circuit), 필드 프로그래밍 지원 게이트 어레이(FPGA, Field Programmable Gate Array) 또는 기타 편집 가능한 논리 소자, 트랜지스터 논리 소자, 하드웨어 부품 또는 이의 임의의 조합일 수 있다. 이는 본 개시에 개시된 내용에 설명된 각종 예시적인 논리 프레임, 모듈 및 회로를 구현할 수 있다. 프로세서(1001)는 예컨대 하나 이상의 마이크로프로세서의 조합, DSP와 마이크로프로세서의 조합 등과 같이 연산 기능을 구현하는 조합일 수도 있다.The processor 100 includes a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programming support gate array (FPGA, Field Programmable Gate Array) or other editable logic elements, transistor logic elements, hardware components, or any combination thereof. It may implement various illustrative logical frames, modules and circuits described in the present disclosure. The processor 1001 may be a combination that implements computational functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and the like.

버스(1002)는 전술한 구성요소 사이에서 정보를 전달하는 통로를 포함할 수 있다. 버스(1002)는 주변 장치 상호 접속 (PCI, Peripheral Component Interconnect) 버스 또는 확장 업계 표준 구조(EISA, Extended Industry Standard Architecture) 버스 등일 수 있다. 버스(1002)는 주소 버스, 데이터 버스, 제어 버스 등으로 분류될 수 있다. 표시의 편의를 위하여, 도 9에서는 굵은 선 한 가닥으로만 표시하였으나, 이는 단 하나의 버스 또는 단 한 가지의 버스만 존재함을 의미하지 않는다.Bus 1002 may include pathways for conveying information between the components described above. Bus 1002 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 1002 can be classified as an address bus, a data bus, a control bus, and the like. For convenience of display, only one thick line is indicated in FIG. 9, but this does not mean that there is only one bus or only one bus.

메모리(1003)는 읽기 전용 메모리(ROM, Read Only Memory) 또는 정적 정보와 명령어를 저장할 수 있는 다른 형태의 정적 메모리 장치, RAM(Random Access Memory) 또는 정보와 명령어를 저장할 수 있는 다른 형태의 동적 메모리 장치일 수 있으며, EEP ROM(EEPROM, Electrically Erasable Programmable Read Only Memory), CD-ROM(Compact Disc Read Only Memory) 또는 다른 광 디스크 메모리, 컴팩트 디스크 메모리(컴팩트 디스크, 레이저 디스크, 광 디스크, 디지털 범용 디스크, Blu-ray 디스크 등), 자기 디스크 저장 매체 또는 다른 자기 저장 장치, 또는 명령어 또는 데이터 구조 형태로 원하는 프로그램 코드를 운반하거나 저장하는 데 사용될 수 있고 컴퓨터에서 액세스할 수 있는 임의의 다른 매체일 수도 있으나, 이에 제한되지 않는다.Memory 1003 may be read only memory (ROM) or other form of static memory device capable of storing static information and instructions, random access memory (RAM) or other form of dynamic memory capable of storing information and instructions. device, electrically erasable programmable read only memory (EEPROM), compact disc read only memory (CD-ROM) or other optical disc memory, compact disc memory (compact disc, laser disc, optical disc, digital universal disc , Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage device, or any other medium accessible to a computer that can be used to carry or store desired program code in the form of instructions or data structures; , but not limited thereto.

메모리(1003)는 본 개시의 실시 방식의 응용 프로그램 코드를 저장하고, 프로세서(1001)에 의해 실행이 제어된다. 프로세서(1001)는 메모리(1003)에 저장된 응용 프로그램 코드를 실행하도록 구성되어 전술한 방법의 실시예에 기재한 내용을 구현한다.The memory 1003 stores application program codes according to an embodiment of the present disclosure, and execution is controlled by the processor 1001 . The processor 1001 is configured to execute application program codes stored in the memory 1003 to implement the contents described in the embodiments of the foregoing method.

여기서, 전자 기기는 모바일 폰, 노트북, 디지털 방송 수신기, 개인 정보 단말기(PDA, Personal Digital Assistant), 태블릿 컴퓨터(PAD), 휴대용 멀티미디어 플레이어(PMP, Portable Multimedia Player), 차량 탑재 단말(예컨대, 차량 탑재 내비게이션 단말) 등의 이동 단말 및 디지털 TV, 데스크탑 컴퓨터 등의 고정 단말, 및 지능 로봇 등을 포함할 수 있으나, 이에 제한되지 아니한다. 도 9에 도시된 전자 기기는 일 실시예에 불과하며, 본 개시의 실시예의 기능 및 이용 범위를 제한하지 아니한다.Here, the electronic device includes a mobile phone, a laptop computer, a digital broadcasting receiver, a personal digital assistant (PDA), a tablet computer (PAD), a portable multimedia player (PMP), a vehicle-mounted terminal (eg, a vehicle-mounted device) It may include mobile terminals such as navigation terminals), fixed terminals such as digital TVs and desktop computers, and intelligent robots, but is not limited thereto. The electronic device shown in FIG. 9 is only one embodiment and does not limit the function and use range of the embodiment of the present disclosure.

본 개시의 실시예는 컴퓨터 판독 가능 저장 매체를 제공한다. 상기 컴퓨터 판독 가능 저장 매체에는 컴퓨터 프로그램이 저장되어 있어, 컴퓨터에서 작동될 때 컴퓨터가 전술한 방법의 실시예 중 해당 내용을 수행할 수 있다.An embodiment of the present disclosure provides a computer readable storage medium. A computer program is stored in the computer-readable storage medium, so that the computer can perform the corresponding contents of the above-described method embodiments when operated by the computer.

도면의 흐름도에 도시된 각 단계가 화살표 방향이 가리키는 순서대로 도시되었으나, 이들 단계가 반드시 화살표 방향이 가리키는 순서대로 차례대로 수행되지는 않음은 물론이다. 본 개시에 명확하게 설명하지 않은 경우, 이들 단계의 실행은 순서상으로 엄격하게 제한되지 않으며, 다른 순서로도 실행될 수 있다. 또한, 도면의 흐름도 중 적어도 일부 단계는 여러 개의 하위 단계 또는 여러 개의 절차를 포함할 수 있는데, 이러한 단계 또는 절차는 동시에 완료되어야 하는 것은 아니며 상이한 시각에 실행될 수도 있고, 그 실행 순서 또한 차례대로 진행되어야 하는 것은 아니며, 다른 단계 또는 이의 하위 단계 또는 절차 중 적어도 일부와 혼합되거나 번갈아 실시될 수 있다.Although each step shown in the flowchart of the drawings is shown in the order indicated by the arrow direction, it is needless to say that these steps are not necessarily performed sequentially in the order indicated by the arrow direction. Unless explicitly stated in this disclosure, the execution of these steps is not strictly limited in order and may be performed in other orders. In addition, at least some of the steps in the flowchart of the drawings may include several sub-steps or several procedures, and these steps or procedures do not have to be completed simultaneously and may be executed at different times, and the execution order must also proceed sequentially. However, it may be mixed or alternated with other steps or at least some of its substeps or procedures.

본 개시의 전술한 컴퓨터 판독 가능 매체는 컴퓨터 판독 가능 신호 매체 또는 컴퓨터 판독 가능 저장 매체 또는 상기 양자의 임의의 조합일 수 있다.The foregoing computer readable medium of this disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two.

컴퓨터 판독 가능 저장 매체는 예컨대, 전기, 자기, 광, 전자기, 적외선 또는 반도체 시스템, 장치 또는 소자, 또는 이의 임의의 조합일 수 있으나, 이에 제한되지 아니한다. 컴퓨터 판독 가능 저장 매체의 예시는 하나 이상의 전선을 가지는 전기 연결, 휴대용 컴퓨터 디스크, 하드 디스크, RAM, ROM, 소거형 ROM(EPROM 또는 플래시 메모리), 광섬유, 읽기 전용 콤팩트 디스크 기억 장치(CD-ROM), 광 메모리 소자, 자기 메모리 소자, 또는 상기 임의의 적절한 조합을 포함할 수 있으나 이에 제한되지 아니한다.A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared or semiconductor system, device or element, or any combination thereof. Examples of computer readable storage media are electrical connections having one or more wires, portable computer disks, hard disks, RAM, ROM, erasable ROM (EPROM or flash memory), optical fiber, read-only compact disk storage (CD-ROM) , an optical memory device, a magnetic memory device, or any suitable combination of the foregoing, but is not limited thereto.

본 개시에서, 컴퓨터 판독 가능 저장 매체는 프로그램을 포함 또는 저장하는 유형의 매체일 수 있으며, 상기 프로그램은 명령어 실행 시스템, 장치, 또는 소자에 의하여 사용되거나 또는 이와 함께 사용될 수 있다.In the present disclosure, a computer readable storage medium may be a tangible medium that contains or stores a program, and the program may be used by or in conjunction with an instruction execution system, device, or device.

본 개시에서, 컴퓨터 판독 가능 신호 매체는 기저대역에서 또는 컴퓨터 판독 가능한 프로그램 코드가 로드된 반송파의 일부로서 전파된 데이터 신호를 포함할 수 있다. 이러한 전파된 데이터 신호로는 다양한 형태가 채용될 수 있으며, 전자기 신호, 광신호 또는 전술한 내용의 임의의 적절한 조합을 포함하나 이에 제한되지 아니한다. 컴퓨터 판독 가능 신호 매체는 컴퓨터 판독 가능 저장 매체 이외의 임의의 컴퓨터 판독 가능 매체일 수 있으며, 상기 컴퓨터 판독 가능 신호 매체는 명령어 실행 시스템, 장치, 또는 소자에 의하여 이용되거나 이와 함께 이용되는 프로그램을 발신, 전파 또는 전송할 수 있다. 컴퓨터 판독 가능 매체에 포함된 프로그램 코드는 임의의 적합한 매체를 사용하여 전달될 수 있으며, 해당 매체는 전선, 광 케이블, 라디오 주파수(RF, radio frequency) 등 또는 이의 임의의 적절한 조합을 포함하나 이에 제한되지 아니한다.In this disclosure, a computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave loaded with computer readable program code. Various forms may be employed as the propagated data signal, including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer readable signal medium may be any computer readable medium other than a computer readable storage medium, wherein the computer readable signal medium transmits a program used by or used with an instruction execution system, device, or element; can propagate or transmit. The program code included in the computer readable medium may be delivered using any suitable medium, including, but not limited to, electric wire, optical cable, radio frequency (RF), etc., or any suitable combination thereof. It doesn't.

상기 컴퓨터 판독 가능 매체는 상기 전자 기기에 포함될 수 있으며, 전자 기기에 조립되지 않고 별도로 존재할 수도 있다.The computer readable medium may be included in the electronic device, and may exist separately without being assembled in the electronic device.

상기 컴퓨터 판독 가능 매체에는 하나 이상의 프로그램이 로드되어 있어, 전술한 하나 이상의 프로그램이 상기 전자 기기에 의하여 실행되면, 상기 전자 기기가 전술한 실시예에 기재한 방법을 실행한다.One or more programs are loaded in the computer readable medium, and when the above-described one or more programs are executed by the electronic device, the electronic device executes the method described in the foregoing embodiment.

하나 이상의 프로그래밍 언어 또는 이의 조합으로써 본 개시의 작동을 실행하기 위한 컴퓨터 프로그램 코드를 작성할 수 있으며, 전술한 프로그래밍 언어는 Java, Smalltalk, C++와 같은 객체 지향 프로그래밍 언어와 "C" 언어 또는 이와 유사한 프로그래밍 언어와 같은 통상의 절차적 프로그래밍 언어를 포함한다. 프로그램 코드는 사용자 컴퓨터 상에서 완전히 실행되거나, 부분적으로 사용자 컴퓨터 상에서 실행되거나, 별도의 패킷으로서 실행되거나, 사용자 컴퓨터와 원격 컴퓨터 상에서 각각 부분적으로 실행되거나, 완전히 원격 컴퓨터 또는 서버에서 실행될 수 있다. 원격 컴퓨터의 경우, 원격 컴퓨터는 임의의 종류의 LAN(Local Area Network) 또는 WAN(Wide Area Network)를 통하여 사용자 컴퓨터에 연결되거나 또는 외부 컴퓨터에 연결될 수 있다(예컨대, 인터넷 서비스 제공자를 사용하여 인터넷에 의해 연결).Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and the "C" language or similar programming languages. It includes common procedural programming languages such as The program code may be executed entirely on the user's computer, partially on the user's computer, as a separate packet, partially on the user's computer and on the remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of local area network (LAN) or wide area network (WAN), or to an external computer (e.g., by using an Internet service provider to access the Internet). connected by).

도면 내의 흐름도와 블록도는 본 개시의 다양한 실시예에 따른 시스템, 방법 및 컴퓨터 프로그래밍 제품의 구현 가능한 구조, 기능 및 작동을 도시한 것이다. 이러한 점에서, 흐름도 또는 블록도 내의 모든 블록은 하나의 모듈, 프로그램 단편, 또는 코드의 일부를 나타낼 수 있으며, 상기 모듈, 프로그램 단편, 또는 코드의 일부는 규정된 논리적 기능을 구현하도록 구성된 하나 이상의 실행 가능한 명령어를 포함한다. 또한 일부 대체 구현에서, 블록에 표시된 기능은 도면에 표시된 순서와 다르게 발생할 수 있다. 예컨대, 연결되어 표시된 두 개의 블록은 사실상 기본적으로 함께 실행될 수 있으며, 경우에 따라 반대 순서로 실행될 수도 있는 바, 이는 관련 기능에 따라 정해진다. 또한, 블록도 및/또는 흐름도 내의 모든 블록 및 블록도 및/또는 흐름도 내의 블록의 조합은 규정된 기능 또는 작동을 실행하는 전용 하드웨어 기반 시스템에 의하여 구현되거나, 전용 하드웨어와 컴퓨터 명령어의 조합에 의하여 구현될 수 있다.The flow diagrams and block diagrams in the drawings illustrate the implementable structures, functions, and operation of systems, methods, and computer programming products according to various embodiments of the present disclosure. In this regard, every block in a flowchart or block diagram may represent a single module, program segment, or portion of code, wherein the module, program segment, or portion of code is one or more executions configured to implement a specified logical function. Contains possible commands. Also, in some alternative implementations, the functions depicted in the blocks may occur in an order different from the order presented in the figures. For example, two blocks displayed as being connected may in fact be basically executed together, and in some cases may be executed in the reverse order, depending on the related functions. In addition, all blocks and combinations of blocks in block diagrams and/or flowcharts may be implemented by dedicated hardware-based systems that perform specified functions or operations, or by a combination of dedicated hardware and computer instructions. It can be.

본 개시의 실시예에서 언급한 모듈은 소프트웨어 또는 하드웨어 방식으로 구현될 수 있다. 이 때, 모듈의 명칭은 경우에 따라서는 본 모듈 자체를 제한하는 것이 아니며, 예컨대 제2 획득 모듈은 "공간적 특징 획득 모듈"로 기술될 수도 있다.Modules mentioned in the embodiments of the present disclosure may be implemented in a software or hardware manner. In this case, the name of the module does not limit the module itself in some cases, and for example, the second acquisition module may be described as a “spatial feature acquisition module”.

이상의 설명은 본 개시의 바람직한 실시예이자 사용된 기술 원리에 대한 설명에 불과하다. 본 개시와 관련된 개시 범위는 전술한 기술 특징의 특정 조합으로 이루어진 기술적 해결 방법에 제한되지 아니하며, 본 개시의 구상을 벗어나지 않으면서 전술한 기술 특징 또는 이의 동등한 특징으로 임의적으로 조합된 다른 기술적 해결 방법 또한 포함한다. 예컨대, 전술한 특징과 본 개시에서 개시된(단 이에 제한되지 아니하는) 유사한 기능을 가지는 기술 특징을 서로 대체하여 형성된 기술적 해결 방법이다.The above description is only a description of the preferred embodiment of the present disclosure and the technical principles used. The disclosure scope related to the present disclosure is not limited to a technical solution consisting of a specific combination of the foregoing technical features, and other technical solutions arbitrarily combined with the foregoing technical features or equivalent features thereof without departing from the spirit of the present disclosure. include For example, it is a technical solution formed by substituting the above-described features and technical features having similar functions disclosed in the present disclosure (but not limited thereto) to each other.

1000: 전자기기
1001: 프로세서
1003: 메모리1000: electronic devices
1001: processor
1003: memory

Claims

As a method performed by an electronic device,
obtaining a search image for the query image;
acquiring spatial features of each of the query image and the search image; and
Estimating a relative pose between the query image and the search image based on the spatial feature;

According to claim 1,
The spatial feature comprises a set of three-dimensional points;
The step of obtaining the spatial feature is
extracting image feature points including image key points and feature descriptors; and
Estimating the three-dimensional point set by performing stereo matching on the image feature points.

According to claim 2,
The step of estimating the relative pose is
obtaining a feature matching result by matching a spatial feature of the query image with a spatial feature of the search image at least once; and
determining the relative pose based on the feature matching result.

According to claim 3,
the feature matching result includes a first feature matching pair;
The step of obtaining the feature matching result is
clustering each of the three-dimensional point sets of the query image and the search image to generate the first feature matching pair between a clustering result of the query image and a clustering result of the search image.

According to claim 4,
Generating the first feature matching pair
determining at least one first cube formed by clustering 3D point sets of the query image;
determining at least one second cube formed by clustering 3D point sets of the search image;
determining a first cluster center of each of the first cubes and determining a second cluster center of each of the second cubes; and
Determining a first cluster center of each first cube and a second cluster center matching each other, and determining the first feature matching pair based on the first cluster center and the second cluster center matching each other, method.

According to claim 4,
the feature matching result further includes a second feature matching pair;
The step of obtaining the feature matching result is
Obtaining a second feature matching pair between the 3D point set of the query image and the 3D point set of the search image by performing a nearest neighbor search and mutual verification on the 3D points of the first feature matching pair. contains more;
The step of determining the relative pose is
determining the relative pose based on the second feature matching pair.

According to claim 6,
the feature matching result further includes a third feature matching pair;
The step of obtaining the feature matching result is
estimating a rough relative pose between the query image and the search image based on the second feature matching pair; Projecting the 3D point set of the search image to the coordinate system of the query image according to the approximate relative pose to determine a third feature matching pair between the 3D point set of the query image and the 3D point set of the search image Step; including,
The step of determining the relative pose is
determining the relative pose based on the third feature matching pair.

According to claim 3,
The step of determining the relative pose is
estimating an initial relative pose between the query image and the search image based on the feature matching result;
determining a local point corresponding to a key point of the query image in the search image based on the initial relative pose, and forming a point matching pair based on the local point corresponding to the key point; and
estimating the relative pose based on the point matching pair.

According to claim 1,
optimizing a current global map based on the relative pose to obtain an optimized global map.

According to claim 9,
Acquiring the optimized global map
determining pose drift information based on the relative pose; and
determining an optimization strategy based on the pose drift information, and obtaining the optimized global map by optimizing the current global map according to the optimization strategy.

According to claim 10,
Acquiring the optimized global map
acquiring the optimized global map by adjusting an initial global map through gradual bundle adjustment when the pose drift information meets a preset error condition; or
and acquiring the optimized global map by adjusting an initial global map through full bundle adjustment when the pose drift information does not meet the error condition.

According to claim 11,
Acquiring the optimized global map
obtaining a first global map by optimizing a multi-degree-of-freedom pose of a key frame of the initial global map based on the relative pose; and
and obtaining the optimized global map by optimizing key frame poses and map points of the first global map through overall bundle adjustment.

In electronic devices,
one or more processors;
Memory;
It includes one or more application programs, the one or more application programs are stored in a memory and configured to be executed by one or more processes, and the one or more application programs are stored in the electronic device according to any one of claims 1 to 12. An electronic device configured to execute an operation corresponding to a method executed by

A computer-readable recording medium on which a program for executing the method of claim 1 is recorded on a computer.