KR20210099752A

KR20210099752A - Electronic device for conforming pose of street view image based on two-dimension map information and operating method thereof

Info

Publication number: KR20210099752A
Application number: KR1020200013496A
Authority: KR
Inventors: 김덕화; 이동환
Original assignee: 네이버 주식회사; 네이버랩스 주식회사
Priority date: 2020-02-05
Filing date: 2020-02-05
Publication date: 2021-08-13
Also published as: KR102316232B1

Abstract

According to various embodiments, provided are an electronic device and an operating method thereof. The present invention is to match a pose of a street view image based on two-dimensional map information. The operating method is configured to acquire two-dimensional map information for the outdoor environment and a plurality of street view images, use the street view images to generate a three-dimensional map, use the two-dimensional map information to match the three-dimensional map, and provide a visual localization of the outdoor environment based on the matched three-dimensional map information.

Description

An electronic device for matching poses of a street view image based on two-dimensional map information and an operation method thereof

다양한 실시예들은 2차원 지도 정보를 기반으로 거리뷰 이미지의 포즈를 정합시키기 위한 전자 장치 및 그의 동작 방법에 관한 것이다. Various embodiments relate to an electronic device for matching poses of a street view image based on 2D map information and an operating method thereof.

일반적으로 비주얼 로컬리제이션(visual localization; VL)은 실내 환경에서의 내비게이션을 제공하기 위해 제안되었다. 예를 들면, 로봇은 실내 환경에 대한 지도 정보를 이용하여 비주얼 로컬리제이션을 수행하고, 이를 통해 실내에서 이동할 수 있다. 현재, 비주얼 로컬리제이션을 실외 환경으로 확장시키기 위한 연구가 이루어지고 있다. 실외 환경에서의 비주얼 로컬리제이션을 위해서는, 실외 환경에 대한 3차원 지도 정보가 필요하다. 그런데, 실외 환경에 대한 3차원 지도 정보를 기반으로 비주얼 로컬리제이션을 제공하는 데, 각종 오류가 발생되는 문제점이 있다. 예를 들면, 2차원 지도 정보를 이용하여 3차원 지도 정보 상에 커서링 시, 커서링과 관련된 아이템의 형태 또는 사이즈가 왜곡되어 표시될 수 있다.In general, visual localization (VL) has been proposed to provide navigation in an indoor environment. For example, the robot may perform visual localization using map information for an indoor environment, and may move indoors through this. Currently, research is being conducted to extend visual localization to outdoor environments. For visual localization in an outdoor environment, 3D map information for the outdoor environment is required. However, there is a problem in that various errors occur in providing visual localization based on 3D map information for an outdoor environment. For example, when cursoring is performed on 3D map information using 2D map information, the shape or size of an item related to the cursoring may be distorted and displayed.

다양한 실시예들은, 실외 환경에 대한 3차원 지도 정보를 기반으로 실외 환경에서의 비주얼 로컬리제이션을 제공하기 위한 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments provide an electronic device for providing visual localization in an outdoor environment based on 3D map information on the outdoor environment, and an operating method thereof.

다양한 실시예들은, 3차원 지도 정보를 생성하면서, 2차원 지도 정보를 기반으로 3차원 지도 정보를 처리할 수 있는 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments provide an electronic device capable of processing 3D map information based on 2D map information while generating 3D map information, and an operating method thereof.

다양한 실시예들은, 2차원 지도 정보를 기반으로 거리뷰 이미지의 포즈를 정합시키기 위한 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments provide an electronic device for matching a pose of a street view image based on 2D map information, and an operating method thereof.

다양한 실시예들에 따른 전자 장치의 동작 방법은, 실외 환경에 대한 2차원 지도 정보 및 복수 개의 거리뷰 이미지들을 획득하는 동작, 상기 2차원 지도 정보 및 상기 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성하는 동작, 및 상기 3차원 지도 정보를 기반으로, 상기 실외 환경에 대한 비주얼 로컬리제이션(visual localization)을 제공하는 동작을 포함할 수 있다. A method of operating an electronic device according to various embodiments includes an operation of acquiring 2D map information for an outdoor environment and a plurality of street view images, and 3D map information by using the 2D map information and the street view images. and providing visual localization for the outdoor environment based on the 3D map information.

다양한 실시예들에 따른 전자 장치는, 메모리, 및 상기 메모리와 연결되고, 상기 메모리에 저장된 적어도 하나의 명령을 실행하도록 구성된 프로세서를 포함하고, 상기 프로세서는, 실외 환경에 대한 2차원 지도 정보 및 복수 개의 거리뷰 이미지들을 획득하고, 상기 2차원 지도 정보 및 상기 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성하고, 상기 3차원 지도 정보를 기반으로, 상기 실외 환경에 대한 비주얼 로컬리제이션을 제공하도록 구성될 수 있다. An electronic device according to various embodiments includes a memory and a processor connected to the memory and configured to execute at least one instruction stored in the memory, wherein the processor includes two-dimensional map information for an outdoor environment and a plurality of obtains street view images, generates 3D map information using the 2D map information and the street view images, and provides visual localization for the outdoor environment based on the 3D map information can be configured to

다양한 실시예들에 따르면, 전자 장치가 실외 환경에 대한 3차원 지도 정보를 생성함으로써, 실외 환경에서의 비주얼 로컬리제이션을 제공할 수 있다. 즉 전자 장치는 실외 환경에 대한 3차원 지도 정보를 기반으로, 실외 환경에서의 측위를 수행할 수 있다. 이 때 전자 장치가 2차원 지도 정보를 기반으로 3차원 지도 정보를 생성하는 데 필요한 거리뷰 이미지의 포즈를 정합시킴으로써, 비주얼 로컬리제이션을 제공할 때 발생되는 오류를 최소화할 수 있다. 예를 들면, 2차원 지도 정보를 이용하여 3차원 지도 정보 상에 커서링 시, 커서링과 관련된 아이템의 형태 또는 사이즈에 대한 왜곡이 최소화될 수 있다. 또한, 비주얼 로컬리제이션을 기반으로 증강 현실(augmented reality; AR) 내비게이션(navigation)을 제공하는 데 있어서, 2차원 지도 정보의 관심점(point of interest; POI)(예: 상점, 횡단보도, 건물 등) 데이터를 활용할 수 있다.According to various embodiments, the electronic device may provide visual localization in the outdoor environment by generating 3D map information on the outdoor environment. That is, the electronic device may perform positioning in the outdoor environment based on 3D map information on the outdoor environment. In this case, by matching the poses of the street view image required for the electronic device to generate the 3D map information based on the 2D map information, an error occurring when providing the visual localization may be minimized. For example, when cursoring is performed on 3D map information using 2D map information, distortion of a shape or size of an item related to the cursoring may be minimized. In addition, in providing augmented reality (AR) navigation based on visual localization, a point of interest (POI) of two-dimensional map information (eg, a store, a crosswalk, a building) etc.) data can be used.

도 1은 다양한 실시예들에 따른 전자 장치를 도시하는 도면이다.
도 2는 2차원 지도 정보를 설명하기 위한 도면이다.
도 3a 및 도 3b는 3차원 지도 정보를 나타내는 도면들이다.
도 4는 다양한 실시예들에 따른 전자 장치의 동작 방법을 도시하는 도면이다.
도 5는 도 4의 3차원 지도 정보 생성 동작을 도시하는 도면이다.
도 6 및 도 7은 도 4의 3차원 지도 정보 생성 동작을 설명하기 위한 도면들이다.
도 8은 도 5의 각 거리뷰 이미지의 포즈 최적화 동작을 도시하는 도면이다.
도 9는 도 5의 각 거리뷰 이미지의 포즈 최적화 동작을 설명하기 위한 도면이다. 1 is a diagram illustrating an electronic device according to various embodiments of the present disclosure;
2 is a diagram for explaining two-dimensional map information.
3A and 3B are diagrams illustrating 3D map information.
4 is a diagram illustrating a method of operating an electronic device according to various embodiments of the present disclosure;
FIG. 5 is a diagram illustrating an operation of generating 3D map information of FIG. 4 .
6 and 7 are diagrams for explaining an operation of generating 3D map information of FIG. 4 .
FIG. 8 is a diagram illustrating a pose optimization operation of each street view image of FIG. 5 .
FIG. 9 is a diagram for explaining a pose optimization operation of each street view image of FIG. 5 .

이하, 본 문서의 다양한 실시예들이 첨부된 도면을 참조하여 설명된다. Hereinafter, various embodiments of the present document will be described with reference to the accompanying drawings.

다양한 실시예들은, 실외 3차원 지도 정보를 기반으로 비주얼 로컬리제이션(visual localization; VL)을 제공하기 위한 전자 장치 및 그의 동작 방법을 제공한다. 다양한 실시예들에 따르면, 전자 장치가 실외 환경에 대한 3차원 지도 정보를 생성함으로써, 실외 환경에서의 비주얼 로컬리제이션을 제공할 수 있다. 즉 전자 장치는 실외 환경에 대한 3차원 지도 정보를 기반으로, 실외 환경에서의 측위를 수행할 수 있다. 이 때 전자 장치가 2차원 지도 정보를 기반으로 3차원 지도 정보를 생성하는 데 필요한 거리뷰 이미지의 포즈를 정합시킬 수 있다. 이를 통해, 전자 장치가 비주얼 로컬리제이션을 제공할 때 발생되는 오류를 최소화할 수 있다. 예를 들면, 2차원 지도 정보를 이용하여 3차원 지도 정보 상에 커서링 시, 커서링과 관련된 아이템의 형태 또는 사이즈에 대한 왜곡이 최소화될 수 있다. 또한, 비주얼 로컬리제이션을 기반으로 증강 현실 내비게이션(AR navigation)을 제공하는 데 있어서, 2차원 지도 정보의 관심점(POI)(예: 상점, 횡단보도, 건물 등) 데이터를 활용할 수 있다.Various embodiments provide an electronic device for providing visual localization (VL) based on outdoor 3D map information and an operating method thereof. According to various embodiments, the electronic device may provide visual localization in the outdoor environment by generating 3D map information on the outdoor environment. That is, the electronic device may perform positioning in the outdoor environment based on 3D map information on the outdoor environment. In this case, the electronic device may match the pose of the street view image required to generate the 3D map information based on the 2D map information. Through this, it is possible to minimize an error that occurs when the electronic device provides visual localization. For example, when cursoring is performed on 3D map information using 2D map information, distortion of a shape or size of an item related to the cursoring may be minimized. In addition, in providing augmented reality navigation (AR navigation) based on visual localization, point of interest (POI) (eg, shops, crosswalks, buildings, etc.) data of 2D map information may be utilized.

도 1은 다양한 실시예들에 따른 전자 장치(100)를 도시하는 도면이다. 도 2는 2차원 지도 정보를 설명하기 위한 도면이다. 도 3a 및 도 3b는 3차원 지도 정보를 나타내는 도면들이다. 1 is a diagram illustrating an electronic device 100 according to various embodiments. 2 is a diagram for explaining two-dimensional map information. 3A and 3B are diagrams illustrating 3D map information.

도 1을 참조하면, 다양한 실시예들에 따른 전자 장치(100)는 통신 모듈(110), 입력 모듈(120), 출력 모듈(130), 메모리(140) 또는 프로세서(150) 중 적어도 어느 하나를 포함할 수 있다. 어떤 실시예에서, 전자 장치(100)의 구성 요소들 중 적어도 어느 하나가 생략될 수 있으며, 적어도 하나의 다른 구성 요소가 추가될 수 있다. 어떤 실시예에서, 전자 장치(100)의 구성 요소들 중 적어도 어느 두 개가 하나의 통합된 회로로 구현될 수 있다. 여기서, 전자 장치(100)는 서버 또는 전자 기기 중 적어도 어느 하나를 포함할 수 있다. 예를 들면, 전자 기기는 스마트폰(smart phone), 휴대폰, 내비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA(personal digital assistants), PMP(portable multimedia player), 태블릿 PC, 게임 콘솔(game console), 웨어러블 디바이스(wearable device), IoT(internet of things) 디바이스, VR(virtual reality) 디바이스, AR(augmented reality) 디바이스 또는 로봇(robot) 중 적어도 어느 하나를 포함할 수 있다.Referring to FIG. 1 , an electronic device 100 according to various embodiments may use at least one of a communication module 110 , an input module 120 , an output module 130 , a memory 140 , and a processor 150 . may include In some embodiments, at least one of the components of the electronic device 100 may be omitted, and at least one other component may be added. In some embodiments, at least any two of the components of the electronic device 100 may be implemented as one integrated circuit. Here, the electronic device 100 may include at least one of a server and an electronic device. For example, the electronic device includes a smart phone, a mobile phone, a navigation system, a computer, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a tablet PC, a game console, It may include at least one of a wearable device, an Internet of things (IoT) device, a virtual reality (VR) device, an augmented reality (AR) device, and a robot.

통신 모듈(110)은 전자 장치(100)에서 외부 장치(181, 183)와 통신을 수행할 수 있다. 통신 모듈(110)은 전자 장치(100)와 외부 장치(181, 183) 간 통신 채널을 수립하고, 통신 채널을 통해, 외부 장치(181, 183)와 통신을 수행할 수 있다. 여기서, 외부 장치(181, 183)는 위성, 서버 또는 전자 기기 중 적어도 어느 하나를 포함할 수 있다. 예를 들면, 전자 기기는 스마트폰, 휴대폰, 내비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA, PMP(portable multimedia player), 태블릿 PC, 게임 콘솔, 웨어러블 디바이스, IoT 디바이스, VR(virtual reality) 디바이스, AR 디바이스 또는 로봇 중 적어도 어느 하나를 포함할 수 있다. 통신 모듈(110)은 유선 통신 모듈 또는 무선 통신 모듈 중 적어도 어느 하나를 포함할 수 있다. 유선 통신 모듈은 외부 장치(181)와 유선으로 연결되어, 유선으로 통신할 수 있다. 무선 통신 모듈은 근거리 통신 모듈 또는 원거리 통신 모듈 중 적어도 어느 하나를 포함할 수 있다. 근거리 통신 모듈은 외부 장치(181)와 근거리 통신 방식으로 통신할 수 있다. 예를 들면, 근거리 통신 방식은, 블루투스(Bluetooth), 와이파이 다이렉트(WiFi direct), 또는 적외선 통신(IrDA; infrared data association) 등을 포함할 수 있다. 원거리 통신 모듈은 외부 장치(183)와 원거리 통신 방식으로 통신할 수 있다. 여기서, 원거리 통신 모듈은 네트워크(190)를 통해 외부 장치(183)와 통신할 수 있다. 예를 들면, 네트워크(190)는 셀룰러 네트워크, 인터넷, 또는 LAN(local area network)이나 WAN(wide area network)과 같은 컴퓨터 네트워크 등을 포함할 수 있다.The communication module 110 may communicate with the external devices 181 and 183 in the electronic device 100 . The communication module 110 may establish a communication channel between the electronic device 100 and the external devices 181 and 183 and communicate with the external devices 181 and 183 through the communication channel. Here, the external devices 181 and 183 may include at least one of a satellite, a server, and an electronic device. For example, the electronic device includes a smartphone, a mobile phone, a navigation device, a computer, a laptop computer, a digital broadcasting terminal, a PDA, a portable multimedia player (PMP), a tablet PC, a game console, a wearable device, an IoT device, a virtual reality (VR) device, It may include at least one of an AR device and a robot. The communication module 110 may include at least one of a wired communication module and a wireless communication module. The wired communication module may be connected to the external device 181 by wire to communicate via wire. The wireless communication module may include at least one of a short-range communication module and a long-distance communication module. The short-range communication module may communicate with the external device 181 in a short-distance communication method. For example, the short-range communication method may include Bluetooth, WiFi direct, or infrared data association (IrDA). The remote communication module may communicate with the external device 183 in a long-distance communication method. Here, the remote communication module may communicate with the external device 183 through the network 190 . For example, network 190 may include a cellular network, the Internet, or a computer network such as a local area network (LAN) or a wide area network (WAN), or the like.

입력 모듈(120)은 전자 장치(100)의 적어도 하나의 구성 요소에 사용될 신호를 입력할 수 있다. 입력 모듈(120)은, 사용자가 전자 장치(100)에 직접적으로 신호를 입력하도록 구성되는 입력 장치 또는 주변 환경을 감지하여 신호를 발생하도록 구성되는 센서 장치 중 적어도 어느 하나를 포함할 수 있다. 예를 들면, 입력 장치는 마이크로폰(microphone), 마우스(mouse) 또는 키보드(keyboard) 중 적어도 어느 하나를 포함할 수 있다. 어떤 실시예에서, 센서 장치는 터치를 감지하도록 설정된 터치 회로(touch circuitry) 또는 터치에 의해 발생되는 힘의 세기를 측정하도록 설정된 센서 회로 중 적어도 어느 하나를 포함할 수 있다. The input module 120 may input a signal to be used in at least one component of the electronic device 100 . The input module 120 may include at least one of an input device configured to allow a user to directly input a signal to the electronic device 100 or a sensor device configured to generate a signal by sensing a surrounding environment. For example, the input device may include at least one of a microphone, a mouse, and a keyboard. In some embodiments, the sensor device may include at least one of a touch circuitry configured to sense a touch or a sensor circuit configured to measure the intensity of a force generated by the touch.

출력 모듈(130)은 전자 장치(100)의 외부로 정보를 출력할 수 있다. 출력 모듈(130)은, 정보를 시각적으로 출력하도록 구성되는 표시 장치 또는 정보를 오디오 신호로 출력할 수 있는 오디오 출력 장치 중 적어도 어느 하나를 포함할 수 있다. 예를 들면, 표시 장치는 디스플레이, 홀로그램 장치 또는 프로젝터 중 적어도 어느 하나를 포함할 수 있다. 일 예로, 표시 장치는 입력 모듈(120)의 터치 회로 또는 센서 회로 중 적어도 어느 하나와 조립되어, 터치 스크린으로 구현될 수 있다. 예를 들면, 오디오 출력 장치는 스피커 또는 리시버 중 적어도 어느 하나를 포함할 수 있다. The output module 130 may output information to the outside of the electronic device 100 . The output module 130 may include at least one of a display device configured to visually output information and an audio output device capable of outputting information as an audio signal. For example, the display device may include at least one of a display, a hologram device, and a projector. For example, the display device may be implemented as a touch screen by being assembled with at least one of a touch circuit and a sensor circuit of the input module 120 . For example, the audio output device may include at least one of a speaker and a receiver.

메모리(140)는 전자 장치(100)의 적어도 하나의 구성 요소에 의해 사용되는 다양한 데이터를 저장할 수 있다. 예를 들면, 메모리(140)는 휘발성 메모리 또는 비휘발성 메모리 중 적어도 어느 하나를 포함할 수 있다. 데이터는 적어도 하나의 프로그램 및 이와 관련된 입력 데이터 또는 출력 데이터를 포함할 수 있다. 프로그램은 메모리(140)에 적어도 하나의 명령을 포함하는 소프트웨어로서 저장될 수 있으며, 운영 체제, 미들 웨어 또는 어플리케이션 중 적어도 어느 하나를 포함할 수 있다. The memory 140 may store various data used by at least one component of the electronic device 100 . For example, the memory 140 may include at least one of a volatile memory and a non-volatile memory. The data may include at least one program and input data or output data related thereto. The program may be stored in the memory 140 as software including at least one instruction, and may include at least one of an operating system, middleware, and an application.

프로세서(150)는 메모리(140)의 프로그램을 실행하여, 전자 장치(100)의 적어도 하나의 구성 요소를 제어할 수 있다. 이를 통해, 프로세서(150)는 데이터 처리 또는 연산을 수행할 수 있다. 이 때 프로세서(150)는 메모리(140)에 저장된 명령을 실행할 수 있다. The processor 150 may execute a program in the memory 140 to control at least one component of the electronic device 100 . Through this, the processor 150 may process data or perform an operation. In this case, the processor 150 may execute a command stored in the memory 140 .

다양한 실시예들에 따르면, 프로세서(150)는 실외 3차원 지도 정보를 기반으로 비주얼 로컬리제이션을 제공할 수 있다. 즉 프로세서(150)는 실외 3차원 지도 정보를 기반으로, 실외 환경에서의 측위(localization)를 수행할 수 있다. 이를 위해, 실외 환경에 대한 3차원 지도 정보가 요구될 수 있다. 프로세서(150)는 실외 환경에 대한 2차원 지도 정보 또는 2차원 이미지 정보 중 적어도 어느 하나를 기반으로, 3차원 지도 정보를 생성할 수 있다. 예를 들면, 실외 환경은 특정 지역으로 정의될 수 있다. 프로세서(150)는 3차원 지도 정보를 메모리(140)에 저장할 수 있다. 그리고, 프로세서(150)는 필요 시에, 3차원 지도 정보를 기반으로, 측위를 수행할 수 있다.According to various embodiments, the processor 150 may provide visual localization based on outdoor 3D map information. That is, the processor 150 may perform localization in the outdoor environment based on the outdoor 3D map information. To this end, 3D map information for the outdoor environment may be required. The processor 150 may generate 3D map information based on at least one of 2D map information and 2D image information for the outdoor environment. For example, the outdoor environment may be defined as a specific area. The processor 150 may store 3D map information in the memory 140 . In addition, the processor 150 may perform positioning based on the 3D map information, if necessary.

프로세서(150)는 외부 장치(181, 183), 예컨대 서버로부터 실외 환경에 대한 2차원 지도 정보를 획득할 수 있다. 이 때 2차원 지도 정보는 적어도 하나의 객체의 위치, 예컨대 지피에스(GPS; global positioning system) 위치를 포함할 수 있다. 예를 들면, 2차원 지도 정보가, 도 2에 도시된 바와 같이 획득될 수 있다. The processor 150 may obtain 2D map information for the outdoor environment from the external devices 181 and 183, for example, a server. In this case, the 2D map information may include a location of at least one object, for example, a global positioning system (GPS) location. For example, two-dimensional map information may be obtained as shown in FIG. 2 .

프로세서(150)는 실외 환경에 대한 스캐닝을 통해, 2차원 이미지 정보를 획득할 수 있다. 이 때 2차원 이미지 정보는 복수 개의 거리뷰 이미지들과 거리뷰 이미지들 각각에 대한 포즈(pose) 정보를 포함할 수 있다. 여기서, 포즈 정보는 위치, 예컨대 지피에스 위치, 및 포즈, 예컨대 3축 위치 값들과 3축 방향 값들을 포함할 수 있다. 예를 들면, 프로세서(150)는 사전에 2차원 이미지 정보를 메모리(140)에 저장하고, 필요 시에, 메모리(140)로부터 2차원 이미지 정보를 획득할 수 있다. The processor 150 may acquire 2D image information through scanning of the outdoor environment. In this case, the 2D image information may include a plurality of street view images and pose information for each of the street view images. Here, the pose information may include a position, for example, a GPS position, and a pose, for example, 3-axis position values and 3-axis direction values. For example, the processor 150 may store the 2D image information in the memory 140 in advance and, if necessary, obtain the 2D image information from the memory 140 .

프로세서(150)는 거리뷰 이미지들로부터, 3차원 지도 정보를 생성할 수 있다. 이 때 프로세서(150)는 SfM(structure from motion) 알고리즘을 이용하여, 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성할 수 있다. 예를 들면, 3차원 지도 정보가, 도3a 또는 도 3b에 도시된 바와 같이 생성될 수 있다. The processor 150 may generate 3D map information from the street view images. In this case, the processor 150 may generate 3D map information using the street view images using a structure from motion (SfM) algorithm. For example, 3D map information may be generated as shown in FIG. 3A or 3B .

프로세서(150)는 2차원 지도 정보를 기반으로, 3차원 지도 정보의 포즈 정보를 업데이트할 수 있다. 이 때 프로세서(150)는 2차원 지도 정보로부터 적어도 하나의 객체를 검출할 수 있다. 예를 들면, 프로세서(150)는 2차원 지도 정보에서 객체 외곽의 선(line)을 검출할 수 있다. 그리고, 프로세서(150)는 3차원 지도 정보에서 객체에 대응하는 영역을 검출할 수 있다. 여기서, 프로세서(150)는 2차원 지도 정보에서의 객체의 위치를 기반으로, 3차원 지도 정보에서 영역을 검출할 수 있다. 예를 들면, 프로세서(150)는 3차원 지도 정보의 포인트 클라우드(point cloud)를 기반으로, 영역 외곽의 점(point)들을 검출할 수 있다. 이를 통해, 프로세서(150)는 객체에 영역이 정합되도록, 3차원 지도 정보의 포즈를 업데이트할 수 있다. 바꿔 말하면, 프로세서(150)는 3차원 지도 정보에서의 영역 외곽의 점들을 2차원 지도 정보에서의 객체 외곽의 선에 정합시킬 수 있다. 예를 들면, 프로세서(150)는 ICP(iterative closet point) 알고리즘을 이용하여, 3차원 지도 정보의 영역을 2차원 지도 정보의 객체에 정합시킬 수 있다. The processor 150 may update pose information of the 3D map information based on the 2D map information. In this case, the processor 150 may detect at least one object from the 2D map information. For example, the processor 150 may detect a line outside the object from the 2D map information. In addition, the processor 150 may detect a region corresponding to the object from the 3D map information. Here, the processor 150 may detect a region in the 3D map information based on the location of the object in the 2D map information. For example, the processor 150 may detect points outside the area based on a point cloud of 3D map information. Through this, the processor 150 may update the pose of the 3D map information so that the area is matched with the object. In other words, the processor 150 may match the points outside the area in the 3D map information to the lines outside the object in the 2D map information. For example, the processor 150 may match an area of 3D map information to an object of 2D map information by using an iterative closet point (ICP) algorithm.

도 4는 다양한 실시예들에 따른 전자 장치(100)의 동작 방법을 도시하는 도면이다. 4 is a diagram illustrating a method of operating the electronic device 100 according to various embodiments of the present disclosure.

도 4를 참조하면, 전자 장치(100)는 410 동작에서 실외 환경에 대한 2차원 지도 정보 및 2차원 이미지 정보를 획득할 수 있다. 프로세서(150)는 외부 장치(181, 183), 예컨대 서버로부터 실외 환경에 대한 2차원 지도 정보를 획득할 수 있다. 이 때 2차원 지도 정보는 적어도 하나의 객체의 위치, 예컨대 지피에스 위치를 포함할 수 있다. 한편, 프로세서(150)는 실외 환경에 대한 스캐닝을 통해, 2차원 이미지 정보를 획득할 수 있다. 이 때 2차원 이미지 정보는 복수 개의 거리뷰 이미지들과 거리뷰 이미지들 각각에 대한 포즈 정보를 포함할 수 있다. 여기서, 포즈 정보는 위치, 예컨대 지피에스 위치, 및 포즈, 예컨대 3축 위치 값들과 3축 방향 값들을 포함할 수 있다. 예를 들면, 프로세서(150)는 사전에 2차원 이미지 정보를 메모리(140)에 저장할 수 있다. 그리고, 프로세서(150)는 메모리(140)로부터 2차원 이미지 정보를 획득할 수 있다.Referring to FIG. 4 , the electronic device 100 may acquire 2D map information and 2D image information for an outdoor environment in operation 410 . The processor 150 may obtain 2D map information for the outdoor environment from the external devices 181 and 183, for example, a server. In this case, the 2D map information may include a location of at least one object, for example, a GPS location. Meanwhile, the processor 150 may acquire 2D image information through scanning of the outdoor environment. In this case, the 2D image information may include a plurality of street view images and pose information for each of the street view images. Here, the pose information may include a position, for example, a GPS position, and a pose, for example, 3-axis position values and 3-axis direction values. For example, the processor 150 may store the 2D image information in the memory 140 in advance. In addition, the processor 150 may obtain 2D image information from the memory 140 .

전자 장치(100)는 420 동작에서 2차원 지도 정보 또는 2차원 이미지 정보 중 적어도 어느 하나를 기반으로, 실외 환경에 대한 3차원 지도 정보를 생성할 수 있다. 프로세서(150)는 2차원 이미지 정보를 기반으로, 3차원 지도 정보를 생성할 수 있다. 이 때 프로세서(150)는 SfM 알고리즘을 이용하여, 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성할 수 있다. 이에 대해, 도 5, 도 6 및 도 7을 참조하여, 보다 상세하게 후술될 것이다. In operation 420 , the electronic device 100 may generate 3D map information for an outdoor environment based on at least one of 2D map information and 2D image information. The processor 150 may generate 3D map information based on 2D image information. In this case, the processor 150 may generate 3D map information using the street view images using the SfM algorithm. This will be described later in more detail with reference to FIGS. 5, 6 and 7 .

도 5는 도 4의 3차원 지도 정보 생성 동작을 도시하는 도면이다. 도 6 및 도 7은 도 4의 3차원 지도 정보 생성 동작을 설명하기 위한 도면들이다. FIG. 5 is a diagram illustrating an operation of generating 3D map information of FIG. 4 . 6 and 7 are diagrams for explaining an operation of generating 3D map information of FIG. 4 .

도 5를 참조하면, 전자 장치(100)는 510 동작에서 2차원 지도 정보로부터 적어도 하나의 객체(610)를 검출할 수 있다. 프로세서(150)는, 2차원 지도 정보로부터 객체(610)를 검출할 수 있다. 예를 들면, 프로세서(150)는, 도 6에 도시된 바와 같이 2차원 지도 정보에서 객체(610) 외곽의 선(611)을 검출할 수 있다. 여기서, 프로세서(150)는 2차원 지도 정보로부터 객체(610)의 위치를 획득할 수 있다. Referring to FIG. 5 , the electronic device 100 detects at least one object 610 from 2D map information in operation 510 . The processor 150 may detect the object 610 from the 2D map information. For example, the processor 150 may detect a line 611 outside the object 610 from the 2D map information as shown in FIG. 6 . Here, the processor 150 may obtain the location of the object 610 from the 2D map information.

전자 장치(100)는 520 동작에서 2차원 이미지 정보를 기반으로, 3차원 지도 정보를 생성할 수 있다. 프로세서(150)는 2차원 이미지 정보의 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성할 수 있다. 그리고, 전자 장치(100)는 530 동작에서 3차원 지도 정보로부터 객체(610)에 대응하는 영역(620)을 검출할 수 있다. 여기서, 프로세서(150)는 2차원 지도 정보에서의 객체(610)의 위치를 기반으로, 3차원 지도 정보에서 객체(610)에 대응하는 영역(620)을 검출할 수 있다. 예를 들면, 프로세서(150)는, 도 6에 도시된 바와 같이 3차원 지도 정보로서 포인트 클라우드(621)를 생성하고, 포인트 클라우드(621)를 기반으로 영역 외곽의 점(623)들을 검출할 수 있다. The electronic device 100 may generate 3D map information based on 2D image information in operation 520 . The processor 150 may generate 3D map information by using the street view images of the 2D image information. In operation 530 , the electronic device 100 may detect the region 620 corresponding to the object 610 from the 3D map information. Here, the processor 150 may detect the region 620 corresponding to the object 610 in the 3D map information based on the location of the object 610 in the 2D map information. For example, the processor 150 may generate a point cloud 621 as 3D map information as shown in FIG. 6 , and detect points 623 outside the area based on the point cloud 621 . there is.

전자 장치(100)는 540 동작에서 객체(610)에 영역(620)이 정합되도록, 영역(620)의 포즈를 최적화할 수 있다. 프로세서(150)는, 도 7에 도시된 바와 같이 3차원 지도 정보의 포인트 클라우드(621)를 처리하여, 영역(620) 외곽의 점(623)들을객체(610) 외곽의 선(611)에 정합시킬 수 있다. 예를 들면, 프로세서(150)는 ICP 알고리즘을 이용하여, 영역(620)을 객체(610)에 정합시킬 수 있다. 여기서, 영역(620) 외곽의 점(623)들이 회전(rotation) 또는 병진(translation) 중 적어도 어느 하나를 수행할 수 있다. 이에 대해, 도 8 및 도 9를 참조하여, 보다 상세하게 후술될 것이다. The electronic device 100 may optimize the pose of the region 620 so that the region 620 is matched to the object 610 in operation 540 . As shown in FIG. 7 , the processor 150 processes the point cloud 621 of the 3D map information to match the points 623 outside the area 620 to the line 611 outside the object 610 . can do it For example, the processor 150 may register the region 620 with the object 610 using the ICP algorithm. Here, the points 623 outside the region 620 may perform at least one of rotation or translation. This will be described later in more detail with reference to FIGS. 8 and 9 .

도 8은 도 5의 각 거리뷰 이미지의 포즈 최적화 동작을 도시하는 도면이다. 도 9는 도 5의 각 거리뷰 이미지의 포즈 최적화 동작을 설명하기 위한 도면이다. FIG. 8 is a diagram illustrating a pose optimization operation of each street view image of FIG. 5 . FIG. 9 is a diagram for explaining a pose optimization operation of each street view image of FIG. 5 .

도 8을 참조하면, 전자 장치(100)는 810 동작에서 객체(610)와 영역(620) 사이의 오차(error)를 계산할 수 있다. 프로세서(150)는 객체(610)와 영역(620) 사이의 거리를 기반으로, 오차를 계산할 수 있다. 이 때 프로세서(150)는 영역(620) 외곽의 점(623)들에 대해, 오차들을 각각 계산하고, 오차들을 합산할 수 있다. Referring to FIG. 8 , the electronic device 100 may calculate an error between the object 610 and the region 620 in operation 810 . The processor 150 may calculate an error based on the distance between the object 610 and the region 620 . In this case, the processor 150 may calculate errors for each of the points 623 outside the region 620 and sum the errors.

예를 들면, 객체(610) 외곽의 선(611) 상의 점들을 위한 직선 방정식(line equation)이, 하기 [수학식 1]과 같이 정의될 수 있다. 그리고, 영역(620) 외곽의 점(623)들에 대한 객체(610) 외곽의 선(611) 상의 점들로의 거리(distance)가, 하기[수학식 2]와 같은 거리 함수(distance function)에 따라 결정될 수 있다. 이를 통해, 프로세서(150)는, 하기 [수학식 3]과 같은 오차 함수를 기반으로, 영역(620) 외곽의 점(623)들로부터 객체(610) 외곽의 선(611) 상의 점들로의 오차들을 합산할 수 있다. 여기서, 오차 함수는 거리 함수에 대한 최소 제곱법(least square)을 기초로 할 수 있다. 여기서, 프로세서(150)는 미리 정해진 기준을 기반으로, 영역(620) 외곽의 점(623)들에 적어도 하나의 가중치를 각각 부여할 수 있다. For example, a line equation for points on the line 611 outside the object 610 may be defined as in [Equation 1] below. And, the distance from the points 623 outside the area 620 to the points on the line 611 outside the object 610 is in a distance function as shown in Equation 2 below. can be determined accordingly. Through this, the processor 150 generates an error from the points 623 outside the area 620 to the points on the line 611 outside the object 610 based on the error function shown in Equation 3 below. can be summed up. Here, the error function may be based on a least square method for the distance function. Here, the processor 150 may assign at least one weight to each of the points 623 outside the area 620 based on a predetermined criterion.

여기서,

는 직선 방정식을 나타내고,

는 객체(610) 외곽의 선(611) 상의 점들에 대한 벡터를 나타내고,

는 변수 값을 나타내고,

는 법선 벡터(normal vector)를 나타낼 수 있다. here,

represents the linear equation,

represents a vector for points on the line 611 outside the object 610,

represents the variable value,

may represent a normal vector.

여기서,

는 거리 함수를 나타내고,

은 상기 [수학식 1]을 기반으로 결정되는 선 벡터(line vector)를 나타내고,

는 영역(620) 외곽의 점(623)들에 대한 벡터를 나타내고,

및

와 도 9의 (a)에 도시된 바와 같은 관계에 있을 수 있으며, 이를 통해 상기 [수학식 2]와 같은 거리 함수가 도 9의 (b)를 기반으로 정의될 수 있다. here,

represents the distance function,

represents a line vector determined based on [Equation 1],

denotes a vector for points 623 outside the region 620,

and

and may have a relationship as shown in FIG.

여기서,

는 오차 함수를 나타내고,

는 회전 행렬(rotation matrix)을 나타내고,

은 영역(620) 외곽의 점(623)들의 총 개수를 나타내고,

은 각 점(623)의 식별자를 나타내고,

는 가중치를 나타낼 수 있다. 예를 들면, 가중치는 영역(620) 외곽의 점(623)들 생성 시 시맨틱 세그멘테이션(semantic segmentation) 기법을 통해 얻을 수 있는 의미 정보(건물, 가로수, 신호등, 도로, 등)에 따라 그 중요도를 부과하거나, 영역(620) 외곽의 점(623)들을 생성하기 위해 사용된 특징점 개수, SfM 추정 오차에 따라 다르게 부과될 수 있다.here,

represents the error function,

represents a rotation matrix,

represents the total number of points 623 outside the area 620,

represents the identifier of each point 623,

may represent a weight. For example, the weight is assigned its importance according to semantic information (buildings, street trees, traffic lights, roads, etc.) that can be obtained through a semantic segmentation technique when the points 623 outside the area 620 are generated. Alternatively, different charges may be applied according to the number of feature points used to generate the points 623 outside the region 620 and the SfM estimation error.

전자 장치(100)는 820 동작에서 객체(610)와 영역(620) 사이의 오차를 미리 정해진 임계치와 비교할 수 있다. 프로세서(150)는, 오차가 임계치 이하인 지의 여부를 판단할 수 있다. 예를 들면, 임계치는 0 이상일 수 있다. 820 동작에서 오차가 임계치 이하이면, 전자 장치(100)는 도 4로 리턴하여, 430 동작으로 진행할 수 있다.The electronic device 100 may compare the error between the object 610 and the region 620 with a predetermined threshold in operation 820 . The processor 150 may determine whether the error is equal to or less than a threshold. For example, the threshold may be greater than or equal to zero. If the error is equal to or less than the threshold in operation 820 , the electronic device 100 returns to FIG. 4 and proceeds to operation 430 .

한편, 820 동작에서 오차가 임계치를 초과하면, 전자 장치(100)는 830 동작에서 3차원 지도 정보에서의 영역(620)의 포즈를 추정할 수 있다. 프로세서(150)는 오차가 감소되는 방향으로, 3차원 지도 정보에서의 영역(620)의 포즈를 추정할 수 있다. 즉 프로세서(150)는, 영역(620) 외곽의 점(623)들에 대한 객체(610) 외곽의 선(611) 상의 타겟 점들로의 거리가 짧아지도록, 영역(620) 외곽의 점(623)들의 위치를 추정할 수 있다. 이 후 전자 장치(100)는 도 5로 리턴하여, 550 동작으로 진행할 수 있다. 다시 도 5를 참조하면, 전자 장치(100)는 550 동작에서 최적화된 포즈를 기반으로, 3차원 지도 정보 및 거리뷰 이미지들을 업데이트할 수 있다. 프로세서(150)는 최적화된 포즈를 이용하여, 3차원 지도 정보를 업데이트할 수 있다. 그리고, 프로세서(150)는 최적화된 포즈를 이용하여, 거리뷰 이미지들 중 적어도 어느 하나의 포즈를 업데이트 할 수 있다. 이 후 전자 장치(100)는 도 4로 리턴하여, 430 동작으로 진행할 수 있다. Meanwhile, if the error exceeds the threshold in operation 820 , the electronic device 100 may estimate the pose of the region 620 in the 3D map information in operation 830 . The processor 150 may estimate the pose of the region 620 in the 3D map information in a direction in which an error is reduced. That is, the processor 150 sets the point 623 outside the area 620 so that the distance from the points 623 outside the area 620 to the target points on the line 611 outside the object 610 is shortened. their location can be estimated. Thereafter, the electronic device 100 may return to FIG. 5 and proceed to operation 550 . Referring back to FIG. 5 , the electronic device 100 may update 3D map information and street view images based on the optimized pose in operation 550 . The processor 150 may update the 3D map information using the optimized pose. Then, the processor 150 may update at least one pose among the street view images by using the optimized pose. Thereafter, the electronic device 100 may return to FIG. 4 and proceed to operation 430 .

일 실시예에 따르면, 도시되지는 않았으나, 전자 장치(100)는 550 동작을 수행한 후에, 530 동작, 540 동작 및 550 동작을 재수행할 수 있다. 이를 통해, 전자 장치(100)는 530 동작에서 3차원 지도 정보에서 객체(610)에 대응하는 영역(620)을 재차 검출할 수 있다. 그리고, 전자 장치(100)는 540 동작에서 객체(610)에 영역(620)이 정합되도록 영역(620)의 포즈를 최적화할 수 있다. 이를 통해, 전자 장치(110)는 550 동작에서 최적화된 포즈를 기반으로, 3차원 지도 정보 및 거리뷰 이미지들을 업데이트할 수 있다. 이 때 전자 장치(100)는 도 8의 820 동작에서 오차가 임계치 이하로 될 때까지, 530 동작, 540 동작 및 550 동작을 반복하여 수행할 수 있다.According to an embodiment, although not shown, after performing operation 550, the electronic device 100 may re-perform operations 530, 540, and 550. Through this, the electronic device 100 may re-detect the region 620 corresponding to the object 610 from the 3D map information in operation 530 . In operation 540 , the electronic device 100 may optimize the pose of the region 620 so that the region 620 is matched to the object 610 . Through this, the electronic device 110 may update the 3D map information and the street view images based on the optimized pose in operation 550 . In this case, the electronic device 100 may repeatedly perform operations 530 , 540 , and 550 until the error becomes less than or equal to the threshold in operation 820 of FIG. 8 .

다시 도 4를 참조하면, 전자 장치(100)는 430 동작에서 3차원 지도 정보를 기반으로, 비주얼 로컬리제이션을 제공할 수 있다. Referring back to FIG. 4 , in operation 430 , the electronic device 100 may provide visual localization based on 3D map information.

일 실시예에 따르면, 프로세서(150)는 3차원 지도 정보를 기반으로, 실외 환경에서의 측위를 수행할 수 있다. 예를 들면, 프로세서(150)는 외부 장치(181, 183)로부터 수신되는 쿼리 이미지(query image)에 대응하여, 측위를 수행할 수 있다. 프로세서(150)는 통신 모듈(110)을 통해, 외부 장치(181, 183)로부터 쿼리 이미지를 수신할 수 있다. 프로세서(150)는 쿼리 이미지를 기반으로, 복수 개의 영역(510)들 중 어느 하나에 대한 3차원 지도 정보를 검출할 수 있다. 이 때 프로세서(150)는 딥러닝(deep learning) 모델을 통해 쿼리 이미지로부터 특징 정보를 추출하고, 특징 정보를 이용하여 3차원 지도 정보를 검출할 수 있다. 그리고, 프로세서(150)는 3차원 지도 정보에서 쿼리 이미지에 대응하는 지점의 위치를 추정할 수 있다. According to an embodiment, the processor 150 may perform positioning in an outdoor environment based on 3D map information. For example, the processor 150 may perform positioning in response to a query image received from the external devices 181 and 183 . The processor 150 may receive a query image from the external devices 181 and 183 through the communication module 110 . The processor 150 may detect 3D map information for any one of the plurality of regions 510 based on the query image. In this case, the processor 150 may extract feature information from the query image through a deep learning model, and detect 3D map information using the feature information. In addition, the processor 150 may estimate the location of a point corresponding to the query image in the 3D map information.

다른 실시예에 따르면, 프로세서(150)는 3차원 지도 정보를 기반으로, 커서링(cursoring)을 수행할 수 있다. 예를 들면, 프로세서(150)는 사용자의 요청에 대응하여, 커서링을 수행할 수 있다. 프로세서(150)는 입력 모듈(120)을 통해, 3차원 지도 정보에서 특정 영역(620)에 대한 사용자의 요청을 수신할 수 있다. 프로세서(150)는 2차원 지도 정보로부터 특정 영역(620)에 대응하는 객체(610)를 결정하고, 객체의 깊이(depth) 정보를 검출할 수 있다. 그리고, 프로세서(150)는 객체(610)의 깊이 정보를 이용하여, 3차원 지도 정보에서 특정 영역(620)의 깊이 정보를 추정할 수 있다. 이 때 특정 영역(620)이 객체(610)에 정합되어 있으므로, 프로세서(150)는 객체(610)의 깊이 정보를 특정 영역(620)의 깊이 정보로 추정할 수 있다. 이를 통해, 프로세서(150)는 특정 영역(620)의 깊이 정보를 이용하여, 3차원 지도 정보 상에서 특정 영역(620)에 대해 커서링을 수행할 수 있다. According to another embodiment, the processor 150 may perform cursoring based on 3D map information. For example, the processor 150 may perform cursoring in response to a user's request. The processor 150 may receive a user's request for a specific area 620 in the 3D map information through the input module 120 . The processor 150 may determine the object 610 corresponding to the specific region 620 from the 2D map information, and detect depth information of the object. In addition, the processor 150 may estimate depth information of the specific region 620 from the 3D map information by using the depth information of the object 610 . In this case, since the specific region 620 is matched to the object 610 , the processor 150 may estimate depth information of the object 610 as depth information of the specific region 620 . Through this, the processor 150 may perform cursoring on the specific region 620 on the 3D map information by using the depth information of the specific region 620 .

또 다른 실시예에 따르면, 프로세서(150)는 비주얼 로컬리제이션을 기반으로 증강 현실 내비게이션(AR navigation)을 제공하는 데 있어서, 2차원 지도 정보의 관심점(POI)(예: 상점, 횡단보도, 건물 등) 데이터를 활용할 수 있다.According to another embodiment, in providing augmented reality navigation (AR navigation) based on visual localization, the processor 150 is a point of interest (POI) of two-dimensional map information (eg, a store, a crosswalk, buildings, etc.) can be used.

다양한 실시예들에 따른 전자 장치(100)의 동작 방법은, 실외 환경에 대한 2차원 지도 정보 및 복수 개의 거리뷰 이미지들을 획득하는 동작, 2차원 지도 정보 및 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성하는 동작, 및 3차원 지도 정보를 기반으로, 실외 환경에 대한 비주얼 로컬리제이션을 제공하는 동작을 포함할 수 있다. A method of operating the electronic device 100 according to various embodiments includes an operation of acquiring 2D map information for an outdoor environment and a plurality of street view images, and a 3D map using the 2D map information and the street view images. It may include an operation of generating information, and an operation of providing visual localization for the outdoor environment based on the 3D map information.

다양한 실시예들에 따르면, 3차원 지도 정보를 생성하는 동작은, 거리뷰 이미지들로부터 3차원 지도 정보를 생성하는 동작, 및 2차원 지도 정보를 기반으로, 3차원 지도 정보의 포즈를 업데이트하는 동작을 포함할 수 있다.According to various embodiments, the generating of the 3D map information includes generating 3D map information from street view images, and updating a pose of the 3D map information based on the 2D map information. may include.

다양한 실시예들에 따르면, 3차원 지도 정보를 생성하는 동작은, 2차원 지도 정보로부터 객체(610)를 검출하는 동작, 3차원 지도 정보로부터 객체에 대응하는 영역을 검출하는 동작, 객체(610)에 영역(620)이 정합되도록, 3차원 지도 정보의 포즈를 업데이트하는 동작 및 객체(610)에 영역(620)이 정합되도록, 거리뷰 이미지들 중 적어도 어느 하나의 포즈를 업데이트하는 동작을 더 포함할 수 있다. According to various embodiments, the operation of generating the 3D map information includes the operation of detecting the object 610 from the 2D map information, the operation of detecting an area corresponding to the object from the 3D map information, and the object 610 . The method further includes an operation of updating the pose of the 3D map information so that the region 620 is matched to the can do.

다양한 실시예들에 따르면, 3차원 지도 정보를 생성하는 동작은, SfM 알고리즘을 이용하여 거리뷰 이미지들로부터 3차원 지도 정보를 생성할 수 있다. According to various embodiments, the operation of generating 3D map information may generate 3D map information from street view images using an SfM algorithm.

다양한 실시예들에 따르면, 3차원 지도 정보의 포즈를 업데이트하는 동작은, 객체(610)와 영역(620) 사이의 오차를 계산하는 동작, 오차가 감소되도록, 3차원 지도 정보의 포즈를 추정하는 동작, 및 3차원 지도 정보에 대해, 추정된 포즈로 업데이트하는 동작을 포함할 수 있다. According to various embodiments, updating the pose of the 3D map information includes calculating an error between the object 610 and the region 620 and estimating a pose of the 3D map information so that the error is reduced. and updating the 3D map information with the estimated pose.

다양한 실시예들에 따르면, 3차원 지도 정보의 포즈를 추정하는 동작은, 오차가 미리 정해진 임계치를 초과하면, 수행되는 방법. According to various embodiments, the operation of estimating the pose of the 3D map information is performed when an error exceeds a predetermined threshold.

다양한 실시예들에 따르면, 객체(610)에 대응하는 영역(620)을 검출하는 동작은, 추정된 포즈로 업데이트하는 동작 후에, 반복될 수 있다. According to various embodiments, the operation of detecting the area 620 corresponding to the object 610 may be repeated after the operation of updating to the estimated pose.

다양한 실시예들에 따르면, 3차원 지도 정보의 포즈를 업데이트하는 동작은, ICP 알고리즘을 이용하여, 객체(610)에 영역(620)이 정합되도록, 3차원 지도 정보의 포즈를 업데이트할 수 있다. According to various embodiments, the updating of the pose of the 3D map information may update the pose of the 3D map information so that the region 620 is matched to the object 610 using an ICP algorithm.

다양한 실시예들에 따르면, 비주얼 로컬리제이션을 제공하는 동작은, 2차원 지도 정보로부터 객체(610)의 깊이 정보를 검출하는 동작, 검출된 깊이 정보를 이용하여, 3차원 지도 정보에서 객체(610)에 대응하는 영역(620)의 깊이 정보를 추정하는 동작, 및 추정된 깊이 정보를 이용하여, 비주얼 로컬리제이션을 제공하는 동작을 포함할 수 있다. According to various embodiments, the operation of providing the visual localization includes the operation of detecting depth information of the object 610 from the 2D map information, and the operation of detecting the depth information of the object 610 from the 3D map information using the detected depth information. ) may include an operation of estimating depth information of the region 620 corresponding to ), and an operation of providing visual localization by using the estimated depth information.

다양한 실시예들에 따른 전자 장치(100)는, 메모리(140), 및 메모리(140)와 연결되고, 메모리에 저장된 적어도 하나의 명령을 실행하도록 구성된 프로세서(150)를 포함할 수 있다. The electronic device 100 according to various embodiments may include a memory 140 and a processor 150 connected to the memory 140 and configured to execute at least one command stored in the memory.

다양한 실시예들에 따르면, 프로세서(150)는, 실외 환경에 대한 2차원 지도 정보 및 복수 개의 거리뷰 이미지들을 획득하고, 2차원 지도 정보 및 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성하고, 3차원 지도 정보를 기반으로, 실외 환경에 대한 비주얼 로컬리제이션을 제공하도록 구성될 수 있다. According to various embodiments, the processor 150 obtains two-dimensional map information and a plurality of street view images for an outdoor environment, and generates three-dimensional map information by using the two-dimensional map information and the street view images. , based on the three-dimensional map information, may be configured to provide visual localization for the outdoor environment.

다양한 실시예들에 따르면, 프로세서(150)는, 거리뷰 이미지들로부터, 3차원 지도 정보를 생성하고, 2차원 지도 정보를 기반으로, 3차원 지도 정보의 포즈를 업데이트하도록 구성될 수 있다. According to various embodiments, the processor 150 may be configured to generate 3D map information from street view images and update a pose of the 3D map information based on the 2D map information.

다양한 실시예들에 따르면, 프로세서(150)는, 2차원 지도 정보로부터 객체(610)를 검출하고, 3차원 지도 정보로부터 객체(610)에 대응하는 영역(620)을 검출하고, 객체(610)에 영역(620)이 정합되도록, 3차원 지도 정보의 포즈를 업데이트하고, 객체(610)에 영역(620)이 정합되도록, 거리뷰 이미지들 중 적어도 어느 하나의 포즈를 업데이트하도록 구성될 수 있다. According to various embodiments, the processor 150 detects the object 610 from the 2D map information, detects the area 620 corresponding to the object 610 from the 3D map information, and the object 610 It may be configured to update the pose of the 3D map information so that the area 620 is matched to the , and update the pose of at least one of the street view images so that the area 620 is matched to the object 610 .

다양한 실시예들에 따르면, 프로세서(150)는, SfM 알고리즘을 이용하여, 거리뷰 이미지들로부터 3차원 지도 정보를 생성하도록 구성될 수 있다. According to various embodiments, the processor 150 may be configured to generate 3D map information from street view images by using the SfM algorithm.

다양한 실시예들에 따르면, 프로세서(150)는, 객체(610)와 영역(620) 사이의 오차를 계산하고, 오차가 감소되도록, 3차원 지도 정보의 포즈를 추정하고, 3차원 지도 정보에 대해, 추정된 포즈로 업데이트하도록 구성될 수 있다. According to various embodiments, the processor 150 calculates an error between the object 610 and the region 620 , estimates a pose of the 3D map information so that the error is reduced, and applies the 3D map information to the 3D map information. , can be configured to update with the estimated pose.

다양한 실시예들에 따르면, 프로세서(150)는, 오차가 미리 정해진 임계치를 초과하면, 3차원 지도 정보의 포즈를 추정하도록 구성될 수 있다. According to various embodiments, the processor 150 may be configured to estimate a pose of the 3D map information when the error exceeds a predetermined threshold.

다양한 실시예들에 따르면, 프로세서(150)는, 3차원 지도 정보에 대해, 추정된 포즈로 업데이트한 후에, 3차원 지도 정보로부터 객체(610)에 대응하는 영역(620)을 재차 검출하도록 구성될 수 있다. According to various embodiments, the processor 150 may be configured to re-detect the region 620 corresponding to the object 610 from the 3D map information after updating the 3D map information to the estimated pose. can

다양한 실시예들에 따르면, 프로세서(150)는, ICP 알고리즘을 이용하여, 객체(610)에 영역(620)이 정합되도록, 3차원 지도 정보의 포즈를 업데이트하도록 구성될 수 있다. According to various embodiments, the processor 150 may be configured to update the pose of the 3D map information so that the region 620 is matched to the object 610 using an ICP algorithm.

다양한 실시예들에 따르면, 프로세서(150)는, 2차원 지도 정보로부터 객체(610)의 깊이 정보를 검출하고, 검출된 깊이 정보를 이용하여, 3차원 지도 정보에서 객체(610)에 대응하는 영역(620)의 깊이 정보를 추정하고, 추정된 깊이 정보를 이용하여, 비주얼 로컬리제이션을 제공하도록 구성될 수 있다. According to various embodiments, the processor 150 detects depth information of the object 610 from the 2D map information, and uses the detected depth information to provide an area corresponding to the object 610 in the 3D map information. Estimate the depth information of 620 , and use the estimated depth information to provide visual localization.

본 문서의 다양한 실시예들은 컴퓨터 장치(예: 전자 장치(100))에 의해 읽을 수 있는 기록 매체(storage medium)(예: 메모리(140))에 저장된 하나 이상의 명령들을 포함하는 컴퓨터 프로그램으로서 구현될 수 있다. 예를 들면, 컴퓨터 장치의 프로세서(예: 프로세서(150))는, 기록 매체로부터 저장된 하나 이상의 명령들 중 적어도 하나를 호출하고, 그것을 실행할 수 있다. 이것은 컴퓨터 장치가 호출된 적어도 하나의 명령에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 하나 이상의 명령들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 컴퓨터 장치로 읽을 수 있는 기록 매체는, 비일시적(non-transitory) 기록 매체의 형태로 제공될 수 있다. 여기서, ‘비일시적’은 기록 매체가 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 기록 매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다.Various embodiments of the present document may be implemented as a computer program including one or more instructions stored in a storage medium (eg, memory 140) readable by a computer device (eg, electronic device 100). can For example, the processor (eg, the processor 150 ) of the computer device may call at least one of one or more instructions stored from a recording medium and execute it. This enables the computer device to be operated to perform at least one function according to at least one command called. The one or more instructions may include code generated by a compiler or code executable by an interpreter. The computer-readable recording medium may be provided in the form of a non-transitory recording medium. Here, 'non-transitory' only means that the recording medium is a tangible device and does not contain a signal (eg, electromagnetic wave), and this term is used in cases where data is semi-permanently stored in the recording medium and It does not distinguish between temporary storage cases.

다양한 실시예들에 따른 컴퓨터 프로그램은, 컴퓨터 장치와 결합되어 비주얼 로컬리제이션(visual localization) 제공 방법을 상기 컴퓨터 장치에 실행시키기 위해 컴퓨터 판독 가능한 기록 매체에 저장될 수 있다. A computer program according to various embodiments may be stored in a computer-readable recording medium in order to be combined with a computer device and execute a method for providing a visual localization in the computer device.

다양한 실시예들에 따르면, 비주얼 로컬리제이션 제공 방법은, 실외 환경에 대한 2차원 지도 정보 및 복수 개의 거리뷰 이미지들을 획득하는 동작, 2차원 지도 정보 및 거리뷰 이미지들을 이용하여, 3차원 지도 정보를 생성하는 동작, 및 3차원 지도 정보를 기반으로, 실외 환경에 대한 비주얼 로컬리제이션을 제공하는 동작을 포함할 수 있다.According to various embodiments, a method of providing visual localization includes an operation of acquiring two-dimensional map information for an outdoor environment and a plurality of street view images, and three-dimensional map information using the two-dimensional map information and the street view images. It may include an operation of generating , and an operation of providing visual localization for an outdoor environment based on 3D map information.

다양한 실시예들에 따르면, 전자 장치(100)가 실외 환경에 대한 3차원 지도 정보를 생성함으로써, 실외 환경에서의 비주얼 로컬리제이션을 제공할 수 있다. 즉 전자 장치(100)는 실외 환경에 대한 3차원 지도 정보를 기반으로, 실외 환경에서의 측위를 수행할 수 있다. 이 때 전자 장치(100)가 2차원 지도 정보를 기반으로 3차원 지도 정보를 생성하는 데 필요한 거리뷰 이미지의 포즈를 정합시킴으로써, 비주얼 로컬리제이션을 제공할 때 발생되는 오류를 최소화할 수 있다. 예를 들면, 2차원 지도 정보를 이용하여 3차원 지도 정보 상에 커서링 시, 커서링과 관련된 아이템의 형태 또는 사이즈에 대한 왜곡이 최소화될 수 있다. According to various embodiments, the electronic device 100 may provide visual localization in the outdoor environment by generating 3D map information for the outdoor environment. That is, the electronic device 100 may perform positioning in the outdoor environment based on 3D map information on the outdoor environment. At this time, by matching the poses of the street view image required for the electronic device 100 to generate the 3D map information based on the 2D map information, an error occurring when providing the visual localization may be minimized. For example, when cursoring is performed on 3D map information using 2D map information, distortion of a shape or size of an item related to the cursoring may be minimized.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 해당 실시 예의 다양한 변경, 균등물, 및/또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성 요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및/또는 B 중 적어도 하나", "A, B 또는 C" 또는 "A, B 및/또는 C 중 적어도 하나" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", "첫째" 또는 "둘째" 등의 표현들은 해당 구성 요소들을, 순서 또는 중요도에 상관없이 수식할 수 있고, 한 구성 요소를 다른 구성 요소와 구분하기 위해 사용될 뿐 해당 구성 요소들을 한정하지 않는다. 어떤(예: 제 1) 구성 요소가 다른(예: 제 2) 구성 요소에 "(기능적으로 또는 통신적으로) 연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기 어떤 구성 요소가 상기 다른 구성 요소에 직접적으로 연결되거나, 다른 구성 요소(예: 제 3 구성 요소)를 통하여 연결될 수 있다.It should be understood that the various embodiments of this document and the terms used therein are not intended to limit the technology described in this document to a specific embodiment, and include various modifications, equivalents, and/or substitutions of the embodiments. In connection with the description of the drawings, like reference numerals may be used for like components. The singular expression may include the plural expression unless the context clearly dictates otherwise. In this document, expressions such as “A or B”, “at least one of A and/or B”, “A, B or C” or “at least one of A, B and/or C” refer to all of the items listed together. Possible combinations may be included. Expressions such as “first”, “second”, “first” or “second” can modify the corresponding components regardless of order or importance, and are only used to distinguish one component from another. It does not limit the corresponding components. When an (eg, first) component is referred to as being “(functionally or communicatively) connected” or “connected” to another (eg, second) component, that component is It may be directly connected to the component, or may be connected through another component (eg, a third component).

본 문서에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구성된 유닛을 포함하며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로 등의 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 모듈은 ASIC(application-specific integrated circuit)으로 구성될 수 있다. As used herein, the term “module” includes a unit composed of hardware, software, or firmware, and may be used interchangeably with terms such as, for example, logic, logic block, component, or circuit. A module may be an integrally formed part or a minimum unit or a part of one or more functions. For example, the module may be configured as an application-specific integrated circuit (ASIC).

다양한 실시예들에 따르면, 기술한 구성 요소들의 각각의 구성 요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있다. 다양한 실시예들에 따르면, 전술한 해당 구성 요소들 중 하나 이상의 구성 요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성 요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성 요소들(예: 모듈 또는 프로그램)은 하나의 구성 요소로 통합될 수 있다. 이런 경우, 통합된 구성 요소는 복수의 구성 요소들 각각의 구성 요소의 하나 이상의 기능들을 통합 이전에 복수의 구성 요소들 중 해당 구성 요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다.According to various embodiments, each component (eg, a module or a program) of the described components may include a singular or a plurality of entities. According to various embodiments, one or more components or operations among the above-described corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg, a module or a program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components identically or similarly to those performed by the corresponding component among the plurality of components prior to integration. According to various embodiments, operations performed by a module, program, or other component are executed sequentially, in parallel, repeatedly, or heuristically, or one or more of the operations are executed in a different order, omitted, or , or one or more other operations may be added.

Claims

A method of operating an electronic device, comprising:
acquiring two-dimensional map information for an outdoor environment and a plurality of street view images;
generating 3D map information by using the 2D map information and the street view images; and
and providing visual localization for the outdoor environment based on the 3D map information.

The method of claim 1, wherein the generating of the 3D map information comprises:
generating the 3D map information from the street view images; and
and updating the pose of the 3D map information based on the 2D map information.

The method of claim 2, wherein the generating of the 3D map information comprises:
detecting an object from the two-dimensional map information;
detecting an area corresponding to the object from the 3D map information;
updating the pose of the 3D map information so that the region matches the object; and
The method further comprising updating the pose of at least one of the street view images so that the region is registered with the object.

The method of claim 2, wherein the generating of the 3D map information comprises:
A method of generating the 3D map information from the street view images by using a structure from motion (SfM) algorithm.

The method of claim 3, wherein updating the pose of the 3D map information comprises:
calculating an error between the object and the region;
estimating the pose of the 3D map information so that the error is reduced; and
and updating the 3D map information with the estimated pose.

The method of claim 5, wherein estimating the pose of the 3D map information comprises:
if the error exceeds a predetermined threshold.

The method of claim 6, wherein the detecting of the region corresponding to the object comprises:
After updating with the estimated pose, the method is repeated.

The method of claim 3, wherein updating the pose of the 3D map information comprises:
A method of updating the pose of the 3D map information so that the region is matched to the object by using an iterative closet point (ICP) algorithm.

The method of claim 1 , wherein providing the visual localization comprises:
detecting depth information of an object from the two-dimensional map information;
estimating depth information of a region corresponding to the object from the 3D map information by using the detected depth information; and
and providing the visual localization by using the estimated depth information.

In an electronic device,
Memory; and
a processor coupled to the memory and configured to execute at least one instruction stored in the memory;
The processor is
Acquire two-dimensional map information and a plurality of street view images for the outdoor environment,
using the 2D map information and the street view images to generate 3D map information,
an apparatus configured to provide visual localization for the outdoor environment based on the three-dimensional map information.

The method of claim 10, wherein the processor comprises:
generate the 3D map information from the street view images,
an apparatus configured to update the pose of the 3D map information based on the 2D map information.

The method of claim 11 , wherein the processor comprises:
detecting an object from the two-dimensional map information,
detecting an area corresponding to the object from the 3D map information,
updating the pose of the 3D map information so that the region matches the object;
and update the pose of at least one of the street view images to match the region to the object.

The method of claim 10, wherein the processor comprises:
detecting depth information of an object from the two-dimensional map information,
estimating depth information of a region corresponding to the object from the 3D map information using the detected depth information,
and provide the visual localization using the estimated depth information.

A computer program stored in a computer-readable recording medium in combination with a computer device to execute a method for providing visual localization on the computer device,
The method of providing the visual localization,
acquiring two-dimensional map information for an outdoor environment and a plurality of street view images;
generating 3D map information by using the 2D map information and the street view images; and
and providing visual localization for the outdoor environment based on the three-dimensional map information.