KR20210050997A

KR20210050997A - Method and apparatus for estimating pose, computer-readable storage medium and computer program for controlling the holder device

Info

Publication number: KR20210050997A
Application number: KR1020190135690A
Authority: KR
Inventors: 임승욱; 유연걸; 윤찬민; 정준영
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2019-10-29
Filing date: 2019-10-29
Publication date: 2021-05-10

Abstract

실시예의 포즈 추정 방법은 상기 카메라로부터 복수의 제1 영상을 수집하는 단계와, 상기 단말기의 GPS 정보를 이용하여 DB로부터 복수의 제2 영상을 수집하는 단계와, 상기 복수의 제1 영상 중 어느 하나의 상기 제1 영상과 상기 복수의 제2 영상을 비교하여 상기 카메라의 제1 포즈를 추정하는 단계와, 상기 복수의 제1 영상을 이용하여 상기 카메라의 제2 포즈를 추정하는 단계와, 동일한 시점에서 추정된 상기 제1 포즈 및 상기 제2 포즈를 비교하여 상기 카메라의 제3 포즈를 추정하는 단계를 포함할 수 있다.
실시예는 동일 시점에서의 제1 포즈와 제2 포즈를 이용함으로써, 제1 포즈가 부정확하거나 제2 포즈가 부정확한 경우를 모두 제거하여 보다 정확한 카메라의 포즈를 추정할 수 있는 효과가 있다.The pose estimation method of the embodiment includes the steps of collecting a plurality of first images from the camera, collecting a plurality of second images from a DB using GPS information of the terminal, and any one of the plurality of first images. Estimating a first pose of the camera by comparing the first image of the plurality of second images and estimating a second pose of the camera using the plurality of first images, and the same viewpoint And estimating a third pose of the camera by comparing the estimated first pose and the second pose.
According to the exemplary embodiment, by using the first pose and the second pose at the same viewpoint, it is possible to estimate a more accurate camera pose by removing all cases in which the first pose is incorrect or the second pose is incorrect.

Description

Pose estimation method and apparatus, computer-readable recording medium and computer program {METHOD AND APPARATUS FOR ESTIMATING POSE, COMPUTER-READABLE STORAGE MEDIUM AND COMPUTER PROGRAM FOR CONTROLLING THE HOLDER DEVICE}

실시예는 단말에 구비된 카메라의 포즈를 효과적으로 추정하기 위한 포즈 추정 방법에 관한 것이다.The embodiment relates to a pose estimation method for effectively estimating a pose of a camera provided in a terminal.

증강 현실(Augmented Reality: AR)은 위치 및 방향 정보를 이용하여 대략적인 위치를 파악하고 주변의 건물 정보와 같은 시설물 정보와 카메라의 움직임에 따라 입력되는 실사 영상 정보 간의 비교를 통해 사용자가 원하는 서비스를 파악하여 관련 정보를 제공하는 기술이다.Augmented Reality (AR) uses location and direction information to determine an approximate location, and compares facility information such as nearby building information and live-action image information input according to the movement of the camera to provide the service desired by the user. It is a technology that identifies and provides relevant information.

더욱 구체적으로, 증강 현실은 가상현실(Virtual Reality: VR)의 한 분야로서 실제 환경에 가상 사물을 합성하여 원래의 환경에 존재하는 사물처럼 보이도록 하는 컴퓨터 그래픽 기법이며, 증강 현실은 가상의 공간과 사물만을 대상으로 하는 기존의 가상 현실과 달리 현실 세계의 기반에 가상 사물을 합성하여 현실 세계만으로는 얻기 어려운 부가적인 정보들을 보강해 제공할 수 있는 기술이다.More specifically, augmented reality is a field of virtual reality (VR), which is a computer graphic technique that synthesizes virtual objects into a real environment to make them look like objects existing in the original environment. Unlike the existing virtual reality that targets only objects, it is a technology that can reinforce and provide additional information that is difficult to obtain from the real world by synthesizing virtual objects on the basis of the real world.

이러한 증강 현실 기술은 5G 통신의 상용화에 따라 통신 단말기에서 사용되는 모바일 AR 기술 분야에서 각광받고 있으며, 현재의 모바일 AR 기술의 어플리케이션에는 마커 기반의 모바일 AR 기술 또는 센서 기반의 모바일 AR 기술이 일반적으로 사용되고 있다.With the commercialization of 5G communication, such augmented reality technology is in the spotlight in the field of mobile AR technology used in communication terminals, and current mobile AR technology applications include marker-based mobile AR technology or sensor-based mobile AR technology. have.

마커 기반의 모바일 AR 기술은 가상 객체를 이용하여 증강시키고자 하는 실물 객체를 촬영할 때, 상기 실물 객체에 대응되는 마커를 같이 촬영함으로써, 마커의 인식을 통해 상기 실물 객체를 인식하는 기술이며, 센서 기반의 모바일 AR 기술은 단말기에 탑재된 GPS와 전자 나침반(Digital Compass) 등을 이용하여 단말기의 현재 위치와 바라보고 있는 방향을 유추하여 유추된 방향으로 영상에 해당하는 POI(Point of Interests) 정보를 오버레이(Overlay) 시켜주는 기술이다.The marker-based mobile AR technology is a technology that recognizes the real object through recognition of the marker by photographing a marker corresponding to the real object when photographing a real object to be augmented using a virtual object. 'S mobile AR technology infers the current location of the terminal and the direction you are looking at using the GPS and digital compass installed on the terminal, and overlays POI (Point of Interests) information corresponding to the image in the inferred direction. It is a technology that makes (Overlay).

하지만, 마커 기반의 모바일 AR 기술은 마커가 없이는 가상 객체를 증강할 수 없다는 문제가 있으며, 센서 기반의 모바일 AR 기술은 감지된 단말의 현재 위치 및 방향의 오차로 인해, 특정 물체 상에 정확히 가상 객체를 증강시키지 못하는 문제가 있다.However, there is a problem that the marker-based mobile AR technology cannot augment a virtual object without a marker, and the sensor-based mobile AR technology accurately displays a virtual object on a specific object due to an error in the current position and direction of the detected terminal. There is a problem that can not be enhanced.

상술한 문제점을 해결하기 위해, 실시예는 카메라의 포즈를 정확하게 추정하기 위한 포즈 추정 방법 및 포즈 추정 장치를 제공하는 것을 그 목적으로 한다.In order to solve the above-described problem, an object of the embodiment is to provide a pose estimation method and a pose estimation apparatus for accurately estimating a pose of a camera.

실시예의 포즈 추정 방법은 상기 카메라로부터 복수의 제1 영상을 수집하는 단계와, 상기 단말기의 GPS 정보를 이용하여 DB로부터 복수의 제2 영상을 수집하는 단계와, 상기 복수의 제1 영상 중 어느 하나의 상기 제1 영상과 상기 복수의 제2 영상을 비교하여 상기 카메라의 제1 포즈를 추정하는 단계와, 상기 복수의 제1 영상을 이용하여 상기 카메라의 제2 포즈를 추정하는 단계와, 동일한 시점에서 추정된 상기 제1 포즈 및 상기 제2 포즈를 비교하여 상기 카메라의 제3 포즈를 추정하는 단계를 포함할 수 있다.The pose estimation method of the embodiment includes the steps of collecting a plurality of first images from the camera, collecting a plurality of second images from a DB using GPS information of the terminal, and any one of the plurality of first images. Estimating a first pose of the camera by comparing the first image of the plurality of second images and estimating a second pose of the camera using the plurality of first images, and the same viewpoint And estimating a third pose of the camera by comparing the estimated first pose and the second pose.

상기 카메라의 제3 포즈를 추정하는 단계에서 상기 제3 포즈는 상기 제1 포즈와 상기 제2 포즈 사이의 거리에 대한 대한 마하라노비스 거리(Mahalanobis Distance)를 이용할 수 있다.In the step of estimating the third pose of the camera, the third pose may use a Mahalanobis distance with respect to a distance between the first pose and the second pose.

상기 카메라의 제3 포즈를 추정하는 단계에서 상기 마하라노비스 거리가 제1 임계값 보다 작으면, 상기 제1 포즈와 상기 제2 포즈에 가중치에 따른 평균값을 이용하여 상기 제3 포즈를 추정할 수 있다.In the step of estimating the third pose of the camera, if the Maharanobis distance is less than a first threshold value, the third pose may be estimated using an average value according to a weight of the first pose and the second pose. have.

상기 카메라의 제3 포즈를 추정하는 단계에서 상기 마하라노비스 거리가 상기 제1 임계값 보다 큰 제2 임계값 보다 크면, 상기 제2 포즈를 상기 제3 포즈를 추정할 수 있다.In the step of estimating the third pose of the camera, if the Mahalanobis distance is greater than a second threshold value greater than the first threshold value, the second pose may be used to estimate the third pose.

상기 카메라의 제3 포즈를 추정하는 단계에서 상기 마하라노비스 거리가 상기 제1 임계값과 상기 제2 임계값 사이의 값이면 상기 제2 포즈에 가중치를 높여 상기 제3 포즈를 추정할 수 있다.In the step of estimating the third pose of the camera, if the Mahalanobis distance is a value between the first threshold and the second threshold, the third pose may be estimated by increasing a weight to the second pose.

상기 카메라의 제3 포즈를 추정하는 단계에서 상기 제1 포즈는 제1 주기마다 측정되고 상기 제2 포즈는 제1 주기 보다 빠른 제2 주기 마다 측정되며, 상기 제3 포즈는 상기 제1 포즈가 측정되지 않는 시점에서는 상기 제2 포즈를 상기 제3 포즈로 추정할 수 있다.In the step of estimating the third pose of the camera, the first pose is measured every first cycle, the second pose is measured every second cycle faster than the first cycle, and the third pose is measured by the first pose. At a point in time that does not occur, the second pose may be estimated as the third pose.

상기 제3 포즈를 추정하는 단계 이후 상기 제3 포즈를 이용하여 상기 단말기의 화면 상에 가상 객체를 구현할 위치를 결정하는 단계를 포함할 수 있다.After the step of estimating the third pose, determining a position to implement a virtual object on the screen of the terminal using the third pose.

실시예에 따른 포즈 추정 장치는 단말기의 카메라로부터 촬영된 복수의 제1 영상을 수집하는 카메라 영상 수집부와, 상기 단말기의 GPS 정보를 이용하여 DB로부터 복수의 제2 영상을 수집하는 DB 영상 수집부와, 상기 제1 영상과 복수의 제2 영상을 비교하여 상기 카메라의 제1 포즈를 추정하는 제1 포즈 추정부와, 상기 복수의 제1 영상을 이용하여 상기 카메라의 제2 포즈를 추정하는 제2 포즈 추정부와, 동일한 시점에서의 상기 제1 포즈 및 상기 제2 포즈를 비교하여 상기 카메라의 제3 포즈를 추정하는 제3 포즈 추정부를 포함할 수 있다.A pose estimation apparatus according to an embodiment includes a camera image collection unit that collects a plurality of first images captured from a camera of a terminal, and a DB image collection unit that collects a plurality of second images from a DB using GPS information of the terminal. And, a first pose estimating unit that estimates a first pose of the camera by comparing the first image and a plurality of second images, and a second pose estimating a second pose of the camera using the plurality of first images. A second pose estimating unit may include a third pose estimating unit for estimating a third pose of the camera by comparing the first pose and the second pose at the same viewpoint.

실시예는 동일 시점에서의 제1 포즈와 제2 포즈를 이용함으로써, 제1 포즈가 부정확하거나 제2 포즈가 부정확한 경우를 모두 제거하여 보다 정확한 카메라의 포즈를 추정할 수 있는 효과가 있다.According to the embodiment, by using the first pose and the second pose at the same point in time, all cases in which the first pose is inaccurate or the second pose is inaccurate are eliminated, thereby estimating a more accurate camera pose.

도 1은 실시예에 따른 증강현실 시스템을 나타낸 블록도이다.
도 2는 실시예에 따른 포즈 추정 장치를 나타낸 블록도이다.
도 3은 실시예에 따른 포즈 추정 방법을 나타낸 블록도이다.
도 4는 실시예에 따른 DB 영상을 수집하는 과정을 설명하기 위한 도면이다.
도 5는 실시예에 따른 제3 포즈를 추정하는 단계를 설명하기 위한 도면이다.1 is a block diagram showing an augmented reality system according to an embodiment.
2 is a block diagram illustrating a pose estimation apparatus according to an embodiment.
3 is a block diagram illustrating a pose estimation method according to an embodiment.
4 is a diagram for explaining a process of collecting a DB image according to an embodiment.
5 is a diagram for describing a step of estimating a third pose according to an exemplary embodiment.

이하, 도면을 참조하여 실시예를 상세히 설명하기로 한다.Hereinafter, embodiments will be described in detail with reference to the drawings.

도 1은 실시예에 따른 증강현실 시스템을 나타낸 블록도이고, 도 2는 실시예에 따른 포즈 추정 장치를 나타낸 블록도이다.1 is a block diagram illustrating an augmented reality system according to an exemplary embodiment, and FIG. 2 is a block diagram illustrating a pose estimation apparatus according to an exemplary embodiment.

도 1을 참조하면, 실시예에 따른 증강현실 시스템(1000)은 포즈 추정 장치(100)를 포함할 수 있다.Referring to FIG. 1, an augmented reality system 1000 according to an embodiment may include a pose estimation apparatus 100.

포즈 추정 장치(100)는 카메라(240)의 포즈를 추정하는 작업을 수행할 수 있다. 포즈 추정 장치(100)는 프로세서를 포함하고, 프로세서에서 포즈 추정 장치(100)의 동작을 전반적으로 제어할 수 있다.The pose estimation apparatus 100 may perform a task of estimating the pose of the camera 240. The pose estimating apparatus 100 includes a processor, and the processor may control the overall operation of the pose estimating apparatus 100.

포즈 추정 장치(100)는 송수신기(미도시)를 포함할 수 있다. 포즈 추정 장치(100)는 송수신기를 이용하여 지도 생성 서버(300)로부터 영상을 수신하고, 단말기(200)로 포즈 정보를 전송할 수 있다. 포즈는 카메라(240) 위치에 대한 정보 및 방향에 대한 정보를 포함할 수 있다.The pose estimation apparatus 100 may include a transceiver (not shown). The pose estimation apparatus 100 may receive an image from the map generation server 300 using a transceiver and transmit pose information to the terminal 200. The pose may include information on the location and direction of the camera 240.

포즈 추정 장치(100)는 송수신기를 이용하여 단말기(200)로부터 초기화 값, 맵 정보, 단말기(200)의 GPS 정보, 단말기(200)가 캡쳐한 2D 이미지 및/또는 단말기(200)에 포함된 카메라(240)에 대한 정보를 수신할 수 있다.The pose estimating device 100 uses a transceiver to initialize values from the terminal 200, map information, GPS information of the terminal 200, a 2D image captured by the terminal 200, and/or a camera included in the terminal 200. Information about 240 may be received.

포즈 추정 장치(100)는 메모리(미도시)를 포함할 수 있다. 메모리는 프로그램의 실행에 필요한 정보 및 수신된 영상들을 저장할 수 있다. The pose estimation apparatus 100 may include a memory (not shown). The memory may store information required for execution of a program and received images.

실시예에 따른 증강현실 시스템은 단말기(200)를 포함할 수 있다.The augmented reality system according to the embodiment may include a terminal 200.

단말기(200)는 프로세서(210), 송수신기(220), 메모리(230), 카메라(240) 및 센서부(250), AR 프로그램(260)을 포함할 수 있다.The terminal 200 may include a processor 210, a transceiver 220, a memory 230, a camera 240 and a sensor unit 250, and an AR program 260.

프로세서(210)는 단말기(200)의 동작을 전반적으로 제어할 수 있다.The processor 210 may overall control the operation of the terminal 200.

프로세서(210)는 카메라(240)를 이용하여 촬영된 영상을 수집하고, 2D 영상으로 저장할 수 있다.The processor 210 may collect an image captured using the camera 240 and store it as a 2D image.

또한, 프로세서(210)는 초기화 값, 가상 객체를 증강시키고자 하는 구역의 맵 정보, GPS 정보 및/또는 카메라(240)에 대한 정보를 송수신기(220)를 이용하여 포즈 추정 장치(100)로 더 전송할 수 있다.In addition, the processor 210 further transmits an initialization value, map information of a region to which a virtual object is to be augmented, GPS information, and/or information about the camera 240 to the pose estimating device 100 using the transceiver 220. Can be transmitted.

프로세서(210)는 포즈 추정 장치(100)가 생성한 카메라(240)의 포즈 정보를 송수신기(220)를 이용하여 수신할 수 있다.The processor 210 may receive pose information of the camera 240 generated by the pose estimation apparatus 100 using the transceiver 220.

프로세서(210)는 증강 현실(Augmented Reality, AR) 프로그램(260)을 실행할 수 있다. 프로세서(210)는, 증강 현실 프로그램(260)을 실행하여, 포즈 추정 장치(100)로부터 추정된 포즈 값에 따라 카메라(240)의 포즈를 결정하고, 카메라(240)가 촬영하고 있는 영상 내에서 가상 객체를 증강시킬 위치를 결정하고, 결정된 위치에 상기 가상 객체를 증강시킬 수 있다.The processor 210 may execute an augmented reality (AR) program 260. The processor 210 executes the augmented reality program 260, determines a pose of the camera 240 according to the pose value estimated from the pose estimation apparatus 100, and determines the pose of the camera 240 within the image being captured by the camera 240. A position to augment the virtual object may be determined, and the virtual object may be augmented at the determined position.

프로세서(210)는 포즈 추정 장치(100)로부터 수신한 카메라(240)의 포즈 정보를 이용하여 카메라(240)의 포즈를 결정하고, 센서부(250)를 이용하여 카메라(240)의 포즈를 업데이트할 수 있다.The processor 210 determines the pose of the camera 240 using the pose information of the camera 240 received from the pose estimation apparatus 100, and updates the pose of the camera 240 using the sensor unit 250 can do.

메모리(230)는 증강 현실 프로그램(250)을 저장하고, 프로세서(210)는 메모리(230)로부터 증강 현실 프로그램(250) 및 증강 현실 프로그램(250)의 실행에 필요한 정보를 로드할 수 있다.The memory 230 may store the augmented reality program 250, and the processor 210 may load the augmented reality program 250 and information necessary for execution of the augmented reality program 250 from the memory 230.

센서부(250)는 카메라(240)의 위치 및 방향을 결정하는데 이용될 수 있다. 센서부(250)는 GPS 센서, 기울기 센서, 지자기 센서, 중력 센서, 자이로센서 및 가속도 센서 중 적어도 하나를 포함할 수 있다.The sensor unit 250 may be used to determine the position and direction of the camera 240. The sensor unit 250 may include at least one of a GPS sensor, a tilt sensor, a geomagnetic sensor, a gravity sensor, a gyro sensor, and an acceleration sensor.

실시예에 따른 증강현실 시스템은 지도 생성 서버(300, 이하 'DB'라 칭함)를 포함할 수 있다.The augmented reality system according to the embodiment may include a map generation server 300 (hereinafter referred to as “DB”).

DB(300)는 특정 구역을 여러 시점에서 촬영한 복수의 제2 영상을 생성하고, 생성한 제2 영상을 포즈 추정 장치(100)로 전송할 수 있다.The DB 300 may generate a plurality of second images photographed at various viewpoints of a specific area, and transmit the generated second image to the pose estimation apparatus 100.

제2 영상은 2D 이미지, 상기 2D 이미지에 포함된 복수의 특징점들(features or feature points) 각각의 2D 이미지 상에서의 위치(u, v), 상기 복수의 특징점들 각각의 지구 상에서의 절대적인 3D 좌표(x, y, z), 및 상기 복수의 특징점들 각각에 대한 이미지 설명자(image descriptor)를 포함할 수 있다. 여기서, 이미지 설명자는 특정 구역을 촬영한 이미지 상에서의 특징점을 다른 구역을 촬영한 이미지에 포함된 특징점과 구분하기 위한 것으로서, 상기 특징점과 상기 특징점 주위의 픽셀과의 상관 관계가 벡터로 표현된 다차원 벡터를 의미할 수 있다.The second image is a 2D image, a position (u, v) of each of a plurality of features or feature points included in the 2D image on a 2D image, and an absolute 3D coordinate on the earth of each of the plurality of feature points ( x, y, z), and an image descriptor for each of the plurality of feature points may be included. Here, the image descriptor is for distinguishing a feature point on an image photographing a specific region from a feature point included in an image photographing another region, and a multidimensional vector in which the correlation between the feature point and the pixels surrounding the feature point is expressed as a vector. Can mean

DB(300)는 직접 촬영한 영상을 저장하는 공간이거나, 외부 네이버, 다움, 구글 등와 같은 포털 사이트에서 제공된 지도 정보를 저장하는 공간일 수 있다.The DB 300 may be a space for storing directly captured images or a space for storing map information provided from a portal site such as external Naver, Daum, and Google.

도 2에 도시된 바와 같이, 실시예에 따른 포즈 추정 장치(100)는 카메라 영상 수집부(110)를 포함할 수 있다.As shown in FIG. 2, the pose estimation apparatus 100 according to the embodiment may include a camera image collection unit 110.

카메라 영상 수집부(110)는 복수의 영상을 수집할 수 있다. 복수의 영상은 단말기의 카메라로부터 수집될 수 있다. The camera image collection unit 110 may collect a plurality of images. A plurality of images may be collected from the camera of the terminal.

실시예에 따른 포즈 추정 장치(100)는 DB 영상 수집부(120)를 포함할 수 있다.The pose estimation apparatus 100 according to the embodiment may include a DB image collection unit 120.

DB 영상 수집부(120)는 DB로부터 복수의 제2 영상을 수집할 수 있다. 제2 영상은 단말기의 GPS 정보를 이용하여 수집될 수 있다. 제2 영상은 GPS 정보로부터 단말기의 위치를 측정하고, 측정한 단말기의 위치와 촬영 위치가 가장 가까운 영상일 수 있다. 제2 영상은 2D 이미지, 상기 2D 이미지에 포함된 복수의 특징점들(features or feature points) 각각의 2D 이미지 상에서의 위치(u, v), 상기 복수의 특징점들 각각의 지구 상에서의 절대적인 3D 좌표(x, y, z), 및 상기 복수의 특징점들 각각에 대한 이미지 설명자(image descriptor)를 포함할 수 있다. The DB image collection unit 120 may collect a plurality of second images from the DB. The second image may be collected using GPS information of the terminal. The second image may be an image in which the location of the terminal is measured from GPS information, and the measured location of the terminal and the photographed location are closest. The second image is a 2D image, a position (u, v) of each of a plurality of features or feature points included in the 2D image on a 2D image, and an absolute 3D coordinate on the earth of each of the plurality of feature points ( x, y, z), and an image descriptor for each of the plurality of feature points may be included.

실시예에 따른 포즈 추정 장치(100)는 제1 포즈 추정부(130)를 포함할 수 있다.The pose estimating apparatus 100 according to the embodiment may include a first pose estimating unit 130.

제1 포즈 추정부(130)는 제1 영상과 복수의 제2 영상을 비교하여 카메라의 제1 포즈를 추정할 수 있다. 제1 포즈 추정부(130)는 제1 영상의 특징점들과 복수의 제2 영상의 특징점들과 비교하여 카메라의 제1 포즈를 추정할 수 있다.The first pose estimating unit 130 may estimate a first pose of the camera by comparing the first image and a plurality of second images. The first pose estimating unit 130 may estimate a first pose of the camera by comparing the feature points of the first image and the feature points of the plurality of second images.

실시예에 따른 포즈 추정 장치(130)는 제2 포즈 추정부(140)를 포함할 수 있다.The pose estimating apparatus 130 according to the embodiment may include a second pose estimating unit 140.

제2 포즈 추정부(140)는 복수의 제1 영상들을 이용하여 카메라의 제2 포즈를 추정할 수 있다. 제2 포즈 추정부(140)는 복수의 제1 영상의 특징점들을 서로 비교하여 카메라의 제2 포즈를 추정할 수 있다.The second pose estimating unit 140 may estimate a second pose of the camera using a plurality of first images. The second pose estimating unit 140 may estimate a second pose of the camera by comparing feature points of a plurality of first images with each other.

실시예에 따른 포즈 추정 장치(100)는 제3 포즈 추정부(150)를 포함할 수 있다.The pose estimating apparatus 100 according to the embodiment may include a third pose estimating unit 150.

제3 포즈 추정부(150)는 제1 포즈와 제2 포즈를 이용하여 카메라의 제3 포즈를 추정할 수 있다. 제1 포즈는 지구 좌표계를 이용하여 DB 영상을 제1 영상과 비교함에 따라 연산량이 많이 소요될 수 있다. 이로 인해 제1 포즈는 제1 주기 마다 측정될 수 있다. 여기서, 제1 주기는 30Hz, 60Hz 일 수 있다.The third pose estimating unit 150 may estimate a third pose of the camera using the first pose and the second pose. The first pose may take a large amount of computation as the DB image is compared with the first image using the earth coordinate system. Accordingly, the first pose may be measured every first period. Here, the first period may be 30Hz or 60Hz.

반면, 제2 포즈는 복수의 제1 영상들을 비교함에 따라 앞선 영상과 중복되는 영역이 존재하기 때문에 연산량이 적을 수 있다. 이로 인해 제2 포즈는 제2 주기 마다 측정될 수 있다. 여기서, 제2 주기는 수백ms 내지 1sec 일 수 있다.On the other hand, the second pose may have a small amount of computation because an area overlapping with the previous image exists as the plurality of first images are compared. Accordingly, the second pose may be measured every second period. Here, the second period may be several hundred ms to 1 sec.

제3 포즈 추정부(150)는 제1 포즈와 제2 포즈가 다른 주기를 가짐에 따라 동일한 시점에서의 영상을 비교함으로써, 제3 포즈를 추정할 수 있다. 반면, 제1 포즈가 측정되지 않는 시점에서는 제2 포즈를 제3 포즈로 추정할 수 있다.The third pose estimating unit 150 may estimate a third pose by comparing images at the same viewpoint as the first pose and the second pose have different periods. On the other hand, when the first pose is not measured, the second pose may be estimated as the third pose.

제2 영상은 절대 위치를 가지고 있지만, 연속적인 특성은 가지지 못한다. 또한, 제1 영상은 연속적인 특성은 가지고 있지만, 촬영한 위치가 변할 수 있기 때문에 카메라의 드래프트가 발생될 수 있다.The second image has an absolute position, but does not have a continuous characteristic. In addition, although the first image has a continuous characteristic, a camera draft may occur because the photographed position may change.

따라서, 제3 포즈 추정부(150)는 동일 시점에서의 제1 포즈와 제2 포즈를 이용하여 서로의 단점을 보완하여 카메라의 포즈를 보다 정확하게 추정할 수 있는 효과가 있다.Accordingly, the third pose estimating unit 150 has an effect of being able to more accurately estimate the pose of the camera by supplementing each other's shortcomings by using the first pose and the second pose at the same viewpoint.

제3 포즈 추정부(150)는 제1 포즈와 제2 포즈 사이의 거리에 대한 마하라노비스 거리(Mahalanobis Distance)를 이용할 수 있다.The third pose estimating unit 150 may use a Mahalanobis distance for a distance between the first pose and the second pose.

마하라노비스 거리가 제1 임계값 보다 작으면, 상기 제1 포즈와 상기 제2 포즈 사이의 거리의 평균값을 이용하여 상기 제3 포즈를 추정할 수 있다.If the Mahalanobis distance is less than the first threshold value, the third pose may be estimated using an average value of the distance between the first pose and the second pose.

마하라노비스 거리가 상기 제1 임계값 보다 큰 제2 임계값 보다 크면, 상기 제2 포즈를 상기 제3 포즈를 추정할 수 있다.When the Mahalanobis distance is greater than a second threshold value greater than the first threshold value, the second pose may be used to estimate the third pose.

마하라노비스 거리가 상기 제1 임계값과 상기 제2 임계값 사이의 값이면 상기 제2 포즈의 가중치를 적게 하여 상기 제3 포즈를 추정할 수 있다.If the Mahalanobis distance is a value between the first threshold and the second threshold, the third pose may be estimated by reducing the weight of the second pose.

실시예에 따른 포즈 추정 장치(100)는 객체 증강부(160)를 더 포함할 수 있다.The pose estimation apparatus 100 according to the embodiment may further include an object augmentation unit 160.

객체 증강부(160)는 카메라의 제3 포즈 예컨대, 카메라의 위치 및 방향을 이용하여 단말기에 디스플레이되고 있는 화면 상에서 가상 객체를 구현할 위치를 결정할 수 있다. 여기서, 객체 증강부는 포즈 추정 장치에서 생략될 수도 있다.The object augmentation unit 160 may determine a position to implement the virtual object on the screen displayed on the terminal using the third pose of the camera, for example, the position and direction of the camera. Here, the object augmentation unit may be omitted from the pose estimation apparatus.

이하에서는 도 3 내지 도 5를 참조하여, 실시예에 따른 포즈 추정 방법을 살펴본다.Hereinafter, a pose estimation method according to an embodiment will be described with reference to FIGS. 3 to 5.

도 3은 실시예에 따른 포즈 추정 방법을 나타낸 블록도이고, 도 4는 실시예에 따른 DB 영상을 수집하는 과정을 설명하기 위한 도면이고, 도 5는 실시예에 따른 제3 포즈를 추정하는 단계를 설명하기 위한 도면이다.3 is a block diagram showing a pose estimation method according to an exemplary embodiment, FIG. 4 is a diagram for explaining a process of collecting a DB image according to an exemplary embodiment, and FIG. 5 is a step of estimating a third pose according to an exemplary embodiment. It is a figure for explaining.

도 3을 참조하면, 실시예에 따른 포즈 추정 방법은 카메라 영상을 수집하는 단계(S100)를 수행할 수 있다.Referring to FIG. 3, the pose estimation method according to the embodiment may perform an operation S100 of collecting a camera image.

카메라 영상을 수집하는 단계(S100)는 단말기로부터 복수의 제1 영상을 수집할 수 있다. 제1 영상은 단말기의 카메라 또는 센서로부터 얻은 영상일 수 있다.In the step S100 of collecting camera images, a plurality of first images may be collected from the terminal. The first image may be an image obtained from a camera or sensor of the terminal.

카메라 영상을 수집하는 단계(S100)는 단말기가 이동하면서 촬영한 복수의 제1 영상을 수집할 수 있다. 제1 영상은 2D 영상일 수 있다. 제1 영상은 특정 시점에서의 정지 영상일 수 있다.In the step S100 of collecting the camera image, a plurality of first images captured while the terminal is moving may be collected. The first image may be a 2D image. The first image may be a still image at a specific point in time.

실시예에 따른 포즈 추정 방법은 DB 영상을 수집하는 단계(S200)를 수행할 수 있다. The pose estimation method according to the embodiment may perform the step (S200) of collecting a DB image.

DB 영상을 수집하는 단계(S200)는 DB로부터 복수의 제2 영상을 수집할 수 있다. 도 4에 도시된 바와 같이, 제2 영상(20)은 단말기의 GPS 정보를 이용하여 단말기에서 촬영된 제1 영상(10)의 위치와 가까운 영상들 일 수 있다. 따라서, 제2 영상(20)은 단말기와 가까운 순서대로 정렬되어 수신될 수 있다. 예컨대, 제1 영상(10)이 저장될 때의 단말기와 가장 가까운 제2-1 영상(21)이 가장 먼저 수신되고, 두 번째로 가까운 제2-2 영상(22)이 두 번째로 수신되고, 다섯번째로 가까운 제2-5 영상(25)이 마지막으로 수신될 수 있다.In the step of collecting DB images (S200), a plurality of second images may be collected from the DB. As shown in FIG. 4, the second image 20 may be images close to the location of the first image 10 captured by the terminal using GPS information of the terminal. Accordingly, the second images 20 may be arranged and received in an order close to the terminal. For example, when the first image 10 is stored, the 2-1 image 21 closest to the terminal is received first, and the 2-2 image 22 closest to the second is received second, The fifth image 25, which is the fifth closest, may be finally received.

제2 영상(20)은 2D 이미지, 상기 2D 이미지에 포함된 복수의 특징점들(features or feature points) 각각의 2D 이미지 상에서의 위치(u, v), 상기 복수의 특징점들 각각의 지구 상에서의 절대적인 3D 좌표(x, y, z), 및 상기 복수의 특징점들 각각에 대한 이미지 설명자(image descriptor)를 포함할 수 있다.The second image 20 is a 2D image, a position (u, v) of each of a plurality of features or feature points included in the 2D image on the 2D image, and each of the plurality of feature points is an absolute It may include 3D coordinates (x, y, z) and an image descriptor for each of the plurality of feature points.

실시예에 따른 포즈 추정 방법은 제1 포즈를 추정하는 단계(S300)를 수행할 수 있다. The pose estimation method according to the embodiment may perform an operation S300 of estimating a first pose.

제1 포즈를 추정하는 단계(S300)는 제1 영상과 복수의 제2 영상을 비교하여 카메라의 제1 포즈를 추정할 수 있다. 제1 포즈는 제1 영상의 특징점들과 복수의 제2 영상의 특징점들과 매칭시켜 제1 포즈를 추정할 수 있다. 여기서, 제1 영상과 제2 영상에는 불필요한 특징점들이 포함될 수 있기 때문에 불필요한 특징점들을 제거하는 단계를 더 수행할 수도 있다.In estimating the first pose (S300 ), the first pose of the camera may be estimated by comparing the first image and the plurality of second images. The first pose may be estimated by matching feature points of the first image and feature points of a plurality of second images. Here, since unnecessary feature points may be included in the first image and the second image, a step of removing unnecessary feature points may be further performed.

실시예에 따른 포즈 추정 방법은 제2 포즈를 추정하는 단계(S400)를 수행할 수 있다. The pose estimation method according to the embodiment may perform an operation S400 of estimating a second pose.

제2 포즈를 추정하는 단계(S400)는 복수의 제1 영상들을 이용하여 카메라의 제2 포즈를 추정할 수 있다. 제2 포즈를 추정하는 단계(S400)는 복수의 제1 영상의 특징점들을 서로 매칭시킴으로써 카메라의 제2 포즈를 추정할 수 있다.In estimating the second pose (S400 ), a second pose of the camera may be estimated using a plurality of first images. In estimating the second pose (S400 ), the second pose of the camera may be estimated by matching feature points of a plurality of first images with each other.

실시예에 따른 포즈 추정 방법은 제3 포즈를 추정하는 단계(S500)를 수행할 수 있다.The pose estimation method according to the embodiment may perform an operation S500 of estimating a third pose.

제3 포즈를 추정하는 단계(S500)는 제1 포즈와 제2 포즈를 이용하여 제3 포즈를 추정할 수 있다. In estimating the third pose (S500 ), a third pose may be estimated using the first pose and the second pose.

도 5에 도시된 바와 같이, 제1 포즈(T1)는 제1 주기(30Hz, 60Hz)마다 측정되며, 제2 포즈(T2)는 제2 주기(수백ms 내지 1sec)마다 측정될 수 있다. As shown in FIG. 5, the first pose T1 may be measured every first period (30Hz, 60Hz), and the second pose T2 may be measured every second period (hundreds of ms to 1sec).

따라서, 제3 포즈를 추정하는 단계(S500)는 동일한 시점에서의 제1 포즈(T1)와 제2 포즈(T2)를 이용하여 제3 포즈를 추정할 수 있다. 도면에서는 제1 시점(t6)과 제2 시점(t11)에서 제3 포즈를 추정할 수 있으나, 이는 한정되지 않는다.Accordingly, in estimating the third pose (S500 ), the third pose may be estimated using the first pose T1 and the second pose T2 at the same viewpoint. In the drawing, the third pose may be estimated at the first time point t6 and the second time point t11, but this is not limited.

여기서, 제1 포즈(T1)가 측정되지 않는 시점(t2~t5, t7~t10)에서는 제2 포즈(T2)를 제3 포즈로 추정할 수 있다.Here, at the time points t2 to t5 and t7 to t10 at which the first pose T1 is not measured, the second pose T2 may be estimated as the third pose.

제3 포즈는 마하라노비스 거리(Md)를 이용하여 추정할 수 있다. 여기서, 마하라노비스 거리(Md)는 수학식 1에 의해 정의될 수 있다.The third pose can be estimated using the Mahalanobis distance (Md). Here, the Mahalanobis distance Md may be defined by Equation 1.

[수학식 1][Equation 1]

여기서, x는 제1 포즈에서의 위치를 나타내고, y는 제2 포즈의 위치에서의 위치를 나타낸다. S^-1은 분산을 의미한다. 분산은 영상에 3D 좌표값을 사영시켜 구해질 수 있다.Here, x denotes a position in the first pose, and y denotes a position in the second pose. S ^-1 means variance. Variance can be obtained by projecting 3D coordinate values on the image.

제3 포즈는 수학식 2에 의해 계산될 수 있다. 제3 포즈는 마하라노비스 거리(Md)의 조건에 따라 달라질 수 있다. 예컨대, 마하라노비스 거리(Md)가 제1 임계값(d1) 보다 작거나, 마하라노비스 거리(Md)가 제1 임계값(d1)과 제2 임계값(d2) 사이이거나, 마하라노비스 거리(Md)가 제2 임계값(d2) 보다 클 경우에 따라 제3 포즈를 다르게 계산할 수 있다.The third pose may be calculated by Equation 2. The third pose may vary depending on the condition of the Mahalanobis distance (Md). For example, the Mahalanobis distance (Md) is less than the first threshold (d1), the Mahalanobis distance (Md) is between the first threshold (d1) and the second threshold (d2), or Depending on the case where the distance Md is greater than the second threshold d2, the third pose may be calculated differently.

[수학식 2][Equation 2]

제3 포즈는 수학식 3에 의해 계산될 수 있다.The third pose may be calculated by Equation 3.

[수학식 3][Equation 3]

여기서, σ는 표준 편차를 의미하고, μ는 평균값을 의미한다. 표준편차의 제곱은 분산값을 의미한다. 여기서, 평균값은 영상에 3D 좌표값을 사용시켜 구해질 수 있다.Here, σ means the standard deviation, and μ means the average value. The square of the standard deviation means the variance value. Here, the average value may be obtained by using a 3D coordinate value in the image.

예를 들어, 제1 포즈(T1)의 평균값이 2, 가중치가 4이고, 제2 포즈(T2)의 평균값이 1, 가중치가 2인 경우, 제3 포즈는 5/3일 수 있다. 여기서, 가중치는 분산의 역수값일 수 있다.For example, when the average value of the first pose T1 is 2 and the weight is 4, the average value of the second pose T2 is 1 and the weight is 2, the third pose may be 5/3. Here, the weight may be an inverse value of variance.

즉, 마하라노비스 거리(Md)가 제1 임계값(d1) 보다 작은 경우, 제1 포즈(T1)와 제2 포즈(T2)에 가중치에 따른 평균값을 이용하여 추정될 수 있다.That is, when the Mahalanobis distance Md is smaller than the first threshold d1, the first pose T1 and the second pose T2 may be estimated using an average value according to a weight.

또한, 마하라노비스 거리(Md)가 제1 임계값(d1)과 제2 임계값(d2) 사이일 경우, 수학식 2의 T'local은 수학식 4에 의해 계산될 수 있다.In addition, when the Mahalanobis distance Md is between the first threshold value d1 and the second threshold value d2, T'local in Equation 2 may be calculated by Equation 4.

[수학식 4][Equation 4]

여기서, v는 가중치를 낮추기 위한 임의의 값일 수 있으며, v는 1보다 클 수 있다.Here, v may be an arbitrary value for lowering the weight, and v may be greater than 1.

또한, 마하라노비스 거리(Md)가 제2 임계값(d2) 보다 크면, 제2 포즈(T2) 값을 이용할 수 있다.In addition, if the Mahalanobis distance Md is greater than the second threshold value d2, the second pose T2 value may be used.

실시예는 제1 포즈와 제2 포즈를 이용함으로써, 제1 포즈가 부정확하거나 제2 포즈가 부정확한 경우를 모두 제거하여 보다 정확한 카메라의 포즈를 추정할 수 있는 효과가 있다.In the embodiment, by using the first pose and the second pose, it is possible to estimate a more accurate camera pose by removing all cases in which the first pose is incorrect or the second pose is incorrect.

실시예에 따른 포즈 추정 방법은 가상 객체의 구현 위치를 결정하는 단계(S600)를 수행할 수 있다.The pose estimation method according to the embodiment may perform the step (S600) of determining an implementation position of a virtual object.

가상 객체의 구현 위치를 결정하는 단계(S600)는 제3 포즈를 기초로 카메라의 위치 및 방향을 이용하여 단말기에 디스플레이되고 있는 화면 상에서 가상 객체를 구현할 위치를 결정할 수 있다.In the determining (S600) of the implementation position of the virtual object, a position to implement the virtual object on the screen displayed on the terminal may be determined using the position and direction of the camera based on the third pose.

상기에서는 도면 및 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허청구범위에 기재된 실시예의 기술적 사상으로부터 벗어나지 않는 범위 내에서 실시예는 다양하게 수정 및 변경시킬 수 있음은 이해할 수 있을 것이다.Although described above with reference to the drawings and embodiments, those skilled in the art will understand that the embodiments can be variously modified and changed without departing from the technical spirit of the embodiments described in the following claims. I will be able to.

100: 포즈 추정 장치
200: 단말기
300: DB100: pose estimation device
200: terminal
300: DB

Claims

In a method performed in a pose estimation apparatus for estimating a pose of a camera provided in a terminal,
Collecting a plurality of first images from the camera;
Collecting a plurality of second images from a DB using GPS information of the terminal;
Estimating a first pose of the camera by comparing the first image of any one of the plurality of first images with the plurality of second images;
Estimating a second pose of the camera using the plurality of first images; And
Estimating a third pose of the camera by comparing the first pose and the second pose estimated at the same time point
Pose estimation method comprising a.

The method of claim 1,
In the step of estimating the third pose of the camera,
The third pose is a pose estimation method using a Mahalanobis distance with respect to a distance between the first pose and the second pose.

The method of claim 2,
In the step of estimating the third pose of the camera,
When the Maharanobis distance is less than a first threshold value, the third pose is estimated by using an average value according to a weight of the first pose and the second pose.

The method of claim 3,
In the step of estimating the third pose of the camera,
When the Maharanobis distance is greater than a second threshold value greater than the first threshold value, the pose estimation method of estimating the third pose from the second pose.

The method of claim 4,
In the step of estimating the third pose of the camera,
If the Maharanobis distance is a value between the first threshold and the second threshold, a weight is increased to the second pose to estimate the third pose.

The method of claim 1,
In the step of estimating the third pose of the camera,
The first pose is measured every first period, and the second pose is measured every second period faster than the first period,
The third pose is a pose estimation method for estimating the second pose as the third pose when the first pose is not measured.

The method of claim 1,
And after estimating the third pose, determining a position to implement a virtual object on the screen of the terminal using the third pose.

A camera image collection unit that collects a plurality of first images from the camera;
A DB image collection unit that collects a plurality of second images from a DB using GPS information of the terminal;
A first pose estimating unit for estimating a first pose of the camera by comparing the first image of any one of the plurality of first images with the plurality of second images;
A second pose estimating unit for estimating a second pose of the camera using the plurality of first images; And
A third pose estimating unit that estimates a third pose of the camera by comparing the first pose and the second pose estimated at the same point in time
Pose estimation device comprising a.

As a computer-readable recording medium storing a computer program,
The computer program,
A computer-readable recording medium comprising instructions for causing a processor to perform the method according to any one of claims 1 to 7.

As a computer program stored in a computer-readable recording medium,
The computer program,
A computer program comprising instructions for causing said processor to perform a method according to claim 1.