KR20130084849A

KR20130084849A - Method and apparatus for camera tracking

Info

Publication number: KR20130084849A
Application number: KR1020120005745A
Authority: KR
Inventors: 김지원; 구팽 장; 박두식; 이호영; 하오민 류; 후준 바오
Original assignee: 삼성전자주식회사
Priority date: 2012-01-18
Filing date: 2012-01-18
Publication date: 2013-07-26
Also published as: CN103218799A; KR101926563B1; US8873802B2; US20130182894A1; CN103218799B

Abstract

PURPOSE: A method and an apparatus for tracing the location of a camera are provided to trace first features within multiple frames and to trace second features within a single frame based on the traced first features. CONSTITUTION: A feature extracting unit extracts third features from at least three images in a first frame among multiple frames (410). A feature tracing unit traces the extracted third features to the last frame among the multiple frames (420). A dynamic point detector removes features having a dynamic trace among the traced third features and determines first features (430). A camera location estimating unit estimates the location of each camera within each frame based on the first features (440). [Reference numerals] (410) Extract features from a first frame among multiple frames; (420) Trace the extracted features to the last frame among the multiple frames; (430) Remove features having a dynamic trace among the traced features; (440) Estimate the location of each camera based on remaining features after the removal; (AA) Start; (BB) End

Description

METHOD AND APPARATUS FOR CAMERA TRACKING}

아래의 실시예들은 카메라의 위치를 추적하기 위한 방법 및 장치에 관한 것이다.The following embodiments relate to a method and apparatus for tracking the position of a camera.

적어도 3 개의 카메라들에 의해 촬영된 프레임들에 기반하여 카메라들 각각의 위치를 추적하는 방법 및 장치가 개시된다.A method and apparatus are disclosed for tracking the location of each of the cameras based on frames taken by at least three cameras.

카메라 추적(tracking)은 컴퓨터 비젼에 있어서 기본적인 문제이다. 카메라 추적의 목적은 비디오(video) 시퀀스(sequence)로부터 자동으로 카메라의 움직임을 복원하는 것일 수 있다. 카메라 추적의 기본적인 아이디어는 프레임들의 시퀀스 내에서 출현하는 씬 포인트(point)들을 선택하고, 2D 특성(feature) 포인트들의 대응(correspondence)들의 집합(set)에 기반하여 선택된 씬 포인트들의 3차원(Dimension; D) 위치 및 카메라 움직임을 동시에 추정하는 것이다.Camera tracking is a fundamental problem in computer vision. The purpose of camera tracking may be to automatically recover the movement of the camera from a video sequence. The basic idea of camera tracking is to select scene points that appear within a sequence of frames, and select a three-dimensional dimension of selected scene points based on a set of correspondences of 2D feature points; D) Estimate the position and camera movement at the same time.

깊이 복원(depth recovery), 3D 재구성(reconstruction), 위치 인식(location recognition) 및 자동 로봇 네비게이션(autonomous robot navigation) 등과 같은, 카메라 추적에 대한 다수의 응용(application)들이 존재한다.There are many applications for camera tracking, such as depth recovery, 3D reconstruction, location recognition and autonomous robot navigation.

특히, 디지털 카메라들이 널리 보급됨에 따라, 단안(monocular) 카메라들이 이미 쉽게 접근할 수 있게 되었고, 단안 카메라들의 가격은 점점 더 낮아지고 있다. 따라서, 단안 카메라를 추적하는 방법들이 광범위하게 사용되고 있다.In particular, with the widespread use of digital cameras, monocular cameras are already readily accessible, and the price of monocular cameras is becoming lower and lower. Thus, methods for tracking monocular cameras are widely used.

그러나, 단안 카메라에 의해 촬영된 이미지로부터는 동적인 객체(dynamic object)의 3차원 정보가 복원될 수 없다. 또한, 축적 오류(accumulation error) 때문에, 단안 카메라에 의해 촬영된 이미지를 사용하여 광범위한(large-scale) 씬(scene)들에 대한 카메라 움직임(motion)을 정확하게 복원하는 것은 어렵다.However, three-dimensional information of a dynamic object cannot be restored from an image photographed by the monocular camera. In addition, due to an accumulation error, it is difficult to accurately reconstruct camera motion for large-scale scenes using images captured by a monocular camera.

스테레오(stereo) 카메라들을 사용하여 카메라 움직임 및 이미지의 깊이 맵들을 복원하는 몇몇 방법들이 제안되었다. 그러나, 스테레오 카메라에 의해 이미지들이 촬영된 경우, 이미지 내의 폐색을 다루는 것이 어려울 수 있다.Several methods have been proposed for reconstructing camera motion and depth maps of an image using stereo cameras. However, when images are taken by a stereo camera, it may be difficult to deal with the occlusion in the image.

일 실시예는 적어도 3 개의 카메라에 의해 촬영된 프레임들에 기반하여 상기의 프레임들 내에서 특징들을 추출하고, 추출된 특징들에 기반하여 카메라들 각각의 위치를 추정하는 방법 및 장치를 제공할 수 있다.An embodiment may provide a method and apparatus for extracting features within the frames based on frames taken by at least three cameras and estimating the position of each of the cameras based on the extracted features. have.

일 실시예는 다중-프레임들 내에서 제1 특징들을 추적하고, 추적된 제1 특징들에 기반하여 단일-프레임 내에서 상기의 제1 특징에 대응하는 제2 특징들을 추적하는 방법 및 장치를 제공할 수 있다.One embodiment provides a method and apparatus for tracking first features in multi-frames and tracking second features corresponding to the first feature in a single-frame based on the tracked first features. can do.

일 측에 따르면, 적어도 3 개의 카메라들에 의해 촬영된 프레임들을 사용하여 상기 카메라들의 위치를 추적하는 방법에 있어서, 다중-프레임들 내에서 하나 이상의 제1 특징들을 추출 및 추적하고, 상기 제1 특징들에 기반하여 상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계 및 하나 이상의 단일-프레임들 각각의 제2 특징들에 기반하여 상기 하나 이상의 단일-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계를 포함하고, 상기 하나 이상의 단일-프레임들은 상기 다중-프레임들에 연이은 프레임들 중 상기 제1 특징들 중 하나에 대응하는 제2 특징들이 문턱치보다 적게 추적된 최초의 프레임의 이전의 프레임들인, 카메라 위치 추적 방법이 제공된다.According to one side, a method for tracking the location of the cameras using frames taken by at least three cameras, the method comprising: extracting and tracking one or more first features within multi-frames, the first feature Tracking the location of each of the cameras within each of the multi-frames based on the plurality of frames and within each of the one or more single-frames based on second features of each of the one or more single-frames. Tracking a location of each of the cameras, wherein the one or more single-frames are tracked with less than a threshold second features corresponding to one of the first features of frames subsequent to the multi-frames. A camera positioning method, which is the previous frames of the first frame, is provided.

상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계는, 상기 다중-프레임들 중 첫 번째 프레임의 적어도 3 개의 이미지들로부터 제3 특징들을 추출하는 단계, 상기 제3 특징들을 상기 다중-프레임들 중 마지막 프레임까지 추적하는 단계, 상기 마지막 프레임까지 추적된 제3 특징들 중 동적 궤적을 갖는 특징들을 제거함으로써 상기 제1 특징들을 결정하는 단계 및 상기 제1 특징들에 기반하여 상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추정하는 단계를 포함할 수 있다.Tracking the location of each of the cameras within each of the multi-frames comprises: extracting third features from at least three images of the first of the multi-frames, the third features; Tracking to the last frame of the multi-frames, determining the first features by removing features with a dynamic trajectory of third features tracked to the last frame and based on the first features Estimating a position of each of the cameras within each of the multi-frames.

상기 제3 특징들을 추출하는 단계는, 상기 첫 번째 프레임의 적어도 3 개의 이미지들로부터 포인트들을 추출하고, 스케일 불변 특징 변형 설명자(Scale Invariant Feature Transform; SIFT)들을 생성하는 단계, 상기 추출된 포인트들을 상기 생성된 SIFT들 간의 설명자 비교를 사용하여 서로 간에 매치시키며, 상기 포인트들 중 매치된 포인트들을 특징으로서 연결시킴으로써 상기 제3 특징들을 생성하는 단계를 포함할 수 있다.Extracting the third features may include extracting points from at least three images of the first frame, generating Scale Invariant Feature Transform Descriptors (SIFTs), and extracting the extracted points. Matching each other using descriptor comparisons between the generated SIFTs, and generating the third features by concatenating matched points among the points as features.

상기 제3 특징들을 추출하는 단계는, 기하 제약을 사용하여 상기 제3 특징들 중 아웃라이어들을 제거하는 단계를 더 포함할 수 있다.Extracting the third features may further include removing outliers of the third features using geometric constraints.

상기 기하 제약은 에피폴라 제약, 재-투영 제약 및 깊이 범위 제약 중 하나 이상일 수 있다.The geometric constraint may be one or more of epipolar constraint, re-projection constraint and depth range constraint.

상기 단일-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계는, 상기 다중-프레임들의 다음 프레임을 현재 프레임으로 설정하는 단계, 상기 현재 프레임 내에서 상기 제1 특징들 중 하나의 특징에 대응하는 상기 제2 특징들을 추출하는 단계, 상기 제2 특징들의 개수가 문턱치 이상인 경우 현재 프레임 내에서의 상기 카메라들 각각의 위치를 추정하는 단계 및 상기 제2 특징들의 개수가 문턱치 이상인 경우 상기 현재 프레임의 다음 프레임을 상기 현재 프레임으로 설정하고 상기 제2 특징들을 추출하는 단계를 반복하는 단계를 포함할 수 있다.Tracking the location of each of the cameras within each of the single-frames comprises: setting a next frame of the multi-frames as a current frame, a feature of one of the first features within the current frame Extracting the second features corresponding to the step of estimating the position of each of the cameras in the current frame when the number of the second features is greater than or equal to the threshold and the current when the number of the second features is greater than or equal to the threshold. And setting the next frame of the frame as the current frame and extracting the second features.

상기 단일-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계는, 상기 제2 특징들의 개수가 문턱치보다 적은 경우 상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단계를 다시 수행하는 단계를 더 포함할 수 있다.Tracking the location of each of the cameras within each of the single-frames may include tracking the location of each of the cameras in each of the multi-frames if the number of the second features is less than a threshold. The method may further include performing the step again.

다른 일 측에 따르면, 적어도 3 개의 카메라들에 의해 촬영된 프레임들을 사용하여 상기 카메라들의 위치를 추적하는 카메라 위치 추적 장치에 있어서, 다중-프레임들 내에서 하나 이상의 제1 특징들을 추출 및 추적하고, 상기 제1 특징들에 기반하여 상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 다중-프레임들 처리부 및 하나 이상의 단일-프레임들 각각의 제2 특징들에 기반하여 상기 단일-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적하는 단일-프레임 처리부를 포함하고, 상기 하나 이상의 단일-프레임들은 상기 다중-프레임들에 연이은 프레임들 중 상기 제1 특징들 중 하나에 대응하는 제2 특징들이 문턱치보다 적게 추적된 최초의 프레임의 이전의 프레임들인, 카메라 위치 추적 장치가 제공된다.According to another aspect, a camera position tracking device for tracking the position of the cameras using frames taken by at least three cameras, the camera position tracking device, extracting and tracking one or more first features in the multi-frames, A multi-frame processor for tracking the position of each of the cameras within each of the multi-frames based on the first features and the single-based based on the second features of each of the one or more single-frames. A single-frame processor for tracking the position of each of the cameras within each of the frames, wherein the one or more single-frames correspond to one of the first features of frames subsequent to the multi-frames. A camera position tracking device is provided, wherein the second features are previous frames of the first frame tracked less than the threshold.

상기 다중-프레임들 처리부는, 상기 다중-프레임들 중 첫 번째 프레임의 적어도 3 개의 이미지들로부터 제3 특징들을 추출하는 특징 추출부, 상기 제3 특징들을 상기 다중-프레임들 중 마지막 프레임까지 추적하는 특징 추적부, 상기 마지막 프레임까지 추적된 제3 특징들 중 동적 궤적을 갖는 특징들을 제거함으로써 상기 제1 특징들을 결정하는 동적 포인트 검출부 및 상기 제1 특징들에 기반하여 상기 다중-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추정하는 카메라 위치 추정부를 포함할 수 있다.The multi-frames processor may include a feature extractor configured to extract third features from at least three images of the first frame of the multi-frames, and to track the third features to the last frame of the multi-frames. A feature tracker, a dynamic point detector that determines the first features by removing features having a dynamic trajectory among the third features tracked to the last frame, and within each of the multi-frames based on the first features It may include a camera position estimator for estimating the position of each of the cameras.

상기 동적 포인트 검출부는 상기 제3 특징들 각각의 4차원 궤적 부공간을 계산하고, 상기 4차원 궤적 부공간에 기반하여 상기 제3 특징들 각각이 동적 궤적을 갖는지 여부를 판단할 수 있다.The dynamic point detector may calculate a four-dimensional trajectory subspace of each of the third features, and determine whether each of the third features has a dynamic trajectory based on the four-dimensional trajectory subspace.

상기 첫 번째 프레임의 적어도 3 개의 이미지들로부터 포인트들을 추출하고, 스케일 불변 특징 변형 설명자(Scale Invariant Feature Transform; SIFT)들을 생성고, 상기 추출된 포인트들을 상기 생성된 SIFT들 간의 설명자 비교를 사용하여 서로 간에 매치시키며, 상기 포인트들 중 매치된 포인트들을 특징으로서 연결시킴으로써 상기 제3 특징들을 생성할 수 있다.Extract points from at least three images of the first frame, generate Scale Invariant Feature Transform Descriptors (SIFTs), and compare the extracted points to each other using a descriptor comparison between the generated SIFTs. The third features may be generated by concatenating a match between the two points and associating matched points among the points as features.

상기 특징 추출부는 기하 제약을 사용하여 상기 제3 특징들 중 아웃라이어들을 제거함으로써 상기 제1 특징들을 결정할 수 있다.The feature extractor may determine the first features by removing outliers of the third features using a geometric constraint.

상기 단일-프레임 처리부는, 상기 다중-프레임들의 다음 프레임을 현재 프레임으로 설정하는 현재 프레임 설정부, 상기 현재 프레임 내에서 상기 제1 특징들 중 하나의 특징에 대응하는 상기 제2 특징들을 추출하는 현재 프레임 특징 추정부 및 상기 제2 특징들의 개수가 문턱치 이상인 경우 현재 프레임 내에서의 상기 카메라들 각각의 위치를 추정하는 문턱치 비교부를 포함할 수 있다.The single-frame processing unit may include a current frame setting unit configured to set a next frame of the multi-frames as a current frame, and extract the second features corresponding to one of the first features within the current frame. The frame feature estimator and the second feature may include a threshold comparison unit for estimating the position of each of the cameras in the current frame.

상기 제2 특징들의 개수가 문턱치 이상인 경우 상기 현재 프레임 설정부는 상기 현재 프레임의 다음 프레임을 상기 현재 프레임으로 새롭게 설정할 수 있고, 상기 현재 프레임 특징 추정부는 상기 새롭게 설정된 현재 프레임으로부터 상기 제1 특징들 중 하나의 특징에 대응하는 상기 제2 특징들을 추출할 수 있다.When the number of the second features is greater than or equal to the threshold, the current frame setting unit may newly set a next frame of the current frame as the current frame, and the current frame feature estimating unit may set one of the first features from the newly set current frame. The second features corresponding to the features may be extracted.

상기 제2 특징들의 개수가 문턱치보다 적은 경우 상기 다중-프레임들 처리부는 재실행될 수 있다.If the number of the second features is less than a threshold, the multi-frames processor may be re-executed.

다중-프레임들 내에서 제1 특징들을 추적하고, 추적된 제1 특징들에 기반하여 단일-프레임 내에서 상기의 제1 특징에 대응하는 제2 특징들을 추적하는 방법 및 장치가 제공된다.A method and apparatus are provided for tracking first features in multi-frames and tracking second features corresponding to the first feature in a single-frame based on the tracked first features.

적어도 3 개의 카메라에 의해 촬영된 프레임들에 기반하여 상기의 프레임들 내에서 특징들을 추출하고, 추출된 특징들에 기반하여 카메라들 각각의 위치를 추정하는 방법 및 장치가 제공된다.A method and apparatus are provided for extracting features within the frames based on frames taken by at least three cameras and for estimating the location of each of the cameras based on the extracted features.

도 1은 일 실시예에 따른 카메라 추적 장치의 동작을 설명한다.
도 2는 일 실시예에 따른 카메라 위치 추적 장치의 구조도이다.
도 3 은 일 실시예에 따른 카메라 위치 추적 방법의 흐름도이다.
도 4는 일 예에 따른 다중-프레임들 처리 단계의 흐름도이다.
도 5는 일 예에 따른 특징 추출 단계를 설명한다.
도 6은 일 예에 따른 특징 추적 단계의 흐름도이다.
도 7은 일 예에 따른 동적 포인트 검출 단계의 흐름도이다.
도 8은 일 예에 따른 카메라 위치 추정의 흐름도이다.
도 9는 일 예에 따른 단일-프레임 처리 단계의 흐름도이다.1 illustrates an operation of a camera tracking apparatus according to an embodiment.
2 is a structural diagram of a camera tracking device according to an embodiment.
3 is a flowchart of a camera location tracking method, according to an embodiment.
4 is a flowchart of a multi-frames processing step according to an example.
5 illustrates a feature extraction step according to an example.
6 is a flowchart of a feature tracking step according to an example.
7 is a flowchart of a dynamic point detection step according to an example.
8 is a flowchart of a camera position estimation according to an example.
9 is a flowchart of a single-frame processing step according to an example.

이하에서, 실시예들을, 첨부된 도면을 참조하여 상세하게 설명한다. 그러나, 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to or limited by the embodiments. Like reference symbols in the drawings denote like elements.

3안 카메라는 3 개의 동기화된 카메라들을 포함할 수 있다. 3 개의 카메라들은 동일 선상으로(collinear) 구성될 수 있다. The three-eye camera may include three synchronized cameras. The three cameras can be configured collinear.

하기에서, 시퀀스는 씬을 연속적으로 캡춰함으로써 생성된 일련의 이미지들을 의미할 수 있다. 3안의 시퀀스들은 3안 카메라 내의 3 개의 카메라들이 각각 씬을 연속적으로 캡취함으로써 생성한 3 개의 시퀀스들을 의미할 수 있다.In the following, a sequence may refer to a series of images generated by continuously capturing a scene. The three-eye sequence may refer to three sequences generated by three cameras in the three-eye camera, each successively capturing a scene.

씬 포인트의 3차원 위치 M은 [X, Y, Z] ^T 로 나타낼 수 있다. 여기서, X, Y 및 Z는 각각 씬 포인트의 X 좌표, Y 좌표 및 Z 좌표를 나타낼 수 있다. 씬 포인트의 2차원 이미지 위치 m은 [u, v, ] ^T 로 나타낼 수 있다.The three-dimensional position M of the scene point may be represented by [ X , Y , Z ] ^T. Here, X , Y and Z may represent the X coordinate, the Y coordinate and the Z coordinate of the scene point, respectively. The two-dimensional image position m of the scene point may be represented by [ u , v ,] ^T.

카메라 변환(transformation)은

로 모델링될 수 있다. 여기서, K는 고유 행렬(intrinsic matrix)일 수 있다,

는 투영 행렬(projective matrix)일 수 있다.Camera transformation

Can be modeled as: Here, K may be an intrinsic matrix,

May be a projective matrix.

고유 행렬은 카메라의 고유의 속성(property) 에 의존할 수 있다. 상기의 카메라의 고유의 속성은 1) 이미지의 x 축 및 y 축을 따르는 픽셀 단위의 초점 길이들(focal length), 2) x 축 및 y 축 간의 경사비 및 3) 주점(principal point)을 포함(encompass)할 수 있다. 여기서, 주점은 이미지 평면으로의 카메라 중심의 투영을 의미할 수 있다.The eigen matrix may depend on the inherent properties of the camera. The inherent properties of the camera include 1) focal lengths in pixels along the x and y axes of the image, 2) tilt ratio between the x and y axes, and 3) principal points. It can encompass. Here, the principal point may refer to the projection of the camera center to the image plane.

외적 행렬은 회전 R 및 변환 t를 포함하는 카메라 움직임을 모델(model)할 수 있다. 따라서, 투영 프로시듀어(procedure)는

로서 표현될 수 있다. 여기서,

는 스케일 팩터(scale factor)일 수 있다. 운동으로부터 구조의 출현(Structure-from-Motion)은 이미지 측정치들인 m'들로부터 모든 M들 및

을 추정하는 것을 의미할 수 있다.The cross product matrix can model the camera movement including the rotation R and the transformation t . Thus, the projection procedure

. &Lt; / RTI > here,

May be a scale factor. Structure-from-Motion emerges from all the Ms from the image measurements m 'and

May mean estimating.

만약, 선택된 씬 포인트들이 모두 정적인 것이라면, 상기의 추정은 매우 신뢰성 있다. 그러나, 씬이 몇몇 움직이는 객체들을 포함할 때, 몇몇 포인트들은 캡춰 동안 단지 1 개의 3D 위치만을 갖지는 않으므로 상기의 추정은 심각하게 혼동(confuse)될 수 있다. 포인트가 정적인지 여부를 결정하기 위해서, 일반적으로 포인트에 대응하는 2D 이미지 위치들의 모든 쌍들이 에피폴라 기하를 만족시키는지 여부가 검사(check)될 수 있다.If the selected scene points are all static, the above estimate is very reliable. However, when the scene includes some moving objects, the above estimates can be seriously confused because some points do not have only one 3D position during capture. In order to determine whether a point is static, it can generally be checked whether all pairs of 2D image positions corresponding to the point satisfy the epipolar geometry.

스테레오 리그(stereo rig)가 가용한(available) 경우, 매치된 포인트들의 3D 위치들은 삼각 측량(triangulation)을 통해 더 신뢰성있게 계산될 수 있다. 동일한 프레임에 속하는 2 개의 이미지들 및 좌측 뷰 및 우측 뷰 사이의 상대적인 위치는 스테레오 리그로부터 직접적으로 유도될 수 있다.If a stereo rig is available, the 3D positions of the matched points can be calculated more reliably through triangulation. The two images belonging to the same frame and the relative position between the left view and the right view can be derived directly from the stereo rig.

3D 위치에 대한 정보는 특성의 추적에도 유용할 수 있다. 특성의 추적은 동일한 씬 포인트에 대응하는 이미지의 포인트들을 연결하는 것일 수 있다.
Information about the 3D position can also be useful for tracking features. Tracking the characteristic may be to connect the points of the image corresponding to the same scene point.

도 1은 일 실시예에 따른 카메라 추적 장치의 동작을 설명한다.1 illustrates an operation of a camera tracking apparatus according to an embodiment.

좌측 이미지, 중간 이미지 및 우측 이미지를 촬영할 수 있는 3안(triocular) 카메라가 사용될 경우, 중간 이미지 내의 특정한 픽셀이 좌측 이미지 및 우측 이미지 양자 내에서 폐색되는 것은 드물게 발생한다.When a trinocular camera is used that can take a left image, an intermediate image and a right image, it is rare that certain pixels in the intermediate image occlude in both the left and right images.

또한, 3안 카메라가 사용될 경우, 각 타임스템프(timestamp)에서, 3 개의 이미지들이 획득될 수 있다. 따라서, 3 개의 이미지들에 기반하여, 추적된(tracked) 특징(feature)들의 3D 위치를 계산하고, 아웃라이어(outlier)들을 제거하기 위해, 에피폴라(epipolar) 기하(geometry)에 비해 더 강력한(stronger) 기하 제약(constraint)이 사용될 수 있다.In addition, when a three-eye camera is used, at each timestamp, three images can be obtained. Thus, based on the three images, to calculate the 3D position of tracked features and to remove outliers, it is more robust than the epipolar geometry. Geometric constraints may be used.

3안 카메라의 구성(configuration)에 있어서, 3안 카메라 내의 3 개의 카메라들 각각의 상대적인 위치들은 고정된 것으로 간주될 수 있다. 또한, 이미지를 캡춰(capture)하는 동안 상기의 3 개의 카메라들의 고유(intrinsic) 파라미터(parameter)들은 변하지 않는 것으로 가정될 수 있다. 따라서, 카메라 위치 추적 장치(100)로의 입력 값들은, 3안의 시퀀스들뿐만 아니라, 3 개의 카메라들에 대한 고유 행렬들 및 상대적인 카메라 위치를 포함할 수 있다. 또한, 카메라 위치 추적 장치(100)의 출력 값들은 각 이미지에 대한 외적 행렬(extrinsic matrix)일 수 있다. 여기서, 외적 행렬은 회전(rotation) 행렬 및 변환(translation) 벡터(vector)를 포함할 수 있다.In the configuration of a three-eye camera, the relative positions of each of the three cameras in the three-eye camera may be considered fixed. It can also be assumed that the intrinsic parameters of the three cameras do not change while capturing the image. Thus, input values to the camera positioning apparatus 100 may include not only sequences in three eyes, but also unique matrices and relative camera positions for the three cameras. In addition, the output values of the camera position tracking apparatus 100 may be an extrinsic matrix for each image. Here, the cross product matrix may include a rotation matrix and a translation vector.

3 개의 카메라들의 고유 파라미터들 및 위치들은, 예컨대 매츠랩을 위한 카메라 교정 툴(Camera Calibration Toolbox for Matlab)과 같은, 툴(tool)에 의해 교정(calibrate)될 수 있다. 2 개의 카메라들 각각이 외적 카메라 파라미터(extrinsic camera parameter)들이

및

라면, 상기의 2 개의 카메라들 간의 상대적 위치

는 하기의 수학식 1에 기반하여 계산될 수 있다.The unique parameters and positions of the three cameras can be calibrated by a tool, such as the Camera Calibration Toolbox for Matlab. Each of the two cameras has an extrinsic camera parameter

And

If so, the relative position between the two cameras

May be calculated based on Equation 1 below.

3안 시퀀스가 카메라 위치 추적 장치(100)로 입력될 수 있다. 또한, 3안 카메라의 3 개의 카메라들 각각의 고유 행렬 K _left, K _middle 및 K _right가 카메라 위치 추적 장치(100)의 입력 파라미터로서 사용될 수 있다. 또한, 3 개의 카메라들 중, 좌측 카메라로부터 중간 카메라로의 투영 행렬

및 좌측 카메라로부터 우측 카메라로의 투영 행렬

이 카메라 위치 추적 장치(100)의 입력 파라미터로서 사용될 수 있다.The three-eye sequence may be input to the camera position tracking apparatus 100. In addition, the unique matrices K _left , K _middle and K _right of each of the three cameras of the _trinocular camera may be used as input parameters of the camera position tracking apparatus 100. Also, of the three cameras, the projection matrix from the left camera to the intermediate camera

And projection matrix from left camera to right camera

This may be used as an input parameter of the camera position tracking device 100.

일단, 3 개의 카메라들의 고유 파라미터들 및 상대적 위치들이 계산되면, 카메라 위치 추적 장치(100)는 삼각 측량(triangulation)을 통해 3 개의 카메라들에 의해 캡취된 이미지들 간의 매치된(matched) 특징 포인트(feature point)들의 3D 포인트들을 추정(estimate)할 수 있다.Once the unique parameters and relative positions of the three cameras have been calculated, the camera position tracking device 100 performs a matched feature point between the images captured by the three cameras via triangulation. It is possible to estimate 3D points of feature points.

또한, 추적된 특징들의 3D 포인트들이 계산되면, 카메라 위치 추적 장치(100)는 이러한 3D 및 2D 간의 대응(correspondence)에 기반하여 신뢰성있고(reliably) 신속하게(rapidly) 카메라의 위치들을 계산할 수 있다.Also, once the 3D points of the tracked features are calculated, the camera position tracking device 100 can calculate the positions of the camera reliably and rapidly based on the correspondence between these 3D and 2D.

3안 시퀀스가 F 개의 프레임들을 포함할 때, 카메라 위치 추적 장치(100)는 각 프레임 별로 3 개의 카메라들 각각의 위치를 출력할 수 있다. 예컨대,

는 첫 번째 프레임 내에서의 좌측 카메라의 위치를 나타낼 수 있다.

는 마지막 프레임인 F 번째 프레임 내에서의 우측 카메라의 위치를 나타낼 수 있다.
When the three-eye sequence includes F frames, the camera position tracking apparatus 100 may output the positions of the three cameras for each frame. for example,

May represent the position of the left camera within the first frame.

May indicate the position of the right camera within the F-th frame, which is the last frame.

도 2는 일 실시예에 따른 카메라 위치 추적 장치의 구조도이다.2 is a structural diagram of a camera tracking device according to an embodiment.

카메라 위치 추적 장치(100)는 적어도 3 개의 카메라들에 의해 촬영된 프레임들을 사용하여 상기 카메라들의 위치를 추적한다.The camera position tracking apparatus 100 tracks the positions of the cameras using frames photographed by at least three cameras.

카메라 위치 추적 장치(100)는 다중-프레임들 처리부(210) 및 단일-프레임 처리부(220)를 포함할 수 있다.The camera position tracking apparatus 100 may include a multi-frame processor 210 and a single-frame processor 220.

다중-프레임들 처리부(210)는 다중-프레임들 내에서 하나 이상의 제1 특징들을 추출 및 추적할 수 있고, 제1 특징들에 기반하여 다중-프레임들 각각 내에서의 카메라들 각각의 위치를 추적할 수 있다. 다중-프레임들 처리부(210)는 단일-프레임 처리부(220)에게 제1 특징들에 대한 정보를 제공할 수 있다.The multi-frames processor 210 can extract and track one or more first features within the multi-frames, and track the location of each of the cameras within each of the multi-frames based on the first features. can do. The multi-frames processor 210 may provide information about the first features to the single-frame processor 220.

단일-프레임 처리부(220)는 하나 이상의 단일-프레임들 각각의 제2 특징들에 기반하여 단일-프레임들 각각 내에서의 상기 카메라들 각각의 위치를 추적할 수 있다. 여기서, 하나 이상의 단일-프레임들은 다중-프레임들에 연이은 프레임들 중 상기 제1 특징들 중 하나에 대응하는 제2 특징들이 문턱치보다 적게 추적된 최초의 프레임의 이전의 프레임들일 수 있다.The single-frame processor 220 may track the location of each of the cameras within each of the single-frames based on the second features of each of the one or more single-frames. Here, the one or more single-frames may be previous frames of the first frame in which second features corresponding to one of the first features of the frames subsequent to the multi-frames are tracked less than the threshold.

즉, 다중-프레임들 처리부(210)가 다중-프레임들 내에서 제1 특징들을 추출 및 추적하면, 단일-프레임들 처리부(220)는 다중-프레임들 처리부(210)가 처리한 프레임들의 이후의 프레임들을 한 개씩 처리할 수 있다. 여기서, 프레임의 처리란 프레임 내에서 특징들을 추적하고, 추적된 특징들에 기반하여 프레임 내에서의 카메라들 각각의 위치를 추적하는 것을 의미할 수 있다. 즉, 다중-프레임들 처리부(210)로부터 단일-프레임 처리부(220)로 제1 특징들이 제공되면, 단일-프레임 처리부(220) 다중-프레임들 처리부(210)가 처리한 프레임들의 이후의 프레임에서 제2 특징들을 추적할 수 있다. 여기서, 제2 특징들 각각은 상기의 제1 특징들 중 하나의 특징에 대응하는 특징이다. 또는, 단일-프레임 처리부(220)가 처리하는 현재 프레임 내에서 추적된 제2 특징들 각각은 단일-프레임 처리부(220)가 현재 프레임의 이전에 처리한 이전 프레임 내에서 추적된 제2 특징들 중 하나의 특징에 대응하는 특징일 수 있다.In other words, if the multi-frames processor 210 extracts and tracks the first features within the multi-frames, the single-frames processor 220 may follow the subsequent processing of the frames processed by the multi-frames processor 210. You can process the frames one by one. Here, the processing of the frame may mean tracking features in the frame and tracking the position of each of the cameras in the frame based on the tracked features. That is, if the first features are provided from the multi-frames processor 210 to the single-frame processor 220, then in a subsequent frame of the frames processed by the single-frame processor 220 multi-frames processor 210. The second features can be tracked. Here, each of the second features is a feature corresponding to one of the first features above. Alternatively, each of the second features tracked in the current frame processed by the single-frame processor 220 may be one of the second features tracked within the previous frame processed by the single-frame processor 220 previously. It may be a feature corresponding to one feature.

단일-프레임 처리부(220)에 의해 추적된 제2 특징들의 개수는 제1 특징들의 개수보다 적을 수 있다. 예컨대, 프레임들의 시퀀스 내에서 제1 특징들 중 하나의 특징에 대응하는 씬 포인트가 사라진 경우, 특정한 제1 특징에 대응하는 제2 특징이 추적되지 못할 수 있다. 단일-프레임 처리부(220)가 다중-프레임들 처리부(210)가 처리한 프레임들의 이후의 프레임들을 한 개씩 처리함에 따라 추적된 제2 특징들의 개수가 문턱치 이하로 내려가면, 단일-프레임 처리부(220)에 의해 카메라들 각각의 위치를 추적하는 것이 부적합할 수 있다. 따라서, 다중-프레임들에 연이은 프레임들 중 제2 특징들이 문턱치보다 적게 추적된 프레임이 발견되면, 상기의 발견된 프레임을 포함하는 다중-프레임들이 다시 다중-프레임들 처리부(210)에 의해 처리될 수 있다.The number of second features tracked by the single-frame processor 220 may be less than the number of first features. For example, if a scene point corresponding to one of the first features disappears in the sequence of frames, the second feature corresponding to the particular first feature may not be tracked. As the single-frame processor 220 processes the subsequent frames of the frames processed by the multi-frames processor 210 one by one, when the number of tracked second features drops below the threshold, the single-frame processor 220 Tracking the location of each of the cameras may be inappropriate. Thus, if a frame whose second features are tracked less than the threshold of frames subsequent to the multi-frames is found, then the multi-frames including the found frame are again processed by the multi-frames processing unit 210. Can be.

즉, 다중-프레임들 처리부(210) 및 단일-프레임 처리부(220)는 서로 간에 교대하여 프레임들의 시퀀스를 처리할 수 있다. 다중-프레임들 처리부(210) 및 단일-프레임 처리부(220)는 서로 간에 상대방에게 처리해야 할 프레임을 식별할 수 있는 정보(예컨대, 프레임의 번호)를 제공할 수 있다.That is, the multi-frames processing unit 210 and the single-frame processing unit 220 may process a sequence of frames alternately with each other. The multi-frames processing unit 210 and the single-frame processing unit 220 may provide each other with information (eg, a frame number) for identifying a frame to be processed.

다중-프레임들 처리부(210)는 다중-프레임들을 사용하여 상기의 다중-프레임들 내에서 공통되는 제1 특징들을 추출하기 때문에, 제1 특징들을 정확하게 추출할 수 있다. 단일-프레임 처리부(220)는 하나의 프레임 내에서 제1 특징들에 대응하는 제2 특징들을 추출하기 때문에, 제2 특징들을 빠르게 추출할 수 있다. 즉, 다중-프레임들 처리부(210) 및 단일-프레임 처리부(220)가 교대하여 실행됨으로써, 카메라 위치 추적 장치(100)는 카메라의 위치를 추적함에 있어서 정확도 및 속도 간의 균형을 맞출 수 있다.Since the multi-frames processing unit 210 extracts the first features common in the multi-frames using the multi-frames, the multi-frames processing unit 210 can accurately extract the first features. Since the single-frame processor 220 extracts the second features corresponding to the first features in one frame, the single-frame processor 220 may quickly extract the second features. That is, the multi-frames processing unit 210 and the single-frame processing unit 220 are alternately executed, so that the camera position tracking apparatus 100 may balance accuracy and speed in tracking the position of the camera.

다중-프레임들 처리부(210)는 특징 추출부(211), 특징 추적부(212), 동적 포인트 검출부(213) 및 카메라 위치 추정부(214)를 포함할 수 있다.The multi-frames processor 210 may include a feature extractor 211, a feature tracker 212, a dynamic point detector 213, and a camera position estimator 214.

단일-프레임 처리부(220)는 현재 프레임 설정부(221), 현재 프레임 특징 추정부(222), 현재 프레임 문턱치 비교부(223) 및 현재 프레임 카메라 위치 추정부(224)를 포함할 수 있다.The single-frame processor 220 may include a current frame setter 221, a current frame feature estimator 222, a current frame threshold comparator 223, and a current frame camera position estimator 224.

각 구성요소들의 구체적인 기능 및 동작 원리에 대해서 하기에서 도 3 내지 도 9를 참조하여 상세히 설명된다.
Specific functions and operating principles of each component will be described in detail with reference to FIGS. 3 to 9 below.

도 3 은 일 실시예에 따른 카메라 위치 추적 방법의 흐름도이다.3 is a flowchart of a camera location tracking method, according to an embodiment.

하기의 단계들(310 및 320)에서, 카메라 위치 추적 장치(200)는 적어도 3 개의 카메라들에 의해 촬영된 프레임들의 부분열(subsequence)을 사용하여, 각 프레임 내에서의 카메라들 각각의 위치를 추적할 수 있다. 적어도 3 개의 카메라들은 각각 좌측 카메라, 중간 카메라 및 우측 카메라일 수 있다. 각 프레임은 좌측 카메라에 의해 캡춰된 좌측 이미지, 중간 카메라에 의해 캡춰된 중간 이미지 및 우측 카메라에 의해 캡춰된 우측 이미지를 포함할 수 있다.In the following steps 310 and 320, the camera position tracking device 200 uses a subsequence of frames taken by at least three cameras to determine the position of each of the cameras within each frame. Can be traced At least three cameras may be a left camera, an intermediate camera and a right camera, respectively. Each frame may include a left image captured by the left camera, an intermediate image captured by the intermediate camera, and a right image captured by the right camera.

다중-프레임들 처리 단계(310)에서, 다중-프레임들 처리부(210)는 다중 프레임들 내에서 하나 이상의 제1 특징들을 추출 및 추적할 수 있다. 여기서, 다중-프레임들은 2 개 이상의 연이은(consecutive) 프레임들일 수 있다. 다중-프레임들 처리부(210)는 추적된 제1 특징들에 기반하여 다중-프레임들 각각 내에서의 카메라들의 위치를 추적할 수 있다. 여기서, 다중-프레임들의 개수는 미리 정의된 것일 수 있다. 이하, 다중-프레임들의 개수를 N _f 로 표시한다. 즉, 다중-프레임들 처리부(210)는 N _f 개 프레임들로 구성된 다중-프레임들에 공통된 특징들을 추출, 삼각 측량 및 추적할 수 있다. 다중-프레임들 처리부(210)는 추출된 특징들 중 동적인 씬 포인트들에 대응하는 특징들을 감지 및 제거할 수 있다. 즉, 제1 특징들은 다중-프레임들 처리부(210)에 의해 추출된 특징들 중 다중-프레임의 N _f 번째 프레임까지 성공적으로 추적되는 N _p 개의 정적인 제1 포인트들에 대응할 수 있다.. 다중-프레임들 처리부(210)는 제1 특징들을 사용하여 N _f 개의 프레임들 내에서의 3N _f 개의 카메라 위치들 및 3N _p 개의 제1 포인트들의 위치들을 동시에 추정할 수 있다.In the multi-frames processing step 310, the multi-frames processor 210 may extract and track one or more first features within the multiple frames. Here, the multi-frames may be two or more consecutive frames. The multi-frames processor 210 may track the position of the cameras in each of the multi-frames based on the tracked first features. Here, the number of multi-frames may be predefined. Hereinafter, the number of multi-frames is denoted by N _f . That is, the multi-frames processor 210 may extract, triangulate, and track features common to the multi-frames composed of N _f frames. The multi-frames processor 210 may detect and remove features corresponding to dynamic scene points among the extracted features. That is, the first features may correspond to N _p static first points successfully tracked up to the N _f th frame of the multi-frame among the features extracted by the multi-frames processing unit 210. The frames processing unit 210 may simultaneously estimate 3N _f camera positions and 3N _p first points positions in the N _f frames using the first features.

단일-프레임 추적 단계(320)에서, 단일-프레임 처리부(220)는 하나 이상의 단일-프레임들 각각의 제2 특징들에 기반하여 단일-프레임들 각각 내에서의 카메라들 각각의 위치를 추적할 수 있다. 여기서, 하나 이상의 단일-프레임들은 다중프레임들에 연이은 프레임들 중 제1 특징들 중 하나에 대응하는 제2 특징들이 문턱치보다 적게 추적된 최초의 프레임의 이전의 프레임들일 수 있다.In the single-frame tracking step 320, the single-frame processing unit 220 can track the position of each of the cameras in each of the single-frames based on the second features of each of the one or more single-frames. have. Here, the one or more single-frames may be previous frames of the first frame in which the second features corresponding to one of the first features of the frames subsequent to the multiple frames are tracked less than the threshold.

단일-프레임 처리부(220)는 다중-프레임들의 다음 프레임인 N _f +1 번째 프레임 내에서 전술된 N _p 개의 제1 특징들에 대해 매치되는 제2 특징들을 검색할 수 있다. 여기서, 검색된 매치되는 제2 특징들의 개수는 N' _p 이다. 단일-프레임 처리부(220)는 제2 특징들의 3D 위치들을 획득할 수 있다. 다음으로, 단일-프레임 처리부(220)는 2D 및 3D 간의 대응에 기반하여 N _f +1 번째 프레임 내에서의 카메라들 각각의 위치를 추정할 수 있다. 다음으로, 단일-프레임 처리부(220)는 N _f +2 번째 프레임 내에서 N' _p 개의 제2 특징들에 대해 매치되는 제3 특징들을 검색할 수 있다. 제2 특징들 각각은 제1 특징들 중 하나에 매치될 수 있다. 따라서, 제3 특징들 각각은 제1 특징들 중 하나에 매치될 수 있다. 이러한 절차는, 단일-프레임 처리부(220)에 의해 N'+n 번째 프레임 내에서 검색한 제1 특징에 매치되는 특징들의 개수들이 문턱치 아래로 내러갈 때까지 반복될 수 있다.The single-frame processor 220 may search for matching second features with respect to the aforementioned N _p first features within the N _f + 1 th frame, which is the next frame of the multi-frames. Here, the number of second feature is detected match is N _'p. The single-frame processor 220 may acquire 3D positions of the second features. Next, the single-frame processor 220 may estimate the position of each of the cameras in the N _f + 1 th frame based on the correspondence between 2D and 3D. Next, the single-frame processor 220 may search for matching third features for N ′ _p second features within the N _f + 2 th frame. Each of the second features may match one of the first features. Thus, each of the third features may match one of the first features. This procedure may be repeated until the number of features matching the first feature retrieved in the N ′ + n th frame by the single-frame processor 220 goes below the threshold.

단일-프레임 처리 단계(320)는 다중-프레임들 처리 단계(310)에 비해 훨씬 빠르게 수행될 수 있다. 단일-프레임 처리 단계(320)에서, 더 많은 프레임들이 처리될 수 있게 하기 위해서, 단일-프레임 처리부(220)는 제1 특징들 중 단일-프레임 내에서 매치되지 않은 특징들을 다중 프레임들 중 N _f 번 째 프레임으로 투영할 수 있다. 단일-프레임 처리부(320)는 투영에 의해 생성된 포인트 및 윈래의 특징 간의 지역적 기색(appearance)을 비교할 수 있다.The single-frame processing step 320 may be performed much faster than the multi-frames processing step 310. In the single-frame processing step 320, in order to allow more frames to be processed, the single-frame processing unit 220 displays N _f of multiple frames that do not match in a single-frame of the first features. You can project to the first frame. The single-frame processor 320 may compare the local appearance between the points generated by the projection and the features of the window.

단일-프레임 처리 단계(320) 이후, 다중-프레임들 처리 단계(310)가 다시 재시작될 수 있다. 재시작된 다중-프레임들 추정 단계(310)에서, 다중-프레임들 처리부(210)는 N' _f 개의 프레임들의 새로운 제1 특징들을 재-추출할 수 있다. 단일-프레임 처리 단계(320)에서, 단일-프레임 처리부(220)는 연이은 프레임들 내에서 제1 특징들에 매치되는 새로운 제2 특징들을 찾을 수 있다.After the single-frame processing step 320, the multi-frames processing step 310 may be restarted again. In the restarted multi-frames estimating step 310, the multi-frames processor 210 may re-extract new first features of the N ′ _f frames. In the single-frame processing step 320, the single-frame processing unit 220 may find new second features that match the first features in successive frames.

이러한 2 단계로 구성된 절차(procedure)는 모든 프레임들이 처리될 때까지 반복될 수 있다.This two-step procedure can be repeated until all the frames have been processed.

전술된 2 단계의 절차에 의해, 프레임들의 부분열 내에서, 특징들이 추출 및 추적될 수 있다. 다중-프레임들 처리부(210) 및 단일-프레임 처리부(220)는 각각 동적 포인트들을 나타내는 특징들을 자동으로 제거할 수 있으며, 제거 이후 남은(remaining) 특징들을 각 프레임 내에서의 카메라들 각각의 위치를 추정하기 위해 사용할 수 있다.By the two-step procedure described above, features can be extracted and tracked within a substring of frames. The multi-frames processing unit 210 and the single-frame processing unit 220 may automatically remove features representing dynamic points, respectively, and determine the position of each of the cameras in each frame after the removal. Can be used to estimate

전술된 2 단계의 절차에 의해 입력 3안 시퀀스의 카메라 움직임이 견고하고(robustly), 효율적으로 복구될 수 있으며, 추적의 정확성을 감소시키지 않은 채 추적 효율이 향상될 수 있다.By the two-step procedure described above, the camera movement of the input trinocular sequence can be robustly and efficiently recovered, and the tracking efficiency can be improved without reducing the accuracy of the tracking.

다중-프레임들 처리 단계(310) 및 단일-프레임 처리 단계(320)를 포함하는 2 단계의 절차에 대해서 하기에서 도 4 내지 도 9를 참조하여 상세히 설명된다.
A two-step procedure comprising a multi-frame processing step 310 and a single-frame processing step 320 is described in detail with reference to FIGS. 4-9 below.

도 4는 일 예에 따른 다중-프레임들 처리 단계의 흐름도이다.4 is a flowchart of a multi-frames processing step according to an example.

다중-프레임들 처리 단계(310)는 하기의 단계들(410 내지 440)을 포함할 수 있다.The multi-frames processing step 310 may include the following steps 410-440.

특징 추출 단계(410)에서, 특징 추출부(211)는 다중-프레임들 중 첫 번째 프레임의 적어도 3 개의 이미지들로부터 제3 특징들을 추출할 수 있다. 특징 추출 단계(410)의 구체적인 동작은 하기에서 도 5를 참조하여 상세히 설명된다.In the feature extraction step 410, the feature extractor 211 may extract third features from at least three images of the first frame of the multi-frames. Specific operations of the feature extraction step 410 will be described in detail with reference to FIG. 5 below.

특징 추적 단계(420)에서, 특징 추적부(212)는 추출된 제3 특징들을 다중-프레임들 중 마지막 프레임까지 추적한다.In the feature tracking step 420, the feature tracker 212 tracks the extracted third features to the last frame of the multi-frames.

전통적인 특징 추적 방법은 첫 번째 프레임 내의 특징 포인트들을 추출하고, 추출된 특징 포인트들을 프레임 단위로(frame by frame)로 추적한다. 만약, 3안 카메라 구성(configuration)에서, 각 특징들이 이전 프레임으로부터 현재 프레임으로 추적된다면, 특징 손실(missing) 문제를 줄이기 위해 더 많은 후보들이 요구된다. 따라서, 특징 추적부(212)는 단지 1 회의 추적만을 수행하면서, 다중의 후보들을 유지하지 않는 3D 추적 알고리즘을 사용할 수 있다.The traditional feature tracking method extracts feature points in the first frame and tracks the extracted feature points frame by frame. If, in a three-eye camera configuration, each feature is tracked from the previous frame to the current frame, more candidates are required to reduce the feature missing problem. Thus, the feature tracker 212 can use a 3D tracking algorithm that does only one tracking while not retaining multiple candidates.

특징 추적 단계(420)의 구체적인 동작이 하기에서 도 6을 참조하여 상세히 설명된다.The detailed operation of the feature tracking step 420 is described in detail with reference to FIG. 6 below.

동적 포인트 검출 단계(430)에서, 동적 포인트 검출부(213)는 마지막 프레임까지 추적된 제3 특징들 중 동적 궤적을 갖는 특징들을 제거함으로써 제1 특징들을 결정할 수 있다. 즉, 제3 특징들 중 상기의 제거 이후 남은 특징들이 제1 특징들이 될 수 있다. 동적 포인트 검출 단계(430)의 구체적인 동작이 하기에서 도 7를 참조하여 상세히 설명된다.In the dynamic point detection step 430, the dynamic point detector 213 may determine the first features by removing features having a dynamic trajectory among the third features tracked to the last frame. That is, the features remaining after the removal of the third features may be the first features. The detailed operation of the dynamic point detection step 430 is described in detail with reference to FIG. 7 below.

카메라 위치 추정 단계(440)에서, 카메라 위치 추정부(214)는 제1 특징들에 기반하여 다중-프레임들 각각 내에서의 카메라들 각각의 위치를 추정할 수 있다. 카메라 위치 추정 단계(440)의 구체적인 동작이 하기에서 도 8을 참조하여 상세히 설명된다.
In the camera position estimation step 440, the camera position estimation unit 214 may estimate the position of each of the cameras in each of the multi-frames based on the first features. The detailed operation of the camera position estimation step 440 is described in detail with reference to FIG. 8 below.

도 5는 일 예에 따른 특징 추출 단계를 설명한다.5 illustrates a feature extraction step according to an example.

포인트 및 스케일 불변 특징 변형(Scale Invariant Feature Transform; SIFT) 설명자 생성 단계(510)에서, 특징 추출부(211)는 다중-프레임들 중 첫 번째 프레임의 적어도 3 개의 이미지들로부터 포인트들을 추출할 수 있다. 여기서, 포인트는 코너(corner) 및 에지(edge) 감지기(detector)에 의해 감지된 해리스 코너 포인트(Harris corner point)일 수 있다.In a point and scale invariant feature transform (SIFT) descriptor generation step 510, the feature extractor 211 may extract points from at least three images of the first frame of the multi-frames. . Here, the point may be a Harris corner point detected by a corner and an edge detector.

또한, 특징 추출부(211)는 다중-프레임들 중 첫 번째 프레임의 3 개의 이미지들로부터 상수 스케일(constant scale)을 갖는 스케일 불변 특징 변형(Scale Invariant Feature Transform; SIFT) 설명자(descriptor)들을 생성할 수 있다. 일반적으로, 3 개의 이미지들 간의 스케일 편차(variation)는 작다.Also, the feature extractor 211 may generate Scale Invariant Feature Transform (SIFT) descriptors having a constant scale from three images of the first frame of the multi-frames. Can be. In general, the scale variation between the three images is small.

특징 생성 단계(520)에서, 특징 추출부(211)는 추출된 포인트들을 생성된 SIFT 설명자들 간의 설명자 비교를 사용하여 서로 간에 매치시킬 수 있으며, 매치된 포인트들을 특징으로 연결시킴으로써 제3 특징들을 생성할 수 있다. 여기서, 가속을 위해, 특징 추출부(211)는 설명자 비교를 위해 k-d 트리를 사용할 수 있다.In the feature generation step 520, the feature extractor 211 may match the extracted points to each other using descriptor comparison between the generated SIFT descriptors, and generate third features by concatenating the matched points with the feature. can do. Here, for acceleration, the feature extractor 211 may use a kd tree for descriptor comparison.

상기의 매치에 의해 생성된 특징들 중 아웃라이어가 있을 수 있다. 이하, 매치에 의해 생성된 제3 특징을 후보 특징 x로 표시한다.There may be outliers among the features created by the above match. Hereinafter, the third feature generated by the match is denoted by the candidate feature x .

아웃라이어 제거 단계(530)에서, 특징 추출부(211)는 3 개의 기하 제약 중 하나 이상을 사용하여 생성된 제3 특징들 중에서 아웃라이어들을 제거할 수 있다. 여기서, 3 개의 기하 제약은 에피폴라 제약, 재-투영(re-projection) 제약 및 깊이 범위(range) 제약일 수 있다.In the outlier removing step 530, the feature extractor 211 may remove the outliers from the third features generated using one or more of the three geometric constraints. Here, the three geometric constraints may be epipolar constraints, re-projection constraints and depth range constraints.

특징 추출부(211)는, 하기의 수학식 2에 기반하여, 펀더멘털(fundamental) 행렬

를 뷰 i 및 뷰 j의 상대적 위치들로부터 유도할 수 있다.

는 뷰 i 및 뷰 j 각각의 펀더멘털 행렬일 수 있으며, 뷰 i로부터 뷰 j로의 펀더멘털 행렬일 수 있다. 여기서, 뷰 i는 3 개의 카메라들 중 i 번째 카메라의 뷰일 수 있다.The feature extractor 211 may perform a fundamental matrix based on Equation 2 below.

Can be derived from the relative positions of view i and view j .

May be a fundamental matrix of views i and j respectively, and may be a fundamental matrix from view i to view j . Here, the view i may be a view of the i th camera among the three cameras.

여기서, K는 고유 파라미터를 나타낼 수 있다.

는 3 개의 카메라들 중 제1 카메라로부터 제2 카메라로의 변환 벡터를 나타낼 수 있다.

은 제1 카메라로부터 제2 카메라로의 회전 벡터를 나타낼 수 있다. [t]_x는 벡터 t의 반대칭행렬(skew symmetric matrix)일 수 있다.Here, K may represent unique parameters.

May represent a conversion vector from the first camera to the second camera among the three cameras.

May represent a rotation vector from the first camera to the second camera. [ t ] _x may be a skew symmetric matrix of the vector t .

[ t ]_x는 하기의 수학식 3에서와 같이 정의될 수 있다.[ t ] _x may be defined as in Equation 3 below.

따라서, 모든 x에 대해 하기의 수학식 4가 성립할 수 있다. 특징 추출부(211)는 하기의 수학식 4가 성립되지 않는 후보 특징 x를 아웃라이어로 간주할 수 있다.Therefore, Equation 4 below can be established for all x . The feature extractor 211 may regard the candidate feature x for which Equation 4 below does not hold as an outlier.

재-투영 확인(validation)이 두 번째 기하 테스트로서 가해질 수 있다.Re-projection validation can be applied as a second geometric test.

예컨대, 특징 추출부(211)는 좌측 카메라를 참조 카메라로 세트할 수 있다. 좌측 카메라가 참조 카메라로 세트되면 좌측 카메라의 투영 행렬들

에 대해 하기의 수학식 5가 성립할 수 있다.For example, the feature extractor 211 may set the left camera as the reference camera. Projection matrices of the left camera when the left camera is set as the reference camera

Equation 5 below can be established for.

상기의 세트에 의해 특징 추출부(211)는, 하기의 수학식 6 및 수학식 7에 기반하여, 중간 카메라의 투영 행렬들

및 우측 카메라의 투영 행렬들

을 계산할 수 있다.By the above set, the feature extraction unit 211 is based on Equations 6 and 7 below, and the projection matrices of the intermediate camera.

And projection matrices of the right camera

Can be calculated.

다음으로, 특징 추출부(211)는, 카메라들의 위치들을 사용하여, 후보 특징 x들 각각의 3D 위치 M을 삼각 측량할 수 있다.Next, the feature extractor 211 may triangulate the 3D position M of each of the candidate features x using the positions of the cameras.

2-뷰 삼각 측량은 3-뷰 삼각 측량에 비해 더 안정적이고 효율적일 수 있다. 따라서, 특징 추출부(211)는 좌측 카메라(즉, 우측 이미지) 내의 매치된 특징 포인트들 m _left 및 우측 카메라(또는, 우측 이미지) 내의 매치된 특징 포인트들 m _right를 가지고, 예컨대 샘슨 차선 삼각 측량(Sampson suboptimal triangulation) 알고리즘을 사용하여, M을 초기화할 수 있다. 여기서, 특징 포인트는 특징에 대응하는 이미지 내의 포인트를 의미할 수 있다.Two-view triangulation can be more stable and efficient than three-view triangulation. Thus, the feature extraction unit 211 has the m _right the feature points match in the left camera (that is, the right image), the feature points of m _left and right cameras (or, right image) match in, for example, Sampson lane triangulation We can initialize M using the Sampson suboptimal triangulation algorithm. Here, the feature point may mean a point in the image corresponding to the feature.

M의 초기화 후, 특징 추출부(211)는 중간 카메라 내의 특징 포인트 m _middle를 추가하여, m _left, m _middle및 m _right를 사용하여 하기의 수학식 8의 에너지 함수를 최소화함으로써 M을 개선할 수 있다.After the initialization of M , the feature extractor 211 can improve the M by minimizing the energy function of Equation 8 using m _left , m _middle and m _right by adding the feature point m _middle in the intermediate camera. have.

여기서, 투영 함수

는 하기의 수학식 9에서와 같이 정의될 수 있다.Where projection function

May be defined as in Equation 9 below.

에너지 함수의 최소화된 값은 후보 특징 x의 재-투영 오류일 수 있다.The minimized value of the energy function may be the re-projection error of candidate feature x .

특징 추출부(211)는 후보 특징 x에 대한 에너지 함수의 최소화된 값을 재-투영 확인의 기준(criterion)으로서 사용할 수 있다.The feature extractor 211 can use the minimized value of the energy function for the candidate feature x as a criterion for re-projection confirmation.

특징 추출부(211)는 상기의 2 가지의 기하학적 제약들 중 하나를 충족시키지 못하는 후보 특징들을 아웃라이어로 인식(recognize)할 수 있다.The feature extractor 211 may recognize candidate features that do not satisfy one of the two geometric constraints as outliers.

후보 특징이 전술된 2 개의 기하학적 제약들 양자를 충족시키더라도, 여전히 후보 특징은 아웃라이어일 수 있다. 이러한 경우, 일반적으로 후보 특징의 삼각 측량된 깊이는 비정상이다. 여기서, 비정상인 깊이는 매우 작거나 매우 큰 깊이를 의미할 수 있다. 따라서, 특징 추출부(211)는 아웃라이어들을 제거하기 위해 깊이 범위 제약을 사용할 수 있다. 특징 추출부(211)는, 후보 특징들 중 깊이 값이 명시된(specified) 깊이 범위

를 벗어나는 특징들을 아웃라이어로 간주할 수 있고, 아웃라이어로 간주된 후보 특징들 제거할 수 있다. Although the candidate feature meets both of the two geometric constraints described above, the candidate feature may still be outlier. In such cases, the triangulated depth of the candidate features is usually abnormal. Here, the abnormal depth may mean very small or very large depth. Thus, feature extractor 211 can use depth range constraints to remove outliers. The feature extractor 211 includes a depth range in which a depth value is specified among candidate features.

Features that deviate from may be considered as outliers and candidate features considered to be outliers may be removed.

특징 추출부(211)는 각 뷰 i에 대해 깊이 범위

를 자동으로 계산하기 위해 2 단계의 적응적인 문턱치 선택 방법을 사용할 수 있다.Feature extraction section 211 provides depth range for each view i

The adaptive threshold selection method in two stages can be used to automatically calculate.

첫 번째 단계에서, 특징 추출부(211)는 각 뷰 i에 대해 뷰 i 내에서 나타나는 모든 특징들의 깊이 값들을 계산할 수 있다. 특징 추출부(211)는 계산된 깊이 값들 중 가장 작은 80%의 깊이 값들을 선택할 수 있다. 특징 추출부(211)는 선택된 깊이 값들을 사용하여 뷰 i에 대한 평균 깊이 값

및 분산

를 계산할 수 있다.In a first step, the feature extractor 211 may calculate depth values of all features appearing in view i for each view i . The feature extractor 211 may select the lowest depth value of 80% among the calculated depth values. The feature extractor 211 uses the selected depth values to average the depth values for the view i .

And dispersion

Can be calculated.

특징 추출부(211)는, 하기의 수학식 10 및 수학식 11에 기반하여 계산된 평균 깊이 값

및 분산

를 사용하여 깊이 범위의 최소 값

및 최대 값

를 계산할 수 있다.The feature extractor 211 may calculate an average depth value based on Equations 10 and 11 below.

And dispersion

Minimum value of depth range using

And maximum values

Can be calculated.

여기서, 파라미터

의 값은 5로 세트될 수 있다.Here,

The value of may be set to 5.

그러나, 일반적으로 계산된

의 값은 0에 매우 가까워서, 무익한 것일 수 있다.However, generally calculated

The value of is very close to zero, which can be useless.

두 번째 단계에서, 특징 추출부(211)는 더 정확한

를 계산할 수 있다.In the second step, the feature extractor 211 is more accurate

Can be calculated.

주어진 삼각 측정 뷰들에 의해 3 개의 뷰 쌍(pair)들이 획득될 수 있다. 3 개의 뷰 쌍들은 하기의 수학식 12에서와 같이 표현될 수 있다.Three view pairs can be obtained by given triangulation views. The three view pairs may be expressed as in Equation 12 below.

3안 카메라들은 거의 동일한 지향(orientation)을 갖기 때문에, 사소한(trifling) 회전 컴포넌트(component)는 무시(neglect)될 수 있다. 따라서, (X _i , Y _i , Z _i ) ^T 가 i번째 카메라에 대한 특징 포인트의 3D 위치이면, 특징 추출부(211)는, 하기의 수학식 13 및 하기의 수학식 14에 기반하여, x _j 및 x _j 를 계산할 수 있다. x _j 는 i 번째 뷰의 초점의(focal) 이미지 평면(plane) 상의 x-좌표(coordinate)일 수 있고, x _j 는 j 번째 뷰의 초점의 이미지 평면 상의 x-좌표일 수 있다.Since trinocular cameras have almost the same orientation, the trifling rotating component can be neglected. Thus, (X _i, Y _i, Z _i) and ^T is based on the i-th back camera 3D position of the feature point of the feature extraction section 211, equation (14) in equation (13) and to below, x _j and x _j can be calculated. x _j may be the x- coordinate on the focal image plane of the i th view and x _j may be the x- coordinate on the image plane of the focal point of the j th view.

따라서, 특징 추출부(211)는, 하기의 수학식 15에 기반하여, i 번째 뷰 내에서의 특징 포인트의 깊이 Z _i 를 계산할 수 있다.Therefore, the feature extractor 211 may calculate the depth Z _i of the feature point in the i th view based on Equation 15 below.

여기서, 특징 추출부(211)는, 하기의 수학식 16에 기반하여, dx _ij 를 계산할 수 있다.Here, the feature extractor 211 may calculate dx _ij based on Equation 16 below.

상술된 것과 같이, 최소 깊이 값

을 선택하는 것은 dx _ij 의 최대 값을 선택하는 것과 동일할 수 있다.As mentioned above, the minimum depth value

Choosing may be equivalent to choosing the maximum value of dx _ij .

특징 추출부(211)는 후보 특성들 중 i 번째 뷰 및 j 번째 뷰 양자 내에서 나타나는 후보 특성들 각각의 dx _ij 를 계산할 수 있다. 특징 추출부(211)는 계산된 모든 dx _ij 값들을 수집(collect)할 수 있고, 수집된 모든 dx _ij 들을 {d _x ₁, d _x ₂, … }와 같이 내림차순으로 정렬할 수 있다.The feature extractor 211 may calculate dx _ij of each of the candidate features appearing in both the i th view and the j th view among the candidate features. The feature extractor 211 may collect all the calculated dx _ij values, and collect all the collected dx _ij { d _x ₁ , d _x ₂ ,. You can sort in descending order as shown in}.

특징 추출부(211)는 내림차순으로 정렬된 dx _ij 들 중 상위에서 80%의 값을 갖는 dx _ij 를 기준 값

로 선택할 수 있다.The feature extractor 211 uses the reference value dx _ij having a value of 80% from the top among the dx _ij arranged in descending order.

.

특징 추출부(211)는, 하기의 수학식 17에 기반하여, i 번째 뷰의 최소 깊이 값을 계산할 수 있다.The feature extractor 211 may calculate a minimum depth value of the i th view based on Equation 17 below.

도 6은 일 예에 따른 특징 추적 단계의 흐름도이다.6 is a flowchart of a feature tracking step according to an example.

특징 추적 단계(420)에는, 2 개의 프레임들 내에서의 제3 특징들의 이동이 추적된다. 하기의 흐름도는 이전 프레임 내의 특징의 위치 및 위치 이동에 기반하여 현재 프레임 내의 특징의 위치를 계산하는 방법을 설명한다.In feature tracking step 420, the movement of the third features in the two frames is tracked. The flowchart below describes a method of calculating the position of a feature in the current frame based on the position and positional movement of the feature in the previous frame.

추출된 특징의 3D 위치는 연이은 프레임 좌표계(coordinate system)들 사이에서는 많이 변경되지 않을 것이라고 합리적으로 가정될 수 있다. 특징의 3D 위치 이동(movement) V는 하기의 수학식 18에서와 같이 공식화될 수 있다.It can be reasonably assumed that the 3D position of the extracted feature will not change much between successive frame coordinate systems. The 3D movement movement V of the feature may be formulated as in Equation 18 below.

여기서, M은 특징의 이전 프레임 내에서의 3D 위치를 나타낼 수 있다. M'은 상기의 특징의 현재 프레임 내에서의 3D 위치를 나타낼 수 있다.Here, M can represent the 3D position within the previous frame of the feature. M ′ may represent a 3D position within the current frame of the feature.

위치 이동을 계산하기 위한 가장 기본적인 측정치는, 이전 프레임 내의 이미지 패치 및 현재 프레임 내의 이미지 패치간의 유사도(similarity)일 수 있다..The most basic measure for calculating the position shift may be the similarity between the image patch in the previous frame and the image patch in the current frame.

특징 추적부(212)는, 하기의 수학식 19에 기반한 에너지 함수 f(v)를 최소화시킬 수 있다. 여기서, 에너지 함수 f(v)는 이전 프레임 내의 이미지 패치 및 현재 프레임 내의 이미지 패치간의 유사도(similarity)를 나타낼 수 있다.The feature tracker 212 may minimize the energy function f ( v ) based on Equation 19 below. Here, the energy function f ( v ) may represent the similarity between the image patch in the previous frame and the image patch in the current frame.

여기서,

는 이전 프레임 내의 i 번째 이미지일 수 있다.

는 현재 프레임 내의 i 번째 이미지일 수 있다. Loc(M, i, j)는 i 번째 이미지 평면 내의, M의 투영이 중심에 위치한 지역 윈도우의 j 번째 픽셀의 위치일 수 있다.here,

May be the i th image in the previous frame.

May be the i th image in the current frame. Loc ( M , i , j ) may be the location of the j th pixel of the local window in which the projection of M is centered within the i th image plane.

특징 추적부(212)는, 하기의 수학식 20에, 기반하여 Loc(M, i, j)를 계산할 수 있다.The feature tracker 212 may calculate Loc ( M , i , j ) based on Equation 20 below.

여기서, v _j 는 j 번째 픽셀의 상기의 지역 윈도우의 중심으로부터의 옵셋을 나타낼 수 있다.Here, v _j may represent an offset from the center of the local window of the j th pixel.

하기에서,

를 단순화하여,

로 표시한다. 여기서,

일 수 있다. 따라서, 수학식 19는 하기의 수학식 21과 같이 단순화될 수 있다.In the following,

By simplifying

. here,

Lt; / RTI > Accordingly, Equation 19 may be simplified as in Equation 21 below.

수학식 21은 하기의 수학식 22와 같이 변형될 수 있다.Equation 21 may be modified as in Equation 22 below.

또한, 수학식 18은 수학식 23과 같이 변형될 수 있으며, M'는 하기의 수학식 23에서와 같이 정의될 수 있다.In addition, Equation 18 may be modified as in Equation 23, and M 'may be defined as in Equation 23 below.

3차원 움직임 V가 작을 때, 항

및

는 하기의 수학식 24에서와 같이 근사화될 수 있다.When the three-dimensional motion V is small, the term

And

Can be approximated as in Equation 24 below.

여기서,

일 수 있다.here,

Lt; / RTI >

특징 추적부(212)는 하기의 수학식 25과 같은 체인(chain) 규칙에 기반하여

를 분해적으로(analytically) 계산할 수 있다.The feature tracking unit 212 is based on a chain rule as shown in Equation 25 below.

Can be calculated analytically.

여기서, M _c 는 카메라 좌표계 내에서의 3D 위치일 수 있다. m _i 는 이미지 평면 내에서의 2D 위치일 수 있다. M _c 는 m _i 에 중심이 위치하는 지역 윈도우 내의 j 번째 픽셀의 2D 위치일 수 있다.Here, M _c may be a 3D position in the camera coordinate system. m _i may be a 2D position within the image plane. M _c may be the 2D location of the j th pixel in the local window centered at m _i .

하기의 수학식 26, 수학식 27 및 수학식 28이 성립함에 따라 하기의 수학식 29가 성립한다. 특징 추적부(212)는 하기의 수학식 29에 기반하여

를 계산할 수 있다.Equation 29 below holds as Equations 26, 27, and 28 below. The feature tracking unit 212 is based on Equation 29 below.

Can be calculated.

여기서,

는 이미지 경사도(gradient)일 수 있다.

는

의 야코비안(jacobian) 행렬일 수 있으며, 하기의 수학식 30에서와 같이 정의될 수 있다.here,

May be an image gradient.

The

It may be a Jacobian matrix of, and may be defined as in Equation 30 below.

수학식 22, 수학식 24 및 수학식 29를 결합함으로써, f(V)는 하기의 수학식 31에서와 같이 근사될 수 있다.By combining Equations 22, 24 and 29, f ( V ) can be approximated as in Equation 31 below.

여기서, g _i , _j , T _i 및 d _i , _j 는 각각 하기의 수학식 32, 수학식 33 및 수학식 34에서와 같이 정의될 수 있다.Here, g _i , _j , T _i and d _i , _j may be defined as in Equations 32, 33, and 34, respectively.

여기서, 아래첨자(subscript)는 의존 관계(dependency)를 가리킬 수 있다. 예컨대, HR _i 는 HR _i 이 단지 뷰의 인덱스(index)에만 의존하고, 이미지 패치들에는 무관하다는(irrelevant) 것을 가리킬 수 있다. 수학식 31에서의 f(V)의 최소화는 하기의 수학식 35의 3 x 3 선형 시스템을 푸는(solve) 것과 동일할(equal) 수 있다.Here, the subscript may indicate a dependency. For example, HR _i may indicate that HR _i only depends on the index of the view and is irrelevant to image patches. Minimization of f ( V ) in Equation 31 may be equivalent to solving the 3 x 3 linear system of Equation 35 below.

특징 추적부(212)는 3D 위치 이동 V의 해답을 구하기 위해 반복적인(iterative) 스킴을 사용할 수 있다.The feature tracker 212 can use an iterative scheme to find the solution of the 3D positional shift V.

반복 횟수 초기화 단계(610)에서, 특징 추적부(212)는 반복 횟수 k의 값을 0으로 초기화할 수 있다.In the repetition number initialization step 610, the feature tracker 212 may initialize the value of the repetition number k to zero.

에너지 함수 초기화 단계(620)에서, 특징 추적부(212)는, 상기의 수학식 35에 기반하여 에너지 함수의 초기값 V ⁽⁰⁾을 계산할 수 있다.In the energy function initialization step 620, the feature tracker 212 may calculate an initial value V ⁽⁰⁾ of the energy function based on Equation 35 above.

위치 초기화 단계(630)에서, 특징 추적부(212)는 하기의 수학식 36에 따라 특징의 초기 위치 M'⁽⁰⁾을 계산할 수 있다.In the position initialization step 630, the feature tracker 212 may calculate the initial position M ' ⁽⁰⁾ of the feature according to Equation 36 below.

반복 횟수 증가 단계(640)에서, 특징 추적부(212)는 반복 횟수 k의 값을 1만큼 증가시킬 수 있다.In the step of increasing the number of repetitions 640, the feature tracker 212 may increase the value of the number of repetitions k by one.

에너지 함수 계산 단계(650)에서, 특징 추적부(210)는, 반복 횟수 값 k를 사용하는 하기의 수학식 37에 기반하여, k 번째 반복에서의 에너지 함수 f(V ⁽ ^k ⁾)을 계산할 수 있다.In calculating the energy function step 650, the feature tracking unit 210, and based on the number of times the value k in Equation 37 below to be used, the energy function of the k th iteration f (V ^{^{^(k))}} to calculate the have.

위치 계산 단계(660)에서, 특징 추적부(210)는, 반복 횟수 값 k를 사용하는 하기의 수학식 38에 기반하여, k 번째 반복에서의 특징의 위치 M'⁽ ^k ⁾를 구할 수 있다.In the position calculation step 660, the feature tracking unit 210, based on the number of iterations the value k Equation 38 below that use can be determined the position M ^{^{^'(k)}} of the characteristics of the k th iteration.

반복 횟수 검사 단계(670)에서, 특징 추적부(210) 반복 횟수 값 k가 지정된 문턱치 이하이면 단계(640)를 다시 수행할 수 있다. 특징 추적부(210)는 반복 횟수 값 k가 지정된 문턱치에 도달하면 f(V ⁽ ^k ⁾) 및 M'⁽ ^k ⁾의 반복적인 계산을 종료할 수 있다. In the repetition count checking step 670, if the feature tracking unit 210 repetition count value k is less than or equal to a specified threshold, step 640 may be performed again. The feature tracking unit 210 may end the iterative calculation of f ( V ⁽ ^k ⁾ ) and M ′ ⁽ ^k ⁾ when the number of iterations k reaches a specified threshold.

특징 추적 단계(680)에서, 특징 추적부(210)는 계산된 위치에 기반하여 현재 프레임 내에서 특징들 각각을 추적할 수 있다. 즉, 특징 추적부(210)는 연이은 프레임들 내의 특징들 간의 대응 및 프레임들의 3D 위치를 동시에 추정할 수 있다.In the feature tracking step 680, the feature tracker 210 may track each of the features in the current frame based on the calculated position. That is, the feature tracker 210 may simultaneously estimate the correspondence between features in successive frames and the 3D position of the frames.

특징 추적부(210)는 다중-프레임의 연이은 프레임들에 대해 순차적으로 단계들(610 내지 680)을 적용함으로써 제3 특징들 각각을 다중-프레임들 중 마지막 프레임까지 추적할 수 있다.
The feature tracker 210 may track each of the third features to the last of the multi-frames by applying steps 610 to 680 sequentially for successive frames of the multi-frame.

도 7은 일 예에 따른 동적 포인트 검출 단계의 흐름도이다.7 is a flowchart of a dynamic point detection step according to an example.

본 예에서, 포인트는 단계(410) 및 단계(420)에서 추출 및 추적된 제3 특징을 의미할 수 있다.In this example, a point may refer to a third feature extracted and tracked in steps 410 and 420.

동적 포인트들을 검출하기 위해서, 예컨대 Yaser Sheikh에 의해 제안된, 자유 이동 카메라들에 대한 2D 배경 추출(subtraction) 알고리즘이 3D 궤적(trajectory) 공간에 적용할 수 있도록 일반화될 수 있다. 3 개의 카메라들이 정적(static)이라는 점을 고려하면, 정적 포인트들은 3안 리그의 실제(actual) 움직임의 반대 방향에 맞춰(accordingly) 강체 운동(rigid motion)을 하는 것으로 간주된다. 따라서, 정적 포인트들의 궤적들은 저 차원의 부공간 내에 놓일 수 있다.In order to detect dynamic points, a 2D background subtraction algorithm for free moving cameras, for example proposed by Yaser Sheikh, can be generalized to be applicable to 3D trajectory space. Considering that the three cameras are static, the static points are considered to perform rigid motion accordingly in the opposite direction of the actual movement of the trinocular rig. Thus, the trajectories of the static points can lie within the lower dimension of the subspace.

포인트의 궤적은 연이은 프레임들 내에서의 3D 좌표들의 연쇄(catenation)로서 정의될 수 있다.The trajectory of the point can be defined as the category of 3D coordinates within successive frames.

단계(410) 및 단계(420)에서, N _f 개의 다중-프레임들 내에서 N _p 개의 포인트들이 추출 및 추적되었을 때, 동적 포인트 검출부(213)는, 하기의 수학식 39에 기반하여 i 번째 포인트의 궤적 w _i 를 계산할 수 있다.In steps 410 and 420, when N _p points are extracted and tracked within the N _f multi-frames, the dynamic point detector 213 performs the i th point based on Equation 39 below. Calculate the locus of w _i .

여기서, M _i , _j 는 j 번째 프레임 내에서의 지역 좌표를 나타낼 수 있다.Here, M _i , _j may represent local coordinates in the j- th frame.

동적 포인트 검출부(213)는, 하기의 수학식 40에서와 같이, 모든 N _p 개의 포인트들을

행렬 W 내에 배열할 수 있다.The dynamic point detection unit 213 collects all N _p points as shown in Equation 40 below.

Can be arranged in the matrix W.

모든 포인트들이 각각 정적일 경우, 모든

에 대해서, M _i , _j 는

와 동일할 수 있다. 여기서,

는 4D 전역(world) 동질(homogeneous) 좌표일 수 있다.

는 j 번째 프레임에 관한 강체 운동의 3 x 4 행렬일 수 있다.If all points are static each

For M _i , _j is

&Lt; / RTI > here,

May be a 4D world homogeneous coordinate.

May be a 3 x 4 matrix of rigid body motion with respect to the j th frame.

따라서, W는 하기의 수학식 41에서와 같이 인수 분해(factor)될 수 있다.Accordingly, W may be factored as in Equation 41 below.

수학식 41에 따른 인수 분해는 W의 랭크가 최대 4라는 것을 가리킨다. 따라서, 정적 포인트의 궤적들은 4 개의 기본(basic) 궤적들로부터 스팬(span)된 부공간 내에 있다.Factorization according to equation (41) indicates that the rank of W is at most 4. Thus, the trajectories of the static point are in subspace spanned from four basic trajectories.

동적 포인트 검출부(213)는, 부공간 내에 놓인 궤적들을 식별하면서, 4-차원 궤적 부공간의 최선의 추정을 견고하게 계산하기 위해 임의 표본 컨센셔스(RANdom SAmple Consensus; RANSAC) 알고리즘을 사용할 수 있다. 각각의 반복(iteration)에서, w ₁, w ₂, w ₃ 및 w ₄로 표시되는 4 개의 궤적들이 부공간을 생성하기 위해 임의로 선택되고, 행렬 W ₄(W ₄ ^T W ₄)-¹ W ₄ ^T는 다른 궤적들을 부공간으로 투영하기 위해 사용된다. 여기서, W ₄ = [w ₁ … w ₄] 이다. 동적 포인트 검출부(213)는 주어진 궤적이 주어진 부공간에 속할 공산(likelihood)을 평가(evaluate)하기 위해, 원본 궤적 및 투영된 궤적 간의 유클리드 거리를 직접적으로 측정할 수 있다.The dynamic point detector 213 may use a random SAmple Consensus (RANSAC) algorithm to robustly calculate the best estimate of the four-dimensional trajectory subspace while identifying the trajectories lying within the subspace. In each iteration, four trajectories, denoted by w ₁ , w ₂ , w _3, and w ₄ , are randomly selected to produce a subspace, and the matrix W ₄ ( W ₄ ^T W ₄ ) ^-1 W ₄ ^T is used to project different trajectories into the subspace. Where W ₄ = [ w ₁ ... w ₄ ]. The dynamic point detector 213 may directly measure the Euclidean distance between the original trajectory and the projected trajectory in order to evaluate the likelihood that the given trajectory will belong to the given subspace.

실제로, 궤적이 부공간 내에 놓이는지 여부를 결정하기 위해 3N _f 공간 내에서 정의된 유클리드 거리를 위한 문턱치를 조율(tune)하는 것은 어렵다. 대신, w _i 를 평가하기 위해, 동적 포인트 검출부(213)는 투영된 궤적 W ₄(W ₄ ^T W ₄)-¹ W ₄ ^T w _i를 N _f 개의 포인트들

로 분할(split)할 수 있다. 또한, 동적 포인트 검출부(213)는, 하기의 수학식 42에 기반하여, 투영 오류 f(w _i )를 계산할 수 있다.Indeed, it is difficult to tune the threshold for the Euclidean distance defined within the 3 N _f space to determine whether the trajectory lies in the subspace. Instead, in order to evaluate w _i , the dynamic point detector 213 determines the projected trajectory W ₄ ( W ₄ ^T W ₄ ) ^-1 W ₄ ^T w _i with N _f points.

You can split by. In addition, the dynamic point detector 213 may calculate the projection error f ( w _i ) based on Equation 42 below.

여기서, m _i , _j , _k 는 j 번째 프레임 내의 k 번째 이미지 상의 i 번째 포인트의 위치를 나타낼 수 있다.Here, m _i , _j , _k may indicate the position of the i th point on the k th image within the j th frame.

궤적 선택 단계(710)에서, 동적 포인트 검출부(213)는 부공간을 생성하기 위한 4 개의 궤적들 w ₁, w ₂, w ₃ 및 w ₄를 선택한다.In the trajectory selection step 710, the dynamic point detector 213 selects four trajectories w ₁ , w ₂ , w ₃ and w ₄ for generating a subspace.

컨센셔스 검출 단계(720)에서, 동적 포인트 검출부(210)는 RANSAC 알고리즘에 기반하여 선택된 4 개의 궤적들 w ₁, w ₂, w ₃ 및 w ₄ 내에서 컨센셔스(consensus)를 검출한다.In the consensus detection step 720, the dynamic point detector 210 detects a consensus within four trajectories w ₁ , w ₂ , w _3, and w ₄ selected based on the RANSAC algorithm.

컨센셔스 비교 단계(730)에서, 선택된 궤적들을 지원하는 데이터 내에 충분한 컨센셔스가 있다면, 동적 포인트 검출부(213)는 루틴을 종료할 수 있다. 그렇지 않다면, 단계(710)가 반복된다. 동적 포인트 검출부(213)는 최대 컨센셔스 집합(set)이 발견될 때까지 다른 4 개의 궤적들을 선택할 수 있다.In the consensus comparison step 730, if there is sufficient consensus in the data supporting the selected trajectories, the dynamic point detector 213 may end the routine. Otherwise, step 710 is repeated. The dynamic point detector 213 may select four other trajectories until the maximum consensus set is found.

동적 포인트 결정 단계(740)에서, 동적 포인트 검출부(213)는 최대 컨센셔스 집합에 속하지 않는 궤적들을 동적 포인트들로 간주할 수 있다.In the dynamic point determination step 740, the dynamic point detector 213 may regard the trajectories that do not belong to the maximum concentration set as the dynamic points.

상술된 것과 같이, 동적 포인트 검출부(210)는 제3 특징들 각각의 4 차원 궤적 부공간을 계산할 수 있고, 계산된 4 차원 궤적 부공간에 기반하여 제3 특징들 각각이 동적 궤적을 갖는지 여부를 판단할 수 있다.As described above, the dynamic point detector 210 may calculate a four-dimensional trajectory subspace of each of the third features, and determine whether each of the third features has a dynamic trajectory based on the calculated four-dimensional trajectory subspace. You can judge.

동적 포인트 검출부(213)는 마지막 프레임까지 추적된 제3 특징들 중 동적 궤적을 갖는 특징들을 제거함으로써 제1 특징들을 결정할 수 있다. 즉, 제1 특징들은 첫 번째 프레임 내에서 추출된 제3 특징들 중 마지막 프레임까지 추적되고, 동적 궤적을 갖지 않는 특징들일 수 있다.
The dynamic point detector 213 may determine the first features by removing features having a dynamic trajectory among the third features tracked to the last frame. That is, the first features may be features that are tracked to the last frame of the third features extracted in the first frame and do not have a dynamic trajectory.

도 8은 일 예에 따른 카메라 위치 추정의 흐름도이다.8 is a flowchart of a camera position estimation according to an example.

다중-프레임들 처리 단계(210)에서, 제1 특징들이 추출되고, 추출된 제1 특징들이 N _f 개의 프레임들 내에서 추적되고, 동적 포인트들이 제거되면, 카메라 위치 추정부(214)는 Structure-from-Motion 기법(technique)을 통해 N _p 개의 포인트들의 전역(world) 좌표 및 카메라들의 3N _f 개의 움직임들을 동시에 추정할 수 있다. N _p 개의 포인트들 각각의 전역 좌표를 M _i 로 표시한다. 여기서, i = 1, …, N _f 이다.In the multi-frames processing step 210, if the first features are extracted, the extracted first features are tracked within the N _f frames, and the dynamic points are removed, the camera position estimator 214 is structure- The from-Motion technique can estimate the world coordinates of N _p points and the 3 N _f movements of the cameras simultaneously. The global coordinate of each of the N _p points is represented by M _i . Where i = 1,... , N _f .

실제로는, 카메라 위치 추정부(214)는, 카메라들의 3N _f 개의 움직임들을 추정하는 것 대신, 현재의 추적된 부분열(subsequence)에 대한 N _f- 1 개의 프레임 리그(rig)들만을 추정할 수 있다. 3 개의 카메라들 간의 상대적인 위치들은 고정되었고 알려져 있다. 또한, 이러한 부분열의 첫 번째 프레임 리그는 전체 부분열의 첫 번째 프레임에 대해

으로 세트되었거나, 이전의 추적된 부분열 내에서 추정될 수 있다. 따라서, 카메라 위치 추정부(214)는 현재의 추정된 잇따른 연쇄에 대한 N _f- 1 개의 프레임 리그들만을 추정할 수 있다.In practice, the camera position estimator 214 estimates only N _f- 1 frame rigs for the current tracked subsequence, instead of estimating 3 N _f motions of the cameras. Can be. The relative positions between the three cameras are fixed and known. Also, the first frame rig of these substrings will be

Can be set, or estimated within the previously tracked substring. Accordingly, the camera position estimator 214 may estimate only N _f- 1 frame rigs for the current estimated subsequent chain.

카메라 위치 추정부(214)는 프레임 리그를 좌측 카메라에 세트할 수 있다. 카메라 위치 추정부(214)는, 하기의 수학식 43에 기반하여, 프레임 리그로부터 3N _f 개의 카메라 위치들을 유도(derive)할 수 있다.The camera position estimator 214 may set the frame rig to the left camera. The camera position estimator 214 may derive 3 N _f camera positions from the frame rig, based on Equation 43 below.

여기서,

는 3N _f 개의 카메라 위치들 중 하나의 위치를 나타낼 수 있다. j = 1 … N _f 일 수 있고,

일 수 있다.

는 프레임 리그를 나타낼 수 있다. j = 1 … N _f 일 수 있다.here,

May represent the position of one of the 3 N _f camera positions. j = 1. Can be N _f ,

Lt; / RTI >

May represent a frame rig. j = 1. May be N _f .

제1 프레임 리그 초기화 단계(810)에서, 현재의 부분열의 첫 번째 프레임 내에서 N _p 개의 포인트들이 삼각 측량되었기 때문에, 카메라 위치 추정부(214)는 알려진 프레임 리그에 의한 역 변환(inverse transformation)을 통해 첫 번째 프레임 내에서의 N _p 개의 포인트들의 전역(world) 좌표들을 초기화할 수 있다.In the first frame rig initialization step 810, since N _p points have been triangulated within the first frame of the current substring, the camera position estimator 214 performs inverse transformation with a known frame rig. Through this, world coordinates of N _p points in the first frame can be initialized.

나머지 프레임 리그 초기화 단계(820)에서, 카메라 위치 추정부(214)는 르벤버그-마쿼트(Levenberg-Marquardt) 알고리즘을 통한 비-선형(non-linear) 최적화 및 (예컨대 롱 콴(Long Quan) 등에 의해 제안된) 카메라 위치 추정 방법을 사용하여 나머지(rest) 프레임 리그들을 초기화할 수 있다.In the remaining frame rig initialization step 820, the camera position estimator 214 performs non-linear optimization using the Levenberg-Marquardt algorithm and (eg, Long Quan, etc.). The camera position estimation method (proposed by) can be used to initialize the rest frame rigs.

구체적으로, 카메라 위치 추정부(214)는 j 번째 프레임에 대해 하기의 수학식 44에 기반하여 초기화를 수행할 수 있다.In detail, the camera position estimator 214 may perform initialization on the j th frame based on Equation 44 below.

여기서, m _i , _j , _k 는 j 번째 프레임의 k 번째 이미지 내의 i 번째 포인트의 2D 측정치일 수 있다.

는 j 번째 프레임의 k 번째 이미지로의 i 번째 포인트의 재-투영일 수 있다. N _j 는 j 번째 프레임 내에서의 가시의(visible) 포인트들의 개수일 수 있다.Here, m _i , _j , _k may be 2D measurements of the i th point in the k th image of the j th frame.

May be re-projection of the i th point into the k th image of the j th frame. N _j may be the number of visible points in the j th frame.

초기화 후, 재-투영 오류 최소화 단계(830)에서, 카메라 위치 추정부(214)는, 하기의 수학식 45에 기반하여, 모든 3D 포인트들 및 카메라 파라미터들에 대하여 재-투영 오류를 최소화할 수 있다.After initialization, in step 830, minimizing the re-projection error, the camera position estimator 214 may minimize the re-projection error for all 3D points and camera parameters, based on Equation 45 below. have.

수학식 44과는 달리, 정규 방정식(normal equation)인 수학식 45는 상이한 3D 포인트들 및 카메라들에 대한 파라미터들 간의 상호작용(interaction)의 결핍(lack)에 기인하는 희박(sparse) 블록 구조를 갖는다.Unlike Equation 44, Equation 45, which is a normal equation, produces a sparse block structure due to a lack of interaction between parameters for different 3D points and cameras. Have

르벤버그-마쿼트 알고리즘의 희박 분산(sparse variant)은 "0" 요소들에 대한 저장(storing) 및 동작(operating)을 회피(avoid)함으로써 정규 방정식의 0 패턴으로부터 이익을 획득할 수 있다. 희박 블록 구조는 르벤버그-마쿼트 알고리즘의 희박 분산을 채용(employ) 함으로써 상당한(tremendous) 계산적인 이익(benefits)을 획득하기 위해 활용(exploit)될 수 있다. 이러한 활용은 번들 조절(bundle adjustment)로 명명될 수 있으며, 거의 모든 특징에 기반한 Structure-from-Motion 시스템의 표준적인 최종 단계로서 사용될 수 있다.The sparse variant of the Levenberg-Marquart algorithm can benefit from the zero pattern of the regular equation by avoiding storing and operating on "0" elements. The lean block structure can be exploited to obtain tremendous computational benefits by employing the lean variance of the Levenberg-Marquardt algorithm. This application can be called a bundle adjustment and can be used as the standard final step in a Structure-from-Motion system based on almost all features.

특히, 수학식 44 및 수학식 45에 대해, 카메라 위치 추정부(214)는, 예컨대 하기의 수학식 46에 나타난 노아 스내블리 등에 의해 제안된 방법에 기반하여,

에 대한 3 개의 파라미터, 증대하는(incremental) 회전 행렬

의 파라미터화 및 카메라 중심 c에 대한 3 개의 파라미터들을 사용하여

를 파라미터로 나타낼(parameterize) 수 있다.In particular, with respect to Equations 44 and 45, the camera position estimator 214 is based on, for example, a method proposed by Noah Snapley or the like shown in Equation 46 below.

Three parameters for the incremental rotation matrix

Using three parameters for the parameterization of the camera center c and

Can be parameterized.

여기서,

는 수학식 3에서 정의된 것과 같은 반대칭행렬일 수 있다. R ^init는 초기의 회전 행렬일 수 있다. 카메라 위치 추정부(214)는, 하기의 수학식 47의 함수들에 기반하여, M _i 를 j 번째 프레임 내의 3 개의 뷰들로 투영할 수 있다.here,

May be an antisymmetric matrix as defined in equation (3). R ^init can be an initial rotation matrix. The camera position estimator 214 may project M _i to three views within the j th frame based on the functions of Equation 47 below.

카메라 위치 추정부(214)는, 수학식 44 및 수학식 45 양자에서의 f(v)에 대해서, 체인 규칙에 기반하여 야코비안 행렬을 분해적으로 계산할 수 있다.The camera position estimator 214 may decompose the Jacobian matrix on f ( v ) in both Equations 44 and 45 based on the chain rule.

카메라 위치 추정부(214)는, 하기의 수학식 48에 기반하여, 미들 뷰 M _middle를 계산할 수 있다.The camera position estimator 214 may calculate the middle view M _middle based on Equation 48 below.

여기서, M의 좌표는 중간 카메라에 대하여 표현된(represent) 것일 수 있다.Here, the coordinate of M may be represented with respect to the intermediate camera.

카메라 위치 추정부(214)는, 하기의 수학식 49, 수학식 50 및 수학식 51에 각각 기반하여

,

및

를 계산할 수 있다.The camera position estimator 214 is based on Equations 49, 50, and 51, respectively.

,

And

Can be calculated.

여기서,

는 상술된 수학식 30에서 정의된 것과 동일할 수 있다.here,

May be the same as defined in Equation 30 described above.

항

및 항

은 모든 포인트들에 대해 동일하다. 카메라 위치 추정부(214)는 항

및 항

을 프레임 리그가 갱신될 때 단 1회 선-계산(pre-compute)할 수 있다.term

And terms

Is the same for all points. The camera position estimator 214

And terms

Can be pre-computed only once when the frame rig is updated.

및

는 서로 부호(sign)에 있어서만 상이하며, 2 번 계산될 필요가 없다. 카메라 위치 추정부(214)는 우측 뷰에 대한 야코비안 행렬을 전술된 것과 유사한 방식으로 유도할 수 있다.

And

Are different only in sign from each other and need not be calculated twice. The camera position estimator 214 may derive the Jacobian matrix for the right view in a manner similar to that described above.

카메라 위치 유도 단계(840)에서, 카메라 위치 추정부(214)는, 전술된 수학식 43에 기반하여, 프레임 리그들로부터 3N _f 개의 카메라 위치들을 유도(derive)할 수 있다.
In the camera position derivation step 840, the camera position estimator 214 may derive 3 N _f camera positions from the frame rigs, based on Equation 43 described above.

도 9는 일 예에 따른 단일-프레임 처리 단계의 흐름도이다.9 is a flowchart of a single-frame processing step according to an example.

단일-프레임 처리 단계(320)는 하기의 단계들(910 내지 950)을 포함할 수 있다.The single-frame processing step 320 may include the following steps 910-950.

단일-프레임 처리 단계(320)에서, 현재 프레임 내에서의 카메라들 각각의 위치가 추정될 수 있다. 현재 프레임은 선형 N-포인트 카메라 위치 결정(Linear N-Point Camera Pose Determination) 방법에 의해 초기화될 수 있다. 또한, 단일-프레임 처리 단계(320)의 동작은 수학식 43 및 수학식 44에 기반하여 최적화될 수 있다. 다중-프레임들 처리 단계(310) 및 단일-프레임 처리 단계(320) 간의 차이 중 하나는, 단일-프레임 처리 단계(320)에서는 번들 조절이 매우 지역적이게 될 수 있다는 것이다. 즉, 단일-프레임 처리 단계(320)에서 조절될 수 있는 포인트는 현재 프레임 및 현재 프레임 리그 내에서 출현하는 포인트들만으로 제한될 수 있다. 지역적인 최적화를 방지하기 위해, 단일-프레임 처리부(220)는 이전 프레임들 내에서의 포인트들의 투영 또한 측정될 수 있다. 따라서, 관여된(involved) 이전의 프레임들의 카메라 파라미터들 또한 수학식 45에서의 상수로서 사용될 수 있다.In single-frame processing step 320, the position of each of the cameras in the current frame can be estimated. The current frame can be initialized by a Linear N-Point Camera Pose Determination method. In addition, the operation of the single-frame processing step 320 may be optimized based on Equation 43 and Equation 44. One of the differences between the multi-frames processing step 310 and the single-frame processing step 320 is that in the single-frame processing step 320, the bundle adjustment can be very local. That is, the points that can be adjusted in the single-frame processing step 320 may be limited to only points that appear within the current frame and the current frame rig. To prevent local optimization, the single-frame processor 220 can also measure the projection of points within previous frames. Thus, camera parameters of previous frames involved may also be used as a constant in equation (45).

단일-프레임 처리 단계(320)는 다중-프레임들 처리 단계(310)에서 제1 특징들이 추출 및 추적된 후에 수행될 수 있다.The single-frame processing step 320 may be performed after the first features have been extracted and tracked in the multi-frames processing step 310.

현재 프레임 설정 단계(910)에서, 현재 프레임 설정부(221)는 다중-프레임의 다음 프레임을 현재 프레임으로 설정할 수 있다..In the current frame setting step 910, the current frame setting unit 221 may set the next frame of the multi-frame as the current frame.

현재 프레임 특징 추적 단계(920)에서, 현재 프레임 특징 추정부(222)는 현재 프레임 내에서 제2 특징들을 추출 및 추적할 수 있다. 제2 특징들은 각각 다중-프레임들 처리 단계(310)에서 추출 및 추적된 제1 특징들 중 하나의 특징에 대응하는 특징일 수 있다. 즉, 제2 특징들은 다중-프레임들 내에서 추출된 제1 특징들이 현재 프레임 내에서 연속적으로 나타난 것일 수 있다.In the current frame feature tracking step 920, the current frame feature estimator 222 may extract and track the second features within the current frame. The second features may each correspond to a feature of one of the first features extracted and tracked in the multi-frames processing step 310. That is, the second features may be that the first features extracted in the multi-frames appear continuously in the current frame.

문턱치 비교부(930)에서, 현재 프레임 문턱치 비교부(223)는 현재 프레임 내에서 추출된 제2 특징들의 개수가 문턱치 이상인지 여부를 검사할 수 있다.. 만약, 추출된 제2 특징들의 개수가 문턱치 이상인 경우, 단계(940)가 수행될 수 있다.In the threshold comparison unit 930, the current frame threshold comparison unit 223 may check whether the number of the second features extracted in the current frame is greater than or equal to the threshold. If it is above the threshold, step 940 may be performed.

만약, 추출된 제2 특징들의 개수가 문턱치보다 적은 경우, 단일-프레임 처리 단계((320)가 종료할 수 있다. 단일-프레임 처리 단계(320)가 종료한 후, 예컨대 처리해야할 프레임이 남은 경우, 다중-프레임들 처리 단계(310)가 재수행됨으로써 다중-프레임들 처리부(210)가 재실행될 수 있다. 이때, 다중-프레임들 처리 단계(310)의 다중-프레임은 현재 프레임에서 시작하는 2 개 이상의 연이은 프레임들일 수 있다.If the number of the extracted second features is less than the threshold, the single-frame processing step 320 may end. After the single-frame processing step 320 ends, for example, there are remaining frames to be processed. When the multi-frames processing step 310 is re-executed, the multi-frames processing unit 210 may be re-executed. It may be more than one consecutive frames.

현재 프레임 위치 추정 단계(940)에서, 현재 프레임 카메라 위치 추정부(224)는 현재 프레임 내에서의 카메라들 각각의 위치를 추정할 수 있다.In the current frame position estimation step 940, the current frame camera position estimator 224 may estimate the position of each of the cameras in the current frame.

현재 프레임 갱신 단계(950)에서, 현재 프레임 설정부(221)는 현재 프레임의 다음 프레임을 현재 프레임으로 새롭게 설정할 수 있다. 이후, 현재 프레임 특징 추적 단계(920)가 반복해서 수행될 수 있다. 즉, 현재 프레임 특징 추정부는 새롭게 설정된 현재 프레임 내에서 제2 특징들을 추출할 수 있다.In the current frame update step 950, the current frame setting unit 221 may newly set the next frame of the current frame as the current frame. Thereafter, the current frame feature tracking step 920 may be repeatedly performed. That is, the current frame feature estimator may extract the second features within the newly set current frame.

앞서 도 1 내지 도 8을 참조하여 설명된 실시예에 따른 기술적 내용들이 본 실시예에도 그대로 적용될 수 있다. 따라서 보다 상세한 설명은 이하 생략하기로 한다.
Technical contents according to the embodiment described above with reference to FIGS. 1 to 8 may be applied to the present embodiment as it is. Therefore, more detailed description will be omitted below.

일 실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예들을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예들의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.
The method according to one embodiment may be implemented in the form of program instructions that may be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, the present invention has been described by way of limited embodiments and drawings, but the present invention is not limited to the above embodiments, and those skilled in the art to which the present invention pertains various modifications and variations from such descriptions. This is possible.

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined by the equivalents of the claims, as well as the claims.

100: 카메라 위치 추적 장치
210: 다중-프레임들 처리부
220: 단일-프레임 처리부100: camera position tracking device
210: multi-frames processing unit
220: single-frame processing unit

Claims

A method of tracking the location of cameras using frames taken by at least three cameras, the method comprising:
Extracting and tracking one or more first features within the multi-frames, and tracking the location of each of the cameras within each of the multi-frames based on the first features; And
Tracking the location of each of the cameras within each of the one or more single-frames based on second features of each of the one or more single-frames.
Wherein the one or more single-frames are previous frames of the first frame in which second features corresponding to one of the first features of frames subsequent to the multi-frames are tracked less than a threshold. Location tracking method.

The method of claim 1,
Tracking the location of each of the cameras within each of the multi-frames,
Extracting third features from at least three images of the first of the multi-frames;
Tracking the third features to the last of the multi-frames;
Determining the first features by removing features having a dynamic trajectory of third features tracked to the last frame; And
Estimating a position of each of the cameras within each of the multi-frames based on the first features
Including, camera position tracking method.

The method of claim 2,
Extracting the third features may include:
Extracting points from at least three images of the first frame and generating Scale Invariant Feature Transform (SIFT); And
Matching the extracted points to each other using descriptor comparison between the generated SIFTs, and generating the third features by concatenating matched points among the points as features.
Including, camera position tracking method.

The method of claim 3,
Extracting the third features may include:
Removing outliers of the third features using geometric constraints
Further comprising, the camera position tracking method.

5. The method of claim 4,
Wherein the geometric constraint is one or more of epipolar constraint, re-projection constraint, and depth range constraint.

The method of claim 1,
Tracking the location of each of the cameras within each of the single-frames,
Setting a next frame of the multi-frames as a current frame;
Extracting the second features corresponding to one of the first features within the current frame;
Estimating a position of each of the cameras in a current frame when the number of the second features is greater than or equal to a threshold; And
Setting the next frame of the current frame as the current frame and extracting the second features when the number of the second features is greater than or equal to a threshold;
Including, camera position tracking method.

The method of claim 6, wherein
Tracking the location of each of the cameras within each of the single-frames,
Re-tracking the location of each of the cameras within each of the multi-frames if the number of the second features is less than a threshold.
Further comprising, the camera position tracking method.

A computer-readable recording medium containing a program for performing the method of any one of claims 1 to 7.

A camera position tracking device for tracking positions of cameras using frames taken by at least three cameras,
A multi-frames processor for extracting and tracking one or more first features within the multi-frames and tracking the location of each of the cameras in each of the multi-frames based on the first features; And
A single-frame processor for tracking the position of each of the cameras within each of the single-frames based on the second features of each of the one or more single-frames
Wherein the one or more single-frames are previous frames of the first frame in which second features corresponding to one of the first features of frames subsequent to the multi-frames are tracked less than a threshold. Location tracking device.

10. The method of claim 9,
The multi-frames processing unit,
A feature extractor for extracting third features from at least three images of a first frame of the multi-frames;
A feature tracker for tracking the third features to a last frame of the multi-frames;
A dynamic point detector for determining the first features by removing features having a dynamic trajectory among the third features tracked to the last frame; And
A camera position estimator for estimating a position of each of the cameras in each of the multi-frames based on the first features
Comprising a camera position tracking device.

The method of claim 10,
The dynamic point detection unit calculates a four-dimensional trajectory subspace of each of the third features, and determines whether each of the third features has a dynamic trajectory based on the four-dimensional trajectory subspace. .

The method of claim 10,
The feature extraction unit may extract,
Extract points from at least three images of the first frame, generate Scale Invariant Feature Transform Descriptors (SIFTs), and compare the extracted points to each other using a descriptor comparison between the generated SIFTs. And generate the third features by concatenating matched points among the points as features.

The method of claim 12,
And the feature extractor determines the first features by removing outliers of the third features using geometric constraints.

The method of claim 13,
And the geometric constraint is one or more of epipolar constraint, re-projection constraint and depth range constraint.

10. The method of claim 9,
The single-frame processing unit,
A current frame setting unit for setting a next frame of the multi-frames as a current frame;
A current frame feature estimator for extracting the second features corresponding to one of the first features in the current frame; And
A threshold comparison unit estimating a position of each of the cameras in a current frame when the number of the second features is greater than or equal to a threshold;
Lt; / RTI >
When the number of the second features is greater than or equal to the threshold, the current frame setting unit newly sets the next frame of the current frame as the current frame, and the current frame feature estimating unit sets one of the first features from the newly set current frame. Extracting the second features corresponding to the feature.

16. The method of claim 15,
And the multi-frames processor is re-executed when the number of the second features is less than a threshold.