JP2002532770A

JP2002532770A - Method and system for determining a camera pose in relation to an image

Info

Publication number: JP2002532770A
Application number: JP2000587206A
Authority: JP
Inventors: カイン，ジム; イェーツ，チャーリー; ツェーン，アーサー; フェジェス，サンダー; チェン，ジンロン; ジャブロンスキー，マーク
Original assignee: ジオメトリックスインコーポレイテッド; コルビンシステムズインコーポレイテッド
Priority date: 1998-11-20
Filing date: 1999-11-19
Publication date: 2002-10-02
Also published as: WO2000034803A3; EP1068489A2; WO2000034803A2

Abstract

(57)【要約】プラットフォームのポーズ情報を決定するための完全に受動的で自己充足したシステムが開示される。このシステムは動き検知装置及び画像化装置からなり、両方とも知られた時間的関係で共に動作し、それにより画像化装置から発生された各画像は動き検知装置により提供される一組の動きデータに対応する。好ましい実施例では、動き検知装置及び画像化装置は共に一体化され及び／又は同期して作動する。画像化装置は周囲のシーンを検知し、その特徴は抽出され、画像化装置の動きを決定するためにトラッキングされる。故に、シーンに関する前もっての情報又は特殊なシーンの準備は必要とされない。更に、カルマンフィルタのような統計的推定処理が特徴トラッキングを助けるために用いられる。ポーズ情報を決定するために、ストラップダウンナビゲーション処理により伝搬された特徴と動きデータは統計的推定処理に提供される。統計的推定処理からの誤差は特徴及び動きデータを更新するために用いられる。結果として、統計的推定処理から出力されたポーズ情報は関連した機器と同様に動きデータの精度及び特徴を無視して高精度を有する。 Summary A completely passive and self-contained system for determining platform pose information is disclosed. The system comprises a motion detector and an imager, both working together in a known temporal relationship, whereby each image generated by the imager is a set of motion data provided by the motion detector. Corresponding to In a preferred embodiment, the motion sensing device and the imaging device work together and / or synchronously. The imaging device detects the surrounding scene and its features are extracted and tracked to determine the motion of the imaging device. Therefore, no prior information about the scene or special scene preparation is required. In addition, a statistical estimation process such as a Kalman filter is used to assist feature tracking. To determine pose information, the feature and motion data propagated by the strapdown navigation process are provided to a statistical estimation process. Errors from the statistical estimation process are used to update feature and motion data. As a result, the pose information output from the statistical estimation process has high accuracy, ignoring the accuracy and characteristics of the motion data, similarly to the related devices.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】発明の背景発明の属する分野本発明は対象のポーズ決定の分野に関し、より詳細には画像解析と動き検知機
器との組み合わせを用いて任意の動きの条件下でカメラを含む剛体の位置及び向
きを決定する方法及びシステムに関する。関連する従来技術プラットフォームから撮影された周囲の画像を用いてプラットフォームの位置
及び向きを正確に決定する問題はナビゲーション分野の重要な部門であり、ある
種の解決策が数十年にわたり提供されてきた。プラットフォームは人により担持
されるカメラサブシステム、屋根に設けられたカメラを有する乗り物、下方に向
くカメラを有する航空機、又はカメラ及び付属の機器が周辺環境の画像を形成す
るために構成された他の移動状態である。簡単な場合には、ある局部空間に関し
て機器を備えたカメラを有するプラットフォームの正確な位置及び向き（集合的
に「ポーズ」と称される）は周辺環境の固定された特徴に対する基準画像により
決定される。動き検知成分により測定された別の画像間の上下、回転角の知覚で
画像のパースペクティブを収集することはシーケンシャルな対応する特徴の自動
配置を簡単化する。BACKGROUND OF THE INVENTIONField of the Invention The present invention relates to the field of object pose determination, and more particularly to image analysis and motion detectors.
The position and orientation of the rigid body including the camera under arbitrary motion conditions using the combination
A method and system for determiningRelated prior art Platform location using surrounding images taken from the platform
And the problem of accurately determining orientation is an important division of the navigation field
Various solutions have been provided for decades. Platform carried by human
Camera subsystem, vehicle with roof mounted camera,
Aircraft with cameras, or cameras and attached equipment that form images of the surrounding environment
Other moving states configured for In simple cases, for a local space
The exact position and orientation of the platform with the camera
Is referred to as a “pause”) by reference images for fixed features of the surrounding environment.
It is determined. Perception of up, down, rotation angle between different images measured by motion detection component
Collecting image perspectives is an automatic
Simplify placement.

【０００２】このアプローチの伝統的な機械化は水平線の知識と共に星の知られたパターン
を手動で観察することがナビゲーション基準として用いられた六分儀の使用に対
して数百年の歴史がある。天体を助けとするナビゲーションの更なる機械化は初
期及び現在の戦略ミサイルの両方の要素としてなされた。ここで乗り物のポーズ
の知識は自己含有慣性ナビゲーションシステムの使用を通して概略知られており
、この初期ポーズはカメラからの星の観測を用いて更新される。[0002] The traditional mechanization of this approach has hundreds of years of history for the use of sextants where manual observation of known patterns of stars with knowledge of the horizon has been used as a navigational reference. Further mechanization of celestial-assisted navigation was made as a component of both early and current strategic missiles. Here, knowledge of vehicle poses is generally known through the use of a self-contained inertial navigation system, and this initial pose is updated using star observations from the camera.

【０００３】ナビゲーションとの統合に対する第二の伝統的な解決策は、デジタルシーンマ
ッチングエリア相関（ＤＳＭＡＣ）案内方法であり、これは巡航ミサイル案内の
ためにルーチンとして用いられる。ＤＳＭＡＣは搭載された慣性ナビゲーション
システムに沿って記憶された一組の基準シーンを用いる。ミサイルが基準シーン
の近傍を飛行するときに、カメラはシーンの疑わしいエリアを見て、搭載された
デジタルカメラからの観測されたシーンを基準と相関をとることにより補正をな
す。補正は次に慣性ナビゲーション解決策に対してなされる。類似の画像的に補
助された案内解決策が航空機案内用にルーチンとして用いられ、ナビゲーション
の更新が知られた視覚的なランドマークでなされる。[0003] A second traditional solution to integration with navigation is the Digital Scene Matching Area Correlation (DSMAC) guidance method, which is used routinely for cruise missile guidance. DSMAC uses a set of reference scenes stored along with the onboard inertial navigation system. As the missile flies near the reference scene, the camera looks at suspicious areas of the scene and makes corrections by correlating the observed scene from the onboard digital camera with the reference. Corrections are then made to the inertial navigation solution. A similar image-assisted guidance solution is routinely used for aircraft guidance and navigation updates are made at known visual landmarks.

【０００４】これらのナビゲーション構想に対する共通の拘束はナビゲーション用に用いら
れる視覚的基準の手がかり（cue）はナビゲーションをする前に知られているこ
とである。星座に助けられたナビゲーションは星のパターンが知られることなし
には働かず、ＤＳＭＡＣは基準シーンの精力的な事前準備を要求する。A common constraint on these navigation schemes is that the visual reference cues used for navigation are known prior to navigation. Constellation-assisted navigation does not work without a known star pattern, and DSMAC requires vigorous advance preparation of the reference scene.

【０００５】米国特許第４０７０３０７、５５７６９６４、４４９４２００、４３４７５１
１、４１７９６９３号は道及び地形（terrain）の特徴を画像的に用いるナビゲ
ーション方法が記載され、その位置は地図から知られている。星座に助けられた
ナビゲーションは米国特許第４７４６９７６号に記載されている。更に米国特許
第５５２５８８３、５７８４２８２号はシーン内に事前に配置された特に設計さ
れた視覚的ターゲットを用いるナビゲーションが記載され、米国特許第５５１７
４１９号はおもにＧＰＳにより決定された乗り物の軌跡を有する地形の任意の点
を地理的に配置（geolocate）するシステムを記載している。事前計画がこれら
すべての従来技術のシステムに対して要求される。[0005] US Patent Nos. 4,070,307, 5,576,964, 4,494,200, and 434751
No. 1,4179,693 describes a navigation method using the features of roads and terrain graphically, the location of which is known from a map. Constellation aided navigation is described in U.S. Pat. No. 4,746,976. No. 5,525,883, 5,784,282 describes navigation using specially designed visual targets pre-positioned in a scene, and US Pat.
No. 419 describes a system for geolocating any point on the terrain having a vehicle trajectory determined primarily by GPS. Pre-planning is required for all these prior art systems.

【０００６】更にまた、米国特許第５６９９４４４号はカメラ位置及びシーン内の特徴が複
数の特徴から複数のカメラから配置される。システムは別の動き検知部品に対す
る要求なしに映像センサのみを用いる。ポーズ情報がすべてのビューが利用可能
であり、更に特徴選択が手動プロセスである間に提供されない故に、システムの
全体の性能は悪影響を受ける。加えて、ポーズ決定に対する画像的な唯一の解決
策は種々のビュー特徴のジオメトリー的特徴により拘束される。米国特許第５５
１１１５３号は画像の高いレートのシーケンスが特徴配置のための繰り返し推定
を提供するために処理され、故に特徴トラッキング処理の自動化の可能性を提供
することにより他の方法を改善する。しかしながら米国特許第５５１１１５３号
は少なくとも７つの特徴は連続的にトラックされ、各特徴は完全な３Ｄ位置ベク
トルではなく単一レンジパラメータとして表される。加えて、そのような画像的
なだけの解決策は許容可能なカメラの動きへの深刻な束縛を付与し、カメラのノ
ード位置、迅速な回転、ルーミング運動（looming）、及び狭い視野のカメラに
対して失敗しやすい傾向にある。最終的に、米国特許第５５１１１５３号のよう
な画像的なだけの方法将来のフレームでの特徴位置を予測する独立した手段を有
さず、それにより特徴トラッキングサーチウインドウはすべての潜在的なカメラ
の動きを含むために十分大きくなければならず、膨大な計算時間を生ずる。[0006] Furthermore, US Pat. No. 5,699,444 disposes camera positions and features in a scene from multiple cameras from multiple features. The system uses only the video sensor without the need for a separate motion sensing component. The overall performance of the system is adversely affected because pose information is available for all views and feature selection is not provided during the manual process. In addition, the only graphical solution to pose determination is constrained by the geometric features of the various view features. US Patent No. 55
No. 11153 improves on other methods by providing a high rate sequence of images that is processed to provide iterative estimation for feature placement, thus providing the possibility of automating the feature tracking process. However, US Pat. No. 5,511,153 discloses that at least seven features are tracked sequentially, with each feature represented as a single range parameter rather than a full 3D position vector. In addition, such an image-only solution imposes a serious constraint on acceptable camera movement, with nodal position of the camera, rapid rotation, looming, and a camera with a narrow field of view. They tend to fail easily. Finally, there is no independent means of predicting feature locations in future frames, such as in U.S. Pat. No. 5,511,153, so that the feature tracking search window will be Must be large enough to include motion, resulting in enormous computation time.

【０００７】しかしながら、実際にはシーンの事前の知識が正確なナビゲーション案内が要
求されたときに利用可能ではない多くのアプリケーション又はエリアが存在する
。故に、プラットフォームの後の決定がシーンの事前の知識なしに周囲のシーン
と結合されて自動的かつ効率的に得られるシステムへの大きなニーズが存在する
。本発明の要約本発明は一以上の機器が正確なデータを提供することに失敗した条件下で特に
ポーズ決定の精度の高いレベルを達成する機器に基づくナビゲーションシステム
を助けるために画像から視覚情報を抽出し、処理する技術に関する。ここに開示
された本発明を用いたナビゲーションシステムは動き検知データの大きなドリフ
ト及びノイズ成分にも関わらず、また画像コレクションに対して用いられるカメ
ラ／レンズ／光学系の大きな不確定性にも関わらず高度に正確なプラットフォー
ムポーズ情報を得るために用いられうる。However, in practice there are many applications or areas where prior knowledge of the scene is not available when accurate navigation guidance is required. Thus, there is a great need for a system in which subsequent decisions of the platform can be combined automatically and efficiently with surrounding scenes without prior knowledge of the scene. SUMMARY OF THE INVENTION The present invention relies on visual information from images to aid device-based navigation systems that achieve a particularly high level of accuracy in pose determination under conditions where one or more devices fail to provide accurate data. Extraction and processing technology. Navigation systems using the invention disclosed herein are not subject to the large drift and noise components of motion detection data, and despite the large uncertainties of the camera / lens / optics used for image collection. It can be used to obtain highly accurate platform pose information.

【０００８】本発明の一特徴によれば、画像は動くシステム（例えば乗り物）及び飛行シス
テム（例えばミサイル）を含むがそれには限定されないプラットフォームの動き
検知機器のクラスターにしっかり接続され、又はそれに関して制御された向きに
向けられたカメラにより収集される。動き検出器器の一例はプラットフォームに
しっかりと固定された３つの直交軸に沿って加速度及び回転速度を測定する装置
からなる慣性測定ユニット（ＩＭＵ）である。このＩＭＵは測地基準フレームに
関してプラットフォームの位置及び速度を測定するＧＰＳセンサで補強される。
顕著な特徴のオペレータは画像処理の意味で比較的独特な画像内の領域を検出す
ることによりカメラから得られた画像に適用される。これらの領域の画像テンプ
レートは記憶され、視覚的特徴トラッキングオペレータは後の画像の対応する領
域を識別することを企てる。対応する特徴が静止した対象又は地形（例えば建物
）から得られた場合にプラットフォーム位置はこれらの特徴に関して固定された
座標系に関して決定される。According to one aspect of the present invention, the image is securely connected to or controlled by a cluster of motion sensing devices on a platform, including but not limited to a moving system (eg, a vehicle) and a flight system (eg, a missile). Collected by a camera pointed at the desired orientation. One example of a motion detector is an inertial measurement unit (IMU) consisting of devices that measure acceleration and rotational speed along three orthogonal axes that are fixedly secured to the platform. This IMU is augmented with a GPS sensor that measures the position and speed of the platform with respect to the datum frame.
A salient feature operator is applied to the image obtained from the camera by detecting regions within the image that are relatively unique in the sense of image processing. The image templates for these regions are stored and the visual feature tracking operator attempts to identify corresponding regions in subsequent images. If the corresponding features are obtained from a stationary object or terrain (eg a building), the platform position is determined with respect to a fixed coordinate system for these features.

【０００９】ＩＭＵ装置クラスタに含まれる加速度計及び速度ジャイロのような動きセンサ
は複合したそれぞれのセンサのサブシステムである。マイクロ機械電気機械シス
テム（ＭＥＭＳ）の分野の最近の発展は大きさ及び性能においてＩＭＵの概念を
変革した。しかしながらＭＥＭＳＩＭＵ部品はナビゲーションに用いられる従
来のＩＭＵ部品より精度がずっと劣っている。更にまた、デジタルカメラ部品は
すべての光検知のデジタル化、画像処理要素を含む高度に集積された単一チップ
カメラに進化している。これらのカメラは極めて小さなデジタルカメラ部品に到
達するためにミニチュアレンズに取り付けられうる。ＭＥＭＳＩＭＵ部品と共
に、新たなカメラ光学部品は測量的精度を有さない。即ち、光線は光感応二次元
アレイ上に高精度にマップされない。更にこの問題を複雑にするのはレンズ／光
学系が焦点距離を変化させ、それにより光学的パラメータは画像収集の経路を通
して高度に変化する。ＩＭＵ及びカメラ状況の両方で、正確なシステム機械化が
構成要素のサブシステムの同様の不正確さを取り扱わなければならない。[0009] Motion sensors, such as accelerometers and velocity gyros, included in the IMU device cluster are subsystems of each respective sensor. Recent developments in the field of micromechanical electromechanical systems (MEMS) have changed the concept of the IMU in size and performance. However, MEMS IMU components are much less accurate than conventional IMU components used for navigation. Furthermore, digital camera components have evolved into highly integrated single-chip cameras that include all digitization of light sensing and image processing components. These cameras can be mounted on miniature lenses to reach very small digital camera components. The new camera optics, along with the MEMS IMU parts, have no surveying accuracy. That is, the rays are not mapped with high precision on the light sensitive two-dimensional array. To further complicate this problem, the lens / optics change the focal length, so that the optical parameters vary highly throughout the image acquisition path. In both the IMU and camera situations, accurate system mechanization must address similar inaccuracies in component subsystems.

【００１０】本発明の一の特徴はサブシステムからの不正確さに対する固有の許容誤差であ
る。本発明は統計に基づく係数でパラメータ化された検知部品の誤差に対して実
質的にかつ容易に拡張可能なモデルを含む。これらの係数の誤差は処理で考慮さ
れ、典型的な環境で、その誤差は高精度で推定される。One feature of the present invention is the inherent tolerance for inaccuracies from subsystems. The present invention includes a model that is substantially and easily extensible for errors in sensing components parameterized with statistically based coefficients. Errors in these coefficients are taken into account in the processing, and in typical circumstances, the errors are estimated with high accuracy.

【００１１】本発明を用いるシステムは自己を含む動き検知及びカメラ装置を含む。これら
のセンサを用いるナビゲーションはローカルシーンに関連し、ポーズ情報の軸は
早期に検出された特徴に対して任意に関連される。すべてのそれに続く特徴及び
プラットフォームデータはこの初期座標系に関して提供される。一実施例によれ
ば、測地ポーズ情報が提供される。ＧＰＳ及びＩＭＵ情報の統合は当業者によく
知られており、そのような統合は本発明と協働して働くように選択的に含まれう
る。結果としてポーズが地理的な座標に参照されることを可能にするよう都市内
の環境で利用可能であるようにまばらなＧＰＳ情報のみが要求される。従来技術
のシステムと異なり、この実施例は室内、トンネル内、過密な建築の都市環境の
ような長期間のＧＰＳ利用不可能な期間中でさえ地理的な座標でポーズの決定を
提供する。A system employing the present invention includes a motion detection and camera device, including self. Navigation using these sensors relates to the local scene, and the axis of the pose information is arbitrarily related to the early detected features. All subsequent features and platform data are provided for this initial coordinate system. According to one embodiment, geodetic pose information is provided. Integration of GPS and IMU information is well known to those skilled in the art, and such integration may be optionally included to work in conjunction with the present invention. As a result, only sparse GPS information is required to be available in the urban environment to allow poses to be referenced to geographic coordinates. Unlike prior art systems, this embodiment provides pose determination in geographic coordinates even during long periods of GPS unavailability, such as indoors, in tunnels, and overcrowded urban environments.

【００１２】ある自然発生したシーンはその画像形成により分類された特徴を含むことが知
られている。そのような特徴は直線の交点（例えば建物の隅）又は垂線の発生（
例えば建物の端又は木の幹）、又はシーンで国際的に計画された他の特徴を反映
した対象を含む。本発明は又球形のクラスに一致する限定された特徴の選択的な
可能性を含む。It is known that certain naturally occurring scenes include features classified by their image formation. Such features may be the intersection of straight lines (eg, corners of a building) or the occurrence of perpendiculars (
(E.g., building edges or tree trunks) or objects that reflect other internationally planned features of the scene. The present invention also includes the optional possibility of limited features matching the spherical class.

【００１３】あるナビゲーションの光景では、例えばエリア内で繰り返されたナビゲーショ
ンが要求される。この例では、必要なナビゲーション特徴アーカイブを収集する
ために学習モードで上記のように本発明を練習することによりナビゲーションを
開始することが適切である。その後に、同一の予め学習された特徴が新たな特徴
を追加する必要なしに用いられる。アーカイブはまたエリアをナビゲートしたい
と願う複数のプラットフォーム間に送られる。同様に画像内で容易に認識可能な
ことは、任意の捜索されない配置（円錐形標識、着色された球、ポスト）に任意
の特徴を挿入するために適切である。これらの特徴は次に好ましくは特徴選択処
理で取り扱われる。しかしながら、これらの好ましい特徴の配置及び特徴トラッ
キングは自然発生特徴でなされるように正確に処理される。In some navigation scenes, for example, repeated navigation within an area is required. In this example, it is appropriate to start navigation by practicing the invention as described above in learning mode to collect the required navigation feature archive. Thereafter, the same pre-learned features are used without having to add new features. Archives are also sent between multiple platforms that want to navigate the area. Similarly, being easily recognizable in the image is appropriate for inserting any feature into any non-searched arrangement (cone, colored sphere, post). These features are then preferably addressed in a feature selection process. However, the placement and feature tracking of these preferred features is handled exactly as is done with naturally occurring features.

【００１４】本発明は方法、システム、動きセンサ及び画像化装置を含むプラットフォーム
のポーズ情報を自動的に得るためのプログラムコードを含むコンピュータ読み取
り可能媒体を含む種々の方法で実施されうる。異なる実施例又は実施は以下の独
自の利点及び長所の一以上を有する。The present invention can be implemented in various ways including a computer readable medium including program code for automatically obtaining pose information for a platform including a method, system, motion sensor, and imaging device. Different embodiments or implementations have one or more of the following unique advantages and advantages.

【００１５】本発明の利点及び長所は任意の動くプラットフォームに対するポーズは外的情
報又は以前の特命計画なしに高い精度で決定されることである。更にまた本発明
はより精度の低い慣性検知及びカメラ部品の使用を可能にするためにサブシステ
ムエラーに対する固有の弾力性を有する。最終的に、正確なポーズ情報は３Ｄシ
ーン再構成及び測定に対して画像の組の広範囲の測定基準を用いることを許容す
る。An advantage and advantage of the present invention is that the pose for any moving platform is determined with high accuracy without external information or previous mission planning. Furthermore, the present invention has inherent resilience to subsystem errors to allow for less accurate inertial sensing and use of camera components. Ultimately, accurate pose information allows the use of a wide range of image set metrics for 3D scene reconstruction and measurement.

【００１６】本発明の他の利点、長所、目的、特徴は以下に図面を参照して示される実施例
の説明と本発明の実施により達成される。発明の詳細な説明本発明のこれらの及び他の特徴、利点は図面を参照して以下の説明、請求項に
より明確に理解される。符号及び術語本発明の以下の詳細な記述では多くの特定の詳細が本発明の一貫した理解を提
供するために用いられる。しかしながら本発明はこれらの詳細な特徴なしに実施
しうることは当業者には明らかである。他の例では、よく知られた方法、手順、
部品、回路は本発明の特徴を不必要に妨害することを回避するために詳細には説
明しない。[0016] Other advantages, advantages, objects and features of the invention will be described hereinafter with reference to the drawings, in which:
And the implementation of the present invention. BRIEF DESCRIPTION OF THE DRAWINGS These and other features and advantages of the present invention will be set forth in the following description and claims, with reference to the drawings, in which:
Will be more clearly understood.Signs and terms In the following detailed description of the invention, numerous specific details are set forth in order to provide a consistent understanding of the invention.
Used to provide. However, the invention may be practiced without these detailed features.
What can be done will be apparent to those skilled in the art. Other examples include well-known methods, procedures,
Components and circuits are described in detail to avoid unnecessarily obstructing features of the present invention.
I won't tell.

【００１７】以下の本発明の詳細な説明は計算装置のデータ処理を表す手順、段階、論理ブ
ロック、処理、及び他の象徴的な表現の言葉で大部分が表現される。これらの処
理の記載及び表現は当業者が他の当業者へ最も効率的に発明の実質を伝えるため
に用いられる手段である。以下に詳細に記載されるシステム及びコンピュータ読
み取り可能媒体に沿った方法は所望の結果を導く処理又は段階の自己完結的なシ
ーケンスである。これらの段階又は処理は物理的量の要求された物理的な操作で
ある。通常必要ではないが、これらの量はコンピュータシステム又は電子的計算
装置内で記憶され、転送され、結合され、比較され、表示され、さもなければ操
作されることの可能な電気的信号の形を取る。これらの信号をビット、値、要素
、シンボル、操作、メッセージ、項、番号、等々と称することは通常の使用の理
由に対して原理的に便利さを提供する。これらの類似の用語のすべてが適切な物
理量に関連付けられ、これらの量に貼られた単に便利なラベルであることに留意
されたい。以下の記述から明らかである他は特に述べることなく本発明を通して
、「処理」又は「計算」又は「確認」又は「比較」等々のような用語を用いる考
察は計算装置のレジスタ及びメモリ内の物理量として表されたデータを計算装置
又は他の電子装置内の物理量として同様に表される他のデータに変換し、操作す
る計算装置の動作及び処理を参照する。システムの概略及びデータ収集図面を参照するに、類似の符号は幾つかの図を通して類似の部品を表す。図１
は本発明が実施される構成を示す。対象１００は典型的にはランドスケープの識
別される特徴を有し、その周囲に、本発明を用いたシステム１１０が計画されな
い任意の経路１０１で搬送される。対象は自然のシーン、地形、人工建造物を含
むが、これには限定されない。システム１１０の例は乗り物、航空機、人間の操
作する及び自己巡航型ミサイルを含む。典型的には、システム１００は画像化装
置１０３と、動き検知装置１０４からなる。一実施例では画像化装置又は画像化
器１０３はデジタル画像を発生するデジタルカメラであり、動きセンサ１０４は
共通の座標空間に関してシステム１１０の測定（即ちポーズ情報）の６つの自由
度すべてを提供する慣性測定ユニット（ＩＭＵ）である。画像化装置１０３は対
象の画像アーカイブ１０２を形成し、ここで各画像は画像が撮影された時点で生
じたポーズをタグされる。そのような対象を画像化するために、画像化器１０３
はＩＭＵ装置１０４と協働して動作するよう取り付けられ、故にビデオフレーム
１０５のフォーマットで一連の画像又はシーンの周囲又は比較的内側で画像化器
を徐々に動かすことにより一連の画像を発生する。例えば画像化器は氾濫した地
形からの特定の画像の組が要求された場合に航空機に取り付けられ、又はカメラ
は画像特徴が自動車ナビゲーションを助けるために用いられる場合に自動車の屋
根の上に取り付けられ、又は取り付けられたＩＭＵを有するカメラは３Ｄ再構成
が計画されている場合には人間の操作者により運搬される。特定のエリアの一連
の画像は乗り物が都会の地形に関して動くときに斯くして発生される。本発明で
は回転速度及び直交軸に沿った加速度を検知するＩＭＵ１０４のような動き検知
装置は画像化処理と結合される。更にまた離散的ＩＭＵ測定のタイミングは画像
フレームの各々のタイミングと正確に同期される。The following detailed description of the invention describes procedures, steps, and logic blocks that represent data processing in a computing device.
It is mostly expressed in terms of lock, processing, and other symbolic expressions. These processes
The descriptions and expressions in the text are used by those skilled in the art to most effectively convey the substance of the invention to others skilled in the art.
This is the means used for The system and computer readout described in detail below
Methods along removable media are self-contained processes or steps that lead to the desired result.
It is a sequence. These steps or processes are performed in the required physical manipulations of physical quantities.
is there. Although not usually necessary, these quantities can be calculated using computer systems or electronic
Stored, transferred, combined, compared, displayed and otherwise manipulated within the device
Take the form of electrical signals that can be made. Bits, values, and elements of these signals
References to symbols, operations, messages, terms, numbers, etc. are in the ordinary sense of use.
Provides convenience in principle for reasons. All of these similar terms are appropriate
Note that these are merely convenient labels associated with the physical quantities and affixed to these quantities
I want to be. Except as otherwise apparent from the following description, the
Consider using terms such as, "processing" or "calculation" or "confirmation" or "comparison".
Calculates data expressed as physical quantities in registers and memory of the computing device.
Or other data that is also represented as a physical quantity in another electronic device and manipulates it.
Operation and processing of the computing device.System outline and data collection Referring to the drawings, like numerals represent like parts throughout the several views. FIG.
Indicates a configuration in which the present invention is implemented. The subject 100 is typically a landscape
Around which a system 110 using the present invention is not planned.
Transported along an arbitrary route 101. Subjects include natural scenes, terrain, and man-made structures.
However, the present invention is not limited to this. Examples of system 110 include vehicles, aircraft, and human controls.
Includes built-in and self-cruising missiles. Typically, system 100 includes an imaging device.
And a motion detection device 104. In one embodiment, the imaging device or imaging
The device 103 is a digital camera that generates a digital image, and the motion sensor 104 is
Six freedoms of measurement (ie pose information) of system 110 with respect to a common coordinate space
An inertial measurement unit (IMU) that provides all degrees. The imaging device 103 is
An elephant image archive 102 is formed, where each image is created at the time the image was taken.
Tagged with a stupid pose. To image such an object, an imager 103
Is mounted to work in cooperation with the IMU device 104, and thus the video frame
Imager around or relatively inside a series of images or scenes in 105 format
A series of images is generated by gradually moving. For example, the imager is in a flooded area
Attach to the aircraft or a camera when a specific set of images from the shape is required
Is a car shop when image features are used to aid car navigation
Camera mounted on or with an IMU mounted on the root is 3D reconstruction
Is planned to be transported by a human operator. A series of specific areas
Images are thus generated as the vehicle moves over urban terrain. In the present invention
Is a motion detector like IMU104 that detects rotational speed and acceleration along orthogonal axes
The device is combined with the imaging process. Furthermore, the timing of discrete IMU measurements is
Exactly synchronized with the timing of each of the frames.

【００１８】画像フレーム及び動き検知データの同期は当業者により種々の方法で達成され
る。例えば、カメラフレームはＩＭＵサンプルクロックにより正確にトリガーさ
れ、又はカメラフレーム及びＩＭＵサンプルの両方は共通クロックを用いて時間
的にトリガーされる。Synchronization of image frames and motion detection data can be achieved in various ways by those skilled in the art. For example, camera frames are triggered exactly by the IMU sample clock, or both camera frames and IMU samples are triggered in time using a common clock.

【００１９】本発明の説明を容易にするために、対象１００は建物であると仮定する。故に
、ユーザー又は操作者は対象１００の周囲のビューを提供する一連の画像を形成
するために対象１００の周囲を巡る。画像化器１０３はビデオカメラであり、そ
の焦点距離は周囲の画像が発生されたときに変更されうる。画像化器１０３は画
像をＩＭＵデータに関連づける時間タグと共に適切な媒体１０５上でデジタル画
像を発生するデジタルサンプリング処理と関連づけられる。画像化器１０３及び
それに伴うＩＭＵ機器１０４は画像化中又は後で画像及び付随する動き検知機器
データを受けるための手段を含むコンピュータシステム１０６に結合される。To facilitate the description of the present invention, assume that object 100 is a building. Thus, a user or operator travels around object 100 to form a series of images that provide a view around object 100. Imager 103 is a video camera, the focal length of which can be changed when surrounding images are generated. The imager 103 is associated with a digital sampling process that generates a digital image on a suitable medium 105 along with a time tag that associates the image with the IMU data. The imager 103 and the associated IMU device 104 are coupled to a computer system 106 that includes means for receiving the image and accompanying motion sensing device data during or after imaging.

【００２０】故に、適切な媒体１０５は実際に専用データバスである。カメラの機械化又は
収集段階の後で、デジタル画像が発生される。図１は磁気テープに直接ビデオ記
録をなすデジタルビデオ（ＤＶ）カメラを示す。或いはカメラはリアルタイム又
は後処理段階でフレームグラバーを介してデジタル化されたアナログ信号を発生
するために用いられる。画像化器１０４の構成そのものは本発明の動作に影響し
ない。以下に、画像化器１０４は通常用いられるカラーフォーマット、座標系、
又は空間で典型的な一連のデジタル画像Ｃ_１，Ｃ_２，．．．Ｃ_Ｎを発生する。Thus, a suitable medium 105 is actually a dedicated data bus. After the camera mechanization or acquisition phase, a digital image is generated. FIG. 1 shows a digital video (DV) camera that makes video recordings directly on magnetic tape. Alternatively, the camera is used to generate a digitized analog signal via a frame grabber in a real-time or post-processing stage. The configuration of the imager 104 itself does not affect the operation of the present invention. In the following, the imager 104 uses commonly used color formats, coordinate systems,
Or a series of digital images C ₁ , C ₂ ,. . . To generate a C _N.

【００２１】一般に用いられるカラー空間の一つはＲＧＢカラー空間であり、これは画像カ
ラー画素の各々がベクトルＣ（ｉ，ｊ）＝［Ｒ（ｉ，ｊ），Ｇ（ｉ，ｊ），Ｂ（
ｉ，ｊ）］^Ｔとして表現され、ここで（ｉ，ｊ）は画像画素Ｃ（ｉ，ｊ）であり
、Ｒ、Ｇ、Ｂはカラー画像Ｃのそれぞれ３つの強度画像である。Ｒ、Ｇ、Ｂカラ
ー画像データ表現はある望ましい計算に対する最良のカラー空間である必要はな
く、一又は他の目的に対して特に有用なカラー空間が他にも多く存在する。One commonly used color space is the RGB color space, where each of the image color pixels is a vector C (i, j) = [R (i, j), G (i, j), B (
i, j)] ^T , where (i, j) is an image pixel C (i, j) and R, G, B are three intensity images of a color image C, respectively. The R, G, B color image data representation need not be the best color space for any desired calculation, and there are many other color spaces that are particularly useful for one or other purposes.

【００２２】コンピュータシステム１０６はデスクトップコンピュータ、ラップトップコン
ピュータ、又は画像化システムに統合されたポータブル装置を含むがそれに限定
されない計算システムである。図２はコンピュータシステム１０６の例示的な内
部構成を示すブロック図である。図２に示されるようにコンピュータシステム１
０６はデータバス１２０にインターフェイスされる中央処理ユニット（ＣＰＵ）
１２２、デバイスインターフェイス１２４を含む。ＣＰＵ１２２は同期された動
作のためにデータバス１２０に結合されたすべての機器及びインターフェイスを
管理するためにある命令を実行し、デバイスインターフェイス１２４は画像化シ
ステム１０３及びＩＭＵ機器１０４のような外部デバイスに結合され、故に画像
データ及びＩＭＵデータはデータバス１２０を通してメモリ又は記録に受容され
る。データバス１２０にインターフェイスされるのは表示インターフェイス１２
６、ネットワークインターフェイス１２８、プリンタインターフェイス１３０、
フロッピー（登録商標）ディスクドライブインターフェイス１３８である。一般に、本発明のコンパイルされ、リンクされたバージョンはフロッピーディスクドライブインターフェイス１３８、ネットワークインターフェイス１２８、デバイスインターフェイス１２４、又はデータバス１２０に結合された他のインターフェイスを通して記録１３６にロードされる。Computer system 106 is a computing system including, but not limited to, a desktop computer, laptop computer, or portable device integrated with an imaging system. FIG. 2 is a block diagram illustrating an exemplary internal configuration of the computer system 106. As shown in FIG.
06 is a central processing unit (CPU) interfaced with the data bus 120
122 and a device interface 124. The CPU 122 executes certain instructions to manage all devices and interfaces coupled to the data bus 120 for synchronized operation, and the device interface 124 communicates to external devices such as the imaging system 103 and the IMU device 104. Combined, and thus the image data and the IMU data are received over a data bus 120 into a memory or record. Display interface 12 is interfaced to data bus 120.
6, network interface 128, printer interface 130,
A floppy disk drive interface 138. Generally, a compiled and linked version of the present invention is loaded into record 136 through floppy disk drive interface 138, network interface 128, device interface 124, or other interface coupled to data bus 120.

【００２３】ランダムアクセスメモリ（ＲＡＭ）のような主メモリ１３２はまたＣＰＵ１２
２にデータ及び他の命令に対するメモリ記憶１３６への命令及びアクセスを提供
するためにデータバス１２０に対してインターフェイスされる。特に、本発明の
コンパイルされ、リンクされたバージョンのような記憶されたアプリケーション
プログラム命令を実行するときに、ＣＰＵ１２２は望ましい結果を達成するため
に画像データを操作する。ＲＯＭ（読み出し専用メモリ）１３４はキーボード１
４０、表示器１２６、ポインティング装置１４２の動作のために基本入出力動作
システム（ＢＩＯＳ）のような不変の命令シーケンスを記憶するために提供され
る。特徴抽出及びトラッキング本発明の一特徴は画像シーケンスで最も目立つ特徴のみを抽出しトラックし、
それらをＩＭＵ測定と協働して用い、自動的に画像化器の動きを推定する自動機
構を提供することにある。本発明で用いられる特徴は一のフレームから隣接フレ
ームへの視覚的に最少の変化として特徴付けられるものであり、自動画像処理方
法により画像内に最も正確に配置される。例えば、目立つ特徴は自己相関表面で
の鋭いピーク、又は画像フレームの各々での角のような特徴により特徴付けられ
る。目立つ特徴を抽出し、それらを続くフレームに配置する多くの技術が存在し
、その詳細な方法は本発明の限定とは考えてはならない。A main memory 132 such as a random access memory (RAM) is also
2 provides instructions and access to memory storage 136 for data and other instructions
Interface to the data bus 120 for communication. In particular, the present invention
Stored applications, such as compiled and linked versions
When executing the program instructions, the CPU 122 performs
Operate the image data. ROM (read only memory) 134 is keyboard 1
40, display 126, pointing device 142 for basic input / output operation
Provided to store an immutable sequence of instructions such as a system (BIOS)
You.Feature extraction and tracking One feature of the present invention is to extract and track only the most prominent features in the image sequence,
An automatic machine that estimates the motion of the imager automatically by using them in cooperation with IMU measurement
It is to provide structure. The feature used in the present invention is that one frame
Is characterized as the smallest visual change to the
Is placed most accurately in the image by the method. For example, a salient feature is the autocorrelation surface
Characterized by sharp peaks, or features such as corners at each of the image frames
You. There are many techniques to extract salient features and place them in subsequent frames
The detailed method should not be considered a limitation of the present invention.

【００２４】特徴抽出及びトラッキングのプロセスを加速するために、本発明は初期画像又
は特徴の失われたあるものがトラックされるように現れた画像のみの特徴を検出
するよう目立った特徴オペレータを用いる。目立った特徴オペレータと共に適用
された画像に続く画像に対して本発明は目立った特徴オペレータにより検出され
た特徴に対応する特徴を確立するために複数分解能階層特徴トラッキングを用い
る。対応する特徴に対する捜索空間はナビゲーション処理サブシステムから推定
された特徴配置により指示された点で開始される。目立った特徴の抽出一実施例によれば、抽出された目立った特徴は典型的には画像内の角のような
特徴である。図３は白エリア２０４と黒(dark)エリア２０６を含む濃度画像２０
０の３Ｄ図面（drawing）２０２を示す。図面２０２は白エリア２０４に対応す
る高いステージ２０８及び黒エリア２０６に対応する平らな面２１０を示す。角
２１２は問題の目立った特徴であり、この位置変化は最も正確に決定され得、典
型的にはあるフレームから次のフレームへ最少の影響しかない。To speed up the feature extraction and tracking process, the present invention provides an initial image or
Detects features only in images that appear to be tracked with some missing features
Use a prominent feature operator to Applicable with prominent feature operators
For images following the extracted image, the invention detects
Using multi-resolution hierarchical feature tracking to establish features corresponding to broken features
You. Search space for corresponding features is estimated from navigation processing subsystem
It starts at the point indicated by the specified feature arrangement.Extract prominent features According to one embodiment, the salient features extracted are typically like corners in the image.
It is a feature. FIG. 3 shows a density image 20 including a white area 204 and a black (dark) area 206.
0 shows a 3D drawing 202. Drawing 202 corresponds to white area 204.
A raised surface 208 and a flat surface 210 corresponding to the black area 206 are shown. Corner
212 is a salient feature of the problem, and this position change can be determined most accurately and
Typically, there is minimal impact from one frame to the next.

【００２５】目立った特徴の検出処理は画像内の全ての目立った特徴を検出するように設計
される。目立った特徴の検出処理は目立つ特徴を検出するために特徴検出オペレ
ータを画像に適用することである。一実施例によれば、画像Ｉ上の特徴検出オペ
レータ又は特徴オペレータＯ（Ｉ）がそのエリア上でなされるラプラシアンオペ
レータに基づいて画像の局部エリアのへシアン（Hessian）マトリックスの関数
である。詳細には、目立った特徴オペレータＯ（Ｉ）は以下のように決定されるＩ_ｆ＝Ｏ（Ｉ）＝Ｄｅｔ［Ｈ（Ｉ）］−λＧ（Ｉ）ここでＩ_ｆはＯ（Ｉ）により目立った特徴の検出処理から得られた特徴画像とし
て決定された結果である。Ｄｅｔ（）はマトリックスＨの行列式であり、及びλ
は制御可能なスケーリング定数であり、Ｇ（Ｉ）＝Ｉｘｘ＋Ｉｙｙである。The salient feature detection process is designed to detect all salient features in the image. The salient feature detection process is to apply a salient feature operator to the image to detect salient features. According to one embodiment, the feature detection operator or feature operator O (I) on image I is a function of the Hessian matrix of the local area of the image based on the Laplacian operator performed on that area. Specifically, the salient feature operator O (I) is determined as follows: _If = O (I) = Det [H (I)]-[lambda] G (I) where _If is given by O (I) This is a result determined as a feature image obtained from a prominent feature detection process. Det () is the determinant of the matrix H, and λ
Is a controllable scaling constant: G (I) = Ixx + Iyy.

【００２６】ヘシアンマトリックスは以下のように更に表現される：The Hessian matrix is further expressed as:

【００２７】[0027]

【数１】 (Equation 1)

【００２８】[0028]

【数２】ここでｘ、ｙはそれぞれ水平及び垂直方向及び第二次導関数であり：Ｉｓは大きさが典型的には１１ｘ１１から１５ｘ１５画素の２Ｄガウシアンカー
ネルで画像コンボリューションをなすことにより得られた画像Ｉの平滑なバージ
ョンである。(Equation 2) Where x and y are the horizontal and vertical directions and the second derivative, respectively: Is is the image I obtained by performing an image convolution with a 2D Gaussian kernel, typically 11 × 11 to 15 × 15 pixels in size. Here is a smooth version of.

【００２９】ここに記載された目立った特徴オペレータの独自の特徴の一つは図３の２１４
、２０８、２１０のようなエッジ又は均一領域を抑制する一方で２１２のような
角のような領域のみを強調することが可能であることである。これは角のような
特徴のみが２つの軸で拘束される故に有用である。画像Ｉが目立った特徴オペレ
ータにより処理された後に目立った画像Ｉ_ｆの局部最大値が次に抽出され、これ
は目立った特徴に対応する。典型的には、画像ＩはＨＩＳカラー空間の強度成分
又は元のカラー画像から得られた輝度成分である強度画像である。One of the unique features of the salient features operator described herein is 214 in FIG.
, 208, 210, while only areas such as corners, such as 212, can be enhanced. This is useful because only features such as corners are constrained on two axes. Local maximum noticeable image I _f after being processed by the feature operator image I is conspicuous then be extracted, which corresponds to the prominent features. Typically, image I is an intensity image that is an intensity component in the HIS color space or a luminance component obtained from the original color image.

【００３０】一般に、目立った特徴の各々は１１ｘ１１又は１３ｘ１３の画像テンプレート
のようなテンプレートとして表される。目立った特徴テンプレートの特性又は属
性は画像の特徴の位置、カラー情報、及びその強度からなる。位置は検出された
目立った特徴又はテンプレートが一般に座標（ｉ，ｊ）で表現される画像内のど
こに配置されるかを示す。そのカラー情報は（ｉ，ｊ）に中心を持つテンプレー
トのカラー情報を担持する。強度は目立った特徴がどのくらい強くＩ_ｆとして抽
出され又は計算されるかの情報を含む。In general, each salient feature is represented as a template, such as an 11 × 11 or 13 × 13 image template. The characteristics or attributes of a salient feature template consist of the location of the features of the image, color information, and its intensity. The position indicates where the detected salient feature or template is located in the image, generally represented by coordinates (i, j). The color information carries the color information of the template centered at (i, j). Intensity or comprises information is extracted as how strongly I _f salient features or calculated.

【００３１】動作では、画像化器から順次受けたＮ個のカラー画像が存在する。各カラー画
像が受容されると、それはまずカラー空間に変換され、ここで輝度又は強度成分
はクロミナンス成分から分離される。当業者には明らかであるが、カラー画像変
換は元のカラー画像が特徴抽出処理に対して適切ではないフォーマットで表現さ
れているときにのみ必要とされる。例えば、多くのカラー画像がＲＧＢカラー区
間にあり、故に輝度成分は一画像に合併されるカラー空間に好ましくは変換され
る。上記の特徴オペレータは次に好ましくは複数のテンプレートとしてインデッ
クス化されテーブルに保持された複数の目立った特徴を形成するために輝度成分
に適用される。テンプレートの各々は各特徴の特性又は属性を記録する。In operation, there are N color images sequentially received from the imager. As each color image is received, it is first converted to a color space, where the luminance or intensity components are separated from the chrominance components. As will be apparent to those skilled in the art, color image transformation is only required when the original color image is represented in a format that is not suitable for the feature extraction process. For example, many color images are in the RGB color interval, and thus the luminance components are preferably converted to a color space that is merged into one image. The above feature operator is then preferably applied to the luminance component to form a plurality of salient features indexed as a plurality of templates and held in a table. Each of the templates records the characteristics or attributes of each feature.

【００３２】Ｎ個のカラー画像が処理されたときまでにＮ個の対応する特徴テーブルが存在
し、各々は複数の目立った特徴からなる。テーブルは次に特徴の各々が如何に一
の画像フレームから他へ移動するかを検出するために用いられる特徴トラッキン
グマップとここで称されるマップとして組織化される。目立った特徴のトラッキングトラッキングのゴールは世界の中の同一の物理的な点に対応する順次の画像フ
レームの画像特徴を探すことにある。画像フレーム内の目立った特徴のコンピュ
ータ化されたトラッキングは目立った特徴の中心と同じ視覚的な特徴のある画像
領域を探すことに基づく故に、不正確な一致が見いだされる場合が多い。これは
通常画像化条件又は遠近感の見え方又は反復されたパターンの発生の変化による
。特徴トラッカーが誤った特徴対応を推定した場合に、この誤りはそれに続くポ
ーズ推定に有害な影響をもたらす。故に、これらの誤った特徴の適合を除去し、
正確な適合の可能な最大の数を維持することが極めて重要である。本発明の一特
徴によれば、多分解能特徴トラッキング、エピポーラージオメトリーに基づいた
トラッキング、及びナビゲーション補助トラッキングのような技術の組み合わせ
が最少の可能な計算を用いた画像シーケンスでの正確な特徴のみの対応を見いだ
すために設計される。多分解能特徴トラッキング好ましい実施例では、多分解能階層的特徴構造がトラッキングに対する特徴を
抽出するために用いられる。更に詳細には、図４Ａは２つの連続した画像４０２
、４０４が画像化器から連続して受容されることを示す。目立った特徴オペレー
タが画像４０２に適用された後に、一の特徴４０６が検出され、その特徴が記録
される。第二の画像４０４が入来するときに、画像から多分解能階層画像ピラミ
ッドが形成される。By the time N color images have been processed, there are N corresponding feature tables
And each is comprised of a number of salient features. The table then shows how each of the features
Feature tracking used to detect if moving from one image frame to another
Organized as maps, referred to herein as map maps.Tracking prominent features Tracking goals are sequential image files that correspond to the same physical point in the world.
The purpose is to find the image features of the frames. Computation of salient features in image frames
Data tracking is an image with the same visual features as the center of prominent features
Inaccurate matches are often found because they are based on finding a region. this is
Usually due to changes in imaging conditions or perspective appearance or occurrence of repeated patterns
. If the feature tracker estimates an incorrect feature correspondence, this error will be
Adversely affect dose estimation. Therefore, removing the fit of these erroneous features,
It is extremely important to maintain the maximum possible number of exact matches. Features of the present invention
According to the characteristics, multi-resolution feature tracking, based on epipolar geometry
Combination of technologies such as tracking and navigation aid tracking
Finds only exact feature correspondences in image sequences using the least possible calculations
Designed to work.Multi-resolution feature tracking In a preferred embodiment, the multi-resolution hierarchical feature structure defines features for tracking.
Used to extract. More specifically, FIG. 4A shows two successive images 402
, 404 are continuously received from the imager. Outstanding Features Operator
After the data is applied to the image 402, one feature 406 is detected and the feature is recorded.
Is done. When the second image 404 comes in, the multi-resolution hierarchical image
A head is formed.

【００３３】図４Ｂは画像４０４の特徴４０６を抽出するための例示的な多分解能階層的特
徴構造４０８を示す。画像構造４０８の多数の画像層４１０（例えばＬ層）が存
在する。画像層４１０の各々は特徴配置の周囲のデシメーション処理により元の
画像４０４から連続的に発生される。例えば、層４１０−Ｌは層４１０−（Ｌ−
１）をデシメーションすることにより発生する。デシメーション係数は典型的に
は好ましくは２に等しい定数である。画像４０２で見いだされる特徴の与えられ
た特性及び２つの画像４０２、４０４が２つの連続した画像であることを知るこ
とにより画像４０４の特徴及びその配置４０５は劇的には変化しない。故に、特
徴に対する概略の捜索は第二の画像で決定され、特徴の元の位置に中心を有する
。より詳細には特徴４０６が画像４０２の座標（１５２、２３４）に配置される
場合に、同一の特徴に対する捜索のためのウインドウは画像４０４で（１５２、
２３４）を中心にした正方形である。FIG. 4B shows an exemplary multi-resolution hierarchical feature structure 408 for extracting features 406 of the image 404. There are multiple image layers 410 (eg, L layers) of the image structure 408. Each of the image layers 410 is continuously generated from the original image 404 by a decimation process around the feature arrangement. For example, layer 410-L may be layer 410- (L-
This is caused by decimation of 1). The decimation factor is typically a constant, preferably equal to two. By knowing the given properties of the features found in the image 402 and that the two images 402, 404 are two consecutive images, the features of the image 404 and its arrangement 405 do not change dramatically. Thus, a rough search for the feature is determined in the second image, centered on the original location of the feature. More specifically, when the feature 406 is located at the coordinates (152, 234) of the image 402, the window for searching for the same feature is the image 404 (152,
234).

【００３４】多分解能階層的特徴構造４０８は層４１０の数が上方に増加するにつれて、各
層４１０の分解能は減少することを示す。換言すると、検索ウインドウの大きさ
が依然として同一であるときに、検索エリアは本質的に拡大される。図に示され
るように、検索ウインドウ４１２は層４１０−（Ｌ−１）より層４１０−Ｌで相
対的に大きなエリアをカバーする。動作では、層４１０−Ｌはサーチウインドウ
４１２の特徴の概略の配置を見つけるためにまず用いられる。連続した画像の対
応する機能の配置を見いだす利用可能な方法の一つはテンプレートマッチング処
理の使用である。テンプレートは目立った特徴オペレータにより抽出された元の
特徴の配置又は新たなフレームの特徴の予想された配置で中心化された正方形の
顔受領域（１１ｘ１１から１５ｘ１５）として典型的には決定される。次にその
適合(match)の対応する副画素（サブピクセル）の正確な位置は２つの対応する
画像領域の正規化された相互相関が最大である（理想的には完全な適合に対する
“１”位置で見いだされる。層４１０−（Ｌ−１）は次に同一のウインドウサイ
ズの最近接エリア内の特徴の概略の配置を更新されるために用いられ、最終的に
層４１０は特徴の正確な位置（ｘ、ｙ）を正確に決定するために用いられる。特
徴構造の使用は従来技術の特徴抽出アプローチにわたる多くの利点を有すること
が知られている。本質的に、特徴テンプレートの効果的により大きな表現が達成
され、これは特徴を効果的かつ正確にトラックすることを可能にし、階層的トラ
ッキング機構に特に適切である。The multi-resolution hierarchical feature structure 408 indicates that as the number of layers 410 increases upward, the resolution of each layer 410 decreases. In other words, the search area is essentially enlarged when the size of the search window is still the same. As shown, the search window 412 covers a relatively larger area on layer 410-L than on layer 410- (L-1). In operation, layer 410-L is first used to find the general location of features in search window 412. One of the available ways to find the corresponding arrangement of functions in successive images is to use a template matching process. The template is typically determined as a square face receiving area (11x11 to 15x15) centered on the original feature locations extracted by the salient feature operators or the expected locations of the new frame features. The exact position of the corresponding sub-pixel of the match is then the maximum of the normalized cross-correlation of the two corresponding image regions (ideally a "1" for a perfect match) Layer 410- (L-1) is then used to update the general arrangement of features in the nearest area of the same window size, and finally layer 410 has the exact features of the features. Used to accurately determine location (x, y) The use of feature structures is known to have many advantages over prior art feature extraction approaches: essentially more efficient feature templates A large representation is achieved, which allows for effective and accurate tracking of features and is particularly suitable for hierarchical tracking mechanisms.

【００３５】画像内にＫ個の目立った特徴が存在し、Ｋは１から１０００の範囲にある。故
に図４Ｂの一つのようなＫ個の特徴構造が存在する。図４Ｃは単一画像からＫ個
の特徴構造４２０を示し、特徴構造４２０の各々は一の特徴に対する。特徴抽出
の結果として、Ｋ個の特徴の各々を記述する一組の属性Ｆ（．．．）が形成され
、特徴の位置、強度、カラーの情報からなる。There are K prominent features in the image, where K ranges from 1 to 1000. Thus, there are K features, such as one in FIG. 4B. FIG. 4C shows K feature structures 420 from a single image, each of the feature structures 420 for a feature. As a result of the feature extraction, a set of attributes F (...) describing each of the K features is formed, and includes information on the position, intensity, and color of the feature.

【００３６】Ｎ個の画像フレーム及びＫ個の属性の組Ｆ_ｉ（．．．），ｉ＝１，２，．．．
Ｋで、図４ＤはＮ画像に対して見いだされた特徴全てを集合的に表し、画像化器
の動きを推定するために特徴をトラッキングするために用いられる、ここで「特
徴トラッキングマップ」又は単に特徴マップと称されるものを示す。加えて、図
４Ｅは特徴抽出処理のフローチャートを示す。図４Ｄ−４Ｅの両方は本発明の特
徴検出及びトラッキング処理を完全に理解するために結合して説明される。A set of N image frames and K attributes F _i (...), I = 1, 2,. . .
At K, FIG. 4D collectively represents all the features found for the N images, and is used to track the features to estimate the motion of the imager, where a “feature tracking map” or simply Show what is called a feature map. FIG. 4E shows a flowchart of the feature extraction process. Both FIGS. 4D-4E are described in conjunction for a full understanding of the feature detection and tracking process of the present invention.

【００３７】４５２で、カラー画像は画像化器から連続的に受信される。好ましくは輝度又
は強度成分である主要な成分は４５４でカラー画像から抽出される。一実施例で
は、カラー画像は別の輝度成分を提供する他のカラー空間に単に変換される。４
５６で、プロセスは例えばメモリエリアでそこに記憶された特徴テンプレートを
探す。メモリエリア内に特徴テンプレートが十分存在する場合には、それはプロ
セスが次の画像で特徴トラッキングに進むことが必要であることを意味し、さも
なければ新たな特徴が４５８で抽出されなければならないか否かをチェックする
必要がある。動作では、最初に受信された画像は特徴トラッキングプロセスをな
すために記憶された特徴又は特徴テンプレートが存在しない故に、目立った特徴
オペレータで特徴抽出動作を常になす。それによりプロセスはここで４６０に進
む。At 452, a color image is continuously received from the imager. A major component, preferably a luminance or intensity component, is extracted 454 from the color image. In one embodiment, the color image is simply converted to another color space that provides another luminance component. 4
At 56, the process looks for a feature template stored there, eg, in a memory area. If there are enough feature templates in the memory area, it means that the process needs to proceed to feature tracking on the next image, otherwise a new feature must be extracted at 458 It is necessary to check whether or not. In operation, the first image received always performs a feature extraction operation with a prominent feature operator because there is no feature or feature template stored to perform the feature tracking process. The process now proceeds to 460.

【００３８】４６０で、特徴抽出プロセスは受信された画像（例えばフレーム＃１）のＫ個
の特徴を発生する。図４Ｄに示されるように、受信された画像フレーム＃１のＫ
個の特徴が存在する。好ましくは特徴テンプレートとしてのＫ個の特徴の特性は
それに続く特徴抽出プロセスに対するメモリ空間に記憶される。At 460, the feature extraction process generates K features of the received image (eg, frame # 1). As shown in FIG. 4D, K of received image frame # 1 is
Features exist. The properties of the K features, preferably as feature templates, are stored in a memory space for the subsequent feature extraction process.

【００３９】次の画像が４６２で入来するときに、プロセスは好ましくはベースとして新た
に到来した画像を有する多分解能階層画像ピラミッドを形成するために４６４に
進む。上記のように、及び図４Ｃに示されるように、トラッキングプロセスは特
徴構造に記憶された特徴テンプレートのそれぞれの層に最も類似性を示す画像ピ
ラミッドの配置に対して捜索する。Ｋ個の多分解能階層的特徴構造の各々で、Ｋ
より少ない対応する特徴は４６６で画像ピラミッドに各対応する層から配置され
、Ｋ特徴配置は次に収集され、フレーム２に対して特徴に付加される。同様に、
次のｎ１フレームに対して、プロセスはｎ１フレームの各々からＫ個の特徴を抽
出するために繰り返し４５６を介して４６２に進む。When the next image arrives at 462, the process preferably proceeds to 464 to form a multi-resolution hierarchical image pyramid with the newly arrived image as a base. As described above, and as shown in FIG. 4C, the tracking process searches for an arrangement of image pyramids that most closely resembles each layer of the feature template stored in the feature structure. For each of the K multi-resolution hierarchical feature structures, K
Fewer corresponding features are placed 466 from each corresponding layer in the image pyramid, and K feature locations are then collected and added to features for frame 2. Similarly,
For the next n1 frame, the process proceeds to 462 via iteration 456 to extract K features from each of the n1 frames.

【００４０】画像が発生すると、画像化器は第一の画像が捕捉されたところから初期位置に
関してかなり対象の周囲で動かされている。Ｋ個の特徴のあるものはこれらの後
で発生された画像で見いだされる必要のないものである。遠近の変化及び画像化
器の動きの故に、これらの特徴はビューの外、又は完全に変化してしまっており
、それでもはやトラックすることはできない。例えば、家の屋根の角は特定の遠
近配置から見るときにその目立った特徴を失い又はビューの外になる。故に、図
４Ｄのｎ１画像に対するＫ個の特徴の表現４３０は本発明の特徴の一つとして、
追加の新たな特徴の発生は特徴の数が所定の閾値（Ｔ）より以上に低下するとき
になされる。４５６で、特徴のある数が入来画像で見いだされず、プロセスはＫ
個の特徴を形成するために新たな特徴を抽出することが必要か否かを決定するた
めに４５８に進む。上記のように、特徴のある数が遠近の変化又は閉鎖（occlus
ion）により脱落したときに、新たな特徴が画像にトラックされた特徴の十分な
量を維持するために抽出され、加えられなければならない。プロセスは特徴検出
を４６０で再スタートし、即ち、失われた一組の目立った特徴を形成するために
それを発生するよう目立った特徴オペレータを画像に適用する。プロセスは一例
として、図４Ｄでフレームｎ１での特徴検出を再スタートさせるよう示される。When an image is generated, the imager is moved around the object considerably with respect to the initial position from where the first image was captured. The K features are those that do not need to be found in the subsequently generated images. Due to perspective changes and the motion of the imager, these features have changed out of view or completely, so that they can no longer be tracked. For example, a roof corner of a house loses its salient features or is out of view when viewed from a particular perspective. Therefore, the representation 430 of K features for the n1 image of FIG. 4D is one of the features of the present invention.
The generation of additional new features occurs when the number of features falls below a predetermined threshold (T). At 456, no distinctive number is found in the incoming image and the process proceeds to K
Proceed to 458 to determine if new features need to be extracted to form the individual features. As noted above, the characteristic number is a change in perspective or closure (occlus
new features must be extracted and added to maintain a sufficient amount of features tracked in the image when dropped by ion). The process restarts feature detection at 460, i.e., applying a salient feature operator to the image to generate the missing set of salient features to form it. The process is shown by way of example in FIG. 4D to restart feature detection at frame n1.

【００４１】図４Ｅ及び４Ｄに示されるように、順次の画像に適合する特徴テンプレートは
それに続く画像の特徴をトラッキングする元の組として残り、一のフレームから
他へ変化しない。典型的にはそれに続く画像フレーム間の特徴の対応を確立する
ことは２つの方法により達成される。一は直接連続した画像対でこれを達成する
ことであり、他の一つは基準として第一のフレームを固定し、この基準フレーム
に関する全ての他のフレームの対応する配置を見いだすことによる。一の実施例
では、第二のアプローチはそれが正確な特徴配置を探す上で可能なバイアス又は
ドリフトを最小化する故に、顕著なドリフトが数画像フレームにわたり集積され
うる第一のアプローチとは逆である。しかしながら、第二のアプローチはカメラ
により撮影されたシーンがカメラが大きな変位をカバーするときにビューの大き
な変化にさらされるために数フレームにわたる短い寿命の特徴の持続のみが許容
され、これにより究極的にトラッキングプロセスは４７２に進められる。As shown in FIGS. 4E and 4D, a feature template that matches a sequential image remains as an original set of tracking features of subsequent images, and does not change from one frame to another. Typically, establishing feature correspondence between subsequent image frames is achieved in two ways. One is to achieve this with a directly consecutive image pair, the other is by fixing the first frame as a reference and finding the corresponding arrangement of all other frames with respect to this reference frame. In one embodiment, the second approach is the reverse of the first approach, where significant drift can be integrated over several image frames, because it minimizes the bias or drift possible in searching for accurate features. It is. However, the second approach only allows the persistence of short-lived features over several frames, since the scene captured by the camera is subject to large changes in view when the camera covers a large displacement, thereby ultimately The tracking process is advanced to 472.

【００４２】第二のアプローチを用いる多くの続くフレームにわたる特徴トラッキングを維
持するために、特徴テンプレート更新機構は４７４に含まれる。図４Ｆに示され
るように対応する特徴配置が基準フレーム４９０の特徴に関して最も最近のフレ
ーム４９２に見いだされない場合に、失われた特徴のテンプレートは最も最近の
フレーム４９４に配置されたものにより置き換えられ、そこではそれらは例えば
４９４でうまくトラックされていた。図４Ｅの４７４で更新されたテンプレート
はそれらが第一のアプローチに対して典型的な集積されたドリフトを最小化する
ことにより顕著な遠近ビュー変化を有する場合にさえうまくトラックされるとい
う利点を有する。To maintain feature tracking over many subsequent frames using the second approach, a feature template update mechanism is included in 474. If the corresponding feature placement is not found in the most recent frame 492 with respect to the features in the reference frame 490 as shown in FIG. 4F, the template for the missing feature is replaced by the one located in the most recent frame 494. Where they were well tracked, for example at 494. The templates updated at 474 in FIG. 4E have the advantage that they are tracked well even if they have significant perspective changes by minimizing the integrated drift typical for the first approach. .

【００４３】当然、特徴表現はある数のフレーム毎の後になされる。図４Ｄはフレーム番号
ｎ１，ｎ２，ｎ３，ｎ４，ｎ５，ｎ６で画像に対してそれぞれ特徴の組４３２か
ら４３６を示す。フレーム番号ｎ１，ｎ２，ｎ３，ｎ４，ｎ５，ｎ６はその間で
フレームと同一の番号を有する必要はない。画像化器は更に動き、より多くの画
像を発生し、その特徴の幾つかは幾つかのそれに続く画像に再現され、４３８か
ら４４０に示され、実施の好ましさに依存して再利用される。４７０で、プロセ
スは全てのフレームは処理され、その特徴は得られる。結果として、図４Ｄの例
のように、特徴マップが得られる。エピポーラー拘束に基づく特徴トラッキング多分解能アプローチがトラッカーに対して検索エリアを顕著に減少する間に、
特徴が、それが特徴テンプレートのみに類似の尺度を用い、世界標準のフレーム
での３Ｄ点のジオメトリー的拘束を使用しない故に、正確に適合されるという保
証はない。Of course, the feature representation is made after every certain number of frames. FIG. 4D shows the frame number
n1, n2, n3, n4, n5, and n6 are sets of features 432 for the image, respectively.
436 are shown. Frame numbers n1, n2, n3, n4, n5 and n6 are between
It need not have the same number as the frame. The imager moves further, more images
Generate an image, some of whose features are reproduced in several subsequent images,
440 are reused depending on the preferred implementation. At 470, the process
All frames are processed and the features are obtained. As a result, the example of FIG. 4D
Thus, a feature map is obtained.Feature tracking based on epipolar constraints While the multi-resolution approach significantly reduces the search area for trackers,
The feature uses a scale similar to that of the feature template only, and
It does not use 3D point geometric constraints in
There is no proof.

【００４４】当業者によく知られているように、エピポーラー拘束は画像化装置により観測
された３Ｄ環境の静止点の投影画像配置上で一次元拘束を決定する。この拘束に
より、２つの画像フレームの対応する与えられた点ｐ１＝（ｘ１，ｙ１，１）^Ｔ、ｐ２＝（ｘ２，ｙ２，１）^Ｔの対を与えられたことは、それらが式ｐ１^Ｔ＊Ｆ
＊ｐ２＝０を満たさなければならず、ここで、Ｆは基本行列である。このマトリ
ックスは７以上の対応する点の対から計算される。このマトリックスを計算して
、画像フレームの特定の対の特徴点の全ての残りの対応に対するエピポーラー拘
束を得る。元の７点の対が正確に適合していることを保証するために、当業者に
よく知られたＲＡＮＳＡＣ（RANdom Sample Concensus）に基づく技術を応用す
る。この拘束が計算された後に特徴適合の残りはこれらのエピポーラーラインの
みに沿って検索することにより見いだされる。このアプローチは検索エリアを大
きく減少し、このサーチエリアの外に当てはめる誤り特徴適合の問題を減少する
。As is well known to those skilled in the art, epipolar constraints determine one-dimensional constraints on the projected image layout of stationary points in a 3D environment observed by an imaging device. This constraint, given point corresponding two image frames ^{p1 = (x1, y1,1) T} , p2 = (x2, y2,1) that given pair of ^T is, they formula p1 ^T * F
* P2 = 0 must be satisfied, where F is the elementary matrix. This matrix is calculated from seven or more corresponding point pairs. This matrix is calculated to obtain epipolar constraints for all remaining correspondences of a particular pair of feature points in the image frame. To ensure that the original seven point pairs match exactly, a technique based on RANSAC (RANdom Sample Concensus) well known to those skilled in the art is applied. After this constraint has been calculated, the rest of the feature match is found by searching along these epipolar lines only. This approach greatly reduces the search area and reduces the problem of fitting false features outside of the search area.

【００４５】３つの画像フレームの対のエピポーラー拘束の結合は当業者に知られている三
焦点（trifocal）拘束と称されるずっと強い拘束を生ずる。この後者の拘束は２
つの他のフレームのこの配置を与える第三のフレームの特徴点の配置を完全に決
定する。図４Ｇは現在の画像フレーム＃ｎ４８０と前の画像フレームとの間のエ
ピポーラージオメトリーの補強のフローチャートを示し、これは複数の増強４８
２及びより初期のフレームの配置を与えるフレーム＃ｎの特徴点の配置での三焦
点拘束の確認から得られる。The combination of the epipolar constraints of a pair of three image frames results in a much stronger constraint called trifocal constraints known to those skilled in the art. This latter constraint is 2
The placement of the feature points of the third frame, giving this placement of one other frame, is completely determined. FIG. 4G shows a flowchart of epipolar geometry enhancement between the current image frame # n480 and the previous image frame, which includes multiple enhancements 48.
Obtained from the confirmation of trifocal constraint in the arrangement of feature points in frame #n, which gives the arrangement of 2 and earlier frames.

【００４６】カルマンフィルタナビゲーション処理は画像からの適切な特徴の抽出に続く処
理として用いられる。ここで用いられるように、カルマンナビゲーションは当業
者によく知られているカルマンフィルタ処理に基づいたナビゲーションを指す。
しばしば、ある期間にわたりなされた測定の組で与えられたシステムの状態を推
定することが望ましい。システムの状態は特定の時点でシステムの固有の特性を
記述する一組の変数を称する。カルマンフィルタは、例えば状態変数の間接的な
測定を用いて、状態変数及びその間接的な測定の両方の共分散情報を用いること
によりシステム状態の推定又は前の推定を更新する有用な技術である。The Kalman filter navigation process is used as a process following the extraction of appropriate features from an image. As used herein, Kalman navigation refers to navigation based on Kalman filtering that is well known to those skilled in the art.
It is often desirable to estimate the state of a system given a set of measurements made over a period of time. The state of the system refers to a set of variables that describe the unique characteristics of the system at a particular point in time. The Kalman filter is a useful technique for updating system state estimates or previous estimates by using covariance information of both the state variables and their indirect measurements, for example, using indirect measurements of state variables.

【００４７】誤って適合された特徴が静的でない、即ち独立に動くようなカルマンナビゲー
ションに対する深刻な問題を引き起こす。トラッキングでエピポーラー及び三焦
点拘束を用いることで、カメラプラットフォームの動きを知ることなしに現在の
フレーム（フレーム＃ｎ）と前のフレームとの間のこれらの悪い又は望ましくな
い対応のほとんどを除去することが可能である。カルマンフィルタは特徴トラッ
キングに対するナビゲーションフィードバックを提供するために用いられる多く
の統計的推定処理の一つである。簡単化のために、カルマンフィルタは一実施例
により本発明を説明するために用いられる。Mismatched features cause serious problems for Kalman navigation where they are not static, ie move independently. Using epipolar and trifocal constraints in tracking to eliminate most of these bad or unwanted correspondences between the current frame (frame #n) and the previous frame without knowing the camera platform movement Is possible. The Kalman filter is one of many statistical estimation processes used to provide navigation feedback for feature tracking. For simplicity, a Kalman filter will be used to describe the invention according to one embodiment.

【００４８】階層的特徴トラッキングとエピポーラー拘束されたトラッキングとの組み合わ
せはカルマンフィルタナビゲーション処理の初期化又は新たに抽出された特徴の
推定中に特徴の正確なトラッキングに対する強力なツールを提供し、それにより
それはカルマンフィルタの臨界的初期化フェーズ中に最適な収束を提供する。ナビゲーション補助特徴トラッキング上記のように、カルマンフィルタ処理はプラットフォーム位置及び視覚的処理
により提供される特徴の組を与える特徴配置の最適な推定を形成する。性能は最
大特徴持続及び達成された特徴の再収集で達成される。Combining Hierarchical Feature Tracking with Epipolar Constrained Tracking
Initialization of the Kalman filter navigation process or the extraction of newly extracted features
Provides a powerful tool for accurate tracking of features during estimation, thereby
It provides optimal convergence during the critical initialization phase of the Kalman filter.Navigation assist feature tracking As mentioned above, Kalman filter processing is platform location and visual processing
Form an optimal estimate of the feature location giving the set of features provided by Performance is the best
Achieved by major feature persistence and recollection of achieved features.

【００４９】プラットフォーム位置及び特徴位置の推定を用いて、その３Ｄ配置が既に知ら
れたこれらの目立った特徴の画像位置を正確に推定し、又はエピポーラーライン
に対して新たに収集された特徴のトラッキングを拘束することが可能である。カ
メラの動き推定が正確になるとすぐに、このパラメータから同様にエピポーラー
拘束を計算することが可能であり、次にこの拘束をその３Ｄ配置が決定されるこ
れらの新たに収集された特徴に用いる。Using the platform position and feature position estimation, the 3D location can be used to accurately estimate the image position of these salient features already known, or for newly collected features relative to the epipolar line. It is possible to constrain tracking. As soon as the motion estimation of the camera is accurate, it is possible to calculate the epipolar constraint from this parameter as well, and then use this constraint on those newly collected features whose 3D placement is to be determined.

【００５０】これらの技術の組み合わせは特徴トラッカーに対する助けの鍵となる。現在の
特徴配置推定に沿った推定されたプラットフォーム位置及び姿勢は次の画像形成
での画素値の推定を許容する。故に、カルマンフィルタは特徴トラッキング処理
に対する自然な関連を提供する。トラッカーに対するこのナビゲーションフィー
ドバックは誤った特徴の誤適合のリスクを大幅に減少するのみならずそれはまた
検索エリアを減少し、それによりトラッキング処理を加速する。カルマンフィル
タナビゲーション推定をより正確にすると、トラッキングの助け、故にトラッキ
ング処理がより正確になる。The combination of these techniques is key to helping the feature tracker. The estimated platform position and orientation along with the current feature location estimation allows for the estimation of pixel values in the next image formation. Hence, the Kalman filter provides a natural link to the feature tracking process. This navigational feedback to the tracker not only greatly reduces the risk of mismatching of the wrong features, but it also reduces the search area and thereby speeds up the tracking process. The more accurate the Kalman filter navigation estimate, the more tracking aids and hence the more accurate the tracking process.

【００５１】特徴トラッキング及びカルマンフィルタナビゲーション処理の統合の利点は本
発明の最も重要な特徴の一つである。それはナビゲーション方法の精度及びロバ
スト性を改善するのみならず、従来技術のシステムの類似の試みより顕著に低品
質のナビゲーション及び画像化部品を用いることを可能にしたことである。The advantage of integrating feature tracking and Kalman filter navigation processing is one of the most important features of the present invention. It not only improves the accuracy and robustness of the navigation method, but also makes it possible to use significantly lower quality navigation and imaging components than similar attempts in prior art systems.

【００５２】図５Ａは本発明の一実施例による特徴トラッキング及びナビゲーションの統合
のフローチャートを示し、これは本発明を用いるシステムの機能ブロック図を示
す図６と関連して理解される。カルマンフィルタ６０４は図５Ｂに示された動き
センサ由来の情報を用いるプラットフォーム状態を推定する。現在の特徴位置推
定に沿って推定されたプラットフォーム位置及び姿勢は次の画像形成６０３での
画素値の推定を許容する。この情報は特徴推定処理に提供される。次の特徴発生
に対する検索エリアはカルマンフィルタにより導かれる。予想された特徴がフレ
ームの推定された番号に対して見いだされない場合には、特徴は失われたと宣言
され、その画像テンプレートは及び推定位置はアーカイブされる６１０。FIG. 5A shows a flowchart of the integration of feature tracking and navigation according to one embodiment of the present invention, which can be understood in connection with FIG. 6, which shows a functional block diagram of a system employing the present invention. The Kalman filter 604 estimates the platform state using information from the motion sensor shown in FIG. 5B. The platform position and orientation estimated along with the current feature position estimation allow estimation of pixel values in the next image formation 603. This information is provided to the feature estimation process. The search area for the next feature occurrence is derived by the Kalman filter. If the expected feature is not found for the estimated number of frames, the feature is declared lost and its image template and estimated location are archived 610.

【００５３】上記のように、本発明のナビゲーション推定の利点の一つは各画像コレクショ
ン間のカメラ並進及び回転運動を推定することにより画像のみの解答（ソリュー
ション）で改善をなすことにあり、それにより従来技術のシステムにしばしば見
られる特徴トラッキング失敗の主要な原因を回避する。これにより画像化器の視
野を離れ、それに戻る特徴を含むフレーム間の大きな角度運動に対する公差が許
容可能となる。それに続くフレームの残りの特徴運動はエピポーラージオメトリ
ーにより決定される相対的プラットフォームから特徴位置の推定誤差の結果が主
要なものである。ナビゲーションと特徴トラッキング処理との間のこの硬い結合
は減少された捜索ウインドウサイズと同様に特徴トラッキングの信頼性の両方で
主要な利点を提供する。カルマンフィルタの特徴の管理特徴をトラッキングする動的な性質はカルマンフィルタに対する特徴管理（マ
ネージメント）のプロセスを要求し、これは図５Ａに示される。特徴マネージャ
は一定の値にカルマンフィルタの現在の特徴の数を保持することを企画する。As mentioned above, one of the advantages of the navigation estimation of the present invention is that each image collection
Image-only solution by estimating camera translation and rotational motion between
Improvement), which is often seen in prior art systems.
Avoid major causes of feature tracking failures. This allows the visualization of the imager
Tolerance for large angular movements between frames, including features that leave and return to the field
It becomes possible. The remaining feature motion of the subsequent frame is epipolar geometry
The result of the estimation error of the feature position from the relative platform determined by
It is important. This rigid connection between navigation and feature tracking processing
In both the reduced search window size as well as the reliability of feature tracking
Provides major benefits.Management of Kalman filter characteristics The dynamic nature of feature tracking is based on feature management (ma
Management process), which is shown in FIG. 5A. Feature Manager
Plans to keep the current number of features of the Kalman filter at a constant value.

【００５４】特徴が一連のＮ個のフレームで検出されない場合にはその特徴は失われたと宣
言される。特徴が失われたときに、新たな特徴が選択される６０１。可視的であ
ると期待されるアーカイブされた特徴６０６を再収集することが企画される。期
待される可視性は記憶された特徴配置、現在のプラットフォーム配置、前に記憶
された特徴テンプレートのアスペクトにより決定される。即ち、プラットフォー
ムから特徴への視野方向（line-of-sight）がテンプレートがアーカイブから利
用可能である前の視野方向の場合にはこの特徴が一時的（仮）に可視的であるこ
とが宣言される。特徴トラッカーは次にこのアーカイブ特徴を得ようと試みる。
成功した場合に、特徴は能動的特徴としてカルマンフィルタに再挿入される。ア
ーカイブされた特徴が利用可能でない場合には、又はアーカイブされた特徴がう
まく得られなかった場合には新たな特徴がフィルタ６０８に挿入され、プリセッ
ト範囲でＬＯＳレイに沿った配置にあるよう初期化される。If a feature is not detected in a series of N frames, the feature is declared lost. When a feature is lost, a new feature is selected 601. It is designed to re-collect the archived features 606 that are expected to be visible. The expected visibility is determined by the stored feature placement, the current platform placement, and the previously stored feature template aspects. That is, if the line-of-sight from platform to feature is before the template is available from the archive, this feature is declared to be temporarily (provisionally) visible. You. The feature tracker then attempts to get this archive feature.
If successful, the feature is reinserted into the Kalman filter as an active feature. If the archived features are not available, or if the archived features were not obtained successfully, a new feature is inserted into the filter 608 and initialized to be in the preset range along the LOS ray. Is done.

【００５５】カルマンフィルタの特徴の数は一実施例では１から２０に設定される。特徴の
数の上限はシステム内の処理能力によってのみ決定される。全ての特徴は与えら
れたフレームで可視的であるとは限らず、しかしながら完全な共分散はカルマン
フィルタに保持された全ての特徴に対して維持される。ナビゲーションデータ処理地球に関して動く剛体に対して動きの以下の式は位置及び姿勢が加速度及び回
転速度から如何にして計算されるかを記述する。In one embodiment, the number of Kalman filter features is set from 1 to 20. Characteristic
The upper limit is only determined by the processing power in the system. All features are given
Is not always visible in a given frame, but full covariance is
Maintained for all features held in the filter.Navigation data processing For a rigid body moving with respect to the earth, the following equation of motion is
Describe how it is calculated from the rolling speed.

【００５６】[0056]

【数３】ここでν ^ｅ＝地球の中心で地球に固定された（ＥＣＥＦ）座標系での速度ベクトルｒ ^ｅ＝ＥＣＥＦ座標での位置ベクトルＣ_ｂ ^ｅ＝ＥＣＥＦフレームに対するボディフレームからの方向余弦マトリックス
ｆ ^ｂ＝ボディアクシス（体軸）のスペシフィックフォース（specific force）（
非重力的加速度）（ＩＭＵ加速度計機器により明確に測定された） Ω_ｅ＝地球の回転速度ｇ ^ｅ＝ＥＣＥＦ座標での地球重力ベクトル ω_ｂ／ｉ ^ｂ＝体軸のボディからの慣性角回転速度（ＩＭＵ速度ジャイロセンサに
より明確に測定された） ω_ｅ／ｉ ^ｅ＝ＥＣＥＦ軸の地球からの慣性角回転速度得られた位置及び姿勢は地球に固定された軸座標に相対的であり、式は地球の
回転に関する。本発明は２つの理由のために詳細な注意を要求する。第一に、地
球の回転速度を検知する性能を有する速度ジャイロを用いる。第二に、選択的入
力として地球の中心で地球に固定された（ＥＣＥＦ）座標に関しての測定を提供
するＧＰＳを用いることを許容する。(Equation 3) Direction cosine matrix from where [nu ^e = body frame relative position vector C _b ^e = ECEF frame in the velocity vector r ^e = ECEF coordinates of the earth is fixed to the Earth at the center (ECEF) coordinate system
f ^b = specific force of body axis (body axis) (
Inertial angle speed from the non-gravitational acceleration) (IMU accelerometer clearly measured) Omega _e = earth rotation speed g ^e = body of the Earth gravity vector ω _{^b /} ⁱ _b = body axis in ECEF coordinates by the device (specifically measured by IMU velocity gyro sensor) ω _{e / i e} ⁼ the position and orientation obtained inertial angle speed from the earth ECEF axes is relative to the axis coordinates fixed to the earth, the equation Regarding the rotation of the earth. The present invention requires careful attention for two reasons. First, a speed gyro having the ability to detect the rotation speed of the earth is used. Second, it allows the use of GPS, which provides measurements on earth fixed (ECEF) coordinates at the center of the earth as an optional input.

【００５７】上記計算プロセスで、ＩＭＵ測定は当業者によく知られているストラップダウ
ン（strapdown）ナビゲーションアルゴリズムに送られる。ストラップダウンナ
ビゲーションアルゴリズムによりＥＣＥＦ座標に関する位置、速度、及び姿勢を
計算する。上記に示された非線形微分方程式により形成されたこの軌跡は線形化
されたカルマンフィルタアルゴリズムに対する基礎を形成する。このカルマンフ
ィルタに対する状態ベクトルはストラップダウンナビゲーション軌跡推定に関す
る系に対する位置、速度、姿勢の誤差を含む。In the above calculation process, the IMU measurements are sent to a strapdown navigation algorithm well known to those skilled in the art. Calculate the position, velocity, and attitude with respect to ECEF coordinates by the strapdown navigation algorithm. This trajectory formed by the nonlinear differential equation shown above forms the basis for a linearized Kalman filter algorithm. The state vector for the Kalman filter includes errors in position, speed, and attitude with respect to the system related to the strapdown navigation trajectory estimation.

【００５８】ＩＭＵ測定に本来ある誤差及びストラップダウンナビゲーションソリューショ
ンを初期化する誤差の故に、ストラップダウンナビゲーションソリューションは
真の軌跡から顕著にドリフトする。ある外的な助けが真のソリューションに近い
ポーズ推定を維持するために適用されなければならない。従来の助けのあるＩＭ
Ｕシステムで、外的な助けはＧＰＳ又は他の独立のナビゲーションシステムから
来る。衛星に助けられたナビデーション又はＤＳＭＡＣのような従来の視覚の助
けのあるナビゲーションでは助けは知られた測地点又は知られた測地アスペクト
のような視覚的特徴の知られた地理基準的属性から来る。例えば、衛星に助けら
れたナビデーションで、地球を基準とした星の観測は従来技術の視野方向に沿っ
たものとして知られている。ＤＳＭＡＣに対して、視覚的基準シーンの特徴は測
地的地球緯度、経度、高度であるとして知られている。本発明では、上記のよう
に知られていない特徴が得られ、用いられる。Due to errors inherent in IMU measurements and errors in initializing the strapdown navigation solution, the strapdown navigation solution drifts significantly from the true trajectory. Some external help must be applied to maintain pose estimation close to the true solution. IM with traditional help
On U systems, external help comes from GPS or other independent navigation systems. In conventional visual aided navigation, such as satellite-assisted navigation or DSMAC, the help comes from known geographical attributes of visual features such as known geodetic points or known geodetic aspects. . For example, in satellite-assisted navigation, the observation of stars relative to the earth is known as being along the line of sight of the prior art. For DSMAC, the features of the visual reference scene are known as geodetic earth latitude, longitude, and altitude. In the present invention, the above unknown features are obtained and used.

【００５９】図６に示されるように、中心推定処理は拡張されたカルマンフィルタ処理６１
４を用いる。カルマンフィルタ処理は以下の要因により制御される：（１） “全体の値（whole value）”ストラップダウンアルゴリズム６０４は
カルマンフィルタと並行して走る。ストラップダウンアルゴリズムはＩＭＵ６０
０から加速度及び回転速度の高速測定から位置及び姿勢を伝達する。（２）カルマンフィルタはストラップダウンナビゲータに関する誤差を決定す
る。（３）カルマンフィルタ状態ベクトルはストラップダウン機器誤差モデル６０
７に基づく。（４）機器誤差は非線形効果に誘導される大きなドリフトを防ぐためにストラ
ップダウンナビゲーションモデル６０４にフィードバックされる。（５）カルマンフィルタは伝搬６０５及び更新６１１段階を形成するためにシ
ステム状態に関して非線形システム力学の部分導関数を用いる。As shown in FIG. 6, the center estimation processing is extended Kalman filter processing 61
4 is used. Kalman filtering is controlled by the following factors: (1) The "whole value" strapdown algorithm 604 runs in parallel with the Kalman filter. The strapdown algorithm is IMU60
From 0, the position and orientation are transmitted from high-speed measurements of acceleration and rotation speed. (2) The Kalman filter determines the error for the strapdown navigator. (3) Kalman filter state vector is a strapdown device error model 60
7 based. (4) The equipment error is fed back to the strapdown navigation model 604 to prevent large drift induced by nonlinear effects. (5) The Kalman filter uses partial derivatives of nonlinear system dynamics with respect to the system state to form the propagation 605 and update 611 stages.

【００６０】外的な助けは測定が基本的な拡張カルマンフィルタに更新されることで達成さ
れる。これらの更新はＩＭＵ測定から完全に非同期でなされるが、これらの時間
的関係は正確に知られなければならない。External help is achieved by updating the measurement to a basic extended Kalman filter. These updates are made completely asynchronously from the IMU measurements, but their temporal relationships must be accurately known.

【００６１】視覚処理はカメラの視野内の特徴の画素座標により表される特徴測定６１０を
提供する。カルマンフィルタは位置及び姿勢に対する画素測定に関する数学的な
決定はプラットフォーム及び特徴を提供する６０９。各特徴をナビゲーションプ
ロセスのＥＣＥＦ空間でＸ−Ｙ−Ｚの三つの組としてモデル化する。故に、全体
で実際に１０の特徴が存在する場合にカルマンフィルタで３０の新たな状態が全
体で必要となる６１０。定式化はカルマンフィルタで支援される特徴の数に関し
て柔軟性を有する。The visual processing provides a feature measurement 610 represented by the pixel coordinates of the feature in the camera's field of view. The Kalman filter provides a platform and features 609 for the mathematical determination of pixel measurements for position and orientation. Each feature is modeled as three sets of XYZ in the ECEF space of the navigation process. Thus, if there are actually a total of 10 features, then a total of 30 new states are needed in the Kalman filter 610. The formulation is flexible with respect to the number of features supported by the Kalman filter.

【００６２】図７に示されるように、特徴７０２はデジタルカメラから観測された画像７０
０で検出された離散的な点である。デジタルカメラボアサイト（boresight）７
０９はＩＭＵ軸の組７０１に関して概略知られている。この関係はしっかり固定
され、又はカメラはＩＭＵ軸に関して厳密に知られた方法で動かされる。視覚サ
ブシステムにより時間にわたりトラックされる画像フレーム内の画素位置７００
はストラップダウンナビゲーションアルゴリズムを助ける動き情報を提供する。
カメラ画像補正処理及びＩＭＵ軸に関してカメラ６０９のジオメトリーをモデル
化すること、カメラフレームの時間的関係がＩＭＵサンプルに関して知られてい
ることを確実にすることは重要である。As shown in FIG. 7, a feature 702 is an image 70 observed from a digital camera.
0 is a discrete point detected. Digital Camera Boresight 7
09 is generally known for the IMU axis set 701. This relationship is fixed or the camera is moved in a strictly known manner about the IMU axis. Pixel location 700 in an image frame tracked over time by the visual subsystem
Provides motion information to aid in strapdown navigation algorithms.
It is important to model the camera 609 geometry with respect to the camera image correction process and the IMU axis, and to ensure that the temporal relationship of the camera frames is known with respect to the IMU samples.

【００６３】特徴の最も臨界的なアスペクトは各特徴が周辺の意味で静的でなければならず
、最も重要なのは「特徴対応」が正確であることである。特徴対応は位置のフレ
ームで観測された特徴が前のフレームで観測された特徴と同じ物理的点に対応す
ることを意味する。特徴はシーン内容の特性に基づいて自動的に検出され、何千
もの特徴はカルマンフィルタにより処理されると考えられる。特徴処理６０３、
６０６は視覚処理機により並列に動作される。The most critical aspect of a feature is that each feature must be static in the surrounding sense, and most importantly, the “feature correspondence” is accurate. Feature correspondence means that the feature observed in the frame at the location corresponds to the same physical point as the feature observed in the previous frame. Features are automatically detected based on the characteristics of the scene content, and thousands of features will be processed by the Kalman filter. Feature processing 603,
606 is operated in parallel by the visual processor.

【００６４】特徴測定をモデル化する目的に対して、各特徴は地球に固定された基準フレー
ム７０４での位置の３つの成分により正確に表現される。故に、現在形態的にカ
ルマンフィルタにより処理された各特徴に対して、知られていない３つの時間不
変量（invariant）が存在する。これらの知られていない量はＩＭＵ機器誤差に
より形成された状態ベクトルに付加される。For the purpose of modeling feature measurements, each feature is accurately represented by three components of position in a reference frame 704 fixed to the earth. Thus, for each feature currently morphologically processed by the Kalman filter, there are three unknown time invariants. These unknown quantities are added to the state vector formed by the IMU instrument error.

【００６５】カルマンフィルタ処理は当業者によく知られた方法を用いて状態ベクトルの共
分散マトリックスの伝搬６０５を要求する。この伝搬計算負荷は状態ベクトル長
さの三乗で増加すると典型的には仮定される。過度の計算負荷を回避するために
、カルマンフィルタは特徴が視野（ＦＯＶ）及び状態ベクトル内の現在の特徴６
１０の存在によりアーカイブされ、及びデアーカイブ（de-archive）される６１
３適応状態ベクトルを維持する能力を含む。Kalman filtering requires propagation 605 of the state vector covariance matrix using methods well known to those skilled in the art. This propagation computational load is typically assumed to increase with the cube of the state vector length. In order to avoid excessive computational load, the Kalman filter uses the current feature 6 in the field of view (FOV) and state vector.
Archived and de-archived by the presence of 10 61
3 Includes the ability to maintain an adaptive state vector.

【００６６】特徴はカメラＦＯＶを離れるよう観測されると「アーカイブ」される。特徴ア
ーカイバルは現在の特徴位置推定、特徴成分共分散マトリックス、特徴成分から
プラットフォームへの補正マトリックス、特徴基準画像テンプレートを記憶する
ことを意味する。デアーカイブ処理はカルマンフィルタフォーミュレーション内
の特徴を置き換え、特徴位置及び補正特性をリセットする。Features are “archived” as they are observed leaving the camera FOV. Feature archival means storing the current feature location estimate, feature component covariance matrix, feature component to platform correction matrix, and feature reference image template. The de-archiving process replaces features in the Kalman filter formulation and resets feature locations and correction characteristics.

【００６７】画像化センサ面７００（画素の）Ｙ，Ｚ特徴位置間の関係及びカメラ軸の特徴
に対する単位ベクトルｕ ^ｃは以下の式で与えられる。[0067] Imaging sensor surface 700 (pixel) Y, the unit vector u ^c for the feature of the relationship and the camera axis between Z feature positions is given by the following equation.

【００６８】[0068]

【数４】又検査により以下のようにも書ける。(Equation 4) The following can also be written by inspection.

【００６９】[0069]

【数５】特徴位置の物理的測定はＣＣＤ素子からのグレースケール値により形成された
パターンの解釈からなされる。この測定処理は画像処理技術に関するランダムフ
レーム間ノイズを含む。誤差は又ＣＣＤ物理的レイアウト、ＣＣＤ信号サンプリ
ング処理、光がＣＣＤアレイに入来するレンズ／光学系路から生じる。名目上、
正味の得られた画素空間は好ましい長方形グリッドからのサンプルを表す。正確
なグリッド寸法（画素の）及び物理的な寸法の概略の画素間隔は製造業者から入
手可能である。(Equation 5) The physical measurement of the feature location is made from the interpretation of the pattern formed by the gray scale values from the CCD device. This measurement process includes random inter-frame noise related to image processing technology. Errors also arise from the CCD physical layout, the CCD signal sampling process, and the lens / optical path where light enters the CCD array. Nominally,
The net resulting pixel space represents samples from the preferred rectangular grid. The exact grid dimensions (in pixels) and the approximate pixel spacing of the physical dimensions are available from the manufacturer.

【００７０】経験的モデルは焦点面の特徴位置に関する首尾一貫した測定誤差を表すために
しばしば用いられる。Empirical models are often used to represent consistent measurement errors for focal plane feature locations.

【００７１】[0071]

【数６】ここで、Ｙ，Ｚ＝焦点面の特徴レイの真の物理的変位Ｙ’，Ｚ’ ＝特徴位置の測定された画素カウントＹ_ＰＩＸ，Ｚ_ＰＩＸ＝Ｙ，Ｚ寸法の画素数 ε_１１、ε_２２＝両方の寸法の画素間隔の誤差７０６ ε_１２、ε_２１＝長方形でない画素スキュー項７０７Ｋ_１、Ｋ_２＝一次及び二次の径方向歪み項７０８これは点の特徴に関する基本的なカメラに基づく測定を記述し、フィルタで推
定された「誤差状態」に関する。これらの誤差状態は、・プラットフォーム位置７０５（δｒ _ｐ ^ｅ）の誤差（３）・特徴位置７０４（δｒ _ｆ ^ｅ）の誤差（３）・カメラからＩＭＵへの整列７０１（Δψ）の誤差（３）・ＥＣＥＦ（ｅ−フレーム）整列（Δθ）の誤差（３）である。(Equation 6) Where Y, Z = true physical displacement of the feature ray at the focal plane Y ', Z' = measured pixel count at the feature location Y _PIX , Z _PIX = number of pixels of the Y, Z dimension ε _11, ε ₂₂ = Error in pixel spacing 706 of both dimensions ε ₁₂ , ε ₂₁ = non-rectangular pixel skew terms 707 K ₁ , K ₂ = first and second order radial distortion terms 708 This is based on the basic camera for point features Describes the measurement and relates to the "error state" estimated by the filter. These error states are platform position 705 ([delta] r _p ^e) of the error (3) wherein position 704 ([delta] r _f ^e) of the error (3) of-alignment from the camera to the IMU 701 ([Delta] [phi]) error (3) • Error (3) in ECEF (e-frame) alignment ( Δθ ).

【００７２】カメラ更新処理に対する拡張されたカルマンフィルタの決定を完全にするため
に、これらの誤差状態に関してカメラ測定（Ｙ，Ｚ）を線形化しなければならな
い。この線形化は以下の式で示される。To complete the extended Kalman filter decision for the camera update process, the camera measurements (Y, Z) must be linearized with respect to these error conditions. This linearization is shown by the following equation.

【００７３】[0073]

【数７】ここで式５、６から、(Equation 7) Here, from Equations 5 and 6,

【００７４】[0074]

【数８】が得られる。(Equation 8) Is obtained.

【００７５】状態に対して必要な部分導関数に関連する測定に対する表現は以下のように与
えられる。The expressions for the measurements related to the required partial derivatives for the state are given as follows:

【００７６】[0076]

【数９】 “Ｚ”特徴測定に対する類似の部分導関数が “Ｙ”が“Ｚ”で置き換えられ
ることを除いて上記表現と同一であることは当業者には明らかである。(Equation 9) It will be apparent to those skilled in the art that a similar partial derivative for the "Z" feature measurement is identical to the above expression except that "Y" is replaced by "Z".

【００７７】これらの式は特徴測定処理の完全な３Ｄ線形化されたモデルを表す。カルマン
フィルタでの特徴更新は如何なる選択的ＧＰＳ疑似範囲又はデルタ疑似範囲（ps
eudo-range）処理とも完全に非同期でこれらの部分導関数を用いてなされる。延
在されたカルマンフィルタ特徴更新処理は当業者によく知られている方法を用い
る。選択的ＧＰＳ及び／又は特徴更新処理はそれらが利用可能である故に発生し
、カルマンフィルタは仮定された統計量及び適合モデルに基づいて最適処理を確
実にする。These equations represent a complete 3D linearized model of the feature measurement process. Feature updates in the Kalman filter can be performed using any selective GPS pseudorange or delta pseudorange (ps).
This is done using these partial derivatives completely asynchronously with the eudo-range) process. The extended Kalman filter feature update process uses a method well known to those skilled in the art. Selective GPS and / or feature update processing occurs because they are available, and the Kalman filter ensures optimal processing based on assumed statistics and fitted models.

【００７８】実際に候補の特徴のアーカイブから少なくとも３つの基準特徴を「アクティブ
」に保つようにする。このアクティブな特徴の組を必要であれば、１０から２０
（又はそれ以上）に拡大する。アクティブな特徴はカルマンフィルタで維持され
、更新されたその共分散特性を有し、ここで特徴アーカイブは使用に対して利用
可能な推定精度の種々の段階で数千の候補の特徴を有する。In practice, at least three reference features from the archive of candidate features are kept “active”. If you need this set of active features from 10 to 20
(Or more). The active features are maintained with a Kalman filter and have their covariance characteristics updated, where the feature archive has thousands of candidate features at various stages of estimation accuracy available for use.

【００７９】特徴はまた十分に良く配置され、それによりその位置はもはやカルマンフィル
タにより更新されず、その特徴は “ランドマーク”となる。特徴の標準偏差が
十分に小さくなるときに、特徴を更新する更なる企てが実際にフィルタ計算で数
値的不安定さを導く。ある応用（例えば、閉鎖路のロボットのナビゲーション）
で、ローカルシーンが全てがランドマークと考えられるためにその特徴が十分に
配置されることで“校正される”ようになることが期待される。The feature is also well positioned so that its position is no longer updated by the Kalman filter, and the feature becomes a “landmark”. When the standard deviation of a feature becomes small enough, further attempts to update the feature actually lead to numerical instability in the filter calculations. Certain applications (eg, navigation of closed-robots)
Therefore, it is expected that the local scenes are all considered as landmarks, so that their characteristics are sufficiently arranged to be “calibrated”.

【００８０】カルマンフィルタは新たな特徴の各発生で特徴配置の初期推定を必要とする。
これは特徴の組に関して前の情報を有さないということに矛盾しているように見
える。これのソリューションは各特徴をカメラジオメトリーモデルを通して投影
されたそのテンプレートの中心により画成されるレイに沿って存在するように初
期化し、そのレイに沿った範囲を仮定することである。典型的に、この範囲は１
００ｍの値を有すると仮定される。処理結果は約１０倍の係数内でこの初期推定
に左右されない。これは即ち、真の範囲はどこでも１０ｍから１０００ｍの間に
あると言うことである。代替的なソリューションはカルマンフィルタ処理が数秒
の仮定された初期範囲で進められる処理により反復されるように用いられる。平
均二乗カルマンフィルタ残差の挙動を観察することにより初期推定が良くないこ
とが推定されうる。反復は受容可能な収束性が得られるまで異なる推定の範囲を
再初期化することによりなされる。一般的な都市のナビゲーション状況に対して
、この特性は必要とされない。The Kalman filter requires an initial estimation of the feature location at each new feature occurrence.
This appears to be inconsistent with having no previous information about the feature set. The solution is to initialize each feature to lie along a ray defined by the center of the template projected through the camera geometry model, and assume a range along that ray. Typically, this range is 1
It is assumed to have a value of 00m. The processing result is independent of this initial estimate, within a factor of about ten. This means that the true range is anywhere between 10 m and 1000 m. An alternative solution is used in which the Kalman filter process is repeated with a process that proceeds in an assumed initial range of a few seconds. By observing the behavior of the mean square Kalman filter residual, it can be estimated that the initial estimation is poor. Iteration is done by re-initializing different estimation ranges until acceptable convergence is obtained. This property is not required for general urban navigation situations.

【００８１】他の実施例により、ＧＰＳは本発明の視覚データ処理システムに対する選択的
測定として考えられる。ＧＰＳ測定更新処理は各衛星測定を別々に扱う。典型的
に、ビュー内に８つ（又はそれ以上）の衛星が存在し、各衛星は搬送波位相に基
づくデルタ範囲測定と同様にコードに基づく範囲測定を提供する。範囲及びデル
タ範囲測定の両方はＧＰＳ受信機クロックの知られていないパラメータに関連し
たバイアス誤差を含む。ＧＰＳに基づくカルマンフィルタ処理は故に、一秒に一
回のレートでビュー内の衛星の各々に対して２つの測定更新段階を含む。According to another embodiment, GPS is considered as a selective measure for the visual data processing system of the present invention. The GPS measurement update process handles each satellite measurement separately. Typically, there are eight (or more) satellites in the view, each satellite providing code-based range measurements as well as carrier phase-based delta range measurements. Both range and delta range measurements include bias errors associated with unknown parameters of the GPS receiver clock. GPS-based Kalman filtering therefore includes two measurement update stages for each of the satellites in the view at a rate of once per second.

【００８２】図８はセンサー誤差モデルパラメータ及びダイナミックプラットフォーム動き
パラメータを含む一組の例示的状態を示す。これらの状態は非同期測定間の図６
のカルマンフィルタ処理６１４により伝搬され、各測定で更新されうる。FIG. 8 shows a set of exemplary states including sensor error model parameters and dynamic platform motion parameters. These states are shown in FIG.
And can be updated with each measurement.

【００８３】（アーサー（Arthur）を通して：元のバージョンのバックグラウンド部分から
コピーされる）本発明の鍵となるアスペクトの一つはプラットフォームのポーズがシーンの如
何なる先行する知識なしに、シーンに配置された特殊な予め調査されたターゲッ
トなしに周囲のシーンに関して決定される。本発明は慣性ナビゲーション、画像
処理、写真測量（photogrammetric）処理の強固な統合により達成される。更に
また、２つの異なるナビゲーションモダリティの使用の故に、別のナビゲーショ
ンソリューションを機械化する機器が非常に低精度の機器が使用できるようにナ
ビゲーション処理の一部分として校正されうる。(Through Arthur: Copied from Background Version of Original Version) One of the key aspects of the present invention is that the pose of the platform is placed in the scene without any prior knowledge of the scene. Are determined with respect to the surrounding scene without special pre-investigated targets. The invention is achieved by a tight integration of inertial navigation, image processing, and photogrammetric processing. Furthermore, due to the use of two different navigation modalities, equipment that mechanizes another navigation solution can be calibrated as part of the navigation process so that very low precision equipment can be used.

【００８４】上記の本発明は第二の動き検知モダリティを含むことにより従来技術のシステ
ムの制限を回避する。本発明は各画像収集時間でのカメラポーズに関する動きの
独立した測定を要求する。フレーム間のポーズ情報を統合することにより、ポー
ズ時間の経歴が形成される。このソリューションの初期化は基準特徴又はある外
的な測地情報のいずれかから得られる概略のポーズを用いる。独立して得られた
ポーズ時間経歴は周囲のシーンに固定された基準特徴を観測することにより更新
されなければ真のポーズからドリフトする。カメラサブシステムはその周囲を画
像化し、画像解析方法はナビゲーション基準点に対して適切である特徴の組を自
動的に選択するために用いられる。この自動化された処理はその局部化された画
像特性の脈絡内で特徴の性質及びシーン内の特徴の組の空間的多様性を用いる。The invention described above avoids the limitations of the prior art systems by including a second motion detection modality. The present invention requires an independent measurement of the motion with respect to the camera pose at each image acquisition time. By integrating pause information between frames, a history of pause times is formed. The initialization of this solution uses the rough pose obtained from either the reference feature or some external geodetic information. Independently derived pause time histories drift from true poses unless updated by observing reference features fixed to the surrounding scene. The camera subsystem images its surroundings and the image analysis method is used to automatically select a set of features that are appropriate for the navigation reference point. This automated process uses the nature of the features within the context of their localized image features and the spatial diversity of the set of features in the scene.

【００８５】特徴の組は順次の画像フレームを通してカメラ／画像処理システムによりトラ
ッキングされる。特徴トラッキングは次の画像点の特徴位置が独立のナビゲーシ
ョン処理及び特徴がシーンで静的であるという仮定により推定される故に簡単化
される（静止していない特徴は自動的に検出され、廃棄される）。カメラ視野（
ＦＯＶ）の特徴配置の測定は周囲のシーンで選択された特徴の配置の収束する予
測を提供するために推定値と比較される。プラットフォームがシーンの周囲を動
く故に、特徴は連続的により良く配置されるようになり、プラットフォームポー
ズはシーンに関して固定された座標系に関して決定される。特徴セット（集合）
からの特徴はカメラのＦＯＶから偶然に失われうるが、その最後に知られた特性
（画像基準テンプレート及び位置推定）はデータベースに保持される。アーカイ
ブされた特徴は再収集され、再配置に対する要求なしに基準特徴として再び用い
られる。シーン内のナビゲーションにより、特徴アーカイブは構築され、ナビゲ
ーション精度は更に一層正確になる。水平面での秀逸なカメラ位置推定は単一の
特徴の観測のみを用いて回復されうる。The set of features is tracked by the camera / image processing system through successive image frames. Feature tracking is simplified because the feature location of the next image point is estimated by an independent navigation process and the assumption that the features are static in the scene (non-stationary features are automatically detected and discarded). ). Camera view (
FOV) feature location measurements are compared to estimates to provide a convergent prediction of the location of selected features in the surrounding scene. Because the platform moves around the scene, the features become continuously better placed and the platform pose is determined with respect to a fixed coordinate system for the scene. Feature set (set)
Can be accidentally lost from the camera's FOV, but its last known properties (image reference template and position estimate) are kept in a database. The archived features are recollected and used again as reference features without the need for relocation. By navigating within the scene, a feature archive is built and navigation accuracy is even more accurate. Excellent camera position estimation in the horizontal plane can be recovered using only a single feature observation.

【００８６】上記のプロセスは画像及び動き測定から得られたプラットフォームポーズ力学
に関する冗長な情報を含む。この冗長性はその付加的情報が検知成分の誤差に関
して学習されうることで最も役に立つ。加速度計、速度ジャイロ、デジタルカメ
ラセンサの誤差の数学的及び統計的モデリングにより、特徴位置及びプラットフ
ォームポーズの推定と並行してそれぞれの誤差を推定しうる。この自己校正の目
的は本発明で非常に重要である。何故ならば多くのアプリケーションは非常に小
さく、低出力、低コストの検知部品を要求するからである。これらの要因は高精
度部品の使用を回避する。本発明は顕著な加速度計及びジャイロの誤差、誤った
整列、低精度カメラシステムに独特のスケール、スキュー、径方向歪みに対して
不感応である。この不感応はこれらの知られた誤差現象をモデル化し、ポーズ情
報と同時に要求されるモデル係数の推定の結果である。The above process involves redundant information about platform pose dynamics obtained from image and motion measurements. This redundancy is most useful because the additional information can be learned with respect to errors in the detected components. Mathematical and statistical modeling of accelerometer, velocity gyro, and digital camera sensor errors can be used to estimate the respective errors in parallel with the feature location and platform pose estimation. The purpose of this self-calibration is very important in the present invention. This is because many applications require very small, low power, low cost sensing components. These factors avoid the use of precision components. The present invention is insensitive to significant accelerometer and gyro errors, misalignment, and the scale, skew, and radial distortion inherent in low precision camera systems. This insensitivity is the result of modeling these known error phenomena and estimating the required model coefficients simultaneously with the pose information.

【００８７】本発明は従来技術とは顕著に別のものであり、動き検知システム及び画像化装
置を含むプラットフォームのポーズ情報を自動的に得るためのシステムを提供す
ることは当業者には明らかである。動き検知システム及び画像化装置はプラット
フォームがナビゲートする環境の先行する知識なしにプラットフォームのポーズ
情報を提供するために協働するよう構成される。The present invention is significantly different from the prior art and it will be apparent to one skilled in the art to provide a system for automatically obtaining pose information for a platform including a motion detection system and an imaging device. is there. The motion detection system and the imaging device are configured to cooperate to provide platform pose information without prior knowledge of the environment in which the platform navigates.

【００８８】本発明はある程度特定の詳細を十分に説明してきた。実施例のこの開示は例示
のみのためになされ、部品の配置及び組み合わせの無数の変更は請求項の本発明
の精神及び範囲から離れることなく実施されうる。更に開示された本発明は方法
、システム、コンピュータ読み出し可能媒体を含む種々の方法で実施されうる。
従って、本発明の範囲は上記の実施例ではなく請求項によって規定される。The present invention has been described in some detail with certain details. This disclosure of the embodiments is for illustrative purposes only, and numerous changes in the arrangement and combination of parts may be made without departing from the spirit and scope of the invention as claimed. Further, the disclosed invention can be implemented in various ways, including as methods, systems, and computer readable media.
Accordingly, the scope of the present invention is defined by the appended claims rather than the above examples.

[Brief description of the drawings]

【図１】本発明を実施するシステムを示す。FIG. 1 shows a system for implementing the invention.

【図２】図１のシステムで用いられるコンピュータシステムの好ましい内部構造のブロ
ック図である。FIG. 2 is a block diagram of a preferred internal structure of a computer system used in the system of FIG.

【図３】白エリアと黒エリアを含む強度画像の３Ｄ画像を示す。FIG. 3 shows a 3D image of an intensity image including a white area and a black area.

【図４Ａ】２つの例示的な連続した及び画像化器から順次受信した画像を示す。FIG. 4A shows images sequentially received from two exemplary continuous and imagers.

【図４Ｂ】図４Ａの画像の一つの特徴を抽出する例示的な複数解像度階層的特徴構造を示
す。FIG. 4B illustrates an exemplary multi-resolution hierarchical feature structure for extracting one feature of the image of FIG. 4A.

【図４Ｃ】単一の画像からのＫ個の画像構造を示し、画像構造の各々は一特徴に対する。FIG. 4C shows K image structures from a single image, each of which is for a feature.

【図４Ｄ】ここで「特徴トラッキングマップ」と称され、又は単に特徴マップと称される
ものを一例として示す。FIG. 4D shows an example of what is referred to herein as a “feature tracking map” or simply as a feature map.

【図４Ｅ】特徴抽出プロセスのフローチャートを示す。FIG. 4E shows a flowchart of a feature extraction process.

【図４Ｆ】一組の連続した画像内での特徴トラッキングでのテンプレートの更新を示す。FIG. 4F illustrates updating a template with feature tracking within a set of consecutive images.

【図４Ｇ】現在の画像フレームと前の画像フレームとの間のエピポーラージオメトリーの
強調処理のフローチャートを示す。FIG. 4G shows a flowchart of an epipolar geometry enhancement process between a current image frame and a previous image frame.

【図５Ａ】本発明の一実施例による特徴トラッキングとナビゲーションの統合の処理のフ
ローチャートを示す。FIG. 5A shows a flowchart of a process of integrating feature tracking and navigation according to one embodiment of the present invention.

【図５Ｂ】動きセンサ由来の情報の特徴を示す。FIG. 5B shows characteristics of information derived from a motion sensor.

【図６】システムのナビゲーション計算要素の処理機能ダイアグラム全体を示す。FIG. 6 shows the entire processing functional diagram of the navigation computation element of the system.

【図７】特徴はデジタルカメラから観察された画像内で検出された離散的な点であるこ
とを示す。FIG. 7 shows that features are discrete points detected in an image viewed from a digital camera.

【図８】慣性装置から受信され、更に図６のカルマンフィルタ処理により伝搬された一
組の例示的な状態を示す。FIG. 8 illustrates a set of exemplary states received from an inertial device and further propagated by the Kalman filtering of FIG.

───────────────────────────────────────────────────── フロントページの続き (81)指定国ＥＰ(ＡＴ，ＢＥ，ＣＨ，ＣＹ，ＤＥ，ＤＫ，ＥＳ，ＦＩ，ＦＲ，ＧＢ，ＧＲ，ＩＥ，ＩＴ，ＬＵ，ＭＣ，ＮＬ，ＰＴ，ＳＥ)，ＪＰ (72)発明者カイン，ジムアメリカ合衆国マサチューセッツ州 02452 ウォールサムレキシントン・ストリート 501 (72)発明者イェーツ，チャーリーアメリカ合衆国フロリダ州 32547 エフティ・ウォールトン・ビーチセントラル・アヴェニュー 924 (72)発明者ツェーン，アーサーアメリカ合衆国カリフォルニア州 95125 サン・ノゼコーストランド・アヴェニュー 2226 (72)発明者フェジェス，サンダーアメリカ合衆国カリフォルニア州 95124 サン・ノゼクリデール・アヴェニュー 4859 (72)発明者チェン，ジンロンアメリカ合衆国カリフォルニア州 95054 サンタ・クララチーニー・ストリート 4664 (72)発明者ジャブロンスキー，マークアメリカ合衆国カリフォルニア州 94114 サン・フランシスコ 17ス・ストリート 3625ＡＦターム(参考） 2F105 AA01 BB17 5J062 AA03 BB01 BB02 CC07 5L096 AA02 AA06 BA08 CA02 CA04 DA02 FA67 FA69 GA05 HA04 HA05 JA11 【要約の続き】する。──────────────────────────────────────────────────続き Continuation of front page (81) Designated country EP (AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE ), JP (72) Inventor Cain, Jim United States of America 02452 Massachusetts 02452 Waltham Lexington Street 501 (72) Inventor Yates, Charlie United States of America 32547 Efty Walton Beach Central Le Avenue 924 (72) Inventor Tene, Arthur United States of America 95125 San Jose, California Coastland Avenue 2226 (72) Inventor Fejes, Thunder United States of America 95124 San Jose Cliday Avenue 4859 (72) Inventor Chen, Jinlong United States of America 95054 Santa Clara Cheney Street 4664 (72) Inventor Jablonski, Mark United States of America 94114 San Francisco 17th Street 3625A F-term (reference) 2F105 AA01 BB17 5J062 AA03 BB01 BB02 CC07 5L096 AA02 AA06 BA08 CA02 CA04 DA02 FA67 FA69 GA05 HA04 HA05 JA11

Claims

[Claims]

1. A method comprising: receiving a series of images of a surrounding scene from an imaging device as the platform navigates around the surrounding scene; obtaining platform pose information through an estimation process with respect to features extracted from each of the images. A motion detector and an imaging device operative in a temporal relationship known as a motion detector that supplies motion data with respect to the platform such that each image corresponds to a respective set of motion data. How to get platform pose information including:

2. The method of claim 1, wherein each of the features reflects certain characteristics of an object of the surrounding scene, and further comprising processing at least one of the images to extract the features.

3. The method of claim 2, further comprising the step of continuously tracking features in each of the images following the at least one processed image.

4. The method of claim 3, further comprising the step of: cooperating with the motion measurement device to determine a respective arrangement of the features with respect to the first coordinate space.

5. The method of claim 2, wherein each of the features is prominent and has only a minimal change from one image to another in a sequence of images.

6. The method of claim 3, wherein certain characteristics of the object are included in the tracking of the feature such that tracking of the feature is made more efficient.

7. The method of claim 6, wherein certain characteristics of the object include predefined characteristics of the object intentionally placed in the scene.

8. The method of claim 7, wherein the predetermined characteristics include at least one of (i) a cross-section of the linear feature, (ii) a vertical state of the linear feature, and (iii) a color-based feature. Method.

9. The method of claim 1, wherein the processing of at least one of the images to extract a feature of interest comprises detecting each of the features by applying a salient feature operator, wherein the salient feature operator comprises 3. The method of claim 2, wherein when applied to at least one, enhances regions such as corners while suppressing at least one edge-like and uniform region of the image.

10. The method of claim 9 wherein said feature operator operates on a smooth version of at least one image based on a function of a Hessian matrix of Laplacian operators.

11. The continuous tracking of features of each image comprises detecting features in each of the images following the at least one processed image along an epipolar line with features extracted from the at least one processed image. The method according to claim 9.

12. The image following at least one processed image by using an estimation process such that the continuous tracking of features of each image is such that the search area for each of each feature of the continuous images is sufficiently narrowed. 10. The method of claim 9, comprising estimating the location of the feature at each of the following.

13. The method of claim 12, wherein the estimating process is based on a Kalman filter.

14. The method of claim 1, further comprising calculating a set of parameters from the motion data using a strapdown navigation process.

15. The method according to claim 14, wherein the estimation process is based on a Kalman filter.

16. The set of parameters is (i) location of the platform, (ii)
16. The method of claim 15, including at least one of velocity, (iii) attitude information.

17. The method of claim 16, wherein each of the parameters need not be exact.

18. The method of claim 17, further comprising updating the set of parameters with input from the external positioning system when the external positioning system becomes available.
The described method.

19. The method of claim 18, wherein the external positioning system is a global positioning system (GPS).

20. The method of claim 19, wherein the parameters further include features from the image.

21. The step of obtaining platform pose information further comprises the step of obtaining an imaging model of the imaging device and providing parameters to an estimation process for estimating the platform pose information, wherein the imaging model comprises: 21. The method of claim 20, wherein the features reflect how the features are transformed from the object around the image.

22. The method of claim 21, wherein the step of obtaining an imaging model further comprises updating the imaging model in response to error data from the estimation process such that the imaging model is constantly updated.

23. The method of claim 1, wherein the platform is selected from the group consisting of a vehicle, an aircraft, a boat, a human operator, and a missile, each of the groups provided with a motion sensor integrated with the imaging device. .

24. Forming a series of surrounding images from the imaging device as the platform navigates; processing at least one of the images to extract features that reflect certain characteristics of the surrounding objects; Successively tracking each of the image features following the at least one processed image following the performed feature; obtaining a set of parameters from the motion data through a strapdown navigation process; The steps of obtaining pose information for the platform by using an estimation processing operation, wherein the imaging device is informed by a motion detection device that supplies motion data with respect to the platform such that each image corresponds to a respective set of motion data. Estimation processing is performed based on the time relationship specified, and the platform pose information is statistically calculated from the estimation processing. It estimated coupled with the strap-down navigation processing to be a method for obtaining a pause information platforms, including motion detection device and the imaging device under parameters.

25. The method of claim 24, wherein certain properties are included in the processing so that the processing is performed more efficiently.

26. The method of claim 24, wherein said processing of at least one of the images to extract features comprises detecting each of the features by applying a feature operator.

27. The feature operator is a salient feature operator that, when applied to at least one of the images, enhances a region such as a corner, while highlighting at least one edge of the image. 27. The method of claim 26, wherein such and uniform areas are suppressed.

28. The method of claim 26, wherein the successive tracking of features of each image is performed by using an estimation process such that a search area for each of each feature of the successive images is sufficiently narrowed. 26. The method of claim 25, comprising estimating a location of the feature at each of the following.

29. The method of claim 28, wherein the estimating process is based on a Kalman filter.

30. Maintaining a feature list including the extracted features; and updating the feature list each time one of the extracted features disappears from one of the series of images. 25. The method of claim 24, wherein said tracking is performed continuously.

31. The updating of the feature list comprises processing one of the subsequent images to extract new features to be inserted into the feature list such that the number of features in the feature list is kept constant. 31. The method of claim 30, further comprising.

32. Determining position information for each feature with respect to a coordinate space in which the motion detection device operates; providing motion data to an estimation process along with the position information to estimate an error in the imaging model; 32. The method of claim 31, further comprising the steps of: receiving error data from the estimation process and updating the imaging model to have the least error, wherein the imaging model indicates a mapping relationship from the object to the features of the image.

33. The method of claim 32, further comprising the step of receiving error data from the estimation process and updating the position information of each feature such that the position information has a minimum error.

34. The method of claim 33, wherein the motion data includes at least one of (i) rotation data and (ii) translation data for the platform.

35. The method of claim 24, wherein the motion data need not be accurate, and the resulting pose information operates with the motion data along features extracted using an estimation process.

36. A motion detection device comprising a global positioning system (GP)
36. The method of claim 35, wherein S) is a sensing device and provides pseudorange and pseudorange speed from the imaging device to the GPS satellite.

37. The method of claim 35, wherein the motion sensing device is an inertial measurement unit (IMU) and the sensors included are at least one gyro and one accelerometer that provide rotation and translation data, respectively.

38. A motion detector integrated with the platform and providing motion data about the platform; generating a series of images of the scene, each image corresponding to one set of motion data; An imaging device operating in a known and temporal relationship therewith; a computing system receiving motion data and images and coupled to the motion detecting device and the imaging device, the computing system comprising a processor and an application module And a memory space for storing code for the application module when executed by the processor: processing at least one of the images to extract features that reflect certain characteristics of surrounding objects; Each of the features of the image following the at least one processed image following the feature that was Obtaining a set of parameters from the motion data through a strapdown navigation process; obtaining platform pose information by using a featured estimation process operation as part of the input; Operating in a known temporal relationship with a motion detector that supplies motion data with respect to the platform to correspond to each set of motion data, the estimating process is such that platform pose information is statistically estimated from the estimating process. A system that combines parameters with the strapdown navigation process to obtain pose information for the platform of the scene without prior knowledge of the scene, receiving parameters.

39. The system of claim 38, wherein each feature is extracted by using a feature operator according to a characteristic feature.

40. Each feature is a salient feature, wherein the salient operator is a salient feature operator, wherein the salient feature operator, when applied to at least one of the images, enhances a corner-like region; 40. The system of claim 39, while suppressing at least one edge and uniform area of the image.

41. The method of claim 26, wherein the successive tracking of features of each image is performed by using an estimation process such that a search area for each of the features of each successive image is sufficiently narrowed. 41. The system of claim 40, comprising estimating an arrangement of features at each of the following.

42. The system according to claim 41, wherein the estimation processing is based on a Kalman filter.

43. Maintaining a feature list including the extracted features; and updating the feature list each time one of the extracted features disappears from one of the series of images. 39. The system of claim 38, wherein said tracking is continuous.

44. The updating of the feature list further comprises the step of processing one of the following images to extract new features to be inserted into the feature list such that the number of features in the feature list is kept constant. 44. The system of claim 43.

45. Determining position information for each feature with respect to a coordinate space in which the motion detection device operates; providing motion data to an estimation process along with the position information to estimate an error in the imaging model; 39. The system of claim 38, further comprising updating the imaging model in response to error data from the estimation process to have the least error, wherein the imaging model indicates a mapping relationship from the object to the image features.

46. The system of claim 45, wherein the application module further comprises: receiving the error data from the estimation process and updating the position information for each feature such that the position information has the least error.

47. The system of claim 46, wherein the motion data includes at least one of (i) rotation data and (ii) translation data for the platform.

48. The system of claim 47, wherein the motion data does not need to be accurate, and the obtained pose information operates with the motion data along features extracted using an estimation process.