JP2018022247A

JP2018022247A - Information processing apparatus and control method thereof

Info

Publication number: JP2018022247A
Application number: JP2016151519A
Authority: JP
Inventors: 俊博本田; Toshihiro Honda; 片山　昭宏; Akihiro Katayama; 昭宏片山; 小竹　大輔; Daisuke Kotake; 大輔小竹; 久義降籏; Hisayoshi Furihata
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-08-01
Filing date: 2016-08-01
Publication date: 2018-02-08
Anticipated expiration: 2036-08-01
Also published as: JP6817742B2

Abstract

PROBLEM TO BE SOLVED: To calculate a position and attitude stably with high accuracy and robustness.SOLUTION: An information processing apparatus holds a three-dimensional map including three-dimensional information of features existing in an image and position attitude information of an imaging section obtained in capturing the image, inputs the image captured by the imaging section, determines consistency between the input image and the features recorded on the three-dimensional map, calculates an evaluation value indicating a result of evaluating a presence state of the features associated between the three-dimensional map and the input image, calculates weight for controlling the influence of the features in the three-dimensional map on calculation of the position and attitude of the imaging section, on the basis of the consistency and evaluation value, and calculates a position and attitude of the imaging section capturing the input image by use of the three-dimensional map and the weight.SELECTED DRAWING: Figure 1A

Description

本発明は撮像装置の位置および姿勢を計測する情報処理装置およびその制御方法に関する。 The present invention relates to an information processing apparatus that measures the position and orientation of an imaging apparatus and a control method thereof.

画像と現実空間の三次元マップを照合して撮像装置の位置および姿勢を計算する技術が知られている。このような技術は、ロボットや自動車の自己位置推定、拡張／複合現実感における現実空間と仮想物体との位置合わせ等に利用される。 A technique for calculating the position and orientation of an imaging device by comparing an image with a three-dimensional map of a real space is known. Such a technique is used for self-position estimation of robots and automobiles, alignment between a real space and a virtual object in augmented / mixed reality.

特許文献１では、現実空間に移動物体が存在するシーンにおいて、シーンを撮像するカメラによって入力された画像をもとに、カメラの位置および姿勢計算を行う方法が開示されている。この方法では、三次元マップ上の特徴点が移動物体上の点か否かを判定し、移動物体上の特徴点を除いた特徴点を用いてカメラの位置および姿勢の計算を行う。 Patent Document 1 discloses a method for calculating the position and orientation of a camera based on an image input by a camera that captures the scene in a scene where a moving object exists in the real space. In this method, it is determined whether or not the feature point on the three-dimensional map is a point on the moving object, and the position and orientation of the camera are calculated using the feature point excluding the feature point on the moving object.

特開２０１２−２２１０４２号公報JP 2012-221042 A

G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” International Symposium on Mixed and Augmented Reality, pp. 225-234, 2007G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces,” International Symposium on Mixed and Augmented Reality, pp. 225-234, 2007 Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.22, no.11, pp.1330-1334, 2000.Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol.22, no.11, pp.1330-1334, 2000. H. Kato and M. Billinghurst, “Marker tracking and hmd calibration for a video-based augmented reality conferencing system,” International Workshop on Augmented Reality, 1999H. Kato and M. Billinghurst, “Marker tracking and hmd calibration for a video-based augmented reality conferencing system,” International Workshop on Augmented Reality, 1999 R. Y. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses,” IEEE Journal of Robotics and Automation, vol.3, no.4, pp.323-344, 1987.RY Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses,” IEEE Journal of Robotics and Automation, vol.3, no.4, pp.323-344, 1987. R. I. Hartley, “Self-calibration from multiple views with a rotating camera,” European Conference on Computer Vision, pp.471-478, 1994.R. I. Hartley, “Self-calibration from multiple views with a rotating camera,” European Conference on Computer Vision, pp.471-478, 1994. C. Pirchheim, D. Schmalstieg, G. Reitmayr, “Handling pure camera rotation in keyframe-based slam,” International Symposium on Mixed and Augmented Reality, pp.229-238, 2013.C. Pirchheim, D. Schmalstieg, G. Reitmayr, “Handling pure camera rotation in keyframe-based slam,” International Symposium on Mixed and Augmented Reality, pp.229-238, 2013. J. Engel, T. Schops, D.Cremers, “LSD-SLAM: Large-Scale Direct Monocular SLAM,” European Conference on Computer Vision, pp.834-849, 2014.J. Engel, T. Schops, D. Cremers, “LSD-SLAM: Large-Scale Direct Monocular SLAM,” European Conference on Computer Vision, pp.834-849, 2014.

しかしながら、特許文献１では、三次元マップ上の特徴点が移動物体上の点であると判定された場合に、その特徴点を排除してしまう。そのため、特許文献１では、位置および姿勢計算に用いる特徴点数が少ない場合または特徴分布に偏りがある場合にカメラの位置および姿勢の精度およびロバスト性が低くなるという課題がある。 However, in Patent Document 1, when it is determined that a feature point on the three-dimensional map is a point on a moving object, the feature point is excluded. Therefore, in Patent Document 1, there is a problem that the accuracy and robustness of the position and orientation of the camera are lowered when the number of feature points used for position and orientation calculation is small or when the feature distribution is biased.

そこで本発明は、安定して、高精度かつ高ロバストに位置および姿勢を計算することを可能にすることを目的とする。 Therefore, an object of the present invention is to make it possible to calculate a position and an attitude stably, highly accurately and robustly.

上記目的を達成するための本発明の一態様による情報処理装置は以下の構成を備える。すなわち、
画像中に存在する特徴の三次元情報とその画像の撮像時における撮像部の位置姿勢情報が記録された三次元マップを保持する保持手段と、
前記撮像部によって撮像された画像を入力する入力手段と、
前記三次元マップに記録されている特徴について、前記入力された画像との整合性を決定する決定手段と、
前記三次元マップと前記入力された画像との間で対応づけられた特徴の存在状態の評価の結果を示す評価値を計算する評価手段と、
前記整合性と前記評価値に基づいて前記三次元マップの特徴が、前記撮像部の位置および姿勢の計算に与える影響を制御する重みを計算する重み計算手段と、
前記三次元マップと前記重みを用いて前記入力された画像の撮像時における前記撮像部の位置および姿勢を計算する計算手段と、を備える。 In order to achieve the above object, an information processing apparatus according to an aspect of the present invention has the following arrangement. That is,
Holding means for holding a three-dimensional map in which three-dimensional information of features existing in the image and position and orientation information of the imaging unit at the time of capturing the image are recorded;
Input means for inputting an image captured by the imaging unit;
Determining means for determining consistency with the input image for the features recorded in the three-dimensional map;
Evaluation means for calculating an evaluation value indicating a result of evaluation of the existence state of the feature associated between the three-dimensional map and the input image;
Weight calculation means for calculating a weight for controlling the influence of the characteristics of the three-dimensional map on the calculation of the position and orientation of the imaging unit based on the consistency and the evaluation value;
Calculating means for calculating the position and orientation of the imaging unit when the input image is captured using the three-dimensional map and the weight.

本発明によれば、安定して、高精度かつ高ロバストに位置および姿勢を計算することができる。 According to the present invention, the position and orientation can be calculated stably, with high accuracy and with high robustness.

第１実施形態における情報処理装置の機能構成例を示すブロック図。The block diagram which shows the function structural example of the information processing apparatus in 1st Embodiment. 情報処理装置のハードウエア構成例を示すブロック図。The block diagram which shows the hardware structural example of information processing apparatus. 第１実施形態における処理手順を示すフローチャート。The flowchart which shows the process sequence in 1st Embodiment. 三次元マップ上の特徴が有するリスト構造の一例を示す図。The figure which shows an example of the list structure which the characteristic on a three-dimensional map has. 第１実施形態における整合性決定の処理手順を示すフローチャート。6 is a flowchart showing a procedure for determining consistency in the first embodiment. 第４実施形態における情報処理装置の機能構成例を示すブロック図。The block diagram which shows the function structural example of the information processing apparatus in 4th Embodiment. 第４実施形態における処理手順を示すフローチャート。The flowchart which shows the process sequence in 4th Embodiment. 評価値の高低による整合性と重みの関係の変化を示すグラフの図。The figure of the graph which shows the change of the relationship between consistency and weight by the level of an evaluation value.

以下に、添付の図面を参照しながら、本発明の実施形態について説明する。 Embodiments of the present invention will be described below with reference to the accompanying drawings.

＜第１実施形態＞
以下では、現実空間に存在する特徴点の三次元座標を保持する三次元マップに基づいて、現実空間における撮像部の位置および姿勢の計測を行う情報処理装置について説明する。本実施形態の情報処理装置は、位置および姿勢計算に用いる特徴点数が少ない場合であって、現実空間に動く物体が存在するような場合にも、高精度、高ロバストに位置および姿勢を計算することが可能である。 <First Embodiment>
Hereinafter, an information processing apparatus that measures the position and orientation of the imaging unit in the real space based on a three-dimensional map that holds the three-dimensional coordinates of the feature points existing in the real space will be described. The information processing apparatus according to the present embodiment calculates the position and orientation with high accuracy and high robustness even when the number of feature points used for position and orientation calculation is small and there is a moving object in the real space. It is possible.

第１実施形態では、撮像部から入力された画像と三次元マップとを照合し、撮像部の位置および姿勢を計算する。この際、動く物体上の特徴点は位置および姿勢計算時に与える影響の大きさ（以下、重みと呼ぶ）を下げる。但し、位置および姿勢計算に用いる特徴点数が多いほど動く物体上にある可能性が低い特徴点に重みを与え、特徴点数が少ないほど動く物体上にある可能性が高い特徴点にも重みを与える。このように位置および姿勢計算に用いる特徴点数に応じて重みを調整することにより高精度、高ロバストに位置および姿勢を計算する。 In the first embodiment, the image input from the imaging unit and the 3D map are collated, and the position and orientation of the imaging unit are calculated. At this time, the feature point on the moving object lowers the magnitude of the influence (hereinafter referred to as the weight) when calculating the position and orientation. However, as the number of feature points used for position and orientation calculation increases, a weight is given to a feature point that is less likely to be on a moving object, and a feature point that is more likely to be on a moving object is assigned to a smaller number of feature points. . In this way, the position and orientation are calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for the position and orientation calculation.

図１Ａは、第１実施形態における情報処理装置１の機能構成例を示すブロック図である。情報処理装置１は、機能部として、三次元情報保持部１１０、画像入力部１２０、整合性決定部１３０、特徴評価部１４０、重み計算部１５０、位置姿勢計算部１６０を有する。三次元情報保持部１１０は、整合性決定部１３０および位置姿勢計算部１６０で利用される三次元マップを保持する。画像入力部１２０は、撮像部１８０と接続されており、撮像部１８０が撮像する画像を整合性決定部１３０および位置姿勢計算部１６０に入力する。第１実施形態では、撮像部１８０として１台のカラーカメラが用いられる。初期化部１７０は、情報処理装置１が撮像部１８０の位置および姿勢計算を開始するに際して、各種初期情報を設定する。 FIG. 1A is a block diagram illustrating a functional configuration example of the information processing apparatus 1 according to the first embodiment. The information processing apparatus 1 includes a three-dimensional information holding unit 110, an image input unit 120, a consistency determination unit 130, a feature evaluation unit 140, a weight calculation unit 150, and a position / orientation calculation unit 160 as functional units. The three-dimensional information holding unit 110 holds a three-dimensional map used by the consistency determining unit 130 and the position / orientation calculating unit 160. The image input unit 120 is connected to the image capturing unit 180 and inputs an image captured by the image capturing unit 180 to the consistency determining unit 130 and the position / orientation calculating unit 160. In the first embodiment, one color camera is used as the imaging unit 180. The initialization unit 170 sets various pieces of initial information when the information processing apparatus 1 starts calculating the position and orientation of the imaging unit 180.

整合性決定部１３０は、三次元情報保持部１１０が保持する三次元マップ、画像入力部１２０によって入力された画像、位置姿勢計算部１６０によって計算された位置および姿勢に基づいて、位置および姿勢計算に用いる特徴点群の整合性を決定する。ここで整合性とは、三次元マップ上の特徴点が動く物体上にない可能性を表す値である。整合性の詳細は後述する。特徴評価部１４０は、整合性決定部１３０において抽出された、位置および姿勢計算に用いる特徴点群の分布に関する評価値を計算する。重み計算部１５０は、整合性決定部１３０によって決定された整合性と、特徴評価部１４０による評価の結果である評価値とに基づいて、位置および姿勢計算に用いる特徴点群に含まれる各特徴点の重みを計算する。位置姿勢計算部１６０は、三次元情報保持部１１０が保持する三次元マップ、画像入力部１２０によって入力された画像、重み計算部１５０によって計算された重みに基づいて、撮像部１８０の世界座標系における位置および姿勢を計算する。 The consistency determination unit 130 calculates the position and orientation based on the three-dimensional map held by the three-dimensional information holding unit 110, the image input by the image input unit 120, and the position and orientation calculated by the position and orientation calculation unit 160. The consistency of the feature point group used for is determined. Here, the consistency is a value indicating the possibility that the feature point on the three-dimensional map is not on the moving object. Details of the consistency will be described later. The feature evaluation unit 140 calculates an evaluation value related to the distribution of the feature point group used in the position and orientation calculation extracted by the consistency determination unit 130. Based on the consistency determined by the consistency determination unit 130 and the evaluation value that is the result of the evaluation by the feature evaluation unit 140, the weight calculation unit 150 includes each feature included in the feature point group used for position and orientation calculation. Calculate point weights. The position / orientation calculation unit 160 is based on the three-dimensional map held by the three-dimensional information holding unit 110, the image input by the image input unit 120, and the weight calculated by the weight calculation unit 150. Calculate the position and orientation at.

図１Ｂは、本実施形態による情報処理装置１のハードウエア構成例を示すブロック図である。ＣＰＵ１０は、ＲＯＭ１１またはＲＡＭ１２に格納されたプログラムを実行することにより、上述した各機能部を実現する。ＲＯＭ１１は、ＣＰＵ１０が実行するプログラムや各種データを記憶する。ＲＡＭ１２は、ＣＰＵ１０が各種処理を実行する際のワークエリアを提供する。なお、外部記憶装置１３に記憶されたプログラムは、ＲＡＭ１２にロードされ、ＣＰＵ１０により実行される。外部記憶装置１３は、ハードディスクまたはフラッシュメモリ等で構成され、各種情報を保持する。キーボード１４およびポインティングデバイス１５は、ユーザがＣＰＵ１０に各種指示を与えるための指示入力部である。インターフェース１６は外部装置と接続され、情報処理装置１と外部装置の通信を実現する。本実施形態では、インターフェース１６に撮像部１８０が接続され、撮像部１８０から撮像された画像がインターフェース１６を介して情報処理装置に入力される。バス１７は、上述した各部を相互に通信可能に接続する。 FIG. 1B is a block diagram illustrating a hardware configuration example of the information processing apparatus 1 according to the present embodiment. The CPU 10 implements each functional unit described above by executing a program stored in the ROM 11 or the RAM 12. The ROM 11 stores programs executed by the CPU 10 and various data. The RAM 12 provides a work area when the CPU 10 executes various processes. Note that the program stored in the external storage device 13 is loaded into the RAM 12 and executed by the CPU 10. The external storage device 13 is composed of a hard disk, a flash memory, or the like, and holds various information. The keyboard 14 and the pointing device 15 are instruction input units for a user to give various instructions to the CPU 10. The interface 16 is connected to an external device, and realizes communication between the information processing device 1 and the external device. In the present embodiment, the imaging unit 180 is connected to the interface 16, and an image captured from the imaging unit 180 is input to the information processing apparatus via the interface 16. The bus 17 connects the above-described units so that they can communicate with each other.

なお、図１Ａに示した各機能部がＣＰＵ１０によるソフトウエアの実行により実現されるものとしたが、これに限られるものではない。図１Ａに示した機能部の少なくとも一部が専用のハードウエアによって実現されてもよい。また、各機能は１つのＣＰＵ（プロセッサー）により実現されてもよいし、複数のＣＰＵ（プロセッサー）により実現されてもよい。 In addition, although each function part shown to FIG. 1A shall be implement | achieved by execution of the software by CPU10, it is not restricted to this. At least a part of the functional units shown in FIG. 1A may be realized by dedicated hardware. Each function may be realized by one CPU (processor) or a plurality of CPUs (processors).

次に、第１実施形態における位置および姿勢計算のための処理手順について説明する。図２は、第１実施形態における処理手順を示すフローチャートである。 Next, a processing procedure for position and orientation calculation in the first embodiment will be described. FIG. 2 is a flowchart showing a processing procedure in the first embodiment.

ステップＳ１０１では、初期化部１７０が、初期三次元マップを読み込み三次元情報保持部１１０に保持する。また、初期化部１７０は、撮像部１８０の内部パラメータの読み込みと、撮像部１８０の初期位置および初期姿勢の計算を行う。なお、本実施形態では、三次元マップはキーフレーム画像上の特徴点の集合として保持するものとする。ここでキーフレーム画像とは、画像を撮像した時の撮像部１８０の世界座標系における位置および姿勢が属性として付与された画像である。 In step S <b> 101, the initialization unit 170 reads the initial three-dimensional map and stores it in the three-dimensional information holding unit 110. The initialization unit 170 reads the internal parameters of the imaging unit 180 and calculates the initial position and initial posture of the imaging unit 180. In the present embodiment, the three-dimensional map is held as a set of feature points on the key frame image. Here, the key frame image is an image to which the position and orientation in the world coordinate system of the imaging unit 180 when the image is captured are assigned as attributes.

三次元マップの特徴点群は、キーフレーム画像ごとに図３のようなリスト構造を有する。図３はＭ個の特徴点が有するリスト構造の一例である。三次元マップには、画像中に存在する特徴の三次元情報とその画像の撮像時における撮像部の位置姿勢情報が記録されている。図３の特徴点ＩＤは１つのキーフレーム画像内で各特徴点を一意に識別できる番号である。輝度値はキーフレーム画像から得られる各特徴点の輝度値である。画像座標はキーフレーム画像座標系における各特徴点の二次元座標である。奥行き値は撮像部座標系を基準とした各特徴点の奥行き値である。特徴量は各特徴点で抽出した特徴量である。なお、特徴点の世界座標系を基準とした三次元座標は、特徴点の画像座標と奥行き値、キーフレーム画像に付与された位置および姿勢を用いて計算することができる。なお、それぞれの三次元マップは、情報抽出を行ったキーフレームを特定するためのキーフレームＩＤと、そのキーフレームを撮像したときの撮像部１８０の位置姿勢（世界座標系）を属性として保持している。 The feature point group of the three-dimensional map has a list structure as shown in FIG. 3 for each key frame image. FIG. 3 is an example of a list structure possessed by M feature points. In the three-dimensional map, three-dimensional information of features existing in the image and position / orientation information of the imaging unit at the time of capturing the image are recorded. The feature point ID in FIG. 3 is a number that can uniquely identify each feature point within one key frame image. The luminance value is a luminance value of each feature point obtained from the key frame image. The image coordinates are two-dimensional coordinates of each feature point in the key frame image coordinate system. The depth value is a depth value of each feature point based on the imaging unit coordinate system. The feature amount is a feature amount extracted at each feature point. The three-dimensional coordinates based on the world coordinate system of the feature points can be calculated using the image coordinates and depth values of the feature points, and the position and orientation given to the key frame image. Each three-dimensional map holds, as attributes, a key frame ID for identifying the key frame from which information has been extracted and the position and orientation (world coordinate system) of the imaging unit 180 when the key frame is imaged. ing.

また、初期三次元マップは、たとえば、三次元マップ作成と撮像部１８０の位置および姿勢計測を同時に行うＫｌｅｉｎらの手法（非特許文献１）により作成しておくことができる。また、撮像部１８０の内部パラメータは、たとえば、平面パターンを多視点で撮像した画像を用いたＺｈａｎｇの手法（非特許文献２）によって事前に校正しておくものとする。また、撮像部１８０の初期位置および初期姿勢は、たとえば、サイズが既知の人工のマーカを用いたＫａｔｏらの手法（非特許文献３）によって計算するものとする。なお、後述のＳ１０６で撮像部１８０の位置および姿勢が計算されると、対象となっている入力画像をキーフレームとして新たな三次元マップが生成され、追加保持される。追加保持された三次元マップは、以降の位置および姿勢の計算に利用することができる。なお、三次元マップの保存は、Ｓ１０６で位置および姿勢が計算されるたびに行われてもよいし、ｎ回（ｎ＞１）に１回の割合で行われてもよい。また、算出された位置および姿勢が既に保存されている三次元マップの位置および姿勢と類似する場合には三次元マップとして保存しないようにしてもよい。 In addition, the initial three-dimensional map can be created, for example, by the method of Klein et al. (Non-Patent Document 1) that performs the three-dimensional map creation and the position and orientation measurement of the imaging unit 180 at the same time. In addition, the internal parameters of the imaging unit 180 are calibrated in advance by, for example, the Zhang method (Non-Patent Document 2) using an image obtained by capturing a planar pattern from multiple viewpoints. In addition, the initial position and initial posture of the imaging unit 180 are calculated by, for example, the method of Kato et al. (Non-Patent Document 3) using an artificial marker whose size is known. Note that when the position and orientation of the imaging unit 180 are calculated in S106, which will be described later, a new three-dimensional map is generated using the target input image as a key frame and additionally held. The additionally held three-dimensional map can be used for subsequent calculation of position and orientation. The storage of the three-dimensional map may be performed every time the position and orientation are calculated in S106, or may be performed once every n times (n> 1). Further, when the calculated position and orientation are similar to the position and orientation of the already stored three-dimensional map, it may not be stored as a three-dimensional map.

次に、ステップＳ１０２において、画像入力部１２０が、撮像部１８０により撮像された画像を情報処理装置１に入力する。ステップＳ１０３において、整合性決定部１３０が、三次元情報保持部１１０が保持する三次元マップ、ステップＳ１０２で入力された画像、撮像部１８０の直前のフレームの位置および姿勢に基づいて位置および姿勢計算に用いる特徴点の整合性を決定する。本実施形態の場合、整合性とは、三次元マップ上の特徴点が動く物体上の点であるか、動く物体によって遮蔽され入力画像から正しく観測されない点である場合に低くする値である。以下、より具体的に説明する。 Next, in step S <b> 102, the image input unit 120 inputs an image captured by the imaging unit 180 to the information processing apparatus 1. In step S103, the consistency determining unit 130 calculates the position and orientation based on the 3D map held by the 3D information holding unit 110, the image input in step S102, and the position and orientation of the frame immediately before the imaging unit 180. Determine the consistency of the feature points used for. In the present embodiment, the consistency is a value to be lowered when the feature point on the three-dimensional map is a point on a moving object or a point that is blocked by a moving object and is not correctly observed from an input image. More specific description will be given below.

図４は、ステップＳ１０３における整合性決定処理の手順を示すフローチャートである。本実施形態では、整合性決定部１３０は、三次元マップと入力された画像の間の対応する特徴に関して、三次元マップと入力された画像との差異を求め、これに基づいて整合性を決定する。ステップＳ１１１において、整合性決定部１３０は、まず前フレームで計算された位置および姿勢を現フレームの位置および姿勢の予測値として、この予測値に最も近い位置および姿勢を属性として持つ三次元マップ中のキーフレーム画像を選択する。そして入力画像上の特徴を抽出し、キーフレーム画像上の特徴点群と入力画像上の特徴点群を特徴量比較によって対応づける。以下、入力画像上の特徴点と対応づいたキーフレーム画像上の特徴点を被選択特徴点と呼び、１つのキーフレーム画像における全ての被選択特徴点を集めたものを被選択特徴点群と呼ぶ。なお、前フレームが存在しない場合は初期位置および初期姿勢が用いられる。また、上述の特徴量比較には、たとえば、テンプレートマッチングの手法を用いることもできる。 FIG. 4 is a flowchart showing the procedure of consistency determination processing in step S103. In the present embodiment, the consistency determining unit 130 obtains a difference between the 3D map and the input image with respect to the corresponding feature between the 3D map and the input image, and determines the consistency based on the difference. To do. In step S111, the consistency determining unit 130 first uses the position and orientation calculated in the previous frame as predicted values of the position and orientation of the current frame, and includes a position and orientation closest to the predicted value as attributes. Select the key frame image. Then, the feature on the input image is extracted, and the feature point group on the key frame image and the feature point group on the input image are associated by feature amount comparison. Hereinafter, the feature points on the key frame image corresponding to the feature points on the input image are referred to as selected feature points, and a collection of all the selected feature points in one key frame image is referred to as a selected feature point group. Call. Note that when there is no previous frame, the initial position and initial posture are used. In addition, for example, a template matching method can be used for the above-described feature amount comparison.

ステップＳ１１２において、整合性決定部１３０は、被選択特徴点群から整合性が未決定の被選択特徴点を選択する。ステップＳ１１３において、整合性決定部１３０は、位置および姿勢の予測値に基づいて被選択特徴点を入力画像に投影し、入力画像座標系での座標（位置ｐ）を予測する。ステップＳ１１４において、整合性決定部１３０は、予測した座標（位置ｐ）における入力画像の画素値（輝度値）と三次元マップに記録されている被選択特徴点の輝度値との差の絶対値に基づいて整合性を決定する。輝度値の差の絶対値が大きいほど、その被選択特徴点が動く物体（移動体）上の点、または動く物体によって遮蔽され入力画像から正しく観測されない点である可能性が高いと判断し、整合性の値を低くする。移動体が移動することにより特徴点が存在していた場所（背景）の画素値が大きく変化するからである。また、本実施形態では、入力画像における特徴点の投影位置の画素値（輝度値）と三次元マップにおける特徴点の属性値（輝度値）との比較により得られる差の絶対値を用いたが、これに限られるものではない。たとえば、入力画像における特徴点の投影位置の所定範囲内の画素による平均画素値と三次元マップにおける特徴点の属性値との比較により得られる差が用いられてもよい。 In step S112, the consistency determining unit 130 selects a selected feature point whose consistency has not been determined from the selected feature point group. In step S113, the consistency determining unit 130 projects the selected feature point on the input image based on the predicted position and orientation values, and predicts the coordinates (position p) in the input image coordinate system. In step S114, the consistency determining unit 130 calculates the absolute value of the difference between the pixel value (luminance value) of the input image at the predicted coordinates (position p) and the luminance value of the selected feature point recorded in the three-dimensional map. To determine consistency. The larger the absolute value of the luminance value difference, the more likely it is that the selected feature point is a point on a moving object (moving object) or a point that is blocked by a moving object and is not correctly observed from the input image, Reduce the consistency value. This is because the pixel value of the place (background) where the feature point exists greatly changes as the moving body moves. In this embodiment, the absolute value of the difference obtained by comparing the pixel value (luminance value) of the projection position of the feature point in the input image with the attribute value (luminance value) of the feature point in the three-dimensional map is used. However, it is not limited to this. For example, a difference obtained by comparing the average pixel value of pixels within a predetermined range of the projection position of the feature point in the input image and the attribute value of the feature point in the three-dimensional map may be used.

具体的には、輝度値の差の絶対値をｓとすると、整合性ｃは数１のように決定される。

但し、ｓ_ｔｈは整合性を０とする輝度値の差の絶対値の閾値とする。 Specifically, when the absolute value of the difference in luminance value is s, the consistency c is determined as shown in Equation 1.

However, s _th is a threshold value of the absolute value of the difference between the luminance values where the consistency is zero.

ステップＳ１１５において、整合性決定部１３０は、全ての被選択特徴点の整合性を決定したか否かを判定する。全ての被選択特徴点の整合性を決定したと判定された場合は、ステップＳ１０３の整合性決定処理を終了する。未処理の被選択特徴点がある場合は、ステップＳ１１２からの処理が繰り返される。 In step S115, the consistency determining unit 130 determines whether the consistency of all selected feature points has been determined. If it is determined that the consistency of all selected feature points has been determined, the consistency determination process in step S103 is terminated. If there is an unprocessed selected feature point, the processing from step S112 is repeated.

図２に戻り、ステップＳ１０４において、特徴評価部１４０は、ステップＳ１０３で得られた被選択特徴点の素材状態を評価し、評価値を得る。以下、ステップＳ１０４において特徴評価部１４０が行う特徴点の存在状態の評価を、単に特徴点の評価ともいう。第１実施形態では、特徴評価部１４０は、被選択特徴点の数に基づいて評価値を計算する。特徴評価部１４０は、被選択特徴点の数が多いほど整合性の高い特徴点のみが位置および姿勢計算に用いられるように評価値を高く設定し、被選択特徴点の数が少ないほど、整合性の低い特徴点も位置および姿勢計算に用いるように評価値を低く設定する。第１実施形態では、被選択特徴点の数をＮ_Ａとすると、評価値Ｅは数２のように計算される。

Returning to FIG. 2, in step S104, the feature evaluation unit 140 evaluates the material state of the selected feature point obtained in step S103, and obtains an evaluation value. Hereinafter, the evaluation of the feature point existence state performed by the feature evaluation unit 140 in step S104 is also simply referred to as feature point evaluation. In the first embodiment, the feature evaluation unit 140 calculates an evaluation value based on the number of selected feature points. The feature evaluation unit 140 sets the evaluation value high so that only the feature points having higher consistency are used for position and orientation calculation as the number of selected feature points is larger, and the smaller the number of selected feature points is, the higher the matching value is. The evaluation value is set to be low so that feature points with low characteristics are also used for position and orientation calculations. In the first embodiment, when the number of selected feature points is N _A , the evaluation value E is calculated as shown in Equation 2.

ステップＳ１０５において、重み計算部１５０は、ステップＳ１０３で決定された整合性（ｃ）およびステップＳ１０４で計算された評価値（Ｅ）に基づいて重み（ｗ）を計算する。重みｗは被選択特徴点それぞれについて個別に計算される。評価値Ｅが高いほど整合性の高い特徴点にのみ重みを与え、評価値が低いほど整合性の低い特徴点にも重みを与える。具体的には、評価値をＥ、整合性をｃとすると、重みｗは数３、数４のように計算する。
Ｅ≧Ｅ_ｔｈのとき、

Ｅ＜Ｅ_ｔｈのとき、

但し、Ｅ_ｔｈは評価値の高低を判断する閾値、ｃ_ｔｈ１、ｃ_ｔｈ２は重みを０とする整合性の閾値（ｃ_ｔｈ１＞ｃ_ｔｈ２）とする。 In step S105, the weight calculation unit 150 calculates a weight (w) based on the consistency (c) determined in step S103 and the evaluation value (E) calculated in step S104. The weight w is calculated individually for each selected feature point. As the evaluation value E is higher, a weight is given only to feature points having higher consistency, and as the evaluation value is lower, weights are given to feature points having lower consistency. Specifically, when the evaluation value is E and the consistency is c, the weight w is calculated as in Equations 3 and 4.
When E ≧ E _th

When E <E _th

However, E _th is a threshold value for judging the level of the evaluation value, and c _th1 and c _th2 are consistency threshold values with a weight of 0 (c _th1 > c _th2 ).

ステップＳ１０６において、位置姿勢計算部１６０は、三次元情報保持部１１０が保持する三次元マップ、ステップＳ１０２で入力された画像、ステップＳ１０５で計算された重みに基づいて世界座標系における撮像部１８０の位置および姿勢を計算する。位置および姿勢は、ステップＳ１０１で用いたＫｌｅｉｎの手法（非特許文献１）に基づいてステップＳ１０５で計算された重みを考慮して計算する。具体的には、数５のＳを最小化する位置および姿勢を計算する。 In step S106, the position / orientation calculation unit 160 includes the 3D map stored in the 3D information storage unit 110, the image input in step S102, and the weight calculated in step S105. Calculate position and orientation. The position and orientation are calculated in consideration of the weight calculated in step S105 based on the Klein method (Non-Patent Document 1) used in step S101. Specifically, the position and orientation at which S in Equation 5 is minimized are calculated.

Ｓは、入力画像座標系における被選択特徴点の二次元座標ｍｉ’と、特徴量比較によってｍｉ’と対応付けた入力画像上の特徴点の二次元座標ｍｉとのユークリッド距離の二乗に重みｗｉを乗算したものを被選択特徴点ごとに計算し、それらを合計したものである。

S is a weight wi for the square of the Euclidean distance between the two-dimensional coordinate mi ′ of the selected feature point in the input image coordinate system and the two-dimensional coordinate mi of the feature point on the input image associated with mi ′ by the feature amount comparison. Is calculated for each selected feature point and summed up.

ステップＳ１０７では、処理全体を終了するか否かを判定する。マウスやキーボードなどを介してユーザから処理全体を終了するコマンドが入力されている場合、処理全体を終了する。そうでない場合は処理を継続するためステップＳ１０２からの処理が繰り返される。 In step S107, it is determined whether or not to end the entire process. When a command for ending the entire process is input from the user via a mouse or a keyboard, the entire process is ended. Otherwise, the process from step S102 is repeated to continue the process.

以上述べたように第１実施形態では、位置および姿勢計算に用いる特徴点数が多いほど動く物体上にある可能性が低い特徴点に重みが与えられ、特徴点数が少ないほど動く物体上にある可能性が高い特徴点にも重みが与えられる。このように位置および姿勢の計算に用いる特徴点数に応じて重みを調整することにより高精度、高ロバストに位置および姿勢を計算することができる。 As described above, in the first embodiment, as the number of feature points used for position and orientation calculation increases, weight is given to a feature point that is less likely to be on a moving object, and as the number of feature points decreases, the feature point may be on a moving object. Weights are also given to feature points having high characteristics. Thus, the position and orientation can be calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for calculating the position and orientation.

＜第２実施形態＞
第１実施形態では、位置および姿勢計算に用いる特徴点数の多寡に基づいて各特徴点が位置および姿勢計算に与える影響の大きさである重みを計算した。第２実施形態では、位置および姿勢計算に用いる特徴点の分布の粗密に基づいて各特徴点が位置および姿勢計算に与える影響の大きさである重みを計算する。すなわち、第２実施形態では、位置および姿勢計算に用いる特徴点の分布が密な領域ほど動く物体上にある可能性が低い特徴点に重みを与え、分布が粗な領域ほど動く物体上にある可能性が高い特徴点にも重みを与える。このように位置および姿勢計算に用いる特徴点の分布の偏りを防ぐことにより高精度、高ロバストに位置および姿勢を計算する。 Second Embodiment
In the first embodiment, the weight, which is the magnitude of the influence of each feature point on the position and orientation calculation, is calculated based on the number of feature points used for the position and orientation calculation. In the second embodiment, a weight that is the magnitude of the influence of each feature point on the position and orientation calculation is calculated based on the density of the distribution of the feature points used for the position and orientation calculation. In other words, in the second embodiment, weights are given to feature points that are less likely to be on moving objects as the distribution of feature points used for position and orientation calculation is denser, and the distribution points are located on moving objects. Give weights to highly likely feature points. In this way, the position and orientation are calculated with high accuracy and high robustness by preventing the deviation of the distribution of the feature points used for the position and orientation calculation.

第２実施形態における装置の構成は第１実施形態で説明した情報処理装置１と同様である。また、第２実施形態における初期化、画像入力、整合性決定、重み計算、位置および姿勢計算の処理手順も第１実施形態と同様である。第１実施形態と第２実施形態で異なる主な部分は、図２のフローチャートにおける特徴の存在状態の評価（ステップＳ１０４）である。 The configuration of the apparatus in the second embodiment is the same as that of the information processing apparatus 1 described in the first embodiment. The processing procedures for initialization, image input, consistency determination, weight calculation, position and orientation calculation in the second embodiment are the same as those in the first embodiment. The main part different between the first embodiment and the second embodiment is the evaluation of the feature existence state in the flowchart of FIG. 2 (step S104).

ステップＳ１０４において、特徴評価部１４０は、ステップＳ１０３で得られた被選択特徴点の入力画像座標系における粗密の分布に基づいて特徴分布の評価値を計算する。被選択特徴点の分布の粗密は、たとえば、入力画像を格子状に四分割し、各分割領域内に投影された被選択特徴点の数を計算することで判断する。具体的には、分割領域Ｄ内に投影された被選択特徴点の数をＮ_Ｄとすると、分割領域Ｄ内に投影された被選択特徴点の評価値Ｅは数６のように計算する。

In step S <b> 104, the feature evaluation unit 140 calculates an evaluation value of the feature distribution based on the density distribution in the input image coordinate system of the selected feature point obtained in step S <b> 103. The density of the distribution of the selected feature points is determined by, for example, dividing the input image into four grids and calculating the number of selected feature points projected in each divided region. Specifically, if the number of selected feature points projected in the divided region D is N _D , the evaluation value E of the selected feature points projected in the divided region D is calculated as in Equation 6.

そして、重み計算部１５０は、被選択特徴点の入力画像への投影位置がどの分割領域に属するかに応じて数６によりＥを決定し、Ｅ≧Ｅ_ｔｈの場合に上記の数３を用いて重みｗを計算し、Ｅ＜Ｅ_ｔｈの場合には、上記の数４を用いて重みｗを計算する。位置姿勢計算部１６０は、重み計算部１５０が各特徴点について計算した重みを用いて、数５により撮像部１８０の位置および姿勢を計算する。但し、Ｅ_ｔｈの値は、第１実施形態とは異なる。たとえば、第２実施形態では画像を四分割しているので、第１実施形態で用いられる閾値の１／４とする。 Then, the weight calculation unit 150 determines E by Equation 6 according to which divided region the projection position of the selected feature point onto the input image belongs, and uses _Equation 3 above when E ≧ E _th. The weight w is calculated, and when E < _Eth , the weight w is calculated using the above _equation 4. The position / orientation calculation unit 160 calculates the position and orientation of the imaging unit 180 using Equation 5 using the weights calculated by the weight calculation unit 150 for each feature point. However, the value of _Eth is different from that in the first embodiment. For example, since the image is divided into four in the second embodiment, it is set to 1/4 of the threshold value used in the first embodiment.

以上述べたように第２実施形態では、位置および姿勢計算に用いる特徴点の分布が密な領域ほど動く物体上にある可能性が低い特徴点に重みを与え、分布が粗な領域ほど動く物体上にある可能性が高い特徴点にも重みを与える。このように位置および姿勢計算に用いる特徴の分布の偏りを防ぐことにより高精度、高ロバストに位置および姿勢を計算する。 As described above, in the second embodiment, weights are given to feature points that are less likely to be on moving objects as the distribution of feature points used for position and orientation calculation is denser, and objects that move as the distribution is coarser. Also weight the feature points that are likely to be above. Thus, the position and orientation are calculated with high accuracy and high robustness by preventing the bias of the distribution of features used for the position and orientation calculation.

＜第３実施形態＞
第１実施形態では、位置および姿勢計算に用いる特徴点数の多寡に基づいて各特徴点が位置および姿勢計算に与える影響を制御する重みを計算する例を示した。第２実施形態では、位置および姿勢計算に用いる特徴点の入力画像座標系における粗密の分布に基づいて重みを計算する例を示した。第３実施形態では、位置および姿勢計算に用いる特徴点数の多寡と粗密の分布を併用する。したがって、第３実施形態では、位置および姿勢計算に用いる特徴点数が多いほど、また位置および姿勢計算に用いる特徴点の分布が密な領域ほど動く物体上にある可能性が低い特徴点に重みを与える。また、第３実施形態では、特徴点数が少ないほど、また分布が粗な領域ほど動く物体上にある可能性が高い特徴点にも重みを与える。このように位置および姿勢計算に用いる特徴点数に応じた重みの調整と、位置および姿勢計算に用いる特徴の分布の偏りの防止により高精度、高ロバストに位置および姿勢を計算する。 <Third Embodiment>
In the first embodiment, an example is shown in which weights for controlling the influence of each feature point on the position and orientation calculation are calculated based on the number of feature points used in the position and orientation calculation. In the second embodiment, an example has been shown in which weights are calculated based on a density distribution in the input image coordinate system of feature points used for position and orientation calculations. In the third embodiment, a large number of feature points used for position and orientation calculation and a density distribution are used together. Therefore, in the third embodiment, as the number of feature points used for position and orientation calculation is larger, and as the distribution of feature points used for position and orientation calculation is denser, the feature points that are less likely to be on moving objects are weighted. give. Further, in the third embodiment, weights are given to feature points that are more likely to be on a moving object as the number of feature points is smaller and a region having a coarser distribution. In this way, the position and orientation are calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for position and orientation calculation and preventing the bias of the distribution of features used for position and orientation calculation.

第３実施形態における装置の構成は第１実施形態で説明した情報処理装置１と同様である。また、第３実施形態における初期化、画像入力、整合性決定、重み計算、位置および姿勢計算の処理手順も第１実施形態と同様である。第１実施形態と第３実施形態で異なる主な部分は、図２のフローチャートにおける特徴の存在状態の評価（ステップＳ１０４）である。 The configuration of the apparatus in the third embodiment is the same as that of the information processing apparatus 1 described in the first embodiment. Also, the initialization, image input, consistency determination, weight calculation, position and orientation calculation processing procedures in the third embodiment are the same as those in the first embodiment. The main difference between the first embodiment and the third embodiment is the evaluation of the presence state of features in the flowchart of FIG. 2 (step S104).

ステップＳ１０４において、特徴評価部１４０は、ステップＳ１０３で得られた被選択特徴点の数および被選択特徴点の入力画像座標系における粗密の分布に基づいて、被選択特徴点の存在状態の評価値を計算する。被選択特徴点の分布の粗密は、たとえば第２実施形態と同様に、入力画像を格子状に四分割し、各分割領域内に投影された被選択特徴点の数を計算することで判断する。具体的には、被選択特徴点の総数を正規化したものをＮ_Ａ’、分割領域Ｄ内に投影された被選択特徴点の数を正規化したものをＮ_Ｄ’とすると、分割領域Ｄ内に投影された被選択特徴点の評価値Ｅは数７のように計算する。

In step S104, the feature evaluation unit 140 evaluates the presence state of the selected feature point based on the number of selected feature points obtained in step S103 and the density distribution of the selected feature points in the input image coordinate system. Calculate For example, as in the second embodiment, the density of the distribution of selected feature points is determined by dividing the input image into four grids and calculating the number of selected feature points projected in each divided region. . Specifically, if the normalized total number of selected feature points is N _A ′, and the normalized number of selected feature points projected in the divided region D is N _D ′, the divided region D The evaluation value E of the selected feature point projected inside is calculated as shown in Equation 7.

そして、重み計算部１５０は、被選択特徴点の入力画像への投影位置がどの分割領域に属するかに応じて数６によりＥを決定し、Ｅ≧Ｅ_ｔｈの場合に上記の数３を用いて重みｗを計算し、Ｅ＜Ｅ_ｔｈの場合には、上記の数４を用いて重みｗを計算する。位置姿勢計算部１６０は、重み計算部１５０が各特徴点について計算した重みを用いて、数５により撮像部１８０の位置および姿勢を計算する。但し、Ｅ_ｔｈは数７により算出される評価値Ｅに関して適した値が用いられる。 Then, the weight calculation unit 150 determines E by Equation 6 according to which divided region the projection position of the selected feature point onto the input image belongs, and uses _Equation 3 above when E ≧ E _th. The weight w is calculated, and when E < _Eth , the weight w is calculated using the above _equation 4. The position / orientation calculation unit 160 calculates the position and orientation of the imaging unit 180 using Equation 5 using the weights calculated by the weight calculation unit 150 for each feature point. However, a value suitable for the evaluation value E calculated by _Equation 7 is used as E _th .

以上述べたように第３実施形態では、位置および姿勢計算に用いる特徴点数が多いほど、また位置および姿勢計算に用いる特徴点の分布が密な領域ほど、動く物体上にある可能性が低い特徴点に重みを与えるように制御される。一方、特徴点数が少ないほど、また分布が粗な領域であるほど、動く物体上にある可能性が高い特徴点にも重みを与えるように制御される。このように位置および姿勢計算に用いる特徴点数に応じた重みの調整と、位置および姿勢計算に用いる特徴の分布の偏りの防止により高精度、高ロバストに位置および姿勢を計算する。 As described above, in the third embodiment, as the number of feature points used for position and orientation calculation is larger and the distribution of feature points used for position and orientation calculation is denser, the feature is less likely to be on a moving object. Controlled to give weights to points. On the other hand, the smaller the number of feature points and the rougher the distribution, the higher the possibility that the feature points on the moving object are weighted. In this way, the position and orientation are calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for position and orientation calculation and preventing the bias of the distribution of features used for position and orientation calculation.

なお、評価値の計算は、被選択特徴点の数と被選択特徴点の粗密の分布に基づいて評価値を計算するものであればよく、上述した数７に限るものではない。例えば、Ｎ_Ａ’とＮ_Ｄ’を数７で定義したものと同じとすると、Ｎ_Ａ’とＮ_Ｄ’の和を評価値とするものや、Ｎ_Ａ’またはＮ_Ｄ’のどちらかが閾値以下であれば評価値は０、それ以外であれば評価値はＮ_Ａ’とＮ_Ｄ’の積とするものでもよい。 Note that the evaluation value may be calculated as long as the evaluation value is calculated based on the number of selected feature points and the density distribution of the selected feature points, and is not limited to Equation 7 described above. For example, if N _A 'and N _D ' are the same as those defined in Equation 7, the sum of N _A 'and N _D ' is used as the evaluation value, or either N _A 'or N _D ' is the threshold value. If it is below, the evaluation value is 0. Otherwise, the evaluation value may be the product of N _A ′ and N _D ′.

＜第４実施形態＞
第１から第３実施形態では、フレームごとに独立して整合性を決定していたためフレーム間で整合性が大きく変化してしまい位置および姿勢が不安定になる可能性がある。第４実施形態では、過去の整合性（整合性の履歴）を用いて整合性の大きな変化を抑制することで位置および姿勢を高精度化、高ロバスト化する。 <Fourth embodiment>
In the first to third embodiments, since the consistency is determined independently for each frame, the consistency greatly changes between frames, and the position and orientation may become unstable. In the fourth embodiment, the position and orientation are made highly accurate and robust by suppressing a large change in consistency using past consistency (consistency history).

図５は第４実施形態における装置の構成を示す図である。第４実施形態における撮像部１８０、三次元情報保持部１１０、画像入力部１２０、整合性決定部１３０、位置姿勢計算部１６０は第１実施形態で説明した情報処理装置１（図１）と同様である。第１実施形態と第４実施形態で異なる主な部分は、情報処理装置１における特徴評価部１４０が省略されている点と、重み計算部１５０ａである。重み計算部１５０ａは、整合性決定部１３０によって決定された整合性に基づいて、位置および姿勢計算に用いる特徴点群に含まれる各特徴点の重みを計算する。 FIG. 5 is a diagram showing the configuration of the apparatus in the fourth embodiment. The imaging unit 180, the three-dimensional information holding unit 110, the image input unit 120, the consistency determining unit 130, and the position / orientation calculating unit 160 in the fourth embodiment are the same as those of the information processing apparatus 1 (FIG. 1) described in the first embodiment. It is. The main difference between the first embodiment and the fourth embodiment is that the feature evaluation unit 140 in the information processing apparatus 1 is omitted and a weight calculation unit 150a. The weight calculation unit 150a calculates the weight of each feature point included in the feature point group used for position and orientation calculation based on the consistency determined by the consistency determination unit 130.

次に、本実施形態における処理手順について説明する。図６は、第４実施形態における処理手順を示すフローチャートである。第４実施形態における初期化、画像入力、位置および姿勢計算の処理手順は第１実施形態（図２のＳ１０１、Ｓ１０２、Ｓ１０６）と同様である。第１実施形態と第４実施形態で異なる主な部分は、図２のフローチャートにおける特徴点の評価（Ｓ１０４）が省略されている点と、整合性決定（Ｓ２０１）および重み計算（Ｓ２０２）である。 Next, a processing procedure in the present embodiment will be described. FIG. 6 is a flowchart showing a processing procedure in the fourth embodiment. The processing procedures of initialization, image input, position and orientation calculation in the fourth embodiment are the same as those in the first embodiment (S101, S102, S106 in FIG. 2). The main difference between the first embodiment and the fourth embodiment is that the evaluation of feature points (S104) in the flowchart of FIG. 2 is omitted, consistency determination (S201), and weight calculation (S202). .

ステップＳ２０１では、まず第１実施形態のステップＳ１０３と同様に被選択特徴点の輝度値と入力画像の輝度値との差の絶対値ｓに基づいて仮の整合性を計算する。計算された仮の整合性は、履歴として保持される。そして仮の整合性を含め、新しいものから順にｎフレーム分の整合性を保持されている整合性の履歴から抽出し、抽出した整合性群の中央値を現在の整合性ｃとする。 In step S201, first, temporary consistency is calculated based on the absolute value s of the difference between the luminance value of the selected feature point and the luminance value of the input image, as in step S103 of the first embodiment. The calculated temporary consistency is retained as a history. Then, from the newest one including the temporary consistency, it extracts from the consistency history in which the consistency for n frames is maintained, and the median value of the extracted consistency group is set as the current consistency c.

ステップＳ２０２では、ステップＳ２０１で決定された整合性に基づいて重みを計算する。具体的には、整合性をｃとすると、重みｗは数８のように計算される。

但し、ｃ_ｔｈは重みが０となる整合性の閾値とする。 In step S202, a weight is calculated based on the consistency determined in step S201. Specifically, when the consistency is c, the weight w is calculated as in Expression 8.

Here, c _th is a consistency threshold value with a weight of 0.

以上述べたように第４実施形態では、過去の整合性を用いて整合性の大きな変化を抑制することで位置および姿勢を高精度化、高ロバスト化する。 As described above, in the fourth embodiment, the position and orientation are made highly accurate and robust by suppressing a large change in consistency using past consistency.

なお、整合性の決定は、過去の整合性に基づいて尤もらしい現在の整合性を決定できるものであればよく、上述した過去の整合性の中央値を用いる方法に限るものではない。例えば、過去の整合性の平均値を現在の整合性としてもよい。また、過去の整合性を多項式近似し、近似した多項式から得られる現在の仮の整合性と、被選択特徴点と入力画像の輝度値の差から得られる現在の仮の整合性との平均値を現在の整合性としてもよい。 Note that the determination of consistency is not limited to the above-described method using the median value of past consistency as long as it can determine the plausible current consistency based on the past consistency. For example, an average value of past consistency may be set as the current consistency. Also, past consistency is approximated by a polynomial, and the average value of the current provisional consistency obtained from the approximated polynomial and the current provisional consistency obtained from the difference between the selected feature point and the luminance value of the input image. May be the current consistency.

また、重み計算は、整合性を用いて重みを計算するものであればよく、上述した数８に限るものではない。例えば、整合性ｃをそのまま重みとして利用してもよい。また、整合性ｃと重みｗが数９のような指数関数や、シグモイド関数に従うものとして重みを計算してもよい。

Further, the weight calculation is not limited to the above-described formula 8, as long as the weight is calculated using consistency. For example, the consistency c may be used as it is as a weight. Alternatively, the weight may be calculated assuming that the consistency c and the weight w follow an exponential function such as Equation 9 or a sigmoid function.

なお、第４実施形態では、評価値Ｅを用いない構成を説明したが、第１〜第３実施形態で説明したような、特徴の存在状態の評価に基づく評価値を用いて数３および数４のように重みを計算してもよい。なお、第４実施形態における装置の構成で過去の整合性を用いずに現在の整合性を決定してもよい。この場合、たとえば、第１〜第３実施形態において特徴評価部１４０およびその処理（Ｓ１０４）が省略されたものとなる。 In the fourth embodiment, the configuration in which the evaluation value E is not used has been described. However, using the evaluation value based on the evaluation of the feature existence state as described in the first to third embodiments, Equation 3 and The weight may be calculated as shown in FIG. Note that the current consistency may be determined without using past consistency in the configuration of the apparatus according to the fourth embodiment. In this case, for example, the feature evaluation unit 140 and its processing (S104) are omitted in the first to third embodiments.

＜その他の実施形態＞
第１〜第４実施形態で示した構成は一例に過ぎず、本発明はこれらの構成に限定されるものではない。以下、第１〜第４実施形態の変形例を説明する。 <Other embodiments>
The configurations shown in the first to fourth embodiments are merely examples, and the present invention is not limited to these configurations. Hereinafter, modified examples of the first to fourth embodiments will be described.

撮像部１８０は、三次元マップと撮像部１８０が撮像する画像との整合性が決定できるものであればよく、上述したカラーカメラに限るものではない。例えば、濃淡画像を撮像するカメラや距離画像を撮像するカメラやカラー画像と距離画像を同時に撮像するカメラでもよい。 The imaging unit 180 only needs to be able to determine the consistency between the three-dimensional map and the image captured by the imaging unit 180, and is not limited to the color camera described above. For example, a camera that picks up a gray image, a camera that picks up a distance image, or a camera that picks up a color image and a distance image at the same time may be used.

三次元情報保持部１１０が保持する三次元マップは、現実空間の特徴点の三次元座標を表現できるものであればよく、上述したキーフレーム画像の集合として保持するものに限られるものではない。例えば、キーフレーム画像は保持せずに特徴点の集合として保持するものでもよい。あるいは、全てのフレーム画像の集合として保持するものでもよい。 The three-dimensional map held by the three-dimensional information holding unit 110 may be any map that can express the three-dimensional coordinates of the feature points in the real space, and is not limited to the one held as a set of the key frame images described above. For example, the key frame image may be held as a set of feature points without being held. Alternatively, it may be held as a set of all frame images.

また、三次元マップが保持する特徴点のデータ構造は図３に限るものではなく、入力画像と照合して撮像部の位置および姿勢を計算できるものであれば何でもよい。例えば、画像座標と奥行き値の代わりに世界座標系を基準とした三次元座標を属性として付与してもよい。また、一定時間が経った特徴点を使用しないようにする等のために、三次元マップに登録された時間を属性として付与してもよい。 The data structure of the feature points held by the three-dimensional map is not limited to that shown in FIG. 3, and any data structure can be used as long as the position and orientation of the imaging unit can be calculated by collating with the input image. For example, three-dimensional coordinates based on the world coordinate system may be assigned as attributes instead of image coordinates and depth values. Also, the time registered in the three-dimensional map may be given as an attribute in order not to use a feature point after a certain period of time.

初期化時に読み込む初期三次元マップの作成方法は上述したＫｌｅｉｎの手法に限るものではなく、現実空間の特徴点の三次元座標を表現できるものを作成する方法であれば何でもよい。例えば、レーザースキャナーを用いて現実空間を計測して三次元マップを作成する方法でもよい。また、ＣＡＤ等により作成された現実空間の三次元モデルを利用できる場合は、その三次元モデルから特徴を抽出し三次元マップを作成する方法でもよい。 The creation method of the initial three-dimensional map read at the time of initialization is not limited to the above-described Klein method, and any method can be used as long as it can create a three-dimensional coordinate of a feature point in the real space. For example, a method of creating a three-dimensional map by measuring a real space using a laser scanner may be used. When a 3D model of real space created by CAD or the like can be used, a method of extracting features from the 3D model and creating a 3D map may be used.

また、初期化時に読み込む撮像部の内部パラメータの校正方法は上述したＺｈａｎｇの手法に限るものではなく、内部パラメータを校正できる方法であれば何でもよい。例えば、３次元座標が既知の点を用いたＴｓａｉの手法（非特許文献４）や回転運動のみ行うカメラを用いたＨａｒｔｌｅｙの手法（非特許文献５）を用いてもよい。 Further, the method for calibrating the internal parameters of the image capturing unit read at the time of initialization is not limited to the Zhang method described above, and any method that can calibrate the internal parameters may be used. For example, the Tsai method using a point with known three-dimensional coordinates (Non-Patent Document 4) or the Hartley method using a camera that performs only a rotational motion (Non-Patent Document 5) may be used.

また、初期化時の撮像部１８０の初期位置および初期姿勢の計算方法は上述したＫａｔｏの手法に限るものではなく、処理開始時の撮像部１８０の位置および姿勢を取得できる手法であれば何でもよい。例えば、あらかじめ初期位置および初期姿勢を指定しておき、撮像部１８０を初期位置および初期姿勢に固定してから処理を開始する方法でもよい。また、撮像部１８０に特定のパターンの画像を張り付け、世界座標系における位置および姿勢が既知の別の固定カメラでそのパターンを認識し、撮像部の位置および姿勢を取得する方法でもよい。 The calculation method of the initial position and initial posture of the imaging unit 180 at the time of initialization is not limited to the above-described Kato method, and any method that can acquire the position and orientation of the imaging unit 180 at the start of processing may be used. . For example, a method may be used in which an initial position and an initial posture are designated in advance, and processing is started after the imaging unit 180 is fixed to the initial position and the initial posture. Alternatively, a method may be used in which an image of a specific pattern is pasted on the imaging unit 180, the pattern is recognized by another fixed camera whose position and orientation in the world coordinate system are known, and the position and orientation of the imaging unit are acquired.

整合性を低くする場合とは、上述した三次元マップ上の特徴点が動く物体上の点であるか、動く物体によって遮蔽され入力画像から観測されない点である場合に限るものではなく、三次元マップ上の特徴点と入力画像とで差異が生じる場合であれば何でもよい。例えば、光源環境が変化して輝度値が変わる場合や、三次元マップに登録されていない静止物体によって遮蔽される場合でもよい。 The case where the consistency is lowered is not limited to the case where the above-described feature point on the three-dimensional map is a point on a moving object or a point that is blocked by a moving object and is not observed from an input image. Any difference between the feature points on the map and the input image may be used. For example, the brightness value may change due to a change in the light source environment, or it may be blocked by a stationary object that is not registered in the three-dimensional map.

整合性（第４実施形態では仮の整合性）の決定は上述した三次元マップ上の特徴点と入力画像との輝度値の差に基づく方法に限るものではなく、三次元マップと現実空間との合致度合いに基づく方法であれば何でもよい。例えば、入力画像から手や人物を検出し、検出した手や人物の領域内に投影された三次元マップ上の特徴点の整合性を低くする方法でもよい。また、三次元マップ上の特徴点と入力画像との色（ＲＧＢ色空間やＨＳＶ色空間の各要素）の差が小さいほど整合性を高く、三次元マップ上の特徴点と入力画像との色の差が大きいほど整合性を低くする方法でもよい。また、入力画像が距離画像の場合、三次元マップ上の特徴点と入力画像との距離値の差が小さいほど整合性を高く、三次元マップ上の特徴点と入力画像との距離値の差が大きいほど整合性を低くする方法でもよい。また、三次元マップの被選択特徴点の入力画像への投影位置と、入力画像上で被選択特徴点に対応づけられた特徴点の位置との間の距離が小さいほど整合性を高くするようにしてもよい。また、濃淡画像と、カラー画像と、距離画像のうちの複数が入力として得られる場合、三次元マップ上の特徴点と入力画像との輝度値の差、色の差、距離値の差を組み合わせて整合性を決定するようにしてもよい。 The determination of the consistency (provisional consistency in the fourth embodiment) is not limited to the above-described method based on the difference in luminance value between the feature point on the 3D map and the input image. Any method may be used as long as it is based on the degree of matching. For example, a method may be used in which a hand or a person is detected from the input image and the consistency of feature points on a three-dimensional map projected in the detected hand or person region is reduced. Also, the smaller the color difference between each feature point on the 3D map and the input image (elements in the RGB color space and HSV color space), the higher the consistency, and the color between the feature point on the 3D map and the input image. The larger the difference, the lower the consistency. Also, when the input image is a distance image, the smaller the difference in the distance value between the feature point on the 3D map and the input image, the higher the consistency, and the difference in the distance value between the feature point on the 3D map and the input image. A method of lowering the consistency may be used as the value is larger. Further, the smaller the distance between the projected position of the selected feature point of the 3D map on the input image and the position of the feature point associated with the selected feature point on the input image, the higher the consistency. It may be. In addition, when multiple images of grayscale, color, and distance images are obtained as input, the difference in brightness value, color difference, and distance value between the feature points on the 3D map and the input image are combined. Thus, the consistency may be determined.

また、第１〜第３実施形態において、整合性を、第４実施形態のように、過去の整合性（履歴）に基づいて決定するようにしてもよい。例えば、過去の整合性の中央値や平均値を現在の整合性としてもよい。また、過去の整合性を多項式近似し、近似した多項式から得られる現在の仮の整合性と、位置および姿勢計算に用いる特徴点と入力画像の差異から得られる現在の仮の整合性の平均値を現在の整合性としてもよい。 In the first to third embodiments, consistency may be determined based on past consistency (history) as in the fourth embodiment. For example, the median value or average value of past consistency may be used as the current consistency. In addition, the past consistency is approximated by a polynomial, the current provisional consistency obtained from the approximated polynomial, and the average value of the current provisional consistency obtained from the difference between the feature points used for position and orientation calculation and the input image May be the current consistency.

第１〜第３実施形態における特徴点の数の評価は上述した被選択特徴点の数に基づいて評価する方法に限るものではなく、位置および姿勢計算に用いる特徴点数の多寡を判断できる方法であれば何でもよい。例えば、入力画像から検出される特徴点数が多いほど評価値を高く、入力画像から検出される特徴点数が少ないほど評価値を低くしてもよい。入力画像から検出される特徴点数が多いほど、三次元マップからの被選択特徴点の数も多くなると想定されるからである。また、被選択特徴点の数および入力画像から検出される特徴点の数が多いほど評価値を高く、被選択特徴点の数および入力画像から検出される特徴点の数が少ないほど評価値を低くしてもよい。 The evaluation of the number of feature points in the first to third embodiments is not limited to the above-described method for evaluating based on the number of selected feature points, but a method that can determine the number of feature points used for position and orientation calculation. Anything is fine. For example, the evaluation value may be increased as the number of feature points detected from the input image is increased, and the evaluation value may be decreased as the number of feature points detected from the input image is decreased. This is because it is assumed that the greater the number of feature points detected from the input image, the greater the number of selected feature points from the three-dimensional map. The evaluation value increases as the number of selected feature points and the number of feature points detected from the input image increase, and the evaluation value increases as the number of the selected feature points and the number of feature points detected from the input image decreases. It may be lowered.

第２および第３実施形態における特徴点の粗密の分布の評価は上述した入力画像を格子状に四分割して各分割領域内の特徴点数を計算する手法に限るものではなく、ある範囲における特徴点分布の粗密を判断できる手法であれば何でもよい。例えば、入力画像の中央を中心とする半径の異なる同心円で入力画像を分割し、各分割領域において分割領域内の特徴点数が多いほどその分割領域内の特徴点の評価値を高く、分割領域内の特徴点数が少ないほどその分割領域内の特徴点の評価値を高くしてもよい。また、ある特徴点を中心とする特定の半径内の特徴点数が多いほどその特徴点の評価値を高く、ある特徴点を中心とする特定の半径内の特徴点数が少ないほどその特徴点の評価値を低くしてもよい。 The evaluation of the density distribution of feature points in the second and third embodiments is not limited to the above-described method of calculating the number of feature points in each divided region by dividing the input image into four grids, and features in a certain range. Any method can be used as long as it can determine the density of the point distribution. For example, the input image is divided into concentric circles with different radii centered on the center of the input image, and the larger the number of feature points in each divided region, the higher the evaluation value of the feature points in that divided region. As the number of feature points decreases, the evaluation value of the feature points in the divided region may be increased. Also, the higher the number of feature points within a specific radius centered on a certain feature point, the higher the evaluation value of that feature point. The smaller the number of feature points within a specific radius centered on a certain feature point, the higher the evaluation value of that feature point. The value may be lowered.

重みは上述した連続値に限るものではなく、２値や量子化した値で表現してもよい。例えば、整合性をｃ、重みが変わる整合性の閾値をｃ_ｔｈ、ｃ_ｔｈ１、ｃ_ｔｈ２（ｃ_ｔｈ１＞ｃ_ｔｈ２）とすると、重みｗは数１０または数１１のように計算してもよい。

The weight is not limited to the above-described continuous value, and may be expressed as a binary value or a quantized value. For example, if the consistency is c and the consistency thresholds at which the weights are changed are c _th , c _th1 , c _th2 (c _th1 > c _th2 ), the weight w may be calculated as in _Expression 10 or Expression 11.

第１〜第３実施形態における重み計算は上述した数３および数４に限るものではなく、整合性および特徴の存在状態の評価値を用いて重みを計算するものであれば何でもよい。例えば、整合性と評価値の和や積を重みとしてもよい。また、整合性をｃ、正規化した評価値をＥ’とすると、数１２のように評価値に応じて重みが０となる整合性の閾値を変化させて重みｗを計算してもよい。

The weight calculation in the first to third embodiments is not limited to the above-described Expression 3 and Expression 4, and may be anything as long as the weight is calculated using the evaluation value of the consistency and the existence state of the feature. For example, the sum or product of the consistency and the evaluation value may be used as the weight. Further, when the consistency is c and the normalized evaluation value is E ′, the weight w may be calculated by changing the consistency threshold at which the weight becomes 0 according to the evaluation value as shown in Equation 12.

また、必ずしも整合性が閾値未満のときに重みを０としなくても良く、整合性が低いほど重みを小さくすればよい。例えば、図７のように整合性が高いほど重みを大きく、整合性が低いほど重みを小さくするが、整合性が低い場合は評価値が低いほど評価値が高い場合と比較して重みを大きくしてもよい。すなわち、評価値が低いほど、整合性の変化に対する重みの変化が少なくなるようにする。 Further, the weight does not necessarily have to be 0 when the consistency is less than the threshold value, and the weight may be decreased as the consistency is lower. For example, as shown in FIG. 7, the higher the consistency is, the larger the weight is, and the lower the consistency is, the smaller the weight is. However, when the consistency is low, the lower the evaluation value is, the higher the evaluation value is. May be. That is, the lower the evaluation value, the smaller the change in weight with respect to the change in consistency.

位置および姿勢計算は上述したＫｌｅｉｎの手法に限るものではなく、三次元マップと入力画像に基づいて位置および姿勢を計算するものであれば何でもよい。例えば、Ｐｉｒｃｈｈｅｉｍらの手法（非特許文献６）やＥｎｇｅｌらの手法（非特許文献７）を用いてもよい。Ｐｉｒｃｈｈｅｍらの手法はＫｌｅｉｎの手法と同じく、特徴量比較によって対応付けられた三次元マップ上の特徴点と入力画像上の特徴点との画像座標系におけるユークリッド距離を最小化する位置および姿勢を計算する。よって本発明を実施する際は数５のようにユークリッド距離に重みを乗算することで三次元マップ上の特徴点が位置および姿勢計算に与える影響を制御する。Ｅｎｇｅｌらの手法は三次元マップ上の特徴点の輝度値と三次元マップ上の特徴点を入力画像に投影した際の画像座標における入力画像の輝度値との差を最小化する位置および姿勢を計算する。よって本発明を実施する際は輝度値の差に重みを乗算することで三次元マップ上の特徴点が位置および姿勢計算に与える影響を制御する。 The position and orientation calculation is not limited to the above-described Klein method, and any method may be used as long as the position and orientation are calculated based on the three-dimensional map and the input image. For example, the method of Pirchheim et al. (Non-Patent Document 6) or the method of Engel et al. (Non-Patent Document 7) may be used. The technique of Pirchhem et al. Calculates the position and orientation that minimizes the Euclidean distance in the image coordinate system between the feature points on the three-dimensional map and the feature points on the input image that are associated by the feature amount comparison, as in the Klein method. To do. Therefore, when the present invention is implemented, the influence of the feature points on the three-dimensional map on the position and orientation calculation is controlled by multiplying the Euclidean distance by the weight as shown in Equation 5. The method of Engel et al. Determines the position and orientation that minimize the difference between the luminance value of the feature point on the three-dimensional map and the luminance value of the input image at the image coordinates when the feature point on the three-dimensional map is projected onto the input image. calculate. Therefore, when the present invention is implemented, the influence of the feature points on the three-dimensional map on the position and orientation calculation is controlled by multiplying the luminance value difference by the weight.

被選択特徴点群を抽出する際に利用する現フレームの位置および姿勢の予測値は前フレームの位置および姿勢に限るものではなく、現フレームの位置および姿勢に近いと思われるものであれば何でもよい。例えば、等速度運動や等加速度運動などの運動モデルを仮定し、運動モデルに基づいて前フレームの位置および姿勢を更新したものでもよい。また、撮像部に別途位置や姿勢を計測するセンサを装着し、センサの計測値に基づいて現フレームの位置および姿勢の予測値を得てもよい。 The predicted value of the current frame position and orientation used when extracting the selected feature point group is not limited to the position and orientation of the previous frame, but anything that seems to be close to the position and orientation of the current frame. Good. For example, a motion model such as a constant velocity motion or a constant acceleration motion may be assumed, and the position and posture of the previous frame may be updated based on the motion model. Alternatively, a sensor for measuring the position and orientation may be attached to the imaging unit, and the predicted value of the position and orientation of the current frame may be obtained based on the sensor measurement values.

各処理を行うタイミングは図２または図６に限るものではなく、重みが位置および姿勢計算に反映されれば何でもよい。例えば、整合性決定（Ｓ１０３、Ｓ２０１）、特徴の評価（Ｓ１０４）、重み計算（Ｓ１０５、Ｓ２０２）と、位置および姿勢計算（Ｓ１０６）を並列して処理してもよい。また、特徴の評価の後に整合性決定を行ってもよい。 The timing for performing each process is not limited to that in FIG. 2 or FIG. 6, and may be anything as long as the weight is reflected in the position and orientation calculation. For example, consistency determination (S103, S201), feature evaluation (S104), weight calculation (S105, S202) and position and orientation calculation (S106) may be processed in parallel. Further, the consistency determination may be performed after the feature evaluation.

三次元マップが保持する特徴は点に限るものではなく、三次元マップと現実空間とを照合できるものであれば何でもよい。例えば、物体のエッジの両端点の世界座標系における座標を特徴として保持してもよい。また、特定のパターンが描かれており撮像することで撮像部の位置および姿勢を計算できるマーカの、世界座標系での位置および姿勢とそのパターンを特徴として保持してもよい。 The feature held by the three-dimensional map is not limited to a point, and any feature can be used as long as the three-dimensional map and the real space can be collated. For example, the coordinates in the world coordinate system of both end points of the edge of the object may be held as features. In addition, the position and orientation of the marker in the world coordinate system and the pattern of the marker that can calculate the position and orientation of the imaging unit by drawing a specific pattern may be held as features.

整合性は上述した連続値に限るものではなく、２値や量子化した値で表現してもよい。例えば、三次元マップ上の特徴点と入力画像上の特徴点の輝度値の差の絶対値をｓ、整合性が変わる輝度値の差の絶対値の閾値をｓ_ｔｈ、ｓ_ｔｈ１、ｓ_ｔｈ２（ｓ_ｔｈ１＞ｓ_ｔｈ２）とすると、整合性ｃは数１３または数１４のように決定してもよい。

Consistency is not limited to the above-described continuous values, and may be expressed by binary values or quantized values. For example, the absolute value of the difference between the luminance values of the feature point on the three-dimensional map and the feature point on the input image is s, and the threshold value of the absolute value of the difference between the luminance values whose consistency changes is s _th , s _th1 , s _th2 ( If s _th1 > s _th2 ), the consistency c may be determined as in _Equation 13 or Equation 14.

整合性は上述した三次元マップ上の特徴点と入力画像とで差異が生じる場合に低くするものに限るものではない。例えば、三次元マップ上の特徴点と入力画像とで差異が生じる場合に高くするものであってもよい。この場合、整合性が高いほど重みを小さくする。 Consistency is not limited to that which is lowered when there is a difference between the feature point on the three-dimensional map and the input image. For example, it may be increased when there is a difference between the feature point on the three-dimensional map and the input image. In this case, the higher the consistency, the smaller the weight.

特徴の評価値は上述した特徴点の数が多いほど、また特徴点の分布が密な領域ほど高く、特徴点の数が少ないほど、また特徴点の分布が粗な領域ほど低くするものに限るものではない。例えば、特徴点の数が多いほど、また分布が密な領域ほど低く、特徴点の数が少ないほど、また分布が粗な領域ほど高くするものであってもよい。この場合、特徴の評価値が高いほど整合性が低い特徴点にも重みを与える。但し、整合性が三次元マップ上の特徴点と入力画像とで差異が生じる場合に高くするものである場合は、特徴の評価値が高いほど整合性が高い特徴点にも重みを与える。整合性の評価は三次元マップ上の特徴点に対して行うものに限るものではない。三次元マップが静止環境において生成された場合などでは、三次元マップ上の特徴点は動体から検出されたものではない。その場合には三次元マップの特徴点に対応づいた入力画像上の特徴点について整合性を評価して、重みを与える。 The evaluation value of the feature is limited to a value that is higher as the number of feature points described above is larger or higher in a region where the distribution of feature points is higher, and lower as the number of feature points is smaller or in a region where the distribution of feature points is coarser. It is not a thing. For example, the area may be lower as the number of feature points is larger or the distribution is denser, and may be higher as the number of feature points is smaller or the area is coarser. In this case, the higher the feature evaluation value is, the more weight is given to the feature points having lower consistency. However, in the case where the consistency is increased when a difference occurs between the feature point on the three-dimensional map and the input image, the higher the evaluation value of the feature, the higher the consistency is given to the feature point. Consistency evaluation is not limited to that performed on feature points on a three-dimensional map. When the 3D map is generated in a static environment, the feature points on the 3D map are not detected from moving objects. In that case, the consistency is evaluated for the feature points on the input image corresponding to the feature points of the three-dimensional map, and a weight is given.

＜効果＞
第１実施形態によれば、位置および姿勢計算に用いる特徴点数が多いほど動く物体上にある可能性が低い特徴点のみに重みが与えられるようになり、特徴点数が少ないほど動く物体上にある可能性が高い特徴点にも重みが与えられるようになる。このように位置および姿勢計算に用いる特徴点数に応じて重みを調整することにより高精度、高ロバストに位置および姿勢を計算することができる。 <Effect>
According to the first embodiment, as the number of feature points used for position and orientation calculation increases, weights are given only to feature points that are less likely to be on the moving object, and as the number of feature points decreases, the feature points are on the moving object. A weight is given also to the feature point with high possibility. Thus, the position and orientation can be calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for position and orientation calculation.

第２実施形態によれば、位置および姿勢計算に用いる特徴点の分布が密な領域ほど動く物体上にある可能性が低い特徴点に重みが与えられ、分布が粗な領域ほど動く物体上にある可能性が高い特徴点にも重みが与えられる。このように位置および姿勢計算に用いる特徴の分布の偏りを防ぐことにより高精度、高ロバストに位置および姿勢を計算すること可能になる。 According to the second embodiment, weights are given to feature points that are less likely to be on moving objects as the distribution of feature points used in position and orientation calculation is denser, and moving objects are located as the distribution is coarser. Weights are also given to feature points that are likely to be. In this way, it is possible to calculate the position and orientation with high accuracy and high robustness by preventing the deviation of the distribution of the features used for the position and orientation calculation.

第３実施形態によれば、位置および姿勢計算に用いる特徴点数が多いほど、また位置および姿勢計算に用いる特徴点の分布が密な領域ほど動く物体上にある可能性が低い特徴点に重みが与えられる。一方、位置および姿勢計算に用いる特徴点数が少ないほど、またそれらの分布が粗な領域ほど動く物体上にある可能性が高い特徴点にも重みが与えられる。このように位置および姿勢計算に用いる特徴点数に応じた重みの調整と、位置および姿勢計算に用いる特徴の分布の偏りの防止により高精度、高ロバストに位置および姿勢が計算される。 According to the third embodiment, as the number of feature points used for position and orientation calculation is larger, and the feature point distribution used for position and orientation calculation is denser, the feature points that are less likely to be on moving objects are weighted. Given. On the other hand, the smaller the number of feature points used for position and orientation calculation, and the more likely the feature points that are more likely to be on the moving object are the regions whose distribution is coarser. As described above, the position and orientation are calculated with high accuracy and high robustness by adjusting the weight according to the number of feature points used for the position and orientation calculation and preventing the bias of the distribution of the features used for the position and orientation calculation.

また、第４実施形態によれば、過去の整合性を用いて整合性の大きな変化を抑制することで位置および姿勢を高精度化、高ロバスト化することが可能になる。 Further, according to the fourth embodiment, it is possible to make the position and orientation highly accurate and robust by suppressing a large change in consistency using past consistency.

＜定義＞
三次元情報保持部１１０は、画像中に存在する特徴の三次元情報とその画像の撮像時における撮像部の位置姿勢情報が記録された三次元マップを保持する構成の一例であり、三次元マップと現実空間とを照合できる特徴を保持するものであれば何でもよい。例えば、物体のエッジの両端点等の特徴点の世界座標系における座標を特徴として保持してもよい。また、特定のパターンが描かれており撮像することで撮像部の位置および姿勢を計算できるマーカの、世界座標系での位置および姿勢とそのパターンを特徴として保持してもよい。 <Definition>
The three-dimensional information holding unit 110 is an example of a configuration that holds a three-dimensional map in which three-dimensional information of features existing in an image and position and orientation information of the imaging unit at the time of capturing the image are recorded. Any feature can be used as long as it retains a feature that can match the real space with the real space. For example, the coordinates in the world coordinate system of feature points such as both end points of the object edge may be held as features. In addition, the position and orientation of the marker in the world coordinate system and the pattern of the marker that can calculate the position and orientation of the imaging unit by drawing a specific pattern may be held as features.

整合性決定部１３０は、三次元マップに記録されている特徴について、入力された画像との整合性を決定する構成の一例であり、三次元マップと現実空間（入力画像）との合致度合いに基づいて整合性を決定するものであれば何でもよい。例えば、上記実施形態では、三次元マップ上の特徴点と入力画像との輝度値の差が小さいほど整合性を高く、三次元マップ上の特徴点と入力画像との輝度値の差が大きいほど整合性を低く決定したがこれに限られるものではない。たとえば、三次元マップ上の特徴点と入力画像との色（ＲＧＢ色空間やＨＳＶ色空間の各要素）の差が小さいほど整合性を高く、三次元マップ上の特徴点と入力画像との色の差が大きいほど整合性を低く決定してもよい。また、三次元マップ上の特徴点と入力画像との距離値の差が小さいほど整合性を高く、三次元マップ上の特徴点と入力画像との距離値の差が大きいほど整合性を低く決定してもよい。また、入力画像から手や人物を検出し、検出した手や人物の領域内に投影された三次元マップ上の特徴点の整合性を低く決定してもよい。手や人物の領域は移動体だからである。 The consistency determining unit 130 is an example of a configuration that determines the consistency between the feature recorded in the three-dimensional map and the input image, and determines the degree of matching between the three-dimensional map and the real space (input image). Anything may be used as long as the consistency is determined based on it. For example, in the above embodiment, the smaller the difference in luminance value between the feature point on the three-dimensional map and the input image, the higher the consistency, and the larger the difference in luminance value between the feature point on the three-dimensional map and the input image. Although consistency was determined to be low, this is not restrictive. For example, the smaller the color difference between each feature point on the 3D map and the input image (elements in the RGB color space and HSV color space), the higher the consistency, and the color between the feature point on the 3D map and the input image. The greater the difference, the lower the consistency. Also, the smaller the difference in the distance value between the feature point on the 3D map and the input image, the higher the consistency, and the greater the difference in the distance value between the feature point on the 3D map and the input image, the lower the consistency. May be. Alternatively, a hand or a person may be detected from the input image, and the consistency of the feature points on the three-dimensional map projected in the detected hand or person region may be determined to be low. This is because the hand and person areas are moving objects.

特徴評価部１４０は、三次元マップと入力された画像との間の対応する特徴の存在状態の評価の結果を示す評価値を計算する構成の一例である。特徴評価部１４０は、三次元マップ上および／または入力画像上の特徴の数や分布に基づいて位置および姿勢計算に用いる特徴点を評価するものであれば何でもよい。一例として、第１実施形態では、三次元マップ上および／または入力画像上の特徴点の数が多いほど評価値を高く計算し、三次元マップ上および／または入力画像上の特徴点の数が少ないほど評価値を低く計算している。また、一例として、第２実施形態では、三次元マップ上および／または入力画像上の特徴点の分布が密な領域ほど評価値を高く計算し、三次元マップ上および／または入力画像上の特徴点の分布が粗な領域ほど評価値を低く計算している。 The feature evaluation unit 140 is an example of a configuration that calculates an evaluation value indicating a result of evaluating a presence state of a corresponding feature between a three-dimensional map and an input image. The feature evaluation unit 140 may be anything that evaluates feature points used for position and orientation calculation based on the number and distribution of features on the three-dimensional map and / or input image. As an example, in the first embodiment, the evaluation value is calculated to be higher as the number of feature points on the three-dimensional map and / or the input image increases, and the number of feature points on the three-dimensional map and / or the input image is calculated. The smaller the value, the lower the evaluation value is calculated. As an example, in the second embodiment, a higher evaluation value is calculated for a region having a dense distribution of feature points on the three-dimensional map and / or the input image, and the feature on the three-dimensional map and / or the input image is calculated. The lower the evaluation value is, the lower the distribution of points is.

重み計算部１５０は、整合性と評価値に基づいて、三次元マップ上の特徴が、撮像部１８０の位置および姿勢の計算に与える影響を制御する重みを計算する構成の一例である。重み計算部１５０は、整合性および／または特徴の評価値を用いて重みを計算するものであれば何でもよい。例えば、整合性と評価値の和や積を重みとしてもよい。また、整合性が所定の閾値以上の場合は重みに０より大きい値を与え、整合性が所定の閾値未満の場合は重みを０とし、且つ、評価値が高いほど所定の閾値を高くするものでもよい。また、整合性が高い場合は重みを大きくし、整合性が低い場合は重みを小さくし、且つ、整合性が低い場合は評価値が低いほど評価値が高い場合と比較して重みを大きくするものでもよい。 The weight calculation unit 150 is an example of a configuration that calculates a weight for controlling the influence of the features on the three-dimensional map on the calculation of the position and orientation of the imaging unit 180 based on the consistency and the evaluation value. The weight calculator 150 may be anything as long as it calculates weights using consistency and / or feature evaluation values. For example, the sum or product of the consistency and the evaluation value may be used as the weight. In addition, when the consistency is equal to or higher than a predetermined threshold, a value greater than 0 is given to the weight. When the consistency is lower than the predetermined threshold, the weight is set to 0, and the higher the evaluation value, the higher the predetermined threshold. But you can. Also, when the consistency is high, the weight is increased, when the consistency is low, the weight is decreased, and when the consistency is low, the lower the evaluation value is, the higher the evaluation value is compared with the higher evaluation value. It may be a thing.

位置姿勢計算部は三次元マップと入力画像に基づいて位置および姿勢を計算するものであれば何でもよい。例えば、Ｋｌｅｉｎの手法やＰｉｒｃｈｈｅｉｍらの手法やＥｎｇｅｌらの手法を用いてもよい。 The position / orientation calculation unit may be anything as long as it calculates the position and orientation based on the three-dimensional map and the input image. For example, Klein's method, Pirchheim's method, or Engel's method may be used.

なお、本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus execute the program. It can also be realized by a process of reading and executing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１：情報処理装置、１１０：三次元情報保持部、１２０：画像入力部、１３０：整合性決定部、１４０：特徴評価部、１５０：重み計算部、１６０：位置姿勢計算部、１８０：撮像部 1: Information processing device, 110: Three-dimensional information holding unit, 120: Image input unit, 130: Consistency determining unit, 140: Feature evaluation unit, 150: Weight calculation unit, 160: Position and orientation calculation unit, 180: Imaging unit

Claims

Holding means for holding a three-dimensional map in which three-dimensional information of features existing in the image and position and orientation information of the imaging unit at the time of capturing the image are recorded;
Input means for inputting an image captured by the imaging unit;
Determining means for determining consistency with the input image for the features recorded in the three-dimensional map;
Evaluation means for calculating an evaluation value indicating a result of evaluation of the existence state of the feature associated between the three-dimensional map and the input image;
Weight calculation means for calculating a weight for controlling the influence of the characteristics of the three-dimensional map on the calculation of the position and orientation of the imaging unit based on the consistency and the evaluation value;
An information processing apparatus comprising: calculation means for calculating the position and orientation of the imaging unit when the input image is captured using the three-dimensional map and the weight.

The information processing apparatus according to claim 1, wherein the evaluation unit calculates the evaluation value based on the number of the corresponding features.

The information processing apparatus according to claim 1, wherein the evaluation unit calculates the evaluation value based on a density distribution of the corresponding feature in the input image.

The information processing apparatus according to claim 1, wherein the evaluation unit calculates the evaluation value based on the number of the corresponding features and the density distribution of the corresponding features in the input image. .

The weight calculation means gives a value greater than 0 to the weight when the consistency is equal to or greater than a predetermined threshold, sets the weight to 0 when the consistency is less than the predetermined threshold, and further increases the evaluation value. The information processing apparatus according to claim 1, wherein the predetermined threshold is increased.

The weight calculation means increases the weight as the consistency is high, and increases the weight when the evaluation value is low as compared with the case where the evaluation value is high as the consistency is low. The information processing apparatus according to any one of claims 1 to 4.

The said determination means determines the said consistency based on the difference of the said three-dimensional map and the said input image regarding the said corresponding characteristic, The one of the Claims 1 thru | or 6 characterized by the above-mentioned. Information processing device.

The difference is at least one of a feature point of the three-dimensional map and a brightness value difference, a color difference, and a distance value difference at a projection position of the feature point on the input image. The information processing apparatus according to claim 7.

The information processing apparatus according to claim 1, wherein the determining unit holds the consistency history and determines the consistency based on the consistency history.

Holding means for holding a three-dimensional map in which three-dimensional information of features existing in the image and position and orientation information of the imaging unit at the time of capturing the image are recorded;
An image input means for inputting an image captured by the imaging unit;
Consistency with the inputted image is acquired for the feature recorded in the three-dimensional map, and consistency of the feature is determined based on the acquired consistency and a history of consistency that is held. A determination means;
Weight calculation means for calculating a weight for controlling the influence of the characteristics of the three-dimensional map on the calculation of the position and orientation of the imaging unit based on the determined consistency;
An information processing apparatus comprising: calculation means for calculating the position and orientation of the imaging unit at the time of imaging the input image based on the three-dimensional map and the weight.

The information processing apparatus according to claim 1, wherein the weight calculation unit calculates the weight as a continuous value.

The three-dimensional map is a three-dimensional map in which a position and orientation closest to the position and orientation calculated immediately before for the imaging unit are recorded among a plurality of pre-registered three-dimensional maps. The information processing apparatus according to any one of claims 1 to 11.

A control method of an information processing apparatus having a holding unit that holds a three-dimensional map in which three-dimensional information of features existing in an image and position and orientation information of an imaging unit at the time of imaging the image are recorded,
An input step of inputting an image captured by the imaging unit;
A determination step for determining consistency with the input image for features recorded in the three-dimensional map;
An evaluation step of calculating an evaluation value indicating a result of evaluation of the existence state of the feature associated between the three-dimensional map and the input image;
A weight calculation step of calculating a weight for controlling the influence of the characteristics of the three-dimensional map on the calculation of the position and orientation of the imaging unit based on the consistency and the evaluation value;
A control method for an information processing apparatus, comprising: a calculation step of calculating the position and orientation of the imaging unit at the time of imaging the input image using the three-dimensional map and the weight.

A control method of an information processing apparatus having a holding unit that holds a three-dimensional map in which three-dimensional information of features existing in an image and position and orientation information of an imaging unit at the time of imaging the image are recorded
An image input step of inputting an image captured by the imaging unit;
Consistency with the inputted image is acquired for the feature recorded in the three-dimensional map, and consistency of the feature is determined based on the acquired consistency and a history of consistency that is held. A decision process;
A weight calculation step of calculating a weight for controlling the influence of the characteristics of the three-dimensional map on the calculation of the position and orientation of the imaging unit based on the determined consistency;
A control method for an information processing apparatus, comprising: a calculation step of calculating a position and a posture of the imaging unit at the time of imaging the input image based on the three-dimensional map and the weight.

The program for making a computer perform each process of the control method described in Claim 13 or 14.