JP6922348B2

JP6922348B2 - Information processing equipment, methods, and programs

Info

Publication number: JP6922348B2
Application number: JP2017072447A
Authority: JP
Inventors: 厚憲茂木; 村瀬　太一; 太一村瀬; 博一加藤; 貴史武富
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-03-31
Filing date: 2017-03-31
Publication date: 2021-08-18
Anticipated expiration: 2037-03-31
Also published as: JP2018173882A

Description

開示の技術は、情報処理装置、方法、及びプログラムに関する。 Disclosure techniques relate to information processing devices, methods, and programs.

観測位置を精度よく検出するための画像処理装置が知られている。この画像処理装置は、観測位置におけるシーンの法線情報を生成し、生成した法線情報に基づき観測位置を推定する。 An image processing device for accurately detecting the observation position is known. This image processing device generates normal information of the scene at the observation position, and estimates the observation position based on the generated normal information.

また、コンピュータビジョンベースの追跡のために複数のマップをマージするための方法が知られている。この方法は、複数のモバイルデバイスからのシーンの複数のマップを使用して自己位置推定及び地図構築同時実行し、Simultaneous Localization and Mapping（ＳＬＡＭ）マップを生成する。そして、この方法では、複数のモバイルデバイスの間でＳＬＡＭマップを共有する。 There are also known methods for merging multiple maps for computer vision based tracking. This method simultaneously performs self-location estimation and map construction using multiple maps of a scene from multiple mobile devices to generate a Simultaneous Localization and Mapping (SLAM) map. Then, in this method, the SLAM map is shared among a plurality of mobile devices.

また、端末に搭載されたカメラによって撮像された画像と、端末の移動軌跡から形成されるループとに基づき、端末の位置及び姿勢を推定する技術が知られている。 Further, there is known a technique for estimating the position and orientation of a terminal based on an image captured by a camera mounted on the terminal and a loop formed from a movement locus of the terminal.

国際公開第２０１６／１８１６８７号International Publication No. 2016/181678 特表２０１６‐５００８８５号公報Special Table 2016-58085

Ra´ul Mur-Artal, J. M. M. Montiel,"ORB-SLAM: A Versatile and Accurate Monocular SLAM System",IEEE TRANSACTIONS ON ROBOTICS, VOL. 31, NO. 5, OCTOBER 2015Ra´ul Mur-Artal, J.M.M. Montiel, "ORB-SLAM: A Versatile and Accurate Monocular SLAM System", IEEE TRANSACTIONS ON ROBOTICS, VOL. 31, NO. 5, OCTOBER 2015

しかし、カメラを搭載した端末の移動軌跡がループを形成しない場合がある。この場合には、端末が備える撮像装置の位置及び姿勢の推定誤差が増大する可能性が高い。 However, the movement locus of the terminal equipped with the camera may not form a loop. In this case, there is a high possibility that the estimation error of the position and orientation of the image pickup device provided in the terminal will increase.

一つの側面では、開示の技術は、撮像装置の位置及び姿勢の推定誤差を低減させることが目的である。 In one aspect, the disclosed technique is aimed at reducing estimation errors in the position and orientation of the imaging device.

開示の技術は、一つの実施態様では、情報処理装置は、端末に搭載された撮像装置によって撮像された画像を取得する。そして、情報処理装置は、取得された前記画像に基づいて、前記撮像装置の位置及び姿勢を推定し、取得された前記画像から予め定められた指標が検出された場合に、前記指標に対する前記撮像装置の位置及び姿勢を推定する。情報処理装置は、前記画像から前記指標が検出された場合に、ループに基づいて、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を推定する。ループは、前回までに取得された前記画像の各々に基づいて推定された前記撮像装置の位置の各々と、推定された前記指標に対する前記撮像装置の位置とから形成される。 In one embodiment of the disclosed technique, the information processing apparatus acquires an image captured by an imaging apparatus mounted on a terminal. Then, the information processing device estimates the position and orientation of the imaging device based on the acquired image, and when a predetermined index is detected from the acquired image, the imaging with respect to the index is performed. Estimate the position and orientation of the device. When the index is detected from the image, the information processing device estimates each of the position and the posture of the image pickup device when each of the images is captured based on the loop. The loop is formed from each of the positions of the image pickup device estimated based on each of the images acquired up to the previous time and the position of the image pickup device with respect to the estimated index.

一つの側面として、撮像装置の位置及び姿勢の推定誤差を低減させることができる、という効果を有する。 As one aspect, it has an effect that the estimation error of the position and orientation of the image pickup apparatus can be reduced.

第１の実施形態に係る情報処理装置の概略ブロック図である。It is a schematic block diagram of the information processing apparatus which concerns on 1st Embodiment. カメラの位置及び姿勢の推定誤差を説明するための説明図である。It is explanatory drawing for demonstrating the estimation error of the position and posture of a camera. キーフレームテーブルの一例を示す図である。It is a figure which shows an example of a keyframe table. マップ点テーブルの一例を示す図である。It is a figure which shows an example of a map point table. 特徴点と特徴点に対応する特徴量の一例を示す図である。It is a figure which shows an example of a feature point and a feature quantity corresponding to a feature point. 指標とパターンとの一例を示す図である。It is a figure which shows an example of an index and a pattern. キーフレーム画像が撮像されたときのカメラの位置及び姿勢の最適化を説明するための説明図である。It is explanatory drawing for demonstrating the optimization of the position and posture of a camera when a keyframe image is imaged. 本実施形態で形成されるループを説明するための図である。It is a figure for demonstrating the loop formed in this embodiment. 第１の実施形態に係る情報処理装置の制御部として機能するコンピュータの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the computer which functions as the control part of the information processing apparatus which concerns on 1st Embodiment. 第１の実施形態における姿勢推定処理の一例を示すフローチャートである。It is a flowchart which shows an example of the posture estimation process in 1st Embodiment. 第１の実施形態におけるマップ生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of the map generation processing in 1st Embodiment. 第１の実施形態における最適化処理の一例を示すフローチャートである。It is a flowchart which shows an example of the optimization process in 1st Embodiment. 第２の実施形態に係る情報処理装置の概略ブロック図である。It is a schematic block diagram of the information processing apparatus which concerns on 2nd Embodiment. 第２の実施形態に係る情報処理装置の制御部として機能するコンピュータの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the computer which functions as the control part of the information processing apparatus which concerns on 2nd Embodiment. 第２の実施形態における最適化処理の一例を示すフローチャートである。It is a flowchart which shows an example of the optimization process in 2nd Embodiment. 第３の実施形態に係る情報処理装置の概略ブロック図である。It is a schematic block diagram of the information processing apparatus which concerns on 3rd Embodiment. 表示装置に表示される表示画面の一例を示す図である。It is a figure which shows an example of the display screen displayed on the display device. 第３の実施形態に係る情報処理装置の制御部として機能するコンピュータの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the computer which functions as the control part of the information processing apparatus which concerns on 3rd Embodiment. 第３の実施形態における表示制御処理の一例を示すフローチャートである。It is a flowchart which shows an example of the display control processing in 3rd Embodiment. 第４の実施形態に係る情報処理装置の概略ブロック図である。It is a schematic block diagram of the information processing apparatus which concerns on 4th Embodiment. 第４の実施形態で形成されるループを説明するための図である。It is a figure for demonstrating the loop formed in 4th Embodiment. 第４の実施形態に係る情報処理装置の制御部として機能するコンピュータの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the computer which functions as the control part of the information processing apparatus which concerns on 4th Embodiment. 第４の実施形態における姿勢推定処理の一例を示すフローチャートである。It is a flowchart which shows an example of the posture estimation process in 4th Embodiment. 第４の実施形態における最適化処理の一例を示すフローチャートである。It is a flowchart which shows an example of the optimization process in 4th Embodiment.

以下、図面を参照して開示の技術の実施形態の一例を詳細に説明する。 Hereinafter, an example of an embodiment of the disclosed technology will be described in detail with reference to the drawings.

＜第１の実施形態＞
図１に、情報処理装置１０の構成例を表す概略図を示す。 <First Embodiment>
FIG. 1 shows a schematic diagram showing a configuration example of the information processing device 10.

図１に示すように、本実施形態の情報処理装置１０は、カメラ１２と、制御部１４とを有する。カメラ１２は、開示の技術の撮像装置の一例である。カメラ１２は、車両などの移動体に搭載され、または人に携帯されうる。カメラ１２の位置は、他の装置に搭載されて、または人に携帯されて運ばれることで変化しうる。カメラ１２と制御部１４とは、ともに情報処理装置１０に含まれても良いし、情報処理装置１０には制御部１４が搭載され、カメラ１２は、制御部１４と通信することが可能な別装置であっても良い。本実施形態では、情報処理装置１０が車両に搭載される場合を例に説明する。 As shown in FIG. 1, the information processing device 10 of the present embodiment includes a camera 12 and a control unit 14. The camera 12 is an example of an imaging device according to the disclosed technology. The camera 12 can be mounted on a moving body such as a vehicle or carried by a person. The position of the camera 12 can be changed by being mounted on another device or carried by a person. Both the camera 12 and the control unit 14 may be included in the information processing device 10, or the information processing device 10 is equipped with the control unit 14, and the camera 12 can communicate with the control unit 14. It may be a device. In the present embodiment, a case where the information processing device 10 is mounted on a vehicle will be described as an example.

カメラ１２は、車両の周辺の画像を逐次撮像する。 The camera 12 sequentially captures images around the vehicle.

制御部１４は、データ記憶部１５と、画像取得部１６と、特徴点抽出部１８と、姿勢推定部２０と、マップ生成部２２と、指標検出部２４と、最適化部２６と、調整部２８とを備える。データ記憶部１５は、開示の技術の記憶部の一例である。また、最適化部２６は、開示の技術の推定部の一例である。 The control unit 14 includes a data storage unit 15, an image acquisition unit 16, a feature point extraction unit 18, a posture estimation unit 20, a map generation unit 22, an index detection unit 24, an optimization unit 26, and an adjustment unit. 28 and. The data storage unit 15 is an example of a storage unit of the disclosed technology. Further, the optimization unit 26 is an example of an estimation unit of the disclosed technology.

データ記憶部１５には、車両の周辺環境の情報を表すマップ情報が格納される。マップ情報は、車両に搭載されたカメラ１２により撮像された画像に基づき生成される。マップ情報について、以下説明する。 The data storage unit 15 stores map information representing information on the surrounding environment of the vehicle. The map information is generated based on the image captured by the camera 12 mounted on the vehicle. The map information will be described below.

例えば、図２の２Ａに示されるように、建物Ｒの周辺を携帯端末等に搭載されたカメラが移動する場合を例に説明する。図２の２Ａは、上空からみた建物Ｒの周辺の領域を表している。図２の２Ａでは、カメラの移動により移動軌跡Ｌ１が生成される。カメラが移動軌跡Ｌ１を移動する際には、端末に搭載されたカメラ１２によってカメラ周辺の画像が逐次撮像される。逐次撮像されたカメラ周辺の画像のうち、所定の条件を満たす画像がキーフレーム画像としてデータ記憶部１５に格納される。キーフレーム画像とは、所定の条件を満たす画像である。 For example, as shown in 2A of FIG. 2, a case where a camera mounted on a mobile terminal or the like moves around the building R will be described as an example. 2A in FIG. 2 represents the area around the building R as seen from above. In 2A of FIG. 2, the movement locus L1 is generated by the movement of the camera. When the camera moves on the movement locus L1, the camera 12 mounted on the terminal sequentially captures images around the camera. Among the sequentially captured images around the camera, an image satisfying a predetermined condition is stored in the data storage unit 15 as a keyframe image. A keyframe image is an image that satisfies a predetermined condition.

画像からは特徴点が抽出される。特徴点とは、例えば、対象領域に存在する物体の形状を表す、画像内のエッジ点等である。また、画像中の特徴点に対応する３次元座標を表すマップ点が生成される。例えば、図２の２Ｂに示されるようなマップ点Ｍが特徴点Ｆに対して生成される。また、画像２Ｃから抽出される特徴点Ｆは、マップ点Ｍと対応付けられる。このため、データ記憶部１５に格納されたマップ情報のマップ点Ｍと、画像２Ｃから抽出される特徴点Ｆとの対応付けに応じて、端末に搭載されたカメラ１２の位置及び姿勢が逐次推定される。 Feature points are extracted from the image. The feature point is, for example, an edge point in an image representing the shape of an object existing in the target area. In addition, map points representing three-dimensional coordinates corresponding to the feature points in the image are generated. For example, a map point M as shown in 2B of FIG. 2 is generated for the feature point F. Further, the feature point F extracted from the image 2C is associated with the map point M. Therefore, the position and orientation of the camera 12 mounted on the terminal are sequentially estimated according to the correspondence between the map point M of the map information stored in the data storage unit 15 and the feature point F extracted from the image 2C. Will be done.

しかし、データ記憶部１５にマップ情報が格納されていない領域において、カメラ１２の位置及び姿勢を推定する場合、同一箇所において撮像された複数の画像を取得することができなければ、カメラの移動軌跡のループが形成されない。このため、例えば、下記参考文献１に示されているような最適化処理を行うことができない。そのため、図２の２Ｂに示されるように、カメラ１２の位置及び姿勢の推定結果は、本来の移動軌跡Ｌ１とは異なる移動軌跡Ｌ２となる。これにより、カメラ１２の位置及び姿勢の推定結果を表す移動軌跡Ｌ２においては、カメラ１２の位置及び姿勢の推定誤差が増大する。 However, when estimating the position and orientation of the camera 12 in the area where the map information is not stored in the data storage unit 15, if a plurality of images captured at the same location cannot be acquired, the movement locus of the camera Loop is not formed. Therefore, for example, the optimization process as shown in Reference 1 below cannot be performed. Therefore, as shown in 2B of FIG. 2, the estimation result of the position and the posture of the camera 12 is a movement locus L2 different from the original movement locus L1. As a result, in the movement locus L2 representing the estimation result of the position and orientation of the camera 12, the estimation error of the position and orientation of the camera 12 increases.

参考文献１：Ra´ul Mur-Artal, J. M. M. Montiel,"ORB-SLAM: A Versatile and Accurate Monocular SLAM System",IEEE TRANSACTIONS ON ROBOTICS, VOL. 31, NO. 5, OCTOBER 2015 Reference 1: Ra´ul Mur-Artal, J.M.M. Montiel, "ORB-SLAM: A Versatile and Accurate Monocular SLAM System", IEEE TRANSACTIONS ON ROBOTICS, VOL. 31, NO. 5, OCTOBER 2015

そこで、本実施形態では、予め定められた指標を環境中に設置し、指標が検知される毎にカメラ１２の位置及び姿勢の最適化を行う。以下、具体的に説明する。 Therefore, in the present embodiment, a predetermined index is set in the environment, and the position and posture of the camera 12 are optimized every time the index is detected. Hereinafter, a specific description will be given.

データ記憶部１５には、キーフレーム画像の各々と、キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の各々と、キーフレーム画像の特徴点の各々の３次元座標であるマップ点とを表すマップ情報が格納される。 The data storage unit 15 includes each of the keyframe images, each of the positions and orientations of the camera 12 when the keyframe image is captured, and map points which are three-dimensional coordinates of each of the feature points of the keyframe image. Map information representing is stored.

具体的には、マップ情報は、キーフレームテーブルとマップ点テーブルとで表現され、キーフレームテーブル及びマップ点テーブルがマップ情報としてデータ記憶部１５に格納される。 Specifically, the map information is represented by a key frame table and a map point table, and the key frame table and the map point table are stored in the data storage unit 15 as map information.

図３に示すキーフレームテーブルには、キーフレームの識別情報を表すキーフレームＩＤと、カメラ１２の位置及び姿勢と、キーフレーム画像と、キーフレーム画像の特徴点と、特徴点に対応するマップ点ＩＤとが対応付けられて格納される。例えば、図３のキーフレームテーブルのキーフレームＩＤ「001」に対応するカメラ１２の位置及び姿勢は、図３に示されるように、(0.24,0.84,0.96,245.0,313.9,23.8)を示す６次元実数値である。６次元実数値のうち(0.24,0.84,0.96）はカメラ１２の姿勢を表し、(245.0,313.9,23.8)はカメラの３次元位置を表す。キーフレームテーブルの１行の情報が１つのキーフレームを表す。 The keyframe table shown in FIG. 3 includes a keyframe ID representing keyframe identification information, a position and orientation of the camera 12, a keyframe image, feature points of the keyframe image, and map points corresponding to the feature points. It is stored in association with the ID. For example, the position and orientation of the camera 12 corresponding to the keyframe ID “001” in the keyframe table of FIG. 3 indicates (0.24,0.84,0.96,245.0,313.9,23.8) as shown in FIG. It is a dimensional real value. Of the 6-dimensional real values, (0.24,0.84,0.96) represents the attitude of the camera 12, and (245.0,313.9,23.8) represents the 3D position of the camera. One row of information in the keyframe table represents one keyframe.

また、図３のキーフレームテーブルのキーフレームＩＤ「001」に対応するキーフレーム画像(24,46,…)は、キーフレーム画像の各画素の画素値を表す。また、図３のキーフレームテーブルのキーフレームＩＤ「001」に対応する特徴点「(11,42),(29,110)…」は、キーフレーム画像内の特徴点の位置に対応する画素位置を表す。また、図３のキーフレームテーブルのキーフレームＩＤ「001」に対応するマップ点ＩＤ「3,5,9,32…」は、各特徴点に対応するマップ点ＩＤを表す。マップ点ＩＤは、マップ点テーブルのマップ点ＩＤと対応する。 The keyframe image (24,46, ...) Corresponding to the keyframe ID "001" in the keyframe table of FIG. 3 represents the pixel value of each pixel of the keyframe image. Further, the feature points "(11,42), (29,110) ..." Corresponding to the keyframe ID "001" in the keyframe table of FIG. 3 represent the pixel positions corresponding to the positions of the feature points in the keyframe image. .. Further, the map point ID "3,5,9,32 ..." Corresponding to the keyframe ID "001" in the keyframe table of FIG. 3 represents the map point ID corresponding to each feature point. The map point ID corresponds to the map point ID in the map point table.

図４に示すマップ点テーブルには、マップ点の識別情報を表すマップ点ＩＤと、マップ点の３次元位置座標（Ｘ[ｍ]，Ｙ[ｍ]，Ｚ[ｍ]）と、マップ点の特徴量とが対応付けられて格納される。例えば、図４のマップ点テーブルの特徴量は、例えば、参考文献２に記載されているOriented FAST and Rotated BRIEF(ORB)等であり、ORBの特徴量は０または１を表す３２次元の特徴量によって表現される。 In the map point table shown in FIG. 4, the map point ID representing the identification information of the map point, the three-dimensional position coordinates (X [m], Y [m], Z [m]) of the map point, and the map point The feature amount is stored in association with it. For example, the feature amount of the map point table in FIG. 4 is, for example, Oriented FAST and Rotated BRIEF (ORB) described in Reference 2, and the feature amount of ORB is a 32-dimensional feature amount representing 0 or 1. Represented by.

参考文献２：E. Rublee et al., "ORB: An efficient alternative to SIFT or SURF", In Proc. of International Conference on Computer Vision, pp. 2564-2571, 2011. Reference 2: E. Rublee et al., "ORB: An efficient alternative to SIFT or SURF", In Proc. Of International Conference on Computer Vision, pp. 2564-2571, 2011.

本実施形態では、情報処理装置１０の制御部１４は、姿勢推定機能とマップ生成機能と最適化機能とを有する。以下、各機能に対応する各機能部について説明する。 In the present embodiment, the control unit 14 of the information processing device 10 has a posture estimation function, a map generation function, and an optimization function. Hereinafter, each functional unit corresponding to each function will be described.

なお、カメラ１２の内部パラメータについては、例えば、参考文献３に記載の方法に基づきキャリブレーションにより予め取得される。カメラ１２の内部パラメータとしては、例えば、焦点距離及び光学中心を含む行列と、歪み係数（例えば５次元）とが含まれる。 The internal parameters of the camera 12 are acquired in advance by calibration based on, for example, the method described in Reference 3. The internal parameters of the camera 12 include, for example, a matrix including the focal length and the optical center, and a distortion coefficient (for example, five dimensions).

参考文献３：Z.Zhang et al., "A flexible new technique for camera calibration.", IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334, 2000. Reference 3: Z. Zhang et al., "A flexible new technique for camera calibration.", IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (11): 1330-1334, 2000.

［姿勢推定機能］ [Posture estimation function]

画像取得部１６は、カメラ１２によって撮像された画像を逐次取得する。次に、画像取得部１６は、カメラ１２によって撮像された画像をグレースケール画像へ変換する。そして、画像取得部１６は、グレースケール画像を出力する。 The image acquisition unit 16 sequentially acquires images captured by the camera 12. Next, the image acquisition unit 16 converts the image captured by the camera 12 into a grayscale image. Then, the image acquisition unit 16 outputs a grayscale image.

特徴点抽出部１８は、画像取得部１６から出力されたグレースケール画像から、特徴点を取得する。例えば、特徴点抽出部１８は、上記参考文献１に記載の手法を用いて、グレースケール画像から特徴点を抽出する。そして、特徴点抽出部１８は、各特徴点に対して、特徴量を計算する。例えば、上記参考文献２に記載のORBを特徴量として用いる場合には、０または１を表す３２次元の特徴量が抽出される。 The feature point extraction unit 18 acquires feature points from the grayscale image output from the image acquisition unit 16. For example, the feature point extraction unit 18 extracts feature points from the grayscale image by using the method described in Reference 1 above. Then, the feature point extraction unit 18 calculates the feature amount for each feature point. For example, when the ORB described in Reference 2 is used as a feature amount, a 32-dimensional feature amount representing 0 or 1 is extracted.

図５に、各特徴点に対応する特徴量のデータ構造の一例を示す。図５に示されるように、特徴点の識別情報を表す特徴点ＩＤと、特徴点の位置を表す画素ｕ[pixel]，ｖ[pixel]と、特徴量とが対応付けられる。 FIG. 5 shows an example of a feature amount data structure corresponding to each feature point. As shown in FIG. 5, the feature point ID representing the identification information of the feature points, the pixels u [pixel] and v [pixel] representing the positions of the feature points, and the feature amount are associated with each other.

姿勢推定部２０は、特徴点抽出部１８によって抽出された特徴点及び特徴点に対応する特徴量に基づいて、カメラ１２の位置及び姿勢を推定する。姿勢推定部２０は、マップ点が得られていない場合、例えば、上記参考文献１の「IV Automatic Map Initialization」に記載の方法を使用して、初期のマップ点を生成する。 The posture estimation unit 20 estimates the position and posture of the camera 12 based on the feature points extracted by the feature point extraction unit 18 and the feature quantities corresponding to the feature points. When the map points are not obtained, the posture estimation unit 20 generates initial map points by using, for example, the method described in "IV Automatic Map Initialization" of Reference 1.

具体的には、まず、姿勢推定部２０は、任意の２視点から撮像されたグレースケール画像から、特徴点抽出部１８により抽出された特徴点を取得する。次に、姿勢推定部２０は、２視点から撮像されたグレースケール画像間で特徴点の対応付けを行い、１視点目に対応するカメラ１２の位置及び姿勢に対する、２視点目に対応するカメラ１２の位置及び姿勢を求める。なお、姿勢推定部２０は、１視点目に対応するカメラ１２の位置及び姿勢を、世界座標系の原点として設定し、２視点目に対応するカメラ１２の位置及び姿勢を、カメラ１２の位置及び姿勢の初期値として設定する。 Specifically, first, the posture estimation unit 20 acquires the feature points extracted by the feature point extraction unit 18 from the grayscale image captured from any two viewpoints. Next, the posture estimation unit 20 associates feature points between grayscale images captured from the two viewpoints, and the camera 12 corresponding to the second viewpoint with respect to the position and orientation of the camera 12 corresponding to the first viewpoint. Find the position and posture of. The posture estimation unit 20 sets the position and orientation of the camera 12 corresponding to the first viewpoint as the origin of the world coordinate system, and sets the position and orientation of the camera 12 corresponding to the second viewpoint as the position and orientation of the camera 12. Set as the initial value of the posture.

また、姿勢推定部２０は、１視点目に対応するカメラ１２の位置及び姿勢に対する、２視点目に対応するカメラ１２の位置及び姿勢に基づいて、三角測量を用いて、特徴点に対応する３次元座標を表すマップ点を計算する。 Further, the posture estimation unit 20 corresponds to the feature point by using triangulation based on the position and orientation of the camera 12 corresponding to the second viewpoint with respect to the position and orientation of the camera 12 corresponding to the first viewpoint. Calculate the map points that represent the dimensional coordinates.

次に、姿勢推定部２０は、車両の移動に合わせてカメラ１２が移動する際に、マップ点を用いて、カメラ１２の位置及び姿勢を逐次推定する。例えば、まず、姿勢推定部２０は、カメラ１２を搭載した車両が既定の運動モデル（例えば、等速運動）を行うと仮定する。そして、姿勢推定部２０は、特徴点抽出部１８によって抽出されたグレースケール画像の特徴点と、データ記憶部１５のマップ点テーブルに格納されたマップ点との対応付けを行う。 Next, the posture estimation unit 20 sequentially estimates the position and posture of the camera 12 by using the map points when the camera 12 moves according to the movement of the vehicle. For example, first, the posture estimation unit 20 assumes that the vehicle equipped with the camera 12 performs a predetermined motion model (for example, constant velocity motion). Then, the posture estimation unit 20 associates the feature points of the grayscale image extracted by the feature point extraction unit 18 with the map points stored in the map point table of the data storage unit 15.

より詳細には、姿勢推定部２０は、データ記憶部１５のマップ点テーブルに格納されたマップ点をグレースケール画像に投影し、グレースケール画像の特徴点の各々とマップ点の各々とを対応付ける。 More specifically, the posture estimation unit 20 projects the map points stored in the map point table of the data storage unit 15 onto the grayscale image, and associates each of the feature points of the grayscale image with each of the map points.

例えば、姿勢推定部２０は、以下の参考文献４に記載のPnPアルゴリズムを用いて、カメラの位置姿勢を推定する。PnPアルゴリズムとは、グレースケール画像に投影されたマップ点と特徴点との間の距離を最小にするようなカメラ１２の位置及び姿勢を、Levenberg-Marquardt法などの非線形最適化アルゴリズムによって算出する手法である。 For example, the posture estimation unit 20 estimates the position and orientation of the camera by using the PnP algorithm described in Reference 4 below. The PnP algorithm is a method of calculating the position and orientation of the camera 12 that minimizes the distance between the map points and feature points projected on the grayscale image by a nonlinear optimization algorithm such as the Levenberg-Marquardt method. Is.

参考文献４：V. Lepetit et al., "EPnP: An Accurate O(n) Solution to the PnP Problem", International Journal of Computer Vision, Vol.81, No.2, pp.155-166(2008). Reference 4: V. Lepetit et al., "EPnP: An Accurate O (n) Solution to the PnP Problem", International Journal of Computer Vision, Vol.81, No.2, pp.155-166 (2008).

また、姿勢推定部２０は、画像取得部１６によって出力されたグレースケール画像をキーフレーム画像として格納するか否かを判定する。例えば、姿勢推定部２０は、以下の基準に従って、画像取得部１６によって出力されたグレースケール画像をキーフレーム画像として格納するか否かを判定する。姿勢推定部２０は、以下の（１）〜（３）の基準を全て満たす場合に、グレースケール画像をキーフレーム画像として、カメラ１２の位置及び姿勢と、特徴点とマップ点との対応付けと共にデータ記憶部１５へ格納する。 Further, the posture estimation unit 20 determines whether or not to store the grayscale image output by the image acquisition unit 16 as a keyframe image. For example, the posture estimation unit 20 determines whether or not to store the grayscale image output by the image acquisition unit 16 as a keyframe image according to the following criteria. When all of the following criteria (1) to (3) are satisfied, the posture estimation unit 20 uses a grayscale image as a keyframe image, along with the correspondence between the position and posture of the camera 12 and the feature points and map points. It is stored in the data storage unit 15.

（１）前回のキーフレーム画像の格納から一定フレーム（例えば、２０フレーム）経過している。
（２）前回までに格納されたキーフレーム画像のうち、グレースケール画像の位置と最も近い最近傍のキーフレーム画像の特徴点とグレースケール画像の特徴点との間の対応点数が一定数以上（例えば、５０点）である。
（３）前回までに格納されたキーフレーム画像のうち、グレースケール画像の位置と最も近い最近傍のキーフレーム画像と比較して一定の割合（例えば、９０％）対応点数が減少している。 (1) A certain frame (for example, 20 frames) has passed since the previous storage of the key frame image.
(2) Of the keyframe images stored up to the previous time, the number of corresponding points between the feature points of the keyframe image closest to the position of the grayscale image and the feature points of the grayscale image is a certain number or more ( For example, 50 points).
(3) Among the keyframe images stored up to the previous time, the corresponding points are reduced by a certain percentage (for example, 90%) as compared with the keyframe image closest to the position of the grayscale image.

［マップ生成機能］ [Map generation function]

マップ生成部２２は、姿勢推定部２０によって新たなキーフレーム画像がデータ記憶部１５へ格納された場合、三角測量を用いて、新たなキーフレーム画像の特徴点の各々のマップ点を算出する。例えば、マップ生成部２２は、新たなキーフレーム画像と、前回までにデータ記憶部１５へ格納されたキーフレーム画像とに基づき、参考文献５に記載の手法により、新たなキーフレーム画像のマップ点を生成する。 When a new keyframe image is stored in the data storage unit 15 by the posture estimation unit 20, the map generation unit 22 calculates each map point of the feature points of the new keyframe image by using triangulation. For example, the map generation unit 22 uses the method described in Reference 5 based on the new keyframe image and the keyframe image stored in the data storage unit 15 up to the previous time to map points of the new keyframe image. To generate.

参考文献５：R. I. Hartley et al., "Triangulation, Computer Vision and Image Understanding", Vol. 68, No.2, pp.146-157, 1997. Reference 5: R. I. Hartley et al., "Triangulation, Computer Vision and Image Understanding", Vol. 68, No.2, pp.146-157, 1997.

具体的には、マップ生成部２２は、前回までにデータ記憶部１５へ格納されたキーフレーム画像のうち、新たなキーフレーム画像の位置と最も近い最近傍のキーフレーム画像を選択する。次に、マップ生成部２２は、新たなキーフレーム画像に含まれる特徴点に対応する、最近傍のキーフレーム画像における特徴点をエピポーラ探索で特定する。エピポーラ探索とは、２視点間の幾何拘束を用い、１視点目の特徴点が存在するべき２視点目のエピポーラ線上のみを探索範囲として対応点を見つける処理である。 Specifically, the map generation unit 22 selects the nearest keyframe image closest to the position of the new keyframe image among the keyframe images stored in the data storage unit 15 up to the previous time. Next, the map generation unit 22 identifies the feature points in the nearest keyframe image corresponding to the feature points included in the new keyframe image by epipolar search. The epipolar search is a process of finding a corresponding point only on the epipolar line of the second viewpoint where the feature point of the first viewpoint should exist by using the geometric constraint between the two viewpoints.

そして、マップ生成部２２は、新たなキーフレーム画像と最近傍のキーフレーム画像との間で対応付けられた特徴点の情報に基づき、三角測量を用いて、新たなキーフレーム画像のマップ点を算出する。なお、三角測量については、例えば、上記参考文献５の「5.1 Linear Triangulation」に記載の方法を用いることができる。 Then, the map generation unit 22 uses triangulation to generate map points of the new keyframe image based on the information of the feature points associated between the new keyframe image and the nearest keyframe image. calculate. For triangulation, for example, the method described in "5.1 Linear Triangulation" in Reference 5 can be used.

そして、マップ生成部２２は、新たなキーフレーム画像のマップ点をデータ記憶部１５へ格納する。 Then, the map generation unit 22 stores the map points of the new keyframe image in the data storage unit 15.

調整部２８は、データ記憶部１５に格納されたマップ情報に基づいて、全てのキーフレーム画像についての、キーフレーム画像上での特徴点とマップ点との間の再投影誤差の総和が最小となるように、マップ点を補正する。 Based on the map information stored in the data storage unit 15, the adjustment unit 28 determines that the sum of the reprojection errors between the feature points and the map points on the keyframe image for all the keyframe images is the minimum. Correct the map points so that

具体的には、調整部２８は、データ記憶部１５のキーフレームテーブルに格納された各キーフレーム、各キーフレームに対応付けられたマップ点、及びカメラ１２の内部パラメータを取得する。そして、調整部２８は、キーフレーム画像上での特徴点とマップ点との間の再投影誤差が最小となるように、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢と、各キーフレーム画像の特徴点に対応付けられたマップ点の座標を補正する。キーフレーム画像上での特徴点とマップ点との間の再投影誤差を最小化するためのアルゴリズムとしては、以下の参考文献６に記載されているLevenberg-Marquardt法などの非線形最適化アルゴリズムを用いることができる。調整部２８における処理は、バンドル調整とも称される。 Specifically, the adjusting unit 28 acquires each key frame stored in the key frame table of the data storage unit 15, a map point associated with each key frame, and an internal parameter of the camera 12. Then, the adjusting unit 28 determines the position and orientation of the camera 12 when each keyframe image is captured so that the reprojection error between the feature point and the map point on the keyframe image is minimized. Correct the coordinates of the map points associated with the feature points of each keyframe image. As an algorithm for minimizing the reprojection error between the feature point and the map point on the keyframe image, a nonlinear optimization algorithm such as the Levenberg-Marquardt method described in Reference 6 below is used. be able to. The process in the adjusting unit 28 is also referred to as bundle adjustment.

参考文献６：B. Triggs et al., "Bundle Adjustment- A Modern Synthesis", In Proc. of International Workshop on Vision Algorithms: Theory and Practice, pp.298-392, 1999. Reference 6: B. Triggs et al., "Bundle Adjustment-A Modern Synthesis", In Proc. Of International Workshop on Vision Algorithms: Theory and Practice, pp.298-392, 1999.

そして、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢を、補正された位置及び姿勢に置き換える。また、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム画像に対応付けられたマップ点の座標を、補正されたマップ点の座標に置き換える。 Then, the adjusting unit 28 replaces the position and orientation of the camera 12 when each keyframe image of the map information stored in the data storage unit 15 is captured with the corrected position and orientation. Further, the adjusting unit 28 replaces the coordinates of the map points associated with each keyframe image in the map information stored in the data storage unit 15 with the coordinates of the corrected map points.

［最適化機能］ [Optimization function]

指標検出部２４は、データ記憶部１５に格納された新たなキーフレーム画像に、形状及び大きさが予め定められた指標が含まれているか否かを検出する。なお、指標は、カメラが移動する対象領域に予め設置される。また、指標は複数設置され、複数の指標の各々についての、指標間の相対的な位置及び姿勢は既知である。例えば、図６に示されるような指標１が、環境に予め設置される。図６に示されるように、指標１には、予め定められたパターン２が含まれている。 The index detection unit 24 detects whether or not the new keyframe image stored in the data storage unit 15 includes an index having a predetermined shape and size. The index is set in advance in the target area where the camera moves. In addition, a plurality of indicators are set, and the relative positions and postures between the indicators for each of the plurality of indicators are known. For example, index 1 as shown in FIG. 6 is pre-installed in the environment. As shown in FIG. 6, the index 1 includes a predetermined pattern 2.

指標検出部２４は、例えば、参考文献７に記載の方法を用い、画像取得部１６によって出力されたグレースケール画像に指標が存在するか否かを判定する。具体的には、指標検出部２４は、画像取得部１６によって出力されたグレースケール画像に対して二値化を行う。そして、指標検出部２４は、指標の４隅座標の位置を推定することにより、指標を検出する。 The index detection unit 24 uses, for example, the method described in Reference 7 to determine whether or not an index exists in the grayscale image output by the image acquisition unit 16. Specifically, the index detection unit 24 binarizes the grayscale image output by the image acquisition unit 16. Then, the index detection unit 24 detects the index by estimating the positions of the coordinates of the four corners of the index.

参考文献７：H. Kato et al., "Marker tracking and HMD calibration for a video-based augmented reality conferencing system", In Proc. of IEEE and ACM International Workshop on Augmented Reality (IWAR), pp.85-94, 1999. Reference 7: H. Kato et al., "Marker tracking and HMD calibration for a video-based augmented reality conferencing system", In Proc. Of IEEE and ACM International Workshop on Augmented Reality (IWAR), pp.85-94, 1999.

姿勢推定部２０は、指標検出部２４によって新たなキーフレーム画像から指標が検出された場合、指標を含むキーフレーム画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。例えば、姿勢推定部２０は、上記参考文献７の「4. Position and pose estimation of markers」に従って、指標に対するカメラ１２の位置及び姿勢を推定する。 When the index is detected from the new keyframe image by the index detection unit 24, the posture estimation unit 20 estimates the position and posture of the camera 12 with respect to the index based on the keyframe image including the index. For example, the posture estimation unit 20 estimates the position and posture of the camera 12 with respect to the index according to “4. Position and pose estimation of markers” in Reference 7.

そして、最適化部２６は、姿勢推定部２０によって推定された新たなキーフレーム画像が撮像されたときの指標に対するカメラ１２の位置及び姿勢に基づいて、データ記憶部１５に格納されたマップ情報を補正する。 Then, the optimization unit 26 stores the map information stored in the data storage unit 15 based on the position and orientation of the camera 12 with respect to the index when the new keyframe image estimated by the attitude estimation unit 20 is captured. to correct.

本実施形態におけるマップ情報の補正について、具体的に説明する。 The correction of the map information in the present embodiment will be specifically described.

例えば、図７の７Ａに示されるように、各キーフレーム画像が撮像されたときのカメラの位置及び姿勢（ａ，ｂ，ｃ）が得られている場合を例に説明する。この場合、ａはスタート地点のキーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢を表す。また、ｃは新たなキーフレーム画像が撮像されたときのカメラの位置及び姿勢を表す。ｂはａとｃとの間に位置するキーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢を表す。また、Ｘは、姿勢推定部２０によって得られた、指標１に対するカメラ１２の位置及び姿勢を表す。 For example, as shown in 7A of FIG. 7, a case where the position and posture (a, b, c) of the camera when each keyframe image is captured is obtained will be described as an example. In this case, a represents the position and orientation of the camera 12 when the keyframe image of the starting point is captured. Further, c represents the position and orientation of the camera when a new keyframe image is captured. b represents the position and orientation of the camera 12 when the keyframe image located between a and c is captured. Further, X represents the position and posture of the camera 12 with respect to the index 1 obtained by the posture estimation unit 20.

本実施形態では、図７の７Ｂに示されるように、カメラ１２の位置及び姿勢ａと、指標１に対するカメラ１２の位置及び姿勢Ｘとに基づき、各キーフレーム画像におけるカメラ１２の位置及び姿勢Ｙを得る。 In this embodiment, as shown in 7B of FIG. 7, the position and orientation Y of the camera 12 in each keyframe image is based on the position and orientation a of the camera 12 and the position and orientation X of the camera 12 with respect to the index 1. To get.

具体的には、図７の７Ｃに示されるように、カメラ１２の位置及び姿勢（ａ，ｂ，ｃ）と、指標１に対するカメラ１２の位置及び姿勢Ｘと、カメラ１２の位置及び姿勢ａでの他の指標に対するカメラ１２の位置及び姿勢とを含むループが形成される。このとき、ループを表すグラフを再計算することにより、補正されたカメラ１２の位置及び姿勢Ｙを得る。これにより、補正前の移動軌跡Ｓが移動軌跡Ｐとなり、実世界と対応するマップ情報が得られる。 Specifically, as shown in 7C of FIG. 7, the position and orientation (a, b, c) of the camera 12, the position and orientation X of the camera 12 with respect to the index 1, and the position and orientation a of the camera 12 A loop is formed that includes the position and orientation of the camera 12 with respect to other indicators. At this time, the corrected position and orientation Y of the camera 12 are obtained by recalculating the graph representing the loop. As a result, the movement locus S before correction becomes the movement locus P, and map information corresponding to the real world can be obtained.

ここで、カメラの位置及び姿勢から形成されるループについてより詳細に説明する。 Here, the loop formed from the position and orientation of the camera will be described in more detail.

図８に示されるように、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢（ａ，ｂ，ｃ）と、新たなキーフレーム画像が撮像されたときの指標（１Ａ，１Ｂ）に対するカメラ１２の位置及び姿勢（Ｘ１，Ｘ２）とを含むループが形成される。ただし、指標（１Ａ，１Ｂ）間の相対的な位置及び姿勢は既知とする。 As shown in FIG. 8, the position and orientation (a, b, c) of the camera 12 when each keyframe image is captured, and the index (1A, 1B) when a new keyframe image is captured. A loop is formed that includes the position and orientation (X1, X2) of the camera 12 with respect to the camera 12. However, the relative positions and postures between the indicators (1A, 1B) are known.

図８に示されるように、キーフレーム画像から指標１Ａが検出された場合、指標１Ａに対するカメラ１２の位置及び姿勢Ｘ１が推定される。なお、ａはキーフレーム画像として格納される際に推定されたカメラ１２の位置及び姿勢である。 As shown in FIG. 8, when the index 1A is detected from the keyframe image, the position and posture X1 of the camera 12 with respect to the index 1A are estimated. Note that a is the position and orientation of the camera 12 estimated when stored as a keyframe image.

また、新たなキーフレーム画像から指標１Ｂが検出された場合、指標１Ｂに対するカメラ１２の位置及び姿勢Ｘ２が推定される。なお、ｃはキーフレーム画像として格納される際に推定されたカメラ１２の位置及び姿勢である。 When the index 1B is detected from the new keyframe image, the position and orientation X2 of the camera 12 with respect to the index 1B are estimated. Note that c is the position and orientation of the camera 12 estimated when stored as a keyframe image.

そして、指標（１Ａ，１Ｂ）間の相対的な位置及び姿勢と、指標１Ａに対するカメラ１２の位置及び姿勢Ｘ１と、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢ｂと、指標１Ｂに対するカメラ１２の位置及び姿勢Ｘ２とからループＺが形成される。これにより、以下の参考文献８に記載のPose Graph最適化を行うことが可能となる。 Then, the relative position and orientation between the indexes (1A, 1B), the position and orientation X1 of the camera 12 with respect to the index 1A, the position and orientation b of the camera 12 when each keyframe image is captured, and the index. A loop Z is formed from the position and posture X2 of the camera 12 with respect to 1B. This makes it possible to perform the Pose Graph optimization described in Reference 8 below.

参考文献８：Ra´ul Mur-Artal and Juan D. Tard´os,"Fast Relocalisation and Loop Closing in Keyframe-Based SLAM",2014 IEEE International Conference on Robotics & Automation (ICRA) May 31 - June 7, 2014. Hong Kong, China Reference 8: Ra´ul Mur-Artal and Juan D. Tard´os, “Fast Relocalisation and Loop Closing in Keyframe-Based SLAM”, 2014 IEEE International Conference on Robotics & Automation (ICRA) May 31 --June 7, 2014. Hong Kong, China

具体的には、まず、最適化部２６は、新たなキーフレーム画像から指標が検出された場合に、データ記憶部１５に格納されたマップ情報を取得する。次に最適化部２６は、前回、キーフレーム画像から指標が検出されたときの、指標に対するカメラ１２の位置及び姿勢と、過去のキーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の各々と、を取得する。最適化部２６は、更に、新たなキーフレーム画像が撮像されたときの指標に対するカメラ１２の位置及び姿勢と、指標間の相対的な位置及び姿勢とを取得して、ループを形成し、形成されるループに基づき、マップ情報を補正する。 Specifically, first, the optimization unit 26 acquires the map information stored in the data storage unit 15 when the index is detected from the new keyframe image. Next, the optimization unit 26 determines the position and orientation of the camera 12 with respect to the index when the index was detected from the keyframe image last time, and the position and orientation of the camera 12 when the past keyframe image was captured. Get each and. The optimization unit 26 further acquires the position and orientation of the camera 12 with respect to the index when a new keyframe image is captured and the relative position and orientation between the indexes to form a loop. The map information is corrected based on the loop to be performed.

より詳細には、最適化部２６は、以下の参考文献８に記載のPose Graph最適化により、新たなキーフレーム画像の周辺のキーフレーム画像におけるカメラ１２の位置及び姿勢の各々と、新たなキーフレーム画像におけるカメラ１２の位置及び姿勢とを補正する。 More specifically, the optimization unit 26 performs each of the positions and orientations of the camera 12 in the keyframe image around the new keyframe image and the new key by the Pose Graph optimization described in Reference 8 below. The position and orientation of the camera 12 in the frame image are corrected.

そして、最適化部２６は、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の補正に応じて、各キーフレーム画像の特徴点に対応するマップ点の座標を座標変換する。具体的には、最適化部２６は、各キーフレーム画像が撮像されたときの補正前のカメラ１２の位置及び姿勢と、補正前の各キーフレーム画像の特徴点に対応するマップ点との間の相対的関係が維持されるように、マップ点の座標を座標変換する。 Then, the optimization unit 26 transforms the coordinates of the map points corresponding to the feature points of each keyframe image according to the correction of the position and posture of the camera 12 when each keyframe image is captured. Specifically, the optimization unit 26 between the position and orientation of the camera 12 before correction when each keyframe image is captured and the map points corresponding to the feature points of each keyframe image before correction. The coordinates of the map points are transformed so that the relative relationship of is maintained.

調整部２８は、最適化部２６によって補正された、各キーフレーム画像の特徴点へのマップ点の再投影誤差を最小化するように、各キーフレーム画像におけるカメラ１２の位置及び姿勢並びに各キーフレーム画像に対応付けられたマップ点の座標を補正する。そして、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム及び各キーフレーム画像に対応付けられたマップ点の座標を、補正された各キーフレーム画像及び各キーフレーム画像に対応付けられたマップ点の座標に置き換える。 The adjustment unit 28 determines the position and orientation of the camera 12 in each keyframe image and each key so as to minimize the reprojection error of the map points to the feature points of each keyframe image corrected by the optimization unit 26. Correct the coordinates of the map points associated with the frame image. Then, the adjusting unit 28 corrects the coordinates of the map points associated with each key frame and each key frame image in the map information stored in the data storage unit 15, and each key frame image and each key frame. Replace with the coordinates of the map point associated with the image.

情報処理装置１０の制御部１４は、例えば、図９に示すコンピュータ５０で実現することができる。コンピュータ５０はＣＰＵ５１、一時記憶領域としてのメモリ５２、及び不揮発性の記憶部５３を備える。また、コンピュータ５０は、カメラ１２、表示装置、及び入出力装置等（図示省略）が接続される入出力interface（Ｉ／Ｆ）５４、及び記録媒体５９に対するデータの読み込み及び書き込みを制御するread/write（Ｒ／Ｗ）部５５を備える。また、コンピュータ５０は、インターネット等のネットワークに接続されるネットワークＩ／Ｆ５６を備える。ＣＰＵ５１、メモリ５２、記憶部５３、入出力Ｉ／Ｆ５４、Ｒ／Ｗ部５５、及びネットワークＩ／Ｆ５６は、バス５７を介して互いに接続される。 The control unit 14 of the information processing device 10 can be realized by, for example, the computer 50 shown in FIG. The computer 50 includes a CPU 51, a memory 52 as a temporary storage area, and a non-volatile storage unit 53. Further, the computer 50 controls reading / writing of data to the input / output interface (I / F) 54 to which the camera 12, the display device, the input / output device and the like (not shown) are connected, and the recording medium 59. A write (R / W) unit 55 is provided. Further, the computer 50 includes a network I / F 56 connected to a network such as the Internet. The CPU 51, the memory 52, the storage unit 53, the input / output I / F 54, the R / W unit 55, and the network I / F 56 are connected to each other via the bus 57.

記憶部５３は、Hard Disk Drive（ＨＤＤ）、solid state drive（ＳＳＤ）、フラッシュメモリ等によって実現できる。記憶媒体としての記憶部５３には、コンピュータ５０を情報処理装置１０の制御部１４として機能させるための情報処理プログラム６０が記憶されている。情報処理プログラム６０は、画像取得プロセス６２と、特徴点抽出プロセス６３と、姿勢推定プロセス６４と、マップ生成プロセス６５と、指標検出プロセス６６と、最適化プロセス６７と、調整プロセス６８とを有する。また、記憶部５３は、データ記憶部１５を構成する情報が記憶されるデータ記憶領域６９を有する。 The storage unit 53 can be realized by a Hard Disk Drive (HDD), a solid state drive (SSD), a flash memory, or the like. The storage unit 53 as a storage medium stores an information processing program 60 for causing the computer 50 to function as the control unit 14 of the information processing device 10. The information processing program 60 includes an image acquisition process 62, a feature point extraction process 63, an attitude estimation process 64, a map generation process 65, an index detection process 66, an optimization process 67, and an adjustment process 68. Further, the storage unit 53 has a data storage area 69 in which information constituting the data storage unit 15 is stored.

ＣＰＵ５１は、情報処理プログラム６０を記憶部５３から読み出してメモリ５２に展開し、情報処理プログラム６０が有するプロセスを順次実行する。ＣＰＵ５１は、画像取得プロセス６２を実行することで、図１に示す画像取得部１６として動作する。また、ＣＰＵ５１は、特徴点抽出プロセス６３を実行することで、図１に示す特徴点抽出部１８として動作する。また、ＣＰＵ５１は、姿勢推定プロセス６４を実行することで、図１に示す姿勢推定部２０として動作する。また、ＣＰＵ５１は、マップ生成プロセス６５を実行することで、図１に示すマップ生成部２２として動作する。また、ＣＰＵ５１は、指標検出プロセス６６を実行することで、図１に示す指標検出部２４として動作する。また、ＣＰＵ５１は、最適化プロセス６７を実行することで、図１に示す最適化部２６として動作する。また、ＣＰＵ５１は、調整プロセス６８を実行することで、図１に示す調整部２８として動作する。また、ＣＰＵ５１は、データ記憶領域６９から情報を読み出して、データ記憶部１５をメモリ５２に展開する。これにより、情報処理プログラム６０を実行したコンピュータ５０が、情報処理装置１０の制御部１４として機能することになる。そのため、ソフトウェアである情報処理プログラム６０を実行するプロセッサはハードウェアである。 The CPU 51 reads the information processing program 60 from the storage unit 53, expands the information processing program 60 into the memory 52, and sequentially executes the processes included in the information processing program 60. The CPU 51 operates as the image acquisition unit 16 shown in FIG. 1 by executing the image acquisition process 62. Further, the CPU 51 operates as the feature point extraction unit 18 shown in FIG. 1 by executing the feature point extraction process 63. Further, the CPU 51 operates as the posture estimation unit 20 shown in FIG. 1 by executing the posture estimation process 64. Further, the CPU 51 operates as the map generation unit 22 shown in FIG. 1 by executing the map generation process 65. Further, the CPU 51 operates as the index detection unit 24 shown in FIG. 1 by executing the index detection process 66. Further, the CPU 51 operates as the optimization unit 26 shown in FIG. 1 by executing the optimization process 67. Further, the CPU 51 operates as the adjustment unit 28 shown in FIG. 1 by executing the adjustment process 68. Further, the CPU 51 reads information from the data storage area 69 and expands the data storage unit 15 into the memory 52. As a result, the computer 50 that executes the information processing program 60 functions as the control unit 14 of the information processing device 10. Therefore, the processor that executes the information processing program 60, which is software, is hardware.

なお、情報処理プログラム６０により実現される機能は、例えば半導体集積回路、より詳しくはApplication Specific Integrated Circuit（ＡＳＩＣ）等で実現することも可能である。 The function realized by the information processing program 60 can also be realized by, for example, a semiconductor integrated circuit, more specifically, an Application Specific Integrated Circuit (ASIC) or the like.

次に、本実施形態に係る情報処理装置１０の作用について説明する。情報処理装置１０は、姿勢推定処理とマップ生成処理と最適化処理とを実行する。情報処理装置１０を搭載した端末が移動を開始し、カメラ１２がカメラの周辺の画像の撮像を開始すると、情報処理装置１０の制御部１４によって、図１０に示す姿勢推定処理が実行される。また、同様に、情報処理装置１０の制御部１４によって、図１１に示すマップ生成処理と、図１２に示す最適化処理とが実行される。以下、各処理について詳述する。 Next, the operation of the information processing device 10 according to the present embodiment will be described. The information processing device 10 executes a posture estimation process, a map generation process, and an optimization process. When the terminal equipped with the information processing device 10 starts moving and the camera 12 starts capturing an image around the camera, the control unit 14 of the information processing device 10 executes the posture estimation process shown in FIG. Similarly, the control unit 14 of the information processing apparatus 10 executes the map generation process shown in FIG. 11 and the optimization process shown in FIG. Hereinafter, each process will be described in detail.

＜姿勢推定処理＞ <Posture estimation process>

ステップＳ１００において、画像取得部１６は、カメラ１２によって撮像された画像を取得する。次に、画像取得部１６は、カメラ１２によって撮像された画像をグレースケール画像へ変換する。そして、画像取得部１６は、グレースケール画像を出力する。 In step S100, the image acquisition unit 16 acquires the image captured by the camera 12. Next, the image acquisition unit 16 converts the image captured by the camera 12 into a grayscale image. Then, the image acquisition unit 16 outputs a grayscale image.

ステップＳ１０２において、特徴点抽出部１８は、上記ステップＳ１００で出力されたグレースケール画像から、特徴点を抽出する。そして、特徴点抽出部１８は、各特徴点に対して、特徴量を計算する。 In step S102, the feature point extraction unit 18 extracts feature points from the grayscale image output in step S100. Then, the feature point extraction unit 18 calculates the feature amount for each feature point.

ステップＳ１０４において、姿勢推定部２０は、上記ステップＳ１０２で抽出された特徴点及び特徴点に対応する特徴量に基づいて、カメラ１２の位置及び姿勢を推定する。 In step S104, the posture estimation unit 20 estimates the position and posture of the camera 12 based on the feature points extracted in step S102 and the feature quantities corresponding to the feature points.

ステップＳ１０６において、姿勢推定部２０は、上記ステップＳ１００で出力されたグレースケール画像をキーフレーム画像として格納するか否かを判定する。グレースケール画像をキーフレーム画像として格納すると判定した場合には、ステップＳ１０８へ進む。一方、グレースケール画像をキーフレーム画像として格納しないと判定した場合には、ステップＳ１００へ戻る。 In step S106, the posture estimation unit 20 determines whether or not to store the grayscale image output in step S100 as a keyframe image. If it is determined that the grayscale image is stored as the keyframe image, the process proceeds to step S108. On the other hand, if it is determined that the grayscale image is not stored as the keyframe image, the process returns to step S100.

ステップＳ１０８において、姿勢推定部２０は、上記ステップＳ１００で出力されたグレースケール画像を、キーフレーム画像としてデータ記憶部１５へ格納する。また、上記ステップＳ１０４で推定された、カメラ１２の位置及び姿勢をデータ記憶部１５へ格納する。 In step S108, the posture estimation unit 20 stores the grayscale image output in step S100 in the data storage unit 15 as a keyframe image. Further, the position and orientation of the camera 12 estimated in step S104 are stored in the data storage unit 15.

＜マップ生成処理＞ <Map generation process>

ステップＳ２００において、マップ生成部２２は、姿勢推定処理によって新たなキーフレーム画像がデータ記憶部１５へ格納されたか否かを判定する。キーフレーム画像がデータ記憶部１５へ格納された場合、ステップＳ２０２へ進む。一方、キーフレーム画像がデータ記憶部１５へ格納されていない場合、ステップＳ２００へ戻る。 In step S200, the map generation unit 22 determines whether or not a new keyframe image is stored in the data storage unit 15 by the posture estimation process. When the keyframe image is stored in the data storage unit 15, the process proceeds to step S202. On the other hand, if the keyframe image is not stored in the data storage unit 15, the process returns to step S200.

ステップＳ２０２において、マップ生成部２２は、姿勢推定処理によってデータ記憶部１５へ格納された新たなキーフレーム画像と、前回までにデータ記憶部１５へ格納されたキーフレーム画像とに基づき、新たなキーフレーム画像のマップ点を生成する。 In step S202, the map generation unit 22 has a new key based on the new keyframe image stored in the data storage unit 15 by the posture estimation process and the keyframe image stored in the data storage unit 15 up to the previous time. Generate map points for the frame image.

ステップＳ２０４において、マップ生成部２２は、上記ステップＳ２０２で生成された新たなキーフレーム画像のマップ点をデータ記憶部１５へ格納する。 In step S204, the map generation unit 22 stores the map points of the new keyframe image generated in step S202 in the data storage unit 15.

ステップＳ２０６において、調整部２８は、データ記憶部１５に格納されたマップ情報に基づいて、全てのキーフレーム画像についての、キーフレーム画像上での特徴点とマップ点との間の再投影誤差の総和が最小となるように、マップ点を補正する。そして、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢を、補正された位置及び姿勢に置き換える。また、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム画像に対応付けられたマップ点を、補正されたマップ点に置き換える。 In step S206, the adjusting unit 28 determines the reprojection error between the feature points and the map points on the keyframe image for all the keyframe images based on the map information stored in the data storage unit 15. Correct the map points so that the sum is minimized. Then, the adjusting unit 28 replaces the position and orientation of the camera 12 when each keyframe image of the map information stored in the data storage unit 15 is captured with the corrected position and orientation. Further, the adjusting unit 28 replaces the map points associated with each keyframe image in the map information stored in the data storage unit 15 with the corrected map points.

＜最適化処理＞ <Optimization process>

ステップＳ３００において、指標検出部２４は、姿勢推定処理によってデータ記憶部１５に格納された新たなキーフレーム画像に指標が含まれているか否かを判定する。新たなキーフレーム画像に指標が含まれている場合には、ステップＳ３０２へ進む。 In step S300, the index detection unit 24 determines whether or not the index is included in the new keyframe image stored in the data storage unit 15 by the posture estimation process. If the new keyframe image contains an index, the process proceeds to step S302.

ステップＳ３０２において、姿勢推定部２０は、指標を含む新たなキーフレーム画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。 In step S302, the posture estimation unit 20 estimates the position and posture of the camera 12 with respect to the index based on a new keyframe image including the index.

ステップＳ３０４において、最適化部２６は、データ記憶部１５に格納された過去のキーフレーム画像におけるカメラ１２の位置及び姿勢の各々と、上記ステップＳ３０２で得られた指標に対するカメラ１２の位置及び姿勢とからループを形成する。そして、最適化部２６は、形成されるループに基づき、Pose Graph最適化により、過去のキーフレーム画像におけるカメラ１２の位置及び姿勢の各々と、新たなキーフレーム画像におけるカメラ１２の位置及び姿勢とを補正する。 In step S304, the optimization unit 26 includes each of the position and orientation of the camera 12 in the past keyframe image stored in the data storage unit 15, and the position and orientation of the camera 12 with respect to the index obtained in step S302. Form a loop from. Then, the optimization unit 26 determines each of the positions and orientations of the camera 12 in the past keyframe image and the position and orientation of the camera 12 in the new keyframe image by Pose Graph optimization based on the formed loop. To correct.

ステップＳ３０６において、最適化部２６は、上記ステップＳ３０４で得られた、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の補正に応じて、各キーフレームのマップ点の座標を座標変換する。
例えば、ループに基づく位置及び姿勢の補正はLoop Closure最適化を利用する事ができる。 In step S306, the optimization unit 26 coordinates the coordinates of the map points of each keyframe according to the correction of the position and orientation of the camera 12 when each keyframe image is captured, which was obtained in step S304. Convert.
For example, loop closure optimization can be used for loop-based position and orientation correction.

ステップＳ３０８において、調整部２８は、上記ステップＳ３０６で得られたマップ点の各キーフレーム画像の特徴点への再投影誤差を最小化するように、各キーフレーム画像におけるカメラ１２の位置及び姿勢と、各キーフレーム画像のマップ点を補正する。そして、調整部２８は、データ記憶部１５に格納されたマップ情報のうちの各キーフレーム及び各キーフレームに対応付けられたマップ点を、補正された各キーフレーム及び各キーフレームに対応付けられたマップ点に置き換える。 In step S308, the adjusting unit 28 determines the position and orientation of the camera 12 in each keyframe image so as to minimize the reprojection error of the map points obtained in step S306 onto the feature points of each keyframe image. , Correct the map points of each keyframe image. Then, the adjusting unit 28 associates the map points associated with each key frame and each key frame in the map information stored in the data storage unit 15 with each corrected key frame and each key frame. Replace with a map point.

以上説明したように、本実施形態に係る情報処理装置は、カメラによって撮像された画像に基づいて、カメラの位置及び姿勢を推定する。そして、撮像された画像から予め定められた指標が検出された場合に、カメラの位置及び姿勢の各々と、指標に対するカメラの位置及び姿勢とから形成されるループに基づいて、キーフレーム画像の各々を撮像したときのカメラの位置及び姿勢の各々を推定する。これにより、カメラの位置及び姿勢の推定誤差を低減させることができる。 As described above, the information processing apparatus according to the present embodiment estimates the position and orientation of the camera based on the image captured by the camera. Then, when a predetermined index is detected from the captured image, each of the keyframe images is based on a loop formed from each of the camera positions and postures and the camera position and posture with respect to the index. Estimate each of the position and orientation of the camera when the image is taken. This makes it possible to reduce the estimation error of the position and orientation of the camera.

また、指標が検出される毎に、キーフレーム画像の各々を撮像したときのカメラの位置及び姿勢の各々の最適化を行うことにより、高頻度で最適化を行うことができる。 Further, each time the index is detected, the position and orientation of the camera when each of the keyframe images is captured is optimized, so that the optimization can be performed with high frequency.

また、高頻度で最適化が行われることにより、調整部によって行われるバンドル調整の収束までの時間が減少し、局所解への収束を回避することができる。 Further, since the optimization is performed with high frequency, the time until the bundle adjustment performed by the adjustment unit converges can be reduced, and the convergence to the local solution can be avoided.

＜第２の実施形態＞
次に、第２の実施形態について説明する。第２の実施形態では、カメラによって撮像された画像から指標が検出された場合に、指標の検出結果に応じて、指標を含む画像の信頼度を算出する。そして、信頼度が予め設定された閾値より大きい場合に、キーフレーム画像の各々が撮像されたときのカメラの位置及び姿勢の各々を補正する点が第１の実施形態と異なる。 <Second embodiment>
Next, the second embodiment will be described. In the second embodiment, when the index is detected from the image captured by the camera, the reliability of the image including the index is calculated according to the detection result of the index. Then, when the reliability is larger than the preset threshold value, the position and orientation of the camera when each of the keyframe images is captured is corrected, which is different from the first embodiment.

図１３に、第２の実施形態の情報処理装置２１０の構成例を示す。第２の実施形態の情報処理装置２１０は、図１３に示されるように、カメラ１２と、制御部２１４とを備える。 FIG. 13 shows a configuration example of the information processing device 210 of the second embodiment. As shown in FIG. 13, the information processing device 210 of the second embodiment includes a camera 12 and a control unit 214.

制御部２１４は、データ記憶部１５と、画像取得部１６と、特徴点抽出部１８と、姿勢推定部２０と、マップ生成部２２と、指標検出部２４と、最適化部２２６と、調整部２８と、信頼度算出部２２５とを備える。 The control unit 214 includes a data storage unit 15, an image acquisition unit 16, a feature point extraction unit 18, a posture estimation unit 20, a map generation unit 22, an index detection unit 24, an optimization unit 226, and an adjustment unit. 28 and a reliability calculation unit 225 are provided.

信頼度算出部２２５は、データ記憶部１５に格納されたキーフレーム画像から指標が検出された場合に、指標の検出結果に応じて信頼度を算出する。 When the index is detected from the keyframe image stored in the data storage unit 15, the reliability calculation unit 225 calculates the reliability according to the detection result of the index.

例えば、信頼度算出部２２５は、キーフレーム画像から検出された指標の平面の法線と、カメラ１２の光軸とのなす角θを算出する。そして、信頼度算出部２２５は、なす角θが、θ１≦θ≦θ２を満たす場合には、角度に関する信頼度を高くする。一方、なす角θが、θ１≦θ≦θ２を満たさない場合には、角度に関する信頼度を低くする。θ１とθ２とは予め設定され、例えば、θ１＝π／１８、θ２＝４π／９である。指標とカメラとの光軸間のなす角θが大きすぎる場合又は小さすぎる場合は、指標の４隅の検出点の誤差がカメラの位置及び姿勢推定に大きな影響を及ぼすようになり（例えば参考文献９を参照）、推定精度が悪化するため、なす角θに応じて信頼度を算出する。 For example, the reliability calculation unit 225 calculates the angle θ formed by the plane normal of the index detected from the keyframe image and the optical axis of the camera 12. Then, when the angle θ formed by the reliability calculation unit 225 satisfies θ1 ≦ θ ≦ θ2, the reliability calculation unit 225 increases the reliability regarding the angle. On the other hand, when the angle θ formed does not satisfy θ1 ≦ θ ≦ θ2, the reliability regarding the angle is lowered. θ1 and θ2 are preset, and for example, θ1 = π / 18, θ2 = 4π / 9. If the angle θ between the index and the optical axis of the camera is too large or too small, the error of the detection points at the four corners of the index will have a large effect on the camera position and orientation estimation (for example, References). 9) Since the estimation accuracy deteriorates, the reliability is calculated according to the angle θ formed.

参考文献９：Y. Uematsu et al., "Improvement of Accuracy for 2D Marker-Based Tracking Using Particle Filter", In Proc. of IEEE International Conference on Artificial Reality and Telexistence(ICAT), pp.183-189, 2007. Reference 9: Y. Uematsu et al., "Improvement of Accuracy for 2D Marker-Based Tracking Using Particle Filter", In Proc. Of IEEE International Conference on Artificial Reality and Telexistence (ICAT), pp.183-189, 2007.

また、信頼度算出部２２５は、カメラ１２と指標との間の距離ｄに応じて、距離に関する信頼度を算出する。信頼度算出部２２５は、距離ｄが大きいほど信頼度が低くなるように、かつ距離ｄが小さいほど信頼度が高くなるように、距離に関する信頼度を算出する。カメラ１２と指標との間の距離が大きくなると、４隅の検出点の同定精度が悪化し、推定精度が悪化するため、距離に応じて信頼度を算出する。 Further, the reliability calculation unit 225 calculates the reliability regarding the distance according to the distance d between the camera 12 and the index. The reliability calculation unit 225 calculates the reliability with respect to the distance so that the larger the distance d, the lower the reliability, and the smaller the distance d, the higher the reliability. As the distance between the camera 12 and the index increases, the identification accuracy of the detection points at the four corners deteriorates, and the estimation accuracy deteriorates. Therefore, the reliability is calculated according to the distance.

また、信頼度算出部２２５は、キーフレーム画像から検出された指標に含まれるパターンと、予め登録されたパターンの一致度を、一致に関する信頼度として算出する。パターンの一致度が低いと、異なる指標と認識される可能性が大きくなるため、パターンの一致度に応じて信頼度を算出する。 Further, the reliability calculation unit 225 calculates the degree of coincidence between the pattern included in the index detected from the keyframe image and the pattern registered in advance as the degree of reliability regarding the match. If the degree of pattern matching is low, the possibility of being recognized as a different index increases, so the reliability is calculated according to the degree of pattern matching.

最適化部２２６は、信頼度算出部２２５によって算出された信頼度に応じて、過去のキーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の各々と、新たなキーフレーム画像が撮像されたときの指標に対するカメラ１２の位置及び姿勢とを補正する。 The optimization unit 226 captures each of the positions and orientations of the camera 12 when the past keyframe image was captured and a new keyframe image according to the reliability calculated by the reliability calculation unit 225. The position and orientation of the camera 12 with respect to the index at that time are corrected.

例えば、最適化部２２６は、信頼度算出部２２５によって算出された、角度に関する信頼度、距離に関する信頼度、及び一致に関する信頼度の少なくとも１つが閾値以上である場合に、キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の補正を行う。または、最適化部２２６は、信頼度算出部２２５により算出された、角度に関する信頼度、距離に関する信頼度、及び一致に関する信頼度の全てが閾値以上である場合、キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢の補正を行うようにしてもよい。 For example, the optimization unit 226 captures a keyframe image when at least one of the reliability regarding the angle, the reliability regarding the distance, and the reliability regarding the match calculated by the reliability calculation unit 225 is equal to or higher than the threshold value. The position and orientation of the camera 12 at that time are corrected. Alternatively, when the optimization unit 226 captures a keyframe image when all of the reliability regarding the angle, the reliability regarding the distance, and the reliability regarding the match calculated by the reliability calculation unit 225 are equal to or higher than the threshold value. The position and orientation of the camera 12 may be corrected.

情報処理装置２１０の制御部２１４は、例えば、図１４に示すコンピュータ５０で実現することができる。コンピュータ５０の記憶媒体としての記憶部５３には、コンピュータ５０を情報処理装置２１０の制御部２１４として機能させるための情報処理プログラム２６０が記憶されている。情報処理プログラム２６０は、画像取得プロセス６２と、特徴点抽出プロセス６３と、姿勢推定プロセス６４と、マップ生成プロセス６５と、指標検出プロセス６６と、信頼度算出プロセス２６６と、最適化プロセス２６７と、調整プロセス６８とを有する。また、記憶部５３は、データ記憶部１５を構成する情報が記憶されるデータ記憶領域６９を有する。 The control unit 214 of the information processing device 210 can be realized by, for example, the computer 50 shown in FIG. The storage unit 53 as a storage medium of the computer 50 stores an information processing program 260 for causing the computer 50 to function as a control unit 214 of the information processing device 210. The information processing program 260 includes an image acquisition process 62, a feature point extraction process 63, an attitude estimation process 64, a map generation process 65, an index detection process 66, a reliability calculation process 266, and an optimization process 267. It has an adjustment process 68. Further, the storage unit 53 has a data storage area 69 in which information constituting the data storage unit 15 is stored.

ＣＰＵ５１は、情報処理プログラム２６０を記憶部５３から読み出してメモリ５２に展開し、情報処理プログラム６０が有するプロセスを順次実行する。ＣＰＵ５１は、画像取得プロセス６２を実行することで、図１３に示す画像取得部１６として動作する。また、ＣＰＵ５１は、特徴点抽出プロセス６３を実行することで、図１３に示す特徴点抽出部１８として動作する。また、ＣＰＵ５１は、姿勢推定プロセス６４を実行することで、図１３に示す姿勢推定部２０として動作する。また、ＣＰＵ５１は、マップ生成プロセス６５を実行することで、図１３に示すマップ生成部２２として動作する。また、ＣＰＵ５１は、指標検出プロセス６６を実行することで、図１３に示す指標検出部２４として動作する。また、ＣＰＵ５１は、信頼度算出プロセス２６６を実行することで、図１３に示す信頼度算出部２２５として動作する。また、ＣＰＵ５１は、最適化プロセス２６７を実行することで、図１３に示す最適化部２２６として動作する。また、ＣＰＵ５１は、調整プロセス６８を実行することで、図１３に示す調整部２８として動作する。また、ＣＰＵ５１は、データ記憶領域６９から情報を読み出して、データ記憶部１５をメモリ５２に展開する。これにより、情報処理プログラム６０を実行したコンピュータ５０が、情報処理装置２１０の制御部２１４として機能することになる。そのため、ソフトウェアである情報処理プログラム２６０を実行するプロセッサはハードウェアである。 The CPU 51 reads the information processing program 260 from the storage unit 53, expands it in the memory 52, and sequentially executes the processes included in the information processing program 60. The CPU 51 operates as the image acquisition unit 16 shown in FIG. 13 by executing the image acquisition process 62. Further, the CPU 51 operates as the feature point extraction unit 18 shown in FIG. 13 by executing the feature point extraction process 63. Further, the CPU 51 operates as the posture estimation unit 20 shown in FIG. 13 by executing the posture estimation process 64. Further, the CPU 51 operates as the map generation unit 22 shown in FIG. 13 by executing the map generation process 65. Further, the CPU 51 operates as the index detection unit 24 shown in FIG. 13 by executing the index detection process 66. Further, the CPU 51 operates as the reliability calculation unit 225 shown in FIG. 13 by executing the reliability calculation process 266. Further, the CPU 51 operates as the optimization unit 226 shown in FIG. 13 by executing the optimization process 267. Further, the CPU 51 operates as the adjustment unit 28 shown in FIG. 13 by executing the adjustment process 68. Further, the CPU 51 reads information from the data storage area 69 and expands the data storage unit 15 into the memory 52. As a result, the computer 50 that executes the information processing program 60 functions as the control unit 214 of the information processing device 210. Therefore, the processor that executes the information processing program 260, which is software, is hardware.

なお、情報処理プログラム２６０により実現される機能は、例えば半導体集積回路、より詳しくはＡＳＩＣ等で実現することも可能である。 The function realized by the information processing program 260 can also be realized by, for example, a semiconductor integrated circuit, more specifically, an ASIC or the like.

次に第２の実施形態における情報処理装置２１０の作用について説明する。情報処理装置２１０によって、図１５に示す最適化処理が実行される。 Next, the operation of the information processing device 210 in the second embodiment will be described. The information processing device 210 executes the optimization process shown in FIG.

＜最適化処理＞
ステップＳ３００〜ステップＳ３０２、ステップＳ３０４〜ステップＳ３０８は第１の実施形態と同様に実行される。 <Optimization process>
Steps S300 to S302 and steps S304 to S308 are executed in the same manner as in the first embodiment.

ステップＳ４０３において、信頼度算出部２２５は、キーフレーム画像の指標の検出結果に応じて信頼度を算出する。 In step S403, the reliability calculation unit 225 calculates the reliability according to the detection result of the index of the keyframe image.

ステップＳ４０４において、最適化部２２６は、上記ステップＳ４０３で算出された信頼度が閾値以上であるか否かを判定する。信頼度が閾値以上である場合には、ステップＳ３０４へ進む。一方、信頼度が閾値未満である場合には、ステップＳ３００へ戻る。 In step S404, the optimization unit 226 determines whether or not the reliability calculated in step S403 is equal to or greater than the threshold value. If the reliability is equal to or higher than the threshold value, the process proceeds to step S304. On the other hand, if the reliability is less than the threshold value, the process returns to step S300.

以上説明したように、第２の実施形態では、情報処理装置２１０は、カメラによって撮像された画像から指標が検出された場合に、指標の検出結果に応じて、指標を含むキーフレーム画像の信頼度を算出する。そして、情報処理装置２１０は、信頼度が予め設定された閾値より大きい場合に、キーフレーム画像の各々が撮像されたときのカメラの位置及び姿勢の各々を補正する。これにより、指標の検出に関する信頼度を用いて、キーフレーム画像の各々が撮像されたときのカメラの位置及び姿勢の各々を精度よく補正することができる。 As described above, in the second embodiment, when the index is detected from the image captured by the camera, the information processing apparatus 210 relies on the keyframe image including the index according to the detection result of the index. Calculate the degree. Then, when the reliability is greater than a preset threshold value, the information processing device 210 corrects each of the position and orientation of the camera when each of the keyframe images is captured. As a result, it is possible to accurately correct each of the position and orientation of the camera when each of the keyframe images is captured by using the reliability regarding the detection of the index.

＜第３の実施形態＞
次に、第３の実施形態について説明する。第３の実施形態では、推定されたカメラの位置及び姿勢に応じて、予め設定された対象物が表示装置に重畳表示されるように、表示装置を制御する点が第１又は第２の実施形態と異なる。 <Third embodiment>
Next, a third embodiment will be described. In the third embodiment, the point of controlling the display device so that the preset object is superimposed and displayed on the display device according to the estimated position and orientation of the camera is the first or second embodiment. Different from the form.

図１６に、第３の実施形態の情報処理装置３１０の構成例を示す。第３の実施形態の情報処理装置３１０は、図１６に示されるように、カメラ１２と、制御部３１４と、表示装置３２６とを備える。また、第３の実施形態では、情報処理装置３１０が情報端末に搭載される場合を例に説明する。ユーザは情報端末を操作して、表示装置に表示される画面を閲覧する。 FIG. 16 shows a configuration example of the information processing device 310 of the third embodiment. As shown in FIG. 16, the information processing device 310 of the third embodiment includes a camera 12, a control unit 314, and a display device 326. Further, in the third embodiment, the case where the information processing device 310 is mounted on the information terminal will be described as an example. The user operates the information terminal to browse the screen displayed on the display device.

制御部３１４は、データ記憶部１５と、画像取得部１６と、特徴点抽出部１８と、姿勢推定部３２０と、マップ生成部２２と、指標検出部３２４と、最適化部２６と、調整部２８と、初期位置推定部３１９と、表示制御部３２５とを備える。 The control unit 314 includes a data storage unit 15, an image acquisition unit 16, a feature point extraction unit 18, a posture estimation unit 320, a map generation unit 22, an index detection unit 324, an optimization unit 26, and an adjustment unit. 28, an initial position estimation unit 319, and a display control unit 325 are provided.

第３の実施形態のデータ記憶部１５には、予め生成されたマップ情報が格納されている。 The data storage unit 15 of the third embodiment stores map information generated in advance.

初期位置推定部３１９は、特徴点抽出部１８によって抽出された特徴点及び特徴点に対応する特徴量と、データ記憶部１５に格納されたマップ情報とに基づいて、上記参考文献８に記載のRelocalizationにより、カメラ１２の初期の位置及び初期の姿勢を推定する。 The initial position estimation unit 319 is described in the above reference 8 based on the feature points extracted by the feature point extraction unit 18, the feature amount corresponding to the feature points, and the map information stored in the data storage unit 15. The initial position and initial posture of the camera 12 are estimated by Relocalization.

具体的には初期位置推定部３１９は、特徴点抽出部１８により抽出された特徴点及び特徴点に対応する特徴量と、マップ情報のうちの特徴点及び特徴点に対応する特徴量とに基づき、画像取得部１６により取得された画像と最も類似するキーフレーム画像を探索する。そして、初期位置推定部３１９は、画像取得部１６によって取得された画像と最も類似するキーフレーム画像との間で、特徴点のマッチングを行う。そして、初期位置推定部３１９は、最も類似するキーフレーム画像における特徴点とマップ点とのペアに基づき、画像取得部１６により取得された画像における特徴点とマップ点とを対応付ける。そして、初期位置推定部３１９は、上記参考文献４に記載のPnPアルゴリズムにより、カメラ１２の初期の位置及び初期の姿勢を推定する。 Specifically, the initial position estimation unit 319 is based on the feature amount corresponding to the feature point and the feature point extracted by the feature point extraction unit 18 and the feature amount corresponding to the feature point and the feature point in the map information. , The keyframe image most similar to the image acquired by the image acquisition unit 16 is searched. Then, the initial position estimation unit 319 matches the feature points between the image acquired by the image acquisition unit 16 and the most similar keyframe image. Then, the initial position estimation unit 319 associates the feature points and the map points in the image acquired by the image acquisition unit 16 with each other based on the pair of the feature points and the map points in the most similar keyframe image. Then, the initial position estimation unit 319 estimates the initial position and the initial posture of the camera 12 by the PnP algorithm described in Reference 4 above.

指標検出部３２４は、更に、画像取得部１６によって出力されたグレースケール画像に、指標が含まれているか否かを検出する。 The index detection unit 324 further detects whether or not the grayscale image output by the image acquisition unit 16 contains an index.

姿勢推定部３２０は、指標検出部３２４によって指標が検出された場合には、指標を含むグレースケール画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。一方、姿勢推定部３２０は、指標検出部３２４によって指標が検出されなかった場合には、特徴点抽出部１８により抽出された特徴点及び特徴点に対応する特徴量に基づいて、カメラ１２の位置及び姿勢を推定する。なお、姿勢推定部２０は、例えば、以下の参考文献１０に記載の方法を使用して、カメラ１２の位置及び姿勢を推定してもよい。 When the index is detected by the index detection unit 324, the posture estimation unit 320 estimates the position and posture of the camera 12 with respect to the index based on the grayscale image including the index. On the other hand, when the index is not detected by the index detection unit 324, the posture estimation unit 320 positions the camera 12 based on the feature points extracted by the feature point extraction unit 18 and the feature amount corresponding to the feature points. And estimate the posture. The posture estimation unit 20 may estimate the position and posture of the camera 12 by using, for example, the method described in Reference 10 below.

参考文献１０：特開２０１５‐１５８４６１号公報 Reference 10: Japanese Unexamined Patent Publication No. 2015-158461

表示制御部３２５は、姿勢推定部２０によって推定されたカメラ１２の位置及び姿勢に基づいて、予め設定された対象物が表示装置３２６に重畳表示されるように、表示装置３２６を制御する。 The display control unit 325 controls the display device 326 so that a preset object is superimposed and displayed on the display device 326 based on the position and orientation of the camera 12 estimated by the posture estimation unit 20.

例えば、表示装置３２６には、図１７に示されるようなカメラ１２で撮影された表示画面Ｄが表示される。表示制御部３２５は、姿勢推定部２０によって推定されたカメラ１２の位置及び姿勢に基づいて、対象物Ｇが表示装置３２６に重畳表示されるように、表示装置３２６を制御する。 For example, the display device 326 displays a display screen D taken by the camera 12 as shown in FIG. The display control unit 325 controls the display device 326 so that the object G is superimposed and displayed on the display device 326 based on the position and orientation of the camera 12 estimated by the posture estimation unit 20.

情報処理装置３１０の制御部３１４は、例えば、図１８に示すコンピュータ５０で実現することができる。コンピュータ５０はＣＰＵ５１、一時記憶領域としてのメモリ５２、及び不揮発性の記憶部５３を備える。また、コンピュータ５０は、カメラ１２、表示装置３２６、及び入出力装置等（図示省略）が接続される入出力Ｉ／Ｆ５４、及び記録媒体５９に対するデータの読み込み及び書き込みを制御するＲ／Ｗ部５５を備える。 The control unit 314 of the information processing device 310 can be realized by, for example, the computer 50 shown in FIG. The computer 50 includes a CPU 51, a memory 52 as a temporary storage area, and a non-volatile storage unit 53. Further, the computer 50 is an R / W unit 55 that controls reading and writing of data to the input / output I / F 54 to which the camera 12, the display device 326, the input / output device and the like (not shown) are connected, and the recording medium 59. To be equipped.

記憶部５３は、ＨＤＤ、ＳＳＤ、フラッシュメモリ等によって実現できる。記憶媒体としての記憶部５３には、コンピュータ５０を情報処理装置３１０の制御部３１４として機能させるための情報処理プログラム３６０が記憶されている。情報処理プログラム３６０は、画像取得プロセス６２と、特徴点抽出プロセス６３と、姿勢推定プロセス３６４と、マップ生成プロセス６５と、指標検出プロセス３６６とを有する。また、情報処理プログラム３６０は、最適化プロセス６７と、調整プロセス６８と、初期位置推定プロセス３７０と、表示制御プロセス３７１とを有する。また、記憶部５３は、データ記憶部１５を構成する情報が記憶されるデータ記憶領域６９を有する。 The storage unit 53 can be realized by an HDD, an SSD, a flash memory, or the like. The storage unit 53 as a storage medium stores an information processing program 360 for causing the computer 50 to function as a control unit 314 of the information processing device 310. The information processing program 360 includes an image acquisition process 62, a feature point extraction process 63, a posture estimation process 364, a map generation process 65, and an index detection process 366. The information processing program 360 also includes an optimization process 67, an adjustment process 68, an initial position estimation process 370, and a display control process 371. Further, the storage unit 53 has a data storage area 69 in which information constituting the data storage unit 15 is stored.

ＣＰＵ５１は、情報処理プログラム２６０を記憶部５３から読み出してメモリ５２に展開し、情報処理プログラム６０が有するプロセスを順次実行する。ＣＰＵ５１は、画像取得プロセス６２を実行することで、図１６に示す画像取得部１６として動作する。また、ＣＰＵ５１は、特徴点抽出プロセス６３を実行することで、図１６に示す特徴点抽出部１８として動作する。また、ＣＰＵ５１は、姿勢推定プロセス３６４を実行することで、図１６に示す姿勢推定部３２０として動作する。また、ＣＰＵ５１は、マップ生成プロセス６５を実行することで、図１６に示すマップ生成部２２として動作する。また、ＣＰＵ５１は、指標検出プロセス３６６を実行することで、図１６に示す指標検出部３２４として動作する。また、ＣＰＵ５１は、最適化プロセス６７を実行することで、図１６に示す最適化部２６として動作する。また、ＣＰＵ５１は、調整プロセス６８を実行することで、図１６に示す調整部２８として動作する。また、ＣＰＵ５１は、初期位置推定プロセス３７０を実行することで、図１６に示す初期位置推定部３１９として動作する。また、ＣＰＵ５１は、表示制御プロセス３７１を実行することで、図１６に示す表示制御部３２５として動作する。また、ＣＰＵ５１は、データ記憶領域６９から情報を読み出して、データ記憶部１５をメモリ５２に展開する。これにより、情報処理プログラム３６０を実行したコンピュータ５０が、情報処理装置３１０の制御部３１４として機能することになる。そのため、ソフトウェアである情報処理プログラム３６０を実行するプロセッサはハードウェアである。 The CPU 51 reads the information processing program 260 from the storage unit 53, expands it in the memory 52, and sequentially executes the processes included in the information processing program 60. The CPU 51 operates as the image acquisition unit 16 shown in FIG. 16 by executing the image acquisition process 62. Further, the CPU 51 operates as the feature point extraction unit 18 shown in FIG. 16 by executing the feature point extraction process 63. Further, the CPU 51 operates as the posture estimation unit 320 shown in FIG. 16 by executing the posture estimation process 364. Further, the CPU 51 operates as the map generation unit 22 shown in FIG. 16 by executing the map generation process 65. Further, the CPU 51 operates as the index detection unit 324 shown in FIG. 16 by executing the index detection process 366. Further, the CPU 51 operates as the optimization unit 26 shown in FIG. 16 by executing the optimization process 67. Further, the CPU 51 operates as the adjustment unit 28 shown in FIG. 16 by executing the adjustment process 68. Further, the CPU 51 operates as the initial position estimation unit 319 shown in FIG. 16 by executing the initial position estimation process 370. Further, the CPU 51 operates as the display control unit 325 shown in FIG. 16 by executing the display control process 371. Further, the CPU 51 reads information from the data storage area 69 and expands the data storage unit 15 into the memory 52. As a result, the computer 50 that executes the information processing program 360 functions as the control unit 314 of the information processing device 310. Therefore, the processor that executes the information processing program 360, which is software, is hardware.

なお、情報処理プログラム３６０により実現される機能は、例えば半導体集積回路、より詳しくはＡＳＩＣ等で実現することも可能である。 The function realized by the information processing program 360 can also be realized by, for example, a semiconductor integrated circuit, more specifically, an ASIC or the like.

次に、第３の実施形態に係る情報処理装置３１０の作用について説明する。情報処理装置３１０は、姿勢推定処理とマップ生成処理と最適化処理と表示制御処理とを実行する。姿勢推定処理とマップ生成処理と最適化処理とについては、第１又は第２の実施形態と同様である。以下、図１９に示す表示制御処理について詳述する。 Next, the operation of the information processing apparatus 310 according to the third embodiment will be described. The information processing device 310 executes a posture estimation process, a map generation process, an optimization process, and a display control process. The posture estimation process, the map generation process, and the optimization process are the same as those in the first or second embodiment. Hereinafter, the display control process shown in FIG. 19 will be described in detail.

＜表示制御処理＞
表示制御処理を実行することを表す指示信号を受け付けると、情報処理装置３１０は、図１９に示す表示制御処理を実行する。 <Display control processing>
Upon receiving the instruction signal indicating that the display control process is to be executed, the information processing apparatus 310 executes the display control process shown in FIG.

ステップＳ５００において、初期位置推定部３１９は、データ記憶部１５に格納されたマップ情報を取得する。 In step S500, the initial position estimation unit 319 acquires the map information stored in the data storage unit 15.

ステップＳ５０２において、画像取得部１６は、カメラ１２によって撮像された初期の画像を取得する。次に、画像取得部１６は、カメラ１２によって撮像された画像をグレースケール画像へ変換する。そして、画像取得部１６は、グレースケール画像を出力する。 In step S502, the image acquisition unit 16 acquires an initial image captured by the camera 12. Next, the image acquisition unit 16 converts the image captured by the camera 12 into a grayscale image. Then, the image acquisition unit 16 outputs a grayscale image.

ステップＳ５０４において、特徴点抽出部１８は、上記ステップＳ５０２で出力されたグレースケール画像から、特徴点を抽出する。そして、特徴点抽出部１８は、各特徴点に対して、特徴量を計算する。 In step S504, the feature point extraction unit 18 extracts feature points from the grayscale image output in step S502. Then, the feature point extraction unit 18 calculates the feature amount for each feature point.

ステップＳ５０６において、初期位置推定部３１９は、上記ステップＳ５０４で抽出された特徴点及び特徴点に対応する特徴量と、上記ステップＳ５００で取得されたマップ情報とに基づき、カメラ１２の初期の位置及び初期の姿勢を推定する。 In step S506, the initial position estimation unit 319 sets the initial position of the camera 12 and the initial position of the camera 12 based on the feature points extracted in step S504 and the feature quantities corresponding to the feature points and the map information acquired in step S500. Estimate the initial posture.

ステップＳ５０８において、画像取得部１６は、カメラ１２によって撮像された画像を取得する。次に、画像取得部１６は、カメラ１２によって撮像された画像をグレースケール画像へ変換する。そして、画像取得部１６は、グレースケール画像を出力する。 In step S508, the image acquisition unit 16 acquires the image captured by the camera 12. Next, the image acquisition unit 16 converts the image captured by the camera 12 into a grayscale image. Then, the image acquisition unit 16 outputs a grayscale image.

ステップＳ５１０において、指標検出部３２４は、上記ステップＳ５０８で出力されたグレースケール画像に、指標が含まれているか否かを判定する。グレースケール画像に指標が含まれている場合には、ステップＳ５１２へ進む。一方、グレースケール画像に指標が含まれていない場合には、ステップＳ５１４へ進む。 In step S510, the index detection unit 324 determines whether or not the grayscale image output in step S508 contains the index. If the grayscale image contains an index, the process proceeds to step S512. On the other hand, if the grayscale image does not include the index, the process proceeds to step S514.

ステップＳ５１２において、姿勢推定部３２０は、指標を含むグレースケール画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。 In step S512, the posture estimation unit 320 estimates the position and posture of the camera 12 with respect to the index based on the grayscale image including the index.

ステップＳ５１４において、特徴点抽出部１８は、上記ステップＳ５０８で出力されたグレースケール画像から、特徴点を抽出する。そして、特徴点抽出部１８は、各特徴点に対して、特徴量を計算する。 In step S514, the feature point extraction unit 18 extracts feature points from the grayscale image output in step S508. Then, the feature point extraction unit 18 calculates the feature amount for each feature point.

ステップＳ５１６において、姿勢推定部３２０は、上記ステップＳ５１４で抽出された特徴点及び特徴点に対応する特徴量に基づいて、カメラ１２の位置及び姿勢を推定する。 In step S516, the posture estimation unit 320 estimates the position and posture of the camera 12 based on the feature points extracted in step S514 and the feature quantities corresponding to the feature points.

ステップＳ５１８において、表示制御部３２５は、上記ステップＳ５１６で推定されたカメラ１２の位置及び姿勢に基づいて、予め設定された対象物が表示装置３２６に重畳表示されるように、表示装置３２６を制御する。 In step S518, the display control unit 325 controls the display device 326 so that the preset object is superimposed and displayed on the display device 326 based on the position and orientation of the camera 12 estimated in step S516. do.

ステップＳ５２０において、表示制御部３２５は、表示制御処理の停止信号を受け付けたか否かを判定する。表示制御処理の停止信号を受け付けた場合には、表示制御処理を終了する。表示制御処理の停止信号を受け付けていない場合には、ステップＳ５０８へ戻る。 In step S520, the display control unit 325 determines whether or not the stop signal of the display control process has been received. When the stop signal of the display control process is received, the display control process is terminated. If the stop signal of the display control process is not received, the process returns to step S508.

以上説明したように、第３の実施形態では、情報処理装置３１０は、推定されたカメラの位置及び姿勢に応じて、対象物が表示装置に重畳表示されるように、表示装置を制御する。また、指標が検出される毎に、キーフレーム画像の各々が撮像されたときのカメラの位置及び姿勢の各々の最適化が行われることにより、高頻度で最適化が行われる。これにより、精度よく推定されたカメラの位置及び姿勢に応じて、表示画面の適切な箇所へ対象物を表示させることができる。 As described above, in the third embodiment, the information processing device 310 controls the display device so that the object is superimposed and displayed on the display device according to the estimated position and orientation of the camera. In addition, each time the index is detected, the position and orientation of the camera when each of the keyframe images is captured is optimized, so that the optimization is performed with high frequency. As a result, the object can be displayed at an appropriate position on the display screen according to the accurately estimated position and orientation of the camera.

＜第４の実施形態＞
次に、第４の実施形態について説明する。第４の実施形態では、カメラによって撮像された画像から、前回検出された指標と対応する指標が検出された場合に、キーフレーム画像におけるカメラの位置及び姿勢を推定する点が第１〜第３の実施形態と異なる。 <Fourth Embodiment>
Next, a fourth embodiment will be described. In the fourth embodiment, the first to third points are to estimate the position and orientation of the camera in the keyframe image when the index corresponding to the previously detected index is detected from the image captured by the camera. It is different from the embodiment of.

第１の実施形態では、複数の指標間の相対的な位置及び姿勢が既知である必要がある。第４の実施形態では、複数の指標間の相対的な位置及び姿勢が既知である必要はない。 In the first embodiment, the relative positions and orientations between the plurality of indicators need to be known. In the fourth embodiment, the relative positions and orientations between the indicators need not be known.

図２０に、第４の実施形態の情報処理装置４１０の構成例を示す。第４の実施形態の情報処理装置４１０は、図２０に示されるように、カメラ１２と、制御部４１４とを備える。 FIG. 20 shows a configuration example of the information processing device 410 of the fourth embodiment. As shown in FIG. 20, the information processing apparatus 410 of the fourth embodiment includes a camera 12 and a control unit 414.

制御部４１４は、データ記憶部１５と、画像取得部１６と、特徴点抽出部１８と、姿勢推定部４２０と、マップ生成部２２と、指標検出部４２４と、最適化部４２６と、調整部２８とを備える。 The control unit 414 includes a data storage unit 15, an image acquisition unit 16, a feature point extraction unit 18, a posture estimation unit 420, a map generation unit 22, an index detection unit 424, an optimization unit 426, and an adjustment unit. 28 and.

［姿勢推定処理］ [Posture estimation process]

指標検出部４２４は、更に、画像取得部１６によって出力されたグレースケール画像に、指標が含まれているか否かを検出する。 The index detection unit 424 further detects whether or not the grayscale image output by the image acquisition unit 16 contains an index.

姿勢推定部４２０は、特徴点抽出部１８により抽出された特徴点及び特徴点に対応する特徴量に基づいて、カメラ１２の位置及び姿勢を推定する。また、姿勢推定部４２０は、更に、指標検出部４２４によって指標が検出された場合には、指標を含むグレースケール画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。 The posture estimation unit 420 estimates the position and posture of the camera 12 based on the feature points extracted by the feature point extraction unit 18 and the feature quantities corresponding to the feature points. Further, when the index is detected by the index detection unit 424, the posture estimation unit 420 estimates the position and posture of the camera 12 with respect to the index based on the grayscale image including the index.

そして、姿勢推定部４２０は、指標検出部４２４によって指標が検出された場合には、指標を含むグレースケール画像をキーフレーム画像としてデータ記憶部１５へ格納する。なお、指標が検出された場合には、指標を含むグレースケール画像がキーフレーム画像としてデータ記憶部１５へ格納されるが、データ記憶部１５へキーフレーム画像として格納される画像には、指標が必ず含まれているわけではない。例えば、第１の実施形態と同様に、所定の条件を満たしたキーフレーム画像も同様にデータ記憶部１５へ格納される。 Then, when the index is detected by the index detection unit 424, the posture estimation unit 420 stores the grayscale image including the index as a keyframe image in the data storage unit 15. When the index is detected, the grayscale image including the index is stored in the data storage unit 15 as a keyframe image, but the index is stored in the image stored as the keyframe image in the data storage unit 15. Not always included. For example, as in the first embodiment, a keyframe image satisfying a predetermined condition is also stored in the data storage unit 15.

また、姿勢推定部４２０は、キーフレーム画像と共に、指標に対するカメラ１２の位置及び姿勢をデータ記憶部１５へ格納する。なお、姿勢推定部４２０は、キーフレーム画像を格納する際に、キーフレーム画像と共に、指標の識別情報を表す指標ＩＤをデータ記憶部１５へ格納する。例えば、キーフレーム画像中の指標領域画像を、指標ＩＤとすることができる。また、姿勢推定部４２０は、連続するフレームで指標ＩＤが同一の指標が検出された場合には、画像取得部１６によって出力されたグレースケール画像についてキーフレーム画像として格納しない。 Further, the posture estimation unit 420 stores the position and posture of the camera 12 with respect to the index in the data storage unit 15 together with the key frame image. When the posture estimation unit 420 stores the keyframe image, the posture estimation unit 420 stores the index ID representing the identification information of the index in the data storage unit 15 together with the keyframe image. For example, the index area image in the keyframe image can be used as the index ID. Further, the posture estimation unit 420 does not store the grayscale image output by the image acquisition unit 16 as a keyframe image when an index having the same index ID is detected in consecutive frames.

また、姿勢推定部４２０は、指標を含むグレースケール画像を新たなキーフレーム画像として格納する際に、データ記憶部１５に既に格納されたキーフレーム画像に含まれる指標と同一であるか否かを判定する。具体的には、新たなキーフレーム画像に含まれている指標領域画像と、データ記憶部１５に格納された指標ＩＤとが同一であるか否かを判定する。 Further, when the posture estimation unit 420 stores the grayscale image including the index as a new keyframe image, whether or not it is the same as the index included in the keyframe image already stored in the data storage unit 15. judge. Specifically, it is determined whether or not the index area image included in the new keyframe image and the index ID stored in the data storage unit 15 are the same.

最適化部４２６は、新たなキーフレーム画像に含まれる指標領域画像がデータ記憶部１５に格納されている指標ＩＤと同一であると判定された場合、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢に基づき、ループを形成する。具体的には、最適化部４２６は、新たなキーフレーム画像を撮像したときのカメラ１２の位置及び姿勢と、データ記憶部１５に既に格納された過去のキーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢との間にエッジを形成しループを形成する。 When the optimization unit 426 determines that the index area image included in the new keyframe image is the same as the index ID stored in the data storage unit 15, the camera when each keyframe image is captured. A loop is formed based on the positions and postures of the twelve. Specifically, the optimization unit 426 is the position and orientation of the camera 12 when a new keyframe image is captured, and the camera when a past keyframe image already stored in the data storage unit 15 is captured. An edge is formed between the 12 positions and postures to form a loop.

例えば、図２１に示されるように、新たなキーフレーム画像から指標１が検出された場合、指標１に対するカメラ１２の位置及び姿勢Ｘ１が推定される。また、既にデータ記憶部１５に格納されている過去のキーフレーム画像からは指標１が検出されており、指標１に対するカメラ１２の位置及び姿勢Ｘ２が推定されている。 For example, as shown in FIG. 21, when the index 1 is detected from the new keyframe image, the position and posture X1 of the camera 12 with respect to the index 1 are estimated. Further, the index 1 is detected from the past keyframe images already stored in the data storage unit 15, and the position and posture X2 of the camera 12 with respect to the index 1 are estimated.

この場合、指標１に対するカメラ１２の位置及び姿勢Ｘ１と、各キーフレーム画像が撮像されたときのカメラ１２の位置及び姿勢ｂと、指標１に対するカメラ１２の位置及び姿勢Ｘ２とからループが形成される。これにより、上記参考文献８に記載のPose Graph最適化を行うことが可能となる。 In this case, a loop is formed from the position and orientation X1 of the camera 12 with respect to the index 1, the position and orientation b of the camera 12 when each keyframe image is captured, and the position and orientation X2 of the camera 12 with respect to the index 1. NS. This makes it possible to perform the Pose Graph optimization described in Reference 8 above.

従って、最適化部４２６は、上記参考文献８に記載のPose Graph最適化により、既にデータ記憶部１５に格納されている過去のキーフレーム画像におけるカメラ１２の位置及び姿勢の各々を補正する。 Therefore, the optimization unit 426 corrects each of the position and orientation of the camera 12 in the past keyframe image already stored in the data storage unit 15 by the Pose Graph optimization described in Reference 8 above.

情報処理装置４１０の制御部４１４は、例えば、図２２に示すコンピュータ５０で実現することができる。コンピュータ５０の記憶媒体としての記憶部５３には、コンピュータ５０を情報処理装置４１０の制御部４１４として機能させるための情報処理プログラム４６０が記憶されている。情報処理プログラム４６０は、画像取得プロセス６２と、特徴点抽出プロセス６３と、姿勢推定プロセス４６４と、マップ生成プロセス６５と、指標検出プロセス４６６と、最適化プロセス４６７と、調整プロセス６８とを有する。また、記憶部５３は、データ記憶部１５を構成する情報が記憶されるデータ記憶領域６９を有する。 The control unit 414 of the information processing device 410 can be realized by, for example, the computer 50 shown in FIG. The storage unit 53 as a storage medium of the computer 50 stores an information processing program 460 for causing the computer 50 to function as a control unit 414 of the information processing device 410. The information processing program 460 includes an image acquisition process 62, a feature point extraction process 63, an attitude estimation process 464, a map generation process 65, an index detection process 466, an optimization process 467, and an adjustment process 68. Further, the storage unit 53 has a data storage area 69 in which information constituting the data storage unit 15 is stored.

ＣＰＵ５１は、情報処理プログラム４６０を記憶部５３から読み出してメモリ５２に展開し、情報処理プログラム４６０が有するプロセスを順次実行する。ＣＰＵ５１は、画像取得プロセス６２を実行することで、図２０に示す画像取得部１６として動作する。また、ＣＰＵ５１は、特徴点抽出プロセス６３を実行することで、図２０に示す特徴点抽出部１８として動作する。また、ＣＰＵ５１は、姿勢推定プロセス４６４を実行することで、図２０に示す姿勢推定部４２０として動作する。また、ＣＰＵ５１は、マップ生成プロセス６５を実行することで、図２０に示すマップ生成部２２として動作する。また、ＣＰＵ５１は、指標検出プロセス４６６を実行することで、図２０に示す指標検出部４２４として動作する。また、ＣＰＵ５１は、最適化プロセス４６７を実行することで、図２０に示す最適化部４２６として動作する。また、ＣＰＵ５１は、調整プロセス６８を実行することで、図２０に示す調整部２８として動作する。また、ＣＰＵ５１は、データ記憶領域６９から情報を読み出して、データ記憶部１５をメモリ５２に展開する。これにより、情報処理プログラム４６０を実行したコンピュータ５０が、情報処理装置４１０の制御部４１４として機能することになる。そのため、ソフトウェアである情報処理プログラム４６０を実行するプロセッサはハードウェアである。 The CPU 51 reads the information processing program 460 from the storage unit 53, expands it in the memory 52, and sequentially executes the processes included in the information processing program 460. The CPU 51 operates as the image acquisition unit 16 shown in FIG. 20 by executing the image acquisition process 62. Further, the CPU 51 operates as the feature point extraction unit 18 shown in FIG. 20 by executing the feature point extraction process 63. Further, the CPU 51 operates as the posture estimation unit 420 shown in FIG. 20 by executing the posture estimation process 464. Further, the CPU 51 operates as the map generation unit 22 shown in FIG. 20 by executing the map generation process 65. Further, the CPU 51 operates as the index detection unit 424 shown in FIG. 20 by executing the index detection process 466. Further, the CPU 51 operates as the optimization unit 426 shown in FIG. 20 by executing the optimization process 467. Further, the CPU 51 operates as the adjustment unit 28 shown in FIG. 20 by executing the adjustment process 68. Further, the CPU 51 reads information from the data storage area 69 and expands the data storage unit 15 into the memory 52. As a result, the computer 50 that executes the information processing program 460 functions as the control unit 414 of the information processing device 410. Therefore, the processor that executes the information processing program 460, which is software, is hardware.

なお、情報処理プログラム４６０により実現される機能は、例えば半導体集積回路、より詳しくはＡＳＩＣ等で実現することも可能である。 The function realized by the information processing program 460 can also be realized by, for example, a semiconductor integrated circuit, more specifically, an ASIC or the like.

次に、第４の実施形態に係る情報処理装置４１０の作用について説明する。情報処理装置４１０は、図２３に示す姿勢推定処理を実行する。また、情報処理装置４１０は、図２４に示す最適化処理を実行する。 Next, the operation of the information processing apparatus 410 according to the fourth embodiment will be described. The information processing device 410 executes the posture estimation process shown in FIG. Further, the information processing device 410 executes the optimization process shown in FIG. 24.

＜姿勢推定処理＞
ステップＳ１００〜ステップＳ１０４は第１の実施形態と同様に実行される。 <Posture estimation process>
Steps S100 to S104 are executed in the same manner as in the first embodiment.

ステップＳ６０６において、指標検出部４２４は、ステップＳ１００で出力されたグレースケール画像に、指標が含まれているか否かを検出する。グレースケール画像に指標が含まれていると検出された場合には、ステップＳ６０７へ進む。一方、グレースケール画像に指標が含まれていないと検出された場合には、ステップＳ１００へ進む。 In step S606, the index detection unit 424 detects whether or not the index is included in the grayscale image output in step S100. If it is detected that the grayscale image contains the index, the process proceeds to step S607. On the other hand, if it is detected that the grayscale image does not include the index, the process proceeds to step S100.

ステップＳ６０７において、姿勢推定部４２０は、指標を含むグレースケール画像に基づいて、指標に対するカメラ１２の位置及び姿勢を推定する。 In step S607, the posture estimation unit 420 estimates the position and posture of the camera 12 with respect to the index based on the grayscale image including the index.

ステップＳ６０８において、姿勢推定部４２０は、上記ステップＳ１００で出力されたグレースケール画像をキーフレーム画像としてデータ記憶部１５へ格納する。また、姿勢推定部４２０は、キーフレーム画像と共に、上記ステップＳ６０７で推定された指標に対するカメラ１２の位置及び姿勢をデータ記憶部１５へ格納する。また、姿勢推定部４２０は、キーフレーム画像を格納する際に、指標の識別情報を表す指標ＩＤをデータ記憶部１５へ格納する。 In step S608, the posture estimation unit 420 stores the grayscale image output in step S100 in the data storage unit 15 as a keyframe image. Further, the posture estimation unit 420 stores the position and posture of the camera 12 with respect to the index estimated in step S607 in the data storage unit 15 together with the key frame image. Further, when the posture estimation unit 420 stores the keyframe image, the posture estimation unit 420 stores the index ID representing the identification information of the index in the data storage unit 15.

＜最適化処理＞
ステップＳ２０４〜ステップＳ２０６は、第１の実施形態と同様に実行される。 <Optimization process>
Steps S204 to S206 are executed in the same manner as in the first embodiment.

ステップＳ７００において、姿勢推定部４２０は、指標を含むグレースケール画像を新たなキーフレーム画像として格納する際に、新たなキーフレーム画像に含まれている指標領域画像と、データ記憶部１５に格納された指標ＩＤとが同一であるか否かを判定する。新たなキーフレーム画像に含まれている指標領域画像と同一である指標ＩＤがデータ記憶部１５に格納されている場合には、ステップＳ７０２へ進む。一方、新たなキーフレーム画像に含まれている指標領域画像と同一である指標ＩＤがデータ記憶部１５に格納されている場合には、ステップＳ７００の処理を繰り返す。 In step S700, when the posture estimation unit 420 stores the grayscale image including the index as a new keyframe image, the posture estimation unit 420 stores the index area image included in the new keyframe image and the data storage unit 15. It is determined whether or not the index ID is the same as the index ID. If the index ID that is the same as the index area image included in the new keyframe image is stored in the data storage unit 15, the process proceeds to step S702. On the other hand, when the index ID that is the same as the index area image included in the new keyframe image is stored in the data storage unit 15, the process of step S700 is repeated.

ステップＳ７０２において、最適化部４２６は、新たなキーフレーム画像におけるカメラ１２の位置及び姿勢と、データ記憶部１５に既に格納された過去のキーフレーム画像におけるカメラ１２の位置及び姿勢との間にエッジを形成しループを形成する。そして、最適化部４２６は、上記参考文献８に記載のPose Graph最適化により、既にデータ記憶部１５に格納されている過去のキーフレーム画像におけるカメラ１２の位置及び姿勢の各々を補正する。 In step S702, the optimization unit 426 has an edge between the position and orientation of the camera 12 in the new keyframe image and the position and orientation of the camera 12 in the past keyframe image already stored in the data storage unit 15. To form a loop. Then, the optimization unit 426 corrects each of the position and orientation of the camera 12 in the past keyframe image already stored in the data storage unit 15 by the Pose Graph optimization described in Reference 8 above.

なお、上記では、各プログラムが記憶部に予め記憶（インストール）されている態様を説明したが、これに限定されない。開示の技術に係るプログラムは、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、ＵＳＢメモリ等の記録媒体に記録された形態で提供することも可能である。 In the above description, the mode in which each program is stored (installed) in the storage unit in advance has been described, but the present invention is not limited to this. The program according to the disclosed technology can also be provided in a form recorded on a recording medium such as a CD-ROM, a DVD-ROM, or a USB memory.

本明細書に記載された全ての文献、特許出願及び技術規格は、個々の文献、特許出願及び技術規格が参照により取り込まれることが具体的かつ個々に記された場合と同程度に、本明細書中に参照により取り込まれる。 All documents, patent applications and technical standards described herein are to the same extent as if the individual documents, patent applications and technical standards were specifically and individually stated to be incorporated by reference. Incorporated by reference in the document.

次に、上記各実施形態の変形例を説明する。 Next, a modification of each of the above embodiments will be described.

上記第１及び第２の実施形態では、姿勢推定部２０は、画像から抽出された特徴点及び特徴点に対応する特徴量と、データ記憶部１５に格納されたマップ情報とに基づいて、カメラ１２の位置及び姿勢を推定する場合を例に説明したがこれに限定されるものではない。例えば、指標検出部によって、グレースケール画像内に指標が検出された場合には、指標に対するカメラ１２の位置及び姿勢を推定するようにしてもよい。 In the first and second embodiments, the posture estimation unit 20 uses the camera based on the feature points extracted from the image and the feature quantities corresponding to the feature points and the map information stored in the data storage unit 15. The case of estimating the position and the posture of the twelve has been described as an example, but the present invention is not limited to this. For example, when the index detection unit detects the index in the grayscale image, the position and orientation of the camera 12 with respect to the index may be estimated.

また、上記各実施形態では、データ記憶部１５に格納されている全てのキーフレームに対して、Pose Graph最適化とバンドル調整を行う場合を例に説明したがこれに限定されるものではない。例えば、バンドル調整が実施されていないキーフレームに対してPose Graph最適化を行ってもよい。また、Pose Graph最適化が行われていないキーフレームに対してバンドル調整を行ってもよい。Pose Graph最適化の実施とバンドル調整の実施との組み合わせについては、適宜変更してもよい。 Further, in each of the above embodiments, the case where Pose Graph optimization and bundle adjustment are performed for all the keyframes stored in the data storage unit 15 has been described as an example, but the present invention is not limited to this. For example, Pose Graph optimization may be performed for keyframes for which bundle adjustment has not been performed. In addition, bundle adjustment may be performed for keyframes that have not been optimized for Pose Graph. The combination of performing Pose Graph optimization and performing bundle adjustment may be changed as appropriate.

また、上記各実施形態では、キーフレーム画像から推定されたカメラの位置及び姿勢と、画像から指標が検出された際の指標に対する位置及び姿勢とでループを形成する場合について説明したが、これに限定されない。例えば、画像取得部によって前回までに逐次取得された画像の各々から推定されたカメラの位置及び姿勢と、画像から指標が検出された際の指標に対する位置及び姿勢とでループを形成し、最適化を実行してもよい。この場合、キーフレーム画像から推定された位置及び姿勢だけでなく、取得された画像の各々について推定された位置及び姿勢が最適化されるため、カメラの移動軌跡を精度良く推定することができる。なお、この場合、画像取得部で画像が取得される都度、画像に指標が含まれるか否かを判定するようにすればよい。 Further, in each of the above embodiments, a case where a loop is formed by the position and orientation of the camera estimated from the keyframe image and the position and orientation with respect to the index when the index is detected from the image has been described. Not limited. For example, a loop is formed and optimized by the position and orientation of the camera estimated from each of the images sequentially acquired by the image acquisition unit up to the previous time and the position and orientation with respect to the index when the index is detected from the image. May be executed. In this case, not only the position and orientation estimated from the keyframe image but also the estimated position and orientation for each of the acquired images are optimized, so that the movement trajectory of the camera can be estimated accurately. In this case, each time the image acquisition unit acquires an image, it may be determined whether or not the image includes an index.

また、上記第３の実施形態では、表示装置に対象物を表示される場合を例に説明したが、例えば、工場又はプラント等の大規模な環境を撮影した画像に対して、付加情報を重畳表示させるように表示装置を制御し、作業者の作業支援を行うようにしてもよい。 Further, in the third embodiment described above, the case where the object is displayed on the display device has been described as an example, but additional information is superimposed on an image of a large-scale environment such as a factory or a plant. The display device may be controlled so as to display, and the work support of the worker may be provided.

以上の各実施形態に関し、更に以下の付記を開示する。 The following additional notes will be further disclosed with respect to each of the above embodiments.

（付記１）
撮影位置が変化し得る撮像装置によって撮像された画像を取得する画像取得部と、
前記画像取得部によって取得された前記画像に基づいて、前記撮像装置の位置及び姿勢を推定し、前記画像取得部によって取得された前記画像から予め定められた指標が検出された場合に、前記指標に対する前記撮像装置の位置及び姿勢を推定する姿勢推定部と、
前記画像取得部によって取得された前記画像から前記指標が検出された場合に、前回までに取得された前記画像の各々に基づいて推定された前記撮像装置の位置の各々と、前記姿勢推定部によって推定された前記指標に対する前記撮像装置の位置とから形成されるループに基づいて、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する推定部と、
を含む情報処理装置。 (Appendix 1)
An image acquisition unit that acquires an image captured by an image pickup device whose shooting position can change, and an image acquisition unit.
The position and orientation of the imaging device are estimated based on the image acquired by the image acquisition unit, and when a predetermined index is detected from the image acquired by the image acquisition unit, the index A posture estimation unit that estimates the position and posture of the image pickup device with respect to the image, and a posture estimation unit.
When the index is detected from the image acquired by the image acquisition unit, each of the positions of the imaging device estimated based on each of the images acquired up to the previous time and the posture estimation unit An estimation unit that corrects each of the position and orientation of the image pickup device when each of the images is imaged, based on a loop formed from the position of the image pickup device with respect to the estimated index.
Information processing equipment including.

（付記２）
前記推定部は、前記画像取得部により取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、前記キーフレーム画像の特徴点の各々に対応する位置の３次元座標を表すマップ点を生成する、
付記１に記載の情報処理装置。 (Appendix 2)
The estimation unit is based on the estimation results of the position and orientation of the imaging device when each of the keyframe images satisfying a predetermined condition is captured among the images acquired by the image acquisition unit. Generates map points that represent the 3D coordinates of the positions corresponding to each of the feature points in the keyframe image.
The information processing device according to Appendix 1.

（付記３）
前記推定部は、前記画像取得部により取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、記憶部に格納された、前記キーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記１又は付記２に記載の情報処理装置。 (Appendix 3)
The estimation unit stores the images acquired by the image acquisition unit based on the estimation results of the position and orientation of the image pickup device when each of the key frame images satisfying a predetermined condition is captured. Correct each of the position and orientation of the image pickup device when each of the key frame images stored in the unit is imaged.
The information processing device according to Appendix 1 or Appendix 2.

（付記４）
複数の前記指標の各々についての、前記指標間の相対的な位置及び姿勢は既知である、
付記１〜付記３の何れか１項に記載の情報処理装置。 (Appendix 4)
For each of the plurality of indicators, the relative positions and orientations between the indicators are known.
The information processing device according to any one of Supplementary note 1 to Supplementary note 3.

（付記５）
前記画像取得部によって取得された前記画像から前記指標が検出された場合に、前記指標の検出結果に応じて、前記指標を含む前記画像の信頼度を算出する信頼度算出部を更に含み、
前記推定部は、前記信頼度算出部によって算出された前記信頼度が予め設定された閾値より大きい場合に、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記１〜付記４の何れか１項に記載の情報処理装置。 (Appendix 5)
When the index is detected from the image acquired by the image acquisition unit, the reliability calculation unit for calculating the reliability of the image including the index is further included according to the detection result of the index.
When the reliability calculated by the reliability calculation unit is larger than a preset threshold value, the estimation unit corrects each of the position and orientation of the image pickup device when each of the images is imaged. ,
The information processing device according to any one of Supplementary note 1 to Supplementary note 4.

（付記６）
前記姿勢推定部によって推定された前記撮像装置の位置及び姿勢に応じて、予め設定された対象物が表示装置に重畳表示されるように、前記表示装置を制御する表示制御部を更に含む、
付記１〜付記５の何れか１項に記載の情報処理装置。 (Appendix 6)
A display control unit that controls the display device is further included so that a preset object is superimposed and displayed on the display device according to the position and orientation of the image pickup device estimated by the posture estimation unit.
The information processing device according to any one of Supplementary note 1 to Supplementary note 5.

（付記７）
前記推定部は、前記画像取得部によって取得された前記画像から、前回検出された前記指標と対応する前記指標が検出された場合に、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記１〜付記６の何れか１項に記載の情報処理装置。 (Appendix 7)
The estimation unit is the position of the image pickup apparatus when each of the images is imaged when the index corresponding to the previously detected index is detected from the image acquired by the image acquisition unit. And correct each of the postures,
The information processing device according to any one of Supplementary note 1 to Supplementary note 6.

（付記８）
撮影位置が移動に伴って変わる撮像装置によって撮像された画像を取得し、
取得された前記画像に基づいて、前記撮像装置の位置及び姿勢を推定し、取得された前記画像から予め定められた指標が検出された場合に、前記指標に対する前記撮像装置の位置及び姿勢を推定し、
取得された前記画像から前記指標が検出された場合に、前回までに取得された前記画像の各々に基づいて推定された前記撮像装置の位置の各々と、推定された前記指標に対する前記撮像装置の位置とから形成されるループに基づいて、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
処理をコンピュータに実行させるための情報処理プログラム。 (Appendix 8)
Acquires an image captured by an imaging device whose shooting position changes as it moves,
The position and orientation of the imaging device are estimated based on the acquired image, and when a predetermined index is detected from the acquired image, the position and orientation of the imaging device with respect to the index are estimated. death,
When the index is detected from the acquired image, each of the positions of the imaging device estimated based on each of the images acquired up to the previous time, and the position of the imaging device with respect to the estimated index. Based on the loop formed from the position, each of the position and orientation of the image pickup device when each of the images is imaged is corrected.
An information processing program that allows a computer to perform processing.

（付記９）
取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、前記キーフレーム画像の特徴点の各々に対応する位置の３次元座標を表すマップ点を生成する、
付記８に記載の情報処理プログラム。 (Appendix 9)
Among the acquired images, each of the feature points of the keyframe image corresponds to each of the feature points of the keyframe image based on the estimation results of the position and the posture of the image pickup device when each of the keyframe images satisfying a predetermined condition is captured. Generate a map point that represents the 3D coordinates of the position to be
The information processing program according to Appendix 8.

（付記１０）
取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、記憶部に格納された、前記キーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記８又は付記９に記載の情報処理プログラム。 (Appendix 10)
Among the acquired images, the keyframe image stored in the storage unit based on the estimation results of the position and orientation of the image pickup device when each of the keyframe images satisfying a predetermined condition is captured. Correct each of the position and orientation of the image pickup device when each of the images is taken.
The information processing program according to Appendix 8 or Appendix 9.

（付記１１）
複数の前記指標の各々についての、前記指標間の相対的な位置及び姿勢は既知である、
付記８〜付記１０の何れか１項に記載の情報処理プログラム。 (Appendix 11)
For each of the plurality of indicators, the relative positions and orientations between the indicators are known.
The information processing program according to any one of Supplementary note 8 to Supplementary note 10.

（付記１２）
取得された前記画像から前記指標が検出された場合に、前記指標の検出結果に応じて、前記指標を含む前記画像の信頼度を更に算出し、
算出された前記信頼度が予め設定された閾値より大きい場合に、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記８〜付記１１の何れか１項に記載の情報処理プログラム。 (Appendix 12)
When the index is detected from the acquired image, the reliability of the image including the index is further calculated according to the detection result of the index.
When the calculated reliability is greater than a preset threshold value, each of the positions and orientations of the imaging device when each of the images is imaged is corrected.
The information processing program according to any one of Supplementary note 8 to Supplementary note 11.

（付記１３）
推定された前記撮像装置の位置及び姿勢に応じて、予め設定された対象物が表示装置に重畳表示されるように、前記表示装置を制御する、
付記８〜付記１２の何れか１項に記載の情報処理プログラム。 (Appendix 13)
The display device is controlled so that a preset object is superimposed and displayed on the display device according to the estimated position and orientation of the image pickup device.
The information processing program according to any one of Supplementary note 8 to Supplementary note 12.

（付記１４）
取得された前記画像から、前回検出された前記指標と対応する前記指標が検出された場合に、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記８〜付記１３の何れか１項に記載の情報処理プログラム。 (Appendix 14)
When the index corresponding to the previously detected index is detected from the acquired image, the position and posture of the image pickup device when each of the images is imaged is corrected.
The information processing program according to any one of Supplementary note 8 to Supplementary note 13.

（付記１５）
撮影位置が移動に伴って変わる撮像装置によって撮像された画像を取得し、
取得された前記画像に基づいて、前記撮像装置の位置及び姿勢を推定し、取得された前記画像から予め定められた指標が検出された場合に、前記指標に対する前記撮像装置の位置及び姿勢を推定し、
取得された前記画像から前記指標が検出された場合に、前回までに取得された前記画像の各々に基づいて推定された前記撮像装置の位置の各々と、推定された前記指標に対する前記撮像装置の位置とから形成されるループに基づいて、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
処理をコンピュータに実行させるための情報処理方法。 (Appendix 15)
Acquires an image captured by an imaging device whose shooting position changes as it moves,
The position and orientation of the imaging device are estimated based on the acquired image, and when a predetermined index is detected from the acquired image, the position and orientation of the imaging device with respect to the index are estimated. death,
When the index is detected from the acquired image, each of the positions of the imaging device estimated based on each of the images acquired up to the previous time, and the position of the imaging device with respect to the estimated index. Based on the loop formed from the position, each of the position and orientation of the image pickup device when each of the images is imaged is corrected.
An information processing method for causing a computer to perform processing.

（付記１６）
取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、前記キーフレーム画像の特徴点の各々に対応する位置の３次元座標を表すマップ点を生成する、
付記１５に記載の情報処理方法。 (Appendix 16)
Among the acquired images, each of the feature points of the keyframe image corresponds to each of the feature points of the keyframe image based on the estimation results of the position and the posture of the image pickup device when each of the keyframe images satisfying a predetermined condition is captured. Generate a map point that represents the 3D coordinates of the position to be
The information processing method according to Appendix 15.

（付記１７）
取得された画像のうち、所定の条件を満たすキーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々の推定結果に基づいて、記憶部に格納された、前記キーフレーム画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記１５又は付記１６に記載の情報処理方法。 (Appendix 17)
Among the acquired images, the keyframe image stored in the storage unit based on the estimation results of the position and orientation of the image pickup device when each of the keyframe images satisfying a predetermined condition is captured. Correct each of the position and orientation of the image pickup device when each of the images is taken.
The information processing method according to Appendix 15 or Appendix 16.

（付記１８）
複数の前記指標の各々についての、前記指標間の相対的な位置及び姿勢は既知である、
付記１５〜付記１７の何れか１項に記載の情報処理方法。 (Appendix 18)
For each of the plurality of indicators, the relative positions and orientations between the indicators are known.
The information processing method according to any one of Supplementary note 15 to Supplementary note 17.

（付記１９）
取得された前記画像から前記指標が検出された場合に、前記指標の検出結果に応じて、前記指標を含む前記画像の信頼度を更に算出し、
算出された前記信頼度が予め設定された閾値より大きい場合に、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
付記１５〜付記１８の何れか１項に記載の情報処理方法。 (Appendix 19)
When the index is detected from the acquired image, the reliability of the image including the index is further calculated according to the detection result of the index.
When the calculated reliability is greater than a preset threshold value, each of the positions and orientations of the imaging device when each of the images is imaged is corrected.
The information processing method according to any one of Supplementary note 15 to Supplementary note 18.

（付記２０）
撮影位置が移動に伴って変わる撮像装置によって撮像された画像を取得し、
取得された前記画像に基づいて、前記撮像装置の位置及び姿勢を推定し、取得された前記画像から予め定められた指標が検出された場合に、前記指標に対する前記撮像装置の位置及び姿勢を推定し、
取得された前記画像から前記指標が検出された場合に、前回までに取得された前記画像の各々に基づいて推定された前記撮像装置の位置の各々と、推定された前記指標に対する前記撮像装置の位置とから形成されるループに基づいて、前記画像の各々が撮像されたときの前記撮像装置の位置及び姿勢の各々を補正する、
処理をコンピュータに実行させるための情報処理プログラムを記憶した記憶媒体。 (Appendix 20)
Acquires an image captured by an imaging device whose shooting position changes as it moves,
The position and orientation of the imaging device are estimated based on the acquired image, and when a predetermined index is detected from the acquired image, the position and orientation of the imaging device with respect to the index are estimated. death,
When the index is detected from the acquired image, each of the positions of the imaging device estimated based on each of the images acquired up to the previous time, and the position of the imaging device with respect to the estimated index. Based on the loop formed from the position, each of the position and orientation of the image pickup device when each of the images is imaged is corrected.
A storage medium that stores an information processing program for causing a computer to execute processing.

１，１Ａ，１Ｂ指標
１０，２１０，３１０，４１０情報処理装置
１２カメラ
１４，２１４，３１４，４１４制御部
１５データ記憶部
１６画像取得部
１８特徴点抽出部
２０，３２０，４２０姿勢推定部
２２マップ生成部
２４，３２４，４２４指標検出部
２６，２２６，４２６最適化部
２８調整部
２２５信頼度算出部
３１９初期位置推定部
３２５表示制御部
３２６表示装置
５０コンピュータ
５１ＣＰＵ
５３記憶部
５９記録媒体
６０，２６０，３６０，４６０情報処理プログラム
Ｍマップ点 1,1A, 1B Index 10,210,310,410 Information processing device 12 Camera 14,214,314,414 Control unit 15 Data storage unit 16 Image acquisition unit 18 Feature point extraction unit 20,320,420 Posture estimation unit 22 Map Generation unit 24,324,424 Index detection unit 26,226,426 Optimization unit 28 Adjustment unit 225 Reliability calculation unit 319 Initial position estimation unit 325 Display control unit 326 Display device 50 Computer 51 CPU
53 Storage unit 59 Recording medium 60, 260, 360, 460 Information processing program M Map point

Claims

An image acquisition unit that acquires an image captured by an image pickup device whose shooting position can change, and an image acquisition unit.
Based on the image acquired by the image acquisition unit, the first position and posture, which are the positions of the image pickup device, are estimated, and a predetermined index is detected from the image acquired by the image acquisition unit. In this case, a posture estimation unit that estimates the second position and the posture, which are the positions of the image pickup device with respect to the index, and the posture estimation unit.
When the index is detected from the image acquired by the image acquisition unit, each of the first positions of the image pickup apparatus and the second position estimated based on each of the images acquired up to the previous time. An estimation unit that corrects each of the first position and the posture of the image pickup device when each of the images is imaged based on the loop formed from the position.
Information processing equipment including.

The estimation unit is based on the estimation results of the first position and the posture of the imaging device when each of the keyframe images satisfying a predetermined condition is captured among the images acquired by the image acquisition unit. To generate map points representing the three-dimensional coordinates of the positions corresponding to each of the feature points of the keyframe image.
The information processing device according to claim 1.

The estimation unit is based on the estimation results of the first position and the posture of the imaging device when each of the keyframe images satisfying a predetermined condition is captured among the images acquired by the image acquisition unit. Then, each of the first position and the posture of the image pickup device when each of the key frame images stored in the storage unit is imaged is corrected.
The information processing device according to claim 1 or 2.

For each of the plurality of indicators, the relative positions and orientations between the indicators are known.
The information processing device according to any one of claims 1 to 3.

When the index is detected from the image acquired by the image acquisition unit, the reliability calculation unit for calculating the reliability of the image including the index is further included according to the detection result of the index.
In the estimation unit, when the reliability calculated by the reliability calculation unit is larger than a preset threshold value, each of the first position and the posture of the image pickup device when each of the images is imaged. To correct,
The information processing device according to any one of claims 1 to 4.

A display control unit that controls the display device is further included so that a preset object is superimposed and displayed on the display device according to the position and orientation of the image pickup device estimated by the posture estimation unit.
The information processing device according to any one of claims 1 to 5.

The estimation unit, from said image acquired by the image acquisition unit, if the index corresponding to the index which is previously detected is detected, the said imaging device when each of the image is captured Correct each of the first position and posture,
The information processing device according to any one of claims 1 to 6.

Acquires an image captured by an imaging device whose shooting position changes as it moves,
Based on the acquired image, the first position and posture, which are the positions of the image pickup device, are estimated, and when a predetermined index is detected from the acquired image, the image pickup device with respect to the index is used. Estimate the second position and posture, which are the positions,
When the index is detected from the acquired image, it is formed from each of the first position and the second position of the imaging device estimated based on each of the images acquired up to the previous time. Based on the loop, each of the first position and the posture of the imaging device when each of the images is captured is corrected.
An information processing program that allows a computer to perform processing.

Acquires an image captured by an imaging device whose shooting position changes as it moves,
Based on the acquired image, the first position and posture, which are the positions of the image pickup device, are estimated, and when a predetermined index is detected from the acquired image, the image pickup device with respect to the index is used. Estimate the second position and posture, which are the positions,
When the index is detected from the acquired image, it is formed from each of the first position and the second position of the imaging device estimated based on each of the images acquired up to the previous time. Based on the loop, each of the first position and the posture of the imaging device when each of the images is captured is corrected.
An information processing method that causes a computer to perform processing.