JP2015203680A

JP2015203680A - Information processing device, method, and program

Info

Publication number: JP2015203680A
Application number: JP2014084715A
Authority: JP
Inventors: その子宮谷; Sonoko Miyatani; 藤木　真和; Masakazu Fujiki; 真和藤木; 鈴木　雅博; Masahiro Suzuki; 雅博鈴木
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2014-04-16
Filing date: 2014-04-16
Publication date: 2015-11-16
Anticipated expiration: 2034-04-16
Also published as: JP6425405B2

Abstract

PROBLEM TO BE SOLVED: To provide a method that selects an image serving as a factor in performance improvement in a position pose estimation using a plurality of images.SOLUTION: An information processing device is configured to: plurally acquire an image having an object body photographed; execute at least one of recognition processing of the object body and position pose estimation processing of the object body on the basis of information on at least one image of the acquired images; execute at least one of the recognition processing of the object body and the position pose estimation processing of the object body on the basis of information on at least two images of the plurality of acquired images; generate first performance information about processing to be executed by first execution means; generate second performance information about processing to be executed by second execution means; and cause display means to display the first performance information and the second performance information.

Description

本発明は、物体を撮像した画像に基づいて画像中の物体の位置姿勢推定を認識して表示する方法に関する。 The present invention relates to a method for recognizing and displaying an estimation of a position and orientation of an object in an image based on an image obtained by imaging the object.

生産現場では、生産効率向上のため、ロボットによる自動化が進められている。ロボットによる部品のピッキングや組み立て作業では、対象部品を認識し、さらに位置姿勢を取得する必要がある。これを実現する方法として、対象部品をカメラで撮影した画像を用いる方法が開発されている。 At production sites, robotic automation is being promoted to improve production efficiency. In picking or assembling parts by a robot, it is necessary to recognize the target part and further acquire the position and orientation. As a method for realizing this, a method using an image obtained by photographing a target part with a camera has been developed.

特許文献１には、複数のカメラで撮影した画像を用いることで認識の正確性を向上させる方法が開示されている。この方法では、複数のカメラで撮影した画像の中から複数のステレオ画像ペアを選択し、ステレオ画像ペアごとに算出した距離値を用いて対象物体の認識を複数行うことで、認識の正確性を向上させている。 Patent Document 1 discloses a method for improving the accuracy of recognition by using images taken by a plurality of cameras. In this method, multiple stereo image pairs are selected from images captured by multiple cameras, and multiple target objects are recognized using distance values calculated for each stereo image pair. It is improving.

特開２０００−９４３７４号公報JP 2000-94374 A

しかし、ステレオ画像ペアをつくる複数の画像に、ノイズを多く含む画像が含まれることがある。また、複数の画像を用いることで処理時間が増大してしまい、所定の時間内にタスクを終了できなくなる場合もある。
本発明は、以上の課題を鑑みてなされたものであり、複数の画像を用いた認識または位置姿勢推定において、複数の画像から位置姿勢の導出に好適な画像を選択することを目的とする。 However, a plurality of images forming a stereo image pair may include images containing a lot of noise. Also, the use of a plurality of images increases the processing time, and the task may not be completed within a predetermined time.
The present invention has been made in view of the above problems, and an object of the present invention is to select an image suitable for deriving a position and orientation from a plurality of images in recognition or position and orientation estimation using a plurality of images.

本発明の情報処理装置は、例えば、対象物体を撮影した複数の画像を取得する画像取得手段と、前記取得した画像のうち、少なくとも１つの画像に関する情報にもとづいて、前記対象物体の認識処理と前記対象物体の位置姿勢推定の処理のうち、少なくともいずれか１つを実行する第１の実行手段と、前記取得した複数のうち、少なくとも２つの画像に関する情報にもとづいて、前記対象物体の認識処理と前記対象物体の位置姿勢推定の処理のうち、少なくともいずれか１つを実行する第２の実行手段と、前記第１の実行手段で実行される処理に関する第１の性能情報を生成する第１の性能情報生成手段と、前記第２の実行手段で実行される処理に関する第２の性能情報を生成する第２の性能情報生成手段と、前記第１の性能情報と第２の性能情報とを表示手段に表示させる表示制御手段とを備える。 The information processing apparatus according to the present invention includes, for example, an image acquisition unit that acquires a plurality of images obtained by capturing a target object, and a recognition process for the target object based on information about at least one of the acquired images. The target object recognition process based on the first execution means for executing at least one of the process of estimating the position and orientation of the target object and information on at least two of the acquired plurality of images. And second execution means for executing at least one of the process for estimating the position and orientation of the target object, and first performance information for generating the first performance information relating to the process executed by the first execution means. Performance information generating means, second performance information generating means for generating second performance information relating to processing executed by the second execution means, the first performance information and the second sex And a display control means for displaying on the display means and information.

複数の画像を用いた認識または位置姿勢推定において、複数の画像から位置姿勢の導出に好適な画像を選択することができる。 In recognition or position / orientation estimation using a plurality of images, an image suitable for deriving the position and orientation can be selected from the plurality of images.

第１の実施形態におけるシステムの構成を示す図である。It is a figure which shows the structure of the system in 1st Embodiment. 本発明の情報処理装置を利用するシステムの構成例を示す図である。It is a figure which shows the structural example of the system using the information processing apparatus of this invention. 第１の実施形態の処理手順を説明する図である。It is a figure explaining the process sequence of 1st Embodiment. 第１の実施形態におけるＧＵＩの例を示す図である。It is a figure which shows the example of GUI in 1st Embodiment. 変形例１におけるＧＵＩの例を示す図である。It is a figure which shows the example of GUI in the modification 1. 第２の実施形態におけるシステムの構成を示す図である。It is a figure which shows the structure of the system in 2nd Embodiment. 第２の実施形態の処理手順を説明する図である。It is a figure explaining the process sequence of 2nd Embodiment. 第２の実施形態におけるＧＵＩの例を示す図である。It is a figure which shows the example of GUI in 2nd Embodiment. 第３の実施形態におけるシステムの構成を示す図である。It is a figure which shows the structure of the system in 3rd Embodiment. 第３の実施形態の処理手順を説明する図である。It is a figure explaining the process sequence of 3rd Embodiment. 第３の実施形態におけるＧＵＩの例を示す図である。It is a figure which shows the example of GUI in 3rd Embodiment. 本発明における情報処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the information processing apparatus in this invention.

本発明にかかる各実施形態を説明するのに先立ち、各実施形態に示す情報処理装置が実装されるハードウェア構成について、図１２を用いて説明する。 Prior to describing each embodiment according to the present invention, a hardware configuration in which the information processing apparatus shown in each embodiment is mounted will be described with reference to FIG.

図１２は、本実施形態における情報装置のハードウェア構成図である。同図において、ＣＰＵ１２１０は、バス１２００を介して接続する各デバイスを統括的に制御する。ＣＰＵ１２１０は、読み出し専用メモリ（ＲＯＭ）１２２０に記憶された処理ステップやプログラムを読み出して実行する。オペレーティングシステム（ＯＳ）をはじめ、本実施形態に係る各処理プログラム、デバイスドライバ等はＲＯＭ１２２０に記憶されており、ランダムアクセスメモリ（ＲＡＭ）１２３０に一時記憶され、ＣＰＵ１２１０によって適宜実行される。また、入力Ｉ／Ｆ１２４０は、外部の装置（表示装置や操作装置など）から情報処理装置１で処理可能な形式で入力信号として入力する。また、出力Ｉ／Ｆ１２５０は、外部の装置（表示装置）へ表示装置が処理可能な形式で出力信号として出力する。 FIG. 12 is a hardware configuration diagram of the information device in the present embodiment. In the figure, a CPU 1210 comprehensively controls devices connected via a bus 1200. The CPU 1210 reads and executes processing steps and programs stored in a read only memory (ROM) 1220. The operating system (OS) and other processing programs, device drivers, and the like according to the present embodiment are stored in the ROM 1220, temporarily stored in a random access memory (RAM) 1230, and executed as appropriate by the CPU 1210. The input I / F 1240 is input as an input signal in a format that can be processed by the information processing apparatus 1 from an external device (display device, operation device, or the like). The output I / F 1250 outputs an output signal to an external device (display device) in a format that can be processed by the display device.

これらの各機能部は、ＣＰＵ１２１０が、ＲＯＭ１２２０に格納されたプログラムをＲＡＭ１２３０に展開し、後述する各フローチャートに従った処理を実行することで実現されている。また例えば、ＣＰＵ１２１０を用いたソフトウェア処理の代替としてハードウェアを構成する場合には、ここで説明する各機能部の処理に対応させた演算部や回路を構成すればよい。 Each of these functional units is realized by the CPU 1210 expanding a program stored in the ROM 1220 in the RAM 1230 and executing processing according to each flowchart described later. Further, for example, when hardware is configured as an alternative to software processing using the CPU 1210, arithmetic units and circuits corresponding to the processing of each functional unit described here may be configured.

（第１の実施形態）
第１の実施形態では、対象物体を複数のカメラで撮影した画像にもとづいて対象物体の位置姿勢推定を実施した時の、該位置姿勢推定に関する性能をユーザに提示する方法について説明する。より具体的には、撮影した各画像及び複数の画像にもとづいてそれぞれ位置姿勢推定を行い、それぞれの位置姿勢推定の性能に関する情報をグラフィックユーザインターフェース（以下、ＧＵＩとする）上に表示する。これにより、各画像及び複数の画像を用いて実施した位置姿勢推定の性能の確認や比較を容易に行うことができるようになり、性能低下あるいは向上の要因となっている画像またはカメラの特定が行えるようになる。また、性能低下の要因と特定されたカメラの除去や配置変更を行うことができるようになる。対象物体の形状や置かれ方、環境光の当たり方といった変動する状況に適応した位置姿勢推定を行うことができる。 (First embodiment)
In the first embodiment, a method for presenting the user with performance related to position / orientation estimation when the position / orientation estimation of the target object is performed based on images obtained by capturing the target object with a plurality of cameras will be described. More specifically, position and orientation estimation is performed based on each captured image and a plurality of images, and information regarding the performance of each position and orientation estimation is displayed on a graphic user interface (hereinafter referred to as GUI). This makes it possible to easily check and compare the performance of position and orientation estimation performed using each image and a plurality of images, and to identify the image or camera that is the cause of the performance degradation or improvement. You can do it. In addition, it becomes possible to remove or change the arrangement of the camera identified as the cause of the performance degradation. It is possible to perform position and orientation estimation adapted to changing conditions such as the shape of the target object, how it is placed, and how it hits the ambient light.

図１は、本実施形態におけるシステムの構成を示している。情報処理装置１００は、画像取得部１０１、第１の計測部１０２、第２の計測部１０３、第１の性能情報生成部１０４、第２の性能情報生成部１０５、第１の出力部１０６、第２の出力部１０７とを備える。そして、情報処理装置１００はインタフェースを介して、外部の表示装置１２０に接続されている。 FIG. 1 shows a system configuration in this embodiment. The information processing apparatus 100 includes an image acquisition unit 101, a first measurement unit 102, a second measurement unit 103, a first performance information generation unit 104, a second performance information generation unit 105, a first output unit 106, A second output unit 107. The information processing apparatus 100 is connected to an external display device 120 via an interface.

図２は、画像取得部１０１が取得する画像の撮影を行うための装置構成及び配置を示す図であり、プロジェクタ１及び複数のカメラ２〜５から構成される。図２では、プロジェクタ１台とカメラ４台で構成される場合を示している。該複数のカメラは、情報処理装置１００と画像取得部１０１を介して接続する。 FIG. 2 is a diagram showing an apparatus configuration and arrangement for taking an image acquired by the image acquisition unit 101, and includes a projector 1 and a plurality of cameras 2 to 5. FIG. 2 shows a case where the projector is composed of one projector and four cameras. The plurality of cameras are connected to the information processing apparatus 100 via the image acquisition unit 101.

画像取得部１０１は、対象物体を複数のカメラ２〜５で撮影した輝度画像を取得する。しかし、複数の視点で対象物体を撮影した画像が取得できれば他の方法であってもよい。例えば、複数の撮像装置あるいは複数の視点位置で対象物体を撮影した画像を、画像情報処理装置１００と接続した外部の記憶装置に一旦保持しておいたものを入力してもよい。また、取得する輝度画像は、カラー画像であってもよいし、モノクロ画像であってもよい。 The image acquisition unit 101 acquires a luminance image obtained by photographing the target object with a plurality of cameras 2 to 5. However, other methods may be used as long as images obtained by capturing the target object from a plurality of viewpoints can be acquired. For example, an image obtained by capturing an image of a target object at a plurality of imaging devices or a plurality of viewpoint positions once held in an external storage device connected to the image information processing device 100 may be input. Further, the acquired luminance image may be a color image or a monochrome image.

第１の計測部１０２は、画像取得部１０１が取得した各画像に基づいて、対象物体の位置姿勢推定を行う。具体的な方法については、後述する。 The first measurement unit 102 estimates the position and orientation of the target object based on each image acquired by the image acquisition unit 101. A specific method will be described later.

第２の計測部１０３は、画像取得部１０１が取得した複数の画像に基づいて、対象物体の位置姿勢推定を行う。具体的な方法については、後述する。 The second measurement unit 103 estimates the position and orientation of the target object based on the plurality of images acquired by the image acquisition unit 101. A specific method will be described later.

第１の性能情報生成部１０４は、第１の計測部１０２で実行された位置姿勢推定処理に関する性能情報（第１の性能情報）を生成する。性能情報は、位置姿勢推定の精度及び位置姿勢推定に要する処理時間とする。第１の性能情報生成部１０４は、生成した第１の性能情報を第１の出力部１０６に送出する。 The first performance information generation unit 104 generates performance information (first performance information) related to the position / orientation estimation process executed by the first measurement unit 102. The performance information is the accuracy of position / orientation estimation and the processing time required for position / orientation estimation. The first performance information generation unit 104 sends the generated first performance information to the first output unit 106.

第２の性能情報生成部１０５は、第２の計測部１０３で実施した位置姿勢推定に関する性能情報（第２の性能情報）を生成する。性能情報は、第１の性能情報生成部１０４で生成するものと同様に、位置姿勢推定の精度及び位置姿勢推定に要する処理時間とする。第２の性能情報生成部１０５は、生成した第２の性能情報を第２の出力部１０７に送出する。 The second performance information generation unit 105 generates performance information (second performance information) related to position and orientation estimation performed by the second measurement unit 103. The performance information is the position / orientation estimation accuracy and the processing time required for position / orientation estimation, similar to that generated by the first performance information generating unit 104. The second performance information generation unit 105 sends the generated second performance information to the second output unit 107.

第１の出力部１０６は、第１の性能情報生成部１０４で生成した性能情報を出力し、表示装置１０２に表示させる。この点で、第１の出力部１０６は、表示制御部として機能する。これにより、ユーザは第１の性能情報を表示装置を介して確認（視認）することができる。 The first output unit 106 outputs the performance information generated by the first performance information generation unit 104 and causes the display device 102 to display the performance information. In this respect, the first output unit 106 functions as a display control unit. As a result, the user can confirm (view) the first performance information via the display device.

第２の出力部１０７は、第２の性能情報生成部１０５で生成した性能情報を出力し、表示装置１２０に表示させる。この点で、第２の出力部１０７は、表示制御部として機能する。これにより、ユーザは第２の性能情報を表示装置１２０を介して確認（視認）することができる。 The second output unit 107 outputs the performance information generated by the second performance information generation unit 105 and causes the display device 120 to display the performance information. In this respect, the second output unit 107 functions as a display control unit. As a result, the user can check (view) the second performance information via the display device 120.

表示装置１２０は、例えば、液晶ディスプレイやＣＲＴディスプレイである。表示装置１２０は、表示画面を有し、その表示画面に第１の出力部１０６と第２の出力部１０７から受け取った性能情報を表示する。本実施形態では、表示装置１２０の表示画面にＧＵＩを表示し、該ＧＵＩの領域に、第１の性能情報と第２の性能情報とを表示する例で説明する。これにより、ユーザはより第１の性能情報と第２の性能情報とをより視認しやすくなるという効果がある。 The display device 120 is, for example, a liquid crystal display or a CRT display. The display device 120 has a display screen and displays performance information received from the first output unit 106 and the second output unit 107 on the display screen. In the present embodiment, an example will be described in which a GUI is displayed on the display screen of the display device 120, and first performance information and second performance information are displayed in the GUI area. Accordingly, there is an effect that the user can more easily visually recognize the first performance information and the second performance information.

図３は、本実施形態の処理手順を示すフローチャートである。以下、図３のフローチャートを用いて、本実施形態の処理を詳細に説明する。 FIG. 3 is a flowchart showing a processing procedure of the present embodiment. Hereinafter, the processing of this embodiment will be described in detail with reference to the flowchart of FIG.

（ステップＳ３０１）
ステップＳ３０１において、画像取得部１０１は、対象物体を撮影した画像を取得する。画像取得部１０１が取得する画像は、プロジェクタ１により投影されたパターンを照射された対象物体を撮影した輝度画像とする。画像取得部１０１は取得した画像を第１の計測部１０２と第２の計測部１０３に送出する。 (Step S301)
In step S301, the image acquisition unit 101 acquires an image obtained by capturing a target object. The image acquired by the image acquisition unit 101 is a luminance image obtained by photographing the target object irradiated with the pattern projected by the projector 1. The image acquisition unit 101 sends the acquired image to the first measurement unit 102 and the second measurement unit 103.

（ステップＳ３０２）
ステップＳ３０２において、第１の計測部１０２は、ステップＳ３０１において画像取得部１０１で取得した各画像それぞれに基づいて対象物体の位置姿勢推定を行う。具体的には、以下の非特許文献１で開示されている方法を用いて対象物体の位置姿勢を推定する。 (Step S302)
In step S302, the first measurement unit 102 estimates the position and orientation of the target object based on each image acquired by the image acquisition unit 101 in step S301. Specifically, the position and orientation of the target object are estimated using the method disclosed in Non-Patent Document 1 below.

ＤａｖｉｄＡ．Ｓｉｍｏｎ、ＭａｒｔｉａｌＨｅｂｅｒｔ、ＴａｋｅｏＫａｎａｄｅ、“Ｒｅａｌ−ｔｉｍｅ３−ＤＰｏｓｅＥｓｔｉｍａｔｉｏｎＵｓｉｎｇａＨｉｇｈ−ＳｐｅｅｄＲａｎｇｅＳｅｎｓｏｒ”、Ｐｒｏｃ．１９９４ＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＲｏｂｏｔｉｃｓａｎｄＡｕｔｏｍａｔｉｏｎ（ＩＣＲＡ’９４）、ｐｐ。２２３５−２２４１、１９９４． David A. Simon, Martial Hebert, Take Kanade, “Real-time 3-D Pose Estimating Using a High-Speed Range Sensor”, Proc. 1994 IEEE International Conference on Robotics and Automation (ICRA'94), pp. 2235-2241, 1994.

具体的には、対象物体を計測して取得した距離値に、面または三次元点の集合で表現される対象物体の三次元モデルを当て嵌めるモデルフィッティングの手法を用いて、対象物体の位置姿勢を推定する。対象物体の距離値は、プロジェクタ１により投影されるパターンと各画像から検出された投影パターンとを対応付けて三角測量の原理を用いることで算出する。すなわち、本ステップで、第１の計測部１０２は、それぞれの画像ごと（カメラ２〜５とプロジェクタ１のペアごと）に対象物体の位置姿勢推定を行う。第１の計測部１０２は、推定された対象物体の位置姿勢データを第１の性能情報生成部１０４に送出する。 Specifically, the position and orientation of the target object using a model fitting technique that fits a 3D model of the target object represented by a surface or a set of 3D points to the distance value obtained by measuring the target object Is estimated. The distance value of the target object is calculated by using the principle of triangulation by associating the pattern projected by the projector 1 with the projection pattern detected from each image. That is, in this step, the first measurement unit 102 estimates the position and orientation of the target object for each image (for each pair of the cameras 2 to 5 and the projector 1). The first measurement unit 102 sends the estimated position / orientation data of the target object to the first performance information generation unit 104.

（ステップＳ３０３）
ステップＳ３０３において、第１の性能情報生成部１０４は、ステップＳ３０２で実施した位置姿勢推定に関する性能情報を生成する。本実施形態において生成される第１の性能情報は、位置姿勢推定の精度及び位置姿勢推定に要する処理時間とする。なお、本実施形態では、位置姿勢推定に関する性能情報を位置姿勢推定精度及び処理時間としたが、もちろん対象物体の位置姿勢推定の性能に関する情報であれば、他のものであってもよい。例えば、推定した位置姿勢であってもよい。あるいは、位置姿勢推定の計算が破たんしたか否か（位置姿勢推定の成否）の情報であってもよい。 (Step S303)
In step S303, the first performance information generation unit 104 generates performance information related to the position and orientation estimation performed in step S302. The first performance information generated in the present embodiment is assumed to be position / orientation estimation accuracy and processing time required for position / orientation estimation. In the present embodiment, the performance information related to the position / orientation estimation is used as the position / orientation estimation accuracy and processing time. However, other information may be used as long as it is information related to the position / orientation estimation performance of the target object. For example, the estimated position and orientation may be used. Alternatively, it may be information on whether or not the calculation of position and orientation has been broken (success or failure of position and orientation estimation).

位置姿勢推定の精度は、対象物体の距離値と対象物体の三次元モデルとのフィッティング度合いとする。フィッティング度合いは、例えば、対象物体の距離値と対応付く三次元形状モデルが保持する面または三次元点までの距離の最大値とする。この場合、距離の最大値が小さいほど位置姿勢推定の精度が高いことを示す。 The accuracy of position and orientation estimation is the degree of fitting between the distance value of the target object and the three-dimensional model of the target object. The fitting degree is, for example, the maximum value of the distance to the surface or the three-dimensional point held by the three-dimensional shape model associated with the distance value of the target object. In this case, the smaller the maximum distance value, the higher the accuracy of position and orientation estimation.

処理時間は、第１の計測部１０２における位置姿勢推定の処理に要する時間とする。さらに、対象物体を撮影する時間や、撮影した画像を画像取得部１０１に入力する時間を処理時間に含めても良い。第１の性能情報生成部１０４は、生成した性能情報を第１の出力部１０６に送出する。 The processing time is a time required for the position / orientation estimation process in the first measurement unit 102. Furthermore, the processing time may include the time for capturing the target object and the time for inputting the captured image to the image acquisition unit 101. The first performance information generation unit 104 sends the generated performance information to the first output unit 106.

（ステップＳ３０４）
ステップＳ３０４において、第１の出力部１０６は、ステップＳ３０３で生成した位置姿勢推定に関する第１の性能情報と画像取得部１０１が取得した画像とを表示装置１２０に出力して表示装置１２０に表示させる。そして、表示装置１２０は、撮像画像と第１の性能情報とを受け取り、表示領域４５０に撮像画像と第１の性能情報とを表示する。本実施形態における第１の性能情報は、上述した通り位置姿勢推定の精度や処理時間とする。また、位置姿勢推定の対象となった物体を特定するために、撮影画像上の該物体の位置付近に矩形のテクスチャを描画して重畳した画像を表示する。該テクスチャは、撮影画像上の該物体の位置付近に描画すればよく、その形状は他のものであってもよい。例えば、円形や星型などであってもよい。 (Step S304)
In step S304, the first output unit 106 outputs the first performance information related to the position / orientation estimation generated in step S303 and the image acquired by the image acquisition unit 101 to the display device 120 and causes the display device 120 to display the first performance information. . Then, the display device 120 receives the captured image and the first performance information, and displays the captured image and the first performance information in the display area 450. The first performance information in the present embodiment is the position and orientation estimation accuracy and processing time as described above. Further, in order to identify the object that is the target of position and orientation estimation, an image in which a rectangular texture is drawn and superimposed in the vicinity of the position of the object on the captured image is displayed. The texture may be drawn near the position of the object on the captured image, and the shape may be other. For example, a circular shape or a star shape may be used.

（ステップＳ３０５）
ステップＳ３０５において、第２の計測部１０３は、ステップＳ３０１で画像取得部１０１が取得した複数の画像にもとづいて対象物体の位置姿勢推定を行う。複数の画像の選び方としては、全ての画像を選択してもよく、また予め画像の組み合わせを決めておいてもよい。具体的な方法は、次の通りである。まず、ステップＳ３０３と同様に画像ごとに対象物体の距離値を算出する。対象物体の距離値は、プロジェクタ１の投影パターンと各画像上から識別した該投影パターンとを対応付けて三角測量の原理を用いることで算出する。次に、算出した距離値を全て用いて、非特許文献１で開示されている方法を用いて対象物体の位置姿勢を推定する。すなわち、複数の視点のカメラから得られる距離値を全て用いて（統合して）、物体の三次元構造を表現し、その三次元構造に対して物体のモデルをフィッティングすることにより、位置姿勢推定を行う。なお、本実施形態では、画像ごと（プロジェクタ１とカメラ２〜５のペアごと）に算出した距離値を全て用いて対象物体の位置姿勢を推定すると説明した。しかし、これに限らず、画像ごとに算出した距離値を統合した距離値にもとづいて位置姿勢を推定する他の方法であってもよい。例えば、全ての距離値に対して三次元空間における密度が均一になるように間引き処理を施したものにもとづいて対象物体の位置姿勢を推定してもよい。また、所定領域内における距離値の平均値から大きく外れているような距離値は位置合わせに使用しないようにすることで、よりロバストな位置姿勢推定を行うことができる。 (Step S305)
In step S305, the second measurement unit 103 estimates the position and orientation of the target object based on the plurality of images acquired by the image acquisition unit 101 in step S301. As a method of selecting a plurality of images, all the images may be selected, or a combination of images may be determined in advance. A specific method is as follows. First, as in step S303, the distance value of the target object is calculated for each image. The distance value of the target object is calculated by using the principle of triangulation by associating the projection pattern of the projector 1 with the projection pattern identified from each image. Next, using all the calculated distance values, the position and orientation of the target object are estimated using the method disclosed in Non-Patent Document 1. In other words, using all the distance values obtained from cameras of multiple viewpoints (integrating), the 3D structure of the object is represented, and the model of the object is fitted to the 3D structure, thereby estimating the position and orientation I do. In the present embodiment, it has been described that the position and orientation of the target object are estimated using all the distance values calculated for each image (for each pair of the projector 1 and the cameras 2 to 5). However, the present invention is not limited to this, and other methods for estimating the position and orientation based on the distance value obtained by integrating the distance values calculated for each image may be used. For example, the position and orientation of the target object may be estimated based on a thinning process performed so that the density in the three-dimensional space is uniform for all distance values. Further, by avoiding the use of the distance value that greatly deviates from the average value of the distance values in the predetermined area for the alignment, more robust position and orientation estimation can be performed.

（ステップＳ３０６）
ステップＳ３０６において、第２の性能情報生成部１０５は、ステップＳ３０５において、複数の画像にもとづいて実施した対象物体の位置姿勢推定に関する第２の性能情報を生成する。本実施形態における第２の性能情報は、ステップＳ３０３で説明したものと同様に、位置姿勢推定の精度及び位置姿勢推定に要する処理時間である。第２の出力部は、第２の性能情報生成部が生成した第２の性能情報を表示装置１２０に出力する。 (Step S306)
In step S306, the second performance information generation unit 105 generates second performance information related to the position / orientation estimation of the target object performed based on the plurality of images in step S305. The second performance information in the present embodiment is the accuracy of position / orientation estimation and the processing time required for position / orientation estimation, as described in step S303. The second output unit outputs the second performance information generated by the second performance information generation unit to the display device 120.

（ステップＳ３０７）
ステップＳ３０７において、第２の出力部１０７は、ステップＳ３０６で生成した対象物体の位置姿勢推定に関する性能情報と画像取得部１０１が取得した画像とを表示装置１２０に出力する。そして、表示装置１２０は、撮像画像と第２の性能情報とを受け取り、表示領域４６０に撮像画像と第２の性能情報を表示する。性能情報は、上述した通り位置姿勢推定の精度や処理時間とする。また、位置姿勢推定の対象となった物体を提示するために、撮影画像上の該物体の位置付近に矩形のテクスチャを描画して重畳した画像を表示する。このテクスチャの形状は矩形に限られず、例えば、円形や星型などであってもよい。 (Step S307)
In step S307, the second output unit 107 outputs the performance information regarding the position / orientation estimation of the target object generated in step S306 and the image acquired by the image acquisition unit 101 to the display device 120. The display device 120 receives the captured image and the second performance information, and displays the captured image and the second performance information in the display area 460. The performance information is the accuracy and processing time of position and orientation estimation as described above. In addition, in order to present the object that is the target of position and orientation estimation, an image in which a rectangular texture is drawn and superimposed in the vicinity of the position of the object on the captured image is displayed. The shape of the texture is not limited to a rectangle, and may be, for example, a circle or a star shape.

図４は、表示装置１２０の表示画面内のＧＵＩ４００として表した例を示している。 FIG. 4 shows an example represented as a GUI 400 in the display screen of the display device 120.

表示装置１２０は、カメラ２、カメラ３、カメラ４及びカメラ５が取得した画像４１０〜４１３及び、それぞれの画像にもとづいて実施した位置姿勢推定の精度及び位置姿勢推定に要した処理時間４２０〜４２３を表示する。また、位置姿勢推定の対象となった物体４０１を特定するために、撮影画像上の該物体の位置付近に楕円形状のテクスチャ４３０〜４３３を描画して重畳した画像を表示する。 The display device 120 includes images 410 to 413 acquired by the camera 2, the camera 3, the camera 4, and the camera 5, accuracy of position / orientation estimation performed based on each image, and processing time 420 to 423 required for position / orientation estimation. Is displayed. Further, in order to specify the object 401 that is the target of position and orientation estimation, an image in which elliptical textures 430 to 433 are drawn and superimposed is displayed near the position of the object on the captured image.

表示装置１２０は、カメラ２、カメラ３、カメラ４及びカメラ５が取得した全ての画像にもとづいて実施した位置姿勢推定の精度及び位置姿勢推定に要した時間４２４を表示する。また、位置姿勢推定の対象となった物体４０１を特定するために、カメラ２〜５が取得した画像を貼り合わせた画像４１４上の該物体付近に、楕円形状のテクスチャ４３４を描画して重畳した画像を表示する。なお、表示する画像は、カメラ２〜５取得したいずれかの画像であってもよいし、画像表示しなくてもよい。 The display device 120 displays the accuracy of the position / orientation estimation performed based on all the images acquired by the camera 2, the camera 3, the camera 4, and the camera 5 and the time 424 required for the position / orientation estimation. Further, in order to specify the object 401 that is the target of position and orientation estimation, an elliptical texture 434 is drawn and superimposed on the vicinity of the object on the image 414 obtained by pasting the images acquired by the cameras 2 to 5. Display an image. The image to be displayed may be any image acquired by the cameras 2 to 5 or may not be displayed.

なお、位置推定の対象となる物体は１つではなく、整列または山積み状態で配置された複数の物体としてもよい。 Note that the number of objects for position estimation is not limited to one, and a plurality of objects arranged in an aligned or stacked state may be used.

以上説明したように、ステップＳ３０１〜ステップＳ３０７の一連の処理を実行することで、撮影した各画像及び複数の画像にもとづいてそれぞれ実施した位置姿勢推定の性能に関する情報をユーザに提示することが可能になる。これにより、画像ごとに実施した位置姿勢推定の性能と、複数の画像を用いて実施した位置姿勢推定の性能の確認や比較を容易に行うことができる。 As described above, by executing the series of processing from step S301 to step S307, it is possible to present to the user information related to the performance of position and orientation estimation performed based on each captured image and a plurality of images. become. Accordingly, it is possible to easily check and compare the performance of position and orientation estimation performed for each image and the performance of position and orientation estimation performed using a plurality of images.

＜位置姿勢推定方法のバリエーション＞
本実施形態では、画像にもとづいて算出した距離値を用いて対象物体の位置姿勢推定を行う方法について説明した。しかし、これに限らず、画像にもとづいて対象物体の位置姿勢推定ができれば、他の方法であってもよい。 <Variation of position and orientation estimation method>
In the present embodiment, the method for estimating the position and orientation of the target object using the distance value calculated based on the image has been described. However, the present invention is not limited to this, and other methods may be used as long as the position and orientation of the target object can be estimated based on the image.

例えば、画像上から抽出した特徴を用いて対象物体の位置姿勢推定を行っても良い。例えば、対象物体を複数の視点で撮影した学習画像と、取得した画像から検出される画像特徴とをマッチングすることで、物体の位置姿勢を推定する。 For example, the position and orientation of the target object may be estimated using features extracted from the image. For example, the position and orientation of the object are estimated by matching a learning image obtained by photographing the target object from a plurality of viewpoints and an image feature detected from the acquired image.

あるいは、画像にもとづいて算出した距離値と画像上から抽出した画像特徴を併用して、対象物体の位置姿勢推定を行ってもよい。例えば、複数の面及び線分で表現される対象物体の三次元モデルを、距離値と画像とから抽出したエッジ特徴に対して同時にフィッティングすることで、対象物体の正確な位置姿勢を推定する。 Alternatively, the position / orientation estimation of the target object may be performed using both the distance value calculated based on the image and the image feature extracted from the image. For example, the accurate position and orientation of the target object are estimated by simultaneously fitting a three-dimensional model of the target object represented by a plurality of planes and line segments to the edge features extracted from the distance value and the image.

＜画像のバリエーション＞
本実施形態では、画像取得部１０１が取得する画像は、プロジェクタによりパターンが投影された対象物体を撮影した輝度画像として説明した。しかし、画像上から抽出した特徴を用いて位置姿勢推定を行う場合には、パターンが投影されていない対象物体を撮影した輝度画像としてもよい。あるいは、パターンが投影された対象物体を撮影した輝度画像と、パターンが投影されていない対象物体を撮影した輝度画像の両方を入力してもよい。 <Image variations>
In the present embodiment, the image acquired by the image acquisition unit 101 has been described as a luminance image obtained by capturing a target object on which a pattern is projected by a projector. However, when position and orientation estimation is performed using features extracted from the image, a luminance image obtained by capturing a target object on which no pattern is projected may be used. Alternatively, both a luminance image obtained by photographing a target object on which a pattern is projected and a luminance image obtained by photographing a target object on which no pattern is projected may be input.

＜位置姿勢推定精度のバリエーション＞
本実施形態では、位置姿勢推定精度を対象物体の距離値と対応付く三次元形状モデルが保持する面または三次元点までの距離の最大値とした。しかし、これに限らず、位置姿勢推定の精度を示す指標であれば、他のものであってもよい。例えば、対象物体の距離値と対応付く三次元形状モデルが保持する面または三次元点までの距離の平均値や分散としてもよい。あるいは、距離点と対応付くモデル上の点を推定した位置姿勢にもとづいて画像上に投影し、画像上における距離の平均値や分散としてもよい。あるいは、画像から抽出した特徴と対応付く学習データの特徴との距離の最大値、平均値または分散としてもよい。あるいは、対象物体の位置姿勢推定を複数回実施して算出した複数の位置姿勢の最大値または分散としてもよい。 <Variation of position and orientation estimation accuracy>
In this embodiment, the position / orientation estimation accuracy is set to the maximum value of the distance to the surface or the three-dimensional point held by the three-dimensional shape model associated with the distance value of the target object. However, the present invention is not limited to this, and any other index may be used as long as it indicates the accuracy of position and orientation estimation. For example, the average value or variance of the distance to the surface or the three-dimensional point held by the three-dimensional shape model associated with the distance value of the target object may be used. Or it is good also as an average value and dispersion | distribution of the distance on an image, projecting on an image based on the position and orientation which estimated the point on the model matched with a distance point. Alternatively, it may be the maximum value, average value, or variance of the distance between the feature extracted from the image and the feature of the learning data associated with it. Or it is good also as the maximum value or dispersion | distribution of the some position and orientation calculated by implementing the position and orientation estimation of a target object in multiple times.

＜変形例１＞
第１の実施形態では、第１の性能情報生成部１０４及び第２の性能情報生成部１０５で生成した位置姿勢推定の性能に関する情報として、位置姿勢推定の精度や処理時間をＧＵＩ上に表示した。本変形例では、取得画像上にコンピュータグラフィック（以下、ＣＧ）を重畳したものをＧＵＩ上に表示する。重畳するＣＧは、点、線分及び面で表現される対象物体の三次元モデルを推定した位置姿勢にもとづいて描画したものとする。これにより、画像上の対象物体と該画像上に重畳した対象物体のＣＧとの重なり具合を観察することで、ユーザが位置姿勢推定の性能を視覚的に判断することができる。 <Modification 1>
In the first embodiment, the position / orientation estimation accuracy and processing time are displayed on the GUI as information related to the position / orientation estimation performance generated by the first performance information generation unit 104 and the second performance information generation unit 105. . In this modification, a computer graphic (hereinafter, CG) superimposed on the acquired image is displayed on the GUI. It is assumed that the CG to be superimposed is drawn based on a position and orientation obtained by estimating a three-dimensional model of a target object expressed by points, line segments, and surfaces. Accordingly, the user can visually determine the performance of position and orientation estimation by observing the degree of overlap between the target object on the image and the CG of the target object superimposed on the image.

図５は、画像上に対象物体のＣＧを描画して重畳したものを、表示装置１２０のＧＵＩ４００に表示した時の例である。表示装置１２０は、カメラ１、カメラ２、カメラ３およびカメラ５が撮影した画像上に対象物体のＣＧ４３４〜４３７を重畳して描画した画像を表示する。 FIG. 5 is an example when a CG of the target object drawn and superimposed on the image is displayed on the GUI 400 of the display device 120. The display device 120 displays an image drawn by superimposing the CGs 434 to 437 of the target object on the images taken by the camera 1, the camera 2, the camera 3, and the camera 5.

表示装置１２０は、カメラ２〜５が撮影した画像全てを貼り合わせた画像４１４上に対象物体のＣＧ４３８を描画して重畳した画像を表示する。 The display device 120 displays an image in which the CG 438 of the target object is drawn and superimposed on the image 414 obtained by pasting all the images taken by the cameras 2 to 5.

ＣＧ４３４〜４３７と画像４１０〜４１４上の対象物体との重なり具合を観察することで、ユーザは位置姿勢推定の性能を視覚的に判断できる。例えば、ＣＧ４３６は、画像４１２上の対象物体とそれぞれ一致しているが、ＣＧ４３７は、画像４１３上の対象物体とずれが発生している。これより、画像４１２を用いた位置姿勢推定の精度は高いと判断できる。一方、画像４１３を用いた位置姿勢推定の精度は、画像４１２を用いた位置姿勢推定の精度に比べて低いと判断できる。 By observing the degree of overlap between the CGs 434 to 437 and the target objects on the images 410 to 414, the user can visually determine the performance of position and orientation estimation. For example, the CG 436 matches the target object on the image 412, but the CG 437 is shifted from the target object on the image 413. From this, it can be determined that the accuracy of position and orientation estimation using the image 412 is high. On the other hand, the accuracy of position and orientation estimation using the image 413 can be determined to be lower than the accuracy of position and orientation estimation using the image 412.

以上説明したように、画像上に対象物体のＣＧを推定した位置姿勢にもとづいて描画して重畳して表示することで、ユーザは対象物体の位置姿勢推定の精度を視覚的に判断することが可能となる。 As described above, the user can visually determine the accuracy of the position / orientation estimation of the target object by drawing and superimposing and displaying the CG of the target object on the image based on the estimated position / orientation. It becomes possible.

＜変形例２＞
第１の実施形態では、第１の計測部１０２及び第２の計測部１０３において、対象物体の位置姿勢を推定した。本変形例では、第１の計測部および第２の計測部は、位置姿勢の推定ではなく、対象物体の有無や種類を特定する認識処理を行う。 <Modification 2>
In the first embodiment, the first measuring unit 102 and the second measuring unit 103 estimate the position and orientation of the target object. In the present modification, the first measurement unit and the second measurement unit perform recognition processing that specifies the presence or absence and type of the target object, not estimation of the position and orientation.

図３を用いて、本変形例における処理手順を説明する。 A processing procedure in the present modification will be described with reference to FIG.

（ステップＳ３０１）
ステップＳ３０１において、画像取得部１０１は、対象物体を撮影した画像を取得する。取得する画像は、対象物体を撮影した輝度画像とする。 (Step S301)
In step S301, the image acquisition unit 101 acquires an image obtained by capturing a target object. The acquired image is a luminance image obtained by photographing the target object.

（ステップＳ３０２）
ステップＳ３０２において、第１の計測部１０２は、ステップＳ３０１で画像取得部１０１が取得した各画像にもとづいて対象物体の認識を行う。認識処理は、以下の非特許文献２に記載される方法により行う。 (Step S302)
In step S302, the first measurement unit 102 recognizes the target object based on each image acquired by the image acquisition unit 101 in step S301. The recognition process is performed by the method described in Non-Patent Document 2 below.

”ＳＣＨＭＩＤ、Ｃ．；Ｒ。ＭＯＨＲ： ’Ｌｏｃａｌｇｒａｙｖａｌｕｅｉｎｖａｒｉａｎｔｓｆｏｒｉｍａｇｅｒｅｔｒｉｅｖａｌ’ ＩＥＥＥＰＡＭＩ１９Ｍａｙ１９９７、ｐａｇｅｓ５３０ − ５３４”．
具体的には、画像上から抽出した特徴を用いたテンプレートマッチング手法を用いて、対象物体の認識を行う。 "SCHMID, C .; R. MOHR: 'Local grayvalue invariants for image retry' IEEE PAMI 19 May 1997, pages 530-534".
Specifically, the target object is recognized using a template matching method using features extracted from the image.

（ステップＳ３０３）
ステップＳ３０３において、第１の性能情報生成部１０４は、ステップＳ３０２で実施した認識処理に関する性能情報を生成する。本変形例において、認識に関する性能情報は、認識精度及び認識に要する処理時間とする。認識精度は、入力画像と入力画像と照合されたテンプレート画像間のハウスドルフ距離とする。認識に要する処理時間は、第１の計測部１０２で実施する認識処理に要する時間である。さらに、対象物体を撮影する時間や、撮影した画像を画像取得部１０１が取得する時間を処理時間に含めても良い。 (Step S303)
In step S303, the first performance information generation unit 104 generates performance information related to the recognition process performed in step S302. In this modification, the performance information regarding recognition is recognition accuracy and processing time required for recognition. The recognition accuracy is the Hausdorff distance between the input image and the template image collated with the input image. The processing time required for recognition is the time required for recognition processing performed by the first measurement unit 102. Furthermore, the processing time may include the time for capturing the target object and the time for the image acquisition unit 101 to acquire the captured image.

（ステップＳ３０４）
ステップＳ３０４において、第１の出力部１０６は、表示装置１２０に設定したＧＵＩ上に、ステップ３０３で生成した認識に関する第１の性能情報を表示させる。認識に関する性能情報は、上述した通り、認識精度及び認識に要する処理時間とする。また、位置姿勢推定の対象となった物体を提示するために、撮影画像上の該物体の位置付近に矩形のテクスチャを描画して重畳した画像を表示する。該テクスチャは、撮影画像上の該物体の位置付近に描画すればよく、その形状は他のものであってもよい。例えば、円形や星型などであってもよい。もっとも、認識に関する性能を示す情報であれば、他のものであってもよい。例えば、対象物体の有無を性能情報としてもよい。あるいは、同時に複数の物体を認識する場合には、認識した物体数を性能情報としてもよい。 (Step S304)
In step S 304, the first output unit 106 displays the first performance information related to recognition generated in step 303 on the GUI set in the display device 120. As described above, the performance information related to recognition is the recognition accuracy and the processing time required for recognition. In addition, in order to present the object that is the target of position and orientation estimation, an image in which a rectangular texture is drawn and superimposed in the vicinity of the position of the object on the captured image is displayed. The texture may be drawn near the position of the object on the captured image, and the shape may be other. For example, a circular shape or a star shape may be used. However, other information may be used as long as the information indicates the performance related to recognition. For example, the presence / absence of the target object may be used as the performance information. Alternatively, when a plurality of objects are recognized at the same time, the number of recognized objects may be used as performance information.

（ステップＳ３０５）
ステップＳ３０５において、第２の計測部１０９は、ステップＳ３０１において画像取得部１０１で取得した複数の画像（２以上選択した画像）に基づいて対象物体の認識を行う。具体的な方法は、次の通りである。まず、画像ごとに対象物体の画像特徴を抽出する。そして、抽出した全ての画像特徴の中から、複数の画像で重複して抽出された特徴を選択する。そして、選択した画像を用いて、非特許文献２で開示される方法で対象物体の認識を行う。もっとも、複数の画像から抽出した画像特徴に基づいて対象物体の認識ができる他の方法であってもよい。例えば、複数の画像から抽出した画像特徴全てを用いて対象物体の認識を行っても良い。 (Step S305)
In step S305, the second measurement unit 109 recognizes the target object based on a plurality of images (two or more selected images) acquired by the image acquisition unit 101 in step S301. A specific method is as follows. First, the image feature of the target object is extracted for each image. Then, from all the extracted image features, a feature extracted by duplication in a plurality of images is selected. Then, the target object is recognized by the method disclosed in Non-Patent Document 2 using the selected image. However, other methods that can recognize a target object based on image features extracted from a plurality of images may be used. For example, the target object may be recognized using all image features extracted from a plurality of images.

（ステップＳ３０６）
ステップＳ３０６において、第２の性能情報生成部１０５は、第２の計測部１０３で実施した認識に関する第２の性能情報を生成する。認識に関する性能情報は、ステップＳ３０３で説明した通りである。 (Step S306)
In step S 306, the second performance information generation unit 105 generates second performance information related to recognition performed by the second measurement unit 103. The performance information related to recognition is as described in step S303.

（ステップ３０７）
ステップＳ３０７において、第２の出力部１０７は、ステップ３０６で生成した認識に関する性能情報と撮像画像とを表示装置１２０の表示領域４６０に表示させる。認識に関する性能情報は、上述した通り、認識精度及び認識に要する処理時間とする。また、認識の対象となった物体を提示するために、撮影画像上の該物体の位置付近に矩形のテクスチャを描画して重畳した画像を表示する。該テクスチャの形状は矩形に限られず、例えば、円形や星型などであってもよい。 (Step 307)
In step S 307, the second output unit 107 displays the performance information related to recognition generated in step 306 and the captured image on the display area 460 of the display device 120. As described above, the performance information related to recognition is the recognition accuracy and the processing time required for recognition. In addition, in order to present an object to be recognized, an image in which a rectangular texture is drawn and superimposed in the vicinity of the position of the object on the captured image is displayed. The shape of the texture is not limited to a rectangle, and may be, for example, a circle or a star shape.

以上説明したように、ステップ３０１〜ステップ３０７の一連の処理を実行することで、画像ごとに実施した認識に関する性能情報と、複数の画像を利用して実施した認識に関する性能情報をＧＵＩ上に表示することができる。これにより、ユーザは、画像ごと及び複数の画像を用いた時の認識性能の確認や比較を行うことができる。 As described above, by executing the series of processing from step 301 to step 307, performance information related to recognition performed for each image and performance information related to recognition performed using a plurality of images are displayed on the GUI. can do. Thereby, the user can confirm and compare recognition performance when using each image and a plurality of images.

＜変形例３＞
第１の実施形態、変形例１及び２では、画像ごと及び複数の画像にもとづいて実施した対象物体の認識または位置姿勢推定に関する性能を表示した。さらに、性能低下の要因となるカメラに関して、該カメラの特性や該カメラの配置情報を示すパラメータの校正や、該カメラまたは画像の除去を促すＧＵＩを表示してもよい。性能低下の要因となるカメラは、当該カメラで撮影した画像にもとづいて推定した位置姿勢推定の性能が予め定めた性能を満たさない場合や、性能が最も低くなるカメラとする。 <Modification 3>
In the first embodiment and the first and second modifications, the performance related to the recognition of the target object or the position and orientation estimation performed based on each image and a plurality of images is displayed. Further, for a camera that causes a performance degradation, a GUI that prompts the user to calibrate parameters indicating the characteristics of the camera and the placement information of the camera or to remove the camera or the image may be displayed. The camera that causes the performance degradation is a camera in which the position / orientation estimation performance estimated based on the image captured by the camera does not satisfy a predetermined performance or the camera having the lowest performance.

（第２の実施形態）
第１の実施形態では、各画像にもとづいて実施した位置姿勢推定に関する性能と、複数の画像にもとづいて実施した位置姿勢推定に関する性能を表示する方法について説明した。本実施形態では、位置姿勢推定に関する性能を表示するとともに、入力した複数の画像の中から位置姿勢推定に用いる画像を選択する方法について説明する。これにより、ユーザは、位置姿勢推定に関する性能を確認しながら、位置姿勢推定に利用する画像の選択ができるようになり、精度の高い位置姿勢推定が可能となる。対象物体の形状や置かれ方、環境光の当たり方といった変動する状況に適応した位置姿勢推定を行うことができる。また、位置姿勢推定に要する処理時間を、組み立て作業のタクトタイムで決定される該処理時間内に収めることが可能となる。 (Second Embodiment)
In the first embodiment, the method for displaying the performance related to the position / orientation estimation performed based on each image and the performance related to the position / orientation estimation performed based on a plurality of images has been described. In the present embodiment, a method for displaying performance related to position / orientation estimation and selecting an image used for position / orientation estimation from a plurality of input images will be described. Accordingly, the user can select an image to be used for position / orientation estimation while confirming the performance related to position / orientation estimation, and can perform position / orientation estimation with high accuracy. It is possible to perform position and orientation estimation adapted to changing conditions such as the shape of the target object, how it is placed, and how it hits the ambient light. Further, the processing time required for position and orientation estimation can be kept within the processing time determined by the tact time of the assembly work.

図６は、本実施形態におけるシステムの構成を示している。本実施形態における情報処理装置６００の構成は、第１の実施形態で図１を用いて説明した情報処理装置１００の構成に加え、画像選択部６１０を備える。その他の構成である画像取得部６０１〜第２の出力部６０７は画像取得部１０１〜第２の出力部１０７と同様の機能を有するため、ここでは、画像選択部６１０についてのみ説明する。また、情報処理装置６００に接続される表示装置６２０は、第１の実施形態で説明した表示装置１２０と同等の機能を有する。また、情報処理装置６００は、外部の操作装置６２０に接続されている。操作装置６２０は、マウスやキーボードであり、ユーザはこの操作装置を介して画像選択部６１０へ指示することができる。また、表示装置６２０の表示画面がタッチパネル機能を有する場合には、このタッチパネル機能が操作装置として機能する。 FIG. 6 shows a system configuration in the present embodiment. The configuration of the information processing apparatus 600 according to the present embodiment includes an image selection unit 610 in addition to the configuration of the information processing apparatus 100 described with reference to FIG. 1 in the first embodiment. Since the image acquisition unit 601 to the second output unit 607 having other configurations have the same functions as the image acquisition unit 101 to the second output unit 107, only the image selection unit 610 will be described here. In addition, the display device 620 connected to the information processing device 600 has a function equivalent to that of the display device 120 described in the first embodiment. Further, the information processing apparatus 600 is connected to an external operation device 620. The operation device 620 is a mouse or a keyboard, and the user can instruct the image selection unit 610 via the operation device. Further, when the display screen of the display device 620 has a touch panel function, the touch panel function functions as an operation device.

画像選択部６１０は、画像取得部６０１が取得した複数の画像の中から第２の計測部での処理に用いる画像を選択する。本実施形態では、第１の実施形態と同様に、第２の計測部では対象物体の位置姿勢推定を行うため、画像選択部６１０は、位置姿勢推定の処理に用いる画像を選択する。 The image selection unit 610 selects an image used for processing in the second measurement unit from among a plurality of images acquired by the image acquisition unit 601. In the present embodiment, as in the first embodiment, the second measurement unit performs position and orientation estimation of the target object, and thus the image selection unit 610 selects an image to be used for position and orientation estimation processing.

図７は、本実施形態の処理手順を示すフローチャートである。ステップ７０１〜ステップ７０４及びステップ７０６、ステップ７０７における処理は、第１の実施形態で説明した処理と同様であるため、ここでの説明は省略する。 FIG. 7 is a flowchart showing the processing procedure of this embodiment. The processing in Steps 701 to 704 and Steps 706 and 707 is the same as the processing described in the first embodiment, and thus description thereof is omitted here.

（ステップ７０８）
ステップＳ７０８において、画像選択部６１０は、画像選択を行うか否かを判断する。この判断は、ユーザが操作装置６１１で行っても良いし、所定の回数は必ず画像選択を行うというような予め定めたルールに基づいて行っても良い。ステップＳ７０８において、画像選択部６１０が、画像を選択すると判断した場合には、ステップ７０９に進む。選択しない場合には、本処理を終了する。 (Step 708)
In step S708, the image selection unit 610 determines whether to perform image selection. This determination may be made by the user using the operation device 611 or based on a predetermined rule such that image selection is always performed a predetermined number of times. If the image selection unit 610 determines in step S708 to select an image, the process proceeds to step 709. If not selected, this process is terminated.

（ステップ７０９）
ステップＳ７０９において、画像選択部６１０は、操作装置を介したユーザからの操作に基づいて、位置姿勢推定に用いる画像を選択する。 (Step 709)
In step S709, the image selection unit 610 selects an image to be used for position and orientation estimation based on an operation from the user via the operation device.

図８は、表示装置６２０の表示画面に表示される、第２の実施形態におけるＧＵＩ８００の例である。 FIG. 8 is an example of the GUI 800 according to the second embodiment displayed on the display screen of the display device 620.

表示装置６２０は、カメラ２、カメラ３、カメラ４及びカメラ５が撮影した画像８１０〜８１３それぞれに基づいて実施した位置姿勢推定精度及び位置姿勢推定に要した処理時間８２０〜８２３を表示する。 The display device 620 displays the position / orientation estimation accuracy performed based on the images 810 to 813 captured by the camera 2, the camera 3, the camera 4, and the camera 5 and the processing time 820 to 823 required for the position / orientation estimation.

画像選択部６１０は、ＧＵＩ８００に設定した選択ボタン８１０〜８３０を操作装置を介してユーザが選択して、カメラが撮影した画像ごとに位置姿勢推定に使用するか否かの選択を行う。図８の例では、カメラ２及びカメラ５の撮影画像は使用せず、カメラ２及びカメラ３の撮影画像は使用する選択をしている。本実施形態では、画像取得部６０１に入力した画像ごとに、位置姿勢推定に使用するか否かの設定を行った。しかし、位置姿勢推定に使用する画像が選択できれば、他の設定方法であってもよい。例えば、選択対象となる画像全てに対して、位置姿勢推定に使用するか否かを一括で設定するようにしてもよい。 The image selection unit 610 selects the selection buttons 810 to 830 set in the GUI 800 via the operation device, and selects whether to use the position and orientation estimation for each image captured by the camera. In the example of FIG. 8, the images taken by the camera 2 and the camera 5 are not used, and the images taken by the camera 2 and the camera 3 are selected to be used. In the present embodiment, for each image input to the image acquisition unit 601, a setting is made as to whether to use for position and orientation estimation. However, other setting methods may be used as long as an image used for position and orientation estimation can be selected. For example, whether or not to use for position and orientation estimation may be collectively set for all images to be selected.

表示装置６２０は、選択ボタン８２４〜８２７で、ユーザが使用すると選択したカメラ２及びカメラ３が撮影した画像にもとづいて実施した位置姿勢推定の精度及び位置姿勢推定に要した処理時間８２４を表示する。 The display device 620 uses the selection buttons 824 to 827 to display the position / orientation estimation accuracy performed based on the images taken by the camera 2 and the camera 3 selected by the user and the processing time 824 required for the position / orientation estimation. .

（ステップ７０５）
ステップＳ７０５において、第２の計測部６０３は、ステップ７０１で選択した画像にもとづいて対象物体の位置姿勢推定を行う。 (Step 705)
In step S 705, the second measurement unit 603 estimates the position and orientation of the target object based on the image selected in step 701.

＜画像選択のバリエーション（選択方法）＞
本実施形態では、画像取得部６０１で取得した画像ごとに、位置姿勢推定に使用するか否かの設定を行った。これに加えて、画像ごとに位置姿勢推定の処理方法を選択してもよい。例えば、画像特徴を用いた位置姿勢推定方法、距離値を用いた位置姿勢推定方法または画像特徴と距離値との両方を用いた位置姿勢推定方法の中から、処理方法を選択してもよい。こうすることで、対象物体の形状や置かれ方、環境光の当たり方といった変動する状況に適応した位置姿勢推定を行うことができる。また、選択対象となる画像全てに対して、位置姿勢推定の処理方法を一括で設定するようにしてもよい。 <Image selection variation (selection method)>
In the present embodiment, for each image acquired by the image acquisition unit 601, whether to use for position and orientation estimation is set. In addition to this, a position and orientation estimation processing method may be selected for each image. For example, a processing method may be selected from a position / orientation estimation method using image features, a position / orientation estimation method using distance values, or a position / orientation estimation method using both image features and distance values. By doing so, it is possible to perform position and orientation estimation adapted to changing conditions such as the shape of the target object, how it is placed, and how it hits the ambient light. Further, the position / orientation estimation processing method may be set collectively for all the images to be selected.

＜変形例２−１＞
第１の実施形態、及び変形例１−１〜１−３では、第１の計測部において、各画像にもとづいて位置姿勢推定を実施した。しかし、これに限らず、２つ以上の画像にもとづいて位置姿勢推定を実施してもよい（１以上であればよい）。この場合、第１の性能情報生成部において、２以上の画像にもとづいて実施した位置姿勢に関する性能情報を生成し、表示装置６２０において、該性能情報を表示する。 <Modification 2-1>
In the first embodiment and Modifications 1-1 to 1-3, the first measurement unit performs position and orientation estimation based on each image. However, the present invention is not limited to this, and position and orientation estimation may be performed based on two or more images (one or more may be used). In this case, the first performance information generation unit generates performance information related to the position and orientation performed based on two or more images, and the display device 620 displays the performance information.

表示装置６２０は、カメラ２及びカメラ３が撮影した画像に基づいて実施した位置姿勢推定の推定精度及び処理時間８２０を表示する。また、表示装置６２０は、カメラ３及びカメラ４が撮影した画像に基づいて実施した位置姿勢推定の推定精度及び処理時間８２１を表示する。また、表示装置６２０は、カメラ４が撮影した画像にもとづいて実施した位置姿勢推定の推定精度及び処理時間８２２を表示する。また、表示装置６２０は、カメラ４及びカメラ５が撮影した画像にもとづいて実施した位置姿勢推定の推定精度及び処理時間８２３を表示する。また、これらの撮影画像８１０〜８１３を表示する。画像８１０、８１１及び８１３に関しては、撮影した画像を貼り合わせた画像とする。例えば、画像８１０は、カメラ２及び３で撮影した画像を貼り合わせた画像とする。 The display device 620 displays the estimation accuracy and processing time 820 of position and orientation estimation performed based on images taken by the camera 2 and the camera 3. Further, the display device 620 displays the estimation accuracy and processing time 821 of the position / orientation estimation performed based on the images taken by the camera 3 and the camera 4. Further, the display device 620 displays the estimation accuracy and the processing time 822 of the position / orientation estimation performed based on the image captured by the camera 4. Further, the display device 620 displays the estimation accuracy and processing time 823 of the position / orientation estimation performed based on the images taken by the camera 4 and the camera 5. In addition, these captured images 810 to 813 are displayed. The images 810, 811 and 813 are images obtained by pasting captured images. For example, the image 810 is an image obtained by pasting together images taken by the cameras 2 and 3.

画像選択部６１０は、選択ボタン８２４〜８２７をユーザが選択することで、使用するカメラを選択する。 The image selection unit 610 selects a camera to be used when the user selects the selection buttons 824 to 827.

表示装置６２０は、カメラ３、カメラ４及びカメラ５が撮影した画像にもとづいて実施した位置姿勢推定の精度及び処理時間８２４を表示する。また、カメラ３、カメラ４及びカメラ５が撮影した画像を貼り合わせた画像８１４を表示する。 The display device 620 displays the accuracy and processing time 824 of position and orientation estimation performed based on images taken by the camera 3, the camera 4, and the camera 5. In addition, an image 814 in which images taken by the camera 3, the camera 4, and the camera 5 are combined is displayed.

なお、撮影画像８１０、８１１、８１３及び８１４は、貼り合わせた画像に代わって、貼り合わせた画像を構成する個々の画像を表示してもよい。例えば、画像８１０には、カメラ２で撮影した画像または、カメラ３で撮影した画像を表示する。あるいは、画像表示そのものをしなくてもよい。 Note that the captured images 810, 811, 813, and 814 may display individual images constituting the combined images instead of the combined images. For example, the image 810 displays an image taken by the camera 2 or an image taken by the camera 3. Alternatively, the image display itself may not be performed.

（第３の実施形態）
第１の実施形態では、各画像にもとづいて実施した位置姿勢推定に関する性能と、複数の画像にもとづいて実施した位置姿勢推定に関する性能を表示する方法について説明した。第２の実施形態では、入力した複数の画像の中から位置姿勢推定に用いる画像を選択する方法について説明した。第３の実施形態では、位置姿勢推定に関する条件を入力し、入力した条件を満たす時の位置姿勢推定に関する性能を表示する方法について説明する。これにより、ユーザが所望する条件を満たす位置姿勢推定が可能となる。また、対象物体の形状や置かれ方、環境光の当たり方といった変動する状況に適応した位置姿勢推定を行うことができる。 (Third embodiment)
In the first embodiment, the method for displaying the performance related to the position / orientation estimation performed based on each image and the performance related to the position / orientation estimation performed based on a plurality of images has been described. In the second embodiment, the method for selecting an image used for position and orientation estimation from a plurality of input images has been described. In the third embodiment, a method will be described in which conditions related to position and orientation estimation are input, and the performance related to position and orientation estimation when the input conditions are satisfied is displayed. This makes it possible to estimate the position and orientation that satisfy the conditions desired by the user. Further, it is possible to perform position and orientation estimation adapted to changing conditions such as the shape of the target object, how it is placed, and how it hits the ambient light.

図９は、本実施形態における情報処理装置９００の構成を示している。情報処理装置９００は、第１の実施形態で図１を用いて説明した構成に加え、条件取得部９１０を保持する。条件取得部９１０、第２の計測部９０３、第２の性能情報生成部９０５、第２の出力部以外は第１の実施形態で説明したものと同様であるため、ここでの説明は省略する。 FIG. 9 shows the configuration of the information processing apparatus 900 in this embodiment. The information processing apparatus 900 holds a condition acquisition unit 910 in addition to the configuration described with reference to FIG. 1 in the first embodiment. Other than the condition acquisition unit 910, the second measurement unit 903, the second performance information generation unit 905, and the second output unit are the same as those described in the first embodiment, and a description thereof is omitted here. .

条件取得部９１０は、位置姿勢推定に関する条件を取得する。本実施形態では、取得する位置姿勢推定に関する条件は、位置姿勢推定精度の下限値とする。 The condition acquisition unit 910 acquires conditions related to position and orientation estimation. In the present embodiment, the condition regarding the position / orientation estimation to be acquired is the lower limit value of the position / orientation estimation accuracy.

第２の計測部９０３は、入力した複数の画像の中から位置姿勢推定に使用する画像の組み合わせを複数生成し、生成した画像の組み合わせごとに位置姿勢推定を行う。位置姿勢推定方法は、第１の実施形態で説明した通りである。 The second measurement unit 903 generates a plurality of image combinations used for position and orientation estimation from the plurality of input images, and performs position and orientation estimation for each generated image combination. The position and orientation estimation method is as described in the first embodiment.

第２の性能情報生成部９０５は、第２の計測部９０３で実施した複数の位置姿勢推定の結果に対して、位置姿勢推定に関する性能情報を生成する。さらに、生成した性能情報が条件取得部９１０で取得した条件を満たすか否かを判定する。具体的には、位置姿勢推定の精度が、取得した推定精度の閾値に収まっているかどうかを判定する。判定された結果、取得した閾値に収まっていると判定された精度の位置姿勢計測結果のみを第２の出力部９０７に送出する。 The second performance information generation unit 905 generates performance information related to position / orientation estimation for the results of the plurality of position / orientation estimations performed by the second measurement unit 903. Furthermore, it is determined whether the generated performance information satisfies the condition acquired by the condition acquisition unit 910. Specifically, it is determined whether the accuracy of position / orientation estimation falls within the acquired estimation accuracy threshold. As a result of the determination, only the position / orientation measurement result with the accuracy determined to be within the acquired threshold value is sent to the second output unit 907.

第２の出力部９０７は、第２の性能情報生成部９０５で生成した性能情報のうち、性能情報生成部９０５において入力した条件を満たすと判定された位置姿勢推定に関する性能のみを出力する。 The second output unit 907 outputs only the performance related to the position / orientation estimation determined to satisfy the condition input in the performance information generation unit 905 among the performance information generated by the second performance information generation unit 905.

図１０は、本実施形態における処理手順を示すフローチャートである。ステップ１００１〜１００６は、第１の実施形態で説明したステップＳ３０１〜３０６と同様であるため、ここでの説明は省略する。 FIG. 10 is a flowchart showing a processing procedure in the present embodiment. Steps 1001 to 1006 are the same as steps S301 to S306 described in the first embodiment, and thus description thereof is omitted here.

（ステップ１００７）
ステップＳ１００７において、条件取得部９１０は、設定したＧＵＩを介して、位置姿勢推定精度の閾値を取得する。本実施形態においては、ユーザが操作装置を介して入力したものを取得する。条件取得部９１０は、取得した条件を第２の出力部９０９へ送出する。 (Step 1007)
In step S 1007, the condition acquisition unit 910 acquires a position / orientation estimation accuracy threshold value via the set GUI. In this embodiment, what the user inputs via the operation device is acquired. The condition acquisition unit 910 sends the acquired condition to the second output unit 909.

（ステップ１００８）
ステップＳ１００８において、第２の出力部９０７は、ステップＳ１００６において生成された位置姿勢計測の精度を取得し、該取得した精度がステップＳ１００７で取得した閾値内におさまっているかを判定する。そして、閾値以内に収まっていると判定された位置姿勢推定に関する性能情報を選択して表示装置９２０に出力する。 (Step 1008)
In step S1008, the second output unit 907 acquires the accuracy of the position / orientation measurement generated in step S1006 and determines whether the acquired accuracy is within the threshold acquired in step S1007. Then, performance information related to position / orientation estimation determined to be within the threshold is selected and output to the display device 920.

図１１は、表示装置９２０及び条件取得部９１０に設定したＧＵＩ１１００を示す例である。表示装置９２０は、条件を満たすと判定した複数の位置姿勢推定に関する性能情報の中から、性能情報表示切り替えボタン１１２８で指定した性能情報を表示する。また、表示装置９２０に表示した性能情報の対象となった位置姿勢推定に使用した画像を特定するために、各画像を使用したか否かを示す情報１１２４〜１１２７を表示する。この例では、カメラＢ及びＣを使用した位置姿勢推定の性能情報を表示領域９０７に表示している。 FIG. 11 is an example showing the GUI 1100 set in the display device 920 and the condition acquisition unit 910. The display device 920 displays the performance information specified by the performance information display switching button 1128 from among a plurality of performance information related to position and orientation estimation determined to satisfy the condition. Further, information 1124 to 1127 indicating whether or not each image is used is displayed in order to specify the image used for the position and orientation estimation that is the target of the performance information displayed on the display device 920. In this example, performance information of position and orientation estimation using the cameras B and C is displayed in the display area 907.

条件取得部９１０は、ＧＵＩ１１００に設定された入力ボックス１１２９に、ユーザが操作装置を介して入力した推定精度の条件を取得する。 The condition acquisition unit 910 acquires an estimation accuracy condition input by the user via the operation device in the input box 1129 set in the GUI 1100.

表示装置９２０は、カメラ２、カメラ３、カメラ４及びカメラ５が撮影した画像１１１０〜１１１３それぞれにもとづいて実施した位置姿勢推定精度及び位置姿勢推定に要した処理時間１１２０〜１１２３を表示する。 The display device 920 displays position / orientation estimation accuracy and processing times 1120 to 1123 required for position / orientation estimation performed based on the images 1110 to 1113 captured by the camera 2, the camera 3, the camera 4, and the camera 5, respectively.

表示装置９２０は、表示領域１１０７に、カメラ３及び４が撮影した画像にもとづいて位置姿勢推定精度及び位置姿勢推定に要した処理時間を表示する。 The display device 920 displays the position / orientation estimation accuracy and the processing time required for the position / orientation estimation based on the images taken by the cameras 3 and 4 in the display area 1107.

＜入力条件のバリエーション＞
本実施形態では、入力する条件を推定精度の下限値とした。しかし、位置姿勢推定に関する条件であれば、他のものであってもよい。例えば、位置姿勢推定に要する処理時間の上限値を指定してもよい。この場合、位置姿勢推定に要する処理時間が、指定した処理時間の上限値を下回る場合に、該条件を満たすと判断する。あるいは、推定精度の下限値と処理時間の上限値の両方を条件として入力してもよい。 <Variations of input conditions>
In the present embodiment, the input condition is the lower limit value of the estimation accuracy. However, other conditions may be used as long as they are conditions related to position and orientation estimation. For example, an upper limit value of processing time required for position and orientation estimation may be specified. In this case, when the processing time required for position / orientation estimation falls below the upper limit value of the designated processing time, it is determined that the condition is satisfied. Alternatively, both the lower limit value of the estimation accuracy and the upper limit value of the processing time may be input as conditions.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

＜各実施形態の効果＞
第１の実施形態では、撮影した各画像または複数の画像にもとづいてそれぞれ位置姿勢推定を行い、それぞれの位置姿勢推定の性能に関する情報をＧＵＩ上に表示する方法について説明した。これにより、各画像及び複数の画像を用いて実施した位置姿勢推定の性能の確認や比較を容易に行うことができるようになり、性能低下あるいは向上の要因となっている画像またはカメラの特定が行えるようになる。また、性能低下の要因と特定されたカメラの除去や配置変更を行うことができるようになる。 <Effect of each embodiment>
In the first embodiment, a method has been described in which position and orientation estimation is performed based on each captured image or a plurality of images, and information on the performance of each position and orientation estimation is displayed on the GUI. This makes it possible to easily check and compare the performance of position and orientation estimation performed using each image and a plurality of images, and to identify the image or camera that is the cause of the performance degradation or improvement. You can do it. In addition, it becomes possible to remove or change the arrangement of the camera identified as the cause of the performance degradation.

第２の実施形態では、撮影した各画像または複数の画像にもとづいて実施した位置姿勢推定の性能をそれぞれ表示するとともに、入力した複数の画像の中から位置姿勢推定に用いる画像を選択する方法について説明した。これにより、ユーザは、性能を確認しながら、位置姿勢推定に利用する画像の選択を実施できるようになるため、性能の高い位置姿勢推定を実施することが可能となる。また、位置姿勢推定に要する処理時間を、組み立て作業のタクトタイムで決定される該処理時間内に収めることが可能となる。 In the second embodiment, a method of displaying the performance of position / orientation estimation based on each captured image or a plurality of images and selecting an image to be used for position / orientation estimation from a plurality of input images. explained. Accordingly, the user can select an image to be used for position / orientation estimation while confirming the performance, and thus can perform position / orientation estimation with high performance. Further, the processing time required for position and orientation estimation can be kept within the processing time determined by the tact time of the assembly work.

第３の実施形態では、位置姿勢推定に関する条件を取得し、取得した条件を満たす際の位置姿勢推定に関する性能を表示する方法について説明する。これにより、ユーザが所望する条件を満たす位置姿勢推定が可能となる。 In the third embodiment, a method for acquiring conditions related to position / orientation estimation and acquiring performance related to position / orientation estimation when the acquired conditions are satisfied will be described. This makes it possible to estimate the position and orientation that satisfy the conditions desired by the user.

＜定義＞
本発明の画像取得手段は、情報処理装置１００と接続した対象物体を撮影したカメラから画像を入力してもよいし、複数の視点で対象物体を撮影した画像を、情報処理装置１００と接続した外部の記憶装置に一旦保持しておいたものを入力してもよい。 <Definition>
The image acquisition unit of the present invention may input an image from a camera that has captured a target object connected to the information processing apparatus 100, or connected to the information processing apparatus 100 are images of the target object captured from a plurality of viewpoints. You may input what was once held in an external storage device.

本発明における第１及び第２の計測部では、画像にもとづいて算出した距離値にもとづいて認識または位置姿勢推定を実施してもよい。あるいは、画像から抽出した画像特徴にもとづいて認識または位置姿勢推定を実施してもよい。あるいは、画像にもとづいて算出した距離値と画像から抽出した画像特徴の両方にもとづいて認識または位置姿勢推定を実施してもよい。 In the first and second measurement units of the present invention, recognition or position / orientation estimation may be performed based on a distance value calculated based on an image. Alternatively, recognition or position / orientation estimation may be performed based on image features extracted from the image. Alternatively, recognition or position / orientation estimation may be performed based on both the distance value calculated based on the image and the image feature extracted from the image.

本発明における第１及び第２の性能情報生成部では、第１及び第２の計測部で実施した位置姿勢推定に関する性能情報を生成する。位置姿勢推定に関する性能情報は、位置姿勢の性能を示す情報であれば何でもよい。例えば、位置姿勢推定の精度または処理時間とする。あるいは、推定した位置姿勢、対象物体の有無、位置姿勢推定または認識の計算が破たんしたか否かの情報であってもよい。 In the 1st and 2nd performance information generation part in this invention, the performance information regarding the position and orientation estimation implemented in the 1st and 2nd measurement part is produced | generated. The performance information related to the position / orientation estimation may be anything as long as it is information indicating the position / orientation performance. For example, the position / orientation estimation accuracy or processing time is used. Alternatively, it may be information on the estimated position and orientation, the presence or absence of the target object, and whether or not the calculation of position and orientation estimation or recognition has broken.

本発明における第１及び第２の出力部は、第１及び第２の性能情報生成部で生成した性能情報を表示装置の表示領域にそれぞれ表示する。表示方法は、位置姿勢推定の精度または処理時間の数値をＧＵＩ上に表示してもよいし、対象物体の形状を示す三次元モデルを推定した位置姿勢にもとづいて描画し入力画像上に重畳して表示してもよい。 The first and second output units in the present invention display the performance information generated by the first and second performance information generation units in the display area of the display device, respectively. The display method may display the accuracy of position / orientation estimation or the numerical value of the processing time on the GUI, or draw based on the estimated position / orientation of the 3D model indicating the shape of the target object and superimpose it on the input image. May be displayed.

１０１画像取得部
１０２第１の計測部
１０３第２の計測部
１０４第１の性能情報生成部
１０５第２の性能情報生成部
１０６第１の出力部
１０７第２の出力部 DESCRIPTION OF SYMBOLS 101 Image acquisition part 102 1st measurement part 103 2nd measurement part 104 1st performance information generation part 105 2nd performance information generation part 106 1st output part 107 2nd output part

Claims

Image acquisition means for acquiring a plurality of images obtained by imaging a target object;
First execution means for executing at least one of recognition processing of the target object and position / orientation estimation processing of the target object based on at least one of the acquired images;
Second execution means for executing at least one of the target object recognition process and the position / posture estimation process of the target object based on at least two or more images of the plurality obtained.
First performance information generation means for generating first performance information relating to processing executed by the first execution means;
Second performance information generation means for generating second performance information relating to processing executed by the second execution means;
Display control means for causing the display means to display the first performance information and the second performance information;
An information processing apparatus comprising:

Furthermore, the image acquisition unit includes an image selection unit that selects at least two images to be used for processing by the second execution unit from a plurality of images acquired by the image acquisition unit,
The second position / orientation estimation unit executes at least one of the target object recognition process and the target object position / orientation estimation process based on the image selected by the image selection unit. The information processing apparatus according to claim 1.

Furthermore, the image acquired by the image acquisition means has means for selecting a processing method in the second position and orientation estimation means,
The information processing apparatus according to claim 1, wherein the second measurement unit performs the process based on the selected processing method.

4. The method according to claim 1, wherein the first performance information includes at least one of accuracy of processing executed by the first execution unit, processing time, and success / failure of the processing. The information processing apparatus according to item 1.

5. The method according to claim 1, wherein the second performance information includes at least one of accuracy, processing time, and success / failure of processing executed by the second execution unit. The information processing apparatus according to item 1.

Furthermore, it comprises a condition acquisition means for acquiring a condition relating to processing executed by at least one of the first execution means and the second execution means,
The information processing apparatus according to claim 1, wherein the display control unit causes the display unit to display performance information that satisfies the acquired condition.

The condition acquisition means includes at least one of processing accuracy and processing time executed by at least one of the first execution means and the second execution means. The information processing apparatus according to claim 6.

The information processing apparatus according to claim 1, wherein the image acquisition unit acquires images captured by a plurality of imaging apparatuses that are arranged at different positions.

The information processing apparatus according to claim 1, wherein the image acquisition unit acquires an image including a pattern projected by a projection apparatus.

The information processing apparatus according to claim 1, further comprising the display unit.

The display control means further causes the display means to display an image in which a three-dimensional model of the target object is drawn based on a result of processing by the first execution means or the second execution means. The information processing apparatus according to any one of claims 1 to 9

Furthermore, a projection device that projects a pattern, and a plurality of imaging devices that capture an image on which the pattern is projected,
The information processing apparatus according to claim 1, wherein the image acquisition unit acquires a plurality of images on which the pattern imaged by the imaging apparatus is projected.

An image acquisition process for acquiring a plurality of images of the target object;
A first execution step of executing at least one of the target object recognition process and the target object position and orientation estimation process based on information about at least one of the acquired images; ,
A second execution step of executing at least one of the target object recognition process and the target object position and orientation estimation process based on information about at least two images of the plurality obtained ,
A first performance information generation step for generating first performance information related to the processing executed in the first execution step;
A second performance information generation step for generating second performance information relating to the processing executed in the second execution step;
A display control step of causing the display means to display the first performance information and the second performance information;
An information processing apparatus comprising:

A program for causing a computer device to function as each unit of the information processing device according to any one of claims 1 to 12, when the computer device is executed.