JP2014178967A

JP2014178967A - Three-dimensional object recognition device and three-dimensional object recognition method

Info

Publication number: JP2014178967A
Application number: JP2013053561A
Authority: JP
Inventors: Tomohiro Nakamichi; 朋弘仲道
Original assignee: 3D Media Co Ltd
Current assignee: Kyoto Robotics Corp
Priority date: 2013-03-15
Filing date: 2013-03-15
Publication date: 2014-09-25
Anticipated expiration: 2033-03-15
Also published as: JP6198104B2

Abstract

PROBLEM TO BE SOLVED: To provide a three-dimensional object recognition device configured to prevent false recognition of position/attitude of an object and to improve recognition accuracy, while improving processing speed, and a three-dimensional object recognition method.SOLUTION: A three-dimensional object recognition device stores a three-dimensional model representing a contour and surface shape of an object in three-dimensional model storing means, acquires an image by imaging the object from a predetermined direction, measures a three-dimensional point group indicating three-dimensional coordinates of the surface of the object, extracts an edge of the object from the acquired image or a three-dimensional measurement result, and evaluates position/attitude of the object by use of an evaluated contour value to be determined on the basis of the extracted edge and three-dimensional contour points indicating a contour shape of the three-dimensional model in all positions/attitudes, and an evaluated point group value to be determined on the basis of the three-dimensional measurement result and three-dimensional face points indicating a surface shape of the three-dimensional model in all positions/attitudes.

Description

本発明は、形状が既知の認識対象となる３次元物体を認識するための３次元物体認識装置及び３次元物体認識方法に関する。 The present invention relates to a three-dimensional object recognition device and a three-dimensional object recognition method for recognizing a three-dimensional object to be recognized whose shape is known.

生産ラインにおいてロボットアームによる部品等の正確な操作を可能とするため、山積みにされた部品等を個々に認識し、各部品の位置及び姿勢を認識する３次元物体認識装置が近年開発されている。 In order to enable accurate operation of parts and the like by a robot arm in a production line, a three-dimensional object recognition device has been developed in recent years that recognizes a pile of parts individually and recognizes the position and orientation of each part. .

従来、このような３次元物体認識装置としては、例えば、認識対象物を所定方向からカメラで撮影した画像から認識対象物のエッジすなわち輪郭等の特徴を抽出し、撮影画像を構成する各画素について最も近いエッジまでの距離をそれぞれ計算し、認識対象物の輪郭形状を表わす３次元モデルを撮影画像上に射影して照合することにより認識対象物の位置姿勢を認識するものがある。また、撮影画像を構成する各画素に、最も近いエッジまでの距離を画素値としてそれぞれ記憶させたディスタンスマップを作成し、このディスタンスマップを参照することにより、処理速度を向上させた３次元物体認識装置が提案されている（例えば、特許文献１参照）。また、３次元物体認識装置としては、認識対象物の表面の点の３次元座標を示す３次元点群を用いて認識対象物の位置姿勢を認識するものもある。 Conventionally, as such a three-dimensional object recognition device, for example, for each pixel constituting a captured image, a feature such as an edge of a recognition target, that is, a contour, is extracted from an image obtained by capturing the recognition target with a camera from a predetermined direction. There is a technique for recognizing the position and orientation of a recognition object by calculating the distance to the nearest edge and projecting and collating a three-dimensional model representing the contour shape of the recognition object onto a captured image. In addition, a distance map in which the distance to the nearest edge is stored as a pixel value in each pixel constituting the captured image is created, and the processing speed is improved by referring to the distance map, thereby recognizing a three-dimensional object. An apparatus has been proposed (see, for example, Patent Document 1). Some 3D object recognition apparatuses recognize the position and orientation of a recognition object using a 3D point group indicating the 3D coordinates of points on the surface of the recognition object.

特開２０１０−２０５０９５号公報JP 2010-205095 A

しかしながら、特許文献１の３次元物体認識装置等のように輪郭を用いて認識対象物の位置姿勢を認識する手法では、２次元画像を用いて認識を行っているため、似通った画像が存在すれば、本来は認識対象物が存在しない場所に誤って認識されてしまう虞があるとともに、画像から輪郭が抽出されない場合には、認識対象物の位置姿勢を認識できないという問題がある。また、このような問題が生じた場合には処理時間も遅くなる。また、３次元点群を用いて認識対象物の位置姿勢を認識する手法では、輪郭部分の位置決めに曖昧性が残るという問題がある。 However, in the method of recognizing the position and orientation of the recognition object using the contour, such as the three-dimensional object recognition device of Patent Document 1, since the recognition is performed using the two-dimensional image, there are similar images. For example, there is a possibility that the recognition target object may be erroneously recognized in a place where the recognition target object does not exist, and the position and orientation of the recognition target object cannot be recognized when the contour is not extracted from the image. In addition, when such a problem occurs, the processing time is also delayed. Further, in the method of recognizing the position and orientation of the recognition target object using the three-dimensional point group, there is a problem that ambiguity remains in the positioning of the contour portion.

本発明は、上記のような課題に鑑みてなされたものであって、認識対象物の位置姿勢の誤認識を抑制して、認識精度を向上させ、且つ処理速度を向上させることができる３次元物体認識装置及び３次物体認識方法を提供することを目的とする。 The present invention has been made in view of the above-described problems, and can suppress misrecognition of the position and orientation of a recognition target object, improve recognition accuracy, and improve processing speed. An object of the present invention is to provide an object recognition device and a tertiary object recognition method.

上記目的を達成するために、請求項１に記載の３次元物体認識装置は、認識対象物の輪郭及び表面形状を表わす３次元モデルを記憶する３次元モデル記憶手段と、前記認識対象物を所定方向から撮影して画像を取得する画像取得手段と、前記認識対象物の表面の点の３次元座標を示す３次元点群を計測する３次元計測手段と、前記画像取得手段により取得した画像又は前記３次元計測手段により得られた３次元計測結果から前記認識対象物のエッジを抽出するエッジ抽出手段と、前記エッジ抽出手段により抽出されたエッジとあらゆる位置姿勢における前記３次元モデルの輪郭形状を示す３次元輪郭点に基づいて求められる輪郭評価値、及び、前記３次元計測手段により得られた３次元計測結果とあらゆる位置姿勢における前記３次元モデルの表面形状を示す３次元面点に基づいて求められる点群評価値を用いて前記認識対象物の位置姿勢を評価する位置姿勢評価手段と、を備えることを特徴としている。 In order to achieve the above object, a three-dimensional object recognition apparatus according to claim 1 includes a three-dimensional model storage means for storing a three-dimensional model representing a contour and a surface shape of a recognition target object, and the recognition target object is determined in advance. An image acquisition unit that captures an image by photographing from a direction; a three-dimensional measurement unit that measures a three-dimensional point group indicating a three-dimensional coordinate of a point on the surface of the recognition object; and an image acquired by the image acquisition unit Edge extraction means for extracting the edge of the recognition target object from the three-dimensional measurement result obtained by the three-dimensional measurement means; the edge extracted by the edge extraction means and the contour shape of the three-dimensional model at any position and orientation The contour evaluation value obtained based on the three-dimensional contour point shown, the three-dimensional measurement result obtained by the three-dimensional measuring means, and the three-dimensional model at any position and orientation It is characterized in that it comprises a position and orientation evaluation means for evaluating the position and orientation of the recognition object with the point group evaluation value calculated on the basis of the three-dimensional surface points showing the surface shape.

請求項２記載の３次元物体認識装置は、前記位置姿勢評価手段が、前記輪郭評価値と前記点群評価値の比率を調整することを特徴としている。 The three-dimensional object recognition apparatus according to claim 2 is characterized in that the position and orientation evaluation unit adjusts a ratio between the contour evaluation value and the point group evaluation value.

請求項３記載の３次元物体認識装置は、前記画像取得手段により取得した画像を構成する各画素に、前記エッジ抽出手段により抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させたディスタンスマップを記憶するディスタンスマップ記憶手段と、前記３次元計測手段により計測された３次元点群の前記画像取得手段により取得された画像上の各画像座標に、前記所定方向と異なる方向から前記認識対象物を撮像した場合に得られる画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた対応座標マップを記憶する対応座標マップ記憶手段と、を備え、前記位置姿勢評価手段は、前記ディスタンスマップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元輪郭点を射影して照合することにより得られる前記輪郭評価値、及び、前記対応座標マップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元面点を射影して照合することにより得られる前記点群評価値を用いて前記認識対象物の位置姿勢を評価することを特徴としている。 The three-dimensional object recognition device according to claim 3, wherein each pixel constituting the image acquired by the image acquisition unit has a distance to the nearest edge among the edges extracted by the edge extraction unit, and the closest edge. Distance map storage means for storing a distance map in which the orientation of the image is stored as an image value, and each image coordinate on the image acquired by the image acquisition means of the three-dimensional point group measured by the three-dimensional measurement means. Corresponding coordinate map storage means for storing corresponding coordinate maps each storing corresponding coordinates that are image coordinates corresponding to the image coordinates of an image obtained when the recognition object is imaged from a direction different from the predetermined direction. And the position / orientation evaluation means includes, on the distance map, the three-dimensional model at any position and orientation. Obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position and orientation on the contour evaluation value obtained by projecting and collating the three-dimensional contour point and the corresponding coordinate map. The position / orientation of the recognition object is evaluated using the point group evaluation value.

請求項４記載の３次元物体認識装置は、前記画像取得手段により取得した画像の解像度を異なる比率で低下させた複数枚の画像を有する画像ピラミッドを作成する画像ピラミッド作成手段と、前記位置姿勢評価手段により得られた最も高い評価値を初期値として用いて、前記認識対象物の位置姿勢の最適化を行う位置姿勢最適化手段と、を備え、前記エッジ抽出手段は、前記画像ピラミッド作成手段により作成された画像ピラミッドにおける各解像度の画像から前記認識対象物のエッジを抽出し、前記ディスタンスマップ記憶手段は、前記画像ピラミッド作成手段により作成された画像ピラミッドにおける各解像度の画像の各画素に、前記エッジ抽出手段により前記各解像度の画像から抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させた各解像度のディスタンスマップを記憶し、前記対応座標マップ記憶手段は、前記３次元計測手段により計測された３次元点群の前記画像ピラミッド作成手段により作成された画像ピラミッドにおける各解像度の画像上の各画像座標に、前記所定方向と異なる方向から前記認識対象物を撮像した場合に得られる各解像度の画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた各解像度の対応座標マップを記憶し、前記位置姿勢評価手段は、解像度が最も低い前記ディスタンスマップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元輪郭点を射影して照合することにより得られる前記輪郭評価値、及び、解像度が最も低い前記対応座標マップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元面点を射影して照合することにより得られる前記点群評価値を用いて前記認識対象物の位置姿勢を評価し、前記位置姿勢最適化手段は、前記位置姿勢評価手段により得られた最も高い評価値を初期値とし、該初期値を前記位置姿勢評価手段で用いられた前記ディスタンスマップ及び前記対応座標マップよりも高い解像度のディスタンスマップ及び対応座標マップを用いて、予め設定された精度に到達又は最も高い解像度のディスタンスマップ及び対応座標マップを用いた位置姿勢の評価が終了するまで最適化を行うことを特徴としている。 The three-dimensional object recognition apparatus according to claim 4, wherein the image pyramid creation unit creates an image pyramid having a plurality of images in which the resolution of the image acquired by the image acquisition unit is reduced at different ratios, and the position and orientation evaluation Position and orientation optimization means for optimizing the position and orientation of the recognition object using the highest evaluation value obtained by the means as an initial value, and the edge extraction means is provided by the image pyramid creation means. The edge of the recognition object is extracted from the image of each resolution in the created image pyramid, and the distance map storage means is provided for each pixel of the image of each resolution in the image pyramid created by the image pyramid creation means. The distance to the nearest edge among the edges extracted from the image of each resolution by the edge extraction means, A distance map of each resolution in which the direction of the nearest edge is stored as an image value is stored, and the corresponding coordinate map storage means is the image pyramid creation means of the three-dimensional point group measured by the three-dimensional measurement means The image coordinates corresponding to each image coordinate of the image of each resolution obtained when the recognition object is imaged from a direction different from the predetermined direction to each image coordinate on the image of each resolution in the image pyramid created by The corresponding coordinate map of each resolution in which the corresponding coordinates are stored is stored, and the position / orientation evaluation means, on the distance map having the lowest resolution, the three-dimensional contour points of the three-dimensional model at any position and orientation. On the contour evaluation value obtained by projecting and collating, and on the corresponding coordinate map with the lowest resolution, The position / posture optimization means evaluates the position / posture of the recognition object using the point group evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position / posture. The initial value is the highest evaluation value obtained by the position / orientation evaluation unit, and the initial value is a distance map and a corresponding resolution higher than the distance map and the corresponding coordinate map used by the position / orientation evaluation unit. Using the coordinate map, optimization is performed until the preset accuracy is reached or the evaluation of the position and orientation using the distance map and the corresponding coordinate map with the highest resolution is completed.

請求項５記載の３次元物体認識装置は、前記位置姿勢評価手段又は前記位置姿勢最適化手段により得られた前記認識対象物の位置姿勢の評価結果を表示する表示手段を備えることを特徴としている。 The three-dimensional object recognition apparatus according to claim 5, further comprising display means for displaying an evaluation result of the position and orientation of the recognition object obtained by the position and orientation evaluation means or the position and orientation optimization means. .

請求項６記載の３次元物体認識方法は、認識対象物の輪郭及び表面形状を表わす３次元モデルを３次元モデル記憶手段に記憶する３次元モデル記憶ステップと、前記認識対象物を所定方向から撮影して画像を取得する画像取得ステップと、前記認識対象物の表面の点の３次元座標を示す３次元点群を計測する３次元計測ステップと、前記画像取得ステップで取得した画像又は前記３次元計測ステップで得られた３次元計測結果から前記認識対象物のエッジを抽出するエッジ抽出ステップと、前記エッジ抽出ステップで抽出されたエッジとあらゆる位置姿勢における前記３次元モデルの輪郭形状を示す３次元輪郭点に基づいて求められる輪郭評価値、及び、前記３次元計測ステップで得られた３次元計測結果とあらゆる位置姿勢における前記３次元モデルの表面形状を示す３次元面点に基づいて求められる点群評価値を用いて前記認識対象物の位置姿勢を評価する位置姿勢評価ステップと、を備えることを特徴としている。 The three-dimensional object recognition method according to claim 6, wherein a three-dimensional model storage step of storing a three-dimensional model representing the contour and surface shape of the recognition object in a three-dimensional model storage means, and photographing the recognition object from a predetermined direction. An image acquisition step for acquiring an image, a three-dimensional measurement step for measuring a three-dimensional point group indicating a three-dimensional coordinate of a surface point of the recognition object, and the image acquired in the image acquisition step or the three-dimensional An edge extraction step for extracting an edge of the recognition object from the three-dimensional measurement result obtained in the measurement step, and a three-dimensional shape indicating the edge extracted in the edge extraction step and the contour shape of the three-dimensional model at any position and orientation The contour evaluation value obtained based on the contour point, the 3D measurement result obtained in the 3D measurement step, and the 3 in all positions and orientations It is characterized in that it comprises a position and orientation evaluation step of evaluating the position and orientation of the recognition object with the point group evaluation value calculated on the basis of the three-dimensional surface points of a surface shape of the original model.

請求項７記載の３次元物体認識方法は、前記位置姿勢評価ステップが、前記輪郭評価値と前記点群評価値の比率を調整することを特徴としている。 The three-dimensional object recognition method according to claim 7 is characterized in that the position and orientation evaluation step adjusts a ratio between the contour evaluation value and the point group evaluation value.

請求項８記載の３次元物体認識方法は、前記画像取得ステップで取得した画像を構成する各画素に、前記エッジ抽出ステップで抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させたディスタンスマップをディスタンスマップ記憶手段に記憶するディスタンスマップ記憶ステップと、前記３次元計測ステップで計測された３次元点群の前記画像取得ステップにより取得された画像上の各画像座標に、前記所定方向と異なる方向から前記認識対象物を撮像した場合に得られる画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた対応座標マップを対応座標マップ記憶手段に記憶する対応座標マップ記憶ステップと、を備え、前記位置姿勢評価ステップは、前記ディスタンスマップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元輪郭点を射影して照合することにより得られる前記輪郭評価値、及び、前記対応座標マップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元面点を射影して照合することにより得られる前記点群評価値を用いて前記認識対象物の位置姿勢を評価することを特徴としている。 The three-dimensional object recognition method according to claim 8, wherein each pixel constituting the image acquired in the image acquisition step has a distance to the nearest edge among the edges extracted in the edge extraction step, and the nearest edge. A distance map storing step of storing a distance map in which the orientation of the image is stored as an image value in a distance map storing means, and the image acquired by the image acquiring step of the three-dimensional point group measured in the three-dimensional measuring step. A corresponding coordinate map in which corresponding coordinates, which are image coordinates corresponding to the image coordinates of an image obtained when the recognition target is imaged from a direction different from the predetermined direction, are stored in the corresponding coordinates. A corresponding coordinate map storage step stored in the map storage means, wherein the position and orientation evaluation step includes The contour evaluation value obtained by projecting and collating the three-dimensional contour point of the three-dimensional model at any position and orientation on the stance map, and the three-dimensional at any position and orientation on the corresponding coordinate map The position and orientation of the recognition object is evaluated using the point group evaluation value obtained by projecting and collating the three-dimensional surface points of the model.

請求項９記載の３次元物体認識方法は、前記画像取得ステップで取得した画像の解像度を異なる比率で低下させた複数枚の画像を有する画像ピラミッドを作成する画像ピラミッド作成ステップと、前記位置姿勢評価ステップで得られた最も高い評価値を初期値として用いて、前記認識対象物の位置姿勢の最適化を行う位置姿勢最適化ステップと、を備え、前記抽出ステップは、前記画像ピラミッド作成ステップで作成された画像ピラミッドにおける各解像度の画像から前記認識対象物のエッジを抽出し、前記ディスタンスマップ記憶ステップは、前記画像ピラミッド作成ステップで作成された画像ピラミッドにおける各解像度の画像の各画素に、前記エッジ抽出ステップで前記各解像度の画像から抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させた各解像度のディスタンスマップを前記ディスタンスマップ記憶手段に記憶し、前記対応座標マップ記憶ステップは、前記３次元計測ステップで計測された３次元点群の前記画像ピラミッド作成ステップで作成された画像ピラミッドにおける各解像度の画像上の各画像座標に、前記所定方向と異なる方向から前記認識対象物を撮像した場合に得られる各解像度の画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた各解像度の対応座標マップを前記対応座標マップ記憶手段に記憶し、前記位置姿勢評価ステップは、解像度が最も低い前記ディスタンスマップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元輪郭点を射影して照合することにより得られる前記輪郭評価値、及び、解像度が最も低い前記対応座標マップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元面点を射影して照合することにより得られる前記点群評価値を用いて前記認識対象物の位置姿勢を評価し、前記位置姿勢最適化ステップは、前記位置姿勢評価ステップで得られた最も高い評価値を初期値とし、該初期値を前記位置姿勢評価ステップで用いられた前記ディスタンスマップ及び前記対応座標マップよりも高い解像度のディスタンスマップ及び対応座標マップを用いて、予め設定された精度に到達又は最も高い解像度のディスタンスマップ及び対応座標マップを用いた位置姿勢の評価が終了するまで最適化を行うことを特徴としている。 The three-dimensional object recognition method according to claim 9, wherein the image pyramid creating step creates an image pyramid having a plurality of images in which the resolution of the image obtained in the image obtaining step is reduced at different ratios, and the position and orientation evaluation A position and orientation optimization step for optimizing the position and orientation of the recognition object using the highest evaluation value obtained in the step as an initial value, and the extraction step is created in the image pyramid creation step The edge of the recognition object is extracted from the image of each resolution in the image pyramid, and the distance map storage step includes the edge in each pixel of the image of each resolution in the image pyramid created in the image pyramid creation step. To the nearest edge among the edges extracted from the image of each resolution in the extraction step A distance map of each resolution in which the distance and the direction of the nearest edge are stored as image values is stored in the distance map storage means, and the corresponding coordinate map storage step is performed by the three-dimensional measurement step. An image of each resolution obtained when the recognition target is imaged from a direction different from the predetermined direction at each image coordinate on the image of each resolution in the image pyramid created in the image pyramid creation step of the dimension point group. A corresponding coordinate map of each resolution in which corresponding coordinates which are image coordinates corresponding to the respective image coordinates are stored is stored in the corresponding coordinate map storage means, and the position and orientation evaluation step is performed on the distance map having the lowest resolution. In addition, the 3D contour points of the 3D model at any position and orientation are projected and collated. And the point group evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model in any position and orientation on the corresponding coordinate map having the lowest resolution. The position and orientation optimization step uses the highest evaluation value obtained in the position and orientation evaluation step as an initial value, and the initial value is determined in the position and orientation evaluation step. Using the distance map and the corresponding coordinate map having a higher resolution than the distance map used and the corresponding coordinate map, the position and orientation using the distance map and the corresponding coordinate map reaching the preset accuracy or using the highest resolution distance map. It is characterized by performing optimization until the evaluation is completed.

請求項１０記載の３次元物体認識方法は、前記位置姿勢評価ステップ又は前記位置姿勢最適化ステップにより得られた前記認識対象物の位置姿勢の評価結果を表示する表示ステップを備えることを特徴としている。 The three-dimensional object recognition method according to claim 10, further comprising a display step of displaying an evaluation result of the position and orientation of the recognition object obtained by the position and orientation evaluation step or the position and orientation optimization step. .

請求項１及び６に記載の発明によれば、画像上のエッジを用いた輪郭評価値と、３次元点群を用いた点群評価値との双方を融合した評価値を用いて、認識対象物の位置姿勢の評価を行うので、認識対象物の位置姿勢の誤認識を抑制し、認識精度及び処理速度を向上させることができる。 According to the first and sixth aspects of the present invention, a recognition target is obtained by using an evaluation value obtained by fusing both an outline evaluation value using an edge on an image and a point cloud evaluation value using a three-dimensional point cloud. Since the position and orientation of the object is evaluated, it is possible to suppress erroneous recognition of the position and orientation of the recognition target object, and to improve the recognition accuracy and the processing speed.

請求項２及び７に記載の発明によれば、輪郭評価値と点群評価値の比率を調整することができるので、画像からエッジが抽出できないような場合でも輪郭評価値の比率を下げて点群評価値の比率を上げることにより認識対象物の位置姿勢を認識することができる。 According to the second and seventh aspects of the invention, since the ratio between the contour evaluation value and the point group evaluation value can be adjusted, the ratio of the contour evaluation value is lowered even when the edge cannot be extracted from the image. The position and orientation of the recognition object can be recognized by increasing the ratio of the group evaluation values.

請求項３及び８に記載の発明によれば、画像取得手段により取得した画像を構成する各画素に、エッジ抽出手段により抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させたディスタンスマップ上に、あらゆる位置姿勢における前記３次元モデルの前記３次元輪郭点を射影して照合することにより得られる輪郭評価値と、３次元計測手段により計測された３次元点群の画像取得手段により取得された画像上の各画像座標に、異なる方向から認識対象物を撮像した場合に得られる画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた対応座標マップ上に、あらゆる位置姿勢における３次元モデルの３次元面点を射影して照合することにより得られる点群評価値との双方を融合した評価値を用いて、認識対象物の位置姿勢の評価を行うので、認識対象物の位置姿勢の誤認識を抑制し、認識精度及び処理速度をより向上させることができる。 According to the third and eighth aspects of the present invention, the distance from the edge extracted by the edge extraction unit to the nearest edge of each pixel constituting the image acquired by the image acquisition unit and the closest edge A contour evaluation value obtained by projecting and collating the three-dimensional contour point of the three-dimensional model at any position and orientation on a distance map in which the orientation is stored as an image value is measured by a three-dimensional measuring unit. In addition, corresponding coordinates that are image coordinates corresponding to the image coordinates of the image obtained when the recognition target object is imaged from different directions are displayed on the image coordinates on the image acquired by the image acquisition unit of the three-dimensional point group. Point cloud evaluation values obtained by projecting and collating 3D surface points of the 3D model at any position and orientation on the corresponding coordinate maps stored respectively Using the evaluation value that combines both, since the evaluation of the position and orientation of the recognition object, and suppress the erroneous recognition of the position and orientation of the recognition object, it is possible to further improve the recognition accuracy and processing speed.

請求項４及び９に記載の発明によれば、画像取得手段により取得した画像の解像度を異なる比率で低下させた複数枚の画像を有する画像ピラミッドを作成する。そして、前記画像ピラミッドにおける各解像度の画像の各画素に、エッジ抽出手段により前記各解像度の画像から抽出されたエッジのうち最も近いエッジまでの距離と、前記最も近いエッジの向きとを画像値として記憶させた各解像度のディスタンスマップのうち解像度が最も低いディスタンスマップ上に、あらゆる位置姿勢における３次元モデルの３次元輪郭点を射影して照合することにより得られる輪郭評価値と、３次元計測手段により計測された３次元点群の前記画像ピラミッド作成手段により作成された画像ピラミッドにおける各解像度の画像上の各画像座標に、異なる方向から認識対象物を撮像した場合に得られる各解像度の画像の前記各画像座標に対応する画像座標である対応座標をそれぞれ記憶させた各解像度の対応座標マップのうち解像度が最も低い対応座標マップ上に、あらゆる位置姿勢における３次元モデルの３次元面点を射影して照合することにより得られる点群評価値との双方を融合した評価値を用いて、認識対象物の位置姿勢の評価を行い、最も高い評価値を初期値として、認識対象物の位置姿勢の最適化を行うので、処理速度をより向上させることができる。 According to the fourth and ninth aspects of the invention, an image pyramid having a plurality of images in which the resolution of the image acquired by the image acquisition means is reduced at different ratios is created. Then, for each pixel of the image of each resolution in the image pyramid, the distance to the nearest edge among the edges extracted from the image of each resolution by the edge extraction means and the direction of the nearest edge are used as image values. Contour evaluation values obtained by projecting and collating the three-dimensional contour points of the three-dimensional model at any position and orientation on the distance map having the lowest resolution among the stored distance maps of the respective resolutions, and three-dimensional measuring means The image of each resolution obtained when the recognition object is imaged from different directions at each image coordinate on the image of each resolution in the image pyramid created by the image pyramid creation means of the three-dimensional point group measured by Corresponding coordinate maps for each resolution storing corresponding coordinates, which are image coordinates corresponding to the image coordinates, respectively. On the corresponding coordinate map having the lowest resolution, the evaluation value is obtained by fusing both the point cloud evaluation value obtained by projecting and collating the 3D surface points of the 3D model at any position and orientation, Since the position and orientation of the recognition target are evaluated and the position and orientation of the recognition target are optimized using the highest evaluation value as an initial value, the processing speed can be further improved.

請求項５及び１０に記載の発明によれば、認識対象物の位置姿勢の評価結果を視覚的に容易に把握することができる。 According to the fifth and tenth aspects of the present invention, the evaluation result of the position and orientation of the recognition object can be easily grasped visually.

本発明の実施形態に係る３次元物体認識装置の構成の一例を示す概略模式図である。It is a schematic diagram showing an example of composition of a three-dimensional object recognition device concerning an embodiment of the present invention. 本発明の実施形態に係る３次元物体認識装置による処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process by the three-dimensional object recognition apparatus which concerns on embodiment of this invention. 画像ピラミッドについて説明するための概略説明図である。It is a schematic explanatory drawing for demonstrating an image pyramid. ディスタンスマップについて説明するための概略説明図である。It is a schematic explanatory drawing for demonstrating a distance map. 対応座標マップについて説明するための概略説明図である。It is a schematic explanatory drawing for demonstrating a corresponding coordinate map. 対応座標マップの作成の仕方の一例について説明するための概略説明図である。It is a schematic explanatory drawing for demonstrating an example of the method of producing a corresponding coordinate map. 輪郭評価値の一例について説明するための説明図であって、（ａ）は輪郭（エッジ）までの距離と評価値の関係を示しており、（ｂ）は角度の差と評価値との関係を示している。It is explanatory drawing for demonstrating an example of an outline evaluation value, Comprising: (a) has shown the relationship between the distance to an outline (edge), and an evaluation value, (b) has the relationship between the difference of an angle, and an evaluation value. Is shown.

以下に本発明の実施形態に係る３次元物体認識装置１について、図面を参照しつつ説明する。図１に示すように、３次元物体認識装置１は、作業台２の上に載置された３次元形状を有する認識対象物３の位置姿勢を認識するためのものであって、この認識対象物３の画像の取得及び認識対象物３の表面の点の３次元座標を示す３次元点群の計測を行うための３次元センサ４と、認識対象物３を把持するためのロボットアーム５と、３次元センサ４から入力される画像や点群データに基づいてロボットアーム５の動作を制御するコンピュータ６とを備えている。 Hereinafter, a three-dimensional object recognition apparatus 1 according to an embodiment of the present invention will be described with reference to the drawings. As shown in FIG. 1, the three-dimensional object recognition device 1 is for recognizing the position and orientation of a recognition object 3 having a three-dimensional shape placed on a work table 2. A three-dimensional sensor 4 for acquiring an image of the object 3 and measuring a three-dimensional point group indicating a three-dimensional coordinate of a surface point of the recognition object 3, and a robot arm 5 for holding the recognition object 3. And a computer 6 for controlling the operation of the robot arm 5 based on an image and point cloud data input from the three-dimensional sensor 4.

３次元センサ４は、認識対象物３の画像を取得するための機能（画像取得手段）及び認識対象物３の表面の点の３次元座標を示す３次元点群の計測を行う機能（３次元計測手段）を有するものであって、従来公知の３次元計測技術等を適用することができる。また、３次元センサ４として、例えば、認識対象物３に対してパターン光を投光する投光手段（不図示）と、このパターン光が投光された認識対象物３を異なる位置に設けられた基準カメラと参照カメラとからなるステレオカメラ（不図示）とを備え、該ステレオカメラにより撮像して得られた複数の画像間で対応する画素を特定し、対応付けられた基準画像上の画素と、参照画像上の画素との位置の差（視差）に三角測量の原理を適用することにより、基準カメラから当該画素に対応する計測対象物上の点までの距離を計測して認識対象物３の３次元点群を取得しても良い。尚、３次元センサ４の数は、特に限定されるものではなく、認識対象物３の画像の取得及び３次元点群の計測を行うことができれば良く、１台又は２台以上の複数であっても良い。また、認識対象物３の画像を取得するための機能と３次元計測を行うための機能を別々に設けるように構成されていても良い。 The three-dimensional sensor 4 has a function (image acquisition means) for acquiring an image of the recognition target object 3 and a function of measuring a three-dimensional point group indicating the three-dimensional coordinates of points on the surface of the recognition target object 3 (three-dimensional Measurement means), and a conventionally known three-dimensional measurement technique or the like can be applied. As the three-dimensional sensor 4, for example, a light projecting unit (not shown) that projects pattern light onto the recognition target object 3 and a recognition target object 3 projected with this pattern light are provided at different positions. And a stereo camera (not shown) made up of a reference camera, a corresponding pixel among a plurality of images obtained by imaging with the stereo camera, and a pixel on the associated reference image And by applying the triangulation principle to the difference in position (parallax) from the pixel on the reference image, the distance from the base camera to the point on the measurement object corresponding to the pixel is measured and the recognition object Three three-dimensional point groups may be acquired. Note that the number of the three-dimensional sensors 4 is not particularly limited, and it is sufficient that the acquisition of the image of the recognition object 3 and the measurement of the three-dimensional point group can be performed. May be. Moreover, you may comprise so that the function for acquiring the image of the recognition target object 3 and the function for performing three-dimensional measurement may be provided separately.

コンピュータ６は、図１に示すように、３次元センサ４により得られた３次元計測結果や画像データ等を記憶する画像メモリ７と、認識対象物３の認識を行うための処理プログラム等を格納するハードディスク８と、該ハードディスク８から読み出された処理プログラムを一時記憶するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９と、この処理プログラムに従って３次元認識処理を行うＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｅｓｓｉｎｇＵｎｉｔ）１０と、画像メモリ７に記憶された画像データやＣＰＵ１０によって求められた認識結果等を表示するための表示部１１と、マウスやキーボード等で構成される操作部１２と、これら各部を互いに接続するシステムバス１３とを有している。尚、本実施形態では、３次元物体３の認識を行う処理プログラムをハードディスク８に格納している例を示しているが、これに代えて、コンピュータ読み取り可能な記憶媒体（不図示）に格納しておき、この記録媒体から処理プログラムを読み出すように構成することも可能である。 As shown in FIG. 1, the computer 6 stores an image memory 7 for storing the three-dimensional measurement results and image data obtained by the three-dimensional sensor 4, a processing program for recognizing the recognition target 3, and the like. , A RAM (Random Access Memory) 9 that temporarily stores a processing program read from the hard disk 8, a CPU (Central Processing Unit) 10 that performs a three-dimensional recognition process according to the processing program, and an image memory 7 A display unit 11 for displaying image data stored in the memory, a recognition result obtained by the CPU 10, an operation unit 12 including a mouse and a keyboard, and a system bus 13 for connecting these units to each other. doing. In the present embodiment, an example in which the processing program for recognizing the three-dimensional object 3 is stored in the hard disk 8 is shown. Instead, the processing program is stored in a computer-readable storage medium (not shown). It is also possible to read out the processing program from this recording medium.

以下、３次元物体認識装置１による処理の流れについて図２のフローチャートを用いながら説明する。本実施形態に係る３次元物体認識装置１では、図２に示すように、まずオフラインで認識対象物３の認識用の３次元モデルを作成して３次元モデル記憶手段１４に記憶しておく（Ｓ１０１）。３次元モデルは、認識対象物３の３次元形状情報を含むものであって、認識対象物３の輪郭形状を表わす３次元輪郭点データ及び表面形状を表わす３次元面点データ等を有しており、３次元ＣＡＤ等を利用して作成される。ここでは、３次元ＣＡＤ等を利用して、予めオフラインで３次元センサ４の位置から考えて可能性のある全範囲に渡って、あらゆる姿勢（３自由度）における３次元モデルを作成し、３次元モデル記憶手段１４に記憶している。尚、この３次元モデルの作成方法は、特に限定されるものではなく、認識対象物３の３次元形状情報を含むものであれば良く、従来公知の方法を用いることができる。 Hereinafter, the flow of processing by the three-dimensional object recognition apparatus 1 will be described with reference to the flowchart of FIG. In the three-dimensional object recognition apparatus 1 according to the present embodiment, as shown in FIG. 2, a three-dimensional model for recognition of the recognition object 3 is first created offline and stored in the three-dimensional model storage means 14 ( S101). The three-dimensional model includes three-dimensional shape information of the recognition target object 3 and includes three-dimensional contour point data representing the contour shape of the recognition target object 3, three-dimensional surface point data representing the surface shape, and the like. It is created using 3D CAD or the like. Here, using a three-dimensional CAD or the like, a three-dimensional model in any posture (three degrees of freedom) is created over the entire range that can be considered in advance from the position of the three-dimensional sensor 4 offline. It is stored in the dimensional model storage means 14. The method for creating the three-dimensional model is not particularly limited as long as it includes the three-dimensional shape information of the recognition object 3, and a conventionally known method can be used.

次に、３次元センサ４により認識対象物３の表面の点の３次元座標を示す３次元点群の計測及び認識対象物３の画像を取得する（Ｓ１０２）。そして、３次元センサ４から認識対象物３を撮影した原画像が入力されると、画像ピラミッド作成手段２１では、この入力された原画像に基づいて、画像ピラミッドを作成し（Ｓ１０３）、図１に示す画像メモリ７に記憶する。 Next, the measurement of the three-dimensional point group which shows the three-dimensional coordinate of the point of the surface of the recognition target object 3 by the three-dimensional sensor 4, and the image of the recognition target object 3 are acquired (S102). When an original image obtained by photographing the recognition object 3 is input from the three-dimensional sensor 4, the image pyramid creation means 21 creates an image pyramid based on the input original image (S103), and FIG. Is stored in the image memory 7 shown in FIG.

画像ピラミッド１７は、図３に示すように、３次元センサ４により取得した原画像１８の解像度を異なる比率で低下させた複数枚の画像１８Ａ〜１８ｃを有するものである。ＣＰＵ１０では、例えば、図３に示すように、縦横両方向にそれぞれｎ個ずつのピクセルが並んだ原画像１８が入力された場合、この原画像を最高解像度画像として、１／２縮小を繰り返すことにより、縦横両方向にそれぞれｎ／２個のピクセルが並んだ第１ピラミッド画像１８Ａ、縦横両方向にそれぞれｎ／４個のピクセルが並んだ第２ピラミッド画像１８Ｂ、縦横両方向にそれぞれｎ／８個のピクセルが並んだ第３ピラミッド画像１８Ｃを作成する。尚、図３では、３段階の異なる低解像度の画像１８Ａ〜１８ｃを有する画像ピラミッド１７を作成した例を示しているが、段階数は特に限定されるものではなく、入力画像の大きさ等に応じて適宜変更しても良い。 As shown in FIG. 3, the image pyramid 17 includes a plurality of images 18 </ b> A to 18 c in which the resolution of the original image 18 acquired by the three-dimensional sensor 4 is reduced at different ratios. In the CPU 10, for example, as shown in FIG. 3, when an original image 18 in which n pixels are arranged in both the vertical and horizontal directions is input, the original image is used as the highest resolution image and the 1/2 reduction is repeated. The first pyramid image 18A in which n / 2 pixels are arranged in both vertical and horizontal directions, the second pyramid image 18B in which n / 4 pixels are arranged in both vertical and horizontal directions, and n / 8 pixels in both vertical and horizontal directions. The arranged third pyramid image 18C is created. Note that FIG. 3 shows an example in which the image pyramid 17 having the three low-resolution images 18A to 18c is created. However, the number of stages is not particularly limited, and the size of the input image is determined. You may change suitably according to it.

次に、エッジ抽出手段２２では、画像ピラミッド作成手段２１により作成された画像ピラミッド１７における各解像度のそれぞれの画像１８〜１８Ｃに対して認識対象物のエッジを抽出する（Ｓ１０４）。尚、エッジの抽出の仕方は、特に限定されるものではなく、例えば、３次元計測により得られるデプス値又は面の法線方向の違いに基づいてエッジを抽出しても良いし、輝度画像からエッジを抽出しても良く、従来公知の方法を用いることができる。 Next, the edge extraction unit 22 extracts the edge of the recognition object for each of the images 18 to 18C of each resolution in the image pyramid 17 created by the image pyramid creation unit 21 (S104). The method of extracting the edge is not particularly limited. For example, the edge may be extracted based on the depth value obtained by three-dimensional measurement or the difference in the normal direction of the surface, or from the luminance image. An edge may be extracted, and a conventionally known method can be used.

次に、ＣＰＵ１０では、エッジ抽出手段２２で各解像度のそれぞれの画像から抽出されたエッジを用いて、各解像度毎のディスタンスマップを作成し、それらをディスタンスマップ記憶手段１５に記憶する（Ｓ１０５）。ディスタンスマップ１９は、図４に示すように、画像ピラミッド作成手段２１により作成された画像ピラミッド１７における各解像度の画像１８〜１８Ｃに対してそれぞれ認識対象物のエッジ２０ａ（斜線で示す部分）を抽出して、画像を構成する各画素２０に、抽出したエッジ２０ａのうち最も近い最近エッジ点までの距離と、最近エッジ点の向きとを画素値として記録させたものである。このディスタンスマップ１９は、例えば、９９５年１２月発刊の電子情報通信学会論文誌Vol.J７８、No.１２の「ユークリッド距離変換アルゴリズムの効率化」、加藤敏洋、平田富夫、斉藤豊文、吉瀬謙二、第１７５０〜１７５７頁に記載されている方法に基づいて作成されるものであり、ここではその詳細な作成の仕方の説明については省略する。このように、各解像度毎のディスタンスマップ１９を作成して記憶しておくことにより、画素２０を参照することで、その画素２０から最も近いエッジ２０ａまでの距離がわかるので、後述する位置姿勢評価手段２３及び最適化手段２４での処理速度を向上させることができる。 Next, the CPU 10 creates a distance map for each resolution by using the edges extracted from the respective images of each resolution by the edge extraction means 22, and stores them in the distance map storage means 15 (S105). As shown in FIG. 4, the distance map 19 extracts the edges 20 a (parts indicated by diagonal lines) of the recognition object from the images 18 to 18 </ b> C of each resolution in the image pyramid 17 created by the image pyramid creation unit 21. Thus, each pixel 20 constituting the image is recorded with the distance to the nearest edge point and the direction of the nearest edge point among the extracted edges 20a as pixel values. This distance map 19 is, for example, published in December 995, Electronic Information Communication Society Journal, Vol. It is created based on the method described on pages 1750 to 1757, and detailed description of the creation method is omitted here. In this way, by creating and storing the distance map 19 for each resolution, the distance from the pixel 20 to the nearest edge 20a can be known by referring to the pixel 20. The processing speed in the means 23 and the optimization means 24 can be improved.

また、ＣＰＵ１０は、図５及び図６に示すように、３次元センサ４ａにより計測された３次元点群の画像ピラミッド作成手段２１により作成された画像ピラミッド１７における各解像度の画像Ｉ上の画像座標（μ_ｉ，ν_ｉ）に、３次元センサ４ｂで３次元センサ４ａと異なる方向から認識対象物３を撮像した場合に得られる各解像度の画像Ｋ上の、画像座標（μ_ｉ，ν_ｉ）に対応する画像座標である対応座標（μ_ｋ，ν_ｋ）をそれぞれ記憶させた各解像度毎の対応座標マップ２５を作成し、それらを対応座標マップ記憶手段１６に記憶する（Ｓ１０６）。対応座標マップ２５の作成の仕方としては、既にＳ１０２の処理で３次元センサ４ａにより画素ごとの３次元点（３次元点群）が計測されているので、例えば、図６に示すように、画像座標（μ_ｉ，ν_ｉ）の３次元点を３次元センサ４ａとは異なる方向から取得した場合の画像平面Ｋに射影し、画像座標（μ_ｉ，ν_ｉ）に対応する画像座標である対応座標（μ_ｋ，ν_ｋ）を求める。そして、この対応座標（μ_ｋ，ν_ｋ）を画像座標（μ_ｉ，ν_ｉ）に記憶する。この際、３次元センサ４ｂを平行化しておけば、ν_ｉ＝ν_ｋとなるため、処理を簡易化することができる。このような処理を画像Ｉの画素ごとに行うことにより、対応座標マップ２５を作成することができる。図５では、対応座標マップ２５を概念的に示しており、画像座標（μ_ｉ，ν_ｉ）＝（１０，２０）には、対応座標μ_ｋ＝５が記憶されている場合を示している。尚、本実施形態のように、１台の３次元センサ４ａを用いて認識対象物の画像の取得及び３次元計測を行う場合には、図６に示すように、仮想的にもう１台の３次元センサ４ｂが３次元センサ４ａとは異なる所定の位置に存在すると仮定することにより、上記のような方法で対応座標マップ２５を作成することができる。つまり、３次元センサ４ｂの焦点距離、主点座標等のカメラパラメータを全て仮想的に決定する。また、図６に示すように、３次元センサ４ａの第１センサ座標系Ｘｃ＝［ＸｃＹｃＺｃ］^Ｔと仮想の３次元センサ４ｂの第２センサ座標系との間には、回転行列Ｒ及び並進移動ベクトルｔが存在するが、これらも仮想的に決定することにより、３次元センサ４ｂの座標系Ｘｃ’＝［Ｘｃ’ Ｙｃ’ Ｚｃ’］^Ｔを求めることができる。そして、この仮想的に設定した３次元センサ４ｂの画像Ｋ上に３次元点を射影することにより、対応座標（μ_ｋ，ν_ｋ）を求めることが可能となる。また、複数のカメラ、例えば、ステレオカメラを用いて、対応座標マップ２５を作成する場合には、ステレオカメラで認識対象物を撮影し、ステレオマッチングにより、一方のカメラの画像座標（μ_ｉ，ν_ｉ）と対応する、他方のカメラの画像座標（μ_ｋ，ν_ｋ）を得ることにより、対応座標マップ２５を作成すれば良い。 Further, as shown in FIGS. 5 and 6, the CPU 10 coordinates the image coordinates on the image I of each resolution in the image pyramid 17 created by the image pyramid creation means 21 of the three-dimensional point group measured by the three-dimensional sensor 4 a. (μ _{_i,} ν _{_i),} the on each resolution of the image K obtained recognition object 3 from the three-dimensional sensor 4a different directions in 3D sensor 4b when captured, the image coordinates (μ _{_i,} ν _i) The corresponding coordinate map 25 for each resolution in which the corresponding coordinates (μ _k , ν _k ), which are image coordinates corresponding to, are stored, and stored in the corresponding coordinate map storage unit 16 (S106). As a method of creating the corresponding coordinate map 25, since the three-dimensional point (three-dimensional point group) for each pixel has already been measured by the three-dimensional sensor 4a in the process of S102, for example, as shown in FIG. Corresponding image coordinates (μ _i , ν _i ) corresponding to image coordinates (μ _i , ν _i ) are projected onto the image plane K when the three-dimensional point of coordinates (μ _i , ν _i ) is acquired from a direction different from that of the three-dimensional sensor 4a. The coordinates (μ _k , ν _k ) are obtained. Then, the corresponding coordinates (μ _k , ν _k ) are stored in the image coordinates (μ _i , ν _i ). At this time, if the three-dimensional sensor 4b is made parallel, ν _i = ν _k is satisfied, and therefore the processing can be simplified. By performing such processing for each pixel of the image I, the corresponding coordinate map 25 can be created. FIG. 5 conceptually shows the corresponding coordinate map 25, and shows a case where the corresponding coordinates μ _k = 5 are stored in the image coordinates (μ _i , ν _i ) = (10, 20). . In addition, as shown in FIG. 6, when acquiring the image of a recognition target object and performing three-dimensional measurement using one three-dimensional sensor 4a like this embodiment, as shown in FIG. By assuming that the three-dimensional sensor 4b exists at a predetermined position different from that of the three-dimensional sensor 4a, the corresponding coordinate map 25 can be created by the method described above. That is, all camera parameters such as the focal length and principal point coordinates of the three-dimensional sensor 4b are virtually determined. Further, as shown in FIG. 6, between the first sensor coordinate system Xc = [Xc Yc Zc] ^T of the three-dimensional sensor 4a and the second sensor coordinate system of the virtual three-dimensional sensor 4b, a rotation matrix R and There is a translation vector t. By virtually determining these, the coordinate system Xc ′ = [Xc ′ Yc ′ Zc ′] ^T of the three-dimensional sensor 4b can be obtained. Then, the corresponding coordinates (μ _k , ν _k ) can be obtained by projecting a three-dimensional point onto the virtually set image K of the three-dimensional sensor 4b. Further, when the corresponding coordinate map 25 is created using a plurality of cameras, for example, stereo cameras, the recognition object is photographed by the stereo cameras, and the image coordinates (μ _i , ν) of one camera are obtained by stereo matching. _The corresponding coordinate map 25 may be created by obtaining the image coordinates (μ _k , ν _k ) of the other camera corresponding to _i ).

次に、位置姿勢評価手段２３では、Ｓ１０５でディスタンスマップ記憶手段１５に記憶されたディスタンスマップの１９のうち解像度が最も低いディスタンスマップ、及び、Ｓ１０６で対応座標マップ記憶手段１６に記憶された対応座標マップ２５のうち解像度が最も低い対応座標マップにあらゆる位置姿勢（６自由度）の３次元モデルを照合させることにより、認識対象物３の位置姿勢の評価を行う（Ｓ１０７）。位置姿勢評価手段２３では、例えば、下記数式（２）の位置姿勢の評価値算出関数を用いて、認識対象物３の位置姿勢の評価を行う。数式（２）の評価値算出関数は、右辺第１項に示す輪郭評価値と右辺第２項に示す点群評価値を融合させた１つの評価値によって認識対象物３の位置姿勢の評価を行う関数であって、パラメータα（０≦α≦１）によって輪郭評価値と点群評価値の融合比率を調整することができる。また、数式（１）は、モデル座標系のｉ番目の３次元点を３次元センサｋの画像へ射影する数式を表わしている。

Next, in the position / orientation evaluation unit 23, the distance map having the lowest resolution among 19 of the distance maps stored in the distance map storage unit 15 in S105, and the corresponding coordinates stored in the corresponding coordinate map storage unit 16 in S106. The position and orientation of the recognition target object 3 are evaluated by collating a three-dimensional model of any position and orientation (6 degrees of freedom) with the corresponding coordinate map having the lowest resolution in the map 25 (S107). The position / orientation evaluation means 23 evaluates the position / orientation of the recognition target object 3 using, for example, a position / orientation evaluation value calculation function of the following mathematical formula (2). The evaluation value calculation function of Expression (2) evaluates the position and orientation of the recognition target object 3 by one evaluation value obtained by fusing the contour evaluation value shown in the first term on the right side and the point cloud evaluation value shown in the second term on the right side. It is a function to be performed, and the fusion ratio of the contour evaluation value and the point group evaluation value can be adjusted by the parameter α (0 ≦ α ≦ 1). Equation (1) represents an equation for projecting the i-th three-dimensional point of the model coordinate system onto the image of the three-dimensional sensor k.

数式（２）の評価値算出関数では、下記の数式（３）によって求められる評価値Ｓ_ｋｉを輪郭の評価で用いる全ての点の数Ｉ_ｋで足し合わせて点の数Ｉ_ｋで割ったものを輪郭の評価を行う３次元センサの総数Ｋで足し合わせて３次元センサの総数Ｋで割ったものに、パラメータαを乗じた値である輪郭評価値と、下記の数式（４）によって求められる評価値Ｓ_ｌｊを点群の評価で用いる全ての点の数Ｊ_ｌで足し合わせて点の数Ｊ_ｌで割ったものを点群の評価を行う３次元センサの総数Ｌで足し合わせて３次元センサの総数Ｌで割ったものに、パラメータ（１−α）を乗じた値である点群評価値とを足し合わせたものである。尚、本実施形態のように輪郭の評価及び点群の評価に同じ３次元センサを用いて行う場合には、Ｋ＝Ｌとなる。

In the evaluation value calculation function of the mathematical formula (2), the evaluation value S _ki obtained by the following mathematical formula (3) is added by the number I _k of all points used in the evaluation of the contour and divided by the number of points I _k. Is obtained by multiplying by the total number K of the three-dimensional sensors for evaluating the contour and dividing by the total number K of the three-dimensional sensors and multiplying the parameter α by the following formula (4). 3D and adding the divided by the evaluation value S _lj number of points are summed by the number J _l of all points used in the evaluation of the point group of the J _l by the total number L of the three-dimensional sensor for evaluating a point group The value obtained by dividing the total number L of sensors by the parameter (1-α) is added to the point group evaluation value. Note that when the same three-dimensional sensor is used for contour evaluation and point group evaluation as in this embodiment, K = L.

数式（３）に示す評価値Ｓ_ｋｉは、下記の数式（５）に示すように、３次元モデルの３次元輪郭点を３次元センサｋの画像上へ射影し、ディスタンスマップを参照させることにより求められる、エッジまでの距離に基づく評価値と、下記の数式（６）に示すように、３次元モデルの３次元輪郭点を３次元センサｋの画像上へ射影した射影点の勾配方向と最近エッジの勾配方向の差に基づく評価値とを掛け合わせたものである。

The evaluation value S _ki shown in the equation (3) is obtained by projecting the 3D contour point of the 3D model onto the image of the 3D sensor k and referring to the distance map as shown in the following equation (5). The obtained evaluation value based on the distance to the edge, and the gradient direction of the projected point obtained by projecting the three-dimensional contour point of the three-dimensional model onto the image of the three-dimensional sensor k as shown in the following formula (6) This is obtained by multiplying the evaluation value based on the difference in the gradient direction of the edge.

数式（５）では、例えば、図７（ａ）に示すように、射影点と最近エッジまでの距離の閾値τ_ａが設定されており、最近エッジまでの距離の差が小さい時には、この関数は１に近い数字を出力し、差が大きくなるにつれて０に近づく結果を出力する。そして、最近エッジまでの距離の自乗が閾値τ_ａの自乗よりも大きくなる場合には、０を出力結果とする。尚、図７（ａ）では、閾値τ_ａ＝４．０に設定した例を示しているが、この閾値τ_ａの値は、特に限定されるものではなく、求められる認識精度等に応じて適宜設定されるものである。また、数式（６）では、例えば、図７（ｂ）に示すように、射影点の勾配方向と最近エッジの勾配方向との内積の閾値τ_β１、τ_β２が設定されており、最近エッジの勾配方向との差の自乗が閾値τ_β１の自乗以下の時には１を出力し、最近エッジの勾配方向との差の自乗が閾値τ_β１の自乗より大きく、τ_β２の自乗より小さい時には、勾配方向の差が閾値τ_β１に近づくにつれて関数は１に近い数値を出力し、勾配方向の差が閾値τ_β２に近づくにつれて関数は０に近づくような結果を出力する。そして、最近エッジの勾配方向との差の自乗が閾値τ_β２の自乗以上の場合には、０を出力結果とする。尚、図７（ｂ）では、閾値τ_β１＝ｃｏｓ２０°、τ_β２＝ｃｏｓ３６°に設定した例を示しているが、この閾値τ_β１、τ_β２の値は、特に限定されるものではなく、求められる認識精度等に応じて適宜設定されるものである。また、数式（５）及び数式（６）は、輪郭を用いた評価に関する関数の一例であり、輪郭の類似度を評価する関数はこれに限られるものではない。 In Equation (5), for example, as shown in FIG. 7A, a threshold τ _{a for} the distance from the projection point to the nearest edge is set, and when the difference in distance to the nearest edge is small, this function is A number close to 1 is output, and a result approaching 0 is output as the difference increases. Then, recently when the square of the distance to the edge is greater than the square of the threshold tau _a, the output result 0. FIG. 7A shows an example in which the threshold value τ _a = 4.0. However, the value of the threshold value τ _a is not particularly limited, and depends on the required recognition accuracy and the like. It is set appropriately. Further, in Equation (6), for example, as shown in FIG. 7B, threshold values τ _β1 and τ _β2 of inner products of the gradient direction of the projection point and the gradient direction of the nearest edge are set, When the square of the difference from the gradient direction is less than or equal to the square of the threshold τ _β1 , 1 is output, and when the square of the difference from the gradient direction of the nearest edge is larger than the square of the threshold τ _β1 and smaller than the square of τ _β2 , the gradient direction The function outputs a numerical value close to 1 as the difference of τ _β1 approaches the threshold value τ _β1 , and outputs a result such that the function approaches 0 as the difference in gradient direction approaches the threshold value τ _β2 . When the square of the difference from the gradient direction of the nearest edge is equal to or larger than the square of the threshold τ _β2 , 0 is set as the output result. 7B shows an example in which the threshold values τ _β1 = cos 20 ° and τ _β2 = cos 36 ° are set, the values of the threshold values τ _β1 and τ _β2 are not particularly limited. It is appropriately set according to the required recognition accuracy and the like. Equations (5) and (6) are examples of functions related to evaluation using contours, and the function for evaluating the similarity of contours is not limited to this.

数式（４）に示す評価値Ｓ_ｌｊは、下記の数式（７）に示すように、３次元モデルの３次元面点を３次元センサｋの画像上へ射影し、対応座標マップを参照させることにより得た対応座標と３次元モデルの３次元面点を３次元センサｌの画像上に射影して得た画像座標との距離の差に基づく評価値と、下記の数式（８）に示すように、射影された３次元モデル面上の３次元面点の法線方向と、射影先の画素に対応する３次元計測点の法線方向との内積の差に基づく評価値とを掛け合わせたものである。

The evaluation value S _lj shown in the equation (4) is obtained by projecting the three-dimensional surface point of the three-dimensional model onto the image of the three-dimensional sensor k and referring to the corresponding coordinate map as shown in the following equation (7). The evaluation value based on the difference in distance between the corresponding coordinates obtained by the above and the image coordinates obtained by projecting the three-dimensional surface point of the three-dimensional model onto the image of the three-dimensional sensor l and the following equation (8) Is multiplied by the evaluation value based on the difference of the inner product of the normal direction of the 3D surface point on the projected 3D model surface and the normal direction of the 3D measurement point corresponding to the projected pixel. Is.

数式（７）では、対応座標からの距離の閾値τ_ｂが設定されており、対応座標からの距離の差が小さい時には、この関数は１に近い数字を出力し、差が大きくなるにつれて０に近づく結果を出力する。そして、対応座標からの距離の自乗が閾値τ_ｂの自乗よりも大きくなる場合には、０を出力結果とするものである。また、数式（８）では、射影された３次元モデル面上の３次元面点の法線方向と射影先の画素に対応する３次元計測点の法線方向との内積の閾値τ_γ１、τ_γ２が設定されており、法線方向の差の自乗が閾値τ_γ１の自乗以下の時には１を出力し、法線方向の差の自乗が閾値τ_γ１の自乗より大きく、τ_β２の自乗より小さい時には、法線方向の差が閾値τ_γ１に近づくにつれて関数は１に近い数値を出力し、法線方向の差が閾値τ_γ２に近づくにつれて関数は０に近づくような結果を出力する。そして、法線方向の差の自乗が閾値τ_γ２の自乗以上の場合には、０を出力結果とする。尚、数式（７）及び数式（８）は、点群を用いた評価に関する関数の一例であり、点群の類似度を評価する関数はこれに限られるものではない。 In equation (7), a threshold τ _{b for} the distance from the corresponding coordinate is set, and when the difference in distance from the corresponding coordinate is small, this function outputs a number close to 1 and becomes 0 as the difference increases. Output the approaching result. When the square of the distance from the corresponding coordinate is larger than the square of the threshold τ _b , 0 is output. Further, in the equation (8), threshold _values τ _γ1 , τ of inner products between the normal direction of the three-dimensional surface point on the projected three-dimensional model surface and the normal direction of the three-dimensional measurement point corresponding to the projection destination pixel. _{When γ2} is set and the square of the difference in the normal direction is less than or equal to the square of the threshold τ _γ1 , 1 is output, and the square of the difference in the normal direction is larger than the square of the threshold τ _γ1 and smaller than the square of τ _β2 Sometimes, the function outputs a value close to 1 as the normal direction difference approaches the threshold τ _γ1, and the function outputs a result such that the function approaches 0 as the normal direction difference approaches the threshold τ _γ2 . When the square of the difference in the normal direction is equal to or larger than the square of the threshold τ _γ2 , 0 is set as the output result. Equations (7) and (8) are examples of functions related to evaluation using point clouds, and the functions for evaluating the similarity of point clouds are not limited to these.

位置姿勢評価手段２３では、例えば、このように数式（２）に示すような評価値算出関数を用いて、認識対象物の位置姿勢の評価を行い、最も評価値の高いものを初期値解候補とする。尚、本実施形態では、３次元モデルを画像上に射影して照合を行っている例を示しているが、３次元モデルを初期探索用に射影した状態のテンプレートを作成し、このテンプレートを用いて認識対象物の位置姿勢の評価を行うようにしても良い。これにより、更に処理速度を向上させることができる。 In the position / orientation evaluation means 23, for example, the position / orientation of the recognition object is evaluated using the evaluation value calculation function as shown in Equation (2), and the candidate with the highest evaluation value is determined as the initial value solution candidate. And In this embodiment, an example is shown in which a three-dimensional model is projected onto an image and collation is performed. Thus, the position and orientation of the recognition target object may be evaluated. Thereby, the processing speed can be further improved.

次に、最適化手段２４では、位置姿勢評価手段２３により得られた初期値解候補を、より高解像度のディスタンスマップと対応座標マップを用いて、最適化を実行し（Ｓ１０８）、位置姿勢の認識精度をより高精度にする。最適化手段２４では、位置姿勢評価手段２３により得られた解候補の位置姿勢を初期値とし、その位置姿勢の３次元モデルをより高解像度である画像に射影し、ディスタンスマップと対応座標マップを用いて、位置姿勢評価手段２３と同様に評価値を計算する。この際、評価値の最大化を実現する手法として、例えば、従来公知のマーカード法等を用いることができる。尚、最適化の手法は、これに限定されるものではなく、従来公知の他の最適化法を用いても良い。 Next, the optimization unit 24 optimizes the initial value solution candidate obtained by the position / orientation evaluation unit 23 using a higher resolution distance map and a corresponding coordinate map (S108). Make the recognition accuracy higher. In the optimization unit 24, the position / posture of the solution candidate obtained by the position / posture evaluation unit 23 is set as an initial value, and the three-dimensional model of the position / posture is projected onto a higher-resolution image, and the distance map and the corresponding coordinate map are displayed. In the same manner as the position / orientation evaluation means 23, the evaluation value is calculated. At this time, for example, a conventionally known Marcard method or the like can be used as a technique for maximizing the evaluation value. The optimization method is not limited to this, and other conventionally known optimization methods may be used.

そして、Ｓ１０８で最適化された評価結果に基づき、位置及び姿勢が必要な精度を満たしているか否かを判定し（Ｓ１０９）、必要な精度を満たしていると判断した場合は（Ｓ１０９：ＹＥＳ）、その結果を最終結果として出力し（Ｓ１１０）、処理を終了する。一方、必要な精度を満たしていないと判断した場合は（Ｓ１０９：ＮＯ）、まだ位置及び姿勢を評価していない高解像度のディスタンスマップ及び対象座標マップがあるか否かを判定し（Ｓ１１１）、そのような未処理のディスタンスマップ及び対象座標マップがないと判断した場合は（Ｓ１１１：ＮＯ）、その時点での結果を最終結果として出力し（Ｓ１１０）、処理を終了する。一方、未処理のディスタンスマップ及び対象座標マップがあると判断した場合は（Ｓ１１１：ＹＥＳ）、Ｓ１０８へ戻って残りのディスタンスマップ及び対象座標マップについて同様の処理を行う。そして、未処理のディスタンスマップ及び対象座標マップが無くなるまでこれを繰り返す。このように、必要な精度に達するまで、より解像度の高いディスタンスマップ及び対象座標マップを用いて処理を行うことにより、認識対象物３の位置及び姿勢をより高い精度で認識することができる。 Then, based on the evaluation result optimized in S108, it is determined whether or not the position and orientation satisfy the required accuracy (S109). If it is determined that the required accuracy is satisfied (S109: YES) The result is output as the final result (S110), and the process is terminated. On the other hand, if it is determined that the required accuracy is not satisfied (S109: NO), it is determined whether there is a high-resolution distance map and target coordinate map that have not yet been evaluated for position and orientation (S111). If it is determined that there is no such unprocessed distance map and target coordinate map (S111: NO), the result at that time is output as the final result (S110), and the process is terminated. On the other hand, if it is determined that there are unprocessed distance maps and target coordinate maps (S111: YES), the process returns to S108 and the same processing is performed for the remaining distance maps and target coordinate maps. This is repeated until there is no unprocessed distance map and target coordinate map. In this way, the position and orientation of the recognition target object 3 can be recognized with higher accuracy by performing processing using the distance map and target coordinate map with higher resolution until the required accuracy is reached.

３次元物体認識装置１では、このように画像上のエッジを用いた輪郭評価値と、３次元点群を用いた点群評価値との双方を融合した評価値を用いて、認識対象物３の位置姿勢の評価を行うので、認識対象物３の位置姿勢の誤認識を抑制し、認識精度及び処理速度を向上させることができる。また、数式（２）のパラメータαを調整することで、輪郭評価値と点群評価値の融合比率を調整することができるので、画像からエッジが抽出できないような場合でも輪郭評価値の比率を下げて点群評価値の比率を上げることにより認識対象物の位置姿勢を認識することができる。 In the three-dimensional object recognition apparatus 1, the recognition object 3 is obtained using an evaluation value obtained by fusing both the contour evaluation value using the edge on the image and the point cloud evaluation value using the three-dimensional point cloud. Since the position and orientation of the recognition object 3 are evaluated, erroneous recognition of the position and orientation of the recognition object 3 can be suppressed, and the recognition accuracy and processing speed can be improved. Further, by adjusting the parameter α in the formula (2), the fusion ratio of the contour evaluation value and the point cloud evaluation value can be adjusted. Therefore, even when the edge cannot be extracted from the image, the ratio of the contour evaluation value can be set. The position and orientation of the recognition object can be recognized by lowering and increasing the ratio of the point cloud evaluation values.

尚、本発明の実施の形態は上述の形態に限るものではなく、本発明の思想の範囲を逸脱しない範囲で適宜変更することができる。 The embodiment of the present invention is not limited to the above-described embodiment, and can be appropriately changed without departing from the scope of the idea of the present invention.

本発明に係る３次元物体認識装置及び３次元物体認識方法は、生産ライン等における部品等の認識を行うための技術として有効に利用することができる。また、サービスロボットが部屋の中等で自分の位置姿勢を特定する技術としても有効に利用することができる。 The three-dimensional object recognition apparatus and the three-dimensional object recognition method according to the present invention can be effectively used as a technique for recognizing components in a production line or the like. In addition, the service robot can be effectively used as a technique for specifying its position and orientation in a room or the like.

１３次元物体認識装置
３認識対象物
４３次元センサ
１１表示部
１４３次元モデル記憶手段
１５ディスタンスマップ記憶手段
１６対応座標マップ記憶手段
１７画像ピラミッド
１８画像
１９ディスタンスマップ
２０画素
２１画像ピラミッド作成手段
２２エッジ抽出手段
２３位置姿勢評価手段
２４最適化手段
２５対応座標マップ DESCRIPTION OF SYMBOLS 1 3D object recognition apparatus 3 Recognition target object 4 3D sensor 11 Display part 14 3D model storage means 15 Distance map storage means 16 Corresponding coordinate map storage means 17 Image pyramid 18 Image 19 Distance map 20 Pixel 21 Image pyramid creation means 22 Edge extraction means 23 Position and orientation evaluation means 24 Optimization means 25 Corresponding coordinate map

Claims

3D model storage means for storing a 3D model representing the contour and surface shape of the recognition object;
Image acquisition means for acquiring an image by photographing the recognition object from a predetermined direction;
Three-dimensional measuring means for measuring a three-dimensional point group indicating three-dimensional coordinates of points on the surface of the recognition object;
Edge extraction means for extracting an edge of the recognition object from an image acquired by the image acquisition means or a three-dimensional measurement result obtained by the three-dimensional measurement means;
A contour evaluation value obtained on the basis of the edge extracted by the edge extraction means and a three-dimensional contour point indicating the contour shape of the three-dimensional model at any position and orientation, and the three-dimensional measurement obtained by the three-dimensional measurement means A position and orientation evaluation means for evaluating the position and orientation of the recognition object using a point cloud evaluation value obtained based on a result and a three-dimensional surface point indicating the surface shape of the three-dimensional model at any position and orientation; A three-dimensional object recognition device.

The three-dimensional object recognition apparatus according to claim 1, wherein the position and orientation evaluation unit adjusts a ratio between the contour evaluation value and the point group evaluation value.

A distance map in which each pixel constituting the image acquired by the image acquisition unit stores the distance to the closest edge among the edges extracted by the edge extraction unit and the direction of the closest edge as image values. Distance map storage means for storing
An image obtained when the recognition object is imaged from a direction different from the predetermined direction at each image coordinate on the image acquired by the image acquisition unit of the three-dimensional point group measured by the three-dimensional measurement unit. A corresponding coordinate map storage unit that stores a corresponding coordinate map in which corresponding coordinates that are image coordinates corresponding to the respective image coordinates are stored;
The position / orientation evaluation means projects the contour evaluation value obtained by projecting and collating the three-dimensional contour point of the three-dimensional model at any position / orientation on the distance map, and on the corresponding coordinate map. The position and orientation of the recognition object is evaluated using the point cloud evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position and orientation. The three-dimensional object recognition apparatus according to 1 or 2.

An image pyramid creating means for creating an image pyramid having a plurality of images in which the resolution of the image obtained by the image obtaining means is reduced at different ratios;
Position and orientation optimization means for optimizing the position and orientation of the recognition object using the highest evaluation value obtained by the position and orientation evaluation means as an initial value,
The edge extraction unit extracts an edge of the recognition object from an image of each resolution in the image pyramid created by the image pyramid creation unit,
The distance map storage unit is configured to connect each pixel of each resolution image in the image pyramid created by the image pyramid creation unit to the nearest edge among the edges extracted from the resolution image by the edge extraction unit. Storing a distance map of each resolution in which the distance and the direction of the nearest edge are stored as image values;
The corresponding coordinate map storage means differs from the predetermined direction in each image coordinate on an image of each resolution in the image pyramid created by the image pyramid creation means of the three-dimensional point group measured by the three-dimensional measurement means. Storing a corresponding coordinate map of each resolution storing corresponding coordinates which are image coordinates corresponding to the respective image coordinates of the image of each resolution obtained when the recognition object is imaged from a direction;
The position / orientation evaluation means projects the three-dimensional contour points of the three-dimensional model at any position / orientation on the distance map having the lowest resolution, and the contour evaluation value obtained by collating The position and orientation of the recognition object is evaluated using the point group evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position and orientation on the lowest corresponding coordinate map. And
The position and orientation optimization means sets the highest evaluation value obtained by the position and orientation evaluation means as an initial value, and the initial value is higher than the distance map and the corresponding coordinate map used in the position and orientation evaluation means. Using the resolution distance map and the corresponding coordinate map, optimization is performed until a preset accuracy is reached or evaluation of the position and orientation using the highest resolution distance map and the corresponding coordinate map is completed. The three-dimensional object recognition apparatus according to any one of claims 1 to 3.

5. The display device according to claim 1, further comprising a display unit configured to display an evaluation result of the position and orientation of the recognition object obtained by the position and orientation evaluation unit or the position and orientation optimization unit. Dimensional object recognition device.

A three-dimensional model storage step for storing a three-dimensional model representing the contour and surface shape of the recognition object in the three-dimensional model storage means;
An image acquisition step of acquiring an image by photographing the recognition object from a predetermined direction;
A three-dimensional measurement step of measuring a three-dimensional point group indicating the three-dimensional coordinates of the points on the surface of the recognition object;
An edge extraction step of extracting an edge of the recognition object from the image acquired in the image acquisition step or the three-dimensional measurement result obtained in the three-dimensional measurement step;
A contour evaluation value obtained based on the edge extracted in the edge extraction step and a three-dimensional contour point indicating the contour shape of the three-dimensional model at any position and orientation, and the three-dimensional measurement obtained in the three-dimensional measurement step A position and orientation evaluation step for evaluating the position and orientation of the recognition object using a point cloud evaluation value obtained based on a result and a three-dimensional surface point indicating the surface shape of the three-dimensional model at any position and orientation; A three-dimensional object recognition method characterized by:

The three-dimensional object recognition method according to claim 1, wherein the position and orientation evaluation step adjusts a ratio between the contour evaluation value and the point group evaluation value.

A distance map in which each pixel constituting the image acquired in the image acquisition step stores the distance to the closest edge among the edges extracted in the edge extraction step and the direction of the closest edge as image values A distance map storage step for storing the distance map in the distance map storage means;
An image obtained when the recognition object is imaged from a direction different from the predetermined direction at each image coordinate on the image acquired by the image acquisition step of the three-dimensional point group measured in the three-dimensional measurement step. A corresponding coordinate map storing step for storing a corresponding coordinate map storing corresponding coordinates, which are image coordinates corresponding to the respective image coordinates, in a corresponding coordinate map storing unit,
In the position and orientation evaluation step, the contour evaluation value obtained by projecting and collating the three-dimensional contour point of the three-dimensional model at any position and orientation on the distance map, and the corresponding coordinate map The position and orientation of the recognition object is evaluated using the point cloud evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position and orientation. 3. The three-dimensional object recognition method according to 1 or 2.

An image pyramid creating step for creating an image pyramid having a plurality of images in which the resolution of the image obtained in the image obtaining step is reduced at different ratios;
A position and orientation optimization step for optimizing the position and orientation of the recognition object using the highest evaluation value obtained in the position and orientation evaluation step as an initial value,
The extraction step extracts an edge of the recognition object from an image of each resolution in the image pyramid created in the image pyramid creation step,
In the distance map storing step, each pixel of each resolution image in the image pyramid created in the image pyramid creation step is connected to the nearest edge among the edges extracted from the image of each resolution in the edge extraction step. Storing a distance map of each resolution in which the distance and the direction of the nearest edge are stored as image values in the distance map storage means;
The corresponding coordinate map storing step differs from the predetermined direction in each image coordinate on the image of each resolution in the image pyramid created in the image pyramid creation step of the three-dimensional point group measured in the three-dimensional measurement step. A corresponding coordinate map of each resolution in which corresponding coordinates, which are image coordinates corresponding to the respective image coordinates of an image of each resolution obtained when the recognition object is imaged from a direction, is stored in the corresponding coordinate map storage unit. Remember,
In the position and orientation evaluation step, the contour evaluation value obtained by projecting and collating the three-dimensional contour points of the three-dimensional model at any position and orientation on the distance map having the lowest resolution, and the resolution is The position and orientation of the recognition object is evaluated using the point group evaluation value obtained by projecting and collating the three-dimensional surface points of the three-dimensional model at any position and orientation on the lowest corresponding coordinate map. And
In the position and orientation optimization step, the highest evaluation value obtained in the position and orientation evaluation step is set as an initial value, and the initial value is higher than the distance map and the corresponding coordinate map used in the position and orientation evaluation step. Using the resolution distance map and the corresponding coordinate map, optimization is performed until a preset accuracy is reached or evaluation of the position and orientation using the highest resolution distance map and the corresponding coordinate map is completed. The three-dimensional object recognition method according to claim 6.

The display step of displaying the evaluation result of the position and orientation of the recognition object obtained by the position and orientation evaluation step or the position and orientation optimization step, according to claim 6. Dimensional object recognition method.