JP2010244413A

JP2010244413A - Method for recognizing object gripped by grip means

Info

Publication number: JP2010244413A
Application number: JP2009094175A
Authority: JP
Inventors: Munetaka Yamamoto; 宗隆山本; Masaki Takasan; 正己高三
Original assignee: Toyota Industries Corp
Current assignee: Toyota Industries Corp
Priority date: 2009-04-08
Filing date: 2009-04-08
Publication date: 2010-10-28

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method for recognizing an object gripped by a grip means, which improves speed of recognizing an object having a hidden part. <P>SOLUTION: The method for recognizing the object 3 gripped by the grip means includes: an image acquisition step of acquiring an object image 1 including a hand 2 and the object 3 gripped by the hand 2; an object detection step of detecting the object displayed on the image 1; a hand detection step of detecting the hand 2 displayed on the image 1 and estimating gripping posture of the hand 2; and an object determination step of specifying the object 3 (partial area A) in the image 1 from information on the object detected by the object detection step and information on the hand 2 detected by the hand detection step, and recognizing the entire shape of the object 3 from information on the external appearance of the predetermined object 3 (partial area A) and information on the gripping posture of the hand 2 which is estimated by the hand detection step. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

この発明は、把持する手等により隠れた部位を有する、把持手段によって把持されている対象物の認識方法に関する。 The present invention relates to a method for recognizing an object that is held by a holding means and has a portion hidden by a holding hand or the like.

対象物を撮像した画像において、対象物には、対象物自体及び対象物上に存在する遮蔽物によって部分的に隠れてしまい画像上に描写されない部位が存在する。そして、このような部位を含めた対象物の全体形状を、隠れた部位を有する対象物を撮像した画像に基づき推定する方法や装置が提案されている。 In an image obtained by capturing an image of an object, the object includes a part that is partially hidden by the object itself and a shielding object present on the object and is not depicted on the image. And the method and apparatus which estimate the whole shape of the target object containing such a site | part based on the image which imaged the target object which has a hidden site | part are proposed.

例えば、特許文献１には、カメラから取得した画像から、部分的に隠れた対象物を認識する情報処理装置が記載されている。
特許文献１の情報処理装置は、予めカメラから供給された対象物自体の画像をモデル画像として蓄積しており、このモデル画像から、画像における位置の変化に対する画素値の変化の大きさの度合いを示すエッジ強度からなるエッジ強度画像を生成する。また、情報処理装置は、モデル画像から、画像の画素の画素値の閾値に対する大小の境界を示すエッジ画像を生成する。さらに、この情報処理装置は、生成したエッジ画像上の境界であるエッジに複数の特徴点を設定し、この特徴点の近傍に特徴量抽出領域を設定する。そして、情報処理装置は、エッジ強度画像を基に、各特徴点の特徴量抽出領域におけるエッジ強度をモデル特徴量として抽出し、モデル特徴量と各特徴点の位置関係とを関連づけてモデル画像毎にモデル辞書に登録する。 For example, Patent Document 1 describes an information processing apparatus that recognizes a partially hidden object from an image acquired from a camera.
The information processing apparatus disclosed in Patent Document 1 accumulates an image of an object itself supplied from a camera in advance as a model image. From this model image, the degree of change in pixel value with respect to a change in position in the image is determined. An edge strength image having the edge strength shown is generated. In addition, the information processing apparatus generates an edge image indicating a size boundary with respect to the threshold value of the pixel value of the pixel of the image from the model image. Further, the information processing apparatus sets a plurality of feature points at the edge that is a boundary on the generated edge image, and sets a feature amount extraction region in the vicinity of the feature points. Then, the information processing apparatus extracts the edge strength in the feature amount extraction region of each feature point as a model feature amount based on the edge strength image, and associates the model feature amount with the positional relationship of each feature point for each model image. Register in the model dictionary.

また、情報処理装置は、カメラから認識される実画像をターゲット画像とし、このターゲット画像からエッジ画像を生成して、生成したエッジ画像上に複数のエッジ点を設定する。さらに、情報処理装置は、各エッジ点におけるエッジ強度をターゲット特徴量として抽出する。
そこで、情報処理装置は、モデル辞書に登録されたモデル特徴量とターゲット特徴量とのマッチングを行い、モデル画像に含まれている対象物とターゲット画像に含まれている対象物とを同定し、ターゲット画像内の対象物を認識する。これにより、ターゲット画像内における対象物の隠れた領域を含む対象物の形状が特定される。 Further, the information processing apparatus uses a real image recognized by the camera as a target image, generates an edge image from the target image, and sets a plurality of edge points on the generated edge image. Furthermore, the information processing apparatus extracts the edge strength at each edge point as a target feature amount.
Therefore, the information processing apparatus performs matching between the model feature amount registered in the model dictionary and the target feature amount, and identifies the target object included in the model image and the target object included in the target image, Recognize objects in the target image. Thereby, the shape of the target object including the hidden region of the target object in the target image is specified.

特開２００８−２４３１７５号公報JP 2008-243175 A

しかしながら、特許文献１の情報処理装置では、モデル辞書に登録された、多量のモデル画像毎に設定されている複数の特徴点に対応するモデル特徴量の全てと、ターゲット画像に設定された複数の特徴点におけるターゲット特徴量とをマッチングさせて、モデル画像とターゲット画像との同定を行っている。このため、モデル特徴量とターゲット特徴量とのマッチングにおける処理量が多くなり、モデル画像とターゲット画像との同定に関する処理速度が低くなるという問題がある。さらに、モデル画像とターゲット画像との同定の精度を向上させるために各画像において設定する特徴点の数量を多くすると、処理量のさらなる増大によりマッチングに要する時間が増大し、モデル画像とターゲット画像との同定に関する処理速度がさらに低下するという問題がある。 However, in the information processing apparatus of Patent Document 1, all of the model feature amounts corresponding to a plurality of feature points set for each of a large number of model images registered in the model dictionary and a plurality of set of target images are set. The model image and the target image are identified by matching the target feature amount at the feature point. For this reason, there is a problem that the processing amount in matching between the model feature amount and the target feature amount increases, and the processing speed related to the identification of the model image and the target image is reduced. Furthermore, if the number of feature points set in each image is increased in order to improve the identification accuracy between the model image and the target image, the time required for matching increases due to a further increase in the processing amount. There is a problem that the processing speed relating to the identification of the image quality further decreases.

この発明は、このような問題点を解決するためになされたもので、隠れた部分を有する対象物の認識処理速度を向上することのできる把持手段によって把持されている対象物の認識方法を提供することを目的とする。 The present invention has been made to solve such problems, and provides a method for recognizing an object held by a grasping means capable of improving the recognition processing speed of an object having a hidden portion. The purpose is to do.

この発明に係る把持手段によって把持されている対象物の認識方法は、把持手段及び把持手段によって把持されている対象物を含む物体の画像を取得する画像取得ステップと、画像に写し出された物体を検知する物体検知ステップと、画像に写し出された把持手段を検知し、把持手段の把持姿勢を推定する把持手段検知ステップと、物体検知ステップにより検知された物体に関する情報及び把持手段検知ステップにより検知された把持手段に関する情報から、画像における対象物を特定し、特定された対象物の外観に関する情報及び把持手段検知ステップにより推定された把持手段の把持姿勢に関する情報から、対象物の全体形状を認識する対象物判定ステップとを備えることを特徴とするものである。 An object recognition method according to the present invention includes a grasping means and an image acquisition step for obtaining an image of an object including the object grasped by the grasping means, and an object projected on the image. Detected by an object detection step to be detected, a gripping means detection step for detecting a gripping means projected in an image and estimating a gripping posture of the gripping means, and information on an object detected by the object detection step and a gripping means detection step. The object in the image is identified from the information regarding the gripping means, and the overall shape of the object is recognized from the information regarding the appearance of the identified object and the information regarding the gripping posture of the gripping means estimated by the gripping means detection step. An object determination step.

これにより、画像に写し出される把持手段の画像から推定される把持姿勢から、把持手段の内側（把持側）に形成される空間範囲を算出することができる。そして、この空間範囲から、把持手段により隠れている対象物の取り得る領域が限定される。よって、対象物は、画像に写し出されている部位だけなく、把持手段により隠れている対象物の取り得る領域よってもその外観に関する要素が限定されて、対象物の全体形状が認識される。このため、対象物の認識に要する処理量が低減される。従って、把持手段によって把持されている対象物の認識方法は、対象物の認識に要する処理速度を向上させることができる。 Thereby, the spatial range formed inside the gripping means (grip side) can be calculated from the gripping posture estimated from the image of the gripping means projected on the image. And the area | region which the target object hidden by the holding means can take is limited from this space range. Therefore, not only the part projected in the image but also the area related to the appearance of the target object is limited by the region that can be taken by the target object hidden by the gripping means, and the entire shape of the target object is recognized. For this reason, the processing amount required for recognition of a target object is reduced. Therefore, the method for recognizing the object held by the holding means can improve the processing speed required for recognizing the object.

把持手段検知ステップは、推定された把持手段の把持姿勢に関する情報から、把持手段により囲まれる領域を算出し、把持手段により囲まれる領域から画像における把持手段によって隠れた部分を含む領域を特定することをさらに含んでもよい。
対象物判定ステップは、特定された対象物に上記特定された領域を結合し、結合された対象物から対象物の全体形状を認識することをさらに含んでもよい。これにより、把持手段によって隠れた把持手段により囲まれる領域及び画像に写し出されている対象物に関する情報を結合することにより、面積、外周長等の対象物全体の外観に関する要素の範囲が限定される。よって、対象物は、画像に写し出されている部位だけなく、対象物の外観に関する要素によっても限定されて、対象物の全体形状が認識される。 The grasping means detection step calculates an area surrounded by the grasping means from information on the estimated grasping posture of the grasping means, and specifies an area including a portion hidden by the grasping means in the image from the area surrounded by the grasping means. May further be included.
The object determination step may further include combining the specified area with the specified object and recognizing the entire shape of the object from the combined object. Thereby, the range of the elements related to the appearance of the entire object such as the area and the outer peripheral length is limited by combining the area surrounded by the grasping means hidden by the grasping means and the information on the object projected on the image. . Accordingly, the object is limited not only by the portion shown in the image but also by the elements related to the appearance of the object, and the entire shape of the object is recognized.

対象物判定ステップは、把持されている対象物に関する情報を含むデータベースにアクセスし、結合された対象物に関する情報をこの条件として条件を満たす範囲内のみでデータベースを検索し、データベースとのマッチングを行うことをさらに含んでもよい。これにより、結合された対象物に関する情報を条件としてその条件の範囲内にデータベースの検索範囲を限定することにより、検索に要する処理量が低減され、処理速度が向上する。また、対象物判定ステップにおいて、結合された対象物に関する情報は、結合された対象物の面積及び外周長の少なくとも１つを含んでもよい。
物体検知ステップは、把持手段検知ステップにより検知された把持手段に関する情報に基づき、画像に写し出された物体から対象物を特定することをさらに含んでもよい。これにより、物体検知ステップは、把持手段の位置及び形状等に関する情報から、画像において把持手段付近から検索して把持手段の近傍における物体を検知し、さらに、検知した物体から対象物を特定する。よって、物体検知ステップ及び対象物判定ステップにおける処理量が低減され、把持手段によって把持されている対象物の認識に係わる処理速度を向上することができる。 In the object determination step, a database including information on the object being grasped is accessed, the database is searched only within a range that satisfies the condition using the information regarding the combined object as a condition, and matching with the database is performed. It may further include. Accordingly, by limiting the search range of the database within the range of the condition on the condition regarding the information on the combined objects, the processing amount required for the search is reduced and the processing speed is improved. In the object determination step, the information related to the combined object may include at least one of the area and the outer peripheral length of the combined object.
The object detection step may further include specifying the target object from the object imaged on the image based on the information on the grip means detected by the grip means detection step. Thereby, the object detection step searches from the information about the position and shape of the gripping means in the vicinity of the gripping means in the image to detect the object in the vicinity of the gripping means, and further identifies the target object from the detected object. Therefore, the processing amount in the object detection step and the object determination step is reduced, and the processing speed related to the recognition of the object held by the holding means can be improved.

物体検知ステップは、把持手段検知ステップにより推定された把持手段の把持姿勢に関する情報に基づき、画像に写し出された物体から対象物を特定することをさらに含んでもよい。これにより、物体検知ステップは、把持手段の把持姿勢に関する情報から、把持手段の把持可能な領域（把持側）のみを検索して画像の物体を検知し、さらに、検知した物体から対象物を特定する。よって、物体検知ステップ及び対象物判定ステップにおける処理量が低減され、把持手段によって把持されている対象物の認識に係わる処理速度を向上することができる。 The object detection step may further include specifying the target object from the object imaged on the image based on the information related to the holding posture of the holding means estimated by the holding means detection step. As a result, the object detection step detects only the region (grip side) that can be gripped by the gripping means from the information on the gripping posture of the gripping means, detects the object of the image, and further identifies the target from the detected object. To do. Therefore, the amount of processing in the object detection step and the object determination step is reduced, and the processing speed related to the recognition of the object held by the holding means can be improved.

物体検知ステップは、把持手段検知ステップにより検知された把持手段に関する情報、及び把持手段検知ステップにより推定された把持手段の把持姿勢に関する情報に基づき、画像に写し出された物体から対象物を特定することをさらに含んでもよい。これにより、物体検知ステップは、把持手段の把持姿勢に関する情報から、把持手段の把持可能な領域（把持側）のみに画像における物体の検索範囲を限定し、さらに、この検索範囲において、把持手段の位置等に関する情報から把持手段の付近から検索して把持手段の近傍における物体を検知する。そして、物体検知ステップは、検知した物体から対象物を特定する。よって、物体検知ステップ及び対象物判定ステップにおける処理量が低減され、把持手段によって把持されている対象物の認識に係わる処理速度を向上することができる。
また、この発明に係るプログラムは、上述の把持手段によって把持されている対象物の認識方法における各ステップを、コンピュータに実行させるためのプログラムであることを特徴とする。 In the object detection step, the object is identified from the object projected on the image based on the information on the gripping means detected in the gripping means detection step and the information on the gripping posture of the gripping means estimated in the gripping means detection step. May further be included. Thereby, the object detection step limits the search range of the object in the image to only the grippable area (grip side) of the gripping means from the information regarding the gripping posture of the gripping means, and further, in this search range, An object in the vicinity of the gripping means is detected by searching from the vicinity of the gripping means from information on the position and the like. In the object detection step, the target is specified from the detected object. Therefore, the amount of processing in the object detection step and the object determination step is reduced, and the processing speed related to the recognition of the object held by the holding means can be improved.
The program according to the present invention is a program for causing a computer to execute each step in the method for recognizing an object held by the holding means described above.

この発明によれば、把持手段によって把持されている対象物の認識方法は、隠れた部分を有する対象物の認識処理速度を向上することが可能になる。 According to the present invention, the method for recognizing an object held by the holding means can improve the recognition processing speed of an object having a hidden portion.

対象物及び対象物を把持している手を含む画像を示す図である。It is a figure which shows the image containing the hand holding the target object and the target object. 図１において、この発明の実施の形態１〜３及び６に係る把持手段によって把持されている対象物の認識装置によって検出される領域を分別して示す図である。In FIG. 1, it is a figure which classifies and shows the area | region detected by the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 1-3 and 6 of this invention. 図２の各領域を別個に示す図である。It is a figure which shows each area | region of FIG. 2 separately. この発明の実施の形態１に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 1 of this invention. 辞書のデータベースの一例を示す図である。It is a figure which shows an example of the database of a dictionary. この発明の実施の形態２に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 2 of this invention. この発明の実施の形態３に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 3 of this invention. 図１において、この発明の実施の形態４に係る把持手段によって把持されている対象物の認識装置によって検出される領域を分別して示す図である。In FIG. 1, it is a figure which classifies and shows the area | region detected by the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 4 of this invention. この発明の実施の形態４に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 4 of this invention. 図１において、この発明の実施の形態５に係る把持手段によって把持されている対象物の認識装置によって検出される領域を分別して示す図である。In FIG. 1, it is a figure which classifies and shows the area | region detected by the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 5 of this invention. この発明の実施の形態５に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 5 of this invention. この発明の実施の形態６に係る把持手段によって把持されている対象物の認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recognition apparatus of the target object currently hold | gripped by the holding means which concerns on Embodiment 6 of this invention.

以下に、この発明の実施の形態について、添付図に基づいて説明する。
実施の形態１．
図１〜５を用いて、この発明の実施の形態１に係る把持手段によって把持されている対象物の認識装置１０１の構成及び動作を示す。なお、以下の実施形態では、対象物を人間の手で把持した場合における対象物の認識について記載する。さらに、認識装置１０１における処理は、認識装置１０１に組み込まれたプログラムによって実行されるものとする。 Embodiments of the present invention will be described below with reference to the accompanying drawings.
Embodiment 1 FIG.
The configuration and operation of the recognition apparatus 101 for an object gripped by the gripping means according to Embodiment 1 of the present invention will be described with reference to FIGS. In the following embodiments, recognition of an object when the object is held by a human hand will be described. Furthermore, the processing in the recognition apparatus 101 is executed by a program incorporated in the recognition apparatus 101.

まず、図１及び４を参照すると、認識装置１０１は、画像取得手段１１、物体検知手段１２ａ、手検知手段１２ｂ、対象物判定手段１３、並びに、記憶手段である辞書１５を含むように構成されている。画像取得手段１１は、物体検知手段１２ａ及び手検知手段１２ｂへ情報を送信するようになっており、物体検知手段１２ａ及び手検知手段１２ｂはそれぞれ、対象物判定手段１３へ情報を送信するようになっている。また、手検知手段１２ｂ及び対象物判定手段１３は、辞書１５へのアクセスが可能となっている。
また、手検知手段１２ｂは、手検知部１２ｂ１、手形状推定部１２ｂ２及び限定条件生成部１２ｂ３によって構成され、手検知部１２ｂ１から手形状推定部１２ｂ２、及び、手形状推定部１２ｂ２から限定条件生成部１２ｂ３へ情報が送信されるようになっている。
さらに、対象物判定手段１３は、情報結合部１３ａ及び対象物認識部１３ｂによって構成され、情報結合部１３ａから対象物認識部１３ｂへ情報が送信されるようになっている。 First, referring to FIGS. 1 and 4, the recognition apparatus 101 is configured to include an image acquisition unit 11, an object detection unit 12a, a hand detection unit 12b, an object determination unit 13, and a dictionary 15 which is a storage unit. ing. The image acquisition unit 11 transmits information to the object detection unit 12a and the hand detection unit 12b, and the object detection unit 12a and the hand detection unit 12b transmit information to the object determination unit 13, respectively. It has become. The hand detection unit 12b and the object determination unit 13 can access the dictionary 15.
The hand detection unit 12b includes a hand detection unit 12b1, a hand shape estimation unit 12b2, and a limitation condition generation unit 12b3. The hand detection unit 12b1 generates a limitation condition from the hand shape estimation unit 12b2 and the hand shape estimation unit 12b2. Information is transmitted to the unit 12b3.
Furthermore, the object determination means 13 includes an information combination unit 13a and an object recognition unit 13b, and information is transmitted from the information combination unit 13a to the object recognition unit 13b.

画像取得手段１１は、外部の画像を取得するためのものであり、カメラ等によって構成されている。そして、画像取得手段１１によって図１に示される二次元画像１が取得され、画像取得手段１１は、取得した画像１を物体検知手段１２ａ及び手検知手段１２ｂの手検知部１２ｂ１に送る。
物体検知手段１２ａは、供給された画像１において特徴部を検出し、画像１上に写し出されている各物体の形状を検出するものである。なお、画像１における特徴部の検出は、例えば、画素の画素値が急激に変化する箇所であるエッジを検出することによって行われる。そして、物体検知手段１２ａは、画像１において検出した各物体の形状情報を、対象物判定手段１３の情報結合部１３ａに送る。また、物体検知手段１２ａは、画像１における各物体の二次元位置も検出し、この位置情報を対象物判定手段１３の情報結合部１３ａに送る。 The image acquisition means 11 is for acquiring an external image, and is constituted by a camera or the like. Then, the two-dimensional image 1 shown in FIG. 1 is acquired by the image acquisition unit 11, and the image acquisition unit 11 sends the acquired image 1 to the object detection unit 12a and the hand detection unit 12b1 of the hand detection unit 12b.
The object detection unit 12a detects a characteristic portion in the supplied image 1 and detects the shape of each object projected on the image 1. Note that the detection of the characteristic portion in the image 1 is performed by, for example, detecting an edge that is a portion where the pixel value of the pixel changes rapidly. Then, the object detection unit 12 a sends the shape information of each object detected in the image 1 to the information combining unit 13 a of the object determination unit 13. The object detection unit 12 a also detects the two-dimensional position of each object in the image 1 and sends this position information to the information combining unit 13 a of the object determination unit 13.

手検知手段１２ｂにおける手検知部１２ｂ１は、画像取得手段１１から画像１の供給をうけ、供給された画像１において、把持手段である人の手２を検出し、さらに、手２の二次元位置を検出するものである。そして、手検知部１２ｂ１は、検出した手２の位置情報及び形状情報を、同じ手検知手段１２ｂ内の手形状推定部１２ｂ２に送る。
なお、手検知手段１２ｂによる手２の検出の際、画像１内における肌色領域の抽出及び人の肌のテクスチャマッチングにより、手２自体が検知され、さらに、検知された手２と、辞書１５内に予め登録された手形状のパターンとのマッチングにより手２の二次元形状が検出される。 The hand detection unit 12b1 in the hand detection unit 12b receives the supply of the image 1 from the image acquisition unit 11, detects the human hand 2 as the gripping unit in the supplied image 1, and further detects the two-dimensional position of the hand 2 Is detected. Then, the hand detection unit 12b1 sends the detected position information and shape information of the hand 2 to the hand shape estimation unit 12b2 in the same hand detection unit 12b.
When the hand 2 is detected by the hand detection unit 12b, the hand 2 itself is detected by extracting the skin color area in the image 1 and the texture matching of the human skin, and the detected hand 2 and the dictionary 15 The two-dimensional shape of the hand 2 is detected by matching with a hand shape pattern registered in advance.

手形状推定部１２ｂ２は、手検知部１２ｂ１により検出された二次元画像１上での手２の形状から、手の向き、手の傾斜角度、及び指の曲がり角度等の情報を含む手２の手指形状すなわち手２の把持姿勢を推定するものである。そして、手形状推定部１２ｂ２は、推定した手２の把持姿勢情報、並びに手検知部１２ｂ１より供給された手２の位置情報及び形状情報を、同じ手検知手段１２ｂ内の限定条件生成部１２ｂ３に送る。
なお、手２の把持姿勢の推定ついて、例えば、谷本らの研究（谷本貴頌他により２００６年３月に作成され公知となった「ロボットハンド制御のための自己増殖型ＳＯＭを用いた画像データベースからの手指形状の実時間推定」と題する筑波大学大学院博士課程システム情報工学研究科修士論文）に記載される方法を使用することによって、手を撮像した一つの二次元画像から手の把持姿勢の推定を行うことができる。谷本らの研究では、予め、手の関節の角度情報と手画像とを同期させて取得し、画像における輪郭抽出と特徴量化を行い、この特徴量と角度とをデータとしてデータベースを構築している。そして、手の実画像について、データベースの構築時と同様の特徴量化を行い、得られた特徴量とデータベースの特徴量との比較を行うことで手の関節の角度を推定し、手の把持姿勢を推定している。
また、複数の画像取得手段１１を使用し、これらの画像取得手段１１により異なる方向から撮像した手２の画像から、手２を立体的に復元し、手２の把持姿勢を計測することもできる。 The hand shape estimation unit 12b2 includes information on the hand 2 including information such as the direction of the hand, the inclination angle of the hand, and the bending angle of the finger from the shape of the hand 2 on the two-dimensional image 1 detected by the hand detection unit 12b1. The finger shape, that is, the holding posture of the hand 2 is estimated. Then, the hand shape estimation unit 12b2 sends the estimated gripping posture information of the hand 2 and the position information and shape information of the hand 2 supplied from the hand detection unit 12b1 to the limited condition generation unit 12b3 in the same hand detection unit 12b. send.
Regarding the estimation of the gripping posture of the hand 2, for example, a study by Tanimoto et al. ("Image database using self-propagating SOM for robot hand control" which was made public in March 2006 by Takaaki Tanimoto et al. Using the method described in the University of Tsukuba Graduate School of Information Science and Technology Master's thesis) Estimation can be performed. In the research of Tanimoto et al., The joint joint angle information and hand image are acquired in advance, contour extraction and feature quantification are performed in the image, and a database is constructed using this feature value and angle as data. . Then, the actual image of the hand is converted into the same feature as when the database was constructed, and the angle of the joint of the hand is estimated by comparing the obtained feature with the feature of the database. Is estimated.
Further, by using a plurality of image acquisition means 11, the hand 2 can be three-dimensionally restored from the image of the hand 2 captured from different directions by the image acquisition means 11, and the gripping posture of the hand 2 can be measured. .

限定条件生成部１２ｂ３は、手形状推定部１２ｂ２により推定された手２の把持姿勢情報から、対象物３を把持する側において手２の指及び掌等によって形成される空間の範囲を限定し、さらに、この空間の範囲の中で、画像１において手２によって隠れて写し出されていない領域を特定するものである。 The limiting condition generation unit 12b3 limits the range of the space formed by the finger, palm, and the like of the hand 2 on the side that holds the object 3 from the gripping posture information of the hand 2 estimated by the hand shape estimation unit 12b2. Further, in this space range, an area that is hidden and hidden from the hand 2 in the image 1 is specified.

そこで、限定条件生成部１２ｂ３は、手２によって形成される空間の範囲について画像１と平行な断面の内から、最大断面積を有する断面又は最大外周長を有する断面を求める。さらに、限定条件生成部１２ｂ３は、最大断面積を有する断面については、この断面を画像１に投影し、手２によって形成される空間の範囲が画像１の手２と重なる領域を特定し、その特定した領域の面積を算出する。なお、この最大断面積を有する断面における手２によって形成される空間の範囲と画像１の手２とが重なる領域は、図２及び図３の領域図（ｂ）における領域Ｂにより示される。また、限定条件生成部１２ｂ３は、最大外周長を有する断面については、この断面を画像１に投影し、手２によって形成される空間の範囲が画像１の手２と重なる領域を特定し、その特定した領域の外周長を算出する。なお、この最大外周長を有する断面における手２によって形成される空間の範囲と画像１の手２とが重なる領域は、図２及び図３の領域図（ｂ）における領域Ｂにより示され、この実施形態では、最大断面積を有する断面における手２によって形成される空間の範囲及び画像１の手２が重なる領域Ｂと同一となっている。 Therefore, the limiting condition generation unit 12b3 obtains a cross section having the maximum cross sectional area or a cross section having the maximum outer peripheral length from the cross sections parallel to the image 1 in the space range formed by the hand 2. Further, for the cross section having the maximum cross-sectional area, the limiting condition generation unit 12b3 projects this cross section on the image 1, specifies the region where the space range formed by the hand 2 overlaps the hand 2 of the image 1, Calculate the area of the identified region. Note that a region where the range of the space formed by the hand 2 in the cross section having the maximum cross-sectional area overlaps the hand 2 of the image 1 is indicated by a region B in the region diagram (b) of FIGS. In addition, the limiting condition generation unit 12b3 projects this cross section on the image 1 for the cross section having the maximum outer peripheral length, specifies an area where the space formed by the hand 2 overlaps the hand 2 of the image 1, and The outer peripheral length of the identified area is calculated. A region where the space formed by the hand 2 in the cross section having the maximum outer peripheral length overlaps with the hand 2 of the image 1 is indicated by a region B in the region diagram (b) of FIGS. In the embodiment, the range of the space formed by the hand 2 in the cross section having the maximum cross-sectional area and the region B where the hand 2 of the image 1 overlaps are the same.

また、領域Ｂの面積及び外周長は、画像１において対象物３における手２によって隠された部位の領域を限定するための限定条件となっている。
そして、限定条件生成部１２ｂ３は、図２及び図３の領域図（ｂ）に示す領域Ｂの面積の値又は外周長の値を、対象物判定手段１３の情報結合部１３ａに送る。また、限定条件生成部１２ｂ３は、手検知部１２ｂ１により検出されて手形状推定部１２ｂ２を介して送られた手２の位置情報及び形状情報も情報結合部１３ａに送る。
なお、手検知手段１２ｂにおける処理動作と、物体検知手段１２ａにおける処理動作は並行して行われる。 Further, the area and the outer peripheral length of the region B are the limiting conditions for limiting the region of the part hidden in the image 1 by the hand 2 in the object 3.
Then, the limiting condition generating unit 12b3 sends the area value or the outer peripheral length value of the region B shown in the region diagram (b) of FIGS. 2 and 3 to the information combining unit 13a of the object determining unit 13. Further, the limiting condition generation unit 12b3 also sends the position information and shape information of the hand 2 detected by the hand detection unit 12b1 and sent via the hand shape estimation unit 12b2 to the information combination unit 13a.
The processing operation in the hand detection unit 12b and the processing operation in the object detection unit 12a are performed in parallel.

対象物判定手段１３における情報結合部１３ａは、物体検知手段１２ａから送られた画像１における手２も含んだ各物体の位置情報及び形状情報と、限定条件生成部１２ｂ３から送られた、領域Ｂ［図２及び図３の領域図（ｂ）参照］の面積の値又は外周長の値、並びに手２の位置情報及び形状情報とを結合するものである。
そこで、情報結合部１３ａは、画像１における各物体の位置情報及び形状情報と、手２の位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、物体検知手段１２ａにより検出された各物体の内から特定する。このとき、情報結合部１３ａによって特定された対象物３の領域は、図２及び図３の領域図（ａ）の部分領域Ａで示され、部分領域Ａは画像１上に写し出されている対象物３を示す。 The information combining unit 13a in the object determining unit 13 includes the area B and the position information and shape information of each object including the hand 2 in the image 1 sent from the object detecting unit 12a, and the region B sent from the limiting condition generating unit 12b3. The value of the area or the outer peripheral length of [refer to the region diagram (b) of FIGS. 2 and 3] and the position information and shape information of the hand 2 are combined.
Therefore, the information combining unit 13a uses the object detection unit 12a to detect the object 3 held by the hand 2 based on the position information and shape information of each object in the image 1 and the position information and shape information of the hand 2. It identifies from each detected object. At this time, the region of the object 3 identified by the information combining unit 13a is indicated by a partial region A in the region diagram (a) of FIGS. 2 and 3, and the partial region A is a target imaged on the image 1. Object 3 is shown.

次に、情報結合部１３ａは、部分領域Ａ［図２及び図３の領域図（ａ）参照］の面積の値又は外周長の値に、限定条件生成部１２ｂ３から送られた領域Ｂ［図２及び図３の領域図（ｂ）参照］の面積の値又は外周長の値を結合する。
情報結合部１３ａは、部分領域Ａの面積の値に領域Ｂの面積の値を結合する場合、領域Ｂの面積の値を部分領域Ａの面積の値に加える。また、情報結合部１３ａは、部分領域Ａの外周長の値に領域Ｂの外周長の値を結合する場合、領域Ｂの外周長の値を部分領域Ａの外周長に加えたものから、部分領域Ａ及び領域Ｂが互いに隣接する部位である外周部Ｂ１、Ｂ２、Ｂ３［図２及び図３の領域図（ｃ）参照］の外周長の値の２倍の値を減じる。 Next, the information combining unit 13a sets the area B [see FIG. 2] sent from the limiting condition generating unit 12b3 to the area value or the outer peripheral length value of the partial area A [see the area diagrams (a) in FIGS. 2 and 3]. 2 and the region diagram (b) of FIG. 3] are combined.
When combining the value of the area of the region B with the value of the area of the partial region A, the information combining unit 13a adds the value of the area of the region B to the value of the area of the partial region A. In addition, when combining the value of the outer peripheral length of the region B with the value of the outer peripheral length of the partial region A, the information combining unit 13a adds the value of the outer peripheral length of the region B to the outer peripheral length of the partial region A. A value that is twice the value of the outer peripheral length of the outer peripheral portions B1, B2, and B3 (refer to the region diagrams (c) in FIGS. 2 and 3), which are regions adjacent to each other, is reduced.

これにより、画像１において手２によって隠された対象物３の部位を面積又は外周長により限定する限定条件を部分領域Ａに結合して得られる、対象物３の仮想面積の値又は仮想外周長の値が算出される。そして、この算出された対象物３の仮想面積の値又は仮想外周長の値は、情報結合部１３ａによって、同じ対象物判定手段１３内の対象物認識部１３ｂに送られる。また、情報結合部１３ａは、対象物３における部分領域Ａの形状情報も対象物認識部１３ｂに送る。 Thereby, the value of the virtual area of the object 3 or the virtual outer circumference length obtained by combining the limited condition for limiting the portion of the object 3 hidden by the hand 2 in the image 1 by the area or the outer circumference length with the partial region A Is calculated. Then, the calculated value of the virtual area of the object 3 or the value of the virtual outer circumference length is sent to the object recognition unit 13b in the same object determination unit 13 by the information combination unit 13a. The information combining unit 13a also sends the shape information of the partial area A in the target object 3 to the target object recognition unit 13b.

対象物認識部１３ｂは、情報結合部１３ａから送られた対象物３の仮想面積の値又は仮想外周長の値から、対象物３の全体形状を特定するものである。
そこで、対象物認識部１３ｂは、辞書１５に登録された図５に示されるデータベース１５ａにアクセスする。
データベース１５ａは、物品の名称、その物品の面積（投影面積）、その物品の外周長（投影面の外周長）、及びその物品の全体形状からなるセットを物品に関するデータとしてテーブル化し、多様な種類の物品に関するデータを含んでいる。そして、データベース１５ａ内では、物品の面積又は外周長について昇順に、物品に関するデータが並べられている。 The object recognition unit 13b specifies the overall shape of the object 3 from the value of the virtual area or the value of the virtual outer circumference of the object 3 sent from the information combining unit 13a.
Therefore, the object recognition unit 13b accesses the database 15a shown in FIG.
The database 15a tabulates a set consisting of the name of an article, the area of the article (projected area), the outer perimeter of the article (outer perimeter of the projection surface), and the overall shape of the article as data relating to the article. Includes data on other items. And in the database 15a, the data regarding articles | goods are arranged in ascending order about the area or outer periphery length of articles | goods.

対象物認識部１３ｂは、情報結合部１３ａから対象物３の仮想面積の値が送られた場合、データベース１５ａ内において、送られた対象物３の仮想面積の値以下となる面積を有する物品のみを検索範囲に含むように設定し、検索範囲内の物品の全体形状と情報結合部１３ａから送られた対象物３の部分領域Ａ［図３の領域図（ａ）参照］の形状とのマッチングを行う。この際、対象物認識部１３ｂは、対象物３の仮想面積に最も近い面積を有する物品から面積に関して降順に（面積が小さくなる方向に向かって）、検索範囲内の物品を検索し物品と対象物３とのマッチングを行う。
例えば、対象物３の仮想面積の値が７である場合、データベース１５ａにおいて面積が７以下である、バナナやイチゴ等の形状と部分領域Ａの形状とのマッチングが行われる。 When the value of the virtual area of the target object 3 is sent from the information combining unit 13a, the target object recognition unit 13b only has an article having an area that is equal to or smaller than the virtual area value of the sent target object 3 in the database 15a. Is included in the search range, and the overall shape of the article in the search range is matched with the shape of the partial area A of the object 3 sent from the information combining unit 13a [see the area diagram (a) in FIG. 3]. I do. At this time, the object recognition unit 13b searches for an article within the search range in descending order with respect to the area from the article having the area closest to the virtual area of the object 3 (in the direction in which the area decreases), and the article and the object. Matching with object 3 is performed.
For example, when the value of the virtual area of the object 3 is 7, matching is performed between the shape of the banana, the strawberry, or the like and the shape of the partial region A whose area is 7 or less in the database 15a.

また、対象物認識部１３ｂは、情報結合部１３ａから対象物３の仮想外周長の値が送られた場合、データベース１５ａ内において、送られた仮想外周長の値以下の外周長を有する物品のみを検索範囲に含むように設定し、検索範囲内の物品の全体形状と対象物３の部分領域Ａの形状とのマッチングを、上述の面積を使用したマッチングと同様にして行う。 Further, when the value of the virtual outer circumference length of the target object 3 is sent from the information combining unit 13a, the object recognition unit 13b only has an article having an outer circumference length equal to or smaller than the value of the virtual outer circumference length sent in the database 15a. Is set so as to be included in the search range, and matching between the overall shape of the article within the search range and the shape of the partial region A of the object 3 is performed in the same manner as the matching using the above-described area.

このマッチングの結果、対象物認識部１３ｂは、対象物３の部分領域Ａの形状と類似する形状を有する物品を、対象物３と認識する。これにより、画像１において隠れた部位を有する対象物３の全体形状が特定される。また、対象物認識部１３ｂは、データベース１５ａ内の物品の全体形状と対象物３の部分領域Ａの形状とのマッチングの際、対象物３の向きや傾き等の把持姿勢を特定することもできる。
よって、認識装置１０１は、対象物３の部分領域Ａの形状と辞書１５内のデータベース１５ａ内の物品の形状とのマッチングを行う際、対象物３及びデータベース１５ａ内の物品間における面積の値又は外周長の値の比較によって、マッチングを行うデータベース１５ａ内の物品を限定してその数量を低減しているため、マッチングに関する処理量が低減され、処理速度が向上する。 As a result of this matching, the object recognition unit 13b recognizes an article having a shape similar to the shape of the partial area A of the object 3 as the object 3. Thereby, the whole shape of the target object 3 having a hidden part in the image 1 is specified. In addition, the object recognition unit 13b can also specify a gripping posture such as the orientation and inclination of the object 3 when matching the overall shape of the article in the database 15a and the shape of the partial area A of the object 3. .
Therefore, when the recognition apparatus 101 matches the shape of the partial region A of the object 3 with the shape of the article in the database 15a in the dictionary 15, the value of the area between the object 3 and the article in the database 15a or Since the number of articles in the database 15a to be matched is limited by the comparison of the values of the outer peripheral lengths and the quantity thereof is reduced, the processing amount related to matching is reduced and the processing speed is improved.

このように、実施の形態１に係る把持手段によって把持されている対象物の認識装置１０１によれば、手２及び手２によって把持されている対象物３を含む物体の画像１を取得する画像取得ステップと、画像１に写し出された物体を検知する物体検知ステップと、画像１に写し出された手２を検知し、手２の把持姿勢を推定する手検知ステップと、物体検知ステップにより検知された物体に関する情報及び手検知ステップにより検知された手２に関する情報から、画像１における対象物３（部分領域Ａ）を特定し、特定された対象物３（部分領域Ａ）の外観に関する情報及び手検知ステップにより推定された手２の把持姿勢に関する情報から、対象物３の全体形状を認識する対象物判定ステップとを行うことにより、対象物３を認識することができる。 As described above, according to the recognition apparatus 101 for the object gripped by the gripping means according to the first embodiment, an image for acquiring the image 1 of the object including the hand 2 and the target 3 gripped by the hand 2. Detected by an acquisition step, an object detection step for detecting an object imaged in image 1, a hand detection step for detecting hand 2 imaged in image 1 and estimating the gripping posture of hand 2, and an object detection step. The object 3 (partial area A) in the image 1 is identified from the information regarding the detected object and the information regarding the hand 2 detected by the hand detection step, and information regarding the appearance of the identified object 3 (partial area A) and the hand Recognizing the object 3 by performing the object determination step for recognizing the entire shape of the object 3 from the information related to the gripping posture of the hand 2 estimated by the detection step. It can be.

これにより、画像１に写し出される手２の画像から推定される把持姿勢から、手２の内側（把持側）に形成される空間範囲を算出することができる。そして、この空間範囲から、手２により隠れている対象物３の取り得る領域が限定される。よって、対象物３は、画像１に写し出されている部位（部分領域Ａ）だけなく、手２により隠れている対象物３の取り得る領域よってもその外観上の要素が限定されて、対象物３の全体形状が認識される。このため、対象物３の認識に要する処理量が低減される。従って、認識装置１０１によれば、対象物３の認識に要する処理速度を向上させることができる。 Thereby, the spatial range formed inside the hand 2 (gripping side) can be calculated from the gripping posture estimated from the image of the hand 2 projected on the image 1. And the area | region which the target object 3 which is hidden with the hand 2 can take is limited from this space range. Therefore, the object 3 is limited not only in the part (partial area A) shown in the image 1 but also in the area that can be taken by the object 3 hidden by the hand 2, the elements on the appearance are limited. 3 overall shapes are recognized. For this reason, the processing amount required for recognition of the target object 3 is reduced. Therefore, according to the recognition apparatus 101, the processing speed required for recognition of the target object 3 can be improved.

画像取得ステップ及び物体検知ステップはそれぞれ、画像取得手段１１及び物体検知手段１２ａによって行われる。
また、手検知ステップは、手検知手段１２ｂによって行われる。手検知手段１２ｂは、推定された手２の把持姿勢に関する情報から、手２により囲まれる領域を算出し、手２により囲まれる領域から画像１における手２によって隠れた部分を含む領域（領域Ｂ）を特定することもできる。 The image acquisition step and the object detection step are respectively performed by the image acquisition unit 11 and the object detection unit 12a.
The hand detection step is performed by the hand detection means 12b. The hand detection unit 12b calculates a region surrounded by the hand 2 from the estimated information regarding the gripping posture of the hand 2, and includes a region (region B) including a portion hidden by the hand 2 in the image 1 from the region surrounded by the hand 2. ) Can also be specified.

また、対象物判定ステップは、対象物判定手段１３によって行われる。対象物判定手段１３は、特定された対象物３（部分領域Ａ）に手２によって隠れた部分に相当する領域（領域Ｂ）を結合し、結合された対象物３から対象物３の全体形状を認識することもできる。このとき、手２によって隠れた部分に相当する領域（領域Ｂ）及び画像１に写し出されている対象物３（部分領域Ａ）の外観に関する情報を結合することにより、面積、外周長等の対象物３全体の外観に関する要素の範囲が限定される。よって、対象物３は、画像１に写し出されている部位（部分領域Ａ）だけなく、対象物３の外観に関する要素によっても限定されて、対象物３の全体形状が認識される。 Further, the object determining step is performed by the object determining means 13. The object determination unit 13 combines the identified object 3 (partial area A) with a region (area B) corresponding to a portion hidden by the hand 2, and the overall shape of the object 3 from the combined object 3. Can also be recognized. At this time, by combining information related to the appearance of the area 3 corresponding to the part hidden by the hand 2 (area B) and the object 3 (partial area A) projected in the image 1 The range of elements related to the appearance of the entire object 3 is limited. Therefore, the target object 3 is limited not only by the part (partial area A) projected on the image 1 but also by elements relating to the appearance of the target object 3, and the entire shape of the target object 3 is recognized.

また、対象物判定手段１３は、把持されている対象物３に関する情報を含むデータベース１５ａにアクセスし、結合された対象物３に関する情報（対象物３の仮想面積又は仮想外周長）を条件としてこの条件を満たす範囲内のみでデータベースを検索し、データベースとのマッチングを行うこともできる。これにより、結合された対象物３に関する情報（対象物３の仮想面積又は仮想外周長）を条件としてその条件の範囲内にデータベース１５ａの検索範囲を限定することができるため、検索に要する処理量が低減され、処理速度が向上する。 Further, the object determination means 13 accesses the database 15a including information on the object 3 being gripped, and uses the information (the virtual area or the virtual outer circumference length of the object 3) regarding the combined object 3 as a condition. It is also possible to search the database only within a range that satisfies the conditions and perform matching with the database. Thereby, since the search range of the database 15a can be limited within the range of the condition on the condition of the information (the virtual area or the virtual outer circumference length of the target object 3) regarding the combined target object 3, the processing amount required for the search Is reduced and the processing speed is improved.

また、画像取得手段１１は、複数設けられていてもよい。複数の画像取得手段１１によって複数の画像が供給される場合、物体検知手段１２ａ及び手検知部１２ｂ１は、複数の画像に基づき、画像１に含まれる物体及び手２の三次元空間内での位置を検出することがきる。また、画像取得手段１１は、ステレオカメラであってもよい。ステレオカメラによって撮像された画像では、画像１に含まれる物体及び手２における平面的な位置だけでなく遠近に関する距離も示すことができる、すなわち三次元位置を検出することがきる。これにより、対象物判定手段１３の情報結合部１３ａにおいて、手２によって把持される対象物３を特定する際、手２と画像１内の物体との位置を三次元上で比較することができるため、対象物３の特定に要する処理量が低減され、処理速度を向上させることができる。 Further, a plurality of image acquisition means 11 may be provided. When a plurality of images are supplied by the plurality of image acquisition units 11, the object detection unit 12a and the hand detection unit 12b1 are based on the plurality of images and the positions of the object included in the image 1 and the hand 2 in the three-dimensional space. Can be detected. Further, the image acquisition unit 11 may be a stereo camera. In the image captured by the stereo camera, not only the planar position of the object included in the image 1 and the hand 2 but also the distance related to perspective can be indicated, that is, the three-dimensional position can be detected. Thereby, when specifying the target object 3 held by the hand 2 in the information combination unit 13a of the target object determination unit 13, the positions of the hand 2 and the object in the image 1 can be compared in three dimensions. Therefore, the processing amount required for specifying the object 3 is reduced, and the processing speed can be improved.

また、対象物判定手段１３の対象物認識部１３ｂにおいて、データベース１５ａ内の物品の全体形状と対象物３の部分領域Ａの形状とのマッチングの際、データベース１５ａ内の物品の検索範囲を、対象物３の仮想面積及び仮想外周長の値の両方の値以下となる面積及び外周長を有する物品のみを含むように限定してもよい。これにより、マッチングに使用されるデータベース１５ａ内のデータ量がさらに低減され、さらなる処理速度の向上を図ることができる。 In addition, in the object recognition unit 13b of the object determination means 13, when matching the overall shape of the article in the database 15a and the shape of the partial area A of the object 3, the search range of the article in the database 15a You may limit so that only the articles | goods which have the area and outer periphery length which are below the value of both the virtual area of the thing 3, and the value of virtual outer periphery length may be included. Thereby, the data amount in the database 15a used for matching is further reduced, and the processing speed can be further improved.

実施の形態２．
この発明の実施の形態２に係る把持手段によって把持されている対象物の認識装置１０２は、実施の形態１における認識装置１０１の手検知部１２ｂ１から物体検知手段１２ａに対して情報が送信されるように構成したものである。
なお、以下の実施の形態において、前出した図における参照符号と同一の符号は、同一または同様な構成要素であるので、その詳細な説明は省略する。 Embodiment 2. FIG.
The object recognition apparatus 102 grasped by the grasping means according to the second embodiment of the present invention transmits information from the hand detection unit 12b1 of the recognition apparatus 101 according to the first embodiment to the object detection means 12a. It is comprised as follows.
In the following embodiments, the same reference numerals as those in the previous drawings are the same or similar components, and thus detailed description thereof is omitted.

図６を参照すると、認識装置１０２において、実施の形態１と同様にして、画像取得手段１１から、物体検知手段２２ａ、及び手検知手段２２ｂの手検知部２２ｂ１へ情報が送信されるようになっている。
さらに、手検知部２２ｂ１から、手形状推定部２２ｂ２及び物体検知手段２２ａへ情報が送信されるようになっている。
このため、図１も合わせて参照すると、手検知部２２ｂ１は、画像取得手段１１から画像１の供給をうけ、供給された画像１において検出した手２の位置情報及び形状情報を手形状推定部２２ｂ２及び物体検知手段２２ａに送る。 Referring to FIG. 6, in the recognition apparatus 102, information is transmitted from the image acquisition unit 11 to the object detection unit 22a and the hand detection unit 22b1 of the hand detection unit 22b as in the first embodiment. ing.
Further, information is transmitted from the hand detection unit 22b1 to the hand shape estimation unit 22b2 and the object detection unit 22a.
Therefore, referring also to FIG. 1, the hand detection unit 22b1 receives the supply of the image 1 from the image acquisition unit 11, and detects the position information and the shape information of the hand 2 detected in the supplied image 1 as a hand shape estimation unit. 22b2 and the object detection means 22a.

物体検知手段２２ａは、画像取得手段１１から供給された画像１における各物体の形状及び位置を検出するが、手検知部２２ｂ１から手２の位置情報が送られている。このため、物体検知手段２２ａは、この情報に基づき、手２の近傍から物体の検索を開始し、手２の近傍に位置する物体について、それらの位置及び形状を検出する。すなわち、物体検知手段２２ａは、手２の外側において、手２の周縁近くから検索範囲を拡げていき、手２の周縁の近くで検知される物体の位置及び形状を検出する。
さらに、物体検知手段２２ａは、検出した物体の位置情報及び形状情報と、手検知部２２ｂ１から送られた手２の位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、検出した物体から特定する。そして、物体検知手段２２ａは、特定した対象物３の位置情報及び形状情報を対象物判定手段２３の情報結合部２３ａに送る。 The object detection unit 22a detects the shape and position of each object in the image 1 supplied from the image acquisition unit 11, and the position information of the hand 2 is sent from the hand detection unit 22b1. Therefore, based on this information, the object detection unit 22a starts searching for an object from the vicinity of the hand 2, and detects the position and shape of the object located in the vicinity of the hand 2. That is, the object detection unit 22 a expands the search range from the vicinity of the periphery of the hand 2 on the outside of the hand 2 and detects the position and shape of the object detected near the periphery of the hand 2.
Further, the object detection means 22a detects the object 3 held by the hand 2 based on the detected position information and shape information of the object and the position information and shape information of the hand 2 sent from the hand detection unit 22b1. Identify from the detected object. Then, the object detection unit 22a sends the specified position information and shape information of the target object 3 to the information combining unit 23a of the target object determination unit 23.

手検知手段２２ｂにおける手形状推定部２２ｂ２及び限定条件生成部２２ｂ３は、実施の形態１の手形状推定部１２ｂ２及び限定条件生成部１２ｂ３と同様に動作して、情報結合部２３ａに対して、図２及び図３の領域図（ｂ）に示す領域Ｂの面積の値又は外周長の値、並びに手２の位置情報及び形状情報を送る。
対象物判定手段２３における情報結合部２３ａは、物体検知手段２２ａから送られた対象物３の位置情報及び形状情報と、限定条件生成部２２ｂ３から送られた、領域Ｂ［図２及び図３の領域図（ｂ）参照］の面積の値又は外周長の値とを結合する。 The hand shape estimation unit 22b2 and the limitation condition generation unit 22b3 in the hand detection unit 22b operate in the same manner as the hand shape estimation unit 12b2 and the limitation condition generation unit 12b3 of the first embodiment, and the information combination unit 23a 2 and the area value of the region B shown in the region diagram (b) of FIG.
The information combination unit 23a in the object determination unit 23 includes the position information and shape information of the object 3 sent from the object detection unit 22a and the region B [from FIG. 2 and FIG. 3 sent from the limiting condition generation unit 22b3. The area value or the perimeter length value of the area diagram (b)] is combined.

なお、本実施の形態２では既に対象物３が特定されているため、情報結合部２３ａは、対象物３の部分領域Ａ［図２及び図３の領域図（ａ）参照］の面積の値又は外周長の値と、領域Ｂ［図２及び図３の領域図（ｂ）参照］の面積の値又は外周長の値とを結合して、対象物３の仮想面積の値又は仮想外周長の値を算出し、この値を対象物認識部２３ｂに送る。
また、この発明の実施の形態２に係る把持手段によって把持されている対象物の認識装置１０２のその他の構成及び動作は、実施の形態１と同様であるため、説明を省略する。 In the second embodiment, since the object 3 has already been specified, the information combining unit 23a determines the area value of the partial area A of the object 3 [see the area diagrams (a) in FIGS. 2 and 3]. Alternatively, the value of the perimeter length and the value of the area or the perimeter length value of the region B [see FIG. 2 and FIG. And the value is sent to the object recognition unit 23b.
Further, since the other configuration and operation of the object recognition apparatus 102 grasped by the grasping means according to the second embodiment of the present invention are the same as those in the first embodiment, description thereof will be omitted.

このように、実施の形態２における把持手段によって把持されている対象物の認識装置１０２によれば、上記実施の形態１の認識装置１０１と同様な効果が得られる。
また、認識装置１０２において、物体検知手段２２ａは、手検知手段２２ｂの手検知部２２ｂ１により検出された手２に関する情報に基づき、画像１に写し出された物体から対象物３を特定することもできる。そこで、物体検知手段２２ａは、物体検知手段２２ａでの処理動作において、手２の位置及び形状等に関する情報から、画像１における物体の検索を手２の近傍から開始し、検出する物体を手２の近傍のものに限定して物体を検出し、さらに、検出した物体から対象物３を特定する。よって、物体検知手段２２ａは、物体の検索範囲及び検出する物体の数量を実施の形態１より低減している。このため、認識装置１０２は、物体検知手段２２ａ及び情報結合部２３ａでの処理量が低減されるため、実施の形態１の認識装置１０１より処理速度が向上する。 Thus, according to the recognition apparatus 102 of the target object gripped by the gripping means in the second embodiment, the same effect as the recognition apparatus 101 in the first embodiment can be obtained.
In the recognition device 102, the object detection unit 22a can also specify the target 3 from the object shown in the image 1 based on the information about the hand 2 detected by the hand detection unit 22b1 of the hand detection unit 22b. . Therefore, the object detection unit 22a starts searching for an object in the image 1 from the vicinity of the hand 2 from information on the position and shape of the hand 2 in the processing operation of the object detection unit 22a, and selects the object to be detected by the hand 2. The object is limited to those in the vicinity of and the object 3 is specified from the detected object. Therefore, the object detection unit 22a reduces the object search range and the number of objects to be detected from those in the first embodiment. For this reason, since the processing amount in the object detection unit 22a and the information combining unit 23a is reduced, the recognition apparatus 102 has a higher processing speed than the recognition apparatus 101 in the first embodiment.

実施の形態３．
この発明の実施の形態３に係る把持手段によって把持されている対象物の認識装置１０３の構成は、実施の形態１における認識装置１０１の手形状推定部１２ｂ２から物体検知手段１２ａに対して情報が送信されるように構成したものである。 Embodiment 3 FIG.
In the configuration of the recognition apparatus 103 for the object gripped by the gripping means according to Embodiment 3 of the present invention, information is transmitted from the hand shape estimation unit 12b2 of the recognition apparatus 101 according to Embodiment 1 to the object detection means 12a. It is configured to be transmitted.

図７を参照すると、認識装置１０３において、実施の形態１と同様にして、画像取得手段１１から、物体検知手段３２ａ、及び手検知手段３２ｂの手検知部３２ｂ１へ情報が送信されるようになっている。さらに、手検知部３２ｂ１から手形状推定部３２ｂ２へ情報が送信されるようになっている。
また、手形状推定部３２ｂ２から、限定条件生成部３２ｂ３及び物体検知手段３２ａへ情報が送信されるようになっている。 Referring to FIG. 7, in the recognition apparatus 103, information is transmitted from the image acquisition unit 11 to the object detection unit 32 a and the hand detection unit 32 b 1 of the hand detection unit 32 b as in the first embodiment. ing. Further, information is transmitted from the hand detection unit 32b1 to the hand shape estimation unit 32b2.
Information is transmitted from the hand shape estimation unit 32b2 to the limiting condition generation unit 32b3 and the object detection unit 32a.

図１も合わせて参照すると、手形状推定部３２ｂ２は、実施の形態１と同様にして、手検知部３２ｂ１から、画像１において検出された手２の位置情報及び形状情報の供給をうけ、手２の形状情報から手２の把持姿勢を推定する。そして、手形状推定部３２ｂ２は、推定した手２の把持姿勢情報、並びに手２の位置情報及び形状情報を限定条件生成部３２ｂ３及び物体検知手段３２ａに送る。
限定条件生成部３２ｂ３は、手形状推定部３２ｂ２から送られた情報を基に、図２及び図３の領域図（ｂ）に示す領域Ｂにおける面積の値又は外周長の値を算出し、この算出した値、並びに手２の位置情報及び形状情報を対象物判定手段３３の情報結合部３３ａに送る。 Referring also to FIG. 1, the hand shape estimation unit 32b2 receives the position information and shape information of the hand 2 detected in the image 1 from the hand detection unit 32b1, as in the first embodiment. The holding posture of the hand 2 is estimated from the shape information of 2. Then, the hand shape estimation unit 32b2 sends the estimated gripping posture information of the hand 2 and the position information and shape information of the hand 2 to the limiting condition generation unit 32b3 and the object detection unit 32a.
Based on the information sent from the hand shape estimation unit 32b2, the limiting condition generation unit 32b3 calculates the area value or the outer circumference length value in the region B shown in the region diagram (b) of FIGS. The calculated value and the position information and shape information of the hand 2 are sent to the information combination unit 33a of the object determination unit 33.

また、物体検知手段３２ａは、画像取得手段１１から供給された画像１における各物体の形状及び位置を検出する。しかしながら、物体検知手段３２ａには、手２の把持姿勢情報、並びに手２の位置情報及び形状情報が手形状推定部３２ｂ２により供給されている。このため、物体検知手段３２ａは、画像１における物体の形状及び位置を検出する場合、手２の位置情報から物体の検索を手２の近傍から開始することができ、さらに、手２の把持姿勢情報から手２の把持可能な領域（把持側領域）に限定して物体の検索を行うことができる。すなわち、物体検知手段３２ａは、手２の把持可能領域（把持側領域）における手２の近傍から物体の検索を開始し、手２の把持可能領域（把持側領域）であり且つ手２の近傍に位置する物体の位置及び形状を検出する。例えば、把持側領域には、手２における指及び掌の内側の領域が含まれ、手２の甲側の領域は含まれない。 In addition, the object detection unit 32 a detects the shape and position of each object in the image 1 supplied from the image acquisition unit 11. However, the hand posture estimation unit 32b2 supplies the gripping posture information of the hand 2, and the position information and shape information of the hand 2 to the object detection unit 32a. For this reason, when detecting the shape and position of the object in the image 1, the object detection unit 32 a can start searching for the object from the position information of the hand 2 from the vicinity of the hand 2. It is possible to search for an object limited to a region where the hand 2 can be gripped (grip side region) from the information. That is, the object detection unit 32a starts searching for an object from the vicinity of the hand 2 in the grippable region (grip side region) of the hand 2, and is in the grippable region (grip side region) of the hand 2 and in the vicinity of the hand 2 The position and shape of the object located at is detected. For example, the area on the grip side includes the area inside the finger and palm of the hand 2 and does not include the area on the back side of the hand 2.

さらに、物体検知手段３２ａは、検出した物体の位置及び形状と、手形状推定部３２ｂ２から送られた手２の位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、検出した物体から特定する。そして、物体検知手段３２ａは、特定した対象物３の位置情報及び形状情報を対象物判定手段３３の情報結合部３３ａに送る。 Furthermore, the object detection unit 32a detects the object 3 held by the hand 2 based on the detected position and shape of the object and the position information and shape information of the hand 2 sent from the hand shape estimation unit 32b2. Identify from detected objects. Then, the object detection unit 32a sends the specified position information and shape information of the target object 3 to the information combining unit 33a of the target object determination unit 33.

情報結合部３３ａは、物体検知手段３２ａから送られた対象物３の位置情報及び形状情報と、限定条件生成部３２ｂ３から送られた、領域Ｂ［図２及び図３の領域図（ｂ）参照］の面積の値又は外周長の値とを、実施の形態２と同様にして結合して、対象物３の仮想面積の値又は仮想外周長の値を算出し、この値を対象物認識部３３ｂに送る。
また、この発明の実施の形態３に係る把持手段によって把持されている対象物の認識装置１０３のその他の構成及び動作は、実施の形態１と同様であるため、説明を省略する。 The information combining unit 33a receives the position information and shape information of the target 3 sent from the object detection unit 32a and the region B sent from the limiting condition generating unit 32b3 [see the region diagram (b) in FIGS. 2 and 3]. ] Are combined in the same manner as in Embodiment 2 to calculate the value of the virtual area or the value of the virtual outer circumference of the object 3, and this value is used as the object recognition unit. To 33b.
Moreover, since the other structure and operation | movement of the recognition apparatus 103 of the target object currently hold | gripped by the holding means which concerns on Embodiment 3 of this invention are the same as that of Embodiment 1, description is abbreviate | omitted.

このように、実施の形態３における把持手段によって把持されている対象物の認識装置１０３によれば、上記実施の形態１の認識装置１０１と同様な効果が得られる。
また、認識装置１０３において、物体検知手段３２ａは、手検知手段３２ｂの手検知部３２ｂ１により検出された手２の位置及び形状に関する情報、及び手形状推定部３２ｂ２により推定された手２の把持姿勢に関する情報に基づき、画像１に写し出された物体から対象物３を特定することもできる。そこで、物体検知手段３２ａでの処理動作において、画像１における物体の位置及び形状の検出は、手２の把持姿勢に関する情報及び手２の位置及び形状から、手２の把持可能領域（把持側領域）における手２の近傍から物体の検索を開始し、検出する物体を手２の把持可能領域（把持側領域）であり且つ手２の近傍に位置する物体に限定している。そして、物体検知手段３２ａは、検出した物体から対象物３を特定する。このため、物体の検索範囲及び検出する物体の数量が実施の形態２より低減されている。よって、認識装置１０３は、物体検知手段３２ａでの処理量が低減されるため、実施の形態２の認識装置１０２より処理速度が向上する。 Thus, according to the recognition apparatus 103 of the target object gripped by the gripping means in the third embodiment, the same effect as the recognition apparatus 101 in the first embodiment can be obtained.
In the recognition device 103, the object detection unit 32a includes information on the position and shape of the hand 2 detected by the hand detection unit 32b1 of the hand detection unit 32b, and the gripping posture of the hand 2 estimated by the hand shape estimation unit 32b2. The object 3 can be specified from the object shown in the image 1 on the basis of the information on the information. Therefore, in the processing operation by the object detection unit 32a, the position and shape of the object in the image 1 are detected from the information regarding the gripping posture of the hand 2 and the position and shape of the hand 2 (the gripping side region). The search for the object is started from the vicinity of the hand 2 in (), and the object to be detected is limited to the object that is in the grippable region (grip side region) of the hand 2 and is located in the vicinity of the hand 2. And the object detection means 32a specifies the target object 3 from the detected object. For this reason, the object search range and the number of objects to be detected are reduced as compared with the second embodiment. Therefore, the processing speed of the recognition apparatus 103 is improved compared to the recognition apparatus 102 of the second embodiment because the processing amount in the object detection unit 32a is reduced.

実施の形態４．
この発明の実施の形態４に係る把持手段によって把持されている対象物の認識装置１０４の構成は、実施の形態１における認識装置１０１の限定条件生成部１２ｂ３により算出される領域Ｂを変更したものであり、対象物３を把持する側において手２によって形成される空間の範囲自体の断面を領域Ｂとしたものである。 Embodiment 4 FIG.
The configuration of the recognition apparatus 104 for an object gripped by the gripping means according to Embodiment 4 of the present invention is obtained by changing the region B calculated by the limiting condition generation unit 12b3 of the recognition apparatus 101 in Embodiment 1. The section of the space itself formed by the hand 2 on the side where the object 3 is gripped is a region B.

図９を参照すると、認識装置１０４は、実施の形態１と同様にして、画像取得手段１１、物体検知手段４２ａ、手検知手段４２ｂ、及び対象物判定手段４３によって構成されている。
手検知手段４２ｂの限定条件生成部４２ｂ３には、実施の形態１と同様にして、同じ手検知手段４２ｂの手形状推定部４２ｂ２から、手検知部４２ｂ１により画像１（図１参照）において検出された手２（図１参照）の位置情報及び形状情報、並びに手形状推定部４２ｂ２により推定された手２の把持姿勢情報が送られる。 Referring to FIG. 9, the recognition device 104 includes the image acquisition unit 11, the object detection unit 42 a, the hand detection unit 42 b, and the object determination unit 43, as in the first embodiment.
Similarly to the first embodiment, the limited condition generating unit 42b3 of the hand detecting unit 42b detects the image 1 (see FIG. 1) from the hand shape estimating unit 42b2 of the same hand detecting unit 42b by the hand detecting unit 42b1. The position information and shape information of the hand 2 (see FIG. 1) and the grip posture information of the hand 2 estimated by the hand shape estimation unit 42b2 are sent.

図８も合わせて参照すると、限定条件生成部４２ｂ３は、手２の把持姿勢情報から、対象物３を把持する側において手２によって形成される空間の範囲を限定する。さらに、限定条件生成部４２ｂ３は、手２によって形成される空間の範囲について画像１（図１参照）と平行な断面の内から、最大断面積を有する断面又は最大外周長を有する断面を求める。そこで、限定条件生成部４２ｂ３は、この求めた断面を画像１に投影し、この投影された領域は、図８の領域Ｂｄのように示される。なお、本実施の形態４では、最大断面積を有する断面と最大外周長を有する断面とを同一としている。さらに、限定条件生成部４２ｂ３は、領域Ｂｄの位置及び形状を検出すると共に、領域Ｂｄの断面積の値又は外周長の値を算出し、これらの情報を対象物判定手段４３の情報結合部４３ａに送る。また、同時に、限定条件生成部４２ｂ３は、手形状推定部４２ｂ２より送られた手２の位置情報及び形状情報を情報結合部４３ａに送る。 Referring also to FIG. 8, the limiting condition generation unit 42 b 3 limits the range of the space formed by the hand 2 on the side where the object 3 is gripped from the gripping posture information of the hand 2. Further, the limiting condition generation unit 42b3 obtains a cross section having the maximum cross sectional area or a cross section having the maximum outer peripheral length from the cross sections parallel to the image 1 (see FIG. 1) in the space range formed by the hand 2. Therefore, the limiting condition generation unit 42b3 projects the obtained cross section onto the image 1, and the projected area is shown as a region Bd in FIG. In the fourth embodiment, the cross section having the maximum cross-sectional area is the same as the cross section having the maximum outer peripheral length. Further, the limiting condition generation unit 42b3 detects the position and shape of the region Bd, calculates the value of the cross-sectional area or the outer peripheral length of the region Bd, and uses these pieces of information as the information combination unit 43a of the object determination unit 43. Send to. At the same time, the limiting condition generation unit 42b3 sends the position information and shape information of the hand 2 sent from the hand shape estimation unit 42b2 to the information combining unit 43a.

情報結合部４３ａは、物体検知手段４２ａにより検出されて送られた画像１（図１参照）における各物体の位置情報及び形状情報と、手２の位置情報及び形状情報と、領域Ｂｄの位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、物体検知手段４２ａにより検出された各物体の内から特定する。さらに、情報結合部４３ａは、特定した対象物３の領域から領域Ｂｄと重なる領域を除去した領域を特定し、この領域の位置及び形状を検出する。なお、情報結合部４３ａにより特定された、領域Ｂｄと重なる領域を除去した対象物３の領域は、図８に示す部分領域Ａｄとなる。 The information combining unit 43a includes position information and shape information of each object, position information and shape information of the hand 2, and position information of the region Bd in the image 1 (see FIG. 1) detected and sent by the object detection unit 42a. Then, based on the shape information, the object 3 held by the hand 2 is specified from among the objects detected by the object detection means 42a. Further, the information combining unit 43a specifies a region obtained by removing a region that overlaps the region Bd from the region of the specified object 3, and detects the position and shape of this region. Note that the region of the object 3 that is identified by the information combining unit 43a and from which the region overlapping the region Bd is removed is the partial region Ad shown in FIG.

次に、情報結合部４３ａは、部分領域Ａｄの面積の値又は外周長の値を算出し、この算出した部分領域Ａｄの面積の値又は外周長の値と、限定条件生成部４２ｂ３から送られた領域Ｂｄの面積の値又は外周長の値とを結合し、対象物３の仮想面積の値又は仮想外周長の値を算出する。すなわち、対象物３の仮想面積の値は、部分領域Ａｄ及び領域Ｂｄを結合した領域全体の面積の値であり、対象物３の仮想外周長の値は、部分領域Ａｄ及び領域Ｂｄを結合した領域全体における外周長の値である。
そして、情報結合部４３ａは、この算出された対象物３の仮想面積の値又は仮想外周長の値と、対象物３の部分領域Ａｄの形状情報とを対象物認識部４３ｂに送る。 Next, the information combining unit 43a calculates the value of the area of the partial region Ad or the value of the outer peripheral length, and sends the calculated value of the area of the partial region Ad or the value of the outer peripheral length and the limit condition generating unit 42b3. The area value or the outer circumference length value of the region Bd is combined to calculate the virtual area value or the virtual outer circumference length value of the object 3. That is, the value of the virtual area of the target object 3 is the value of the area of the entire region where the partial region Ad and the region Bd are combined, and the value of the virtual outer circumference length of the target object 3 is a combination of the partial region Ad and the region Bd This is the value of the outer peripheral length in the entire region.
Then, the information combining unit 43a sends the calculated virtual area value or virtual outer peripheral length value of the target object 3 and the shape information of the partial region Ad of the target object 3 to the target object recognition unit 43b.

また、この発明の実施の形態４に係る把持手段によって把持されている対象物の認識装置１０４のその他の構成及び動作は、実施の形態１と同様であるため、説明を省略する。
このように、実施の形態４における把持手段によって把持されている対象物の認識装置１０４によれば、上記実施の形態１の認識装置１０１と同様な効果が得られる。
また、認識装置１０４において、手検知手段４２ｂの限定条件生成部４２ｂ３は、対象物３を把持する側において手２によって形成される空間の範囲の断面である領域Ｂｄにおける面積又は外周長の値を算出し、この値を対象物３の全体形状を特定するための限定条件としている。これにより、限定条件生成部４２ｂ３は、手２と領域Ｂｄとが重なる領域の算出や、領域Ｂｄにおける手２の外周部の長さの算出のために手２の指のエッジ部分の長さの算出といった、細かい情報処理作業が低減されている。よって、認識装置１０４は、限定条件生成部４２ｂ３での処理量が低減されるため、実施の形態１の認識装置１０１より処理速度を向上させることができる。 Further, since the other configuration and operation of the object recognition device 104 held by the holding means according to the fourth embodiment of the present invention are the same as those in the first embodiment, description thereof will be omitted.
Thus, according to the recognition apparatus 104 of the target object gripped by the gripping means in the fourth embodiment, the same effect as the recognition apparatus 101 in the first embodiment can be obtained.
Further, in the recognition device 104, the limiting condition generation unit 42b3 of the hand detection unit 42b calculates the value of the area or the outer peripheral length in the region Bd that is a cross section of the space formed by the hand 2 on the side where the object 3 is gripped. This value is used as a limiting condition for specifying the overall shape of the object 3. As a result, the limiting condition generation unit 42b3 calculates the length of the edge portion of the finger of the hand 2 for the calculation of the region where the hand 2 and the region Bd overlap or the length of the outer periphery of the hand 2 in the region Bd. Detailed information processing work such as calculation is reduced. Therefore, the recognition apparatus 104 can improve the processing speed as compared with the recognition apparatus 101 according to the first embodiment because the processing amount in the limiting condition generation unit 42b3 is reduced.

実施の形態５．
この発明の実施の形態５に係る把持手段によって把持されている対象物の認識装置１０５の構成は、実施の形態１における認識装置１０１の物体検知手段１２ａから限定条件生成部１２ｂ３に対して情報が送信されるように構成したものである。 Embodiment 5 FIG.
In the configuration of the recognition apparatus 105 for the object gripped by the gripping means according to Embodiment 5 of the present invention, information is sent from the object detection means 12a of the recognition apparatus 101 in Embodiment 1 to the limiting condition generation unit 12b3. It is configured to be transmitted.

図１１を参照すると、認識装置１０５において、実施の形態１と同様にして、画像取得手段１１から、物体検知手段５２ａ、及び手検知手段５２ｂの手検知部５２ｂ１へ情報が送信されるようになっている。さらに、手検知部５２ｂ１から手形状推定部５２ｂ２へ、そして、手形状推定部５２ｂ２から限定条件生成部５２ｂ３へ情報が送信されるようになっている。また、物体検知手段５２ａからは、限定条件生成部５２ｂ３及び対象物判定手段５３の情報結合部５３ａへ情報が送信されるようになっている。
そこで、手形状推定部５２ｂ２は、実施の形態１と同様にして、手検知部５２ｂ１により画像１（図１参照）において検出された手２（図１参照）の位置情報及び形状情報、並びに手形状推定部５２ｂ２が推定した手２の把持姿勢情報を限定条件生成部５２ｂ３に送る。また、物体検知手段５２ａは、画像１において検出した各物体の位置情報及び形状情報を、限定条件生成部５２ｂ３及び情報結合部５３ａに送る。 Referring to FIG. 11, in the recognition apparatus 105, as in the first embodiment, information is transmitted from the image acquisition unit 11 to the object detection unit 52a and the hand detection unit 52b1 of the hand detection unit 52b. ing. Further, information is transmitted from the hand detection unit 52b1 to the hand shape estimation unit 52b2, and from the hand shape estimation unit 52b2 to the limiting condition generation unit 52b3. Further, information is transmitted from the object detection means 52a to the limiting condition generation section 52b3 and the information combination section 53a of the object determination means 53.
Accordingly, the hand shape estimation unit 52b2 performs position information and shape information of the hand 2 (see FIG. 1) detected in the image 1 (see FIG. 1) by the hand detection unit 52b1 as well as the first embodiment, and the hand. The holding posture information of the hand 2 estimated by the shape estimating unit 52b2 is sent to the limiting condition generating unit 52b3. Further, the object detection means 52a sends the position information and shape information of each object detected in the image 1 to the limiting condition generation unit 52b3 and the information combination unit 53a.

図１０も合わせて参照すると、限定条件生成部５２ｂ３は、実施の形態１と同様にして、物体検知手段５２ａから送られた画像１（図１参照）における各物体の位置情報及び形状情報と、手形状推定部５２ｂ２から送られた手２の位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、物体検知手段５２ａにより検出された各物体の内から特定する。この特定された対象物３は、図１０の部分領域Ａによって示される。
さらに、限定条件生成部５２ｂ３は、手形状推定部５２ｂ２から送られた情報を基に、図１０に示す領域Ｂを算出する。なお、領域Ｂは、実施の形態１における領域Ｂと同じである。 Referring to FIG. 10 as well, the limiting condition generation unit 52b3, as in the first embodiment, the position information and shape information of each object in the image 1 (see FIG. 1) sent from the object detection unit 52a, Based on the position information and shape information of the hand 2 sent from the hand shape estimation unit 52b2, the target 3 held by the hand 2 is specified from among the objects detected by the object detection means 52a. This identified object 3 is indicated by a partial area A in FIG.
Further, the limiting condition generation unit 52b3 calculates a region B shown in FIG. 10 based on the information sent from the hand shape estimation unit 52b2. Region B is the same as region B in the first embodiment.

次に、限定条件生成部５２ｂ３は、部分領域Ａ及び領域Ｂの位置及び形状を検出し、検出したこれらの位置情報及び形状情報に基づき、部分領域Ａ（対象物３）及び領域Ｂによって囲まれる領域である図１０に示す領域Ｃを特定する。
さらに、限定条件生成部５２ｂ３は、領域Ｃと部分領域Ａとの境界線Ｃ１を検出する。そして、限定条件生成部５２ｂ３は、境界線Ｃ１の曲率に合わせて境界線Ｃ１の両端を領域Ｂの内部に向かって領域Ｂの外周部Ｂ４に到達するまで延長し、境界線Ｃ１の両端に延長線部Ｃ２及びＣ３を設定する。なお、延長線部Ｃ２及びＣ３は、境界線Ｃ１の接線方向に延びる直線であってもよい。 Next, the limiting condition generation unit 52b3 detects the positions and shapes of the partial areas A and B, and is surrounded by the partial areas A (target 3) and the areas B based on the detected position information and shape information. A region C shown in FIG. 10 which is a region is specified.
Further, the limiting condition generation unit 52b3 detects a boundary line C1 between the region C and the partial region A. Then, the limiting condition generation unit 52b3 extends both ends of the boundary line C1 toward the inside of the region B in accordance with the curvature of the boundary line C1 until reaching the outer peripheral portion B4 of the region B, and extends to both ends of the boundary line C1. Line parts C2 and C3 are set. Note that the extended line portions C2 and C3 may be straight lines extending in the tangential direction of the boundary line C1.

そこで、限定条件生成部５２ｂ３は、延長線部Ｃ２及びＣ３を境界として、領域Ｂにおける部分領域Ａ（対象物３）側の領域のみを新しい領域Ｂｅ１として採用し、延長線部Ｃ２及びＣ３を境界とした領域Ｂにおける部分領域Ａ（対象物３）と反対側となる領域Ｂｅ２を削除する。このため、領域Ｂｅ１は、領域Ｂの一部分に限定したものとなっており、対象物３における手２によって隠れた実際の領域に対して領域Ｂより近い形状及び面積を有している。
さらに、限定条件生成部５２ｂ３は、領域Ｂｅ１における面積の値又は外周長の値を算出し、この情報を対象物判定手段５３の情報結合部５３ａに送る。また、同時に、限定条件生成部５２ｂ３は、部分領域Ａの位置情報及び形状情報を情報結合部５３ａに送る。 Therefore, the limiting condition generation unit 52b3 adopts only the region on the partial region A (target 3) side in the region B as the new region Be1 with the extended line portions C2 and C3 as the boundary, and the extended line portions C2 and C3 as the boundary. The region Be2 on the opposite side to the partial region A (object 3) in the region B is deleted. For this reason, the region Be1 is limited to a part of the region B, and has a shape and area closer to those of the region B than the actual region hidden by the hand 2 in the object 3.
Further, the limiting condition generation unit 52 b 3 calculates the area value or the outer peripheral length value in the region Be 1, and sends this information to the information combining unit 53 a of the object determination unit 53. At the same time, the limiting condition generation unit 52b3 sends the position information and shape information of the partial area A to the information combining unit 53a.

情報結合部５３ａは、限定条件生成部５２ｂ３から送られた部分領域Ａの形状情報から、部分領域Ａの面積の値又は外周長の値を算出し、部分領域Ａの面積の値又は外周長の値と、限定条件生成部１２ｂ３から送られた領域Ｂｅ１の面積の値又は外周長の値とを結合し、対象物３の仮想面積の値又は仮想外周長の値を算出する。すなわち、対象物３の仮想面積の値は、部分領域Ａ及び領域Ｂｅ１を結合した領域全体の面積の値であり、対象物３の仮想外周長の値は、部分領域Ａ及び領域Ｂｅ１を結合した領域全体における外周長の値である。なお、この算出された対象物３の仮想面積の値又は仮想外周長の値は、実施の形態１に示すような領域Ｂを使用して求めた対象物３の仮想面積の値又は仮想外周長の値より、実際の対象物３全体の面積の値又は外周長の値に近いものとなっている。
そして、この算出された対象物３の仮想面積の値又は仮想外周長の値は、対象物３の部分領域Ａの形状情報と共に、情報結合部５３ａによって、対象物認識部５３ｂに送られる。 The information combining unit 53a calculates the value of the area of the partial region A or the value of the outer peripheral length from the shape information of the partial region A sent from the limiting condition generating unit 52b3, and calculates the value of the area of the partial region A or the outer peripheral length. The value is combined with the area value or the outer peripheral length value of the region Be1 sent from the limiting condition generating unit 12b3, and the virtual area value or the virtual outer peripheral length value of the object 3 is calculated. That is, the value of the virtual area of the object 3 is the value of the entire area where the partial area A and the area Be1 are combined, and the value of the virtual outer circumference length of the target object 3 is the combined area of the partial area A and the area Be1. This is the value of the outer peripheral length in the entire region. Note that the calculated virtual area value or virtual outer circumference length value of the object 3 is the virtual area value or virtual outer circumference length of the object 3 obtained using the region B as shown in the first embodiment. Is closer to the actual area value of the entire object 3 or the outer peripheral length.
Then, the calculated value of the virtual area of the object 3 or the value of the virtual outer circumference length is sent to the object recognition unit 53b by the information combining unit 53a together with the shape information of the partial area A of the object 3.

また、この発明の実施の形態５に係る把持手段によって把持されている対象物の認識装置１０５のその他の構成及び動作は、実施の形態１と同様であるため、説明を省略する。
このように、実施の形態５における把持手段によって把持されている対象物の認識装置１０５によれば、上記実施の形態１の認識装置１０１と同様な効果が得られる。
また、認識装置１０５における対象物３の仮想面積の値又は仮想外周長の値は、実施の形態１における領域Ｂをその一部分である領域Ｂｅ１に限定して算出されているため、実施の形態１において算出される対象物３の仮想面積の値又は仮想外周長の値より、実際の対象物３全体の面積の値又は外周長の値に近いものとなっている。これにより、データベース１５ａ（図５参照）内の物品の検索範囲が狭くなるため、対象物判定手段５３の対象物認識部５３ｂにおけるデータベース１５ａとのマッチングの際の処理量が低減される。よって、認識装置１０５は、実施の形態１の認識装置１０１より処理速度を向上させることができる。 Further, since the other configuration and operation of the object recognition apparatus 105 gripped by the gripping means according to the fifth embodiment of the present invention are the same as those in the first embodiment, description thereof will be omitted.
Thus, according to the recognition apparatus 105 of the target object gripped by the gripping means in the fifth embodiment, the same effect as the recognition apparatus 101 in the first embodiment can be obtained.
In addition, since the value of the virtual area or the virtual outer circumference length of the object 3 in the recognition apparatus 105 is calculated by limiting the region B in the first embodiment to the region Be1 that is a part of the region B, the first embodiment. Is closer to the actual area value or the outer peripheral length of the entire target object 3 than the value of the virtual area or the virtual outer peripheral length value of the target object 3 calculated in (1). Thereby, since the search range of the articles in the database 15a (see FIG. 5) is narrowed, the processing amount at the time of matching with the database 15a in the object recognition unit 53b of the object determination means 53 is reduced. Therefore, the recognition apparatus 105 can improve the processing speed as compared with the recognition apparatus 101 of the first embodiment.

実施の形態６．
この発明の実施の形態６に係る把持手段によって把持されている対象物の認識装置１０６の構成は、実施の形態１における認識装置１０１の限定条件生成部１２ｂ３が、領域Ｂの面積の値又は外周長の値を情報結合部１３ａに送っていたものを、領域Ｂの位置情報及び形状情報を送るようにしたものである。 Embodiment 6 FIG.
The configuration of the object recognition device 106 gripped by the gripping means according to Embodiment 6 of the present invention is such that the limited condition generation unit 12b3 of the recognition device 101 in Embodiment 1 uses the area B area value or the outer circumference. The position information and the shape information of the region B are sent from what has been sent the long value to the information combining unit 13a.

図１２を参照すると、認識装置１０６は、実施の形態１と同様にして、画像取得手段１１、物体検知手段６２ａ、手検知手段６２ｂ、及び対象物判定手段６３によって構成されている。
手検知手段６２ｂの限定条件生成部６２ｂ３には、手検知部６２ｂ１より検出された手２（図１参照）の位置情報及び形状情報、並びに、手形状推定部６２ｂ２により推定された手２の姿勢情報が送られる。
図２及び３を合わせて参照すると、限定条件生成部６２ｂ３は、実施の形態１と同様にして、送られた情報を基に、図２及び図３の領域図（ｂ）に示す領域Ｂを算出する。さらに、限定条件生成部６２ｂ３は、領域Ｂの位置及び形状を検出し、この検出した位置情報及び形状情報を対象物判定手段６３の情報結合部６３ａに送る。また、同時に、限定条件生成部６２ｂ３は、手２の位置情報及び形状情報を情報結合部６３ａに送る。 Referring to FIG. 12, the recognition device 106 includes an image acquisition unit 11, an object detection unit 62 a, a hand detection unit 62 b, and an object determination unit 63, as in the first embodiment.
The limited condition generation unit 62b3 of the hand detection unit 62b includes the position information and shape information of the hand 2 (see FIG. 1) detected by the hand detection unit 62b1, and the posture of the hand 2 estimated by the hand shape estimation unit 62b2. Information is sent.
Referring to FIGS. 2 and 3 together, the limiting condition generation unit 62b3 creates the area B shown in the area diagram (b) of FIGS. 2 and 3 based on the sent information in the same manner as in the first embodiment. calculate. Further, the limiting condition generating unit 62b3 detects the position and shape of the region B, and sends the detected position information and shape information to the information combining unit 63a of the object determining unit 63. At the same time, the limiting condition generation unit 62b3 sends the position information and shape information of the hand 2 to the information combining unit 63a.

情報結合部６３ａは、物体検知手段６２ａにより検出されて送られた画像１（図１参照）における各物体の位置情報及び形状情報と、限定条件生成部６２ｂ３から送られた手２の位置情報及び形状情報とに基づき、手２によって把持されている対象物３を、物体検知手段６２ａにより検出された各物体の内から特定する。なお、この特定された対象物３は、図２及び図３の領域図（ａ）の部分領域Ａによって示される。
さらに、情報結合部６３ａは、部分領域Ａ［図３の領域図（ａ）参照］と、限定条件生成部６２ｂ３から送られた領域Ｂ［図３の領域図（ｂ）参照］とを、それらの位置情報及び形状情報に基づいて結合する（図２参照）。そして、情報結合部６３ａは、部分領域Ａ及び領域Ｂを結合した領域における仮想面積の値又は仮想外周長の値を算出する。
算出された仮想面積の値又は仮想外周長の値は、情報結合部６３ａによって、対象物認識部６３ｂに送られる。また、同時に、部分領域Ａの形状情報も、情報結合部６３ａによって対象物認識部６３ｂに送られる。 The information combining unit 63a includes the position information and shape information of each object in the image 1 (see FIG. 1) detected and sent by the object detection unit 62a, the position information of the hand 2 sent from the limiting condition generation unit 62b3, and Based on the shape information, the object 3 held by the hand 2 is specified from among the objects detected by the object detection means 62a. The specified object 3 is indicated by a partial area A in the area diagram (a) of FIGS. 2 and 3.
Further, the information combining unit 63a divides the partial region A [see the region diagram (a) in FIG. 3] and the region B [see the region diagram (b) in FIG. 3] sent from the limiting condition generating unit 62b3. Are combined based on the position information and shape information (see FIG. 2). Then, the information combining unit 63a calculates the value of the virtual area or the value of the virtual outer circumference length in the region where the partial region A and the region B are combined.
The calculated value of the virtual area or the value of the virtual outer circumference length is sent to the object recognition unit 63b by the information combining unit 63a. At the same time, the shape information of the partial area A is also sent to the object recognition unit 63b by the information combining unit 63a.

また、この発明の実施の形態６に係る把持手段によって把持されている対象物の認識装置１０６のその他の構成及び動作は、実施の形態１と同様であるため、説明を省略する。
このように、実施の形態６における把持手段によって把持されている対象物の認識装置１０６によれば、上記実施の形態１の認識装置１０１と同様な効果が得られる。 Further, since the other configuration and operation of the object recognition apparatus 106 gripped by the gripping means according to Embodiment 6 of the present invention are the same as those in Embodiment 1, description thereof is omitted.
Thus, according to the recognition apparatus 106 of the target object gripped by the gripping means in the sixth embodiment, the same effect as the recognition apparatus 101 in the first embodiment can be obtained.

また、実施の形態１〜６の対象物判定手段１３〜６３において、限定条件を対象物３の部分領域Ａ及びＡｄに結合して得られる対象物３の仮想面積の値又は仮想外周長の値を使用して、データベース１５ａ内の検索範囲を限定し、検索範囲内の物品の形状と対象物３の形状とのマッチングを行っていたがこれに限定されるものではない。データベース１５ａに物品の柔らかさ（触感）や物品の色に関する項目を追加し、対象物３から検出される柔らかさ（触感）、及び画像１から検出される対象物３の色によって、データベース１５ａ内の物品の検索範囲をさらに限定してもよい。これにより、データベース１５ａ内の物品の形状と対象物３の形状とのマッチングの際の処理量がさらに低減されるため、認識装置１０１〜１０６の処理速度を向上させることができる。 Moreover, in the object determination means 13-63 of Embodiments 1-6, the value of the virtual area of the target object 3 or the value of the virtual perimeter length obtained by combining the limiting conditions with the partial regions A and Ad of the target object 3 Is used to limit the search range in the database 15a and match the shape of the article within the search range with the shape of the object 3. However, the present invention is not limited to this. Items relating to the softness (tactile sensation) of the article and the color of the article are added to the database 15a, and the contents of the database 15a are determined according to the softness (tactile sensation) detected from the object 3 and the color of the object 3 detected from the image 1. The search range of the article may be further limited. Thereby, since the processing amount at the time of matching with the shape of the articles | goods in the database 15a and the shape of the target object 3 is further reduced, the processing speed of the recognition apparatuses 101-106 can be improved.

また、実施の形態１〜６の認識装置１０１〜１０６において、対象物３の把持手段を人の手２としていたがこれに限定されるものでなく、ロボットハンドであってもよい。
また、実施の形態１〜６において、手２すなわち把持手段によって、画像１における隠れた部分を有する対象物３の全体形状を認識していたが、これに限定されるものではない。対象物３を隠しているものが布などの把持手段以外のものであっても、認識装置１０１〜１０６による対象物３の認識方法を適用することができる。 In the recognition apparatuses 101 to 106 according to the first to sixth embodiments, the grasping means for the object 3 is the human hand 2. However, the present invention is not limited to this, and a robot hand may be used.
In the first to sixth embodiments, the entire shape of the object 3 having a hidden portion in the image 1 is recognized by the hand 2, that is, the gripping means, but is not limited thereto. Even if what hides the object 3 is something other than gripping means such as cloth, the recognition method of the object 3 by the recognition devices 101 to 106 can be applied.

１画像、２手（把持手段）、３対象物、１１画像取得手段、１２ａ，２２ａ，３２ａ，４２ａ，５２ａ，６２ａ物体検知手段、１２ｂ，２２ｂ，３２ｂ，４２ｂ，５２ｂ，６２ｂ手検知手段、１３，２３，３３，４３，５３，６３対象物判定手段、１５ａデータベース、１０１，１０２，１０３，１０４，１０５，１０６把持手段によって把持されている対象物の認識装置。 1 image, 2 hands (gripping means), 3 object, 11 image acquisition means, 12a, 22a, 32a, 42a, 52a, 62a object detection means, 12b, 22b, 32b, 42b, 52b, 62b hand detection means, 13 , 23, 33, 43, 53, 63 Object determination means, 15a database, 101, 102, 103, 104, 105, 106 Recognizing apparatus for objects gripped by the gripping means.

Claims

A method for recognizing an object held by a holding means,
An image acquisition step of acquiring an image of an object including the object being gripped by the gripping means and the gripping means;
An object detection step of detecting the object projected in the image;
A gripping means detection step of detecting the gripping means projected in the image and estimating a gripping posture of the gripping means;
From the information related to the object detected in the object detection step and the information related to the gripping means detected in the gripping means detection step, the object in the image is identified.
A grasping means comprising: an object determination step for recognizing the overall shape of the object from information relating to the appearance of the identified object and information relating to a grasping posture of the grasping means estimated by the grasping means detection step. A method for recognizing a grasped object.

The gripping means detection step includes
From the information about the estimated gripping posture of the gripping means, calculate a region surrounded by the gripping means,
The method for recognizing an object gripped by the gripping means according to claim 1, further comprising: specifying an area including a portion hidden by the gripping means in the image from an area surrounded by the gripping means.

The object determining step includes:
Combining the identified region with the identified object;
The method for recognizing an object held by the holding means according to claim 2, further comprising recognizing an overall shape of the object from the combined objects.

The object determining step includes:
Accessing a database containing information about the object being grasped, searching the database only within a range that satisfies the condition on the condition that the information about the combined object is a condition, and performing matching with the database Furthermore, the recognition method of the target object hold | gripped by the holding means of Claim 3.

In the object determining step,
The method for recognizing an object held by the holding means according to claim 4, wherein the information related to the combined object includes at least one of an area and an outer peripheral length of the combined object.

The object detection step includes:
The gripping according to any one of claims 1 to 5, further comprising: specifying the target object from the object projected on the image based on information on the gripping means detected by the gripping means detection step. A method for recognizing an object held by a means.

The object detection step includes:
6. The method according to claim 1, further comprising: specifying the target object from the object projected on the image based on information on a gripping posture of the gripping means estimated by the gripping means detection step. A method for recognizing an object gripped by the gripping means.

The object detection step includes:
Based on the information on the gripping means detected by the gripping means detection step and the information on the gripping posture of the gripping means estimated by the gripping means detection step, the object is identified from the object projected on the image. A method for recognizing an object gripped by the gripping means according to any one of claims 1 to 5, further comprising:

The program for making a computer perform each step in the recognition method of the target object currently hold | gripped by the holding means as described in any one of Claims 1-8.