JPH0719834A

JPH0719834A - Recognizing apparatus for object

Info

Publication number: JPH0719834A
Application number: JP5164343A
Authority: JP
Inventors: Takashi Kimoto; 隆木本; Daiki Masumoto; 大器増本; Hiroshi Yamakawa; 宏山川; Shigemi Osada; 茂美長田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1993-07-02
Filing date: 1993-07-02
Publication date: 1995-01-20

Abstract

PURPOSE:To recognize a three-dimensional object efficiently by active sensing. CONSTITUTION:A corresponding point is estimated with a corresponding-point determining part 14 out of an inner model 10. Then, parallel movement and rotation are performed based on a parallel moving amount 16 and rotation transformation amount 20, which are initially set with a coordinate transformation part 15. Sensor information is predicted with a sensor projecting part 26. A recognizing and processing part 25 operates the error between the estimated sensor information and the observed sensor information, corrects the parallel moving amount 16 and the rotation transformation amount 20 by reverse propagation so as to reduce the error and determines the mutual coordinate transformation of the inner model 10 and an object to be observed 34. At the same time, the corresponding point is estimated by propagating the error into the corresponding-point determining part 14 and correcting the relation of the corresponding point. Furthermore, a visual sensor 30 is actively moved with an actuator 32. The above described coordinate and the determination of the corresponding point without the contradiction with respect to the information of a plurality of obtained images are performed.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ロボット、産業装置、
視聴覚認識等の能動的センシング技術として用いられる
物体認識装置に関する。BACKGROUND OF THE INVENTION The present invention relates to a robot, an industrial device,
The present invention relates to an object recognition device used as an active sensing technology such as audiovisual recognition.

【０００２】[0002]

【従来の技術】例えば、コンピュータビジョンのような
センシング技術は、テレビカメラなどの視覚センサによ
り観測対象である３次元物体の２次元画像を獲得し、こ
の２次元画像から観測対象の特徴量の３次元構造を推測
する。次に、知識として持っている観測対象のモデルと
照合を行うことにより、認識するものである。2. Description of the Related Art For example, in sensing technology such as computer vision, a two-dimensional image of a three-dimensional object to be observed is acquired by a visual sensor such as a television camera, and the three-dimensional feature quantity Guess the dimensional structure. Next, recognition is performed by collating with the model of the observation target that has knowledge.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、このよ
うな従来の物体認識にあっては、異なる形状を持った物
体が同一の２次元的な見え方を持つことからわかるよう
に、一般に単一の２次元画像から観測対象の３次元構造
を一意に復元することは困難である。そこで、視覚セン
サを複数用いたステレオビジョンや、視覚センサを能動
的に動作させるアクティブセンシング技術を用いること
により、複数の画像を獲得し、これらから観測対象の３
次元構造を再構成し、さらに、これを観測対象のモデル
と照合するというアプローチがとられている。However, in such conventional object recognition, as is apparent from the fact that objects having different shapes have the same two-dimensional appearance, in general, a single object is recognized. It is difficult to uniquely restore the three-dimensional structure of the observation target from the two-dimensional image. Therefore, stereo vision using multiple visual sensors and active sensing technology that activates the visual sensors are used to acquire multiple images, and the 3
The approach is to reconstruct the dimensional structure and then match it with the model to be observed.

【０００４】このためセンサ数の増加に伴って装置構成
および処理が繁雑になるという問題がある。したがっ
て、効率的な３次元構造の再構成技術と再構成した観測
対象物体の３次元構造とモデルとの照合技術の開発が望
まれている。本発明は、このような従来の問題点に鑑み
てなされたもので、アクティブセンシングにより獲得し
た複数の画像を用いた観測対象物体の３次元構造の再構
築、内部モデルと観測対象物体相互の座標系の変換の決
定、内部モデルと観測対象物体相互の特徴量の対応関係
の決定を同時に可能とする物体認識装置を提供すること
を目的とする。Therefore, there is a problem in that the device configuration and processing become complicated as the number of sensors increases. Therefore, it is desired to develop an efficient three-dimensional structure reconstructing technique and a technique for collating the reconstructed three-dimensional structure of the observed object with the model. The present invention has been made in view of the above conventional problems, and is to reconstruct a three-dimensional structure of an observation target object using a plurality of images acquired by active sensing, coordinate between an internal model and the observation target object. It is an object of the present invention to provide an object recognition device that can simultaneously determine the transformation of a system and the correspondence relationship between the internal model and the feature quantity of the observation target object.

【０００５】[0005]

【課題を解決するための手段】第１図は本発明の原理説
明図である。本発明の物体認識装置は、基本的に、対応
点決定部１４、座標変換部１５、センサ投影部２６、及
び認識処理手段２５で構成される。対応点決定部１４
は、内部モデル１０の特徴量と観測対象物体３４の特徴
量の対応関係を決定する。座標変換部１５は、平行移動
部１８と回転変換部２０を備え、平行移動量１６と回転
変換量２０に基づき内部モデル１２の回転および平行移
動を行う。FIG. 1 is a diagram illustrating the principle of the present invention. The object recognition device of the present invention basically includes a corresponding point determination unit 14, a coordinate conversion unit 15, a sensor projection unit 26, and a recognition processing unit 25. Corresponding point determination unit 14
Determines the correspondence between the feature amount of the internal model 10 and the feature amount of the observation target object 34. The coordinate transformation unit 15 includes a translation unit 18 and a rotation transformation unit 20, and rotates and translates the internal model 12 based on the translation amount 16 and the rotation transformation amount 20.

【０００６】センサ射影部２６は、内部モデル１０から
センサ情報を生成する。更に、認識処理部２５は、観測
対象物体３４のセンサ情報と内部モデル１０から予想し
た予測センサ情報の誤差演算部２８で求めた誤差エネル
ギーを最小化するように、対応点決定部１４の対応関
係、座標変換部１５に対する平行移動量１６及び回転変
換量２０を決定して、内部モデル１０と観測対象物体３
４の同定を行う。The sensor projection unit 26 generates sensor information from the internal model 10. Further, the recognition processing unit 25 uses the correspondence relationship of the corresponding point determination unit 14 so as to minimize the error energy obtained by the error calculation unit 28 of the sensor information of the observation target object 34 and the predicted sensor information predicted from the internal model 10. , The translation amount 16 and the rotation transformation amount 20 with respect to the coordinate conversion unit 15 are determined, and the internal model 10 and the observation target object 3 are determined.
4. Identification is performed.

【０００７】ここで、認識処理部２５は、観測対象物体
３４の３次元構造と内部モデル１０から予想した３次元
構造の誤差エネルギーを最小化するように対応点決定部
１４の対応関係、座標変換部１５の平行移動量１６及び
観点変換量２０を決定することにより、内部モデル１０
と観測対象物体３４の同定を行う。また対応点決定部１
４は、入力層ユニットに入力された観測対象物体３４の
内部モデルの特徴量から、観測対象物体３４の特徴量を
出力層ユニットから出力するような二層構造のニューラ
ルネットワークにより構成し、観測した観測対象物体３
４の特徴量と出力した特徴量の誤差エネルギーを最小化
するようにニューラルネットワークの重みを修正するこ
とにより、対応関係を推測する。Here, the recognition processing unit 25 uses the correspondence relationship and coordinate transformation of the corresponding point determination unit 14 so as to minimize the error energy between the three-dimensional structure of the observation target object 34 and the three-dimensional structure predicted from the internal model 10. By determining the parallel movement amount 16 and the viewpoint conversion amount 20 of the unit 15, the internal model 10
And the observation target object 34 is identified. Also, the corresponding point determination unit 1
4 is configured by a two-layered neural network that outputs the feature amount of the observation target object 34 from the output layer unit from the feature amount of the internal model of the observation target object 34 input to the input layer unit, and observes Observed object 3
The correspondence is estimated by modifying the weight of the neural network so as to minimize the error energy between the feature amount of 4 and the output feature amount.

【０００８】このニューラルネットワークは、各入出力
ユニットに結合した重みの合計値がある一定値になるよ
うな制約を付加しており、具体的には、各入出力ユニッ
トの結合の重みが０又は１になるような制約を付加して
いる。さらに本発明では、観察対象物体３４を認識する
手段としてアクチュエータ３２により動作可能な視覚セ
ンサ３０を設けている。このため認識処理部２５は、視
覚センサ３０のアクチュエータ３２を動作させることに
より、複数の観察画像について再構成した観測対象物体
３４の３次元構造と内部モデル１０との同定を行うこと
を特徴とする。This neural network is added with a constraint that the total value of the weights connected to each input / output unit becomes a certain value. Specifically, the connection weight of each input / output unit is 0 or The constraint that it becomes 1 is added. Further, in the present invention, the visual sensor 30 operable by the actuator 32 is provided as means for recognizing the observation target object 34. Therefore, the recognition processing unit 25 is characterized by operating the actuator 32 of the visual sensor 30 to identify the three-dimensional structure of the observation target object 34 reconstructed for a plurality of observation images and the internal model 10. .

【０００９】[0009]

【作用】本発明の物体認識装置は、まず対応点決定部１
４が内部モデル１０から対応点を推定する。次に、これ
を座標変換部１５で初期設定した並行移動量１６と回転
変換量２０に基づき平行移動及び回転し、センサ射影部
２６によりセンサ情報を予測する。認識処理部２５は、
予測したセンサ情報と観測したセンサ情報間の誤差を誤
差演算部２８で演算し、この誤差を小さくするように逆
伝播することにより、回転変換量１６及び平行移動量２
０を修正し、内部モデル１０と観測対象物体３４の相互
の座標変換を決定していく。In the object recognition apparatus of the present invention, first, the corresponding point determining unit 1
4 estimates corresponding points from the internal model 10. Next, this is translated and rotated based on the parallel movement amount 16 and the rotation transformation amount 20 initialized by the coordinate conversion unit 15, and the sensor projection unit 26 predicts the sensor information. The recognition processing unit 25
The error between the predicted sensor information and the observed sensor information is calculated by the error calculation unit 28, and back propagation is performed so as to reduce this error, whereby the rotation conversion amount 16 and the parallel movement amount 2
0 is corrected, and mutual coordinate transformation between the internal model 10 and the observation target object 34 is determined.

【００１０】同時に、誤差を対応点決定部１４に伝播し
対応点関係を修正することにより対応点を推定してい
く。さらに、アクチュエータ３２より能動的に視覚セン
サ３０を動作させ、獲得した複数枚の画像情報に対し
て、矛盾のない上記の座標変換、対応点の決定を行うこ
とにより、内部モデル１０と観測対象物体３４の同定を
行う。At the same time, the corresponding point is estimated by propagating the error to the corresponding point determining unit 14 and correcting the corresponding point relationship. Further, the visual sensor 30 is actively operated by the actuator 32, and the coordinate transformation and the corresponding points are determined consistently with respect to the acquired plurality of pieces of image information, whereby the internal model 10 and the object to be observed are observed. 34 are identified.

【００１１】[0011]

【Example】

＜目次＞１．本発明の階層型感覚情報処理モデル２．意図的センシング３．認識モデル（１）モデルの概略（２）センサ情報変換モデル（３）置換ニューラルネットワーク４．物体認識（１）３次元構造の再構成（２）内部モデルとの対応付け５．シミュレーション実験（１）３次元構造の再構成（２）内部モデルとの対応付け１．階層型感覚情報処理モデル図２は本発明の物体認識装置に適用される階層型感覚情
報処理モデルの概略を示す。この階層型感覚情報処理モ
デルは外界３８の情報を取り込むセンサ３０と、外界３
８に働きかけるアクチュエータ３４の上に自律的な複数
の処理ユニット３６を階層状に相互結合した構造をとっ
ている。<Table of contents> 1. Hierarchical sensory information processing model of the present invention 2. Intentional sensing 3. Recognition model (1) Outline of model (2) Sensor information conversion model (3) Permutation neural network 4. Object recognition (1) Reconstruction of three-dimensional structure (2) Correlation with internal model Simulation experiment (1) Reconstruction of three-dimensional structure (2) Correlation with internal model 1. Hierarchical Sensory Information Processing Model FIG. 2 shows an outline of a hierarchical sensory information processing model applied to the object recognition device of the present invention. This hierarchical sensory information processing model includes a sensor 30 that takes in information from the outside world 38 and an outside world 3
8 has a structure in which a plurality of autonomous processing units 36 are connected to each other in a hierarchical manner on an actuator 34 which acts on the actuator 8.

【００１２】図３は図２の処理ユニット３６の概略を示
す。処理ユニット３６は認識モジュール４２，運動モジ
ュール４４及び感覚運動融合モジュール４６の３つのモ
ジュールから構成している。認識モジュール４２は下位
層からの感覚情報４８をより抽象度の高い上位の感覚情
報５０に変換し、上位層に伝達する。運動モジュール４
４は上位層からの運動指令５４を下位層への運動指令５
６に変換して、下位層の処理ユニットに働きかける。FIG. 3 schematically shows the processing unit 36 of FIG. The processing unit 36 is composed of three modules, a recognition module 42, a movement module 44, and a sensorimotor fusion module 46. The recognition module 42 converts the sensory information 48 from the lower layer into higher sensory information 50 having a higher degree of abstraction and transmits it to the upper layer. Exercise module 4
4 is a motion command 54 from the upper layer to a motion command 5 to the lower layer
Convert to 6 and work on the processing unit in the lower layer.

【００１３】本発明に適用する階層型感覚情報処理モデ
ルの特徴は、処理ユニット３６に上位からの目標（意
図）５２を受け入れ、これを認識に有効に利用する機構
と、認識と運動を有機的に結合する感覚運動融合モジュ
ール４６を導入したことにある。感覚運動融合モジュー
ル４６は運動によって生ずる感覚情報４８の変化を予測
により認識し、予測目標５８として認識モジュール４２
に通知する機能をもつ。また、感覚運動融合モジュール
４６は認識モジュール４２からの感覚情報５０と上位層
からの目標（意図）５２に基づき、運動モジュール４４
に対する予測運動目標６０を生成する機能を実現してい
る。The feature of the hierarchical sensory information processing model applied to the present invention is that the processing unit 36 accepts a target (intention) 52 from a host and effectively uses it for recognition, and an organic recognition and movement mechanism. Introducing a sensorimotor fusion module 46 that couples to. The sensorimotor fusion module 46 recognizes a change in the sensory information 48 caused by the motion by prediction, and recognizes it as a prediction target 58 by the recognition module 42.
It has a function to notify. Further, the sensory movement fusion module 46, based on the sensory information 50 from the recognition module 42 and the target (intention) 52 from the upper layer, the movement module 44.
It realizes the function of generating a predicted motion target 60 for

【００１４】従来の認識における基本的な情報の伝達経
路は図３の実線で示すような上下方向の経路のみであっ
たが、本発明のモデルにあっては、更に認識と運動の間
に情報の伝達経路を導入することによって認識と運動を
一体化する内部的なフィードバックを実現している。従
って、各処理ユニット３６は上位層からの意図５２に基
づいて自律的に動作することが可能となり、複数の処理
ユニット３６が目標に向けて協調的に動作することがで
きる。２．意図的センシング本発明の物体認識装置における能動性に関しては、観察
対象物を捕えるセンサを動作するアクチュエータについ
て、アクチュエータを単にセンシングの一部として用い
る従来のアクティブセンシングの枠を越えた、意図をも
ったセンサ情報処理、即ち意図的センシングの実現を図
っている。In the conventional recognition, the basic information transmission path was only the vertical path as shown by the solid line in FIG. 3, but in the model of the present invention, information is further transmitted between recognition and movement. The internal feedback that integrates the recognition and the movement is realized by introducing the transmission path of. Therefore, each processing unit 36 can operate autonomously based on the intention 52 from the upper layer, and the plurality of processing units 36 can operate cooperatively toward the target. 2. Intentional sensing Regarding the activity in the object recognition device of the present invention, regarding an actuator that operates a sensor that captures an observation target, a sensor with an intention that goes beyond the conventional active sensing frame in which the actuator is simply used as a part of sensing. We are trying to realize information processing, that is, intentional sensing.

【００１５】意図的センシングとは従来の受動的な認識
処理とは異なり、認識を運動と分離独立したものと捕え
ず、認識と運動を並列的，協調的に動作させることで、
認識精度の向上を図るものである。いま、視覚認識する
目標、即ちセンシングの意図が与えられたとすると、こ
の意図を達成するようなセンシング行動を実現する。例
えば、ある物体を計測したいという意図が与えられたと
すると、対象物体の探索行動や対象物体の全体像を把握
しようとする行動を行う。Unlike the conventional passive recognition processing, the intentional sensing does not catch the recognition as independent from the movement, but operates the recognition and the movement in parallel and cooperatively.
It is intended to improve the recognition accuracy. Now, assuming that a visual recognition target, that is, a sensing intention is given, a sensing behavior that achieves this intention is realized. For example, when an intention to measure a certain object is given, a search action for the target object or an action for grasping the whole image of the target object is performed.

【００１６】このように、観測すべき物体や行動が意図
として指示されることにより、対象に関する知識を利用
した適切なセンシング行動がとれることとなり、認識精
度及び認識速度の向上が期待できる。図４は本発明の意
図的センシングに用いるアクチュエータ及びセンサの実
施例を示す。アクチュエータ３２としては、多自由度の
マニピュレータを使用しており、マニピュレータの先端
に視覚センサとして例えばＣＣＤカメラ等を設置してい
る。このような視覚センサ３０を備えたアクチュエータ
３２を用いることにより、観察対象物体３４例えばコッ
プに対し、探索動作，全体像の把握，曖昧な点に対する
注視等の様々なセンシング行動を実現することができ
る。In this way, by instructing the object or action to be observed as an intention, an appropriate sensing action utilizing knowledge about the object can be taken, and improvement in recognition accuracy and recognition speed can be expected. FIG. 4 shows an embodiment of the actuator and sensor used for the intentional sensing of the present invention. A multi-degree-of-freedom manipulator is used as the actuator 32, and a CCD camera or the like is installed as a visual sensor at the tip of the manipulator. By using the actuator 32 equipped with such a visual sensor 30, it is possible to realize various sensing actions such as a search action, an overall image grasp, and a gaze at an ambiguous point with respect to the observation target object 34, for example, a cup. .

【００１７】本発明に適用される図３に示した処理ユニ
ット３６に用いる感覚運動融合モジュール４６として
は、ある時点のセンサ情報と運動指令から、次の時刻で
得られるであろうセンサ情報を予測する順モデルの階層
型ニューラルネットワークを使用し、認識と運動を有機
的に結合する感覚運動融合モデルとする。このモデルは
誤差逆伝播法（バックプロパゲーション法）を用いて、
センサ情報を意図した目標値に近づけるような運動指令
を生成する機能と、予測したセンサ情報を認識モジュー
ル４２に通知する機能を実現する。The sensorimotor fusion module 46 used in the processing unit 36 shown in FIG. 3 applied to the present invention predicts the sensor information which will be obtained at the next time, from the sensor information and the motion command at a certain time. It is a sensorimotor fusion model that organically connects recognition and movement by using a hierarchical neural network of a forward model. This model uses the error backpropagation method (backpropagation method)
A function of generating a motion command that brings the sensor information closer to an intended target value and a function of notifying the predicted sensor information to the recognition module 42 are realized.

【００１８】更に、図３の認識モジュール４２として
は、階層型ニューラルネットワークを用いて意図５２を
センシングに利用できる構造をもつ認識モデルとする。
このモデルを用いることにより、上位層からの意図５２
と感覚運動融合モデル４６の予測したセンサ情報即ち予
測目標５８をセンシングに有効に利用できる。このよう
に、感覚融合運動モジュール４６及び認識モジュール４
２にニューラルネットワークを用いることで、複雑なセ
ンサ処理過程の学習による獲得と適応能力をもったモデ
ルの構築が期待できる。Further, as the recognition module 42 in FIG. 3, a recognition model having a structure in which the intent 52 can be used for sensing using a hierarchical neural network is used.
By using this model, the intent from the upper layer 52
The sensor information predicted by the sensory-motor fusion model 46, that is, the prediction target 58 can be effectively used for sensing. Thus, the sensory fusion movement module 46 and the recognition module 4
By using a neural network for 2, it is expected to obtain a model by learning a complicated sensor processing process and construct a model having adaptive ability.

【００１９】以下の実施例の説明にあっては、本発明に
よる３次元物体に認識処理を観察対象物体と内部モデルとの対応関係の決定、内部モデルからセンサ情報への射像、という２つの処理要素でモデル化することにより、認識
対象物体の内部モデルの変更やセンサの追加，削除に対
応できる認識モデル即ち物体認識装置を実現する。３．認識モデル（１）モデルの概略まず、本発明の物体認識に用いる認識モデルの基本的な
考え方を説明する。図５に示すように、観察対象物体３
４及び内部モデル１０がおのおの異なる座標系上の特徴
点で定義されていたとする。ここで物体中心座標系をＸ
ｏｂｊｅｃｔ、内部モデル座標系をＸｍｏｄｅｌとよ
ぶ。In the following description of the embodiments, there are two processes of recognizing a three-dimensional object according to the present invention, determining the correspondence between the object to be observed and the internal model, and projecting the internal model to the sensor information. By modeling with a processing element, a recognition model, that is, an object recognition device that can handle changes in the internal model of the recognition target object and addition and deletion of sensors is realized. 3. Recognition model (1) Outline of model First, the basic idea of the recognition model used for object recognition of the present invention will be described. As shown in FIG. 5, the observation target object 3
4 and the internal model 10 are defined by feature points on different coordinate systems. Where X is the object center coordinate system
The object and the internal model coordinate system are called Xmodel.

【００２０】また、センサ情報は物体中心座標系の原点
Ｏが射像される点を原点Ｕｏとした２次元センサ座標系
Ｕ上に表現される。観測対象物体３４から得られたセン
サ座標系Ｕ上のセンサ情報６２に基づいて、観察対象物
体３４と内部モデル１０の全ての特徴点の間の対応関係
（対応関係の推定）と、物体中心座標系と内部モデル座
標系を一致させるような座標変換、即ち平行移動量Ｘ
ｏ，回転変換量θを推定することにより観察対象物体３
４を同定し認識する。Further, the sensor information is expressed on the two-dimensional sensor coordinate system U having the origin Uo as a point on which the origin O of the object center coordinate system is projected. On the basis of the sensor information 62 on the sensor coordinate system U obtained from the observation target object 34, the correspondence relationship (estimation of the correspondence relationship) between the observation target object 34 and all the feature points of the internal model 10 and the object center coordinates. Coordinate conversion to make the system and the internal model coordinate system match, that is, the parallel displacement X
o, the object to be observed 3 is estimated by estimating the rotation conversion amount θ.
Identify and recognize 4

【００２１】図６は認識モデルを実現する実施例構成図
である。まず対応点決定部として機能する置換ニューラ
ルネットワーク１４は観察対象物体３４と内部モデル１
０の特徴点の間の対応関係を推定する。また、エネルギ
最小化部１２は内部モデル座標系で定義された特徴点を
センサ座標系に変換するセンサ情報の変換モデル６４を
基本要素とし、エネルギ最小化の手法を用いて対応点の
推定と座標変換の決定を行い、観測対象物体を認識す
る。FIG. 6 is a block diagram of an embodiment for realizing the recognition model. First, the replacement neural network 14 functioning as a corresponding point determination unit is provided with the observation target object 34 and the internal model 1.
Estimate the correspondence between 0 feature points. Further, the energy minimization unit 12 uses the conversion model 64 of the sensor information for converting the feature points defined in the internal model coordinate system to the sensor coordinate system as a basic element, and estimates the corresponding points and coordinates using the energy minimization method. Make the conversion decision and recognize the object to be observed.

【００２２】図７は図６のエネルギ最小化部１２に設け
られたセンサ情報の変換モデル６４の詳細を示す。この
変換モデル６４は平行移動部１８，回転変換部２２及び
センサ射影部２６で構成される。尚、図７は変換モデル
６４の一系統を取り出しているが、図６にあっては、３
つの対応点の処理であることから３系統示している。図
６の認識モデルの実施例による処理は、上位層より意図
５２として予測された観測対象物体が与えられると、あ
る１つの視点から意図された観測対象物体３４の内部モ
デル１０の特徴点をセンサ座標系に投影し、センサ予測
情報を生成する。この処理が図７の平行移動部１８，回
転変換部２２及びセンサ射影部２６の処理となり、予測
センサ情報は誤差演算部２８に与えられる。FIG. 7 shows the details of the sensor information conversion model 64 provided in the energy minimization unit 12 of FIG. The conversion model 64 includes a parallel moving unit 18, a rotation converting unit 22, and a sensor projecting unit 26. Although FIG. 7 shows one system of the conversion model 64, in FIG.
Three systems are shown because it is processing of one corresponding point. When the observation target object predicted as the intent 52 is given from the upper layer, the processing according to the embodiment of the recognition model of FIG. 6 detects the feature points of the internal model 10 of the observation target object 34 intended from one certain viewpoint. Projecting onto a coordinate system to generate sensor prediction information. This processing is the processing of the parallel moving unit 18, the rotation converting unit 22, and the sensor projection unit 26 of FIG. 7, and the predicted sensor information is given to the error calculating unit 28.

【００２３】次に、予測センサ情報と視覚センサ３０に
より実際に観測した観測対象物体３４のセンサ情報との
誤差を誤差演算部２８で求め、この誤差を逆伝播し、誤
差エネルギを最小化するように観察対象物体３４と内部
モデル１０の対応関係を推定し、物体中心座標系と内部
モデル座標系間の座標変換、即ち平行移動量１６及び回
転変換部２０を更新する。Next, an error between the predicted sensor information and the sensor information of the object 34 to be observed actually observed by the visual sensor 30 is calculated by the error calculator 28, and this error is back propagated to minimize the error energy. Then, the correspondence between the observation target object 34 and the internal model 10 is estimated, and the coordinate conversion between the object center coordinate system and the internal model coordinate system, that is, the translation amount 16 and the rotation conversion unit 20 are updated.

【００２４】更に、アクチュエータ３２による視覚セン
サ３０を符号化して観測対象物体３４の視点位置を変え
るセンシング行動に伴って認識モデルによる処理を続け
ることで、観測対象物体３４と内部モデル１０の対応関
係とその座標系間の座標変換即ち平行移動量１６及び回
転変換量２０を決定している。（２）センサ情報変換モデル図８は図６のエネルギ最小化部１２に設けたセンサ情報
の変換モデル６４の機能を示す。この変換モデル６４は
内部モデル座標系Ｘｍｏｄｅｌの任意の点Ｘを回転角θ
だけ回転したＸ´をセンサ座標系へ中心投影変換で射影
することにより、認識対象物体３４の内部モデル１０か
らのセンサ情報を予測する。Further, the visual sensor 30 by the actuator 32 is coded to continue the processing by the recognition model in accordance with the sensing action of changing the viewpoint position of the observation target object 34, so that the correspondence relationship between the observation target object 34 and the internal model 10 is established. The coordinate conversion between the coordinate systems, that is, the parallel movement amount 16 and the rotation conversion amount 20 are determined. (2) Sensor Information Conversion Model FIG. 8 shows the function of the sensor information conversion model 64 provided in the energy minimization unit 12 of FIG. This conversion model 64 rotates an arbitrary point X in the internal model coordinate system Xmodel by a rotation angle θ.
The sensor information from the internal model 10 of the recognition target object 34 is predicted by projecting X ′ that has been rotated only by the central projection transformation onto the sensor coordinate system.

【００２５】いま、Ｘ´＝（ｘ´，ｙ´，ｚ´）に対応
するセンサ座標系の点をＵ＝（ｕ，ｖ）、センサ座標平
面からＸ´への距離をｄとすると、中心投影変換部２６
による中心投影変換はNow, assuming that the point in the sensor coordinate system corresponding to X '= (x', y ', z') is U = (u, v) and the distance from the sensor coordinate plane to X'is d, the center Projection conversion unit 26
The central projection transformation by

【００２６】[0026]

【数１】 [Equation 1]

【００２７】と表わせる。但し、Can be expressed as However,

【００２８】[0028]

【数２】 [Equation 2]

【００２９】である。ここで座標系の平行移動量をＸ
０、回転角（回転変換量）をθとすると、座標系間の変
換は[0029] Here, the translation amount of the coordinate system is X
If the rotation angle is 0 and the rotation angle (rotation conversion amount) is θ, the conversion between coordinate systems is

【００３０】[0030]

【数３】 [Equation 3]

【００３１】で表わされる。座標系の回転角θをオイラ
ー角（α，β，γ）で記述すると、It is represented by When the rotation angle θ of the coordinate system is described by Euler angles (α, β, γ),

【００３２】[0032]

【数４】 [Equation 4]

【００３３】である。ここで、各回転変換行列はIt is Where each rotation transformation matrix is

【００３４】[0034]

【数５】 [Equation 5]

【００３５】である。従って、内部モデル１０からセン
サ情報への変換は次式で表わされる。It is Therefore, the conversion from the internal model 10 to the sensor information is expressed by the following equation.

【００３６】[0036]

【数６】 [Equation 6]

【００３７】図９は本発明の対応点決定部として用いる
置換ニューラルネットワーク１４の実施例を示す。この
置換ニューラルネットワーク１４は出力層７２の出力ユ
ニット７４に対応付けた観測対象物体３４の特徴点と入
力層６６の入力ユニット６８に対応付けた内部モデル１
０の特徴点の対応関係を決定する２層構造のネットワー
クである。FIG. 9 shows an embodiment of the replacement neural network 14 used as the corresponding point determining unit of the present invention. The replacement neural network 14 has an internal model 1 in which the feature points of the observation target object 34 associated with the output unit 74 of the output layer 72 and the input unit 68 of the input layer 66 are associated with each other.
It is a two-layered network that determines the correspondence relationship of 0 feature points.

【００３８】従って、観測対象物体３４の予測した特徴
点の座標値と観測した特徴点の座標値との間の誤差を減
らすようにネットワークの結合の重みＷｉｊを修正する
ことにより、対応関係を設定する。入力ユニット６８及
び出力ユニット７４の個数即ち観測対象物体３４及び内
部モデル１０の特徴点の数をｎ、ｉ番目の入力ユニット
６８の入力値即ち内部モデル１０の特徴点座標ＩＸ
（ｉ）、出力ユニット７４の出力値即ち観測対象物体３
４の特徴点座標の予測値をＯＸ（ｉ）、観測対象物体３
４の実際に観測された特徴点をＸ（ｉ）とする。Therefore, the correspondence relation is set by modifying the network connection weight Wij so as to reduce the error between the coordinate value of the predicted feature point of the observation target object 34 and the coordinate value of the observed feature point. To do. The number of the input units 68 and the output units 74, that is, the number of feature points of the observation object 34 and the internal model 10 is n, the input value of the i-th input unit 68, that is, the feature point coordinates IX of the internal model 10.
(I), the output value of the output unit 74, that is, the observation target object 3
The predicted value of the feature point coordinates of 4 is OX (i), and the observation target object 3
Let 4 (X) be the actually observed feature points.

【００３９】また、ｉ番目の入力ユニット６８とｊ番目
の出力ユニット７４の重みをＷｉｊとし、出力ユニット
７４としては線形のニューロンを用いると、ｊ番目の出
力ユニット７４の出力値はIf the weight of the i-th input unit 68 and the j-th output unit 74 is Wij and a linear neuron is used as the output unit 74, the output value of the j-th output unit 74 is

【００４０】[0040]

【数７】 [Equation 7]

【００４１】となる。ここで、観測対象物体３４の観測
した特徴点Ｘ（ｉ）と予測した特徴点ＯＸ（ｉ）との二
乗誤差をエネルギＥとして定義し、It becomes Here, a squared error between the observed feature point X (i) of the observation target object 34 and the predicted feature point OX (i) is defined as energy E, and

【００４２】[0042]

【数８】 [Equation 8]

【００４３】ネットワークの重みの更新式は次式のよう
に定義する。The updating formula of the weight of the network is defined as the following formula.

【００４４】[0044]

【数９】 [Equation 9]

【００４５】ここで、εα_in、α_out及びβはおのおの
正の定数である。（９）式における右辺の（８）式に示
したエネルギＥを最小化する１番目の項には、学習後の
ネットワークが内部モデル１０の特徴点と観測対象物体
３４の特徴点の間の置換を構成するように重みの制約項
を付加している。（９）式の２番目，３番目の項は次の
（１０）式，（１１）式で示す各入力ユニット６８及び
出力ユニット７４に結合した重みの合計値を１とする制
約項を実現するための重み制限部７０，７６を果たすも
のである。Here, εα _in , α _out and β are positive constants. In the first term that minimizes the energy E shown in equation (8) on the right side of equation (9), the learned network is a replacement between the feature points of the internal model 10 and the feature points of the observation target object 34. A weight constraint term is added so that The second and third terms of the equation (9) realize a constraint term having the total value of the weights coupled to the input unit 68 and the output unit 74 as 1 shown in the following equations (10) and (11). The weight limiting units 70 and 76 for

【００４６】[0046]

【数１０】 [Equation 10]

【００４７】更に、（９）式の右辺の４番目の項は（１
２）式に示す重みを０か１かにする制約項である。３．物体認識これまでに説明した本発明に用いる認識モデルの処理
を、観測対象物の特徴点を物体中心座標系で再構成する
処理と、再構成した観測対象物体と内部モデルの対応関
係を決定する処理に分割し、３次元物体の認識を行う。（１）３次元構造の再構成まず、物体中心座標系Ｘｏｂｊｅｃｔで観測対象物体を
再構成する処理を説明する。いま図１０（ａ）に示すよ
うに、視覚センサ３０の向きにｚ軸、ｚ軸と垂直な平面
上にｘ軸及びｙ軸をとるように物体中心座標系Ｘｏｂｊ
ｅｃｔを定義する。Furthermore, the fourth term on the right side of the equation (9) is (1
This is a constraint term that sets the weight shown in the expression (2) to 0 or 1. 3. Object recognition The processing of the recognition model used in the present invention described above is reconstructed in the object center coordinate system of the feature points of the observation object, and the correspondence between the reconstructed observation object and the internal model is determined. It divides into processing and recognizes a three-dimensional object. (1) Reconstruction of three-dimensional structure First, a process of reconstructing an object to be observed by the object center coordinate system Xobject will be described. Now, as shown in FIG. 10A, the object center coordinate system Xobj is set so that the direction of the visual sensor 30 is the z axis, and the x axis and the y axis are on the plane perpendicular to the z axis.
Define ect.

【００４８】この状態で、図４に示したように視覚セン
サ３０をアクチュエータ３２としてのマニピュレータに
装着し、図１０（ｂ）に示すように物体中心座標系の原
点Ｏを中心に一定距離ｄで回転運動を行う。そして、観
測対象物体のある特徴点の視覚センサ３０の回転運動即
ち３０−１，３０−２，３０−３で示す回転位置での特
徴点のセンサ座標系Ｕにおける座標を複数のセンサ情報
として確保する。In this state, the visual sensor 30 is attached to the manipulator as the actuator 32 as shown in FIG. 4, and as shown in FIG. 10 (b), at a constant distance d with the origin O of the object center coordinate system as the center. Perform a rotational movement. Then, the rotational movement of the visual sensor 30 at a certain characteristic point of the object to be observed, that is, the coordinate in the sensor coordinate system U of the characteristic point at the rotational position indicated by 30-1, 30-2, 30-3 is secured as a plurality of sensor information. To do.

【００４９】図１１は３次元情報の再構成モデルの実施
例を示す。ここで特徴点の個数をｎ、回転運動の回数即
ち観測回数をｍとすると、ｍ回の観測で得られた（ｍ×
ｎ）個のセンサ座標系Ｕにおける特徴点座標に（ｍ×
ｎ）個のセンサ情報の変換モデル６４を対応させる。再
構成する観測対象物体３４０の特徴点の物体中心座標系
Ｘｏｂｊｅｃｔでの座標をＸ（ｉ）＝［ｘ（ｉ），ｙ（ｉ），ｚ（ｉ）］とし、同一の特徴点に対応するｍ個のセンサ情報処理モ
デルで共有する。また全てのセンサ情報の変換モデル６
４の平行移動Ｘ０は、この場合、０とする。時刻ｔでの
回転角をθ（ｔ）で表わし、時刻ｔ＝０ではθ（０）＝
０とすると、物体中心座標系Ｘｏｂｊｅｃｔに対する視
覚センサ３０の回転角θ（０），・・・，θ（ｍ−１）
は、マニピュレータに対する運動指令から既知の値であ
る。物体中心座標系Ｘｏｂｊｅｃｔの原点までの距離ｄ
は距離センサやステレオ視覚等の手段により測定して設
定する。FIG. 11 shows an embodiment of a reconstruction model of three-dimensional information. Here, assuming that the number of feature points is n and the number of rotational movements, that is, the number of observations is m, it is obtained by m observations (m ×
(n) The feature point coordinates in the sensor coordinate system U are (m ×
n) Corresponding conversion models 64 of sensor information. The coordinates of the feature points of the observation target object 340 to be reconstructed in the object center coordinate system Xobject are X (i) = [x (i), y (i), z (i)] and correspond to the same feature point. It is shared by m sensor information processing models. Also, a conversion model 6 for all sensor information
The parallel movement X0 of 4 is 0 in this case. The rotation angle at time t is represented by θ (t), and at time t = 0, θ (0) =
If 0, the rotation angle θ (0), ..., θ (m−1) of the visual sensor 30 with respect to the object center coordinate system Xobject.
Is a known value from the motion command for the manipulator. Distance d to the origin of the object center coordinate system Xobject
Is measured and set by means such as a distance sensor or stereo vision.

【００５０】ここで、ｉを特徴点、ｔを回転運動の回数
として、Here, i is a feature point, and t is the number of rotations,

【００５１】[0051]

【数１１】 [Equation 11]

【００５２】を時刻ｔにセンサ情報の変換モデル６４が
観測対象物体３４の特徴点の座標Ｘ（ｉ）から予測した
２次元センサ座標系Ｕでの特徴点の座標とし、Is the coordinate of the feature point in the two-dimensional sensor coordinate system U predicted from the coordinate X (i) of the feature point of the observation object 34 by the sensor information conversion model 64 at time t,

【００５３】[0053]

【数１２】 [Equation 12]

【００５４】を視覚センサ３０により観測された特徴点
の座標として予測する。エネルギ最小化部１２における
エネルギ関数Ｅは、予測したセンサ情報と観測したセン
サ情報の二乗誤差Predict as the coordinates of the feature points observed by the visual sensor 30. The energy function E in the energy minimization unit 12 is a squared error between the predicted sensor information and the observed sensor information.

【００５５】[0055]

【数１３】 [Equation 13]

【００５６】とし、再構成する観測対象物体３４０の物
体中心座標系Ｘｏｂｊｅｃｔでの特徴点座標Ｘ（ｉ）を
次式に従って更新していくことによって、観測対象物体
３４の３次元構造が再構成される。Then, the three-dimensional structure of the observation object 34 is reconstructed by updating the feature point coordinates X (i) in the object center coordinate system Xobject of the observation object 340 to be reconstructed according to the following equation. It

【００５７】[0057]

【数１４】 [Equation 14]

【００５８】ここで、ε_Xは正の定数である。即ち、初
期値として例えばＸ（ｉ）＝０を与え、（１６）式に従
って（１５）式のエネルギＥを最小化するようにＸ
（ｉ）を順次更新することで、Ｅ＝０に近付くようなセ
ンサ情報を出力するＸ＝［Ｘ（０），・・・Ｘ（ｎ−１）］を定めることができる。このようにセンシング行動全体
に亘って獲得したセンサ情報の誤差エネルギを最小化す
ることにより、矛盾の少ない観測対象物体３４の３次元
構造を再構成することができる。Here, ε _X is a positive constant. That is, for example, X (i) = 0 is given as an initial value, and X (i) is set so as to minimize the energy E of the equation (15) according to the equation (16).
By sequentially updating (i), it is possible to determine X = [X (0), ... X (n-1)], which outputs sensor information that approaches E = 0. By thus minimizing the error energy of the sensor information acquired over the entire sensing behavior, it is possible to reconstruct the three-dimensional structure of the observation target object 34 with less contradiction.

【００５９】ここで、再構成に必要なセンシング行動の
回数を考えると、ｎ個の特徴点Ｘ（ｉ）＝［ｘ（ｉ），ｙ（ｉ），ｚ（ｉ）］但し、ｉ＝０，・・・ｎ−１に対する未知数は３ｎ、１回の観測で得られる方程式が
２ｎであるから、３次元構造の再構成には少なくとも２
ｍｎ＞３ｎを満足するようなセンシング回数（ｎ≧２）
が必要である。（２）内部モデルとの対応付け次に図９に示した置換ニューラルネットワーク１４を用
いた観測対象物体と内部モデルとの対応付けによる同定
認識を説明する。即ち、図９に示した置換ニューラルネ
ットワーク１４を用いて観測対象物体３４と内部モデル
１０の特徴点の対応関係と、物体中心座標系Ｘｏｂｊｅ
ｃｔと内部モデル座標系Ｘｍｏｄｅｌを一致させるよう
な座標変換、即ち平行移動量と回転変換量を決定するこ
とにより、再構成した観測対象物体を内部モデルと同定
し認識する。Considering the number of sensing actions required for reconstruction, n feature points X (i) = [x (i), y (i), z (i)] where i = 0 , N−1 is 3n, and the equation obtained by one observation is 2n, so at least 2 is required for reconstruction of the three-dimensional structure.
Number of sensing times that satisfies mn> 3n (n ≧ 2)
is necessary. (2) Correlation with Internal Model Next, identification and recognition by associating the observation target object with the internal model using the replacement neural network 14 shown in FIG. 9 will be described. That is, using the replacement neural network 14 shown in FIG. 9, the correspondence between the observation target object 34 and the feature points of the internal model 10 and the object center coordinate system Xobje.
The reconstructed object to be observed is identified and recognized as the internal model by determining the coordinate transformation that matches ct and the internal model coordinate system Xmodel, that is, the translation amount and the rotation transformation amount.

【００６０】図１２は内部モデルとの対応付けを示す。
即ち、内部モデル座標系Ｘｍｏｄｅｌでの内部モデル１
０の特徴点座標を置換ニューラルネットワーク１４によ
り置換し、（４）式の回転変換ｆ_rにより平行移動Ｘ０
と回転θにより変換して物体中心座標系Ｘｏｂｊｅｃｔ
における観測対象物体３４０の３次元構造を予測し、誤
差演算部２８に与える。予測した特徴点をＸ_p（ｉ）、
観測した特徴点をＸ（ｉ）で表わし、エネルギ関数をこ
れらの二乗誤差FIG. 12 shows the correspondence with the internal model.
That is, the internal model 1 in the internal model coordinate system Xmodel
0 feature point coordinates of and replaced by a substituting neural network 14, translating the rotational transformation f _r of equation (4) X0
And the rotation θ to convert the object center coordinate system Xobject
The three-dimensional structure of the observation target object 340 in is predicted and given to the error calculator 28. The predicted feature points are X _p (i),
The observed feature points are represented by X (i), and the energy function is expressed by the square error of these.

【００６１】[0061]

【数１５】 [Equation 15]

【００６２】で定義する。このエネルギＥを最小化する
ように、次式に基づいて平行移動量Ｘ０と回転角θを修
正し、観測対象物体３４と内部モデル１０の特徴点の座
標を一致させる回転変換と平行移動を決定している。It is defined by In order to minimize this energy E, the parallel movement amount X0 and the rotation angle θ are corrected based on the following equations, and the rotation conversion and the parallel movement for matching the coordinates of the feature points of the observation target object 34 and the internal model 10 are determined. is doing.

【００６３】[0063]

【数１６】 [Equation 16]

【００６４】同時に、置換ニューラルネットワーク１４
の出力値ＯＸ（ｉ）に対するエネルギの微分値At the same time, the replacement neural network 14
Of energy with respect to the output value OX (i) of

【００６５】[0065]

【数１７】 [Equation 17]

【００６６】を求め、置換ニューラルネットワーク１４
に逆伝播し、その重み係数を修正することにより観測対
象物体３４と内部モデル１０の特徴点の対応関係を決定
している。４．シミュレーション実験以上説明した本発明の物体認識装置を実現する認識モデ
ルの有効性を計算機シミュレーションにより評価する
と、次のような結果を得た。（１）３次元構造の再構成センサ情報の変換モデル６４を用いて３次元構造を再構
成するシミュレーション実験を行う。前述したように、
回転変換はオイラー角、観測対象物体からセンサ情報へ
の投影は中心投影変換を用いる。図１３はシミュレーシ
ョン実験のために生成したセンサ情報を示す。即ち、物
体中心座標系での５点の特徴点Ｐ０＝（０．５，０．５，０．５）Ｐ１＝（−０．５，０．３，−０．２）Ｐ２＝（−０．４，−０，５，０．０）Ｐ３＝（０．４，−０．３，０．０）Ｐ４＝（０．２，０．９，−０．２）が与えられたとき、物体中心座標系の原点を中心に距離
ｄ＝２、Δα＝３０°，Δβ＝３０°，Δγ＝−３０°
の刻みで３回、視覚センサを回転運動させ、センサ情報
を生成した。これら１５個のセンサ座標系における特徴
点座標にセンサ情報の変換モデル６４を対応付け、エネ
ルギ最小化により観測対象物体の３次元構造を再構成し
た。Then, the replacement neural network 14 is obtained.
And the weighting coefficient is corrected to determine the correspondence between the observation target object 34 and the feature points of the internal model 10. 4. Simulation Experiment When the effectiveness of the recognition model for realizing the object recognition device of the present invention described above was evaluated by computer simulation, the following results were obtained. (1) Reconstruction of three-dimensional structure A simulation experiment for reconstructing a three-dimensional structure using the conversion model 64 of sensor information is performed. As previously mentioned,
The Euler angle is used for the rotation conversion, and the center projection conversion is used for the projection from the observation target object to the sensor information. FIG. 13 shows the sensor information generated for the simulation experiment. That is, five feature points in the object center coordinate system P0 = (0.5,0.5,0.5) P1 = (-0.5,0.3, -0.2) P2 = (-0 .4, -0,5,0.0) P3 = (0.4, -0.3,0.0) P4 = (0.2,0.9, -0.2) Distance d = 2, Δα = 30 °, Δβ = 30 °, Δγ = −30 ° around the origin of the object center coordinate system
The sensor information was generated by rotating the visual sensor three times at each step. The conversion model 64 of the sensor information is associated with the feature point coordinates in these 15 sensor coordinate systems, and the three-dimensional structure of the observation target object is reconstructed by energy minimization.

【００６７】図１４は縦軸に誤差、横軸にエネルギ最小
化による更新回数を示し、再構成した物体中心座標での
Ｘ，Ｙ，Ｚの各座標値の誤差及び平均誤差は図示の特性
となった。図１４から明らかなように、３００回以内の
更新で誤差が０．０１以内に収まり、３次元構造が高精
度に再構成されていることが分かる。（２）内部モデルとの対応付け図１２に示したように、センサ情報の変換モデル６４の
回転変換と置換ニューラルネットワーク１４を結合し、
再構成した観測対象物体３４０と内部モデル１０の特徴
点との対応関係と、物体中心座標系Ｘｏｂｊｅｃｔとモ
デル座標系Ｘｍｏｄｅｌの平行移動量Ｘ０及び回転角θ
を決定する。FIG. 14 shows the error on the ordinate and the number of updates by energy minimization on the abscissa, and the error and average error of each coordinate value of X, Y, Z at the reconstructed object center coordinates have the illustrated characteristics. became. As is clear from FIG. 14, it is understood that the error is within 0.01 within 300 updates and the three-dimensional structure is reconstructed with high accuracy. (2) Correlation with internal model As shown in FIG. 12, the rotation conversion of the sensor information conversion model 64 and the replacement neural network 14 are combined,
The correspondence between the reconstructed observation target object 340 and the feature points of the internal model 10, the parallel movement amount X0 and the rotation angle θ of the object center coordinate system Xobject and the model coordinate system Xmodel.
To decide.

【００６８】ここで内部モデル座標系における（±０．
５，±０．５，±０．５）の直方体の頂点８点に対し、
±０．２５の範囲で一様に乱数を付加し、１０個の内部
モデルを生成した。そして、各内部モデルをＸ０＝
（０．１，０．１，０．１）で平行移動し、０＜α＜９０° ０＜β＜９０° γ＝α の範囲で、α，βをそれぞれ５°刻みに変化させて回転
したものを物体中心座標系での観測対象物体とした。Here, in the internal model coordinate system (± 0.
5, ± 0.5, ± 0.5) for the 8 points of the rectangular parallelepiped,
Random numbers were uniformly added in the range of ± 0.25 to generate 10 internal models. Then, for each internal model, X0 =
Move in parallel at (0.1, 0.1, 0.1) and rotate by changing α and β in 5 ° increments in the range of 0 <α <90 ° 0 <β <90 ° γ = α The object was an object to be observed in the object center coordinate system.

【００６９】また、特徴点の対応関係及び平行移動量Ｘ
０と回転角θは未知であるとして、観測対象物体と内部
モデルの特徴点の対応関係と平行移動量Ｘ０，回転角θ
を決定するシミュレーション実験を行った。また、置換
ニューラルネットワークの重みの初期値は段数により定
めた。図１５は内部モデルとの対応付けの実験結果を示
し、図１６にαに関して平均した同定の成功率を示して
いる。尚、１００回以内の更新で正しい対応関係と平行
移動量Ｘ０，回転角θを決定したものを成功と見做して
いる。図１６から明らかなように、物体中心座標系と内
部モデル座標系との回転角αが４０°以内の場合は、ほ
ぼ確実に対応関係と平行移動量Ｘ０，回転角θが決定で
きることが確認された。Further, the correspondence between feature points and the parallel movement amount X
0 and the rotation angle θ are unknown, the correspondence between the observation target object and the feature points of the internal model, the parallel movement amount X0, and the rotation angle θ
A simulation experiment to determine Moreover, the initial value of the weight of the replacement neural network is determined by the number of stages. FIG. 15 shows the experimental result of the association with the internal model, and FIG. 16 shows the success rate of identification averaged with respect to α. It should be noted that the case where the correct correspondence, the parallel movement amount X0, and the rotation angle θ are determined by updating within 100 times is regarded as successful. As is clear from FIG. 16, when the rotation angle α between the object center coordinate system and the internal model coordinate system is within 40 °, it is confirmed that the correspondence relationship and the parallel movement amount X0 and the rotation angle θ can be almost certainly determined. It was

【００７０】尚、本発明は上記の実施例に示した具体的
な数値による制約を受けないことは勿論である。Of course, the present invention is not restricted by the concrete numerical values shown in the above-mentioned embodiments.

【００７１】[0071]

【発明の効果】以上説明してきたように本発明によれ
ば、アクティブセンシングにより獲得した複数の画像を
用いた観測対象物体の３次元構造の再構築と、内部モデ
ルと観測対象物体相互の座標系の変換の決定、更に内部
モデルと観測対象物体相互の特徴量の対応関係の決定を
同時に行うことで、効率的な３次元物体の認識を実現す
ることができる。As described above, according to the present invention, the three-dimensional structure of an observation target object is reconstructed using a plurality of images acquired by active sensing, and the coordinate system between the internal model and the observation target object is reconstructed. It is possible to realize efficient recognition of a three-dimensional object by simultaneously determining the conversion of the above, and further determining the correspondence relationship between the internal model and the feature amount of the observation target object.

【００７２】更に、アクチュエータにより能動的にセン
サを動作させて、獲得した複数枚の画像を用いた観測対
象物に対し矛盾のない座標変換，対応点の決定を行うこ
とにより、内部モデルと観測対象物の同定の精度を十分
に高めることができる。Furthermore, the sensor is actively operated by the actuator to perform consistent coordinate conversion and determination of corresponding points on the observation object using a plurality of acquired images, whereby the internal model and the observation object are determined. It is possible to sufficiently enhance the accuracy of identifying the object.

[Brief description of drawings]

【図１】本発明の原理説明図FIG. 1 is an explanatory view of the principle of the present invention.

【図２】本発明で実現される階層型感覚情報モデルの説
明図FIG. 2 is an explanatory diagram of a hierarchical sensory information model realized by the present invention.

【図３】図２の階層型感覚情報モデルユニット構成を示
した説明図FIG. 3 is an explanatory diagram showing the configuration of the hierarchical sensory information model unit of FIG.

【図４】本発明で用いる視覚センサとアクチュエータの
実施例構成図FIG. 4 is a structural diagram of an embodiment of a visual sensor and an actuator used in the present invention.

【図５】本発明における認識モデルの基本的考え方を示
した説明図FIG. 5 is an explanatory diagram showing the basic idea of the recognition model in the present invention.

【図６】本発明の装置構成の実施例構成図FIG. 6 is a configuration diagram of an embodiment of the device configuration of the present invention.

【図７】図６のエネルギー最小化部の詳細を示した実施
例構成図7 is a configuration diagram of an embodiment showing details of an energy minimization unit in FIG.

【図８】本発明のセンサ情報変換モデルの実施例図FIG. 8 is a diagram showing an embodiment of a sensor information conversion model of the present invention.

【図９】本発明の置換ニューラルネットワークの実施例
構成図FIG. 9 is a configuration diagram of an embodiment of a replacement neural network of the present invention.

【図１０】本発明における３次元物体の再構成を示した
説明図FIG. 10 is an explanatory diagram showing reconstruction of a three-dimensional object in the present invention.

【図１１】本発明における３次元物体の再構成モデルを
示した実施例構成図FIG. 11 is a configuration diagram of an embodiment showing a reconstruction model of a three-dimensional object in the present invention.

【図１２】本発明における内部モデルとの対応付けを示
した説明図FIG. 12 is an explanatory diagram showing correspondence with an internal model according to the present invention.

【図１３】視覚センサを回転運動して得た３次元構造の
シミュレーションで用いるセンサ情報の説明図FIG. 13 is an explanatory diagram of sensor information used in a simulation of a three-dimensional structure obtained by rotating a visual sensor.

【図１４】置換ニューラルネットワークの更新回数と誤
差のシミュレーション結果を示した特性図FIG. 14 is a characteristic diagram showing a simulation result of the number of updates and an error of the replacement neural network.

【図１５】内部モデルとの対応付けのシミュレーション
による回転角α、βと成功率を示した３次元グラフ図FIG. 15 is a three-dimensional graph diagram showing rotation angles α and β and a success rate by a simulation of association with an internal model.

【図１６】図１５の回転角αと平均成功率の関係を示し
たグラフ図FIG. 16 is a graph showing the relationship between the rotation angle α and the average success rate in FIG.

[Explanation of symbols]

１０：内部モデル１２：エネルギ最小化部１４：対応点決定部（置換ニューラルネットワーク）１５：座標変換部１６：平行移動量（Ｘ）１８：平行移動部２０：回転変換量（θ）２２：回転変換部２５：認識処理部２６：センサ射影部（中心投影変換部）２８：誤差演算部３０：視覚センサ３２：アクチュエータ３４：観測対象物体３６：処理ユニット３８：外界４０：物体認識モジュール４２：認識モジュール４４：運動モジュール４６：感覚運動融合モジュール４８，５０：感覚情報５２：意図（目標）５４，５６：運動指令５８，６０：予測目標６４：変換モデル６６：入力層６８：入力ユニット７０，７５：重み制限部７２：出力層７４：出力ユニット 10: Internal model 12: Energy minimization unit 14: Corresponding point determination unit (replacement neural network) 15: Coordinate conversion unit 16: Parallel movement amount (X) 18: Parallel movement unit 20: Rotation conversion amount (θ) 22: Rotation Conversion unit 25: Recognition processing unit 26: Sensor projection unit (center projection conversion unit) 28: Error calculation unit 30: Visual sensor 32: Actuator 34: Observed object 36: Processing unit 38: Outside world 40: Object recognition module 42: Recognition Module 44: Motion module 46: Sensorimotor fusion module 48, 50: Sensory information 52: Intention (target) 54, 56: Motor command 58, 60: Prediction target 64: Conversion model 66: Input layer 68: Input unit 70, 75 : Weight limiting unit 72: output layer 74: output unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者長田茂美神奈川県川崎市中原区上小田中1015番地富士通株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Shigemi Nagata 1015 Kamiodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa Fujitsu Limited

Claims

[Claims]

1. A corresponding point determining unit (14) for determining a correspondence relationship between a feature amount of an internal model (10) and a feature amount of an observation target object (34), a translation amount (16) and a rotation conversion amount (20). ), A coordinate transformation unit (15) that rotates and translates the internal model (10), a sensor projection unit (26) that generates sensor information from the internal model (10), and a sensor of the observation target object (34). Information and internal model (1
0) error calculation unit of predicted sensor information (28)
By determining the correspondence relationship of the corresponding point determination unit (14), the parallel movement amount (16) and the rotation conversion amount (20) with respect to the coordinate conversion unit (15) so as to minimize the error energy obtained in An object recognition device comprising an internal model (10) and a recognition processing unit (25) for identifying an observation target object (34).

2. The object recognition apparatus according to claim 1, wherein the recognition processing section (25) has an error energy of a three-dimensional structure predicted from the three-dimensional structure of the observation target object (34) and the internal model (10). Corresponding point determination unit (1
4) to identify the internal model (10) and the observation target object (34) by determining the correspondence relationship (4), the parallel movement amount (16) and the rotation conversion amount (20) of the coordinate transformation unit (15). Object recognition device characterized by.

3. The object recognition apparatus according to claim 1, wherein the corresponding point determination unit (14) determines the observation target object from the feature amount of the internal model of the observation target object (34) input to the input layer unit. The feature amount of (34) is configured by a two-layered neural network that outputs from the output layer unit, and the error energy between the observed feature amount of the observation target object (34) and the output feature amount is minimized. An object recognizing device which estimates a correspondence by correcting a weight of the neural network.

4. The object recognition apparatus according to claim 3, wherein the neural network is added with a constraint that a total value of weights connected to each input / output unit is a constant value. Recognition device.

5. The object recognizing device according to claim 3, wherein the neural network is added with a constraint that the weight of the coupling of each input / output unit becomes 0 or 1.

6. The object recognizing device according to claim 2, wherein a visual sensor (30) operable by an actuator (32) is provided as means for recognizing an object to be observed (34),
The recognition processing unit (25) identifies the three-dimensional structure of the observation target object (34) reconstructed by operating the actuator (32) of the visual sensor (30) and the internal model (10). Object recognition device characterized by.