JP2002342758A

JP2002342758A - Visual recognition system

Info

Publication number: JP2002342758A
Application number: JP2001145362A
Authority: JP
Inventors: Osamu Hasegawa; 修長谷川
Original assignee: Individual
Current assignee: Individual
Priority date: 2001-05-15
Filing date: 2001-05-15
Publication date: 2002-11-29

Abstract

PROBLEM TO BE SOLVED: To provide a visual recognition system capable of satisfactorily functioning even under a situation that the distance between a camera and an object to be recognized or the direction of the object arbitrarily fluctuates. SOLUTION: This visual recognition system is provided with a picture fetching means 1, a normalizing means 2 for generating a normalized picture by matching the picture with at least one reference distance, a storage means 6, and a recognizing means 7. In a learning process, an object to be learnt and the distance data are fetched by using the picture fetching means, and the normalized picture of the fetched object to be learnt is generated by the normalizing means, and the normalized picture to which a concept such as designation and meaning is correlated is stored in the storage part. In a recognition process, the object to be recognized and the distance data are fetched by using the picture fetching means, and the normalized picture of the object to be recognized is generated by the normalizing means, and the normalized picture of the object to be recognized is contrasted with the normalized picture of the object to be learnt stored in the storage part by using the recognizing means so that the concept of the object to be recognized can be recognized.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、ロボットやコン
ピュータに、物体を視覚認識させるシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for causing a robot or a computer to visually recognize an object.

【０００２】[0002]

【従来の技術】従来から、カメラなどを利用してコンピ
ュータが物体を視覚的に認識するシステムはあった。こ
のような視覚認識システムは、例えば、工場の生産ライ
ンで、つぎつぎ搬送される部品の種類に応じた処理を行
う必要がある場合、搬送ライン上の部品を認識するため
に利用される。上記のような認識の原理を簡単に説明す
る。まず、生産ラインに搬送される部品の外観を画像情
報としてあらかじめ登録しておくとともに、それら予め
登録した画像情報に部品コードを対応させておく。2. Description of the Related Art Conventionally, there has been a system in which a computer visually recognizes an object using a camera or the like. Such a visual recognition system is used for recognizing components on a transport line when it is necessary to perform a process according to the type of a component to be transported one after another on a production line of a factory, for example. The principle of recognition as described above will be briefly described. First, the appearance of a part conveyed to the production line is registered in advance as image information, and a part code is associated with the pre-registered image information.

【０００３】そして、生産ラインには、そこに搬送され
る部品の外観を取り込むＣＣＤカメラなどの装置を備え
ておき、このカメラで撮影した搬送部品と、上記あらか
じめ登録した画像情報とをマッチングさせる。このマッ
チングの段階で、最も近い部品すなわち最も近い画像情
報が見つかったとき、その画像情報に対応した部品コー
ドを上記搬送部品として認識する。[0003] The production line is provided with a device such as a CCD camera that captures the appearance of the parts conveyed there, and the conveyed parts photographed by this camera are matched with the previously registered image information. At this matching stage, when the closest component, that is, the closest image information is found, the component code corresponding to the image information is recognized as the transport component.

【０００４】上記のように、一定の搬送ライン上を搬送
される部品の場合には、同じ部品なら、いつも同じ大き
さで読み込むことができる。言い換えると、カメラの位
置さえ特定しておけば、搬送される部品の画像と、登録
された部品の画像とを、常に同一の大きさで対応させる
ことができる。このように両者を同一の大きさで対応さ
せることができるので、部品の種類を認識することがで
きる。[0004] As described above, in the case of parts conveyed on a fixed conveying line, the same parts can always be read in the same size. In other words, as long as the position of the camera is specified, the image of the conveyed component and the image of the registered component can always be associated with the same size. As described above, the two can be made to correspond with the same size, so that the type of the component can be recognized.

【０００５】[0005]

【発明が解決しようとする課題】このようにした従来の
システムでは、認識時に読み込まれる部品の大きさと、
あらかじめ登録した部品の画像との大きさが一致しなく
なると、部品の種類を正確に認識することができなくな
る。そのために、認識すべき部品とカメラとの距離が一
定していない環境ではこのシステムが良好に機能しない
という問題があった。In such a conventional system, the size of a component read at the time of recognition is determined.
If the size of the image of the part registered in advance does not match, the type of the part cannot be accurately recognized. Therefore, there is a problem that this system does not function well in an environment where the distance between the part to be recognized and the camera is not constant.

【０００６】さらに、表面に模様が描かれている部品の
場合には、その部品から遠く離れてしまえば、表面の模
様がはっきり見えなくなってしまうことがある。例え
ば、白黒の縞模様でも、遠くになれば、全体に灰色に見
えることがある。このようなことは、カメラを通しても
同じで、カメラから遠くのものは、表面の模様が見えに
くくなったりする。つまり、取り込んだ画像も、その対
象とカメラとの距離によって違うものになってしまう。Further, in the case of a part having a pattern drawn on the surface, the pattern on the surface may not be clearly seen if the part is far away from the part. For example, a black-and-white striped pattern may appear gray at a distance. This is the same even through a camera, and objects far from the camera may have difficulty seeing the surface pattern. In other words, the captured image differs depending on the distance between the target and the camera.

【０００７】そのため、あらかじめ登録した画像と、認
識時に取り込んだ画像の距離が異なれば、その物体の認
識は難しくなる。さらにまた、カメラに対する部品の向
きが異なれば、その部品と登録した画像とが一致しなく
なってしまう。そのため、同一部品であっても、向きが
変わってしまえば、その部品を認識できなくなる。Therefore, if the distance between the image registered in advance and the image captured at the time of recognition is different, it becomes difficult to recognize the object. Furthermore, if the orientation of the part with respect to the camera is different, the part and the registered image will not match. Therefore, even if the components are the same, if the orientation changes, the components cannot be recognized.

【０００８】上記のような問題を解決するために、あら
かじめ各部品ごとに、あらゆる距離とあらゆる向きにお
ける画像を読み込んで登録しておく方法が考えられる。
しかし、このような方法では、あらかじめ登録すべき画
像の数が天文学的な数字になってしまう。In order to solve the above-mentioned problem, a method of reading and registering images at all distances and all directions for each component in advance is conceivable.
However, in such a method, the number of images to be registered in advance is an astronomical number.

【０００９】つまり、従来技術では、カメラと認識対象
の間の距離や対象の向きが任意に変動するような状況下
で、多数の物体を精度良く認識することはきわめて困難
であった。この発明の目的は、そうした状況下でも良好
に機能する視覚認識システムを提供することにある。That is, in the prior art, it is extremely difficult to accurately recognize a large number of objects in a situation where the distance between the camera and the recognition target and the direction of the target fluctuate arbitrarily. An object of the present invention is to provide a visual recognition system that functions well under such circumstances.

【００１０】[0010]

【課題を解決するための手段】第１の発明は、画像取り
込み手段と、画像を１つまたは複数の基準距離に合わせ
た正規化画像を生成する正規化手段と、記憶手段と、認
識手段とを備え、学習プロセスでは、上記画像取り込み
手段によって学習対象をその距離データとともに取り込
み、上記正規化手段によって取り込んだ学習対象の正規
化画像を生成し、この正規化画像に名称、意味などの概
念を対応づけて上記記憶部に記憶させ、認識プロセスで
は、画像取り込み手段によって認識対象をその距離デー
タとともに取り込み、上記正規化手段によって取り込ん
だ認識対象の正規化画像を生成し、上記認識手段が上記
認識対象の正規化画像と、上記記憶部に記憶された学習
対象の正規化画像とを対比させることにより、上記認識
対象の概念を認識する点に特徴を有する。According to a first aspect of the present invention, there is provided an image capturing means, a normalizing means for generating a normalized image in which an image is adjusted to one or a plurality of reference distances, a storage means, and a recognizing means. In the learning process, the learning target is captured together with the distance data by the image capturing means, and a normalized image of the learning target captured by the normalizing means is generated. In the recognition process, the recognition target is captured together with the distance data by the image capturing means, a normalized image of the recognition target captured by the normalizing means is generated, and the recognition means performs the recognition. The concept of the recognition target is recognized by comparing the normalized image of the target and the normalized image of the learning target stored in the storage unit. That has a feature to the point.

【００１１】第２の発明は、画像の特徴を局所的に基底
関数展開する画像解析手段と、局所的な基底関数の係数
を基底関数ごとに足し合わせて特徴ベクトルを生成する
特徴ベクトル生成手段を備え、学習プロセスでは、上記
特徴ベクトル生成手段によって学習対象の正規化画像か
ら特徴ベクトルを生成し、この特徴ベクトルに学習対象
の概念を対応づけて記憶手段に記憶させ、認識プロセス
では、上記特徴ベクトル生成手段によって認識対象の正
規化画像から特徴ベクトルを生成し、認識手段が上記認
識対象の特徴ベクトルと上記記憶手段に記憶された特徴
ベクトルとを対比させることにより、上記認識対象の概
念を認識する点に特徴を有する。According to a second aspect of the present invention, there is provided image analysis means for locally developing a feature of an image into a basis function, and feature vector generation means for generating a feature vector by adding coefficients of a local basis function for each basis function. In the learning process, a feature vector is generated from the normalized image to be learned by the feature vector generation unit, and the concept of the learning target is stored in the storage unit in association with the feature vector. In the recognition process, the feature vector is generated. A generating unit generates a feature vector from the normalized image to be recognized, and the recognition unit recognizes the concept of the recognition target by comparing the feature vector of the recognition target with the feature vector stored in the storage unit. It is characterized by points.

【００１２】第３の発明は、複数の特徴ベクトルの特徴
をより判別し易い判別空間に特徴ベクトルを投影して判
別ベクトルとする判別ベクトル生成手段を備え、学習プ
ロセスでは、上記判別ベクトル生成手段が、学習対象の
特徴ベクトルから学習対象の判別ベクトルを生成し、記
憶手段が、学習対象の判別ベクトルと学習対象の概念と
を対応づけて記憶し、認識プロセスでは、上記判別ベク
トル生成手段が、認識対象の特徴ベクトルから、判別ベ
クトルを生成し、認識手段が、上記認識対象の判別ベク
トルと記憶手段に記憶された判別ベクトルとを対比する
ことにより、上記認識対象の概念を認識する点に特徴を
有する。According to a third aspect of the present invention, there is provided a discriminant vector generating means for projecting a feature vector into a discriminant space in which a feature of a plurality of feature vectors can be more easily discriminated as a discriminant vector. Generating a discrimination vector of the learning target from the feature vector of the learning target, the storage means stores the discrimination vector of the learning target and the concept of the learning target in association with each other, and in the recognition process, the discrimination vector generation means A discrimination vector is generated from the feature vector of the target, and the recognition unit compares the discrimination vector of the recognition target with the discrimination vector stored in the storage unit, thereby recognizing the concept of the recognition target. Have.

【００１３】第４の発明は、画像取り込み手段と、記憶
手段と、認識手段とを備え、学習プロセスでは、上記画
像取り込み手段によって学習対象をその距離データとと
もに取り込み、上記学習対象物体の画像に学習対象の概
念を対応づけて上記記憶部に記憶させ、認識プロセスで
は、上記画像取り込み手段が、対象をその距離データと
ともに取り込み、取り込んだ画像の中から上記距離デー
タに基づいて認識対象を特定し、上記認識手段が、上記
特定した認識対象の画像と上記記憶手段に記憶された画
像とを対比することにより、上記認識対象の概念を認識
する点に特徴を有する。According to a fourth aspect of the present invention, there is provided an image capturing means, a storing means, and a recognizing means. In the learning process, the learning object is captured by the image capturing means together with the distance data, and the learning object image is learned. In the recognition process, the concept of the target is stored in the storage unit, and in the recognition process, the image capturing unit captures the target together with the distance data, and specifies the recognition target based on the distance data from the captured image, It is characterized in that the recognition means recognizes the concept of the recognition target by comparing the specified image of the recognition target with the image stored in the storage means.

【００１４】第５の発明は、学習プロセスでは、画像取
り込み手段が、対象をその距離データとともに取り込
み、取り込んだ画像の中から距離データに基づいて学習
対象を特定し、この特定した学習対象の画像に概念を対
応させて記憶部に記憶させる点に特徴を有する。なお、
上記第４、第５の発明における認識対象の特定とは、取
り込んだ画像の中から認識すべき対象の画像を背景から
切り出したり、切り出したもの以外の画像を削除したり
することである。According to a fifth aspect of the present invention, in the learning process, the image capturing means captures the target together with the distance data, specifies a learning target from the captured image based on the distance data, and specifies the specified learning target image. The feature is that the concept is stored in the storage unit in association with the concept. In addition,
The identification of the recognition target in the fourth and fifth inventions refers to cutting out an image of a target to be recognized from a captured image from a background or deleting an image other than the cut out image.

【００１５】第６の発明は、正規化手段を備え、学習プ
ロセスでは、上記正規化手段によって学習対象の画像を
１つまたは複数の基準距離で正規化した正規化画像を生
成し、この正規化画像に概念を対応づけて記憶手段に記
憶し、認識プロセスでは、上記正規化手段によって認識
対象の画像を１つまたは複数の基準距離で正規化した正
規化画像を生成し、認識手段は、上記認識対象の正規化
画像と上記記憶手段に記憶された正規化画像とを対比す
るとともに、上記認識対象の概念を認識する点に特徴を
有する。In a sixth aspect of the present invention, a normalizing means is provided, and in the learning process, a normalized image is generated by normalizing the image to be learned by the normalizing means with one or a plurality of reference distances. In the recognition process, a concept is associated with the image and stored in the storage unit. In the recognition process, the normalization unit generates a normalized image in which the image to be recognized is normalized by one or a plurality of reference distances. The feature is that the normalized image to be recognized is compared with the normalized image stored in the storage means, and the concept of the recognition target is recognized.

【００１６】第７の発明は、画像の特徴を局所的に基底
関数展開する画像解析手段と、局所的な基底関数の係数
を基底関数ごとに足し合わせて特徴ベクトルを生成する
特徴ベクトル生成手段を備え、学習プロセスでは、上記
特徴ベクトル生成手段によって学習対象の画像から特徴
ベクトルを生成し、この特徴ベクトルに学習対象の概念
を対応づけて記憶手段に記憶させ、認識プロセスでは、
上記特徴ベクトル生成手段によって認識対象の画像から
特徴ベクトルを生成し、認識手段が上記認識対象の特徴
ベクトルと上記記憶手段に記憶された特徴ベクトルとを
対比させるとともに、上記認識対象の概念を認識する点
に特徴を有する。According to a seventh aspect of the present invention, there is provided an image analyzing means for locally expanding a feature of an image to a basis function, and a feature vector generating means for generating a feature vector by adding coefficients of a local basis function for each basis function. In the learning process, a feature vector is generated from the learning target image by the feature vector generating unit, and the learning target concept is stored in the storage unit in association with the feature vector. In the recognition process,
A feature vector is generated from the image to be recognized by the feature vector generation unit, and the recognition unit compares the feature vector of the recognition target with the feature vector stored in the storage unit and recognizes the concept of the recognition target. It is characterized by points.

【００１７】第８の発明は、正規化手段が、画像中の認
識対象の向きを画像処理により基準角度に合わせる機能
を有する点に特徴を有する。第９の発明は、画像解析手
段が、画像を複数の部分に分けて分析し、特徴ベクトル
生成手段が、複数の部分特徴ベクトルを作成し、それら
に基づき認識を行う点に特徴を有する。An eighth aspect of the present invention is characterized in that the normalizing means has a function of adjusting the orientation of a recognition target in an image to a reference angle by image processing. A ninth aspect of the present invention is characterized in that the image analysis unit analyzes an image by dividing the image into a plurality of parts, and the feature vector generation unit creates a plurality of partial feature vectors and performs recognition based on the partial feature vectors.

【００１８】なお、上記学習プロセスにおける、画像読
み取り手段、正規化手段、画像解析手段、特徴ベクトル
生成手段、判別ベクトル生成手段、記憶手段は、認識プ
ロセスにおける各手段と同一のものであってもかまわな
いし、個別に設けたものであってもかまわない。すなわ
ち、学習手段と認識手段を別システムとし、あらかじめ
学習した内容を複数の認識システムに移植（コピー）し
て利用するといったことも可能である。In the learning process, the image reading means, the normalizing means, the image analyzing means, the feature vector generating means, the discriminating vector generating means and the storing means may be the same as the respective means in the recognition process. Alternatively, they may be provided individually. That is, it is also possible to use the learning means and the recognition means as separate systems, and transplant (copy) the contents learned in advance to a plurality of recognition systems.

【００１９】[0019]

【発明の実施の形態】図１〜図１０に、この発明の第１
実施例を示す。物体を認識するためには、学習プロセス
と、認識プロセスとが必要である。図１には、上記両プ
ロセスを実行するためのシステムを同時に表している。
図１のように、この第１実施例の視覚認識システムは、
学習対象や認識対象の画像を取り込む画像取り込み手段
１と、取り込んだ画像を正規化する正規化手段２と、画
像解析手段３と、特徴ベクトル生成手段４と、判別ベク
トル生成手段５と、記憶手段６と、認識手段７とを備え
ている。また、外部から物体の概念を入力する入力部８
と、この視覚認識システムで認識した結果を外部へ出力
する出力部９とを備えている。1 to 10 show a first embodiment of the present invention.
An example will be described. In order to recognize an object, a learning process and a recognition process are required. FIG. 1 shows a system for executing both processes at the same time.
As shown in FIG. 1, the visual recognition system of the first embodiment
Image capturing means 1 for capturing an image to be learned or recognized, normalizing means 2 for normalizing the captured image, image analyzing means 3, feature vector generating means 4, discriminating vector generating means 5, storage means 6 and recognition means 7. An input unit 8 for inputting a concept of an object from outside;
And an output unit 9 for outputting a result recognized by the visual recognition system to the outside.

【００２０】上記画像取り込み手段１は、物体のテクス
チャー画像データと、その物体までの距離データとを取
り込むことができる手段である。この第１実施例では、
ステレオカメラと、このカメラから入力されたデータを
処理するデータ処理部とによって、画像取り込み手段１
が構成されている。上記画像取り込み手段１は、ステレ
オカメラを用いて画像を取り込むので、テクスチャー
（物体表面の模様）と同時に、距離データを得ることが
できる。また、上記画像取り込み手段１は、上記距離デ
ータから、上記カメラから上記物体までの代表距離を算
出するようにしている。The image capturing means 1 can capture texture image data of an object and distance data to the object. In the first embodiment,
An image capturing unit 1 includes a stereo camera and a data processing unit that processes data input from the camera.
Is configured. Since the image capturing means 1 captures an image using a stereo camera, distance data can be obtained simultaneously with the texture (pattern on the surface of the object). The image capturing means 1 calculates a representative distance from the camera to the object from the distance data.

【００２１】上記代表距離とは、上記ステレオカメラか
ら、画像を取り込む対象物体の代表点までの距離であ
る。物体は、大きさを持っているので、その中心点と
か、カメラから最も近い点とかを予め代表点と決めてお
き、ステレオカメラからその代表点までの距離を代表距
離とする。あるいは、上記カメラから、物体表面の複数
の点までの距離を求めて、それらの平均距離を上記代表
距離としても良い。The representative distance is a distance from the stereo camera to a representative point of an object to be captured. Since the object has a size, a center point or a point closest to the camera is determined in advance as a representative point, and a distance from the stereo camera to the representative point is set as a representative distance. Alternatively, distances from the camera to a plurality of points on the object surface may be obtained, and the average distance between them may be used as the representative distance.

【００２２】上記正規化手段２は、取り込んだ画像を、
正規化するところである。上記正規化とは、実際には、
いろいろな距離にある物体の画像を読み込んだ時に、そ
れらの距離を基準距離に合わせたり、傾きを揃えたりす
るために行う処理のことであるが、図２、図３を用いて
さらにくわしく説明する。図２では、ステレオカメラ１
０から物体１１までの代表距離をＳとしている。そし
て、この第１実施例では、２つの基準距離Ｓ１、Ｓ２を
設定している。The normalizing means 2 converts the captured image into
It is about to be normalized. The above normalization is actually
This is a process performed when reading images of objects at various distances to adjust those distances to the reference distance or to adjust the inclination. This will be described in more detail with reference to FIGS. . In FIG. 2, the stereo camera 1
The representative distance from 0 to the object 11 is S. In the first embodiment, two reference distances S1 and S2 are set.

【００２３】まず、画像取り込み手段１によって物体１
１の画像を取り込む。上記画像取り込み手段１によって
代表距離Ｓを算出し、この代表距離Ｓを画像データとと
もに、正規化手段２へ入力する。正規化手段２は、入力
された画像データを、基準距離Ｓ１、Ｓ２で正規化す
る。すなわち、物体１１が、実際よりステレオカメラ１
０に近い距離である基準距離Ｓ１に置かれた場合の画像
として、実際に取り込んだ画像を、基準距離Ｓ１／代表
距離Ｓの比率で拡大し、これを第１の正規化画像１１ａ
とする。また、基準距離Ｓ２／代表距離Ｓの割合で、縮
小した画像データ１１ｂを第２の正規化画像１１ｂとす
る。First, the object 1 is
Capture the first image. The representative distance S is calculated by the image capturing means 1, and the representative distance S is input to the normalizing means 2 together with the image data. The normalizing means 2 normalizes the input image data with reference distances S1 and S2. That is, the object 11 is more stereo camera 1 than it actually is.
As an image when placed at the reference distance S1, which is a distance close to 0, the actually captured image is enlarged by the ratio of the reference distance S1 / representative distance S, and is enlarged to the first normalized image 11a.
And Further, the image data 11b reduced at the ratio of the reference distance S2 / representative distance S is used as the second normalized image 11b.

【００２４】この正規化の手順は、学習システムにおい
ても、認識システムにおいても同様である。また、この
第１実施例では、２つの基準距離Ｓ１，Ｓ２を設定し
て、第１，第２の正規化画像を作成するようにしている
が、この数は２つに限らない。後で説明するが、上記基
準距離を適当に設定することによって、物体の認識率を
上げることができる。この基準距離の設定数は、認識す
べき対象の大きさや数などに応じて設定する。This normalization procedure is the same in both the learning system and the recognition system. In the first embodiment, two reference distances S1 and S2 are set to create the first and second normalized images, but the number is not limited to two. As will be described later, the recognition rate of the object can be increased by appropriately setting the reference distance. The set number of the reference distances is set according to the size and number of the objects to be recognized.

【００２５】また、上記正規化手段２では、認識対象物
体の向きに関する正規化も行う。画像を取り込む際に、
カメラ１０に対する物体の向きが異なると、同じ物体で
も違うものに見えてしまうことがある。そこで見え方を
揃えるために、向きを揃える正規化処理が必要である。
この第１実施例では、物体表面がカメラ１０の光軸に対
して直交するように例えばアフィン変換などの画像処理
によって取り込んだ画像を回転する。つまり、カメラの
撮像面に平行な面を、この第１実施例における基準面と
する。The normalizing means 2 also normalizes the orientation of the object to be recognized. When importing images,
If the direction of the object with respect to the camera 10 is different, the same object may look different. Therefore, in order to make the appearance uniform, it is necessary to perform a normalization process for making the orientations uniform.
In the first embodiment, an image captured by image processing such as affine transformation is rotated so that the object surface is orthogonal to the optical axis of the camera 10. That is, a plane parallel to the imaging plane of the camera is set as the reference plane in the first embodiment.

【００２６】具体的な処理方法としては、物体表面の複
数の点における距離が一定になるように、複数点の距離
を測定しながら、面の向き（傾き）を調整したり、表面
の局所的な面の法線ベクトルの和が、光軸と平行になる
ように、物体の画像を回転させながらその向きを調整
し、正規化画像を生成する。たとえば、壁を斜めから見
た画像は、壁を真正面から見た画像へと変換する。立方
体などの複数の面のある物体は、各面ごとに基準面が定
まる。ジュース缶のような円筒は缶を真横から見た画像
へと変換される。より複雑な形状の物体に対しては、同
様に上記の基準に基づき、複数の基準面が定まるのが一
般的である。ただし、カメラの光軸まわりの回転は行わ
ない。以下では簡単のため、図３（ａ）、（ｂ）の平板
状の物体１２、１３を例に説明する。まず、取り込んだ
画像中の物体１２が、カメラ１０の撮像面に対して傾い
ている場合の処理の仕方を説明する。As a specific processing method, the direction (inclination) of the surface is adjusted while measuring the distances of the plurality of points so that the distances at the plurality of points on the surface of the object are constant. The orientation of the image of the object is adjusted while rotating the image of the object so that the sum of the normal vectors of the surfaces becomes parallel to the optical axis, and a normalized image is generated. For example, an image in which a wall is viewed obliquely is converted into an image in which a wall is viewed directly in front. For an object having a plurality of surfaces such as a cube, a reference surface is determined for each surface. A cylinder like a juice can is transformed into an image of the can viewed from the side. For an object having a more complicated shape, a plurality of reference planes are generally determined based on the above-described criteria. However, rotation around the optical axis of the camera is not performed. For the sake of simplicity, the following description will be given with reference to the flat objects 12 and 13 shown in FIGS. 3A and 3B as examples. First, a method of processing when the object 12 in the captured image is inclined with respect to the imaging surface of the camera 10 will be described.

【００２７】図３（ａ）、（ｂ）において、カメラ１０
の撮像面と平行な面をｘ−ｙ平面とする。図３（ａ）
は、取り込んだ画像内の物体１２が、ｘ−ｙ平面に対
し、ｘ軸周りに角度αだけ傾いている場合である。この
ような画像１２を取り込んだ場合、上記正規化手段２で
は、物体１２を矢印方向に角度αだけ回転させて、正
規化画像１２ａを生成する。また、図３（ｂ）は、取り
込んだ画像内の物体１３が、ｘ−ｙ平面に対し、ｙ軸周
りに角度βだけ傾いている場合である。このような物体
１３を取り込んだ場合、上記正規化手段２では、物体１
３を矢印方向に角度βだけ回転させて、正規化画像１
３ａを生成する。3A and 3B, the camera 10
A plane parallel to the imaging plane is defined as an xy plane. FIG. 3 (a)
Is a case where the object 12 in the captured image is inclined by an angle α around the x axis with respect to the xy plane. When such an image 12 is captured, the normalization means 2 rotates the object 12 by the angle α in the direction of the arrow to generate a normalized image 12a. FIG. 3B shows a case where the object 13 in the captured image is inclined by an angle β around the y-axis with respect to the xy plane. When such an object 13 is captured, the normalizing means 2 uses the object 1
3 is rotated by an angle β in the direction of the arrow to obtain a normalized image 1
3a is generated.

【００２８】もしも、取り込んだ画像が、ｘ軸、ｙ軸の
両方に対して傾いていた場合には、ｘ軸周りの回転と、
ｙ軸周りの回転の両方を行うことによって、上記撮像面
であるｘ−ｙ平面に平行な正規化画像を生成することが
できる。このように、上記正規化手段２では、物体の距
離と傾きについて正規化を行って、正規化画像を生成す
る。距離についての処理と、傾きについての処理は、ど
ちらを先に行ってもかまわない。If the captured image is tilted with respect to both the x-axis and the y-axis, rotation around the x-axis and
By performing both rotations about the y-axis, it is possible to generate a normalized image parallel to the xy plane, which is the imaging plane. As described above, the normalizing means 2 normalizes the distance and the inclination of the object to generate a normalized image. Either the processing for the distance or the processing for the inclination may be performed first.

【００２９】ただし、必ずしも、正規化手段２におい
て、距離を揃える処理と、傾きを揃える処理の両方を行
わなくてもよい。距離についての正規化だけを行うよう
にするだけでも、従来と比べて、圧倒的に認識の正確性
が増す。なお、正規化処理として距離に関する処理だけ
を行う場合には、代表距離として、物体上の１点までの
距離を用いるようにしてもかまわないが、傾きを調整す
るためには、複数点における距離データが必要である。
つまり、上記画像取り込み手段１において、複数点の距
離データを取り込むことが必要になる。However, it is not always necessary for the normalizing means 2 to perform both the process of adjusting the distance and the process of adjusting the inclination. Even if only the normalization of the distance is performed, the accuracy of recognition is greatly improved as compared with the related art. When only the processing related to the distance is performed as the normalization processing, the distance to one point on the object may be used as the representative distance. However, in order to adjust the inclination, the distance at a plurality of points is used. Data is needed.
That is, it is necessary for the image capturing means 1 to capture distance data of a plurality of points.

【００３０】図１の画像解析手段３は、上記のようにし
て生成した正規化画像を解析する手段である。ここで
は、正規化画像を解析する方法を、図４を用いて説明す
る。図４には、バッグの正規化画像１４を示す。この正
規化画像１４は、ディスプレイ表示されたものである。
ここでは、説明を簡単にするため、線画とする。このデ
ィスプレイ上に、マスクパターンをスキャンしながら重
ねて、画像を解析する。The image analyzing means 3 in FIG. 1 is means for analyzing the normalized image generated as described above. Here, a method of analyzing a normalized image will be described with reference to FIG. FIG. 4 shows a normalized image 14 of the bag. The normalized image 14 is displayed on a display.
Here, for the sake of simplicity, a line drawing is used. The image is analyzed by superimposing the mask pattern on this display while scanning.

【００３１】上記マスクパターンとは、上記正規化画像
の局所的な特徴を抽出するために設定したもので、例え
ば、図５に示すようなパターンである。図５には、２５
のマスクパターンのセットを示しているが、個々のマス
クパターンは、これに限らないし、１セットのマスクパ
ターンの種類も、２５個に限らない。デジタル画像を基
底関数展開するマスクパターンのセットであればよい。The mask pattern is set to extract local features of the normalized image, and is, for example, a pattern as shown in FIG. FIG.
Are shown, but the number of individual mask patterns is not limited to this, and the number of types of mask patterns in one set is not limited to 25. What is necessary is just a set of mask patterns for developing a digital image into basis functions.

【００３２】図５の、個々のマスクパターンｍ１〜ｍ２
５は、正方形で、それぞれ異なるパターンを持ってい
る。上記画像１４上を移動させながら、各位置で２５個
のマスクパターンｍ１〜ｍ２５を重ねる。具体的には、
ある位置において、マスクパターンｍ１〜ｍ２５を順番
に重ね、その下に位置する画像が、上記マスクパターン
ｍ１〜ｍ２５のどの要素をどれだけ持っているかという
値を、マスクパターンの要素量として算出する。例え
ば、マスクパターンｍ２４については、Ｌ字を形成する
３つの灰色正方形に対応する画像部分の濃淡値データ三
つを掛け合わせたものが、その位置におけるマスクパタ
ーンｍ２４の要素量とする。このようにして、すべての
マスクパターンｍ１〜ｍ２５の要素量を計算する。この
処理は、上記マスクパターンを移動させながら、各位置
で行う。このときの移動ピッチは通常は１ピクセルとす
るが、このピッチ幅を大きくするにつれ、ゆるやかに認
識精度が低下する反面、処理速度は向上する。The individual mask patterns m1 to m2 in FIG.
5 is a square, each having a different pattern. While moving on the image 14, 25 mask patterns m1 to m25 are overlapped at each position. In particular,
At a certain position, the mask patterns m1 to m25 are sequentially overlapped, and a value indicating which element of the mask pattern m1 to m25 the image located below has and how much is calculated as an element amount of the mask pattern. For example, as for the mask pattern m24, a value obtained by multiplying three pieces of gray value data of an image portion corresponding to three gray squares forming an L-shape is set as an element amount of the mask pattern m24 at that position. In this way, the element amounts of all the mask patterns m1 to m25 are calculated. This process is performed at each position while moving the mask pattern. The movement pitch at this time is usually one pixel. As the pitch width is increased, the recognition accuracy is gradually reduced, but the processing speed is improved.

【００３３】図５に示すマスクパターンは、説明のた
め、大きく表しているが、この実施例では、３×３ピク
セルのマスクパターンのセットを用いている。ただし、
マスクパターンの大きさは、取りたい特徴の種類に応じ
て、任意に設定することができる。例えば、小さいマス
クパターンのセットを用いた場合には、細かな（高周
波）特徴を抽出することができ、大きなマスクパターン
を用いた場合には、大きな（低周波）特徴を抽出するこ
とができる。例えば、バッグを、バッグであるかどうか
を認識させようとする場合と比べて、そのバッグがワニ
皮製であることを認識させようとする場合の方が、小さ
なマスクパターンを用いることになる。Although the mask pattern shown in FIG. 5 is shown large for the sake of explanation, this embodiment uses a set of mask patterns of 3 × 3 pixels. However,
The size of the mask pattern can be set arbitrarily according to the type of feature to be taken. For example, when a set of small mask patterns is used, fine (high-frequency) features can be extracted, and when a large mask pattern is used, large (low-frequency) features can be extracted. For example, a smaller mask pattern is used when trying to recognize that the bag is made of crocodile skin than when trying to recognize whether the bag is a bag.

【００３４】また、大きさの違うマスクパターンのセッ
トを上記画像解析手段３に複数記憶させておいて、すべ
てのマスクパターンのセットによって解析を行うように
すれば、大きな特徴から、小さな特徴まで、あらゆる特
徴を取ることができるようになる。このように特徴が多
くなれば、対象の判別がより容易となる。いかなる大き
さマスクパターンのセットをいくつ設けるかについて
は、認識すべき対象のもつ特徴や数に応じて定める。た
とえば、上記画像解析手段３が特定のマスクパターンの
セットを用いて認識のシミュレーションを行った結果を
人間が評価し、その評価に応じて、最適なセットを上記
画像解析手段３が自動的に選択する、といったアプロー
チが考えられる。If a plurality of sets of mask patterns having different sizes are stored in the image analysis means 3 and the analysis is performed by using all the sets of mask patterns, a large feature to a small feature can be obtained. You will be able to take on any feature. If the number of features increases, it becomes easier to determine the target. The size of the mask pattern set and the number of mask patterns are determined according to the features and the number of the objects to be recognized. For example, a human evaluates a result of the recognition analysis performed by the image analysis unit 3 using a specific set of mask patterns, and the image analysis unit 3 automatically selects an optimal set according to the evaluation. Approach.

【００３５】また、図１の特徴ベクトル生成手段４は、
上記画像解析手段３で行った正規化画像１１ａの分析結
果を、全てのマスクパターンごとに集計する。つまり、
上記正規化画像１１ａは、個々のマスクパターンの特徴
を、どれだけ備えた画像であるかということを集計する
ことになる。その集計結果は、図６のように表される。
この図は、横軸にマスクパターンの種類を示し、縦軸
に、各パターンの要素量を示している。各マスクパター
ンを各次元として生成した２５次元の特徴ベクトルＡを
生成する。この第１実施例では、各マスクパターンがこ
の発明の基底関数にあたる。そして、上記特徴ベクトル
Ａ１、正規化画像１１ａの特徴を定量化したものであ
る。The feature vector generating means 4 in FIG.
The analysis result of the normalized image 11a performed by the image analysis means 3 is totaled for each mask pattern. That is,
The normalized image 11a totals how many features of each mask pattern are provided. The result of the aggregation is shown in FIG.
In this figure, the horizontal axis indicates the type of mask pattern, and the vertical axis indicates the element amount of each pattern. A 25-dimensional feature vector A generated by using each mask pattern as each dimension is generated. In the first embodiment, each mask pattern corresponds to a basis function of the present invention. The feature vector A1 and the features of the normalized image 11a are quantified.

【００３６】上記判別ベクトル生成手段５は、上記特徴
ベクトル生成手段４によって生成された特徴ベクトルＡ
１、判別ベクトルに変換するものである。判別ベクトル
とは、上記特徴ベクトルの次元のうち、物体の判定に有
効な次元に重み付けをして、新しい軸を判別軸として設
定し、次元数を減らしたベクトルである。The discriminant vector generation means 5 generates the feature vector A generated by the feature vector generation means 4.
1. Conversion into a discrimination vector. The discrimination vector is a vector obtained by weighting a dimension effective for object determination among the dimensions of the feature vector, setting a new axis as a discrimination axis, and reducing the number of dimensions.

【００３７】例えば、異なる２つの物体の正規化画像を
区別しなければならない場合を考える。まず、物体の画
像を微妙な光源の変化などに応じてｎ枚取り込み、第１
の物体の特徴ベクトルのセットをＡ、第２の物体の特徴
ベクトルのセットをＢとする。これらの特徴ベクトル群
ＡとＢは、上記特徴ベクトル生成手段４によって生成さ
れた２５次元ベクトルであるが、判別ベクトルの考え方
の説明を簡単にするために２次元ベクトルとし、ＡとＢ
のクラス内分散は十分小さいとする。このとき、図７に
示すように、上記特徴ベクトルＡとＢは、２次元（ｘ−
ｙ）平面上に表される。ここで、Ａ、Ｂベクトル群の終
点群ａ、ｂを良好に分離する判別軸Ｌを判別分析法など
により算出する。このようにして、ＡとＢをより良好に
分離する軸（空間）を得る。For example, consider a case where it is necessary to distinguish between normalized images of two different objects. First, n images of an object are captured according to subtle changes in the light source, etc.
A is a set of feature vectors of the second object, and B is a set of feature vectors of the second object. The feature vector groups A and B are 25-dimensional vectors generated by the feature vector generating means 4 described above. In order to simplify the explanation of the concept of the discrimination vector, the two are two-dimensional vectors.
It is assumed that the intra-class variance of is sufficiently small. At this time, as shown in FIG. 7, the feature vectors A and B are two-dimensional (x−
y) Represented on a plane. Here, a discriminant axis L for satisfactorily separating the end point groups a and b of the A and B vector groups is calculated by a discriminant analysis method or the like. In this way, an axis (space) that better separates A and B is obtained.

【００３８】上記の判別軸で定義される空間を判別空間
と言うことにする。なお、こうした判別のための空間
は、ほかにも主成分分析、カーネル主成分分析といった
教師なしデータ解析手法や、カーネル判別分析、サポー
トベクターマシン、ニューラルネットワークなどの教師
あり手法によっても構成することができる。上記判別ベ
クトル生成手段５は、上記特徴ベクトルを上記判別空間
上の判別ベクトルに変換するための計算式、例えば変換
マトリックスなどを記憶しておく。The space defined by the above-mentioned discrimination axis is called a discrimination space. The space for such discrimination can also be constructed by unsupervised data analysis methods such as principal component analysis and kernel principal component analysis, and supervised methods such as kernel discriminant analysis, support vector machines, and neural networks. it can. The discrimination vector generation means 5 stores a calculation formula for converting the feature vector into a discrimination vector in the discrimination space, for example, a conversion matrix.

【００３９】図１の記憶手段６は、学習プロセスにおい
て、上記判別ベクトル手段５で生成された判別ベクトル
に、物体の概念を対応づけて記憶する。物体の概念は、
キーボードやマウス、音声によって外部から入力する。
また、この記憶手段６には、物体の概念と対応する学習
用のラベルつき特徴ベクトル群も一緒に記憶させてお
く。上記認識手段７は、認識時に、認識対象物体の判別
ベクトルと上記記憶手段６に記憶された判別ベクトルと
を比較して、物体の認識を行うところである。ここで、
認識した物体の情報は、出力部９を介して、外部に出力
される。例えば、外部の図示しない制御手段に出力し、
上記認識結果に応じて、他の装置を制御することもでき
る。The storage means 6 in FIG. 1 stores the concept of the object in association with the discrimination vector generated by the discrimination vector means 5 in the learning process. The concept of an object is
Input externally by keyboard, mouse, or voice.
The storage means 6 also stores a group of labeled feature vectors for learning corresponding to the concept of the object. At the time of recognition, the recognition means 7 compares the discrimination vector of the recognition target object with the discrimination vector stored in the storage means 6 to recognize the object. here,
Information on the recognized object is output to the outside via the output unit 9. For example, output to an external control means (not shown),
Other devices can be controlled according to the recognition result.

【００４０】以下に、図１の視覚認識システムを用い
て、物体を認識する手順を説明する。初めに、学習プロ
セスについて、図８のフローチャートにしたがって、説
明する。この学習プロセスは、このシステムに、初め
て、様々な物体を覚えさせるプロセスである。まず、ス
テップ１では、図１の画像取り込み手段１によって、学
習対象物体のテクスチャー画像と距離データとを取り込
む。このステップ１では、覚えさせたい複数の物体の画
像に対し、光源などの条件を少しずつ変えながらそれぞ
れ複数枚ずつ読み込ませる。The procedure for recognizing an object using the visual recognition system shown in FIG. 1 will be described below. First, the learning process will be described with reference to the flowchart of FIG. This learning process is a process that makes the system memorize various objects for the first time. First, in step 1, the image capturing means 1 shown in FIG. 1 captures a texture image of a learning target object and distance data. In this step 1, a plurality of images of a plurality of objects to be memorized are read while changing conditions such as a light source little by little.

【００４１】ステップ２では、上記正規化手段２によっ
て、個々のテクスチャーを基準距離と、基準角度によっ
て正規化し、正規化画像を作成する。正規化のための基
準距離が複数設定されている場合には、このステップを
繰り返して、各物体の画像群に対して複数の正規化画像
を生成する。ステップ３で、上記画像解析手段３によっ
て、個々の正規化画像をマスクパターンを用いて展開す
る。ステップ４では、特徴ベクトル生成手段４が、上記
画像解析結果に基づいて特徴ベクトルを生成する。上記
ステップ１〜ステップ４は、それぞれ、複数の物体に対
する処理を行うようにしているが、各ステップでは、１
つの物体に対する処理だけを行うようにして、ステップ
１〜ステップ４を、学習対象の数分、繰り返すようにし
てもかまわない。In step 2, each of the textures is normalized by the reference distance and the reference angle by the normalizing means 2 to create a normalized image. If a plurality of reference distances for normalization are set, this step is repeated to generate a plurality of normalized images for the image group of each object. In step 3, each normalized image is developed by the image analysis means 3 using a mask pattern. In step 4, the feature vector generation means 4 generates a feature vector based on the image analysis result. The above steps 1 to 4 each perform processing on a plurality of objects.
Steps 1 to 4 may be repeated as many times as the number of learning targets by performing only the processing on one object.

【００４２】ステップ５では、判別ベクトル生成手段５
が、ステップ４までに、生成した複数の特徴ベクトルに
基づき判別空間を生成する。そして、上記特徴ベクトル
を、上記判別空間上に投影する計算式である変換マトリ
ックスを算出し、それを記憶しておく。なお、上記ステ
ップ２において、１つの物体に対して、複数の正規化画
像を生成した場合には、このステップ５で生成する判別
空間も、正規化の基準距離ごとに生成することになる。
さらに、ステップ６で、上記判別ベクトル生成手段５
は、全ての特徴ベクトルに上記計算式をかけて、判別ベ
クトルを生成する。ステップ７で、個々の判別ベクトル
に、外部から入力された物体の概念を対応づけて、上記
記憶手段６に記憶する。以上で、初期の学習プロセスは
終了する。In step 5, the discrimination vector generation means 5
Generates a discrimination space based on the plurality of generated feature vectors by step 4. Then, a transformation matrix, which is a calculation formula for projecting the feature vector onto the discrimination space, is calculated and stored. When a plurality of normalized images are generated for one object in step 2 described above, the discrimination space generated in step 5 is also generated for each reference distance for normalization.
Further, in step 6, the discrimination vector generation means 5
Generates a discrimination vector by multiplying all the feature vectors by the above formula. In step 7, the concept of the object input from the outside is associated with each discrimination vector and stored in the storage means 6. Thus, the initial learning process ends.

【００４３】次に、新しい物体を学習する、追加学習の
プロセスを図９のフローチャートにしたがって説明す
る。ステップ１１〜ステップ１４までは、図８のステッ
プ１〜ステップ４とほぼ同じなので、詳細な説明は省略
する。要するに、ステップ１１〜ステップ１４で、画像
取り込み手段１が取り込んだ、学習対象物体のテクスチ
ャー画像と距離データとから、学習対象の画像の特徴ベ
クトルを生成する。Next, an additional learning process for learning a new object will be described with reference to the flowchart of FIG. Steps 11 to 14 are almost the same as steps 1 to 4 in FIG. 8, and thus detailed description is omitted. In short, in steps 11 to 14, a feature vector of the learning target image is generated from the texture image of the learning target object and the distance data captured by the image capturing unit 1.

【００４４】ステップ１５で、上記判別ベクトル生成手
段５が、上記記憶手段６に記憶されている学習済みの特
徴ベクトルを呼び出して、これにステップ１４で生成し
た特徴ベクトルを合わせて、これら全ての特徴ベクトル
から、新しい判別空間を特定し、新しい変換マトリック
スを生成する。ステップ１６では、判別空間ベクトル生
成手段５に、上記変換マトリックスを記憶する。この計
算式は、学習するたびに更新されることになる。In step 15, the discrimination vector generation means 5 calls the learned feature vector stored in the storage means 6 and adds the feature vector generated in step 14 to all of these feature vectors. From the vectors, a new discriminant space is identified and a new transformation matrix is generated. In step 16, the above-mentioned transformation matrix is stored in the discriminant space vector generation means 5. This formula is updated every time learning is performed.

【００４５】ステップ１７で、上記判別ベクトル生成手
段５は、生成した変換マトリックスによって、上記特徴
ベクトルを判別ベクトルに変換する。ステップ１８で
は、外部から入力された学習対象物体の概念と、ステッ
プ１６で生成した判別ベクトルおよびステップ１４で生
成した特徴ベクトルとを対応づけて記憶手段６に記憶さ
せる。以上、ステップ１１〜ステップ１７を追加学習の
たびに、行うようにする。In step 17, the discrimination vector generation means 5 converts the feature vector into a discrimination vector by using the generated conversion matrix. In step 18, the concept of the learning target object input from the outside, the discrimination vector generated in step 16, and the feature vector generated in step 14 are stored in the storage unit 6 in association with each other. As described above, steps 11 to 17 are performed each time additional learning is performed.

【００４６】なお、図８のステップ７と図９のステップ
１８で、特徴ベクトルを記憶手段６に記憶させたのは、
上記判別ベクトル生成手段５が、新たな計算式を生成す
る際に、学習済みの特徴ベクトルが必要になるからであ
る。以上で、学習プロセスは終了する。なお、この第１
実施例では、図１の画像取り込み手段１、正規化手段
２、画像解析手段３、特徴ベクトル生成手段４、判別ベ
クトル生成手段５、記憶手段６，入力部８が、学習シス
テムを構成している。The feature vectors stored in the storage means 6 in step 7 of FIG. 8 and step 18 of FIG. 9 are as follows.
This is because the learned vector needs to be learned when the discrimination vector generating means 5 generates a new calculation formula. Thus, the learning process ends. In addition, this first
In the embodiment, the image capturing unit 1, the normalizing unit 2, the image analyzing unit 3, the feature vector generating unit 4, the discriminant vector generating unit 5, the storage unit 6, and the input unit 8 in FIG. 1 constitute a learning system. .

【００４７】次に、認識プロセスを図１０のフローチャ
ートにしたがって説明する。ステップ２１で、画像取り
込み手段１によって、認識対象のテクスチャーと距離デ
ータとを読み込む。ステップ２２で、上記正規化手段２
が、上記テクスチャーを正規化する。このステップ２で
は、複数の基準距離が設定されていても、まず１つの基
準距離における正規化画像を生成することにする。ステ
ップ２３で、画像解析手段３が、上記正規化画像を展開
する。ステップ２４で、特徴ベクトル生成手段４が、特
徴ベクトルを生成する。Next, the recognition process will be described with reference to the flowchart of FIG. In step 21, the texture to be recognized and the distance data are read by the image capturing means 1. In step 22, the normalizing means 2
Normalizes the texture. In step 2, even if a plurality of reference distances are set, a normalized image at one reference distance is first generated. In step 23, the image analysis unit 3 develops the normalized image. In step 24, the feature vector generation means 4 generates a feature vector.

【００４８】ステップ２５で、判別ベクトル生成手段５
が、判別ベクトルを生成する。ここでは、上記ステップ
２４で生成した特徴ベクトルを、図８のステップ１６で
生成した計算式を用いて変換することによって、判別ベ
クトルを生成する。ステップ２６では、認識手段７が、
ステップ２５で生成した認識対象の判別ベクトルを、上
記記憶手段６に記憶されている学習済み判別ベクトルと
対比して、個々の学習済み判別ベクトルと、認識対象の
判別ベクトルとの距離を算出する。ここで算出する距離
とは、各ベクトルの終点間の距離である。At step 25, the discriminant vector generation means 5
Generates a discrimination vector. Here, the discrimination vector is generated by converting the feature vector generated in step 24 using the calculation formula generated in step 16 in FIG. In step 26, the recognition means 7
The discrimination vector of the recognition target generated in step 25 is compared with the learned discrimination vector stored in the storage means 6 to calculate the distance between each learned discrimination vector and the discrimination vector of the recognition target. The distance calculated here is the distance between the end points of the respective vectors.

【００４９】ステップ２７で、上記判別ベクトル生成手
段５は、判別ベクトル間の距離の算出結果に基づいて、
上記認識対象の判別ベクトルに最も近い学習済み判別ベ
クトルを特定する。そして、ステップ２８で、上記距離
が設定値以下かどうかを判定する。設定値とは、認識対
象の判別ベクトルと、学習対象の判別ベクトルとの差
が、どのくらいのとき、すなわち、どのくらい近いとき
に、同じものと認識するかという基準値である。ここ
で、上記距離が設定値以下の場合には、ステップ２９に
進み、特定した判別ベクトルと、距離とを記憶する。一
方、ステップ２８で、上記距離が設定値を越えていた場
合には、ステップ３０へ進む。In step 27, the discrimination vector generation means 5 calculates the distance between the discrimination vectors based on the calculation result.
The learned discrimination vector closest to the recognition target discrimination vector is specified. Then, in step 28, it is determined whether or not the distance is equal to or less than a set value. The set value is a reference value that indicates when the difference between the discrimination vector of the recognition target and the discrimination vector of the learning target is the same, that is, how close the difference is. Here, if the distance is equal to or less than the set value, the process proceeds to step 29, where the specified discrimination vector and the distance are stored. On the other hand, if the distance exceeds the set value in step 28, the process proceeds to step 30.

【００５０】ステップ３０では、別の基準距離があるか
どうか、つまり、設定されている複数の基準距離のう
ち、まだ、正規化に利用していない基準距離があるかど
うかということを判断する。ステップ３０で、別の基準
距離があった場合には、ステップ２２に進み、その基準
距離で正規化画像を作成する。そして、ステップ２２か
ら、以下のステップを繰り返す。ただし、ステップ２７
で、認識対象の判別ベクトルに最も近い学習済みの判別
ベクトルを特定する際には、ステップ２９で記憶してい
る判別ベクトルまでの距離も含めて、最も近い判別ベク
トルを選択するようにしている。これにより、ステップ
２９までに、認識対象の判別ベクトルに最も近い判別ベ
クトルであって、しかも、その距離が設定値以下のもの
だけが、選別されるようにしている。At step 30, it is determined whether or not there is another reference distance, that is, whether or not there is a reference distance that has not been used for normalization among a plurality of set reference distances. If there is another reference distance in step 30, the process proceeds to step 22, where a normalized image is created using the reference distance. Then, the following steps are repeated from step 22. However, step 27
When the learned discrimination vector closest to the recognition target discrimination vector is specified, the closest discrimination vector including the distance to the discrimination vector stored in step 29 is selected. Thus, by step 29, only the discrimination vector closest to the recognition target discrimination vector and whose distance is equal to or less than the set value is selected.

【００５１】ステップ３０で、別の基準距離がなかった
場合には、ステップ３１に進む。ステップ３１では、特
定した判別ベクトルがあるかどうかを判断し、なけれ
ば、ステップ３３へ進み、認識不能とし、その結果を出
力する（ステップ３４）。ステップ３１で、特定ベクト
ルがあった場合には、ステップ３２へ進む。このステッ
プ３２では、記憶手段６で、上記特定された判別ベクト
ルに対応付けられている概念を認識対象物体の概念とし
て出力する（ステップ３４）。以上で、認識プロセスが
終了する。If there is no other reference distance in step 30, the process proceeds to step 31. In step 31, it is determined whether or not the specified discrimination vector exists. If not, the process proceeds to step 33, in which the recognition is not possible, and the result is output (step 34). If there is a specific vector in step 31, the process proceeds to step 32. In step 32, the storage unit 6 outputs the concept associated with the specified discrimination vector as the concept of the recognition target object (step 34). Thus, the recognition process ends.

【００５２】この第１実施例においては、図１の画像取
り込み手段１と、正規化手段２、画像解析手段３、特徴
ベクトル生成手段４、判別ベクトル生成手段５、認識手
段７、出力部９が、この発明の認識システムを構成して
いる。そして、図１中、学習システムにおけるデータの
流れを実線の矢印で示し、認識システムにおけるデータ
の流れを破線の矢印で示している。In the first embodiment, the image capturing means 1 of FIG. 1, the normalizing means 2, the image analyzing means 3, the feature vector generating means 4, the discriminant vector generating means 5, the recognizing means 7, and the output unit 9 are provided. Constitute the recognition system of the present invention. In FIG. 1, the flow of data in the learning system is indicated by solid arrows, and the flow of data in the recognition system is indicated by broken arrows.

【００５３】上記第１実施例のシステムによれば、学習
プロセスにおいて、学習対象の画像を正規化しているの
で、認識プロセスにおいても、取り込んだ画像を正規化
することによって、判別ベクトルの正確な対比ができ
る。すなわち、実際の物体の位置が、学習時に読み込ん
だ画像の位置と違っていても、正確な認識ができる。そ
のため、従来のように、学習時にあらゆる距離における
画像を取り込んでおくようなことをする必要がない。According to the system of the first embodiment, since the learning target image is normalized in the learning process, the captured image is also normalized in the recognition process, thereby enabling accurate comparison of the discrimination vectors. Can be. That is, even if the actual position of the object is different from the position of the image read during learning, accurate recognition can be performed. Therefore, it is not necessary to take in images at all distances during learning as in the related art.

【００５４】なお、取り込む画像がカラー画像の場合、
画像解析手段３において、画像を色分解してから、それ
ぞれの画像にマスクをかけて分解を行うようにする。こ
こでいう色分解とは、例えば、カラー画像を、光の三原
色である青・緑・赤の３つの画像に分解することであ
る。ただし、色分解の要素としては、上記三原色にかぎ
らない。例えば、色相・彩度・明度（輝度）など、様々
な表色法における要素に分解するようにしてもかまわな
い。例えば、彩度は、画像の取り込み環境の明るさの影
響を受けにくいので、そうした特性を活用することが考
えられる。When the image to be captured is a color image,
The image analysis means 3 performs color separation of the image, and then masks each image to perform separation. The color separation referred to here is, for example, to separate a color image into three images of blue, green and red, which are three primary colors of light. However, the elements of color separation are not limited to the above three primary colors. For example, it may be decomposed into elements in various colorimetric methods such as hue, saturation, and brightness (luminance). For example, since the saturation is hardly affected by the brightness of the image capturing environment, it is conceivable to utilize such characteristics.

【００５５】また、この第１実施例では、上記画像取り
込み手段１は、画像データと、距離データとを取り込め
るものであればどのようなものでもかまわない。上記第
１実施例においては、ステレオカメラを用いることによ
って、テクスチャー画像と、距離データとを同時に取り
込めるようにしているが、距離データの取り込みを他の
距離測定方法を用いて行えば、画像の取り込みにはステ
レオカメラを用いなくても良い。他の物体までの距離の
測定方法には、例えば、３台以上のカメラを用いる方法
のほか、レンジファインダーを用いる方法、超音波ソナ
ーを利用する方法などがある。In the first embodiment, the image capturing means 1 may be of any type as long as it can capture image data and distance data. In the first embodiment, the texture image and the distance data can be captured at the same time by using a stereo camera. However, if the distance data is captured using another distance measurement method, the image capture can be performed. Need not use a stereo camera. Methods for measuring the distance to another object include, for example, a method using three or more cameras, a method using a range finder, and a method using an ultrasonic sonar.

【００５６】さらに、上記第１実施例では、テクスチャ
ー画像の学習と認識のプロセスについて、説明したが、
画像取り込み手段１により物体の距離画像を取り込める
ようにしておけば、距離画像についても、上記テクスチ
ャーと全く同様にして、学習および認識を行うことがで
きる。距離画像とは、物体や情景の凹凸／奥行きの変化
を色の変化や、輝度の変化として表したもので、上記テ
クスチャーを解析するのに用いたマスクパターンをその
まま利用することもできる。距離画像の例として、画像
中の対象物までの距離を濃淡で表現した画像１３を図１
１に示す。ここでは、カメラからの距離が近いものを薄
く（白っぽく）表し、遠く離れるほど、濃く（黒っぽ
く）表している。In the first embodiment, the process of learning and recognizing a texture image has been described.
If the distance image of the object can be captured by the image capturing means 1, learning and recognition can be performed on the distance image in exactly the same manner as the texture. The distance image expresses a change in unevenness / depth of an object or a scene as a change in color or a change in luminance. The mask pattern used for analyzing the texture can be used as it is. As an example of the distance image, an image 13 in which the distance to an object in the image is represented by shading is shown in FIG.
It is shown in FIG. Here, objects that are closer to the camera are lighter (whiter), and farther away are darker (blacker).

【００５７】また、テクスチャーについての判別ベクト
ルと、距離画像に付いての判別ベクトルとの両方を学習
させておけば、認識時には、どちらか一方の画像データ
から、物体の認識をおこなうことができる。例えば、認
識対象物体が古くなってしまって、画像の取り込み時
に、表面の模様が見えなくなっていた場合には、テクス
チャーの判別ベクトルにその特徴が表れないので、認識
ができない可能性がある。このように、テクスチャーの
判別ベクトルからは、物体の認識ができない場合であっ
ても、距離画像の判別ベクトルを学習時に記憶させてお
けば、距離画像の判別ベクトルから、物体の認識ができ
る。すなわち、双方の判別ベクトルを用いた認識を行う
ことにより、認識の信頼性を向上させることが可能とな
る。If both the discrimination vector for the texture and the discrimination vector for the distance image are learned, the object can be recognized from either one of the image data at the time of recognition. For example, if the object to be recognized has become old and the surface pattern has become invisible at the time of capturing the image, the feature may not appear in the texture discrimination vector, and recognition may not be possible. As described above, even if the object cannot be recognized from the texture determination vector, the object can be recognized from the distance image determination vector by storing the distance image determination vector during learning. That is, by performing recognition using both discrimination vectors, it is possible to improve the reliability of recognition.

【００５８】さらに、上記第１実施例では、学習プロセ
スと認識プロセスの両プロセスにおいて、複数の基準距
離について、それぞれ、正規化を行うようにしている。
しかし、基準距離を複数設けなくてもかまわないし、複
数設けた場合でも、常に、全ての基準距離で正規化しな
くてもかまわない。例えば、学習時には、複数の基準距
離で正規化を行って、その基準距離ごとに判別空間を作
成しておき、認識時には、実際の距離に近い基準距離の
みで正規化を行って、その基準距離の判別空間での判別
だけを行うようにしてもよい。Further, in the first embodiment, normalization is performed for each of a plurality of reference distances in both the learning process and the recognition process.
However, a plurality of reference distances may not be provided, and even if a plurality of reference distances are provided, it is not always necessary to normalize all the reference distances. For example, at the time of learning, normalization is performed with a plurality of reference distances, a discriminant space is created for each of the reference distances, and at the time of recognition, normalization is performed only with a reference distance close to the actual distance, and the reference distance is calculated. Alternatively, only the determination in the determination space may be performed.

【００５９】反対に、学習プロセスでは、物体の大きさ
によって、複数の基準距離から適した基準距離を選択し
て、その基準距離のみで正規化を行い、その基準距離の
判別空間を作成し、認識時に、複数の基準距離で正規化
を行って、最も近い判別ベクトルを探すという方法でも
かまわない。要するに、学習時に取り込んだ物体の距離
と、認識時の物体の位置とを揃えて、比較できるように
すれば、正確な認識ができる。On the other hand, in the learning process, an appropriate reference distance is selected from a plurality of reference distances according to the size of the object, normalization is performed only with the reference distance, and a discrimination space for the reference distance is created. At the time of recognition, a method of performing normalization with a plurality of reference distances and searching for the closest discrimination vector may be used. In short, if the distance of the object taken in at the time of learning and the position of the object at the time of recognition are aligned and compared, accurate recognition can be performed.

【００６０】なお、学習プロセスにおいて、学習対象の
画像を取り込む時には、その物体の特徴が見えやすい位
置に物体をおいて、周囲の不要な物をどけてから、学習
対象だけを取り込むことができる。もしも、学習対象以
外の物体、例えば背景にある山などを実際に動かすこと
ができない場合には、取り込んだ画像から、背景を消す
などの処理をオペレータが行うことができる。つまり、
学習プロセスでは、学習対象を正確に覚えさせるため
に、学習対象物体の周囲に置かれたものを排除した環境
を意図的に作り出すことができる。In the learning process, when an image to be learned is captured, an object is placed at a position where the characteristics of the object are easily seen, unnecessary surrounding objects are removed, and then only the learning target can be captured. If it is impossible to actually move an object other than the learning target, for example, a mountain in the background, the operator can perform processing such as erasing the background from the captured image. That is,
In the learning process, it is possible to intentionally create an environment in which objects placed around the learning target object are excluded in order to correctly remember the learning target.

【００６１】一方、認識プロセスでは、認識対象物体
が、真っ黒な背景の中に独立して置かれている場合の方
が少ない。このような状況で、認識対象と背景とを同時
に取り込んだ画像から、両者を一体にして解析した場合
には、認識対象物体の認識が正確にできないことが多
い。例えば、同じ対象であっても、その背景が全く異な
った場合には、その背景の特徴の違いによって、異なる
物体と認識してしまうことがある。そのために、背景か
ら認識対象物体を切り出して、その切り出した物体の画
像を解析して認識する方法がある。On the other hand, in the recognition process, there are fewer cases where the object to be recognized is independently placed in a black background. In such a situation, if both are analyzed integrally from an image in which the recognition target and the background are captured at the same time, the recognition target object cannot often be accurately recognized. For example, even if the same object has completely different backgrounds, the objects may be recognized as different objects due to differences in the characteristics of the background. For this purpose, there is a method in which a recognition target object is cut out from a background, and an image of the cut out object is analyzed and recognized.

【００６２】図１２を用いて説明する第２実施例は、図
１に示す第１実施例の画像取り込み手段１に、取り込ん
だ画像から、背景部分を削除して、認識対象を特定する
機能が付加されたものである。それ以外の構成や、各構
成要素の作用は上記第１実施例と同じである。そこで、
この第２実施例の説明にも図１を用いる。この第２実施
例の視覚認識システムにおける学習プロセスは、上記第
１実施例と同じなので、ここではその説明を省略する。
以下には、必要な学習が終了した後、物体を認識する認
識プロセスについて説明する。ただし、上記学習プロセ
スにおいては、学習対象であるバッグを模様のない黒い
壁の前に置いて、学習したものとする。In the second embodiment described with reference to FIG. 12, the image capturing means 1 of the first embodiment shown in FIG. 1 has a function of deleting a background portion from a captured image and specifying a recognition target. It has been added. The other configuration and the operation of each component are the same as those of the first embodiment. Therefore,
FIG. 1 is used for the description of the second embodiment. Since the learning process in the visual recognition system of the second embodiment is the same as that of the first embodiment, the description is omitted here.
In the following, a recognition process for recognizing an object after necessary learning is completed will be described. However, in the above learning process, it is assumed that the bag to be learned is learned in front of a black wall without a pattern.

【００６３】一方、認識プロセスにおいて、認識対象の
バッグが、床に置かれているとする。まず、上記バッグ
の画像を上記画像取り込み手段１によって取り込む。図
１２に示すように、ここで取り込んだ画像１５には、バ
ッグ１５ａの背景に床１５ｂの模様も含まれている。こ
の画面１５全体を認識対象として、解析を行ったので
は、上記バッグと床とが一体化した物体を解析すること
になってしまう。上記画面１５の中から、バッグだけを
切り取ることが必要である。On the other hand, in the recognition process, it is assumed that the bag to be recognized is placed on the floor. First, the image of the bag is captured by the image capturing means 1. As shown in FIG. 12, the image 15 captured here also includes the pattern of the floor 15b on the background of the bag 15a. If the analysis is performed with the entire screen 15 as a recognition target, an object in which the bag and the floor are integrated will be analyzed. It is necessary to cut out only the bag from the screen 15.

【００６４】そこで、上記画像取り込み手段１は、画像
データとともに取り込んだ距離データを基にして、認識
対象物体の切り取りを行う。すなわち、上記画像取り込
み手段１は、ステレオカメラと、データ処理部とからな
っているが、このデータ処理部が、上記ステレオカメラ
によって取り込んだ距離データによって、上記バッグと
床面との距離の差を検出することができる。上記床面１
５ｂまでの距離は、上記バッグ１５ａまでの距離と比べ
て、カメラから遠い。しかも、上記バッグ１５ａの境界
において、上記床面１４ｂまでの距離との差は、ほぼ一
定である。このことから、床面とバッグとを別の物体と
認識し、バッグを認識対象として、床面から切り取るこ
とができる。Therefore, the image capturing means 1 cuts out the object to be recognized based on the distance data captured together with the image data. That is, the image capturing means 1 includes a stereo camera and a data processing unit. The data processing unit determines a difference between the distance between the bag and the floor surface based on the distance data captured by the stereo camera. Can be detected. Floor 1 above
The distance to 5b is farther from the camera than the distance to the bag 15a. Moreover, the difference from the distance to the floor surface 14b at the boundary of the bag 15a is substantially constant. Accordingly, the floor and the bag can be recognized as different objects, and the bag can be cut out from the floor as a recognition target.

【００６５】実際には、背景である床面部分の画像を排
除して真っ黒に処理してしまう。このようにして、対象
物体だけを切り取れば、図１２は、図４のように、背景
には、何もない画像１１ａと同じになる。このようにし
て、背景を排除した画像を、上記正規化手段２が正規化
し、その正規化画像を解析処理して認識するステップ
は、図１０のステップ２２以降のステップであり、上記
第１実施例と同じである。すなわち、上記第１実施例と
同様に、マスクパターンｍを利用した解析、特徴ベクト
ルの生成、判別ベクトルの生成を行って、認識手段７が
バッグを認識する。In practice, the image of the floor surface portion as the background is excluded and the image is processed to be completely black. If only the target object is cut out in this way, FIG. 12 becomes the same as the image 11a having no background as shown in FIG. The normalizing means 2 normalizes the image from which the background has been removed in this way, and the step of analyzing and recognizing the normalized image is the step after step 22 in FIG. Same as the example. That is, as in the first embodiment, analysis using the mask pattern m, generation of a feature vector, and generation of a discrimination vector are performed, and the recognition unit 7 recognizes the bag.

【００６６】なお、この第２実施例の認識プロセスのよ
うに、取り込んだ画像から、背景を排除することができ
れば、実際には、認識対象物体がどのような背景の中に
存在していても、その背景の模様などに依らないで、正
確な認識をすることができるので、必ずしも、上記正規
化手段２における正規化処理を行わなくてもかまわな
い。正規化処理を行わなかったとしても、従来のよう
に、背景と対象物体とを一体化して認識してしまう場合
と比べれば、格段に認識精度が高くなる。ただし、この
第２実施例のように、背景を排除する処理を行った上
に、正規化画像を作成するようにすれば、さらに、正確
な認識ができることは当然である。Note that if the background can be removed from the captured image as in the recognition process of the second embodiment, the object to be recognized may actually exist in any background. Since it is possible to perform accurate recognition without depending on the background pattern or the like, it is not always necessary to perform the normalization processing in the normalization means 2. Even if the normalization process is not performed, the recognition accuracy is significantly higher than in the conventional case where the background and the target object are integrally recognized. However, if a normalized image is created after performing the process of removing the background as in the second embodiment, it is natural that more accurate recognition can be performed.

【００６７】以上のように、背景から、対象物体を切り
取ることは、上記画像取り込み手段１で、自動的に行う
ことができる。これにより、必要な部分だけを解析し
て、正確な認識ができるようになる。ただし、異なる物
体との境界における奥行きがなだらかに連続しているよ
うな場合には、異なる物体のなかから、目的の物体だけ
を切り取ることは、難しい。その場合には、オペレータ
が、手動で、画像処理を行って背景を削除するようにし
ても良い。As described above, the object capturing from the background can be automatically performed by the image capturing means 1. As a result, it is possible to analyze only necessary parts and perform accurate recognition. However, when the depth at the boundary with different objects is smoothly continuous, it is difficult to cut out only the target object from the different objects. In such a case, the operator may manually perform image processing to delete the background.

【００６８】また、予め、認識対象物体の距離がわかっ
ていれば、その範囲内にある物体だけを認識対象として
解析するように、プログラムしておくことができる。そ
のようにすれば、物体の切り出しが、効率良く、正確に
できる。さらに、ここで説明した、背景の排除の処理
は、学習プロセスに置いても全く同様に行うことができ
る。すなわち、学習プロセスに置いても、画像取り込み
手段１が取り込んだ画像の中から、背景との距離の差に
基づいて、学習対象物体を自動的に特定して、背景を削
除することができる。If the distance of the object to be recognized is known in advance, it can be programmed so that only objects within the range are analyzed as objects to be recognized. By doing so, the object can be cut out efficiently and accurately. Further, the process of removing the background described here can be performed in exactly the same manner even in the learning process. That is, even in the learning process, the learning target object can be automatically specified from the images captured by the image capturing unit 1 based on the difference in distance from the background, and the background can be deleted.

【００６９】図１３、図１４に示す第３実施例は、認識
対象物体の前に他の物体がある場合にも、正しい認識が
できるようにした例である。この第３実施例では、画像
取り込み手段１で、認識対象物体を特定して切り出す機
能を備えるとともに、画像解析手段３が、画像解析を部
分的に行う機能を備えたものである。その他は、上記実
施例と同様である。また、この第３実施例においても、
システムの全体構成は、上記第１実施例と同様なので、
図１を用いて説明する。The third embodiment shown in FIGS. 13 and 14 is an example in which correct recognition can be performed even when there is another object in front of the object to be recognized. In the third embodiment, the image capturing means 1 has a function of specifying and cutting out a recognition target object, and the image analyzing means 3 has a function of partially performing image analysis. Others are the same as the above embodiment. Also in the third embodiment,
Since the overall configuration of the system is the same as in the first embodiment,
This will be described with reference to FIG.

【００７０】図１３は、学習プロセスにおける正規化画
像１６である。この正規化画像１６は、取り込んだ画像
から、上記第２実施例と同様にして、背景を削除し、さ
らに、図１の正規化手段２によって作成したものであ
る。このような正規化画像１６が作成されたら、上記画
像解析手段３は、画像解析を行うが、この第３実施例で
は、上記画像１６を、複数のウインドウＷに分割して処
理を行う。ここでは、ウインドウＷ１〜Ｗｎで分割する
が、これらのウインドウＷ１〜Ｗｎは、上下左右方向に
互いに、少しづつずらして配置されたものである。言い
換えれば、１つのウインドウを図中、横方向および縦方
向に移動したものである。FIG. 13 shows a normalized image 16 in the learning process. The normalized image 16 is obtained by deleting the background from the captured image in the same manner as in the second embodiment, and further by the normalizing means 2 in FIG. When such a normalized image 16 is created, the image analysis means 3 performs image analysis. In the third embodiment, the image 16 is divided into a plurality of windows W for processing. Here, the windows W1 to Wn are divided, but these windows W1 to Wn are arranged so as to be slightly shifted from each other in the vertical and horizontal directions. In other words, one window is moved in the horizontal and vertical directions in the figure.

【００７１】上記画像解析手段３は、上記ウインドウＷ
１〜Ｗｎごとに、マスクパターンｍによる分析を行う。
すなわち、各ウインドウＷ１〜Ｗｎ内で、図５に示すマ
スクパターンｍ１〜ｍ２５を移動させながら重ね、各要
素量を集計する。上記要素量は、ウインドウごとに集計
され、ウインドウごとに特徴ベクトルが生成される。こ
のウインドウごとに生成された特徴ベクトルを部分特徴
ベクトルａ１〜ａｎとする。The image analysis means 3 is provided with the window W
The analysis using the mask pattern m is performed every 1 to Wn.
That is, within each of the windows W1 to Wn, the mask patterns m1 to m25 shown in FIG. The element amounts are totaled for each window, and a feature vector is generated for each window. The feature vectors generated for each window are referred to as partial feature vectors a1 to an.

【００７２】この第３実施例では、ｎ個のウインドウＷ
が、上記画面１７上で、互いに重なり合っているので、
同一部分を複数回解析して部分特徴ベクトルを生成して
いることになる。In the third embodiment, n windows W
Are overlapped on the screen 17,
This means that the same part is analyzed a plurality of times to generate a partial feature vector.

【００７３】そして、判別ベクトル生成手段５では、上
記部分特徴ベクトルａ１〜ａｎごとに、部分判別ベクト
ルｃ１〜ｃｎを生成する。画像全体の判別ベクトルＣも
生成し、それらを記憶手段６に記憶させる。この記憶手
段６に記憶させる際には、個々の部分判別ベクトルｃ１
〜ｃｎに対し、それぞれ「バッグ」という概念を対応づ
ける。ただし、概念の付け方としては、全ての部分判別
ベクトルｃ１〜ｃ１２に同じ概念を付けなければならな
いというものではない。個々の部分判別ベクトルに、例
えば、「バッグの部分」とか、「バッグの留め金」とい
うように、別々の概念を付けるようにしてもかまわな
い。これにより、よりキメ細かな認識が可能になる。Then, the discrimination vector generation means 5 generates partial discrimination vectors c1 to cn for each of the partial feature vectors a1 to an. A discrimination vector C for the entire image is also generated and stored in the storage means 6. When storing in the storage means 6, each of the partial determination vectors c1
To cn are respectively associated with the concept of “bag”. However, as a method of attaching the concept, it is not necessary to attach the same concept to all the partial determination vectors c1 to c12. Each part determination vector may have a different concept, such as "bag part" or "bag clasp". This enables more detailed recognition.

【００７４】このように、部分特徴ベクトルを生成し
て、部分判別ベクトルと概念とを対応づけて記憶させる
という学習方法をとれば、認識プロセスにおいて、認識
対象が、部分的にしか見えないような場合にも、その物
体を認識することができる。例えば、図１４に示す画像
１７のように、バッグの前に他のものがあった場合で
も、バッグを認識することができるので、その方法を以
下に説明する。なお、この画像１７は、認識プロセスに
おいて、上記画像取り込み手段１から取り込んだ画像で
ある。そして、この画像１７中のバッグ１７ａは、図１
３の画像１６のバッグと同じものとする。つまり、この
システムにおいて、すでに学習済みのバッグである。し
かし、認識プロセスでは、上記バッグ１７ａの前に、テ
ープカッター１７ｂと、本１７ｃが、置かれている。As described above, if the learning method of generating the partial feature vector and storing the partial discrimination vector and the concept in association with each other is adopted, in the recognition process, the recognition target is only partially visible. In such a case, the object can be recognized. For example, as in the case of an image 17 shown in FIG. 14, even when there is something else in front of the bag, the bag can be recognized, and the method will be described below. Note that this image 17 is an image captured from the image capturing means 1 in the recognition process. The bag 17a in this image 17 is
It is the same as the bag of the image 16 of No. 3. That is, in this system, the bag has already been learned. However, in the recognition process, the tape cutter 17b and the book 17c are placed in front of the bag 17a.

【００７５】ここで、上記画像取り込み手段１のデータ
処理部は、画像とともに取り込んだ距離データから、認
識対象であるバッグ１７ａと他の物体である、テープカ
ッター１７ｂ、本１７ｃとを、カメラからの距離が異な
ることから分離することができる。そして、ここでは、
上記バッグ１７ａを認識対象として特定し、他の部分を
背景とともに削除する。上記画像取り込み手段１が、取
り込んだ画像１７から、バッグ１７ａの部分だけを残す
ようにするためには、予め、特定の距離付近にある物体
だけを認識対象とするということを設定しておくことに
よっても実現できる。あるいは、上記画像１７を取り込
んだ段階で、オペレータがバッグ１７ａの部分を指定す
ることによって、それが対象であることを画像取り込み
手段１のデータ処理部に指示することもできる。Here, the data processing section of the image capturing means 1 converts the bag 17a to be recognized and the tape cutter 17b and book 17c as other objects from the camera from the distance data captured together with the image. It can be separated from the different distances. And here,
The bag 17a is specified as a recognition target, and other portions are deleted together with the background. In order for the image capturing means 1 to leave only the portion of the bag 17a from the captured image 17, it must be set in advance that only objects located near a specific distance are to be recognized. It can also be realized by Alternatively, at the stage where the image 17 has been captured, the operator can designate the portion of the bag 17a, thereby instructing the data processing unit of the image capturing means 1 that it is the target.

【００７６】上記のようにして、画像１７からバッグ１
７ａを認識対象として特定し、他の部分を削除すると、
図１５に示すように、図１４のバッグ１７ａの一部が欠
けた対象物体１８ａを含んだ画像１８になる。そして、
このような画像１８を上記正規化手段２で正規化する。
次に、画像解析手段３が、上記正規化画像の画像解析を
行う。なお、ここでは、図１５に示す画像１８を正規化
画像として、画像解析を説明する。この画像解析手段３
は、学習プロセスと同様に、上記画像１８を複数のウイ
ンドウＷ１〜Ｗｎに分割する。そして、各ウインドウＷ
１〜Ｗｎにおいて、マスクパターンｍ１〜ｍ２５による
分析を行い、図１の特徴ベクトル生成手段４で、部分特
徴ベクトルｄ１〜ｄｎを生成する。さらに、判別ベクト
ル生成手段５によって、ウインドウＷ１〜Ｗｎごとの、
部分判別ベクトルｅ１〜ｅｎを生成する。As described above, the bag 1 is obtained from the image 17.
When 7a is specified as a recognition target and other parts are deleted,
As shown in FIG. 15, the image 18 includes the target object 18a in which a part of the bag 17a in FIG. 14 is missing. And
Such an image 18 is normalized by the normalizing means 2.
Next, the image analysis means 3 performs image analysis of the normalized image. Here, the image analysis will be described using the image 18 shown in FIG. 15 as a normalized image. This image analysis means 3
Divides the image 18 into a plurality of windows W1 to Wn, as in the learning process. And each window W
In 1 to Wn, the analysis is performed by the mask patterns m1 to m25, and the partial feature vectors d1 to dn are generated by the feature vector generating unit 4 in FIG. Further, the discrimination vector generation means 5 sets the
Generate partial discrimination vectors e1 to en.

【００７７】上記認識手段７は、上記部分判別ベクトル
ｅ１〜ｅｎと、記憶手段７の部分判別ベクトルｃ１〜ｃ
ｎとを対比して、同じベクトルあるいは近いベクトルが
あれば、認識対象が「バッグ」であると認識する。例え
ば、図１５では、ウインドウＷ２〜Ｗ４は、画像が欠け
た部分である。そのため、上記ウインドウＷ２〜Ｗ４に
おける部分判別ベクトルｅ２〜ｅ４からは、バッグの特
徴を見いだすことはできない。The recognizing means 7 includes the partial determination vectors e1 to en and the partial determination vectors c1 to c of the storage means 7.
If n and n are the same or similar vectors, the recognition target is recognized as a “bag”. For example, in FIG. 15, the windows W2 to W4 are portions where images are missing. Therefore, the features of the bag cannot be found from the partial determination vectors e2 to e4 in the windows W2 to W4.

【００７８】しかし、他のウインドウＷに対応する部分
判別ベクトルには、バッグの部分判別ベクトルｃ１〜ｃ
ｎのいずれかに近いベクトルが含まれているはずであ
る。したがって、いくつかの部分判別ベクトルの対比に
よって、認識対象が「バッグ」であることを認識するこ
とができる。すなわち、この第３実施例のように、解析
対象画像を、複数のウインドウに分割して、部分判別ベ
クトルを生成するようにすれば、物体の全体を画像とし
て取り込めないような状況下でも、見える部分から、そ
の物体を認識することができる。また、全体が見える場
合であっても、部分ベクトルと、全体ベクトルの双方が
マッチングすれば、より高い確信度で認識できることに
なる。However, the partial determination vectors corresponding to the other windows W include the partial determination vectors c1 to c of the bag.
A vector close to any one of n should be included. Therefore, it is possible to recognize that the recognition target is a “bag” by comparing some partial determination vectors. That is, as in the third embodiment, if the analysis target image is divided into a plurality of windows to generate a partial discrimination vector, the image can be seen even in a situation where the entire object cannot be captured as an image. From the part, the object can be recognized. Even when the whole is visible, if both the partial vector and the whole vector are matched, the recognition can be performed with higher certainty.

【００７９】上記のように、この第３実施例の画像取り
込み手段１は、距離データに基づいて、図１４のように
同一画像１７内に複数の物体が存在した場合には、それ
ぞれを区別することができる。例えば、テープカッター
１７ｂを対象物体と特定する場合には、テープカッター
１７ｂと距離の異なるバッグ１７ａや本１７ｃを背景と
ともに削除することができるし、本１７ｃを対象物体と
する場合も同様である。そのため、上記バッグ１７ａだ
けでなく、テープカッター１７ｂや、本１７ｃをそれぞ
れ、別の認識対象物体として特定することができる。も
しも、上記テープカッター１７ｂと本１７ｃが、同じ距
離にあった場合には、バッグ１７ａが削除されて、上記
テープカッター１７ｂと本１７ｃの両方が残ってしま
う。その場合にも、個々の物体について、その部分に対
応するウインドウ内の部分特徴ベクトルや判別ベクトル
から、個々の物体を認識することができる。As described above, the image capturing means 1 of the third embodiment discriminates between a plurality of objects in the same image 17 as shown in FIG. 14, based on the distance data. be able to. For example, when specifying the tape cutter 17b as the target object, the bag 17a or the book 17c different in distance from the tape cutter 17b can be deleted together with the background, and the same applies to the case where the book 17c is the target object. Therefore, not only the bag 17a but also the tape cutter 17b and the book 17c can be specified as different objects to be recognized. If the tape cutter 17b and the book 17c are at the same distance, the bag 17a is deleted and both the tape cutter 17b and the book 17c remain. Even in such a case, the individual object can be recognized from the partial feature vector and the discrimination vector in the window corresponding to the portion.

【００８０】図１６に、第４実施例を示す。この第４実
施例は、同一の学習対象を繰り返し読み込んで、複数の
判別ベクトルを生成し、これらを基に物体の認識を行う
例である。なお、この第４実施例においても、システム
の全体構造は、上記第１実施例と同様なので、以下の説
明にも図１を用いる。FIG. 16 shows a fourth embodiment. In the fourth embodiment, the same learning target is repeatedly read, a plurality of discrimination vectors are generated, and the object is recognized based on these vectors. In the fourth embodiment as well, the overall structure of the system is the same as that of the first embodiment, so that FIG.

【００８１】上記第１〜第３実施例でも説明したよう
に、図１の認識手段７は、認識対象画像から生成した判
別ベクトルと、学習プロセスで生成した判別ベクトルと
を対比する際に、両ベクトルの近さを、各ベクトルの終
点の距離を算出している。そして、その距離が最も近く
て、しかも、ある設定値以内の時に、学習済みの概念と
して認識するようにしている。ただし、実際には、同じ
物体でも、画像取り込み時の角度や、明るさによって、
全く同じ判別ベクトルが生成されない場合がある。その
ため、この第４実施例では、学習時にも、同一の物体に
対して、複数回、画像の取り込みを行い、複数の判別ベ
クトルを生成するようにしている。As described in the first to third embodiments, the recognizing means 7 shown in FIG. 1 compares the discrimination vector generated from the recognition target image with the discrimination vector generated in the learning process. The closeness of the vector is calculated as the distance between the end points of each vector. Then, when the distance is the shortest and within a certain set value, it is recognized as a learned concept. However, actually, even for the same object, depending on the angle at the time of capturing the image and the brightness,
Exactly the same discrimination vector may not be generated. For this reason, in the fourth embodiment, even during learning, the same object is fetched an image a plurality of times to generate a plurality of discrimination vectors.

【００８２】第４実施例の学習プロセスでは、図８、図
９で説明したような第１実施例の学習プロセスを、同一
の物体について繰り返し、同一の概念に対する複数の判
別ベクトルを生成する。例えば、第１の物体Ｐについて
の複数の判別ベクトルＰ１、Ｐ２、…、Ｐｎと、第１の
物体とは異なる第２の物体Ｑについての判別ベクトルＱ
１、Ｑ２、…、Ｑｎとが生成される。図１５には、上記
判別ベクトルＰ１、Ｐ２、…、Ｐｎの終点を複数の□印
で示し、上記判別ベクトルＱ１、Ｑ２、…、Ｑｎの終点
を複数の○印で示している。Ｒ１は認識対象である物体
Ｒの判別ベクトルである。In the learning process of the fourth embodiment, the learning process of the first embodiment as described with reference to FIGS. 8 and 9 is repeated for the same object to generate a plurality of discrimination vectors for the same concept. For example, a plurality of discrimination vectors P1, P2,..., Pn for the first object P and a discrimination vector Q for a second object Q different from the first object P
, Qn,..., Qn are generated. In FIG. 15, the end points of the discrimination vectors P1, P2,..., Pn are indicated by a plurality of □ marks, and the end points of the discrimination vectors Q1, Q2,. R1 is a discrimination vector of the object R to be recognized.

【００８３】さらに、図中、点ｐｏは、上記物体Ｐにつ
いての判別ベクトルＰ１、Ｐ２、…、Ｐｎの代表ベクト
ルＰｏの終点であり、この点ｐｏを中心とする破線で表
した円Ｐｃの半径が、設定距離である。同様に、点ｑｏ
は、上記物体Ｑについての判別ベクトルＱ１、Ｑ２、
…、Ｑｎの代表ベクトルＱｏの終点であり、この点Ｑｏ
を中心とする破線の円Ｑｃの半径が、設定距離である。
そして、この第４実施例の認識プロセスでは、図１の判
別ベクトル生成手段５が生成した認識対象物体の判別ベ
クトルを、判別空間中の代表ベクトルと対比し、最も近
い代表ベクトルであって、距離が設定距離以内の場合
に、その代表ベクトルに対応づけられた概念を上記認識
対象物体の概念とするようにしている。Further, in the figure, the point po is the end point of the representative vector Po of the discrimination vectors P1, P2,..., Pn for the object P, and the radius of the circle Pc represented by a broken line centered on this point po. Is the set distance. Similarly, the point qo
Are discrimination vectors Q1, Q2,
..., the end point of the representative vector Qo of Qn, and this point Qo
Is the set distance.
Then, in the recognition process of the fourth embodiment, the discrimination vector of the recognition target object generated by the discrimination vector generation means 5 of FIG. 1 is compared with the representative vector in the discrimination space, and the closest representative vector is obtained. Is within the set distance, the concept associated with the representative vector is set as the concept of the recognition target object.

【００８４】例えば、物体Ｒを認識する認識プロセスに
おいて、生成した判別ベクトルＲ１に最も近い代表ベク
トルが上記代表ベクトルＰｏであって、さらに、上記判
別ベクトルＲ１が、上記円Ｐｃ内にあるときには、上記
認識手段７は物体ＲをＰと認識するようにしている。こ
の第４実施例のように、同一の物体についても、繰り返
し読み込んで、複数の判別ベクトルを生成しておくよう
にすれば、学習対象物体の特徴をより的確に捉えること
ができる。その結果、物体の判別や認識が、画像取り込
みの際の様々な条件による影響を受けにくくすることが
できる。For example, in the recognition process for recognizing the object R, when the representative vector closest to the generated discrimination vector R1 is the above-mentioned representative vector Po, and when the above-mentioned discrimination vector R1 is within the above-mentioned circle Pc, The recognition means 7 recognizes the object R as P. As in the fourth embodiment, if the same object is repeatedly read and a plurality of discrimination vectors are generated, the characteristics of the learning target object can be grasped more accurately. As a result, the discrimination and recognition of the object can be made less affected by various conditions at the time of capturing the image.

【００８５】また、上記物体Ｐと物体Ｑの特徴が非常に
似ていたり、どちらも特徴が曖昧な場合には、両者の判
別ベクトルＰ１、Ｐ２、…Ｐｎと、判別ベクトルＱ１、
Ｑ２、…、Ｑｎとが、混在してしまうことがある。この
ような場合には、次のような方法で、物体の認識を行う
ことができる。If the characteristics of the object P and the object Q are very similar, or if the characteristics of both are ambiguous, the discrimination vectors P1, P2,...
Q2,..., Qn may be mixed. In such a case, the object can be recognized by the following method.

【００８６】まず、認識対象物体Ｒの判別ベクトルＲ１
から個々の判別ベクトルＰ１、Ｐ２、…、Ｐｎ、Ｑ１、
Ｑ２、…、Ｑｎまでの距離を算出する。そして、全ての
距離の中で、最も近い距離から、予め設定した数、例え
ばｋ個の判別ベクトルを特定する。そして、このｋ個の
判別ベクトル中に、物体Ｐの判別ベクトルと、物体Ｑの
判別ベクトルのどちらが多く含まれているかということ
によって、物体Ｒの概念を特定する。上記ｋ個の判別ベ
クトルの中に、物体Ｐの判別ベクトルの方が多く含まれ
ていた場合には、「上記物体ＲはＰである」と判断し、
上記ｋ個の中に、物体Ｑの判別ベクトルの方が多く含ま
れていた場合には、「物体ＲはＱである」と判断する。First, the discrimination vector R1 of the object R to be recognized
, Pn, Q1,.
The distances to Q2,..., Qn are calculated. Then, a predetermined number, for example, k discrimination vectors are specified from the closest distance among all the distances. Then, the concept of the object R is specified based on which of the k discrimination vectors contains the discrimination vector of the object P and the discrimination vector of the object Q. When the k discrimination vectors include more discrimination vectors of the object P, it is determined that “the object R is P”,
If the number of discrimination vectors of the object Q is larger than the number of the k objects, it is determined that “the object R is Q”.

【００８７】また、上記のようにして特定したｋ個の判
別ベクトルについて、それぞれの距離によって重み付け
をして、物体ＰであるかＱであるかを判定するようにし
ても良い。例えば、上記ｋ個の中に、物体Ｐについての
判別ベクトルと、物体Ｑについての判別ベクトルとの両
方が、同数個ずつ含まれていたとしても、上記判別ベク
トルＲ１により近い判別ベクトルが、物体Ｑの判別ベク
トルの方が多い場合には、物体ＲはＱであると判断する
ことになる。なお、このような判定基準は、上記の方法
に限らず、様々な方法が考えられる。また上記のＰ１…
ＰｎやＱ１…Ｑｎには各々部分特徴ベクトルが設定でき
るが、その場合も、データ量が増えるのみで同様の処理
を行う。Also, the k discrimination vectors specified as described above may be weighted by their respective distances to determine whether the object is an object P or Q. For example, even if the k pieces include the same number of both the discrimination vector for the object P and the discrimination vector for the object Q, the discrimination vector closer to the discrimination vector R1 is If there are more discrimination vectors, the object R is determined to be Q. It should be noted that such a criterion is not limited to the above method, and various methods can be considered. The above P1 ...
A partial feature vector can be set for each of Pn and Q1... Qn. In this case, the same processing is performed only by increasing the data amount.

【００８８】[0088]

【発明の効果】第１〜第３の発明によれば、画像取り込
み時に、どのような距離にあるかとかということに関係
なく、認識対象物体を従来技術に比べ、より正確に認識
することができる。第２の発明は、画像の特徴を特徴ベ
クトルにすることによって、より抽象化することができ
る。これにより、物体の表現に関する頑健性を増すこと
ができる。第３の発明によれば、より判別に適した空間
を構成するので、認識率を向上させることができる。According to the first to third aspects of the present invention, it is possible to more accurately recognize a recognition target object as compared with the prior art regardless of the distance at the time of capturing an image. it can. According to the second invention, the feature of the image can be abstracted by using a feature vector. Thereby, the robustness regarding the representation of the object can be increased. According to the third aspect, since a space more suitable for discrimination is configured, the recognition rate can be improved.

【００８９】第４の〜第６の発明によれば、認識プロセ
スに置いて、取り込んだ画像の中から、距離データに基
づいて認識対象を特定することができる。例えば、対象
物体と背景とを区別して、背景を削除して、認識対象だ
けを残すこともできる。また、同一画像内の複数の物体
を距離情報を手がかりに順次切り出して、それぞれの物
体の認識を行うこともできる。According to the fourth to sixth aspects, in the recognition process, the recognition target can be specified from the captured image based on the distance data. For example, it is also possible to distinguish the target object from the background, delete the background, and leave only the recognition target. Further, a plurality of objects in the same image can be sequentially cut out based on distance information as a clue, and each object can be recognized.

【００９０】第５の発明によれば、学習プロセスにおい
ても、取り込んだ画像の中から、学習対象を自動的に特
定することができる。第６の発明によれば、画像取り込
み時に、認識対象物体がどのような距離にあったかとい
うことに関係なく、良好に認識することができる。第７
の発明によれば、上記第４〜第６の発明の効果に加え
て、画像の特徴を特徴ベクトルにすることによって、よ
り抽象化することができる。これにより、物体の表現の
頑健性が向上し、認識率の改善が図れる。According to the fifth aspect, in the learning process, the learning target can be automatically specified from the captured image. According to the sixth aspect, at the time of capturing an image, it is possible to satisfactorily recognize the recognition target object irrespective of the distance. Seventh
According to the invention, in addition to the effects of the fourth to sixth inventions, further abstraction can be achieved by converting the feature of the image into a feature vector. Thereby, the robustness of the representation of the object is improved, and the recognition rate can be improved.

【００９１】第８の発明によれば、画像取り込み時の、
カメラに対する認識対象物体の間の向きの違いに対する
頑健性／適応性が増し、より良好に認識することができ
るようになる。第９の発明によれば、部分ベクトルによ
る判別ができるので、認識対象が、部分的に隠れている
ような場合でも、その物体を認識することができる。ま
た、１つの画像内に、複数の認識対象が存在する場合に
も、別々の部分特徴ベクトルから、物体を個別に認識す
ることができる。According to the eighth aspect, when the image is taken in,
Robustness / adaptability to the difference in orientation between the recognition target objects with respect to the camera is increased, and recognition can be performed better. According to the ninth aspect, since the determination can be made based on the partial vector, the object can be recognized even when the recognition target is partially hidden. In addition, even when a plurality of recognition targets exist in one image, objects can be individually recognized from different partial feature vectors.

[Brief description of the drawings]

【図１】第１実施例の構成を示す図である。FIG. 1 is a diagram showing a configuration of a first embodiment.

【図２】基準距離による正規化を説明する図である。FIG. 2 is a diagram illustrating normalization by a reference distance.

【図３】角度による正規化を説明する図であって、
（ａ）はｘ軸周りの回転、（ｂ）はｙ軸周りの回転を行
う場合の図である。FIG. 3 is a diagram illustrating normalization by an angle,
(A) is a figure when rotating about an x-axis, (b) is a figure in the case of rotating about a y-axis.

【図４】第１実施例の画像解析を行う画面の図である。FIG. 4 is a diagram of a screen for performing image analysis according to the first embodiment.

【図５】第１実施例のマスクパターンを示した図であ
る。FIG. 5 is a diagram showing a mask pattern of the first embodiment.

【図６】第１実施例の特徴ベクトルの要素量を示したグ
ラフである。FIG. 6 is a graph showing an element amount of a feature vector according to the first embodiment.

【図７】判別ベクトルの原理を説明するための図であ
る。FIG. 7 is a diagram for explaining the principle of a discrimination vector.

【図８】第１実施例の初回の学習手順を示すフローチャ
ートである。FIG. 8 is a flowchart illustrating an initial learning procedure according to the first embodiment.

【図９】第１実施例の追加学習の手順を示すフローチャ
ートである。FIG. 9 is a flowchart illustrating a procedure of additional learning according to the first embodiment.

【図１０】第１実施例の認識手順を示すフローチャート
である。FIG. 10 is a flowchart illustrating a recognition procedure according to the first embodiment.

【図１１】距離画像の例である。FIG. 11 is an example of a distance image.

【図１２】第２実施例の画像解析を行う画面の図であ
る。FIG. 12 is a diagram of a screen for performing image analysis according to the second embodiment.

【図１３】第３実施例の学習プロセスにおける画面の図
である。FIG. 13 is a diagram of a screen in a learning process according to the third embodiment.

【図１４】第３実施例の認識プロセスにおいて、画像取
り込み手段によって取り込んだ画像の図である。FIG. 14 is a diagram of an image captured by an image capturing unit in the recognition process of the third embodiment.

【図１５】第３実施例の認識プロセスにおける画面の図
である。FIG. 15 is a diagram of a screen in a recognition process of the third embodiment.

【図１６】第４実施例の判別ベクトルを示した図であ
る。FIG. 16 is a diagram illustrating a discrimination vector according to a fourth embodiment.

[Explanation of symbols]

１画像取り込み手段２正規化手段３画像解析手段４特徴ベクトル生成手段５判別ベクトル生成手段６記憶手段７認識手段Ｓ１，Ｓ２基準距離Ｌ判別軸 DESCRIPTION OF SYMBOLS 1 Image taking-in means 2 Normalization means 3 Image analysis means 4 Feature vector generation means 5 Discrimination vector generation means 6 Storage means 7 Recognition means S1, S2 Reference distance L Discrimination axis

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B057 AA02 AA05 BA02 CA13 CA16 CD03 DA11 DB03 DC08 DC34 DC36 DC40 5L096 AA09 BA05 CA05 EA03 EA16 EA23 FA02 FA25 FA34 FA59 FA66 GA12 GA19 HA09 JA11 KA04 ──────────────────────────────────────────────────続き Continued on the front page F term (reference) 5B057 AA02 AA05 BA02 CA13 CA16 CD03 DA11 DB03 DC08 DC34 DC36 DC40 5L096 AA09 BA05 CA05 EA03 EA16 EA23 FA02 FA25 FA34 FA59 FA66 GA12 GA19 HA09 JA11 KA04

Claims

[Claims]

1. A learning process, comprising: an image capturing unit; a normalizing unit configured to generate a normalized image in which an image is adjusted to one or more reference distances; a storage unit; and a recognizing unit. Means for capturing the learning target together with the distance data, generating a normalized image of the learning target captured by the normalizing means, storing the normalized image in the storage unit in association with a concept such as a name and a meaning, In the recognition process, the recognition target is captured together with the distance data by the image capturing means, a normalized image of the recognition target captured by the normalization means is generated, and the recognition means stores the normalized image of the recognition target and the storage unit. A visual recognition system that recognizes the concept of the recognition target by comparing the normalized image of the learning target stored in the storage device with the learning target.

2. A learning process, comprising: an image analysis means for locally expanding a feature of an image into a basis function; and a feature vector generation means for adding a coefficient of a local basis function for each basis function to generate a feature vector. Then, a feature vector is generated from the normalized image to be learned by the feature vector generating means, the concept of the learning target is stored in the storage means in association with the feature vector, and the recognition process is performed by the feature vector generating means. 2. The concept of claim 1, wherein a feature vector is generated from the normalized image of the target, and the recognition unit recognizes the concept of the recognition target by comparing the feature vector of the recognition target with the feature vector stored in the storage unit. Described visual recognition system.

3. A discriminant vector generating means for projecting a feature of a plurality of feature vectors to a discriminant space in which discrimination is easier to make a discriminant vector. The discrimination vector of the learning target is generated, and the storage unit stores the discrimination vector of the learning target and the concept of the learning target in association with each other. In the recognition process, the discrimination vector generation unit determines the discrimination vector from the feature vector of the recognition target. The visual recognition system according to claim 2, wherein a vector is generated, and the recognition unit recognizes the concept of the recognition target by comparing the determination vector of the recognition target with the determination vector stored in the storage unit.

4. An image capturing means, a storage means, and a recognizing means. In the learning process, a learning target is captured together with the distance data by the image capturing means, and a concept of the learning target is stored in the image of the learning target object. In the recognition process, the image capturing means captures an object together with the distance data, specifies a recognition target from the captured image based on the distance data, and the recognition means A visual recognition system that recognizes the concept of the recognition target by comparing the specified image of the recognition target with the image stored in the storage unit.

5. In the learning process, the image capturing means captures an object together with the distance data, specifies a learning target from the captured images based on the distance data, and associates the concept with the specified image of the learning target. The visual recognition system according to claim 4, wherein the visual recognition is performed and stored in the storage unit.

6. A learning process comprising a normalization means,
A normalization image is generated by normalizing the image to be learned by the normalization means by one or a plurality of reference distances, and the concept is stored in the storage means in association with the normalized image. Generating a normalized image obtained by normalizing the image to be recognized by one or a plurality of reference distances by the converting means, and recognizing the normalized image to be recognized and the normalized image stored in the storage means. The visual recognition system according to claim 4, wherein the concept of the recognition target is recognized by comparison.

7. A learning process, comprising: image analysis means for locally developing a basis function of an image, and feature vector generation means for generating a feature vector by adding coefficients of a local basis function for each basis function. Then, a feature vector is generated from the image to be learned by the feature vector generating means, the concept of the learning target is stored in the storage means in association with the feature vector, and in the recognition process, the feature vector is generated by the feature vector generating means. A feature vector is generated from an image, and the recognition unit recognizes the concept of the recognition target by comparing the feature vector of the recognition target with the feature vector stored in the storage unit.
A visual recognition system according to any one of the preceding claims.

8. The visual recognition system according to claim 1, wherein the normalizing means has a function of adjusting the orientation of the image to be recognized to a reference angle by image processing.

9. The method according to claim 2, wherein the image analysis means analyzes the image by dividing the image into a plurality of parts, and the feature vector generation means creates a plurality of partial feature vectors.
The visual recognition system according to any one of claims 7 and 8.