JPH11161788A

JPH11161788A - Recognition model generation method and image recognition method

Info

Publication number: JPH11161788A
Application number: JP32495797A
Authority: JP
Inventors: Etsuro Fujita; 悦郎藤田; Shinji Abe; 伸治安部; Kenji Ogura; 健司小倉
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-11-26
Filing date: 1997-11-26
Publication date: 1999-06-18
Anticipated expiration: 2017-11-26
Also published as: JP3530363B2

Abstract

(57)【要約】【課題】人間による対象物の領域指定をなくし、認識
モデルの生成に係る作業量を軽減する。【解決手段】認識モデル生成の対象物が共通に写され
た複数枚の画像を入力して、該入力された各画像を部分
画像に分解し、該分解された各部分画像から特徴ベクト
ルを抽出し、該特徴ベクトルの集合を、階層構造を持つ
クラスタ群に分類し、クラスタ群の中より、対象物領域
に含まれる部分画像の特徴ベクトルで構成されるクラス
タを獲得し、該獲得した対象物クラスタの重心を中心と
する特徴ベクトル空間の超球でかつ、対象物クラスタの
すべての特徴ベクトルを含む超球を、対象物を特徴付け
る領域とし、該特徴ベクトル空間の超球に対して言語表
現を対応付けることで対象物の認識モデルを生成する。 (57) [Summary] [PROBLEMS] To reduce the amount of work related to generation of a recognition model by eliminating the designation of a region of an object by a human. SOLUTION: A plurality of images in which an object for recognition model generation is commonly captured are input, each input image is decomposed into partial images, and a feature vector is extracted from each decomposed partial image. Then, the set of feature vectors is classified into a cluster group having a hierarchical structure, and a cluster composed of feature vectors of partial images included in the target object region is obtained from the cluster group, and the obtained target object is obtained. A hypersphere in the feature vector space centered on the center of gravity of the cluster and including all feature vectors of the object cluster is defined as a region for characterizing the object, and a linguistic expression is given to the hypersphere in the feature vector space. By associating the object, a recognition model of the object is generated.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像中の対象物を
認識するための認識モデル生成方法および、生成した認
識モデルに基づく画像認識方法に関するものである。The present invention relates to a method for generating a recognition model for recognizing an object in an image and an image recognition method based on the generated recognition model.

【０００２】[0002]

【従来の技術】画像中の対象物を認識するための認識モ
デルの生成に関する手法は、近年多くの手法が提案され
ている。その中の一つに、認識モデル生成の対象である
対象物が写された画像から直接その対象物の認識モデル
を生成する手法がある。この従来の手法は、対象物が写
された画像に対して、人手により、画像中のどこに対象
物が写っているかを指定することにより、対象物の色ま
たはテクスチャに関する画像特徴を学習し、学習した画
像特徴と対象物を特徴付ける言語表現とを対応付けるこ
とで対象物の認識モデルを生成するというものである
（参考文献１：Ｒ．Ｗ．Ｐicard and Ｔ．Ｐ．Ｍinka，
“Ｖision texture for annotation”，Ｊ．Ｍultimedi
a Ｓystems ３，ｐｐ．３−１４（１９９５））。例え
ば、対象物「空」が写された画像から対象物「空」の認
識モデルを生成する場合、人手により画像中の対象物
「空」の領域を空間的に指定することにより、対象物
「空」の画像特徴を学習し、学習した画像特徴と言語表
現「空」とを対応付けることで認識モデルを生成してい
る。2. Description of the Related Art In recent years, many methods have been proposed for generating a recognition model for recognizing an object in an image. One of them is a method of directly generating a recognition model of a target object from an image on which the target object to be generated is generated. This conventional method learns image features related to the color or texture of an object by manually designating where in the image the image of the object is located. The object image recognition model is generated by associating the obtained image features with the linguistic expressions characterizing the object (Ref. 1: RW Picard and TP Minka,
"Vision texture for annotation", J. Amer. Multimedi
a Systems 3, pp. 3-14 (1995)). For example, when generating a recognition model of the object “sky” from an image in which the object “sky” is captured, the region of the object “sky” in the image is manually spatially designated, and the object “sky” is designated. The recognition model is generated by learning the image feature of “sky” and associating the learned image feature with the linguistic expression “sky”.

【０００３】参考文献１では、風景画像を部分メッシュ
画像に分解し、次に、人手により、画像中の認識モデル
生成対象の対象物の領域に含まれる部分メッシュ画像を
複数指定することにより、対象物の画像特徴を学習し、
学習した画像特徴と対象物を特徴付ける言語表現とを対
応付けることで対象物の認識モデルを生成する方法を提
案している。例えば、対象物「空」が写された風景画像
から対象物「空」の認識モデルを生成する場合は、まず
風景画像を部分メッシュ画像に分解し、風景画像中の対
象物「空」の領域に含まれる部分メッシュ画像を人手に
より複数指定することにより、対象物「空」の画像特徴
を学習し、学習した画像特徴と言語表現「空」とを対応
付けることで対象物「空」の認識モデルを生成してい
る。In Reference 1, a landscape image is decomposed into partial mesh images, and then a plurality of partial mesh images included in an area of a target for which a recognition model is to be generated in the image are manually designated. Learn the image features of objects,
We propose a method for generating a recognition model of an object by associating the learned image features with the linguistic expressions that characterize the object. For example, when generating a recognition model of the object “sky” from a landscape image in which the object “sky” is captured, first, the landscape image is decomposed into a partial mesh image, and the area of the object “sky” in the landscape image is The image feature of the object "sky" is learned by manually specifying a plurality of partial mesh images included in the image, and the recognition model of the object "sky" is associated by associating the learned image feature with the linguistic expression "sky". Has been generated.

【０００４】[0004]

【発明が解決しようとする課題】認識モデル生成対象の
対象物が写された画像に対して、人手により画像中のど
こに対象物が写っているかを指定することにより、対象
物の認識モデルを生成する従来の手法においては、認識
モデル生成対象の対象物の種類が多くなると、画像中の
対象物の領域を指定する作業量が増大するという問題が
ある。SUMMARY OF THE INVENTION A recognition model of an object is generated by manually designating where in the image the object to be recognized is to be captured. However, the conventional method has a problem that when the number of types of the target for which the recognition model is to be generated increases, the amount of work for specifying the region of the target in the image increases.

【０００５】本発明が解決しようとする課題は、画像中
のどこに対象物が写っているかを人手により指定するこ
となく、認識モデル生成対象の対象物が写された複数枚
の画像から、対象物の領域を自動的に見い出して対象物
の画像特徴を学習し、対象物の認識モデルを生成する方
法を提案することにある。[0005] The problem to be solved by the present invention is that a plurality of images on which an object to be a recognition model is to be generated can be obtained without manually specifying where in the image the object appears. It is an object of the present invention to propose a method of automatically finding an area and learning an image feature of an object and generating a recognition model of the object.

【０００６】また、本発明は、生成した認識モデルを使
用して、未知画像に目的の対象物が写っているかどうか
を認識する方法を提案することにある。Another object of the present invention is to propose a method for recognizing whether or not a target object appears in an unknown image by using a generated recognition model.

【０００７】[0007]

【課題を解決するための手段】本発明に係る認識モデル
生成方法は、認識モデル生成の対象物が共通に写された
複数枚の画像と該対象物を特徴付ける言語表現を入力し
て、該入力された各画像を、例えば等間隔メッシュに分
割し、部分メッシュ画像に分解する段階と、各部分メッ
シュ画像から色あるいはテクスチャなどに関する特徴ベ
クトルを抽出する段階と、該抽出された特徴ベクトルの
集合を、階層構造を持つクラスタ群に分類する段階と、
該得られたクラスタ群の中より、対象物領域に含まれる
部分メッシュ画像の特徴ベクトルで構成されるクラスタ
を獲得する段階と、該獲得した対象物クラスタの重心を
中心とする特徴ベクトル空間の超球でかつ、対象物クラ
スタのすべての特徴ベクトルを含む超球を対象物の画像
特徴を特徴付ける領域として生成する段階と、該特徴ベ
クトル空間の超球に対して、入力された言語表現を対応
付けることで対象物の認識モデルを生成する段階とを有
することを特徴とする。According to the present invention, there is provided a recognition model generating method comprising the steps of: inputting a plurality of images in which an object to be recognized is generated in common and a language expression characterizing the object; Dividing each of the divided images into, for example, equally-spaced meshes and decomposing them into partial mesh images, extracting feature vectors related to colors or textures from the respective partial mesh images, and setting a set of the extracted feature vectors. Classifying into hierarchical clusters,
Obtaining a cluster composed of the feature vectors of the partial mesh images included in the target object region from the obtained cluster group; and determining a superposition of a feature vector space centered on the center of gravity of the obtained target cluster. Generating a hypersphere that is a sphere and includes all feature vectors of the object cluster as a region that characterizes the image feature of the object, and associating the input linguistic expression with the hypersphere in the feature vector space And generating a recognition model of the object.

【０００８】また、本発明に係る画像認識方法は、入力
された未知画像を、例えば等間隔メッシュに分割し部分
メッシュ画像に分解する段階と、該未知画像の各部分メ
ッシュ画像から特徴ベクトルを抽出する段階と、該抽出
された各特徴ベクトルと前記認識モデル生成方法で生成
した対象物の認識モデルとの照合計算によって、未知画
像に対象物が写っているか否かを判定する段階を有する
ことを特徴とする。In the image recognition method according to the present invention, the input unknown image is divided into, for example, equally spaced meshes and decomposed into partial mesh images, and a feature vector is extracted from each partial mesh image of the unknown image. And a step of determining whether or not the target object is shown in the unknown image by a collation calculation between the extracted feature vectors and the recognition model of the target object generated by the recognition model generation method. Features.

【０００９】ここで、認識モデル生成の対象である対象
物が共通に写された複数枚の画像は、次の２条件（条件
１）、（条件２）が満たされるようにする。（条件１）認識モデル生成用画像中の認識モデル生成
対象である対象物の領域は、認識モデル生成用のすべて
の画像間で、色あるいはテクスチャに関して類似した画
像特徴を持つ。（条件２）認識モデル生成用画像中の認識モデル生成
対象である対象物の領域以外の背景領域あるいはその一
部の領域は、認識モデル生成用のすべての画像間で、色
あるいはテクスチャに関して類似した画像特徴を持たな
い。Here, a plurality of images in which the object to be recognized is generated in common are set to satisfy the following two conditions (condition 1) and (condition 2). (Condition 1) The region of the target object for which the recognition model is to be generated in the image for generating the recognition model has image characteristics similar in color or texture among all the images for generating the recognition model. (Condition 2) The background region other than the region of the object for which the recognition model is to be generated in the recognition model generation image or a part of the background region is similar in color or texture between all the images for recognition model generation. No image features.

【００１０】認識モデル生成対象の対象物の領域に含ま
れる部分メッシュ画像から抽出した特徴ベクトル群は、
（条件１）のため、特徴ベクトル空間内で近接し一つの
クラスタを形成する。したがって、階層的クラスタリン
グ段階で得られたクラスタ群の中から、このクラスタを
獲得することができれば、画像中の対象物の領域を人手
により指定することなく、対象物の画像特徴を学習する
ことが可能となり、対象物の認識モデルの生成が可能と
なる。A feature vector group extracted from the partial mesh image included in the area of the object for which the recognition model is to be generated is represented by:
Because of (Condition 1), one cluster is formed to be close in the feature vector space. Therefore, if this cluster can be obtained from the group of clusters obtained in the hierarchical clustering step, it is possible to learn the image characteristics of the object without manually specifying the region of the object in the image. This makes it possible to generate a recognition model of the object.

【００１１】ところで、この所望のクラスタは、（条件
１）、（条件２）のため、次の（特徴１）、（特徴２）
を満たすクラスタとして特徴付けて良い。（特徴１）クラスタは、認識モデル生成用画像のすべ
ての画像の少なくとも一つの部分メッシュ画像から抽出
した特徴ベクトルを含む。（特徴２）クラスタは、特徴１を満たすクラスタの中
で階層が最も低い。すなわち、階層構造をデンドログラ
ムにより表現した場合、このクラスタは、特徴１を満た
すクラスタの中で、デンドログラム上最も末端に位置す
る。By the way, since this desired cluster is (condition 1) and (condition 2), the following (feature 1) and (feature 2)
May be characterized as a cluster that satisfies (Feature 1) The cluster includes a feature vector extracted from at least one partial mesh image of all images of the recognition model generation image. (Feature 2) The cluster has the lowest hierarchy among the clusters that satisfy Feature 1. That is, when the hierarchical structure is represented by a dendrogram, this cluster is located at the end of the dendrogram among the clusters satisfying the feature 1.

【００１２】したがって、これら２つの特徴を同時に満
たすクラスタを、階層的クラスタリング段階において得
られたクラスタ群の中より、探し出すことで所望のクラ
スタを獲得できる。このような方法で獲得したクラスタ
を、特徴付け領域生成段階において、クラスタを含むよ
り大きな超球に領域を一般化し、次いで、この超球に対
して、入力された言語表現を対応付けることで対象物の
認識モデルを生成する。生成された認識モデルは認識用
辞書として保持し、未知画像中の対象物の認識に利用す
る。Therefore, a desired cluster can be obtained by searching for a cluster satisfying these two features at the same time from the cluster group obtained in the hierarchical clustering stage. In the characterization region generation stage, the cluster obtained in such a manner is generalized to a larger hypersphere including the cluster, and then the input linguistic expression is associated with the hypersphere to obtain an object. Generate a recognition model for. The generated recognition model is stored as a recognition dictionary and used for recognition of an object in an unknown image.

【００１３】[0013]

【発明の実施の形態】以下に、本発明の実施の形態につ
いて図面を参照して説明する。図１は本発明の認識モデ
ル生成方法を実施する処理の流れを示す図である。この
処理は、キーボード、イメージスキャナ、ディスプレ
ィ、ＣＰＵ及びメモリ装置などからなる所謂コンピュー
タシステムを使用して実現されるが、その構成は周知で
あるので図示は省略する。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing the flow of processing for implementing the recognition model generation method of the present invention. This processing is realized by using a so-called computer system including a keyboard, an image scanner, a display, a CPU, a memory device, and the like.

【００１４】まず、イメージスキャナなどを利用して、
あらかじめ認識モデル生成対象である対象物が共通に写
された複数毎の画像を入力し、メモリ等に格納する（ス
テップ１０１）。また、キーボードなどを使用して、こ
の認識モデル生成対象である対象物付ける言語表現を入
力し、レジスタ等に設定する（ステップ１０２）。ここ
で、認識モデル生成対象である対象物が共通に写された
複数毎の画像には、先の２条件（条件１）、（条件２）
が満たされるようにする。First, using an image scanner or the like,
A plurality of images in which objects to be recognized model generation targets are commonly captured are input in advance and stored in a memory or the like (step 101). In addition, using a keyboard or the like, a linguistic expression for attaching the recognition model generation target is input and set in a register or the like (step 102). Here, the above two conditions (Condition 1) and (Condition 2) are included in each of a plurality of images in which an object to be recognized is generated.
Is satisfied.

【００１５】次いで、ＣＰＵ上で以下の処理を実行し
て、対象物の認識モデルを自動生成する。最初、入力さ
れた各画像を等間隔メッシュに分割し、部分メッシュ画
像に分解する（ステップ１１１）。次に、各画像につい
て、各部分メッシュ画像から色またはテクスチャなどに
関する特徴ベクトルを抽出し（ステップ１１２）、この
抽出された特徴ベクトルの集合を、階層構造を持つクラ
スタ群に分類する（ステップ１１３）。この得られたク
ラスタ群の各クラスタに対して、クラスタを構成する各
特徴ベクトルの抽出元の部分メッシュ画像の集合が、認
識モデル生成用画像のすべての画像の少なくとも１つの
部分メッシュ画像を含むどうかを調べて、含む場合に
は、このクラスタを対象物の画像特徴を特徴付けるクラ
スタの候補として登録する（ステップ１１４）。次に、
この登録された候補クラスタの中で、クラスタを構成す
る特徴ベクトルの個数が最も少ないクラスタを選び、個
数最小クラスタを構成する各特徴ベクトルに対して、抽
出元の部分メッシュ画像が認識モデル生成対象である対
象物の領域に含まれているかどうかを調べて、抽出元の
部分メッシュ画像がすべて、対象物領域に含まれている
場合には、そのクラスタを対象物の画像特徴を特徴付け
るクラスタとして判定し獲得する（ステップ１１５）。
次に、この獲得した対象物クラスタの重心を中心とする
特徴ベクトル空間の超球でかつ、対象物クラスタのすべ
ての特徴ベクトルを含む超球を、対象物の画像特徴を特
徴付ける領域として生成する（ステップ１１６）。最後
に、この生成された特徴ベクトル空間の超球に対して、
入力された言語表現を対応付けることで認識モデルを生
成する（ステップ１１７）。なお、生成した認識モデル
は認識用辞書として辞書メモリに保存する。Next, the following processing is executed on the CPU to automatically generate a recognition model of the object. First, each input image is divided into meshes at equal intervals and decomposed into partial mesh images (step 111). Next, for each image, a feature vector relating to color or texture is extracted from each partial mesh image (step 112), and a set of the extracted feature vectors is classified into a cluster group having a hierarchical structure (step 113). . For each cluster of the obtained cluster group, whether the set of partial mesh images from which each feature vector constituting the cluster is extracted includes at least one partial mesh image of all images of the recognition model generation image Is checked, and if it is included, this cluster is registered as a cluster candidate that characterizes the image feature of the object (step 114). next,
Among the registered candidate clusters, a cluster having the smallest number of feature vectors constituting the cluster is selected, and for each feature vector constituting the minimum number of clusters, an extraction source partial mesh image is used as a recognition model generation target. It is checked whether it is included in the area of a certain object, and if all of the extracted partial mesh images are included in the object area, the cluster is determined as a cluster characterizing the image feature of the object. Acquisition (step 115).
Next, a hypersphere in the feature vector space centered on the center of gravity of the obtained object cluster and including all the feature vectors of the object cluster is generated as a region characterizing the image feature of the object ( Step 116). Finally, for the generated hypersphere in the feature vector space,
A recognition model is generated by associating the input linguistic expressions (step 117). The generated recognition model is stored in a dictionary memory as a recognition dictionary.

【００１６】次に、本発明による認識モデル生成に関す
る具体的な処理例を図２，図３，図４を参照して説明す
る。Next, a specific example of processing relating to generation of a recognition model according to the present invention will be described with reference to FIGS.

【００１７】まず、認識モデル生成対象である対象物が
共通に写された画像１，・・・，画像Ｎを入力する。た
だし、各画像には次の２条件が満たされるようにする。（条件１）認識モデル生成用画像中の認識モデル生成
対象である対象物の領域は、認識モデル生成用のすべて
の画像間で、色あるいはテクスチャに関して類似した画
像特徴を持つ。（条件２）認識モデル生成用画像中の認識モデル生成
対象である対象物の領域以外の背景領域あるいはその一
部の領域は、認識モデル生成用のすべての画像間で、色
あるいはテクスチャに関して類似した画像特徴を持たな
い。First, an image 1,..., And an image N in which an object, which is a recognition model generation target, is commonly captured is input. However, the following two conditions are satisfied for each image. (Condition 1) The region of the target object for which the recognition model is to be generated in the image for generating the recognition model has image characteristics similar in color or texture among all the images for generating the recognition model. (Condition 2) The background region other than the region of the object for which the recognition model is to be generated in the recognition model generation image or a part of the background region is similar in color or texture between all the images for recognition model generation. No image features.

【００１８】ここでは、例えば「青く晴れた空」という
対象物の認識モデルを生成することを想定し、対象物
「青く晴れた空」が共通に写された画像１，・・・，画
像Ｎを、（条件１）、（条件２）が満たされるように選
んで入力するものとする。（条件１）により、画像１，
・・・，画像Ｎにはそれぞれ、対象物「青く晴れた空」
の領域内にあってかつ、互いに類似した画像特徴を持つ
矩型領域が含まれていることになる。図２に、認識モデ
ル生成対象の対象物の領域を斜線部分で示す。この例で
は、ａ（１，１），ａ（２，１），ａ（Ｎ，Ｋ）など
が、互いに類似した画像特徴を持つ矩型領域である。こ
こで、矩型のサイズを最小のものにそろえることによっ
て、矩型領域のサイズを同一とすることができる。一
方、（条件２）により、このような性質を持つ矩型領域
は、それぞれ元の画像の対象物「青く晴れた空」の領域
に含まれるものに限られることになる。したがって、認
識モデル生成用画像の各画像中から、互いに類似した画
像特徴を持つサイズ同一の矩型領域を自動的に見い出す
ことができれば、その矩型領域の画像特徴を学習するこ
とにより、対象物「青く晴れた空」の画像特徴を、領域
を人手により指定することなく学習することが可能とな
る。ただし、この時点では、入力された各画像のどの位
置にその矩型領域が存在するかが処理系には分からない
ので、とりあえず、入力された各画像を適当なサイズの
等間隔メッシュに分割する。すなわち、入力された画像
１，・・・，画像Ｎを、画像分割処理（ステップ１１
１）において、図２のように、適当なサイズの部分メッ
シュ画像ａ（１，１），・・・，ａ（Ｎ，Ｋ）に分解す
る。図２は４×４のメッシュに分割した例を示してい
る。なお、以下の一連の処理を行っても、対象物「青く
晴れた空」の画像特徴の学習が達成できない場合には、
等間隔メッシュの分割サイズをより小さいものに自動的
に変更して初めから処理をやり直すこととする。Here, for example, it is assumed that a recognition model of an object "blue clear sky" is generated, and images 1,... Is selected and input so that (condition 1) and (condition 2) are satisfied. According to (condition 1), image 1
..., Each image N has a target object “blue clear sky”
And a rectangular region having image characteristics similar to each other. In FIG. 2, the area of the target object for which the recognition model is to be generated is indicated by hatching. In this example, a (1, 1), a (2, 1), a (N, K) and the like are rectangular regions having similar image features. Here, the size of the rectangular area can be made the same by making the rectangular size the smallest one. On the other hand, according to (Condition 2), the rectangular regions having such properties are limited to those included in the target object “blue clear sky” region of the original image. Therefore, if it is possible to automatically find a rectangular region of the same size having similar image characteristics from each image of the recognition model generation image, the image characteristics of the rectangular region are learned, and It is possible to learn the image feature of “blue sunny sky” without manually specifying the area. However, at this point, since the processing system does not know at which position of the input image the rectangular area exists, the input image is divided into equally-spaced meshes of an appropriate size for the time being. . That is, the input image 1,..., Image N is divided into images (step 11).
In 1), as shown in FIG. 2, the image is decomposed into partial mesh images a (1, 1),..., A (N, K) of an appropriate size. FIG. 2 shows an example of division into 4 × 4 meshes. If the learning of the image features of the object "blue clear sky" cannot be achieved even after performing the following series of processing,
It is assumed that the division size of the equally-spaced mesh is automatically changed to a smaller one, and the processing is restarted from the beginning.

【００１９】次に、特徴ベクトル抽出処理（ステップ１
１２）において、各部分メッシュ画像から色またはテク
スチャに関する特徴ベクトルｖ（１，１），・・・，ｖ
（Ｎ，Ｋ）を抽出する。これを、同じく図２に示す。図
２では特徴ベクトル空間の次元を２次元としている。次
に、階層的クラスタリング処理（ステップ１１３）にお
いて、特徴ベクトルの集合｛ｖ（１、１）、・・・、ｖ
（Ｎ、Ｋ）｝を、図３のように、階層構造を持つクラス
タ群に分類する。図３は、３回統合処理を繰り返した後
のクラスタ形成の様子を表わし、クラスタに付けられた
数字は、そのクラスタが何回目の統合処理で形成された
かを表わしている。Next, feature vector extraction processing (step 1)
In 12), a feature vector v (1, 1),.
(N, K) is extracted. This is also shown in FIG. In FIG. 2, the dimension of the feature vector space is two-dimensional. Next, in the hierarchical clustering process (step 113), a set of feature vectors {v (1, 1),.
(N, K)} are classified into cluster groups having a hierarchical structure as shown in FIG. FIG. 3 shows a state of cluster formation after repeating the integration processing three times, and the number given to the cluster indicates the number of times the cluster was formed in the integration processing.

【００２０】なお、特徴ベクトル抽出に関しては、例え
ば、「Ｙ-ＩＯhta Ｔ．Ｋanadeand Ｔ．Ｓakai，“Ｃo
lor information for region segmentation”，Ｃomp．
and Ｉmg．Ｐroc.，１３：２２２−２４１（１９８
０）」（参考文献２）や「Ｊ．Ｍao and Ａ．Ｋ．Ｊai
n，“Ｔexture classification and segmentationusing
multiresolution simultaneous autoregressive model
s”，Ｐatt．Ｒec.，２５（２）：１７３−１８８（１
９９２）」(参考文献３）に詳述されており、階層的ク
ラスタリングに関しては、例えば、「高木幹男、下田陽
久、“画像解析ハンドブック”、東京大学出版会（１９
９１）」（参考文献４）に詳述されている。As for the feature vector extraction, for example, “YI Ohta T. Kanadeand T. Sakai,“ Co
lor information for region segmentation ", Comp.
and Img. Proc., 13: 222-241 (198).
0) ”(Ref. 2) and“ J. Mao and AK Jai
n, “Texture classification and segmentation using
multiresolution simultaneous autoregressive model
s ", Patt. Rec., 25 (2): 173-188 (1
992) ”(Ref. 3). Regarding hierarchical clustering, for example,“ Mikio Takagi, Hirohisa Shimoda, “Image Analysis Handbook”, University of Tokyo Press (19)
91) "(reference document 4).

【００２１】次に、候補クラスタ登録処理（ステップ１
１４）において、得られた各クラス夕に対して、図４に
示すように、クラスタを構成する特徴ベクトルの抽出元
の部分メッシュ画像の集合が、入力されたすべての画像
の少なくとも１つの部分メッシュ画像を含むどうかを調
べて、含む場合には、このクラスタを対象物の画像特徴
を特徴付けるクラスタの候補として登録する。すなわ
ち、着目するクラスタＣが、Ｃ＝｛ｖ（ｉ１，ｊ１），ｖ（ｉ２，ｊ２），・・・，
ｖ（ｉｐ，ｊｐ）｝の特徴ベクトルで構成されるなら、これらｐ個の特徴ベ
クトルの抽出元の部分メッシュ画像で構成される集合Ｃ
＾はＣ＾＝｛ａ（ｉ１，ｊ１），ａ（ｉ２，ｊ２），・・
・，ａ（ｉｐ，ｊｐ）｝と表わされるが、ここで、集合Ｃ＾が、入力されたＮ枚
すべての画像の少なくとも一つの部分メッシュ画像を含
む場合には、このクラスタＣを、対象物の画像特徴を特
徴付けるクラスタの候補として登録する。図４は、クラ
スタＣがＮ枚すべての画像の少なくとも一つの部分メッ
シュ画像から抽出された特徴ベクトルを含んでいること
を表わしている。Next, candidate cluster registration processing (step 1)
In 14), for each of the obtained classes, as shown in FIG. 4, a set of partial mesh images from which the feature vectors constituting the cluster are extracted is converted into at least one partial mesh of all the input images. It is checked whether or not an image is included. If the image is included, this cluster is registered as a cluster candidate that characterizes the image feature of the target object. That is, the cluster C of interest is C = ｛v (i1, j1), v (i2, j2),.
v (ip, jp)}, a set C composed of partial mesh images from which the p feature vectors are extracted
＾ is C ＾ = {a (i1, j1), a (i2, j2),.
, A (ip, jp)} where the set C ＾ includes at least one partial mesh image of all the input N images, the cluster C is defined as an object Is registered as a cluster candidate that characterizes the image feature. FIG. 4 shows that cluster C includes feature vectors extracted from at least one partial mesh image of all N images.

【００２２】次に、対象物クラスタ獲得処理（ステップ
１１５）において、登録した候補クラスタの中でクラス
タを構成する特徴ベクトルの個数が最も少ないクラスタ
Ｃ_minを選択し、次いで、クラスタＣ_minを構成する各特
徴ベクトルの抽出元の部分メッシュ画像が、認識モデル
生成対象である対象物の領域に含まれているか否かを調
べて、抽出元の部分メッシュ画像がすべて対象物領域に
含まれている場合には、クラスタＣ_minを対象物の画像
特徴を特徴付けるクラスタとして判定し獲得する。すな
わち、クラスタＣ_minが、Ｃ_min＝｛ｖ（ｉ１_min，ｊ１_min），ｖ（ｉ２_min，ｊ２_min），・・・，ｖ（ｉｑ_min，ｊｑ_min）｝と表わされるなら、クラスタＣ_minの抽出元の部分メッ
シュ画像で構成される集合Ｃ_min＾は、Ｃ_min＾＝｛ａ（ｉ１_min，ｊ１_min），ａ（ｉ２_min，ｊ２_min），・・・，ａ（ｉｑ_min、ｊｑ_min）｝と表わされるが、このとき、各部分メッシュ画像ａ（ｉ
ｋ_min，ｊｋ_min）が、対象物「青く晴れた空」の領域に
含まれているか否かを、図４のように、各部分メッシュ
画像ａ（ｉｋ_min，ｊｋ_min）にアウトラインを付けて元
の画像中で強調表示して調べて、各部分メッシュ画像ａ
（ｉｋ_min，ｊｋ_min）がすべて対象物「青く晴れた空」
の領域に含まれている場合には、クラスタＣ_minを対象
物の画像特徴を特徴付けるクラスタとして判定し獲得す
る。Next, in the object cluster acquisition process (step 115), a cluster C _min having the smallest number of feature vectors constituting the cluster is selected from the registered candidate clusters, and then a cluster C _min is formed. Check whether the partial mesh image from which each feature vector is extracted is included in the region of the target object for which the recognition model is to be generated, and if all the partial mesh images from which the extraction source is included are included in the target object region , The cluster C _min is determined and acquired as a cluster characterizing the image feature of the object. That is, if the cluster C _min _{is, C min = {v (i1} min, j1 min), v (i2 min, j2 min), ···, v (iq min, jq min)} represented as a cluster C _min C _min ^ is set in the extract the partial mesh image _{composed, C min ^ = {a (} i1 min, j1 min), a (i2 min, j2 min), ···, a (iq min, jq _min )}, where each partial mesh image a (i
k _min, jk _min) is, whether included in the area of the object "blue clear sky", as shown in FIG. 4, each partial mesh image a (ik _{_min,} jk _min) to put an outline Each partial mesh image a is highlighted and examined in the original image.
(Ik _min , jk _min ) are all objects “blue clear sky”
Is included in the region, the cluster C _min is determined and acquired as a cluster characterizing the image feature of the target object.

【００２３】階層的クラスタリング処理（ステップ１１
３）は類似した特徴ベクトル同士を順次統合してクラス
タを成長させる処理である。したがって、各部分メッシ
ュ画像から抽出したすべての特徴ベクトルは、最終的に
は一つのクラスタＣ_maxに統合される。ゆえに、候補ク
ラスタ登録処理（ステップ１１４）において登録した候
補クラスタの中には、クラスタＣ_maxも含まれることに
なる。ところが、入力された認識モデル生成用画像の中
には、一般に、対象物「青く晴れた空」の領域以外の別
の領域も存在するため、クラスタＣ_maxには、対象物
「青く晴れた空」の領域以外の別の領域に含まれる部分
メッシュ画像から抽出された特徴ベクトルも含まれる。
例えば、入力された画像１に、対象物「青く晴れた空」
以外に、例えば対象物「アスファルトの道路」が写って
いるとすると、対象物「アスファルトの道路」の領域に
含まれる部分メッシュ画像から抽出された特徴ベクトル
もＣ_maxに含まれる。候補クラスタ登録処理において登
録した候補クラスタの中から、クラスタを構成する特徴
ベクトルの個数が最も少ないクラスタを選択する理由
は、クラスタＣ_maxのように過剰成長し、対象物とは無
関係な特徴ベクトルを含む可能性のあるクラスタを除外
するためである。Hierarchical clustering processing (step 11)
3) is a process of sequentially integrating similar feature vectors to grow a cluster. Therefore, all feature vectors extracted from each partial mesh image are finally integrated into one cluster _Cmax . Therefore, the cluster C _max is also included in the candidate clusters registered in the candidate cluster registration process (step 114). However, in the input recognition model generation image, there is generally another region other than the region of the target object “blue clear sky”. Therefore, the target object “blue clear sky” is included in the cluster C _max. The feature vector extracted from the partial mesh image included in another area other than the area of "" is also included.
For example, in the input image 1, the object "blue clear sky"
In addition, if the object "asphalt road" is imaged, for example, the feature vector extracted from the partial mesh image included in the area of the object "asphalt road" is also included in _Cmax . The reason for selecting a cluster having the smallest number of feature vectors constituting the cluster from the candidate clusters registered in the candidate cluster registration process is that a feature vector that is overgrown like a cluster C _max and is irrelevant to the object is selected. This is to exclude clusters that may be included.

【００２４】なお、クラスタＣ_minを構成するすべての
特徴ベクトルのうちで一つでも、特徴ベクトルの抽出元
の部分メッシュ画像が、認識モデル生成対象である対象
物の領域に含まれない場合には、使用した特徴量とは別
の特徴量を用いて再度、特徴ベクトル抽出処理（ステッ
プ１１２）の処理から一連の処理を繰り返すようにす
る。If at least one of the feature vectors constituting the cluster C _min does not include the partial mesh image from which the feature vector is extracted, in the region of the object for which the recognition model is to be generated. Then, a series of processes from the feature vector extraction process (step 112) is repeated again using a feature amount different from the used feature amount.

【００２５】次に、特徴付け領域生成処理（ステップ１
１６）において、対象物クラスタの重心を中心とし、重
心と対象物クラスタの各特徴ベクトルとの距離の最大値
を半径とする特徴ベクトル空間の超球を、対象物「青く
晴れた空」の画像特徴を特徴付ける領域として生成す
る。そして、人手によって、「青く晴れた空」という言
語表現を入力する（ステップ１０２）。最後に、認識モ
デル生成処理（ステップ１１７）において、対象物「青
く晴れた空」を特徴付ける特徴ベクトル空間の超球に対
して、入力された言語表現「青く晴れた空」を対応付け
ることで、対象物「青く晴れた空」の認識モデルを生成
する。Next, a characterization area generating process (step 1)
In 16), a hypersphere in a feature vector space whose center is the center of gravity of the object cluster and whose radius is the maximum value of the distance between the center of gravity and each feature vector of the object cluster is converted into an image of the object "blue clear sky". It is generated as a region characterizing the feature. Then, a linguistic expression "blue sunny sky" is manually input (step 102). Finally, in the recognition model generation process (step 117), the input linguistic expression “blue clear sky” is associated with the hypersphere in the feature vector space characterizing the target object “blue clear sky”. Generate a recognition model for the object "blue sunny sky".

【００２６】図５は、上述のようにして生成した認識モ
デルに基づく本発明の画像認識方法を実施する処理の流
れを示す図である。この処理も実際にはコンピュータシ
ステム上で実現される。FIG. 5 is a diagram showing a flow of processing for implementing the image recognition method of the present invention based on the recognition model generated as described above. This processing is also actually realized on the computer system.

【００２７】まず、カメラや画像記録媒体などの未知画
像を入力する（ステップ２０１）。この入力された未知
画像を等間隔メッシュに分割して複数の部分メッシュ画
像に分解し（ステップ２１１）、該未知画像の各部分メ
ッシュ画像から特徴ベクトルを抽出する（ステップ２１
２）。これらは、基本的に図１のステップ１１１，１１
２と同様である。この抽出された各特徴ベクトルと上述
の認識モデル生成方法を用いて生成されて保存しておい
て認識モデルとの照合計算によって、未知画像中に対象
物が写っているか否かを判定する（ステップ２１３）。First, an unknown image such as a camera or an image recording medium is input (step 201). The input unknown image is divided into equally spaced meshes and decomposed into a plurality of partial mesh images (step 211), and a feature vector is extracted from each partial mesh image of the unknown image (step 21).
2). These are basically steps 111 and 11 in FIG.
Same as 2. It is determined whether or not the target object is included in the unknown image by performing a collation calculation with each of the extracted feature vectors and the recognition model generated and stored using the above-described recognition model generation method (step 213).

【００２８】次に、本発明の認識モデルに基づいた画像
認識方法の具体的な処理例を図６を参照して説明する。
ここでは、未知画像Ｘに対象物「青く晴れた空」が写っ
ているか否かを判定するとする。Next, a specific processing example of the image recognition method based on the recognition model of the present invention will be described with reference to FIG.
Here, it is assumed that it is determined whether or not the target object “blue clear sky” is captured in the unknown image X.

【００２９】未知画像Ｘを入力し、まず、画像分割処理
（ステップ２１１）において、未知画像Ｘを部分メッシ
ュ画像ｂ（１），・・・，ｂ（Ｋ）に分解する。次に、
特徴ベクトル抽出処理（ステップ２１２）において、未
知画像Ｘの各部分メッシュ画像ｂ（ｊ）（１≦ｊ≦Ｋ）
から、対象物「青く晴れた空」の認識モデルを生成した
特徴量と同じ特徴量を用いて、特徴ベクトルｗ（１），
・・・，ｗ（Ｋ）を抽出する。このように、各部分メッ
シュ画像ｂ（ｊ）から抽出する特徴量は、判定対象とな
る対象物の認識モデルを生成した特徴量と同一のものに
する必要がある。最後に、認識モデル照合処理（ステッ
プ２１３）において、対象物「青く晴れた空」の認識モ
デルとの照合計算によって、未知画像Ｘに対象物「青く
晴れた空」が写っているか否かを判定する。すなわち、
図６に示すように、抽出した特徴ベクトルｗ（１），・
・・，ｗ（Ｋ）のうちで少なくとも一つの特徴ベクトル
が対象物の画像特徴を特徴付ける特徴ベクトル空間の超
球に含まれるかどうかを調べて、一つでも特徴ベクトル
が含まれる場合には、未知画像Ｘには対象物「青く晴れ
た空」が写っていると判定し、言語表現「青く晴れた
空」を未知画像Ｘに付与する。An unknown image X is input. First, in an image dividing process (step 211), the unknown image X is decomposed into partial mesh images b (1),..., B (K). next,
In the feature vector extraction process (step 212), each partial mesh image b (j) of the unknown image X (1 ≦ j ≦ K)
By using the same feature amount as the feature amount that generated the recognition model of the object “blue sunny sky”, the feature vector w (1),
.., W (K) are extracted. As described above, the feature amount extracted from each partial mesh image b (j) needs to be the same as the feature amount that generated the recognition model of the object to be determined. Finally, in the recognition model matching process (step 213), it is determined whether or not the target object "blue clear sky" is shown in the unknown image X by a calculation of matching with the recognition model of the target object "blue clear sky". I do. That is,
As shown in FIG. 6, the extracted feature vectors w (1),.
.., W (K) is checked to see if at least one feature vector is included in the hypersphere of the feature vector space characterizing the image feature of the object, and if at least one feature vector is included, It is determined that the target object “blue clear sky” is shown in the unknown image X, and the linguistic expression “blue clear sky” is given to the unknown image X.

【００３０】[0030]

【発明の効果】以上説明したように、本発明によれば、
認識モデル生成対象である対象物が共通に写された複数
枚の画像を与えることによって、対象物の領域が画像中
のどこにあるかを人手により指定することなく、処理系
で対象物の領域を自動的に見い出して対象物の画像特徴
を学習し、対象物の認識モデルを生成することが可能と
なる。これにより、対象物の領域を人間が指定する従来
の手法と比較して認識モデルの生成に係る作業量を軽減
できる。また、このようにして生成された認識モデルに
基づいて、未知画像中から目的の対象物を容易に認識す
ることができる。As described above, according to the present invention,
By providing a plurality of images in which the target object for which the recognition model is to be generated is shared, the processing system can specify the target object region without manually specifying where in the image the target object region is located. It is possible to automatically find out and learn the image features of the object, and generate a recognition model of the object. As a result, the amount of work involved in generating a recognition model can be reduced as compared with a conventional method in which a human specifies a region of a target object. Also, based on the recognition model generated in this way, the target object can be easily recognized from the unknown image.

[Brief description of the drawings]

【図１】本発明の認識モデル生成方法に係る処理の流れ
を示す図である。FIG. 1 is a diagram showing a flow of processing according to a recognition model generation method of the present invention.

【図２】認識モデル生成用画像を等間隔メッシュに分割
した例及び各部分メッシュ画像から特徴ベクトルを抽出
した例を示す図である。FIG. 2 is a diagram illustrating an example in which an image for generating a recognition model is divided into equally-spaced meshes and an example in which a feature vector is extracted from each partial mesh image.

【図３】抽出した特徴ベクトルの集合を階層構造を持つ
クラスタ群に分類する過程を示す図である。FIG. 3 is a diagram showing a process of classifying a set of extracted feature vectors into clusters having a hierarchical structure.

【図４】一つのクラスタを構成する各特徴ベクトルが、
どの部分メッシュ画像から抽出されたかを調べる過程を
示す図である。FIG. 4 is a diagram showing each feature vector constituting one cluster;
It is a figure which shows the process of checking from which partial mesh image it was extracted.

【図５】本発明の画像認識方法に係る処理の流れを示す
図である。FIG. 5 is a diagram showing a flow of processing according to the image recognition method of the present invention.

【図６】未知画像Ｘ中の対象物を判定する処理の流れを
具体的に説明する図である。FIG. 6 is a diagram specifically illustrating a flow of a process of determining a target in an unknown image X;

[Explanation of symbols]

１０１認識モデル生成用画像入力１０２言語表現入力１１１認識モデル生成用画像に関する画像分割処理１１２認識モデル生成用画像に関する特徴ベクトル
抽出処理１１３階層的クラスタリング処理１１４候補クラスタ登録処理１１５対象物クラスタ獲得処理１１６特徴付け領域生成処理１１７認識モデル生成処理２０１未知画像入力２１１未知画像に関する画像分割処理２１２未知画像に関する特徴ベクトル抽出処理２１３認識モデル照合処理101 Image Input for Recognition Model Generation 102 Linguistic Expression Input 111 Image Segmentation Processing for Image for Recognition Model Generation 112 Feature Vector Extraction Processing for Image for Recognition Model Generation 113 Hierarchical Clustering Processing 114 Candidate Cluster Registration Processing 115 Object Cluster Acquisition Processing 116 Features Attached area generation processing 117 Recognition model generation processing 201 Unknown image input 211 Image segmentation processing for unknown images 212 Feature vector extraction processing for unknown images 213 Recognition model matching processing

Claims

[Claims]

1. A method for generating a recognition model by inputting a plurality of images in which an object of recognition model generation is commonly taken and a linguistic expression characterizing the object, wherein each of the input images is Decomposing into partial images; extracting feature vectors from each of the decomposed partial images; classifying a set of the extracted feature vectors into clusters having a hierarchical structure; Acquiring a cluster composed of feature vectors of partial images included in the object region from the cluster group; and a hypersphere in a feature vector space centered on the center of gravity of the acquired object cluster, and Generating a hypersphere including all the feature vectors of the object cluster as a region characterizing the object; and inputting the hypersphere in the generated feature vector space to the hypersphere. Generating a recognition model of an object by associating a language expression with the object.

2. A method for recognizing an object of an input unknown image based on a recognition model generated by the recognition model generation method according to claim 1, wherein the input unknown image is decomposed into partial images. Performing a step of extracting a feature vector from each partial image of the unknown image; and performing a matching calculation between each of the extracted feature vectors and the generated recognition model of the object to determine whether the target object is included in the unknown image. Determining whether or not the image is recognized.