JP2007317133A

JP2007317133A - Image classification method, device, and program

Info

Publication number: JP2007317133A
Application number: JP2006148896A
Authority: JP
Inventors: Yongqing Sun; 泳青孫; Satoshi Shimada; 聡嶌田; Masashi Morimoto; 正志森本
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-05-29
Filing date: 2006-05-29
Publication date: 2007-12-06
Anticipated expiration: 2026-05-29
Also published as: JP4703487B2

Abstract

<P>PROBLEM TO BE SOLVED: To precisely classify images corresponding to concepts on the Internet. <P>SOLUTION: When a keyword serving as a query is given, images corresponding to a plurality of concepts of the keyword are gathered by referring to explanation text around the images. In a feature space configured of feature amounts extracted from the images, an identification function for identifying the images corresponding to the respective concepts is found by extracting proper learning data from a group of images corresponding to the respective concepts, and based on the identification function, the images matching the respective concepts are classified. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、画像分類方法及び装置及びプログラムに係り、特に、テキストと画像で表現されたマルチメディア情報を対象として、キーワードにより画像を検索したときに検索結果の画像を画像内容に従って分類する画像分類方法及び装置及びプログラムに関する。 The present invention relates to an image classification method, apparatus, and program, and in particular, image classification for classifying an image as a search result according to image content when searching for an image by a keyword for multimedia information expressed by text and an image. The present invention relates to a method, an apparatus, and a program.

コンピュータとインターネット技術の進歩に伴い、実世界中に画像データを生成・蓄積することが容易になっている。その一方、膨大な画像データの効率よい管理技術が求められるようになった。このコア技術の一つとして、インターネット画像の分類技術が注目されている。従来の画像分類方法は、次の３つのアプローチに分類される。 With advances in computer and internet technology, it is becoming easier to generate and store image data in the real world. On the other hand, an efficient management technique for enormous amounts of image data has been demanded. As one of the core technologies, Internet image classification technology has attracted attention. Conventional image classification methods are classified into the following three approaches.

（１）テキスト照合による分類：
画像にキーワードを手動で事前に付与しておき、キーワードの照合により画像の類似性を判定することで画像を分類する方法である。 (1) Classification by text matching:
This is a method of classifying images by manually assigning keywords to images in advance and determining similarity of images by keyword matching.

（２）画像の照合による分類：
画像から色、テクスチャ、形状などの特徴量を抽出し、これらの特徴量を用いた照合による画像を分類する方法がある（例えば、非特許文献１参照）。 (2) Classification by image matching:
There is a method of extracting feature quantities such as color, texture, and shape from an image and classifying images by collation using these feature quantities (see, for example, Non-Patent Document 1).

（３）テキストと画像の統合処理による分類：
インターネット画像をテキストで検索するときに、クエリワードに関係する画像を検索結果として提示するときに画像内容が似ているものを分類して提示する方法が検討されている（例えば、非特許文献２参照）。この方法は次の２つのステップにより実現される。インターネット画像を含むＷｅｂサイトでは、画像とその画像を説明する説明テキストから構成されていることが多いので、まず、クエリのワードと説明テキストの照合により、クエリワードに関係のある画像を収集する。次に、画像の色やテクスチャに関する特徴量を用いて収集した画像を分類する。この方法は、説明テキストを用いて関連のある画像を収集したものを分類対象とするので、内容が似ているものが多い画像を分類対象とできる点が上記の（２）の技術と異なる。
Yixin Chen, James Z Wang, Robert Krovets, “content-based image retrieval by clustering”, Multimedia Information Retrieval 2003, pp.193-200 Deng Cai, Xiaofei He, Zhiwei Li, Wei-Ying Ma and Ji-rong Wen, “Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Analysis”, 12th ACM International conference on Multimedia, New York City, USA, Oct. 2004 (3) Classification by text and image integration processing:
When searching for an Internet image by text, a method of classifying and presenting images having similar image contents when presenting an image related to a query word as a search result has been studied (for example, Non-Patent Document 2). reference). This method is realized by the following two steps. Web sites including Internet images are often composed of images and explanatory texts that describe the images. First, images related to the query words are collected by matching the query words with the explanatory texts. Next, the collected images are classified using the feature quantities related to the color and texture of the image. This method differs from the above technique (2) in that an image obtained by collecting related images using explanatory text is used as a classification target, so that images having many similar contents can be classified.
Yixin Chen, James Z Wang, Robert Krovets, “content-based image retrieval by clustering”, Multimedia Information Retrieval 2003, pp.193-200 Deng Cai, Xiaofei He, Zhiwei Li, Wei-Ying Ma and Ji-rong Wen, “Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Analysis”, 12th ACM International conference on Multimedia, New York City, USA, Oct . 2004

しかしながら、上記従来技術（１）のテキストによる照合では、各画像に、その画像の意味や内容を表すテキスト（インデックス）を手動で事前に付与しておく必要があり、大量の画像に対して手動で付与するためのコストと時間がかかることが問題である。 However, in the collation by the text of the prior art (1), it is necessary to manually add a text (index) representing the meaning and contents of the image to each image in advance. The problem is that it takes a lot of time and cost to apply the above.

従来技術（２）の画像による照合では、画像から抽出した色、テクスチャ、形状などの物理的な特徴量だけで分類するが、画像の物理的な特徴と画像内容を表す意味レベルの分類基準との対応関係を明確に規定できないため、画像の意味的な分類を行うことが困難である。 In the matching by the image of the prior art (2), classification is performed only by physical feature amounts such as color, texture, and shape extracted from the image. Since it is not possible to clearly define the correspondence relationship, it is difficult to perform semantic classification of images.

従来技術（３）のテキストと画像の統合による分類は、まず、テキスト照合で画像を取得し、次に、画像による照合で分類するという順で行われる。画像による照合では、従来技術（２）と同じ課題を持つので、最初に行うテキスト照合でできるだけ分類しやすい画像を収集することが必要になる。しかし、簡易なテキスト照合による画像収集では、クエリワードに含まれる多様な概念に対応する画像がすべて収集されるので、従来技術（２）と同様に、画像内容に基づいた画像分類は困難であり、その分類精度は低くなる。 The classification based on the integration of the text and the image in the prior art (3) is performed in the order of first obtaining an image by text collation and then classifying by image collation. Since collation by image has the same problem as the prior art (2), it is necessary to collect images that are as easy to classify as possible by text collation performed first. However, with image collection by simple text matching, all images corresponding to various concepts included in the query word are collected, so that image classification based on image contents is difficult as in the case of the prior art (2). The classification accuracy is low.

本発明は、上記の点に鑑みなされたもので、概念に対応した画像を精度よく分類することが可能な画像分類方法及び装置及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an object thereof is to provide an image classification method, apparatus, and program capable of accurately classifying images corresponding to a concept.

図１は、本発明の原理を説明するための図である。 FIG. 1 is a diagram for explaining the principle of the present invention.

本発明（請求項１）は、テキストと画像で表現されたマルチメディア情報を対象として、キーワードにより画像を検索したときに検索結果の画像を画像内容に従って分類する画像分類方法であって、
概念・シソーラス取得手段が、入力されたクエリキーワードに基づいて、ワードの持つ複数の概念と各概念のシソーラスを記憶したワード辞書を検索し、ｎ個の概念と該概念のシソーラスを取得する概念・シソーラス取得ステップ（ステップ１）と、
画像収集手段が、概念のシソーラスとクエリキーワードのＡＮＤ条件によりウェブ検索し、画像の周辺にある説明テキストを照合し、該クエリキーワードと該シソーラスに関連する画像を収集して記憶手段に格納する画像収集ステップ（ステップ２）と、
画像フィルタリング手段が、各概念ｎ（ｎ＝１，２，…，Ｎ）について、記憶手段の収集された画像群から、クエリキーワードと概念のシソーラスとのＡＮＤ検索したときの類似度が高いＭ個の画像を正事例候補、それ以外の画像をラベル無し画像として抽出する画像フィルタリングステップ（ステップ３）と、
正事例生成手段が、正事例候補について、画像特徴量を抽出し、画像特徴空間での分布の中心との距離が予め設定しておいた閾値以下の画像を正事例とする正事例生成ステップ（ステップ４）と、
負事例生成手段が、正事例の各画像から抽出した画像特徴量から、ばらつきが予め設定された閾値より小さい特徴量を選択し、該特徴量に対するラベルなし画像群の分布から負事例を求める負事例生成ステップ（ステップ５）と、
識別関数算出手段が、正事例生成ステップで求めた正事例と、負事例生成ステップで求めた負事例を学習データとして正事例と負事例を識別するための識別関数を求める識別関数算出ステップ（ステップ６）と、
識別手段が、正事例候補と、ラベル無し画像群とから、識別関数を用いて、概念ｎに対応する画像を求める識別ステップ（ステップ７）と、
全ての概念ｎに対して、画像収集ステップ以降の処理を繰り返す（ステップ８）。 The present invention (Claim 1) is an image classification method for classifying a search result image according to image content when searching for an image by a keyword for multimedia information expressed by text and an image,
A concept / thesaurus acquisition unit searches a word dictionary storing a plurality of concepts of a word and a thesaurus of each concept based on an inputted query keyword, and acquires n concepts and a thesaurus of the concept A thesaurus acquisition step (step 1);
The image collecting means searches the web by the AND condition of the concept thesaurus and query keyword, collates the explanatory text around the image, collects the query keyword and the image related to the thesaurus, and stores the image in the storage means A collection step (step 2);
For each concept n (n = 1, 2,..., N), the image filtering means has a high similarity when the query keyword and the concept thesaurus are AND-searched from the image group collected by the storage means. An image filtering step (step 3) for extracting the image of (2) as a positive case candidate and the other images as unlabeled images;
A positive case generation unit extracts an image feature amount for a positive case candidate and sets a positive case as an example of an image whose distance from the center of the distribution in the image feature space is equal to or less than a preset threshold ( Step 4) and
The negative case generation means selects a feature amount whose variation is smaller than a preset threshold from the image feature amounts extracted from each image of the positive case, and obtains a negative case from the distribution of the unlabeled image group for the feature amount. A case generation step (step 5);
Discriminant function calculating means for determining a discriminant function for discriminating between positive and negative cases using the positive case obtained in the positive case generating step and the negative case obtained in the negative case generating step as learning data (step 6) and
An identification step (step 7) in which an identification unit obtains an image corresponding to the concept n from the positive case candidate and the unlabeled image group using an identification function;
The processing after the image collection step is repeated for all concepts n (step 8).

図２は、本発明の原理構成図である。 FIG. 2 is a principle configuration diagram of the present invention.

本発明（請求項２）は、テキストと画像で表現されたマルチメディア情報を対象として、キーワードにより画像を検索したときに検索結果の画像を画像内容に従って分類する画像分類装置であって、
ワードの持つ複数の概念と各概念のシソーラスを記憶したワード辞書１０７と、
入力されたクエリキーワードに基づいて、ワード辞書１０７を検索し、ｎ個の概念と該概念のシソーラスを取得する概念・シソーラス取得手段１００と、
概念のシソーラスとクエリキーワードのＡＮＤ条件によりウェブ検索し、画像の周辺にある説明テキストを照合し、該クエリキーワードと該シソーラスに関連する画像を収集して記憶手段に格納する画像収集手段１０１と、
各概念ｎ（ｎ＝１，２，…，Ｎ）について、記憶手段の収集された画像群を、クエリキーワードと概念のシソーラスとのＡＮＤ検索したときの類似度が高いＭ個の画像を正事例候補、それ以外の画像をラベル無し画像とする画像フィルタリング手段１０２と、
正事例候補について、画像特徴量を抽出し、画像特徴空間での分布の中心との距離が予め設定しておいた閾値以下の画像を正事例とする正事例生成手段１０３と、
正事例の各画像から抽出した画像特徴量から、ばらつきが予め設定された閾値より小さい特徴量を選択し、該特徴量に対するラベルなし画像群の分布から負事例を求める負事例生成手段１０４と、
正事例生成手段１０３で求めた正事例と、負事例生成手段１０４で求めた負事例を学習データとして正事例と負事例を識別するための識別関数を求める識別関数算出手段１０５と、
正事例候補と、ラベル無し画像群とから、識別関数を用いて、概念ｎに対応する画像を求める識別手段１０６と、
全ての概念ｎに対して、画像取得手段１０１、画像フィルタリング手段１０２、正事例生成手段１０３、負事例生成手段１０４、識別関数算出手段１０５、識別手段１０６を繰り返す手段と、を有する。 The present invention (Claim 2) is an image classification device for classifying an image of a search result according to image contents when searching for an image by a keyword for multimedia information expressed by text and an image.
A word dictionary 107 storing a plurality of concepts of words and a thesaurus of each concept;
A concept / thesaurus acquisition unit 100 that searches the word dictionary 107 based on the input query keyword and acquires n concepts and a thesaurus of the concept;
An image collecting unit 101 that performs a web search using an AND condition of a conceptual thesaurus and a query keyword, collates explanatory texts around the image, collects the query keyword and an image related to the thesaurus, and stores them in a storage unit;
For each concept n (n = 1, 2,..., N), M images having a high degree of similarity when an AND search of a query keyword and a concept thesaurus is performed on a group of images collected by the storage means is a positive example. Candidate, image filtering means 102 for other images as unlabeled images,
A positive case generation unit 103 that extracts an image feature amount for a positive case candidate and sets an image whose distance from the center of distribution in the image feature space is equal to or less than a preset threshold value as a positive case;
A negative case generation unit 104 that selects a feature amount whose variation is smaller than a preset threshold value from image feature amounts extracted from each image of the positive case, and obtains a negative case from a distribution of unlabeled image groups with respect to the feature amount;
A discriminant function calculating unit 105 for obtaining a discriminant function for discriminating between a positive case and a negative case using the positive case obtained by the positive case generating unit 103 and the negative case obtained by the negative case generating unit 104 as learning data;
An identification unit 106 for obtaining an image corresponding to the concept n from the positive case candidate and the unlabeled image group using an identification function;
For all concepts n, the image acquisition means 101, the image filtering means 102, the positive case generation means 103, the negative case generation means 104, the discrimination function calculation means 105, and the means for repeating the discrimination means 106 are provided.

本発明（請求項３）は、コンピュータに、請求項２記載の画像分類装置の各手段を実行させる画像分類プログラムである。 The present invention (Claim 3) is an image classification program for causing a computer to execute each means of the image classification apparatus according to Claim 2.

本発明では、インターネット画像が、
・様々な人が様々な目的でとった多種多様な画像；
・画像の意味は周辺テキスト情報と画像情報で表現される：
という特徴を有していることにより、まず、クエリとなるキーワードが与えられたときに、そのキーワードが持つ複数概念に応じた画像を、当該画像の周辺の説明テキストの照合により収集し、画像から抽出した特徴量で構成される特徴空間において、各概念に対応する画像を識別するための識別関数を各概念に対応する画像群から適切な学習データを抽出することにより求め、当該識別関数に基づいて各概念に対応した画像を分類する。これにより、概念に対応した画像を精度よく分類することができる。 In the present invention, the Internet image is
・ Various images taken by various people for various purposes;
The meaning of the image is expressed by surrounding text information and image information:
First, when a keyword to be queried is given, images corresponding to a plurality of concepts possessed by the keyword are collected by collating explanatory texts around the image, and from the image In a feature space composed of extracted feature quantities, an identification function for identifying an image corresponding to each concept is obtained by extracting appropriate learning data from an image group corresponding to each concept, and based on the identification function To classify the images corresponding to each concept. Thereby, the image corresponding to the concept can be classified with high accuracy.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図３は、本発明の一実施の形態におけるシステム構成を示す。 FIG. 3 shows a system configuration according to an embodiment of the present invention.

同図に示すシステムは、クライアント端末１と、ウェブ画像検索エンジン２と、ウェブサイト３がインターネット４に接続されている構成である。 The system shown in the figure has a configuration in which a client terminal 1, a web image search engine 2, and a website 3 are connected to the Internet 4.

クライアント端末１は、ユーザの検索要求であるクエリとなるキーワードを受け付け、クエリワードに基づいて検索された画像を提示する。 The client terminal 1 accepts a keyword serving as a query that is a user search request, and presents an image searched based on the query word.

ウェブ画像検索エンジン２は、クエリワードと、ウェブサイト３の説明テキストとのテキスト照合を行い、クエリワードと関連するウェブ画像を検索する。 The web image search engine 2 performs text matching between the query word and the explanatory text of the website 3, and searches for a web image associated with the query word.

ウェブサイト３は、インターネット上で公開されている数多くのウェブサイトであり、公開されている情報は、画像と周辺の説明テキストとから構成されている。 The website 3 is a large number of websites published on the Internet, and the published information is composed of images and surrounding explanatory text.

上記の図３に示すシステムを前提としたときの、クライアント装置１を画像分類装置をとして説明する。 The client apparatus 1 when the system shown in FIG. 3 is assumed is described as an image classification apparatus.

図４は、本発明の一実施の形態における画像分類装置の構成を示す。 FIG. 4 shows the configuration of the image classification device in one embodiment of the present invention.

同図に示す画像分類装置は、概念・シソーラス取得部１００、画像収集部１０１、画像フィルタリング部１０２、正事例生成部１０３、負事例生成部１０４、識別関数算出部１０５、識別処理部１０６、ワード辞書記憶部１０７から構成される。 The image classification apparatus shown in the figure includes a concept / thesaurus acquisition unit 100, an image collection unit 101, an image filtering unit 102, a positive case generation unit 103, a negative case generation unit 104, an identification function calculation unit 105, an identification processing unit 106, a word The dictionary storage unit 107 is configured.

概念・シソーラス取得部１００は、ユーザから入力されたクエリキーワードに基づいて、ワード辞書記憶部１０７に記憶してあるワード辞書から、クエリキーワードの持つ各概念と各概念のシソーラスを読み取り、クエリワード、及び読み取った各概念とシソーラスを画像収集部１０１へ出力する。 The concept / thesaurus acquisition unit 100 reads each concept of the query keyword and the thesaurus of each concept from the word dictionary stored in the word dictionary storage unit 107 based on the query keyword input from the user, Each read concept and thesaurus is output to the image collection unit 101.

ワード辞書記憶部１０７は、ワードの持つ複数の概念と各概念のシソーラスをまとめた電子辞書で、概念・シソーラス取得部１００からワードを受け取ると、そのワードが持つ概念とシソーラスを概念・シソーラス取得部１００に出力する。ワード辞書の例を図５に示す。 The word dictionary storage unit 107 is an electronic dictionary that summarizes a plurality of concepts possessed by a word and a thesaurus of each concept. When a word is received from the concept / thesaurus acquisition unit 100, the concept and thesaurus possessed by the word are converted into a concept / thesaurus acquisition unit. Output to 100. An example of a word dictionary is shown in FIG.

画像収集部１０１は、概念・シソーラスの取得部１００からクエリキーワードと各概念に関するシソーラスを受け取ると、クエリキーワードと各概念に関するシソーラスを検索条件（ＡＮＤ条件）として、ウェブ検索エンジンによりウェブ画像の周辺にある説明テキストとの照合を行い、クエリワードとシソーラスとに関連する画像を収集し、メモリ等の記憶手段（図示せず）に格納する。収集した画像と、ウェブ画像検索エンジン２で検索したときの説明テキストとの照合におけるテキスト類似度を画像フィルタリング部１０２へ出力する。 When the image collection unit 101 receives the query keyword and the thesaurus related to each concept from the concept / thesaurus acquisition unit 100, the web search engine uses the query keyword and the thesaurus related to each concept as a search condition (AND condition) around the web image. It collates with a certain explanatory text, collects images related to the query word and the thesaurus, and stores them in storage means (not shown) such as a memory. The text similarity in collation between the collected image and the explanatory text when searched by the web image search engine 2 is output to the image filtering unit 102.

画像フィルタリング部１０２は、画像取得部１０１から画像とテキスト類似度を受け取ると、テキスト類似度が高いＭ個の画像を正事例候補画像、その他の画像をラベルなし画像の２つのグループに分ける。正事例候補画像を正事例生成部１０３と識別処理部１０６に出力し、ラベルなし画像を負事例生成部１０４と識別処理部１０６に出力する。 When the image filtering unit 102 receives the text similarity with the image from the image acquisition unit 101, the image filtering unit 102 divides the M images having high text similarity into two groups of positive case candidate images and the other images as unlabeled images. The positive case candidate image is output to the positive case generation unit 103 and the identification processing unit 106, and the unlabeled image is output to the negative case generation unit 104 and the identification processing unit 106.

正事例生成部１０３は、画像フィルタリング部１０２から正事例候補画像を受け取ると、正事例画像を特徴付ける代表特徴量を抽出し、抽出した当該代表特徴量を用いて正事例画像を検出する。検出した正事例画像を識別関数算出部１０５に、抽出した正事例特徴量（代表特徴量）を負事例生成部１０４にそれぞれ出力する。正事例画像の抽出の詳細は、図６のフローチャートで後述する。 When the positive case generation unit 103 receives the positive case candidate image from the image filtering unit 102, the positive case generation unit 103 extracts a representative feature amount that characterizes the positive case image, and detects the positive case image using the extracted representative feature amount. The detected positive case image is output to the discriminant function calculation unit 105, and the extracted positive case feature amount (representative feature amount) is output to the negative case generation unit 104. Details of the extraction of the correct case image will be described later with reference to the flowchart of FIG.

負事例生成部１０４は、画像フィルタリング部１０２からラベルなし画像を、正事例生成部１０３から代表特徴量をそれぞれ受け取ると、当該代表特徴量を用いて画像フィルタリング部１０２から受け取ったラベルなし画像から負事例画像を抽出し、抽出した負事例画像を識別関数算出部１０５へ出力する。詳細は、図７の処理フローで後述する。 When the negative case generation unit 104 receives the unlabeled image from the image filtering unit 102 and the representative feature amount from the positive case generation unit 103, the negative case generation unit 104 uses the representative feature amount to obtain a negative value from the unlabeled image received from the image filtering unit 102. A case image is extracted, and the extracted negative case image is output to the discrimination function calculation unit 105. Details will be described later in the processing flow of FIG.

識別関数算出部１０５は、正事例生成部１０３から受け取った正事例画像と負事例生成部１０４から受け取った負事例画像を学習データとして、負事例と正事例を識別するための識別関数を求め、求めた識別関数を識別処理部１０６へ出力する。識別関数の算出は例えば、従来技術であるＢＤＡという分類器を用いて求めればよい（文献１：Xiang Sean Zhou, Thomas S. Huang, “Comparing Discriminating transformations and SVM for learning during multimedia retrieval”, ACM Multimedia 2001, pp. 137-146参照）。 The discriminant function calculating unit 105 obtains a discriminant function for discriminating between the negative case and the positive case by using the positive case image received from the positive case generating unit 103 and the negative case image received from the negative case generating unit 104 as learning data, The obtained discrimination function is output to the discrimination processing unit 106. The discrimination function can be calculated using, for example, a conventional classifier called BDA (Reference 1: Xiang Sean Zhou, Thomas S. Huang, “Comparing Discriminating transformations and SVM for learning during multimedia retrieval”, ACM Multimedia 2001. , pp. 137-146).

識別処理部１０６は、画像フィルタリング部１０２から入力される正事例候補画像とラベルなし画像を、識別関数算出部１０５から入力される識別関数に基づいて分類する。正事例として分類された結果をクエリキーワードの該当する概念に対応する画像として出力する。 The identification processing unit 106 classifies the positive case candidate image and the unlabeled image input from the image filtering unit 102 based on the identification function input from the identification function calculation unit 105. The result classified as the positive case is output as an image corresponding to the corresponding concept of the query keyword.

次に、上記の構成における動作を説明する。 Next, the operation in the above configuration will be described.

図６は、本発明の一実施の形態における画像分類装置の処理のフローチャートである。 FIG. 6 is a flowchart of the process of the image classification device according to the embodiment of the present invention.

ステップ２０１）概念・シソーラス取得部１００において、クエリとなるキーワードを取得する。 Step 201) The concept / thesaurus acquisition unit 100 acquires a keyword to be a query.

ステップ２０２）概念・シソーラス取得部１００は、クエリキーワードに基づいてワード辞書記憶部１０７にアクセスして、クエリワードが持つＮ個の概念と、各概念に対応するシソーラスを読み取る。 Step 202) The concept / thesaurus acquisition unit 100 accesses the word dictionary storage unit 107 based on the query keyword, and reads N concepts possessed by the query word and a thesaurus corresponding to each concept.

ステップ２０３）画像収集部１０１において、クエリワードにｎ番目の概念シソーラスを追加した検索条件（ＡＮＤ条件）で、インターネット４を介してウェブ画像検索エンジン２を用いて画像の検索を行い、画像を収集し、当該画像と類似度をメモリ等の記憶手段（図示せず）に格納する。画像の検索は、ウェブ画像の周辺の説明テキストやタイトルなどと検索条件とのテキスト照合を行い、類似度の高い順にインターネットの画像を検出すればよい。 Step 203) In the image collection unit 101, images are retrieved by using the web image search engine 2 via the Internet 4 under the search condition (AND condition) in which the nth concept thesaurus is added to the query word. Then, the image and the similarity are stored in a storage means (not shown) such as a memory. The image search may be performed by collating texts such as the description text or title around the web image with the search condition, and detecting images on the Internet in descending order of similarity.

ステップ２０４）画像フィルタリング部１０２は、画像収集部１０１で収集された画像のテキスト照合の類似度の高い上位Ｍ枚の画像を正事例画像候補、その他の画像をラベルなし画像として分類する。 Step 204) The image filtering unit 102 classifies the top M images having high similarity in text matching of the images collected by the image collecting unit 101 as positive case image candidates and the other images as unlabeled images.

ステップ２０５）正事例生成部１０３は、画像フィルタリング部１０２から入力された正事例画像候補画像から正事例画像を求める。処理内容を図７に基づいて説明する。 Step 205) The positive case generation unit 103 obtains a positive case image from the positive case image candidate images input from the image filtering unit 102. The processing contents will be described with reference to FIG.

まず、正事例生成部１０３は、正事例候補画像を読み込む（ステップ３０１）。次に、取得した正事例候補画像について、色（ｆ^Ｃ）、テクスチャ（ｆ^Ｔ），形状（ｆ^Ｓ）などのＰ個の特徴量を抽出する（ステップ３０２）。ここで、各色、テクスチャ、形状の特徴量は従来技術で得られている多次元の特徴ベクトルである。 First, the correct case generation unit 103 reads a correct case candidate image (step 301). Next, P feature amounts such as color (f ^C ), texture (f ^T ), shape (f ^S ), etc. are extracted from the acquired positive case candidate images (step 302). Here, the feature quantity of each color, texture, and shape is a multidimensional feature vector obtained by the prior art.

Ｐ個の特徴量のそれぞれについて、正事例候補画像全体のばらつきを求める（ステップ３０３）。まず、色の特徴空間について、正事例候補画像全体における標準偏差を以下の関数で求める For each of the P feature amounts, the variation of the entire positive case candidate image is obtained (step 303). First, for the color feature space, the standard deviation of the entire positive case candidate image is obtained using the following function:

Ｃは、正事例候補画像全体の中心、Ｍは正事例候補画像の総数、ｆ_m ^Cはｐ次元色特徴ベクトルである。

C is a positive case candidate entire image center, M is the total number of positive case candidate image, f _m ^C is a p-dimensional color feature vector.

同様に、テクスチャや形状などＰ個の特徴空間についても、正事例候補画像全体における標準偏差を計算する。この処理によりＰ通りの標準偏差が求められる。 Similarly, for P feature spaces such as texture and shape, the standard deviation in the entire positive case candidate image is calculated. By this process, P standard deviations are obtained.

次に、Ｐ通りの標準偏差が最小となる特徴量を検出し、代表特徴量（Ｉ）とする（ステップ３０４）。例えば、色（ｆ^Ｃ），テクスチャ（ｆ^Ｔ）、形状（ｆ^Ｓ）などのＰ個の特徴量の中で、色特徴量（ｆ^Ｃ）の標準偏差が一番小さければ、代表特徴量（Ｉ）は色特徴量となる。 Next, the feature quantity having the smallest P standard deviation is detected and set as the representative feature quantity (I) (step 304). For example, if the standard deviation of the color feature quantity (f ^C ) is the smallest among the P feature quantities such as color (f ^C ), texture (f ^T ), and shape (f ^S ), the representative feature quantity ( I) is a color feature amount.

ステップ３０４で決めた代表特徴量（Ｉ）の特徴空間における正事例候補画像の中心Ｃを求める（ステップ３０５）。 The center C of the positive case candidate image in the feature space of the representative feature quantity (I) determined in step 304 is obtained (step 305).

次に、代表特徴量（Ｉ）の特徴空間における正事例候補画像ｎと中心Ｃとの距離を算出する（ステップ３０６）。ここでは、距離の計算は一般のユークリッド距離計算式を用いて求められる。 Next, the distance between the positive case candidate image n and the center C in the feature space of the representative feature quantity (I) is calculated (step 306). Here, the distance is calculated using a general Euclidean distance calculation formula.

ステップ３０６で求めた距離が予め設定しておいた閾値以下であるかを判定する。閾値以下の場合は、ステップ３０８に移行し、閾値より大きい場合はステップ３０９へ移行する。 It is determined whether the distance obtained in step 306 is equal to or less than a preset threshold value. If it is equal to or smaller than the threshold value, the process proceeds to step 308. If it is greater than the threshold value, the process proceeds to step 309.

閾値以下の場合は、正事例候補画像ｎを正事例に振り分ける（ステップ３０８）。 If it is less than or equal to the threshold value, the positive case candidate image n is assigned to the positive case (step 308).

閾値より大きい場合は、すべての正事例候補画像において上記のステップ３０６〜ステップ３０８に至る処理を行ったかを判定し、行っていなければステップ３１０に移行し、そうでなければ処理を終了する。 If it is larger than the threshold value, it is determined whether or not the processing from step 306 to step 308 has been performed on all the positive case candidate images. If not, the process proceeds to step 310, and if not, the processing ends.

ｎをインクリメントし、ステップ３０６へ移行する（ステップ３１０）。 n is incremented and the routine proceeds to step 306 (step 310).

ステップ２０６）負事例生成部１０４は、負事例画像を抽出する。以下、図８を用いて詳細に負事例生成部１０４の動作を説明する。 Step 206) The negative case generation unit 104 extracts a negative case image. Hereinafter, the operation of the negative case generation unit 104 will be described in detail with reference to FIG.

まず、負事例生成部１０４は、ラベルなし画像から代表特徴量（Ｉ）を抽出する。例えば、ステップ２０５で求めた代表特徴量（Ｉ）が色特徴量であれば、ラベルなし画像からも色特徴量を抽出する（ステップ４１）。 First, the negative case generation unit 104 extracts the representative feature quantity (I) from the unlabeled image. For example, if the representative feature amount (I) obtained in step 205 is a color feature amount, the color feature amount is also extracted from the unlabeled image (step 41).

ラベルなし画像の中で、ステップ３０５で求めた代表特徴量（Ｉ）の特徴空間における正事例候補画像の中心Ｃから距離が閾値より大きいものを負事例として検出する（ステップ４２）。 Among the unlabeled images, those having a distance larger than the threshold from the center C of the positive case candidate image in the feature space of the representative feature amount (I) obtained in step 305 are detected as negative cases (step 42).

ステップ２０７）識別関数算出部１０５は、正事例生成部１０３で生成された正事例画像と、負事例生成部１０４で生成された負事例画像とを正事例と負事例学習データとして、負事例と正事例を識別するための識別関数を求める。識別関数を求める一例として、従来技術のＢＤＡ(Biased Discriminate Analysis)という分類器を用いる方法が有効である（文献１参照）。ＢＤＡ分類器は、正事例グループの分散を最小化し、かつ、負事例グループと正事例グループの分散を最大化する重み係数を求める。図９にＢＤＡ分類器の例を示す。代表特徴量（Ｉ）の各次元に対する重み付け係数を表す行列をＷ、代表特徴量（Ｉ）で表した正事例画像をｆ_l、正事例画像の総数をＬ、代表特徴量（Ｉ）で表した負事例画像をｎｆ_ｊ、負事例画像の総数をＪ、代表特徴量空間（Ｉ）における正事例画像の中心をＣとすると、上記の目標を達成するための最適な重み係数は式（２）で定義され、 Step 207) The discrimination function calculation unit 105 uses the positive case image generated by the positive case generation unit 103 and the negative case image generated by the negative case generation unit 104 as a positive case and negative case learning data as a negative case. An identification function for identifying a positive case is obtained. As an example of obtaining the discriminant function, a method using a conventional classifier called BDA (Biased Discriminate Analysis) is effective (see Document 1). The BDA classifier determines a weighting factor that minimizes the variance of the positive case group and maximizes the variance of the negative case group and the positive case group. FIG. 9 shows an example of the BDA classifier. A matrix representing a weighting coefficient for each dimension of the representative feature quantity (I) is represented by W, a positive case image represented by the representative feature quantity (I) is represented by _fl , a total number of positive case images is represented by L, and a representative feature quantity (I). If the negative case image is nf _j , the total number of negative case images is J, and the center of the positive case image in the representative feature space (I) is C, the optimal weighting factor for achieving the above goal is )

式（１）より、Ｗは、

From equation (1), W is

により求められる。

Is required.

ステップ２０８）識別処理部１０６において、学習した識別器を用いて、正事例候補画像とラベルなし画像の識別を行い、正事例として判別されたサンプルをクエリキーワードのｎ番目の概念に対する画像として出力する。学習したＢＤＡ分類器で得られた重み付けＷを画像特徴量にかけて、以下の式で分類度Ｓを求める。Ｓは、予め設定しておいた閾値以下になれば、ｎ番目概念に対する画像とする。 Step 208) Using the learned classifier, the discrimination processing unit 106 discriminates between the correct case candidate image and the unlabeled image, and outputs the sample determined as the correct case as an image for the nth concept of the query keyword. . The weighting W obtained by the learned BDA classifier is applied to the image feature amount to obtain the classification degree S by the following equation. If S is equal to or less than a preset threshold value, S is an image for the nth concept.

Ｓ＝Ｗ^Ｔ・ｆ_ｉ
ステップ２０９）識別処理部１０６は、全てのクエリキーワードに関連するＮ個の概念において、上記のステップ２０３からステップ２０８に至る画像分類処理を行ったかを判定する。行っていなければ、ステップ２０３からステップ２０８の処理を繰り返す。そうでなければステップ２１０に移行する。 S = W ^T · f _i
Step 209) The identification processing unit 106 determines whether the image classification processing from Step 203 to Step 208 has been performed on the N concepts related to all query keywords. If not, the processing from step 203 to step 208 is repeated. Otherwise, go to Step 210.

ステップ２１０）識別処理部１０６は、クエリキーワードに関連するＮ個の概念に対応する画像を結果として提示する。 Step 210) The identification processing unit 106 presents an image corresponding to the N concepts related to the query keyword as a result.

また、本発明は、上記の図４に示す画像分類装置の構成の各機能をプログラムとして構築し、画像分類装置として利用されるコンピュータにインストールして実行させる、または、ネットワークを介して流通させることが可能である。 Further, the present invention constructs each function of the configuration of the image classification apparatus shown in FIG. 4 as a program and installs and executes it on a computer used as the image classification apparatus or distributes it via a network. Is possible.

また、構築されたプログラムを、ハードディスクや、フレキシブルディスク・ＣＤ−ＲＯＭ等の可搬記憶媒体に格納し、画像分類装置として利用されるコンピュータにインストールする、または、配布することが可能である。
なお、本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 Further, the constructed program can be stored in a portable storage medium such as a hard disk, a flexible disk, or a CD-ROM, and can be installed or distributed on a computer used as an image classification apparatus.
The present invention is not limited to the above-described embodiment, and various modifications and applications can be made within the scope of the claims.

本発明は、インターネット上に存在する画像を分類する技術に適用可能である。 The present invention can be applied to a technique for classifying images existing on the Internet.

本発明の原理を説明するための図である。It is a figure for demonstrating the principle of this invention. 本発明の原理構成図である。It is a principle block diagram of this invention. 本発明の一実施の形態におけるシステム構成図である。1 is a system configuration diagram according to an embodiment of the present invention. 本発明の一実施の形態における画像分類装置の構成図である。It is a block diagram of the image classification device in one embodiment of the present invention. 本発明の一実施の形態におけるワード辞書の例である。It is an example of the word dictionary in one embodiment of this invention. 本発明の一実施の形態における画像分類装置の処理のフローチャートである。It is a flowchart of the process of the image classification device in one embodiment of this invention. 本発明の一実施の形態における正事例画像抽出ｓ２０５の詳細な処理のフローチャートである。It is a flowchart of the detailed process of the positive case image extraction s205 in one embodiment of this invention. 本発明の一実施の形態における負事例画像抽出ｓ２０６の処理のフローチャートである。It is a flowchart of the process of negative case image extraction s206 in one embodiment of this invention. 本発明の一実施の形態におけるＢＤＡ分類器を説明する図である。It is a figure explaining the BDA classifier in one embodiment of this invention.

Explanation of symbols

１クライアント端末
２ウェブ画像検索エンジン
３ウェブサイト
４インターネット
１００概念・シソーラス取得手段、概念・シソーラス取得部
１０１画像収集手段、画像収集部
１０２画像フィルタリング手段、画像フィルタリング部
１０３正事例生成手段、正事例生成部
１０４負事例生成手段、負事例生成部
１０５識別関数算出手段、識別関数算出部
１０６識別手段、識別処理部
１０７ワード辞書、ワード辞書記憶部 1 Client Terminal 2 Web Image Search Engine 3 Website 4 Internet 100 Concept / Thesaurus Acquisition Unit, Concept / Thesaurus Acquisition Unit 101 Image Collection Unit, Image Collection Unit 102 Image Filtering Unit, Image Filtering Unit 103 Positive Case Generation Unit, Positive Case Generation Unit 104 negative case generation unit, negative case generation unit 105 identification function calculation unit, identification function calculation unit 106 identification unit, identification processing unit 107 word dictionary, word dictionary storage unit

Claims

An image classification method for classifying multimedia information expressed in text and images according to image content when searching for images by keywords,
A concept / thesaurus acquisition unit searches a word dictionary storing a plurality of concepts of a word and a thesaurus of each concept based on an inputted query keyword, and acquires n concepts and a thesaurus of the concept A thesaurus acquisition step;
The image collection means searches the web based on the AND condition of the concept thesaurus and the query keyword, collates the explanatory text around the image, collects the query keyword and the image related to the thesaurus, and stores them in the storage means An image collection step to
The degree of similarity when the image filtering means performs an AND search of the query keyword and the concept thesaurus from the image group collected in the storage means for each concept n (n = 1, 2,..., N). An image filtering step of extracting M images having a high value as candidates for positive cases and other images as unlabeled images;
A positive case generation unit that extracts an image feature amount for the positive case candidate and sets a positive case as an image whose distance from the center of the distribution in the image feature space is a predetermined threshold or less. When,
A negative case generation means selects a feature amount whose variation is smaller than a preset threshold value from the image feature amounts extracted from each image of the positive case, and selects a negative case from the distribution of the unlabeled image group for the feature amount. The negative case generation step to be sought,
Discrimination function calculating means for obtaining a discrimination function for identifying a positive case and a negative case using the positive case candidate obtained in the positive case generation step and the negative case obtained in the negative case generation step as learning data A function calculation step;
An identification step for identifying an image corresponding to the concept n using the identification function from the positive case candidate and the unlabeled image group;
An image classification method, wherein the processing after the image collection step is repeated for all concepts n.

An image classification device that classifies images of search results according to image contents when searching for images by keywords for multimedia information expressed in text and images,
A word dictionary that stores multiple concepts of words and the thesaurus of each concept;
A concept / thesaurus acquisition means for searching the word dictionary based on the input query keyword and acquiring n concepts and a thesaurus of the concept;
An image collecting means for performing a web search according to an AND condition of the conceptual thesaurus and the query keyword, collating explanatory texts around the image, collecting the query keyword and an image related to the thesaurus, and storing the collected information in a storage means; ,
For each concept n (n = 1, 2,..., N), M images having a high degree of similarity when an AND search of the query keyword and the concept thesaurus is performed on the collected image group of the storage means. An image filtering means for setting the image as a positive case candidate and the other images as unlabeled images;
For the positive case candidate, a positive case generation unit that extracts an image feature amount and sets an image whose distance from the center of distribution in the image feature space is equal to or less than a preset threshold as a positive case;
Negative case generation means for selecting a feature amount whose variation is smaller than a preset threshold from the image feature amount extracted from each image of the positive case, and obtaining a negative case from a distribution of unlabeled image groups for the feature amount; ,
An identification function calculating means for obtaining an identification function for identifying a positive case and a negative case by using the positive case determined by the positive case generating means and the negative case determined by the negative case generating means as learning data;
Identification means for obtaining an image corresponding to the concept n from the positive case candidate and the unlabeled image group using the identification function;
For all concepts n, the image collection means, the image filtering means, the positive case generation means, the negative case generation means, the discrimination function calculation means, means for repeating the identification means,
An image classification apparatus comprising:

On the computer,
An image classification program for causing each means of the image classification apparatus according to claim 2 to be executed.