JP2007041762A

JP2007041762A - Image recognizing device

Info

Publication number: JP2007041762A
Application number: JP2005223924A
Authority: JP
Inventors: Kenji Fukazawa; 賢二深沢
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2005-08-02
Filing date: 2005-08-02
Publication date: 2007-02-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image recognizing device for performing proper image recognition. <P>SOLUTION: This image recognizing device is provided with a storage means for associating an image featured value, a keyword, location information and location information relevancy showing how much an object to which the keyword is added is unique to a specific place with each of a plurality of object image data, and for storing the image data as a database, an image data input means for inputting the image data as acquired image data, and for extracting the location information, an image featured value extracting means for extracting the image featured value of the acquired image data, an exclusion deciding means for deciding whether or not the object image data should be excluded from the processing object by using the location information relevancy of the database and the interval of a photographic location calculated from the location information of the acquired image data and the location information of the object image data and a keyword extracting means for comparing the image featured values of the acquired image data with the image featured values of the object image data, and for extracting the keyword satisfying conditions from the database when it is judged that the object image data are the processing object. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、画像データの内容を認識する技術に関する。 The present invention relates to a technique for recognizing the contents of image data.

従来から、テキスト形式のキーワードを入力してＷｅｂページを検索するシステムが広く実用化されている。かかるシステムでは、適切なキーワードを入力することで、所望のＷｅｂページを検索することができる。こうしたキーワードによる検索手法に対して、任意の画像から種々の情報を検索する技術がある。例えば、下記特許文献１には、撮影画像の情報とその画像の撮影場所名とを関連付けて記憶したデータベースを用意し、任意の画像データを入力すると、データベースから類似画像データを検索し、検索された類似画像データに関連付けられた撮影場所名を抽出する技術が開示されている。 2. Description of the Related Art Conventionally, systems for searching Web pages by inputting text format keywords have been widely put into practical use. In such a system, a desired Web page can be searched by inputting an appropriate keyword. As a search method using such keywords, there is a technique for searching various information from an arbitrary image. For example, in Patent Document 1 below, a database that stores information related to captured images and the names of the locations where the images are captured is prepared. When arbitrary image data is input, similar image data is retrieved from the database and searched. A technique for extracting a shooting place name associated with similar image data is disclosed.

特開２００２−３７３１６８号公報JP 2002-373168 A

しかしながら、かかるデータベースを用いた技術では、画像データの撮影場所名の情報は抽出できるものの、撮影場所名以外の情報、例えば、被写体の名称などについては取得することができなかった。そのため、画像データの内容を表わす適切なキーワードを抽出して、画像認識を行なうことが困難であった。 However, in the technique using such a database, information on the shooting location name of the image data can be extracted, but information other than the shooting location name, such as the name of the subject, cannot be acquired. For this reason, it is difficult to extract an appropriate keyword representing the content of the image data and perform image recognition.

本発明は、適切な画像認識が難しいといった問題を踏まえ、適切な画像認識を行なう画像認識装置を提供することを目的とする。 An object of the present invention is to provide an image recognition apparatus that performs appropriate image recognition in view of the problem that appropriate image recognition is difficult.

本発明の第１の画像認識装置は、上記課題に鑑み、以下の手法を採った。すなわち、撮影位置の情報を備えた一の画像データの画像内容を表わすキーワードを、所定の検索対象から抽出する処理を実行し、画像内容を認識する画像認識装置であって、前記検索対象となる複数の対象画像データ毎に、画像特徴量と、画像内容を表わすキーワードと、該画像特徴量を備えた画像が撮影される位置情報と、該キーワードが付与された物体が特定場所に固有のものである度合を示す位置情報関連度とを関連付けし、データベースとして記憶する記憶手段と、前記一の画像データを取得画像データとして入力し、該取得画像データが撮影される位置情報を抽出する画像データ入力手段と、前記取得画像データに備わった画像特徴量を抽出する画像特徴量抽出手段と、前記データベースに記憶された前記対象画像データの位置情報関連度と、前記取得画像データの位置情報および該対象画像データの位置情報から求めた撮影位置の隔たりとを用いて、当該対象画像データを前記キーワードの抽出処理の処理対象から除外するか否かを判断する除外判断手段と、前記処理対象にすると判断した場合に、前記取得画像データの画像特徴量と前記対象画像データの画像特徴量とを比較して、所定条件を満たす該対象画像データのキーワードを、前記データベースから抽出するキーワード抽出手段とを備えたことを要旨としている。 In view of the above problems, the first image recognition apparatus of the present invention employs the following method. That is, an image recognition apparatus that executes processing for extracting a keyword representing the image content of one image data having photographing position information from a predetermined search target and recognizes the image content, and is the search target. For each of the plurality of target image data, an image feature amount, a keyword representing the image content, position information at which an image having the image feature amount is photographed, and an object to which the keyword is assigned is specific to a specific location A storage means for associating a position information relevance degree indicating a degree of image data and storing it as a database; and image data for inputting the one image data as acquired image data and extracting position information where the acquired image data is photographed An input means; an image feature quantity extraction means for extracting an image feature quantity included in the acquired image data; and a position of the target image data stored in the database. Whether or not to exclude the target image data from the keyword extraction processing target using the degree of report relevance, the position information of the acquired image data, and the shooting position distance obtained from the position information of the target image data An exclusion determination means for determining whether or not the target image data satisfies the predetermined condition by comparing the image feature amount of the acquired image data with the image feature amount of the target image data when the target image data is determined to be the processing target. And a keyword extracting means for extracting the keywords from the database.

また、本発明の第１の画像認識装置に対応する画像認識方法は、撮影位置の情報を備えた一の画像データの画像内容を表わすキーワードを、所定の検索対象から抽出する処理を実行し、画像内容を認識する画像認識方法であって、前記検索対象となる複数の対象画像データ毎に、画像特徴量と、画像内容を表わすキーワードと、該画像特徴量を備えた画像が撮影される位置情報と、該キーワードが付与された物体が特定場所に固有のものである度合を示す位置情報関連度とを関連付けして、データベースとして記憶し、前記一の画像データを取得画像データとして入力し、該取得画像データが撮影される位置情報を抽出し、前記取得画像データに備わった画像特徴量を抽出し、前記データベースに記憶された前記対象画像データの位置情報関連度と、前記取得画像データの位置情報および該対象画像データの位置情報から求めた撮影位置の隔たりとを用いて、当該対象画像データを前記キーワードの抽出処理の処理対象から除外するか否かを判断し、前記処理対象にすると判断した場合に、前記取得画像データの画像特徴量と前記対象画像データの画像特徴量とを比較して、所定条件を満たす該対象画像データのキーワードを、前記データベースから抽出することを要旨としている。 Further, an image recognition method corresponding to the first image recognition apparatus of the present invention executes a process of extracting a keyword representing the image content of one image data having shooting position information from a predetermined search target, An image recognition method for recognizing image contents, wherein for each of a plurality of target image data to be searched, an image feature amount, a keyword representing the image content, and a position where an image having the image feature amount is photographed The information and the positional information relevance indicating the degree to which the object to which the keyword is assigned is unique to a specific place are stored as a database, and the one image data is input as acquired image data, Extracting position information where the acquired image data is photographed, extracting image feature amounts included in the acquired image data, and extracting the position information function of the target image data stored in the database. And whether or not to exclude the target image data from the processing target of the keyword extraction process using the position information of the acquired image data and the distance of the photographing position obtained from the position information of the target image data. If it is determined that the image is to be processed, the image feature quantity of the acquired image data is compared with the image feature quantity of the target image data, and the keyword of the target image data satisfying a predetermined condition is determined as the database. The main point is to extract from.

本発明の第１の画像認識装置およびその画像認識方法によれば、画像特徴量、画像内容を表わすキーワード、位置情報、位置情報関連度を、対象画像データ毎に関連付けして記憶したデータベースを用意し、そのデータベースを検索して、一の取得画像データの画像特徴量から取得画像データの画像内容を示すキーワードを抽出する。したがって、取得画像データの画像内容を表わす適切なキーワードをデータベースから抽出することができる。 According to the first image recognition apparatus and image recognition method of the present invention, a database is prepared in which image feature quantities, keywords representing image contents, position information, and position information relevance are stored in association with each target image data. Then, the database is searched, and a keyword indicating the image content of the acquired image data is extracted from the image feature amount of one acquired image data. Therefore, an appropriate keyword representing the image content of the acquired image data can be extracted from the database.

また、キーワード抽出の際、位置情報関連度と撮影位置の隔たりとを用いて、対象画像データをキーワードの抽出処理の処理対象から除外するか否かを判断する。つまり、データベースの検索対象を絞り込む。したがって、検索対象を減らし、迅速な処理を行なうことができる。 Further, when extracting a keyword, it is determined whether or not to exclude the target image data from the processing target of the keyword extraction process by using the positional information relevance level and the distance between shooting positions. That is, the search target of the database is narrowed down. Therefore, it is possible to reduce the number of search targets and perform a quick process.

上記の構成を有する画像認識装置の位置情報関連度は、Ａ）前記対象画像データ内のキーワードが付与された物体が、特定場所に固有のものであるタイプＡと、Ｂ）前記対象画像データ内のキーワードが付与された物体が、特定場所に存在する程度が高く、かつ、存在する場所が限られるものであるタイプＢと、Ｃ）前記対象画像データ内のキーワードが付与された物体が、特定場所に固有のものではないタイプＣとの３つのタイプに応じて分類して設定され、前記除外判断手段は、前記対象画像データの位置情報関連度が前記タイプＡであり、かつ、前記撮影位置の隔たりが所定量以上である場合に、当該対象画像データを前記処理対象から除外すると判断するものとしても良い。 The positional information relevance of the image recognition apparatus having the above configuration is as follows: A) Type A in which the keyword in the target image data is assigned to a specific place, and B) in the target image data An object to which the keyword is added is of a type B where the degree to which the keyword is present is high and the place where the keyword is present is limited, and C) the object to which the keyword in the target image data is assigned is specified It is classified and set according to three types, ie, type C that is not unique to the place, and the exclusion determination means has a position information relevance level of the target image data of the type A, and the shooting position When the gap is equal to or larger than a predetermined amount, it may be determined that the target image data is excluded from the processing target.

かかる画像認識装置によれば、データベースを検索する際に、位置情報関連度がタイプＡであり、撮影位置の隔たりが所定量以上である対象画像データを処理対象から除外する。こうした対象画像データは、特定場所に固有のものであるため、撮影場所が近いと判断されない限り、取得画像データの画像内容がこの対象画像データの画像内容と同一もしくは類似するものである可能性は著しく低い。こうした対象画像データを処理対象から除外することで、迅速な処理を行なうことができると共に、適切なキーワードを抽出できる精度（認識精度）を向上することができる。 According to such an image recognition apparatus, when searching the database, target image data whose positional information relevance is type A and whose shooting position gap is a predetermined amount or more is excluded from the processing target. Since such target image data is specific to a specific location, there is a possibility that the image content of the acquired image data is the same as or similar to the image content of this target image data unless it is determined that the shooting location is close. Remarkably low. By excluding such target image data from the processing target, it is possible to perform quick processing and improve the accuracy (recognition accuracy) for extracting an appropriate keyword.

上記の構成を有する画像認識装置において、更に、前記記憶手段は、前記対象画像データ内のキーワードが付与された物体を撮影することができる地理的な範囲を示す領域レンジを、該対象画像データ毎に関連付けし、前記データベースとして記憶しており、前記除外判断手段は、前記取得画像データの位置情報が、前記データベースに記憶された前記対象画像データの領域レンジ内に入らない場合に、前記撮影位置の隔たりが所定以上であると判断するものとしても良い。 In the image recognition apparatus having the above-described configuration, the storage unit further sets an area range indicating a geographical range in which an object to which a keyword is assigned in the target image data can be captured for each target image data. Is stored as the database, and the exclusion determination unit is configured to detect the shooting position when the position information of the acquired image data does not fall within an area range of the target image data stored in the database. It is good also as what judges that the gap of this is more than predetermined.

かかる画像認識装置によれば、データベースに領域レンジを備え、その領域レンジを基準に撮影位置が離れているか否かを判断する。したがって、判断を容易に行なうことができる。 According to such an image recognition apparatus, an area range is provided in the database, and it is determined whether or not the shooting positions are separated based on the area range. Therefore, the determination can be easily made.

上記の構成を有する画像認識装置のキーワード抽出手段は、前記取得画像データの画像特徴量と前記対象画像データの画像特徴量とから、該取得画像データと該対象画像データとの類似または非類似の程度を示す信頼性係数を算出する信頼性係数算出部と、前記算出された信頼性係数に基づいて前記対象画像データのキーワードを抽出する抽出部とからなるものとしても良い。 The keyword extraction unit of the image recognition apparatus having the above configuration uses the image feature amount of the acquired image data and the image feature amount of the target image data to determine whether the acquired image data and the target image data are similar or dissimilar. A reliability coefficient calculation unit that calculates a reliability coefficient indicating the degree, and an extraction unit that extracts a keyword of the target image data based on the calculated reliability coefficient may be used.

かかる画像認識装置によれば、対象画像データ毎の信頼性係数を比較することで、キーワードの抽出を容易なものとすることができる。 According to such an image recognition apparatus, it is possible to easily extract a keyword by comparing the reliability coefficient for each target image data.

上記の構成を有する画像認識装置において、更に、前記キーワード抽出手段は、前記画像特徴量を用いて算出された信頼性係数に、前記位置情報関連度を加味して該信頼性係数の補正を行なう補正部を備え、前記抽出部は、前記補正された信頼性係数に基づいて前記キーワードを抽出するものとしても良い。 In the image recognition apparatus having the above-described configuration, the keyword extracting unit further corrects the reliability coefficient by adding the position information relevance to the reliability coefficient calculated using the image feature amount. A correction unit may be provided, and the extraction unit may extract the keyword based on the corrected reliability coefficient.

かかる画像認識装置によれば、補正された信頼性係数は、画像特徴量と位置情報関連度とを考慮した係数となり、この信頼性係数に基づいてキーワードを抽出する。複数の情報を基準に、取得画像データと対象画像データとの類似具合を判断するため、適切なキーワードの抽出を容易なものとすることができる。 According to such an image recognition apparatus, the corrected reliability coefficient is a coefficient that takes into consideration the image feature quantity and the position information relevance, and a keyword is extracted based on the reliability coefficient. Since the degree of similarity between the acquired image data and the target image data is determined based on a plurality of pieces of information, it is possible to easily extract an appropriate keyword.

上記の構成を有する画像認識装置の抽出部は、前記検索対象となる対象画像データ毎に求めた補正後の信頼性係数の中、該信頼性係数の最も高い該対象画像データのキーワードを抽出するものとしても良い。 The extraction unit of the image recognition apparatus having the above configuration extracts a keyword of the target image data having the highest reliability coefficient from among the corrected reliability coefficients obtained for each target image data to be searched. It is good as a thing.

かかる画像認識装置によれば、補正した信頼性係数を比較することにより、データベースの検索対象の中から最も適切と判断される一のキーワードを抽出することができる。 According to such an image recognition device, by comparing the corrected reliability coefficients, it is possible to extract one keyword that is determined to be most appropriate from the search targets of the database.

本発明の第２の画像認識装置は、検索対象となる複数の対象画像データ毎に、画像特徴量と、画像内容を表わすキーワードと、該画像特徴量を備えた画像が撮影される位置情報と、該キーワードが付与された物体が特定場所に固有のものである度合を示す位置情報関連度とを関連付けて記憶したデータベースを利用して、撮影位置の位置情報を備えた一の画像データの画像内容を表わすキーワードを、当該データベースから抽出する処理を実行し、画像内容を認識する画像認識装置であって、前記一の画像データを取得画像データとして入力し、該取得画像データが撮影される位置情報を抽出する画像データ入力手段と、前記取得画像データに備わった画像特徴量を抽出する画像特徴量抽出手段と、前記データベースに記憶された前記対象画像データの位置情報関連度と、前記取得画像データの位置情報および該対象画像データの位置情報から求めた撮影位置の隔たりとを用いて、当該対象画像データを前記キーワードの抽出処理の処理対象から除外するか否かを判断する除外判断手段と、前記処理対象にすると判断した場合に、前記取得画像データの画像特徴量と前記対象画像データの画像特徴量とを比較して、所定条件を満たす該対象画像データのキーワードを、前記データベースから抽出するキーワード抽出手段とを備えたことを要旨としている。 The second image recognition apparatus according to the present invention includes, for each of a plurality of target image data to be searched, an image feature amount, a keyword representing the image content, and position information at which an image having the image feature amount is captured. An image of one piece of image data having position information of a shooting position using a database that stores and associates a position information relevance level indicating the degree to which the object to which the keyword is assigned is unique to a specific place An image recognition apparatus that executes processing for extracting a keyword representing the content from the database and recognizes the image content, wherein the one image data is input as acquired image data, and the acquired image data is captured Image data input means for extracting information, image feature quantity extraction means for extracting image feature quantities included in the acquired image data, and the target image stored in the database The target image data is excluded from the processing target of the keyword extraction process using the degree of relevance of the position information of the data, the position information of the acquired image data, and the distance of the photographing position obtained from the position information of the target image data. An exclusion determination means for determining whether or not to perform processing, and when it is determined that the image is to be processed, the image feature amount of the acquired image data is compared with the image feature amount of the target image data, and the predetermined condition is satisfied. The gist of the invention is that it includes keyword extraction means for extracting keywords of the target image data from the database.

本発明の第２の画像認識装置によれば、予め用意したデータベースを用いて、一の取得画像データの画像特徴量から取得画像データの画像内容を示すキーワードを抽出する際、位置情報関連度と撮影位置の隔たりとを用いて、対象画像データをキーワードの抽出処理の処理対象から除外するか否かを判断する。したがって、検索対象を減らし、迅速な処理を行なうことができる。 According to the second image recognition apparatus of the present invention, when a keyword indicating the image content of the acquired image data is extracted from the image feature amount of one acquired image data using a database prepared in advance, Whether or not the target image data is to be excluded from the processing target of the keyword extraction process is determined using the gap between the shooting positions. Therefore, it is possible to reduce the number of search targets and perform a quick process.

本発明は、コンピュータプログラムおよびコンピュータプログラムを記録した媒体としても実装することができる。記録媒体としては、フレキシブルディスク，ＣＤ−ＲＯＭ，ＤＶＤ−ＲＯＭ／ＲＡＭ，光磁気ディスク、メモリカード、ハードディスクなどコンピュータが読取り可能な種々の媒体を利用することができる。 The present invention can also be implemented as a computer program and a medium recording the computer program. As the recording medium, various computer-readable media such as a flexible disk, a CD-ROM, a DVD-ROM / RAM, a magneto-optical disk, a memory card, and a hard disk can be used.

以下、本発明の実施の形態について、実施例に基づき以下の順序で説明する。
Ａ．画像認識システム：
Ｂ．データベースの構築：
Ｃ．画像認識装置の構造：
Ｄ．画像認識処理：
Ｅ．変形例： Hereinafter, embodiments of the present invention will be described in the following order based on examples.
A. Image recognition system:
B. Database construction:
C. Structure of image recognition device:
D. Image recognition processing:
E. Variation:

Ａ．画像認識システム：
図１は、本発明の一実施例としての画像認識装置を含む画像認識システムを示す説明図である。この画像認識システム１００は、主に、複数の撮影画像に関する情報を備えるデータベース２０，データベース２０を構築するデータベース作成装置１０，構築されたデータベース２０を利用する画像認識装置１５などから構成されている。 A. Image recognition system:
FIG. 1 is an explanatory diagram showing an image recognition system including an image recognition apparatus as one embodiment of the present invention. The image recognition system 100 mainly includes a database 20 including information on a plurality of captured images, a database creation device 10 that constructs the database 20, an image recognition device 15 that uses the constructed database 20, and the like.

データベース作成装置１０は、キーワード設定部１１，領域レンジ設定部１２，位置情報関連度設定部１３，画像特徴量抽出装置１４などを備え、通信線を介してデータベース２０と接続している。データベース作成装置１０は、デジタルスチルカメラなどの撮像機器で撮影された複数の画像データを入力し、画像データの被写体等を表わすキーワードなど、画像データ毎に対応する種々の情報を作成する。 The database creation device 10 includes a keyword setting unit 11, an area range setting unit 12, a position information relevance setting unit 13, an image feature amount extraction device 14, and the like, and is connected to the database 20 via a communication line. The database creation apparatus 10 receives a plurality of image data captured by an imaging device such as a digital still camera, and creates various information corresponding to each image data such as a keyword representing a subject of the image data.

本実施例で対象となる画像データには、画像データを撮影した撮影場所の地理的な位置情報を備えている。この位置情報は、無線通信の一態様であるＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍｓ）方式の通信を用いた経度、緯度の情報（ＧＰＳ情報と呼ぶ）である。データベース作成装置１０は、画像データのＧＰＳ情報を抽出して、画像データの情報の一部としている。なお、取り扱う画像データの位置情報は、ＧＰＳ情報によるものに限らず、携帯電話の基地局の情報など、他の位置情報であっても良い。 The target image data in this embodiment includes geographical position information of the shooting location where the image data was shot. This position information is longitude and latitude information (referred to as GPS information) using communication in the GPS (Global Positioning Systems) system, which is an aspect of wireless communication. The database creation device 10 extracts GPS information of image data and uses it as a part of the image data information. Note that the position information of the image data to be handled is not limited to that based on the GPS information, but may be other position information such as information on a mobile phone base station.

キーワード設定部１１は、画像データの画像内容（被写体）を撮影した撮影場所の地名や、被写体の名称等をテキストで表わし、これをキーワードとして画像データに関連付けて設定する。領域レンジ設定部１２は、画像データ内の被写体を撮影することが可能な領域を経緯度の数値で設定し、これを領域レンジＲとして画像データに関連付けて設定する。位置情報関連度設定部１３は、キーワードを付された画像データ内の被写体と、ＧＰＳ情報との関連性を評価して数値化し、これを位置情報関連度ＰＲとして画像データに関連付けて設定する。画像特徴量抽出装置１４は、各画像データを入力し、所定の画像処理により画像データの画像特徴量を抽出する。 The keyword setting unit 11 expresses the place name of the shooting location where the image content (subject) of the image data is taken, the name of the subject, and the like as text, and sets this as a keyword in association with the image data. The area range setting unit 12 sets an area where the subject in the image data can be photographed with a numerical value of longitude and latitude, and sets this as an area range R in association with the image data. The position information relevance setting unit 13 evaluates and digitizes the relevance between the subject in the image data to which the keyword is attached and the GPS information, and sets this as the position information relevance PR in association with the image data. The image feature quantity extraction device 14 inputs each image data, and extracts the image feature quantity of the image data by predetermined image processing.

こうして各画像データに関連付けして設定されたＧＰＳ情報，キーワード，領域レンジＲ，位置情報関連度ＰＲ，画像特徴量などの画像データに関する種々の情報は、データベース２０に出力される。なお、画像データに関する情報の内、キーワード，領域レンジＲ，位置情報関連度ＰＲについては、ユーザが各画像データの内容を確認して設定している。 Various information relating to image data such as GPS information, keywords, region range R, position information relevance PR, and image feature amount set in association with each image data is output to the database 20. Of the information related to image data, the keyword, region range R, and position information relevance level PR are set by the user confirming the contents of each image data.

データベース２０は、所定容量の記憶領域を備えており、データベース作成装置１０から入力した上記の画像データに関する情報を記憶している。本実施例のデータベース２０では、各画像データと画像データに関する情報との関連付けはなされているが、画像データ自体は記憶されていない。つまり、このデータベース２０は、画像データに関する種々の情報である特徴量のみを記憶したデータベースである。こうすることで、データ量を減らし、記憶容量を削減している。こうした構成のデータベース２０は、通信線を介して画像認識装置１５と接続されている。 The database 20 includes a storage area having a predetermined capacity, and stores information related to the image data input from the database creation device 10. In the database 20 of this embodiment, each image data is associated with information related to the image data, but the image data itself is not stored. In other words, the database 20 is a database that stores only feature amounts that are various information related to image data. By doing so, the amount of data is reduced and the storage capacity is reduced. The database 20 having such a configuration is connected to the image recognition apparatus 15 via a communication line.

画像認識装置１５は、画像特徴量抽出装置１６，識別器１７，信頼性係数補正器１８などを備え、任意の画像データを入力し、データベース２０から、この任意の画像データに適したキーワードを抽出する。すなわち、画像認識装置１５は、データベース２０を辞書として用いて、任意の画像データの内容に対応するキーワードを抽出することから、任意の画像データの内容を認識する装置となる。 The image recognition device 15 includes an image feature amount extraction device 16, a discriminator 17, a reliability coefficient corrector 18, etc., and inputs arbitrary image data, and extracts keywords suitable for the arbitrary image data from the database 20. To do. That is, the image recognizing device 15 uses the database 20 as a dictionary to extract a keyword corresponding to the content of arbitrary image data, and thus becomes a device that recognizes the content of arbitrary image data.

この画像認識装置１５内の画像特徴量抽出装置１６は、データベース作成装置１０の画像特徴量抽出装置１４と同様、入力した任意の画像データの画像特徴量を抽出する。画像特徴量抽出装置１６は、識別器１７と接続しており、抽出された画像特徴量は識別器１７に出力される。 The image feature amount extraction device 16 in the image recognition device 15 extracts the image feature amount of the input arbitrary image data, like the image feature amount extraction device 14 of the database creation device 10. The image feature quantity extraction device 16 is connected to the discriminator 17, and the extracted image feature quantity is output to the discriminator 17.

識別器１７は、画像特徴量抽出装置１６から画像特徴量を入力すると共に、入力した任意の画像データが備えるＧＰＳ情報を抽出し、これらに基づきデータベース２０から所定条件を満たす画像特徴量，キーワードを取得する。識別器１７は、入力した任意の画像データの画像特徴量と、データベース２０から取得した画像特徴量とを比較し、両者の画像特徴量の類似具合を係数で表わす信頼性係数ＴＣを算出する。識別器１７は、信頼性係数補正器１８とも接続しており、算出した信頼性係数ＴＣと、データベース２０から取得したキーワードを、信頼性係数補正器１８へ出力する。 The discriminator 17 inputs the image feature amount from the image feature amount extraction device 16 and extracts GPS information included in the input arbitrary image data. Based on these, the image feature amount and keyword satisfying a predetermined condition are extracted from the database 20. get. The discriminator 17 compares the image feature quantity of the input arbitrary image data with the image feature quantity acquired from the database 20, and calculates a reliability coefficient TC that represents the similarity between the two image feature quantities. The discriminator 17 is also connected to the reliability coefficient corrector 18 and outputs the calculated reliability coefficient TC and the keyword acquired from the database 20 to the reliability coefficient corrector 18.

信頼性係数補正器１８は、識別器１７からのキーワード，信頼性係数ＴＣと共に、識別器１７で画像特徴量，キーワードを取得した画像データに対応する位置情報関連度ＰＲも入力する。信頼性係数補正器１８は、識別器１７から入力した信頼性係数ＴＣを位置情報関連度ＰＲに基づいて補正し、補正した信頼性係数ＴＣ１に基づいて所定条件を満たすキーワードを、外部の機器（例えば、ディスプレイなどの表示装置）に出力する。 The reliability coefficient corrector 18 also receives the keyword and reliability coefficient TC from the discriminator 17 and the positional information relevance PR corresponding to the image data obtained by the discriminator 17 and the image data. The reliability coefficient corrector 18 corrects the reliability coefficient TC input from the discriminator 17 based on the positional information relevance PR, and selects a keyword that satisfies a predetermined condition based on the corrected reliability coefficient TC1 as an external device ( For example, the data is output to a display device such as a display.

以上の構成の画像認識システム１００において、画像認識装置１５は、任意の画像データを入力すると、データベース２０を利用（検索）して自動的に所定のキーワード等を抽出する。抽出されたキーワードは、上記の通り、画像データ内の被写体の名称や、撮影場所名など、画像データの内容に適合すると認定されたものである。 In the image recognition system 100 configured as described above, when arbitrary image data is input, the image recognition device 15 uses (searches) the database 20 to automatically extract predetermined keywords and the like. As described above, the extracted keyword is certified to match the contents of the image data, such as the name of the subject in the image data and the name of the shooting location.

抽出されたキーワードは、例えば、入力された画像データと関連付けて記憶され、後日の画像データの検索やキーワード毎の分類など、画像データの管理に利用することができる。 The extracted keywords are stored in association with the input image data, for example, and can be used for image data management, such as image data search at a later date and classification for each keyword.

以下、画像認識装置１５の具体的態様の説明に先立って、まず、データベース２０の構築について説明する。なお、以下の説明では、上記のデータベース作成装置１０および画像認識装置１５における各処理を、ソフトウェアプログラムで実現し、その処理プログラムをインストールしたコンピュータにより画像認識システム１００を構成するものとする。 Hereinafter, prior to the description of specific modes of the image recognition device 15, the construction of the database 20 will be described first. In the following description, each process in the database creation device 10 and the image recognition device 15 is realized by a software program, and the image recognition system 100 is configured by a computer in which the processing program is installed.

Ｂ．データベースの構築：
図２は、データベース２０の構築処理の流れを示すフローチャートである。この処理は、処理プログラムをインストールしたデータベース作成装置１０としてのコンピュータ（図示なし）により実行される処理である。このコンピュータは、ＣＰＵ，ＲＯＭ，ＲＡＭ，ハードディスク等を備え、キーボード，ディスプレイ等と接続された一般的な計算機であり、複数の撮影画像を備えたデジタルスチルカメラと接続されている。キーボードを介したユーザ指示により、データベース２０の構築処理は実行される。 B. Database construction:
FIG. 2 is a flowchart showing the flow of the database 20 construction process. This process is a process executed by a computer (not shown) as the database creation device 10 in which the processing program is installed. This computer includes a CPU, ROM, RAM, hard disk, and the like, and is a general computer connected to a keyboard, a display, and the like, and is connected to a digital still camera having a plurality of photographed images. The database 20 construction process is executed by a user instruction via the keyboard.

処理を開始すると、ＣＰＵは、デジタルスチルカメラ内の所定の画像データを入力し、画像データに対するキーワードの設定処理を実行する（ステップＳ２００）。具体的には、入力した画像データをディスプレイ上に表示し、ユーザがキーボードを介して入力した画像内容を表わすキーワードを受け付け、これを画像データと関連付けて、ハードディスク内に記憶している。 When the process is started, the CPU inputs predetermined image data in the digital still camera, and executes a keyword setting process for the image data (step S200). Specifically, the input image data is displayed on a display, a keyword representing the image contents input by the user via the keyboard is received, and this is associated with the image data and stored in the hard disk.

キーワードの設定に続き、ＣＰＵは、入力した画像データの画像特徴量を抽出する処理を行なう（ステップＳ２１０）。本実施例では、ＭＰＥＧ−７において採用されている色配置（ＣｏｌｏｒＬａｙｏｕｔ）の手法を用いて画像特徴量を抽出している。具体的には、画像データの輝度色差信号をＹＣｂＣｒの情報に変換したものである。ＣＰＵは、抽出した画像特徴量を、画像データと関連付けて、ハードディスク内に記憶する。なお、画像特徴量の抽出処理は、画像認識装置１５においても同様の処理を実行するため、後の画像認識装置１５の説明において詳しく説明する。 Following the keyword setting, the CPU performs a process of extracting the image feature amount of the input image data (step S210). In the present embodiment, the image feature amount is extracted by using a color layout method employed in MPEG-7. Specifically, the luminance color difference signal of the image data is converted into YCbCr information. The CPU stores the extracted image feature amount in the hard disk in association with the image data. Note that the image feature amount extraction processing is executed in the image recognition device 15 and will be described in detail later in the description of the image recognition device 15.

続いて、ＣＰＵは、入力した画像データに付帯するＧＰＳ情報を抽出する（ステップＳ２２０）。そして、ＣＰＵは、抽出したＧＰＳ情報を、キーワード，画像特徴量と同様、画像データと関連付けてハードディスク内に記憶している。なお、画像データに対するＧＰＳ情報が無いような場合には、ユーザがキーボードを介して入力する。 Subsequently, the CPU extracts GPS information attached to the input image data (step S220). The CPU stores the extracted GPS information in the hard disk in association with the image data as well as the keyword and the image feature amount. In addition, when there is no GPS information for the image data, the user inputs via the keyboard.

ＧＰＳ情報の抽出後、ＣＰＵは領域レンジＲを設定する処理を行なう（ステップＳ２３０）。領域レンジＲは、上述のように、画像データ内の被写体を撮影することが可能な領域を経緯度で表わしたものである。具体的には、図３に示すように、ＧＰＳ情報による経緯度を中心に、画像データ内の被写体を撮影することができる概略の領域を設定する。例えば、ＧＰＳ情報による経度、緯度が（Ｘ，Ｙ）であり、被写体が「東京タワー」である場合に、「東京タワー」を撮影することができる範囲として（Ｘ±Ｒ，Ｙ±Ｒ）で表現される領域を設定する。この場合のＲが領域レンジＲとして設定される。ＣＰＵは、ユーザが画像データ毎にキーボードを介して入力した領域レンジＲを受け付け、画像データと関連付けて、ハードディスク内に記憶する。 After extracting the GPS information, the CPU performs a process of setting the area range R (step S230). As described above, the area range R represents an area in which the subject in the image data can be photographed using longitude and latitude. Specifically, as shown in FIG. 3, an approximate area where the subject in the image data can be photographed is set around the longitude and latitude according to the GPS information. For example, when the longitude and latitude according to GPS information are (X, Y) and the subject is “Tokyo Tower”, the range in which “Tokyo Tower” can be photographed is (X ± R, Y ± R). Set the area to be represented. R in this case is set as the region range R. The CPU receives the area range R input by the user via the keyboard for each image data, and stores it in the hard disk in association with the image data.

続いて、ＣＰＵは、入力した画像データの位置情報関連度ＰＲを設定する処理を行なう（ステップＳ２４０）。上述のように、位置情報関連度ＰＲは、キーワードを付された画像データ内の被写体と、ＧＰＳ情報との関連性を評価したものである。本実施例では、所定のキーワードが付され、ディスプレイ上に表示された画像データの被写体が、特定の場所に存在する程度の高低を数値によって表わしている。ＣＰＵは、ユーザが入力した位置情報関連度ＰＲを画像データと関連付けて、ハードディスク内に記憶する。 Subsequently, the CPU performs a process of setting the position information relevance PR of the input image data (step S240). As described above, the position information relevance PR is obtained by evaluating the relevance between the subject in the image data to which the keyword is attached and the GPS information. In this embodiment, a predetermined keyword is attached, and the numerical value indicates the level at which the subject of the image data displayed on the display is present at a specific location. The CPU stores the position information relevance PR input by the user in the hard disk in association with the image data.

位置情報関連度ＰＲの設定に際しては、キーワードを付した画像データ内の被写体から、画像データのタイプを大きく３つ（タイプＡ〜Ｃ）に分類する。タイプＡは、キーワードを備えた被写体が「その場所固有の対象物を示している」場合であり、タイプＣは、「その場所固有の対象物ではない」場合であり、タイプＢは、タイプＡからタイプＣの中間に相当する場合である。 When setting the position information relevance PR, the types of image data are roughly classified into three types (types A to C) from subjects in the image data to which keywords are attached. Type A is a case where the subject having the keyword “shows an object specific to the place”, type C is a case “not an object specific to the place”, and type B is type A To the middle of type C.

タイプＡに分類される画像データには、０．９≦ＰＲ≦１．０の範囲の位置情報関連度ＰＲが設定される。例えば、画像データ内の被写体が「東京タワー」であり、キーワードとして「東京タワー」が設定されている場合、「東京タワー」は「その場所固有の対象物である」ため、高い値（例えば、ＰＲ＝０．９５）の位置情報関連度ＰＲが設定される。 For the image data classified as type A, the positional information relevance PR in the range of 0.9 ≦ PR ≦ 1.0 is set. For example, if the subject in the image data is “Tokyo Tower” and “Tokyo Tower” is set as the keyword, “Tokyo Tower” is “the object unique to the place”, so a high value (for example, The position information relevance PR of PR = 0.95) is set.

タイプＣに分類される画像データには、ＰＲ＝０の位置情報関連度ＰＲが設定される。例えば、画像データ内の被写体が「ネコ」であり、キーワードとして「ネコ」が設定されている場合、「ネコ」は「場所に固有の対象物ではない」ため、低い値の位置情報関連度ＰＲ（ＰＲ＝０）が設定される。 For image data classified as type C, a position information relevance PR of PR = 0 is set. For example, when the subject in the image data is “cat” and “cat” is set as the keyword, “cat” is “not a target unique to the place”, and therefore the position information relevance PR with a low value. (PR = 0) is set.

タイプＢに分類される画像データには、０＜ＰＲ＜０．９の範囲の位置情報関連度ＰＲが設定される。例えば、画像データ内の被写体が「奈良公園」の「シカ」であり、キーワードとして「シカ」が設定されている場合、「シカ」は場所に固有の対象物ではないが、「奈良公園」など、ある程度特定の場所に限られるため、所定の値（例えば、ＰＲ＝０．７）の位置情報関連度ＰＲが設定される。こうした対象としては、「飛行場」の「飛行機」，「港」の「船舶」，「動物園」の「パンダ」，「山岳公園」の「高山植物」など、場所に固有ではないものの、設備や生息域といった有効な一定の広がりをもっているものが該当する。 For the image data classified as type B, the position information relevance PR in the range of 0 <PR <0.9 is set. For example, if the subject in the image data is “deer” of “Nara Park” and “deer” is set as the keyword, “deer” is not an object specific to the place, but “Nara Park”, etc. Since the position is limited to a specific place to some extent, the position information relevance PR having a predetermined value (for example, PR = 0.7) is set. These targets include “airplanes” at “airfields”, “ships” at “ports”, “pandas” at “zoos”, “alpine plants” at “mountain parks”, although they are not specific to the place. This applies to a certain range of effective areas.

こうしてキーワード，画像特徴量，ＧＰＳ情報，領域レンジＲ，位置情報関連度ＰＲを設定し、一連のデータベース２０の構築処理を終了する。その結果、ハードディスクには図４に示す画像データに関する種々の情報が蓄積される。 In this way, the keyword, the image feature amount, the GPS information, the area range R, and the position information relevance PR are set, and the series of database 20 construction processing is completed. As a result, various information relating to the image data shown in FIG. 4 is stored in the hard disk.

例えば、図４に示すように、画像データの内容（被写体）が東京タワーであるような場合には、位置情報関連度ＰＲとしてＰＲ＝０．９５（タイプＡ），ＧＰＳ情報として撮影場所の経度、緯度，領域レンジＲとしてＲ＝０．０１２，画像特徴量としてのＹＣｂＣｒの情報もしくはこれを処理した情報，キーワードとして「東京タワー」「芝公園」などが設定され、ハードディスクに蓄積される。つまり、こうした情報を記憶したハードディスクが、データベース２０となる。以上の処理を経て、構築されたデータベース２０を辞書として利用し、画像認識装置１５は、任意の画像データの内容を認識する。 For example, as shown in FIG. 4, when the content of the image data (subject) is Tokyo Tower, the position information relevance PR is PR = 0.95 (type A), and the GPS information is the longitude of the shooting location. The latitude and area range R is set to R = 0.012, YCbCr information as an image feature amount or information obtained by processing this, and “Tokyo Tower” and “Shiba Koen” are set as keywords and stored in the hard disk. That is, the hard disk storing such information becomes the database 20. Through the above processing, the constructed database 20 is used as a dictionary, and the image recognition device 15 recognizes the contents of arbitrary image data.

Ｃ．画像認識装置の構造：
図５は、本発明の画像認識処理のプログラムをソフトウェアとして備えた画像認識装置１５の概略構造図である。図示するように、画像認識装置１５は、処理対象となる所定の画像を撮影し、その画像データを記憶したデジタルスチルカメラ２７、デジタルスチルカメラ２７に記憶された画像データを入力して所定の処理を施すコンピュータ３０（以下、ＰＣ３０と記す）、ＰＣ３０により処理される画像の表示等を行なうディスプレイ４０、ユーザインターフェースとしてのキーボード４１やマウス４２等から構成されている。 C. Structure of image recognition device:
FIG. 5 is a schematic structural diagram of an image recognition apparatus 15 provided with the image recognition processing program of the present invention as software. As shown in the figure, the image recognition device 15 captures a predetermined image to be processed, inputs the digital still camera 27 storing the image data, and the image data stored in the digital still camera 27 to perform predetermined processing. Computer 30 (hereinafter referred to as PC 30), a display 40 for displaying images processed by the PC 30, a keyboard 41, a mouse 42, and the like as a user interface.

デジタルスチルカメラ２７は、撮像センサとしてのＣＣＤ，ＣＣＤを介して取り込んだ画像データに所定の画像処理を施す画像処理回路，画像データを記憶するメモリカード２４などを備えると共に、ＧＰＳを用いてデジタルスチルカメラ２７の地理的な位置情報を記憶するＧＰＳ機能を備えている。この機能により、画像データにはＧＰＳ情報が付加されている。 The digital still camera 27 includes a CCD as an image sensor, an image processing circuit that performs predetermined image processing on image data captured via the CCD, a memory card 24 that stores image data, and the like, and a digital still using GPS. A GPS function for storing geographical position information of the camera 27 is provided. With this function, GPS information is added to the image data.

一般に、デジタルスチルカメラで取扱う画像のデータ構造は、いわゆるＥｘｉｆ形式であり、ＪＰＥＧ形式の画像データを基本に、画像データを撮影した際の撮影情報、サムネイル画像など、所定の画像付加情報を、Ｅｘｉｆ規約に準拠した形式で埋め込んで構成されている。画像付加情報としては、撮影日時，撮影時の露出時間，絞り値，シャッタースピード，ＩＳＯ感度，ホワイトバランスなど種々の情報が記憶されている。本実施例では、これらに加え、画像付加情報の一つとして、ＧＰＳ情報が含まれている。なお、ＧＰＳ情報は、Ｅｘｉｆ規約でユーザの使用量域として設定されているメーカーノートに書き込むものとしても良い。 In general, the data structure of an image handled by a digital still camera is a so-called Exif format. Based on image data in JPEG format, predetermined image additional information such as shooting information when shooting image data, thumbnail images, and the like is used. Embedded in a format that conforms to the rules. As the image additional information, various information such as the shooting date and time, the exposure time at the time of shooting, the aperture value, the shutter speed, the ISO sensitivity, and the white balance are stored. In the present embodiment, in addition to these, GPS information is included as one of the image additional information. Note that the GPS information may be written in a maker note that is set as a user's usage range according to the Exif convention.

なお、取扱う画像データは、位置情報が付与されているものであれば、デジタルスチルカメラ２７からの取得に代えて、デジタルビデオカメラ２１，カメラ付きの携帯電話２３，メモリカード２４，ハードディスク２５など、撮像機器や記録媒体からの取得であっても良い。 If the image data to be handled is provided with position information, instead of obtaining from the digital still camera 27, the digital video camera 21, the camera-equipped mobile phone 23, the memory card 24, the hard disk 25, etc. It may be acquired from an imaging device or a recording medium.

ＰＣ３０は、内部にＣＰＵ３１，ＲＯＭ３２，ＲＡＭ３３，ハードディスク３４，Ｉ／Ｆ回路部３５等を有し、各機器はそれぞれ内部バスにより接続されている。Ｉ／Ｆ回路部３５は、デジタルスチルカメラ２７，ディスプレイ４０，キーボード４１，マウス４２と接続しており、こうした外部の機器とＰＣ３０とのインターフェイスとして機能している。 The PC 30 includes a CPU 31, a ROM 32, a RAM 33, a hard disk 34, an I / F circuit unit 35, and the like, and each device is connected by an internal bus. The I / F circuit unit 35 is connected to the digital still camera 27, the display 40, the keyboard 41, and the mouse 42, and functions as an interface between these external devices and the PC 30.

ＲＯＭ３２内には、所定のオペレーションシステムが記憶されており、ＰＣ３０の電源投入と共に、ＣＰＵ３１はＲＯＭ３２内のオペレーションシステムＯＳを読み込み、起動する。 A predetermined operation system is stored in the ROM 32, and when the PC 30 is turned on, the CPU 31 reads and starts the operation system OS in the ROM 32.

ハードディスク３４には、オペレーションシステムＯＳ上で動作する種々のアプリケーションプログラムがインストールされている。本実施例では、アプリケーションプログラムの一つとして、画像認識処理プログラムがインストールされている。ＣＰＵ３１は、キーボード４１やマウス４２を介したユーザ操作の指示を受け、このプログラムを読み出し、ＲＡＭ３３上に展開して実行する。 Various application programs that operate on the operating system OS are installed in the hard disk 34. In this embodiment, an image recognition processing program is installed as one of application programs. The CPU 31 receives a user operation instruction via the keyboard 41 and the mouse 42, reads out this program, develops it on the RAM 33, and executes it.

Ｄ．画像認識処理：
図６は、画像認識処理のフローチャートである。この処理は、任意の画像データの内容に対応したキーワードを抽出する処理、換言すると、入力した任意の画像データの内容を認識する処理であり、キーボードを介したユーザ指示により実行される。 D. Image recognition processing:
FIG. 6 is a flowchart of the image recognition process. This process is a process of extracting a keyword corresponding to the content of arbitrary image data, in other words, a process of recognizing the content of input arbitrary image data, and is executed by a user instruction via a keyboard.

処理を開始すると、ＣＰＵ３１は、一の画像データを取得する（ステップＳ６００）。具体的には、処理の開始と同時に、ディスプレイ上に表示されたデジタルスチルカメラ２７内の画像の中からユーザが指定する任意の一枚に対応する画像データを取得している。 When the process is started, the CPU 31 acquires one piece of image data (step S600). Specifically, simultaneously with the start of processing, image data corresponding to an arbitrary sheet designated by the user is acquired from the images in the digital still camera 27 displayed on the display.

一の画像データを取得したＣＰＵ３１は、その画像データに付帯する位置情報（ＧＰＳ情報）を抽出する（ステップＳ６０５）。抽出されるＧＰＳ情報は、一の画像を撮影した撮影場所の経度、緯度の情報である。 CPU31 which acquired the one image data extracts the positional information (GPS information) incidental to the image data (step S605). The extracted GPS information is information on the longitude and latitude of the shooting location where one image was taken.

続いて、ＣＰＵ３１は、一の画像データの画像特徴量を抽出する（ステップＳ６１０）。本実施例では、上述のように、色配置の手法を用いて画像特徴量を抽出する。具体的には、ＣＰＵ３１は、画像データを８×８の小ブロックに分割し、各小ブロックを構成する画素データの値（Ｒ、Ｇ、Ｂ）の平均値（Ｒave、Ｇave、Ｂave）を各小ブロックの代表値（Ｒrep、Ｇrep、Ｂrep）として用いる。ＣＰＵ３１は求めた代表値を用いて８×８の画素データから構成される縮小画素データを作成する。ＣＰＵ３１は、作成した縮小画像データ（ＲＧＢデータ）をＹＣｂＣｒデータに変換し、更に、Ｙ、Ｃｂ、Ｃｒの各成分について離散コサイン変換（ＤＣＴ）を行う。ＤＣＴによって、画像データは周波数成分に変換される。なお、離散コサイン変換は当業者において周知な変換処理であるから詳細な説明は省略する。 Subsequently, the CPU 31 extracts an image feature amount of one image data (step S610). In this embodiment, as described above, the image feature amount is extracted by using the color arrangement method. Specifically, the CPU 31 divides the image data into 8 × 8 small blocks, and sets the average value (Rave, Gave, Bave) of the pixel data values (R, G, B) constituting each small block. Used as representative values (Rrep, Grep, Brep) for small blocks. The CPU 31 creates reduced pixel data composed of 8 × 8 pixel data using the obtained representative value. The CPU 31 converts the generated reduced image data (RGB data) into YCbCr data, and further performs discrete cosine transform (DCT) for each component of Y, Cb, and Cr. The image data is converted into frequency components by DCT. Since the discrete cosine transform is a conversion process well known to those skilled in the art, detailed description thereof is omitted.

例えば、図７に示すように、Ｙ、Ｃｂ、Ｃｒの各成分について８×８の係数値が算出される。すなわち、
Ｙｃ[8][8]＝{yc00, yc01, yc02...yc76,yc77}
Ｃｂｃ[8][8]＝{cbc00, cbc01, cbc02...cbc76,cbc77}
Ｃｒｃ[8][8]＝{crc00, crc01, crc02...crc76,crc77}
となり、画像データ（画像）との位置関係は、例えば図７に示すとおりとなる。図７に示すように、係数値が大きくなるにつれて周波数は高くなる。 For example, as shown in FIG. 7, an 8 × 8 coefficient value is calculated for each of Y, Cb, and Cr components. That is,
Yc [8] [8] = {yc00, yc01, yc02 ... yc76, yc77}
Cbc [8] [8] = {cbc00, cbc01, cbc02 ... cbc76, cbc77}
Crc [8] [8] = {crc00, crc01, crc02 ... crc76, crc77}
Thus, the positional relationship with the image data (image) is, for example, as shown in FIG. As shown in FIG. 7, the frequency increases as the coefficient value increases.

なお、画像特徴量の取得としては、Ｒ、Ｇ、Ｂ各成分のヒストグラムを用いる手法、最大・最小輝度値および平均輝度値を用いる手法など、周知の種々の手法を用いるものとしても良い。この場合、データベース２０側も同様の手法により、画像特徴量を抽出しておけば良い。 Note that various known methods such as a method using a histogram of each of R, G, and B components, a method using maximum / minimum luminance values, and an average luminance value may be used for acquiring the image feature amount. In this case, the image feature amount may be extracted by the same method on the database 20 side.

続いて、ＣＰＵ３１は、信頼性係数の基準値の初期化を行なう（ステップＳ６２０）。信頼性係数ＴＣ１は、上述のとおり、画像データの画像特徴量の類似具合を表わす指標であり、後述する処理ステップにおいて算出されるが、ここでは、まず、算出される信頼性係数ＴＣ１を評価する基準値ＴＣｍａｘをゼロに設定して初期化している。 Subsequently, the CPU 31 initializes the reference value of the reliability coefficient (step S620). As described above, the reliability coefficient TC1 is an index representing the degree of similarity of the image feature amount of the image data, and is calculated in a processing step described later. Here, first, the calculated reliability coefficient TC1 is evaluated. The reference value TCmax is initialized to zero.

ＣＰＵ３１は、データベース２０から一の対象を選択する（ステップＳ６３０）。上述のように、データベース２０は、位置情報関連度ＰＲ，ＧＰＳ情報，領域レンジＲ，画像特徴量，キーワードなどの種々の情報を一まとまり（情報群）として一の画像データに関連付け、複数の情報群を記憶している。このステップは、データベース２０内の複数の情報群から一の情報群を選択する処理であり、間接的に一の情報群に関連付けられた画像データを選択する処理であると言える。よって、このステップでは、一の画像データが選択されるものとし、以下、選択された画像データを、ステップＳ６００で取得した一の画像データと区別するため、「データベース画像」と呼ぶこととする。 The CPU 31 selects one target from the database 20 (step S630). As described above, the database 20 associates various pieces of information such as position information relevance PR, GPS information, region range R, image feature amount, keyword, and the like as a group (information group) with a single piece of image data. Remember the group. This step is a process of selecting one information group from a plurality of information groups in the database 20, and can be said to be a process of selecting image data associated with one information group indirectly. Therefore, in this step, it is assumed that one image data is selected. Hereinafter, the selected image data is referred to as a “database image” in order to distinguish it from the one image data acquired in step S600.

データベース画像を選択したＣＰＵ３１は、データベース２０内の全てのデータベース画像が選択されたか否かを判断する（ステップＳ６３５）。ＣＰＵ３１は、選択されたデータベース画像について後述する処理を実行するとフラグを立てており、そのフラグを確認することで、選択され、処理済みか否かを判断している。 CPU31 which selected the database image judges whether all the database images in the database 20 were selected (step S635). The CPU 31 sets a flag to execute processing to be described later on the selected database image, and determines whether it has been selected and processed by checking the flag.

ステップＳ６３５で、全てのデータベース画像について選択されたと判断した（Ｙｅｓ）場合には、データベース２０内の検索は終了したものとして、一連の画像認識処理を終了する。 If it is determined in step S635 that all database images have been selected (Yes), the search in the database 20 has been completed, and the series of image recognition processing is terminated.

他方、ステップＳ６３５で、全てのデータベース画像は選択されていないと判断した（Ｎｏ）場合には、選択されたデータベース画像の位置情報関連度ＰＲを抽出し、位置情報関連度ＰＲがどのタイプ（Ａ〜Ｃ）に属するかを判断する（ステップＳ６４５）。 On the other hand, if it is determined in step S635 that all database images have not been selected (No), the position information relevance level PR of the selected database image is extracted, and the type (A To C) is determined (step S645).

ステップＳ６４５で、位置情報関連度ＰＲが、タイプＢまたはタイプＣに該当する場合（ＰＲが０．９よりも小さい場合）には、ステップＳ６６０へ移行する。すなわち、位置情報関連度ＰＲがそれほど高くない（タイプＡのように、その場所固有の対象物である可能性が高くない）データベース画像については、取得した一の画像データのキーワードを抽出するための検索対象とする。 If the position information relevance PR corresponds to type B or type C in step S645 (when PR is smaller than 0.9), the process proceeds to step S660. That is, for a database image whose position information relevance PR is not so high (likely type A, the possibility of being an object specific to the place is not high), a keyword for extracting one acquired image data keyword is used. Search target.

他方、ステップＳ６４５で、位置情報関連度ＰＲが、タイプＡに該当する場合（ＰＲが０．９以上の場合）には、選択されたデータベース画像のＧＰＳ情報および領域レンジＲを抽出する。そして、既に抽出してある（ステップＳ６０５）一の画像データのＧＰＳ情報による撮影場所の経度、緯度が、データベース画像の領域レンジＲを加味した経度，緯度の範囲内であるか否かを判断する（ステップＳ６５５）。 On the other hand, if the position information relevance PR corresponds to type A (when PR is 0.9 or more) in step S645, the GPS information and the area range R of the selected database image are extracted. Then, it is determined whether or not the longitude and latitude of the shooting location based on the GPS information of one image data already extracted (step S605) are within the range of longitude and latitude taking into account the region range R of the database image. (Step S655).

ステップＳ６５５で、データベース画像の経度，緯度の範囲内に納まらないと判断した（Ｎｏ）場合には、現在選択されているデータベース画像に対して処理済みのフラグを立てて、ステップＳ６３０へ戻り、データベース２０から次のデータベース画像を選択する。 If it is determined in step S655 that it does not fall within the longitude and latitude ranges of the database image (No), a processed flag is set for the currently selected database image, and the process returns to step S630 to return to the database. The next database image is selected from 20.

つまり、タイプＡに分類されたデータベース画像には、その場所固有の対象物である被写体が写っていることとなるため、このデータベース画像と任意の画像データとがほぼ同じ内容であると判断するためには、両者の撮影場所が近いことが条件となる。 That is, the database image classified as type A includes a subject that is an object specific to the place, so that it is determined that the database image and arbitrary image data have almost the same content. The condition is that the shooting locations of both are close.

データベース画像の経度、緯度を中心とする領域レンジＲを加味した範囲内（図３参照）に、一の画像データの撮影場所の経度，緯度が納まらないような場合には、そのデータベース画像のキーワードを抽出しても、一の画像データの画像内容を適切に表現したものとはならない。したがって、タイプＡであって、経度、緯度の位置情報が条件を満たさないデータベース画像に対しては、これを検索対象から除外する処理を行なっている。 If the longitude and latitude of the shooting location of one image data does not fit within the range (see FIG. 3) that includes the region range R centered on the longitude and latitude of the database image, the keyword of the database image However, the image content of one image data is not appropriately expressed. Therefore, for a database image that is type A and whose longitude and latitude position information does not satisfy the condition, a process of excluding it from the search target is performed.

他方、ステップＳ６５５で、データベース画像の経度，緯度の範囲内に納まると判断した（Ｙｅｓ）場合には、ステップＳ６６０へ移行する。すなわち、タイプＡであって、経度、緯度の位置情報が条件を満たす(撮影場所が近い)データベース画像についてのみ、検索対象としている。 On the other hand, if it is determined in step S655 that the database image falls within the longitude and latitude ranges (Yes), the process proceeds to step S660. That is, only database images that are of type A and satisfy the conditions of the position information of longitude and latitude (similar to the shooting location) are set as search targets.

タイプＢ，Ｃであるデータベース画像、または、タイプＡであって位置情報が条件を満たすデータベース画像を検索対象としたＣＰＵ３１は、データベース画像の画像特徴量を抽出し、これと、ステップＳ６１０で既に抽出してある画像データの画像特徴量とから、両画像の画像特徴量の距離Ｄを算出する（ステップＳ６６０）。 The CPU 31 that has searched for a database image of type B or C or a database image of type A that satisfies the position information extracts the image feature amount of the database image and already extracts it in step S610. The distance D between the image feature amounts of both images is calculated from the image feature amount of the image data (step S660).

画像特徴量の距離Ｄは、両画像の類似度あるいは非類似度を表わし、例えば、ユークリッド距離を用いる場合には、以下の式によって算出される。 The distance D of the image feature amount represents the similarity or dissimilarity between the two images. For example, when the Euclidean distance is used, it is calculated by the following formula.

なお、Ｙｃｐ，Ｃｂｃｐ，Ｃｒｃｐはデータベース画像の画像特徴量（係数）を、Ｙｃｑ，Ｃｂｃｑ，Ｃｒｃｑは画像データの画像特徴量（係数）をそれぞれ示し、本実施例では、ＭＰＥＧ−７に採用される画像特徴量を用いているため、変数ｉ，ｊは、共に０〜７（ｎ＝７）の値を採る。 Ycp, Cbcp, and Crcp indicate the image feature amount (coefficient) of the database image, and Ycq, Cbqq, and Crcq indicate the image feature amount (coefficient) of the image data. In this embodiment, the image feature amount (coefficient) is adopted in MPEG-7. Since the image feature amount is used, the variables i and j both take values of 0 to 7 (n = 7).

距離Ｄを算出したＣＰＵ３１は、信頼性係数ＴＣを算出する（ステップＳ６７０）。信頼性係数ＴＣは、データベース画像の画像特徴量と画像データの画像特徴量との差、つまり、上記の距離Ｄに応じて決定される係数であり、両者の画像特徴量が近い（距離Ｄが小さい）ほど、信頼性が高いとして、係数は大きな値となる。具体的には、次式により算出される。 The CPU 31 having calculated the distance D calculates a reliability coefficient TC (step S670). The reliability coefficient TC is a coefficient determined according to the difference between the image feature quantity of the database image and the image feature quantity of the image data, that is, the distance D, and the image feature quantities of both are close (distance D is equal to The smaller the value, the higher the reliability and the higher the coefficient. Specifically, it is calculated by the following formula.

なお、Ｄｒｅｆは、距離Ｄに対する一つの基準値であり、両者の画像データをほぼ同じもの（類似度が高い）であると判断できる基準値として、予め設定されている。すなわち、この式において、距離ＤがＤｒｅｆ以下であり、信頼性係数ＴＣが１を超えるような場合には、両者の画像データはほぼ同じものであると判断する。したがって、信頼性係数ＴＣが１を超える場合には信頼性係数ＴＣ＝１として設定し、信頼性係数ＴＣの取り得る範囲を０≦ＴＣ≦１．０としている。 Dref is one reference value with respect to the distance D, and is set in advance as a reference value with which it is possible to determine that both image data are substantially the same (having high similarity). That is, in this equation, when the distance D is equal to or less than Dref and the reliability coefficient TC exceeds 1, it is determined that the two image data are substantially the same. Therefore, when the reliability coefficient TC exceeds 1, the reliability coefficient TC is set to 1, and the range that the reliability coefficient TC can take is set to 0 ≦ TC ≦ 1.0.

続いて、ＣＰＵ３１は、算出した信頼性係数ＴＣの補正を行なう（ステップＳ６８０）。具体的には、位置情報関連度ＰＲを用いて、次式により補正を行なう。 Subsequently, the CPU 31 corrects the calculated reliability coefficient TC (step S680). Specifically, correction is performed by the following equation using the position information relevance PR.

この補正は、位置関連情報度ＰＲが大きな値である場合には、上記ステップで算出された信頼性係数ＴＣを、より一層大きな値の信頼性係数ＴＣ１に修正し、信頼性が高い（一の画像データの画像内容を表わす適切なキーワードが付されたデータベース画像である）ことを示すものである。つまり、両者のＧＰＳ情報に基づく位置情報が近く、かつ、タイプＡあるいはタイプＢのデータベース画像であるような場合には、位置情報関連度ＰＲが比較的大きな値となり、信頼性係数ＴＣ１も大きな値（１に近い値）となる。 In this correction, when the position-related information degree PR is a large value, the reliability coefficient TC calculated in the above step is corrected to a larger reliability coefficient TC1, and the reliability is high (one This is a database image with an appropriate keyword representing the image content of the image data). That is, when the position information based on the GPS information of both is close and the database image is of type A or type B, the position information relevance PR is a relatively large value and the reliability coefficient TC1 is also a large value. (Value close to 1).

なお、この信頼性係数ＴＣ１を算出する対象となるデータベース画像は、上記のステップＳ６４５，Ｓ６５５より、タイプＢ，Ｃ、または、タイプＡであって位置情報が条件を満たす場合である。したがって、位置情報関連度ＰＲが高いタイプＡであっても、ＧＰＳ情報に基づく位置情報から近い場所で撮影されていないと判断されたデータベース画像は、処理対象とならないため、そのまま位置情報関連度ＰＲを用いた補正をしても何ら問題は生じない。 Note that the database image that is a target for calculating the reliability coefficient TC1 is the type B, C, or type A and the position information satisfies the conditions from Steps S645 and S655. Therefore, even if the position information relevance level PR is high, a database image that is determined not to be photographed at a location close to the position information based on GPS information is not a processing target. There is no problem even if correction is performed using.

また、データベース画像がタイプＣである場合も、位置情報関連度ＰＲがゼロで新たな信頼性係数ＴＣ１は元の信頼性係数ＴＣのままであるため、補正上の問題はない。こうした補正を行なうことで、画像データ自体の画像特徴量の類似度は高いものの、異なる場所で撮影された全く別々の被写体である場合にも適切な信頼性係数ＴＣ１を算出することができる。 Even when the database image is type C, there is no problem in correction because the position information relevance PR is zero and the new reliability coefficient TC1 remains the original reliability coefficient TC. By performing such correction, although the degree of similarity of the image feature amount of the image data itself is high, an appropriate reliability coefficient TC1 can be calculated even when the subjects are completely different subjects photographed at different places.

続いて、ＣＰＵ３１は、補正した信頼性係数ＴＣ１が、現在設定されている信頼性係数の基準値ＴＣｍａｘよりも大きいか否かを判断する（ステップＳ６８５）。 Subsequently, the CPU 31 determines whether or not the corrected reliability coefficient TC1 is larger than the currently set reliability coefficient reference value TCmax (step S685).

ステップＳ６８５で、信頼性係数ＴＣ１が基準値ＴＣｍａｘよりも大きいと判断された（Ｙｅｓ）場合には、基準値ＴＣｍａｘを信頼性係数ＴＣ１に置き換えると共に、現在の処理対象のデータベース画像のキーワードを、抽出キーワードとして一時的に記憶し（ステップＳ６９０）、ステップＳ６３０に戻って、次のデータベース画像について一連の処理を繰り返す。なお、抽出キーワードは、信頼性係数の基準値ＴＣｍａｘを置き換える毎に、新たな処理対処のデータベース画像のキーワードに書き換えられる。 If it is determined in step S685 that the reliability coefficient TC1 is greater than the reference value TCmax (Yes), the reference value TCmax is replaced with the reliability coefficient TC1, and the keyword of the current database image to be processed is extracted. The keyword is temporarily stored (step S690), and the process returns to step S630 to repeat a series of processes for the next database image. The extracted keyword is rewritten to a keyword of a new database image to be processed every time the reliability coefficient reference value TCmax is replaced.

他方、ステップＳ６８５で、信頼性係数ＴＣ１が基準値ＴＣｍａｘ以下であると判断された（Ｎｏ）場合には、キーワードを記憶することなく、ステップＳ６３０に戻って、次のデータベース画像について一連の処理を繰り返す。なお、ステップＳ６８５，Ｓ６９０の処理によりステップＳ６３０へ戻る前に、ＣＰＵ３１は、処理対象のデータベース画像に対して処理済のフラグを立てている。 On the other hand, if it is determined in step S685 that the reliability coefficient TC1 is equal to or less than the reference value TCmax (No), the process returns to step S630 without storing the keyword, and a series of processing is performed on the next database image. repeat. Note that the CPU 31 sets a processed flag for the database image to be processed before returning to step S630 by the processing of steps S685 and S690.

こうして一連の処理を繰り返し、データベース２０内のデータベース画像の全てについて処理した後、ステップＳ６３５からＥＮＤに抜けて、画像認識処理を終了する。この際、ＣＰＵ３１は、一時的に記憶された抽出キーワードをディスプレイ４０上に表示する。 In this way, a series of processes are repeated, and after all the database images in the database 20 are processed, the process returns from step S635 to END, and the image recognition process ends. At this time, the CPU 31 displays the temporarily stored extracted keyword on the display 40.

以上の画像認識処理によれば、全てのデータベース画像に対して処理を繰り返し、信頼性係数ＴＣ１が最も高くなるデータベース画像に対応したキーワードを抽出する。したがって、任意の一の画像データの画像内容を表わす適切なキーワードを抽出することができる。 According to the above image recognition processing, the processing is repeated for all the database images, and the keyword corresponding to the database image having the highest reliability coefficient TC1 is extracted. Therefore, it is possible to extract an appropriate keyword representing the image content of any one image data.

また、本実施例の画像認識処理では、タイプＡであって位置情報が所定条件を満たさないデータベース画像を検索対象から除外するなど、位置情報関連度ＰＲに応じて検索対象を絞り込む処理を行なう。すなわち、画像特徴量を用いた類似具合を判断する前に、位置情報関連度ＰＲを加味して、処理対象を減らすことができる。したがって、キーワードの検索速度を向上し、検索時間を短縮することができる。 Further, in the image recognition processing of the present embodiment, processing for narrowing down the search target according to the position information relevance PR is performed, such as excluding database images that are type A and whose positional information does not satisfy the predetermined condition from the search target. That is, before determining the degree of similarity using the image feature amount, it is possible to reduce the number of objects to be processed in consideration of the position information relevance level PR. Therefore, the keyword search speed can be improved and the search time can be shortened.

さらには、位置情報関連度ＰＲに応じて検索対象を絞り込むことで、任意の一の画像データの画像内容を表わす適切なキーワードとはなり得ないデータベース画像を除外することができる。したがって、適切なキーワードを抽出する確率が向上し、任意の画像データの画像内容を認識する精度（認識率）を向上させることができる。 Furthermore, by narrowing down the search target according to the position information relevance PR, it is possible to exclude database images that cannot be appropriate keywords representing the image content of any one image data. Therefore, the probability of extracting an appropriate keyword is improved, and the accuracy (recognition rate) for recognizing the image content of arbitrary image data can be improved.

本実施例の画像認識処理では、データベース画像の画像特徴量に加え、位置情報関連度ＰＲを考慮して補正した信頼性係数ＴＣ１を求める。すなわち、画像特徴量のみではなく、撮影された位置も近いと判断されるデータベース画像に対する信頼性を向上する補正を行なう。したがって、補正された信頼性係数ＴＣ１を単純に評価することで、適切なキーワードを抽出することができ、画像内容の認識率を向上することができる。 In the image recognition processing of the present embodiment, the reliability coefficient TC1 corrected in consideration of the position information relevance PR in addition to the image feature amount of the database image is obtained. In other words, not only the image feature amount but also the correction for improving the reliability of the database image that is determined to be close to the photographed position is performed. Therefore, by simply evaluating the corrected reliability coefficient TC1, an appropriate keyword can be extracted, and the recognition rate of the image content can be improved.

本実施例では、データベース画像を構成する画像データを大きく分類する一つの基準として、位置情報関連度ＰＲ＝０．９を用いたが、ＰＲは０．９に限るものではない。例えば、０．９よりも大きい値を基準とするものとしても良い。 In this embodiment, the positional information relevance PR = 0.9 is used as one criterion for largely classifying the image data constituting the database image. However, PR is not limited to 0.9. For example, a value larger than 0.9 may be used as a reference.

こうして画像認識処理により抽出されたキーワードは、画像データ自体（例えば、Ｅｘｉｆ形式のメーカノート）に付与するものとしても良い。こうすることで、後日の画像データの管理、分類などに有効に利用することができる。また、データベースの構築に利用するものとしても良い。 The keywords thus extracted by the image recognition processing may be assigned to the image data itself (for example, manufacturer note in Exif format). By doing so, it can be effectively used for image data management and classification at a later date. Further, it may be used for database construction.

なお、本実施例では、データベース２０を画像認識装置１５から独立した構成として説明したが、画像認識装置１５はデータベース２０を含む構成であっても良い。例えば、画像認識装置１５として画像認識処理プログラムをインストールしたＰＣ３０内のハードディスク３４に、図４に示す画像データに関する種々の情報を蓄積することで、ハードディスク３４をデータベース２０とすれば良い。かかる態様で画像認識装置１５を構成しても、迅速かつ、適切なキーワードの抽出を行なうことができる。 In the present embodiment, the database 20 has been described as being independent of the image recognition device 15, but the image recognition device 15 may be configured to include the database 20. For example, the hard disk 34 may be used as the database 20 by storing various information related to the image data shown in FIG. 4 in the hard disk 34 in the PC 30 in which the image recognition processing program is installed as the image recognition device 15. Even if the image recognition apparatus 15 is configured in this manner, it is possible to quickly and appropriately extract keywords.

Ｅ．変形例：
本実施例では、画像認識処理により、一のキーワードを抽出するものとして説明したが、複数のキーワードを抽出するものとしても良い。例えば、図６にステップＳ６９０で、順次抽出キーワードを書き換える処理に代えて、信頼性係数ＴＣ１が上位３番目までに該当する抽出キーワードを記憶し、これらを候補として抽出するものとすれば良い。抽出した複数のキーワードをディスプレイ４０上に表示し、ユーザが一を選択するアプリケーションとすることで、任意の画像データに対し、より一層適切なキーワードを抽出することができる。 E. Variation:
In the present embodiment, description has been made assuming that one keyword is extracted by image recognition processing, but a plurality of keywords may be extracted. For example, in step S690 in FIG. 6, instead of the process of sequentially rewriting the extracted keywords, the extracted keywords corresponding to the third highest reliability coefficient TC1 may be stored and extracted as candidates. By displaying the plurality of extracted keywords on the display 40 and making the application for the user to select one, more appropriate keywords can be extracted for any image data.

また、本実施例では、データベース２０は、位置関連情報ＰＲ，ＧＰＳ情報，領域レンジＲ，画像特徴量，キーワードなどの情報である特徴量を記憶したものとして説明したが、これらに加えて、画像データ自体を記憶しているものとしても良い。こうすることで、種々の情報と画像データとの関連付けが容易となる。 In the present embodiment, the database 20 is described as having stored therein feature quantities that are information such as position related information PR, GPS information, region range R, image feature quantities, and keywords. The data itself may be stored. By doing so, it becomes easy to associate various information with image data.

本実施例では、アプリケーション上でユーザが指定した１枚の画像データの処理について説明したが、デジタルスチルカメラからハードディスク上にコピーした全ての画像データについて、順次、画像認識処理を実行する態様であっても良い。こうすることで、画像認識処理におけるユーザ操作の手間を省き、利便性を向上することができる。 In the present embodiment, the processing of one piece of image data designated by the user on the application has been described. However, the image recognition processing is sequentially executed for all image data copied from the digital still camera to the hard disk. May be. By doing so, it is possible to save the user operation in the image recognition process and improve the convenience.

さらに、本実施例では、コンピュータのハードディスク上にデータベース２０を構築するものとしたが、例えば、ネットワーク上のサーバにデータベース２０を構築するものとしても良い。この場合、ＰＣ３０のＩ／Ｆ回路３５にネットワークとの接続機能を設ければ良い。こうすることで、大容量のデータベース２０を構築することができる。 Furthermore, in this embodiment, the database 20 is constructed on the hard disk of the computer, but for example, the database 20 may be constructed on a server on the network. In this case, the I / F circuit 35 of the PC 30 may be provided with a network connection function. By doing so, a large-capacity database 20 can be constructed.

以上、本発明の実施の形態について説明したが、本発明はこうした実施の形態に何ら限定されるものではなく、本発明の趣旨を逸脱しない範囲内において様々な形態で実施し得ることは勿論である。本実施例では、画像認識処理はソフトウェアプログラムの態様にて実行されるが、上記の各処理（ステップ）を実行する論理回路を備えたハードウェア回路を用いるものとしても良い。こうすることで、ＣＰＵ３１の負荷を軽減することができると共に、より一層高速に各処理を実行することができる。 As mentioned above, although embodiment of this invention was described, this invention is not limited to such embodiment at all, Of course, it can implement with various forms within the range which does not deviate from the meaning of this invention. is there. In this embodiment, the image recognition process is executed in the form of a software program. However, a hardware circuit including a logic circuit that executes each of the above processes (steps) may be used. By doing so, the load on the CPU 31 can be reduced and each process can be executed at a higher speed.

一実施例の画像認識装置を含む画像認識システムを示す説明図である。It is explanatory drawing which shows the image recognition system containing the image recognition apparatus of one Example. データベースの構築処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the construction process of a database. 領域レンジＲの説明図である。It is explanatory drawing of the area | region range R. FIG. データベースに蓄積される画像データの種々の情報の説明図である。It is explanatory drawing of the various information of the image data accumulate | stored in a database. 画像認識装置の概略構造図である。It is a schematic structure figure of an image recognition device. 画像認識処理のフローチャートである。It is a flowchart of an image recognition process. 画像の特徴量の説明図である。It is explanatory drawing of the feature-value of an image.

Explanation of symbols

１０...データベース作成装置
１１...キーワード設定部
１２...領域レンジ設定部
１３...位置情報関連度設定部
１４...画像特徴量抽出装置
１５...画像認識装置
１６...画像特徴量抽出装置
１７...識別器
１８...信頼性係数補正器
２０...データベース
２１...デジタルビデオカメラ
２３...携帯電話
２４...メモリカード
２５...ハードディスク
２７...デジタルスチルカメラ
３０...コンピュータ
３１...ＣＰＵ
３２...ＲＯＭ
３３...ＲＡＭ
３４...ハードディスク
３５...Ｉ／Ｆ回路部
４０...ディスプレイ
４１...キーボード
４２...マウス
１００...画像認識システム
ＰＲ...位置情報関連度
Ｒ...領域レンジ
ＴＣ...信頼性係数
ＴＣ１...信頼性係数 DESCRIPTION OF SYMBOLS 10 ... Database creation apparatus 11 ... Keyword setting part 12 ... Area range setting part 13 ... Position information relevance setting part 14 ... Image feature-value extraction apparatus 15 ... Image recognition apparatus 16. .. Image feature extraction device 17 ... Discriminator 18 ... Reliability coefficient corrector 20 ... Database 21 ... Digital video camera 23 ... Mobile phone 24 ... Memory card 25 ... Hard disk 27 ... Digital still camera 30 ... Computer 31 ... CPU
32 ... ROM
33 ... RAM
34 ... Hard disk 35 ... I / F circuit part 40 ... Display 41 ... Keyboard 42 ... Mouse 100 ... Image recognition system PR ... Position information relevance R ... Area range TC ... Reliability factor TC1 ... Reliability factor

Claims

An image recognition apparatus for executing processing for extracting a keyword representing image content of one image data having information on a shooting position from a predetermined search target and recognizing the image content,
For each of the plurality of target image data to be searched, an image feature amount, a keyword representing the image content, position information where an image having the image feature amount is photographed, and an object to which the keyword is assigned are specified. Storage means for associating a position information relevance level indicating a degree unique to a place and storing it as a database;
Image data input means for inputting the one image data as acquired image data and extracting position information where the acquired image data is captured;
Image feature amount extraction means for extracting an image feature amount included in the acquired image data;
Using the degree of relevance of the position information of the target image data stored in the database, the position information of the acquired image data, and the shooting position distance obtained from the position information of the target image data, the target image data is Exclusion determination means for determining whether or not to exclude from the keyword extraction process,
If it is determined that the image is to be processed, the image feature quantity of the acquired image data is compared with the image feature quantity of the target image data, and keywords of the target image data satisfying a predetermined condition are extracted from the database. An image recognition apparatus comprising: a keyword extraction unit.

The image recognition apparatus according to claim 1,
The position information relevance is
A) Type A in which the object to which the keyword in the target image data is assigned is unique to a specific place;
B) Type B in which the object to which the keyword in the target image data is assigned is high in a certain place and the place where the object exists is limited;
C) The object to which the keyword in the target image data is assigned is classified and set according to three types of type C that are not unique to a specific place,
The exclusion determination unit determines to exclude the target image data from the processing target when the position information relevance of the target image data is the type A and the distance between the photographing positions is a predetermined amount or more. Image recognition device.

The image recognition apparatus according to claim 2, further comprising:
The storage means associates, for each target image data, a region range indicating a geographical range in which an object to which a keyword is assigned in the target image data, and stores the range as the database.
The exclusion determination unit determines that the gap between the photographing positions is greater than or equal to a predetermined value when the position information of the acquired image data does not fall within the range of the target image data stored in the database. apparatus.

The image recognition device according to any one of claims 1 to 3,
The keyword extracting means includes
A reliability coefficient calculation unit that calculates a reliability coefficient indicating a degree of similarity between the acquired image data and the target image data from the image feature amount of the acquired image data and the image feature amount of the target image data When,
An image recognition apparatus comprising: an extraction unit that extracts a keyword of the target image data based on the calculated reliability coefficient.

The image recognition apparatus according to claim 4, further comprising:
The keyword extracting means includes
A correction unit that corrects the reliability coefficient by adding the position information relevance to the reliability coefficient calculated using the image feature amount;
The extraction unit is an image recognition device that extracts the keyword based on the corrected reliability coefficient.

The image recognition device according to claim 5,
The image recognizing apparatus, wherein the extraction unit extracts a keyword of the target image data having the highest reliability coefficient among the corrected reliability coefficients obtained for each target image data to be searched.

For each of a plurality of target image data to be searched, an image feature amount, a keyword representing the image content, position information at which an image having the image feature amount is captured, and an object to which the keyword is assigned is a specific location A keyword representing the image content of one piece of image data having the position information of the photographing position is extracted from the database using a database that stores the position information relevance indicating the degree unique to the image. An image recognition apparatus that executes processing and recognizes image contents,
Image data input means for inputting the one image data as acquired image data and extracting position information where the acquired image data is captured;
Image feature amount extraction means for extracting an image feature amount included in the acquired image data;
Using the degree of relevance of the position information of the target image data stored in the database, the position information of the acquired image data, and the shooting position distance obtained from the position information of the target image data, the target image data is Exclusion determination means for determining whether or not to exclude from the keyword extraction process,
If it is determined that the image is to be processed, the image feature quantity of the acquired image data is compared with the image feature quantity of the target image data, and keywords of the target image data satisfying a predetermined condition are extracted from the database. An image recognition apparatus comprising: a keyword extraction unit.

An image recognition method for recognizing image content by executing processing for extracting a keyword representing the image content of one piece of image data having information on a shooting position from a predetermined search target,
For each of the plurality of target image data to be searched, an image feature amount, a keyword representing the image content, position information where an image having the image feature amount is photographed, and an object to which the keyword is assigned are specified. Associating with the location information relevance indicating the degree unique to the place, storing it as a database,
The one image data is input as acquired image data, and position information where the acquired image data is photographed is extracted,
Extracting an image feature amount included in the acquired image data;
Using the degree of relevance of the position information of the target image data stored in the database, the position information of the acquired image data, and the shooting position distance obtained from the position information of the target image data, the target image data is Decide whether or not to exclude them from the keyword extraction process,
If it is determined that the image is to be processed, the image feature quantity of the acquired image data is compared with the image feature quantity of the target image data, and keywords of the target image data satisfying a predetermined condition are extracted from the database. Image recognition method.

A computer program for controlling an image recognition apparatus for recognizing image content by executing processing for extracting a keyword representing the image content of one piece of image data having information on a shooting position from a predetermined search target,
For each of the plurality of target image data to be searched, an image feature amount, a keyword representing the image content, position information where an image having the image feature amount is photographed, and an object to which the keyword is assigned are specified. A function of associating a location information relevance level indicating a degree unique to a place and storing it as a database;
A function of inputting the one image data as acquired image data and extracting position information where the acquired image data is captured;
A function of extracting an image feature amount included in the acquired image data;
Using the degree of relevance of the position information of the target image data stored in the database, the position information of the acquired image data, and the shooting position distance obtained from the position information of the target image data, the target image data is A function to determine whether to exclude from the keyword extraction process,
If it is determined that the image is to be processed, the image feature quantity of the acquired image data is compared with the image feature quantity of the target image data, and keywords of the target image data satisfying a predetermined condition are extracted from the database. A computer program that causes the image recognition apparatus to realize functions.

A recording medium in which the computer program according to claim 9 is recorded in a computer-readable manner.