JP6210856B2

JP6210856B2 - Object position specifying system and object position specifying method

Info

Publication number: JP6210856B2
Application number: JP2013240316A
Authority: JP
Inventors: 三友刈屋
Original assignee: Olympus Corp
Current assignee: Olympus Corp
Priority date: 2013-11-20
Filing date: 2013-11-20
Publication date: 2017-10-11
Anticipated expiration: 2033-11-20
Also published as: JP2015099571A

Description

本発明は、対象物位置特定システム、および対象物位置特定方法に関する。 The present invention relates to an object position specifying system and an object position specifying method.

従来から、画像に写されている物体、つまり、被写体（対象物）や画像が撮影されたシーンを認識する技術がある（非特許文献１参照）。この技術では、入力された画像に対する処理を、以下の手順で行うことによって、画像に写っている対象物が、大量の画像を対象物の種類毎に分類してまとめたデータ（以下、「教師データ」という）のそれぞれ毎にどれくらい類似しているかを表す類似度を算出し、最も類似度が高い教師データが表している対象物が、入力された画像に写っている対象物であると認識する。つまり、入力された画像は、最も類似度が高い対象物が写っているシーンであると認識する。 2. Description of the Related Art Conventionally, there is a technique for recognizing an object shown in an image, that is, a subject (target object) or a scene where an image is captured (see Non-Patent Document 1). In this technology, the input image is processed according to the following procedure, so that the object shown in the image is a data in which a large number of images are grouped and classified according to the type of object (hereinafter referred to as “teacher”). The degree of similarity is calculated for each data), and the object represented by the teacher data with the highest degree of similarity is recognized as the object shown in the input image. To do. That is, the input image is recognized as a scene in which an object having the highest similarity is shown.

（手順１−１）：画像の領域を細かく分割し、分割した領域毎に局所特徴ベクトルを生成する。
（手順１−２）：生成した局所特徴ベクトルに基づいて、分割した領域毎に量子化ベクトルを生成する。
（手順２）：生成した量子化ベクトルから、画像全体のヒストグラムを生成する。
（手順３）：例えば、ＳＶＭ（Ｓｕｐｐｏｒｔｖｅｃｔｏｒｍａｃｈｉｎｅ：サポートベクタマシン）演算などによって、生成した画像全体のヒストグラムと大量の教師データのそれぞれとを比較し、分類されたそれぞれの教師データ毎に類似度を算出する。 (Procedure 1-1): The area of the image is finely divided, and a local feature vector is generated for each divided area.
(Procedure 1-2): A quantization vector is generated for each divided region based on the generated local feature vector.
(Procedure 2): A histogram of the entire image is generated from the generated quantization vector.
(Procedure 3): For example, the generated histogram of the entire image is compared with each of a large amount of teacher data by, for example, SVM (Support vector machine) operation, and the degree of similarity for each classified teacher data Is calculated.

ところで、非特許文献１で開示された技術によって、最も類似度が高い対象物が写っているシーンであると認識した後に、この対象物が写っている画像内の位置を特定することが求められる場合がある。この場合には、入力された画像に対してシーン認識の処理を行った後に、再度シーン認識の処理を、この画像を予め定めた領域に分割した領域毎に行うことによって、分割したいずれの領域に対象物が写っているか、すなわち、対象物が写っている位置を特定することができると考えられる。 By the way, the technique disclosed in Non-Patent Document 1 is required to identify the position in the image in which the object is captured after recognizing that the object has the highest similarity. There is a case. In this case, after the scene recognition process is performed on the input image, the scene recognition process is performed again for each area obtained by dividing the image into predetermined areas. It is considered that the position where the object is shown, that is, the position where the object is shown can be specified.

Ｇ．Ｃｓｕｒｋａ，Ｃ．Ｒ．Ｄａｎｃｅ，ＬｉｘｉｎＦａｎ，Ｊ．Ｗｉｌｌａｍｏｗｓｋｉ，Ｃ．Ｂｒａｙ， “ＶｉｓｕａｌＣａｔｅｇｏｒｉｚａｔｉｏｎｗｉｔｈＢａｇｓｏｆＫｅｙｐｏｉｎｔｓ”，Ｐｒｏｃ．ＥＣＣＶＷｏｒｋｓｈｏｐｏｎＳｔａｔｉｓｔｉｃａｌＬｅａｒｎｉｎｇｉｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ，ｐｐ．５９−７４，２００４．G. Csurka, C.I. R. Dance, Lixin Fan, J.M. Willamowski, C.I. Bray, “Visual Categorization with Bags of Keypoints”, Proc. ECCV Workshop on Statistical Learning in Computer Vision, pp. 59-74, 2004.

しかしながら、入力された画像に対してシーン認識の処理を行った後に、分割した領域に対するシーン認識の処理を再度行うということは、シーン認識の処理を分割した領域の数＋１回行うことになり、対象物が写っている画像内の位置を特定するために要する演算時間が長くなってしまう、という問題になる。 However, after performing the scene recognition process on the input image, performing the scene recognition process on the divided area again means performing the scene recognition process on the number of divided areas + 1 times. There is a problem that the calculation time required for specifying the position in the image in which the object is shown becomes long.

本発明は、上記の課題認識に基づいてなされたものであり、演算時間が長くなるのを抑えた上で、入力された画像内で対象物が写っている位置を特定することができる対象物位置特定システム、および対象物位置特定方法を提供することを目的としている。 The present invention has been made on the basis of the above problem recognition, and is capable of specifying a position where the object is reflected in the input image while suppressing an increase in calculation time. An object is to provide a position specifying system and an object position specifying method.

上記の課題を解決するため、本発明の対象物位置特定システムは、入力された画像の全体の領域を、予め定めた第１の大きさの複数の第１の領域に分割し、該分割した前記第１の領域毎に、該第１の領域に含まれる画像データにおける局所的な特徴を表す局所特徴ベクトルを生成する局所特徴ベクトル生成部と、前記局所特徴ベクトル生成部が生成した、それぞれの前記第１の領域の前記局所特徴ベクトルの値を量子化し、それぞれの前記第１の領域に対応する量子化ベクトルを生成する量子化ベクトル生成部と、前記量子化ベクトル生成部が生成したそれぞれの前記量子化ベクトルの値を、前記第１の領域毎に保存する量子化ベクトル保存部と、前記第１の領域毎のそれぞれの前記量子化ベクトルの値から、前記画像の全体または一部の領域を表すヒストグラムを生成するヒストグラム生成部と、前記ヒストグラム生成部が生成した前記ヒストグラムに対するサポートベクタマシン（ＳＶＭ）演算を行うＳＶＭ演算部と、前記局所特徴ベクトル生成部と、前記量子化ベクトル生成部と、前記ヒストグラム生成部と、前記ＳＶＭ演算部とのそれぞれを制御し、対象物が写されている前記画像のシーンを認識するシーン認識の処理を実行させた後に、該シーン認識の処理において判別した対象物が、前記画像の全体の領域を前記第１の領域よりも大きな予め定めた第２の大きさに分割した複数の第２の領域のいずれの位置に写されているかを特定するための位置特定の処理を実行させる位置特定制御部と、を備え、前記位置特定制御部は、前記シーン認識の処理において、前記ヒストグラム生成部に、それぞれの前記第１の領域毎の前記量子化ベクトルの値から、前記画像の全体を表すヒストグラムを生成させ、前記ＳＶＭ演算部に、前記画像の全体を表すヒストグラムと、複数の画像のヒストグラムが対象物の種類毎に分類してまとめられた複数の教師データのヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させ、前記位置特定の処理において、前記ヒストグラム生成部に、前記量子化ベクトル保存部に保存された前記第１の領域毎のそれぞれの前記量子化ベクトルの値から、それぞれの前記第２の領域の画像を表すヒストグラムを生成させ、前記ＳＶＭ演算部に、前記第２の領域を表すヒストグラムのそれぞれに対するＳＶＭ演算を実行させる、ことを特徴とする。 In order to solve the above-described problem, the object position specifying system according to the present invention divides an entire area of an input image into a plurality of first areas having a predetermined first size, and divides the area. For each of the first regions, a local feature vector generation unit that generates a local feature vector that represents a local feature in the image data included in the first region, and a local feature vector generation unit that generates the local feature vector, Quantizing the value of the local feature vector of the first region and generating a quantization vector corresponding to each of the first regions, and each of the quantization vector generation units generated by the quantization vector generation unit A quantization vector storage unit that stores the value of the quantization vector for each of the first regions, and a region of the whole or a part of the image from the values of the quantization vectors for each of the first regions. A histogram generator for generating a histogram representing a and SVM operation unit for performing support vector machine (SVM) operation on said histogram the histogram generating unit has generated, said local feature vector generating unit, and the quantized vector generation unit Each of the histogram generation unit and the SVM calculation unit is controlled to execute a scene recognition process for recognizing the scene of the image in which the object is copied, and then the determination is made in the scene recognition process. For specifying in which position of the plurality of second areas the object is divided into a predetermined second size larger than the first area of the entire area of the image A position specifying control unit that executes a position specifying process, wherein the position specifying control unit includes the histogram in the scene recognition process. A ram generating unit that generates a histogram representing the entire image from the quantization vector value for each of the first regions, and the SVM calculating unit includes a histogram representing the entire image, and a plurality of histograms. An SVM operation is performed to compare each of the histograms of a plurality of teacher data in which the histogram of the image is classified and grouped for each type of object, and in the position specifying process, the histogram generation unit performs the quantization A histogram representing each image of the second region is generated from the value of the quantization vector for each of the first regions stored in the vector storage unit, and the SVM operation unit causes the second operation to be performed. An SVM operation is performed on each histogram representing a region.

また、本発明の対象物位置特定システムは、前記ヒストグラム生成部が生成した、前記画像の全体を表すヒストグラムを保存するヒストグラム保存部、をさらに備え、前記位置特定制御部は、前記位置特定の処理において、前記ＳＶＭ演算部に、前記第２の領域を表すヒストグラムのそれぞれと、前記ヒストグラム保存部に保存された前記画像の全体を表すヒストグラムとを比較するＳＶＭ演算を実行させる、ことを特徴とする。 The object position specifying system according to the present invention further includes a histogram storing unit that stores a histogram representing the entire image generated by the histogram generating unit, and the position specifying control unit includes the position specifying process. The SVM calculation unit is caused to execute an SVM calculation for comparing each of the histograms representing the second region with a histogram representing the whole of the image stored in the histogram storage unit. .

また、本発明の前記位置特定制御部は、前記位置特定の処理において、前記ＳＶＭ演算部に、前記第２の領域を表すヒストグラムのそれぞれと、複数の前記教師データの内、予め定めた条件に応じて選択した一部の前記教師データのヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させる、ことを特徴とする。 Further, in the position specifying process, the position specifying control unit according to the present invention causes the SVM calculating unit to set each of the histogram representing the second region and a predetermined condition among the plurality of teacher data. A SVM operation for comparing each of the histograms of a part of the selected teacher data is executed.

また、本発明の対象物位置特定システムは、前記ヒストグラム生成部が生成した、前記画像の全体を表すヒストグラムを保存するヒストグラム保存部と、前記ヒストグラム保存部に保存された前記画像の全体を表すヒストグラム、または複数の前記教師データの内、予め定めた条件に応じて選択した一部の前記教師データのヒストグラムのいずれか一方を選択して出力する教師データ切り替え部と、をさらに備え、前記位置特定制御部は、前記位置特定の処理において、前記ＳＶＭ演算部に、前記第２の領域を表すヒストグラムのそれぞれと、前記教師データ切り替え部を制御することによって該教師データ切り替え部から出力されたヒストグラムとを比較するＳＶＭ演算を実行させる、ことを特徴とする。 The object position specifying system according to the present invention includes a histogram storage unit that stores the histogram that represents the entire image generated by the histogram generation unit, and a histogram that represents the entire image stored in the histogram storage unit. Or a teacher data switching unit that selects and outputs one of the histograms of a part of the teacher data selected according to a predetermined condition among the plurality of teacher data, and In the position specifying process, the control unit causes the SVM calculating unit to control each of the histograms representing the second area and the histogram output from the teacher data switching unit by controlling the teacher data switching unit. The SVM operation for comparing the two is executed.

また、本発明の対象物位置特定方法は、入力された画像の全体の領域を、予め定めた第１の大きさの複数の第１の領域に分割し、該分割した前記第１の領域毎に、該第１の領域に含まれる画像データにおける局所的な特徴を表す局所特徴ベクトルを生成する局所特徴ベクトル生成部と、前記局所特徴ベクトル生成部が生成した、それぞれの前記第１の領域の前記局所特徴ベクトルの値を量子化し、それぞれの前記第１の領域に対応する量子化ベクトルを生成する量子化ベクトル生成部と、前記量子化ベクトル生成部が生成したそれぞれの前記量子化ベクトルの値を、前記第１の領域毎に保存する量子化ベクトル保存部と、前記第１の領域毎のそれぞれの前記量子化ベクトルの値から、前記画像の全体または一部の領域を表すヒストグラムを生成するヒストグラム生成部と、前記ヒストグラム生成部が生成した前記ヒストグラムに対するサポートベクタマシン（ＳＶＭ）演算を行うＳＶＭ演算部と、前記局所特徴ベクトル生成部と、前記量子化ベクトル生成部と、前記ヒストグラム生成部と、前記ＳＶＭ演算部とのそれぞれを制御し、対象物が写されている前記画像のシーンを認識するシーン認識の処理を実行させた後に、該シーン認識の処理において判別した対象物が、前記画像の全体の領域を前記第１の領域よりも大きな予め定めた第２の大きさに分割した複数の第２の領域のいずれの位置に写されているかを特定するための位置特定の処理を実行させる位置特定制御部と、を備えた対象物位置特定システムにおいて、前記位置特定制御部が、前記シーン認識の処理において、前記ヒストグラム生成部に、それぞれの前記第１の領域毎の前記量子化ベクトルの値から、前記画像の全体を表すヒストグラムを生成させる手順と、前記ＳＶＭ演算部に、前記画像の全体を表すヒストグラムと、複数の画像のヒストグラムが対象物の種類毎に分類してまとめられた複数の教師データのヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させる手順と、を含み、前記位置特定の処理において、前記ヒストグラム生成部に、前記量子化ベクトル保存部に保存された前記第１の領域毎のそれぞれの前記量子化ベクトルの値から、それぞれの前記第２の領域の画像を表すヒストグラムを生成させる手順と、前記ＳＶＭ演算部に、前記第２の領域を表すヒストグラムのそれぞれに対するＳＶＭ演算を実行させる手順と、を含む、ことを特徴とする。 In the object position specifying method of the present invention, the entire area of the input image is divided into a plurality of first areas having a predetermined first size, and each of the divided first areas is divided. A local feature vector generating unit that generates a local feature vector representing a local feature in the image data included in the first region, and each of the first regions generated by the local feature vector generating unit A quantization vector generation unit that quantizes the value of the local feature vector and generates a quantization vector corresponding to each of the first regions, and a value of each of the quantization vectors generated by the quantization vector generation unit Is generated from a quantization vector storage unit for storing each of the first regions and a value of the quantization vector for each of the first regions. A histogram generation unit for, the SVM computation unit for performing support vector machine (SVM) operation on said histogram the histogram generating unit has generated, said local feature vector generating unit, and the quantized vector generation unit, the histogram generation unit And the SVM calculation unit, and after executing the scene recognition process for recognizing the scene of the image in which the object is copied, the object determined in the scene recognition process is A position specifying process for specifying at which position of the plurality of second areas obtained by dividing the entire area of the image into a predetermined second size larger than the first area; An object position specifying system comprising: a position specifying control unit that executes the position recognition control unit; A procedure for generating a histogram representing the whole of the image from the value of the quantization vector for each of the first regions, and a histogram representing the whole of the image in the SVM calculating unit; And a step of executing an SVM operation for comparing each of a plurality of histograms of a plurality of teacher data in which histograms of a plurality of images are classified and collected for each type of object, and in the position specifying process, the histogram A step of causing the generation unit to generate a histogram representing an image of each of the second regions from the value of each of the quantization vectors stored in the quantization vector storage unit; Including causing the SVM calculation unit to execute SVM calculation for each of the histograms representing the second region. Features.

本発明によれば、演算時間が長くなるのを抑えた上で、入力された画像内で対象物が写っている位置を特定することができるという効果が得られる。 According to the present invention, it is possible to specify the position where an object is shown in an input image while suppressing an increase in calculation time.

本発明の第１の実施形態による対象物位置特定システムの概略構成を示したブロック図である。1 is a block diagram showing a schematic configuration of an object position specifying system according to a first embodiment of the present invention. 本第１の実施形態の対象物位置特定システムにおける処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in the target object location identification system of the 1st embodiment. 本第１の実施形態の対象物位置特定システムにおける全体の処理の一例を模式的に示した図である。It is the figure which showed typically an example of the whole process in the target object location specifying system of the 1st embodiment. 本第１の実施形態の対象物位置特定システムにおいてシーン認識の処理を行う動作の一例を模式的に示した図である。It is the figure which showed typically an example of the operation | movement which performs the process of scene recognition in the target object location specifying system of the 1st embodiment. 本第１の実施形態の対象物位置特定システムにおいて対象物の位置を特定する処理の考え方を説明する図である。It is a figure explaining the view of the process which pinpoints the position of a target object in the target object location specifying system of the 1st embodiment. 本発明の第２の実施形態による対象物位置特定システムの概略構成を示したブロック図である。It is the block diagram which showed schematic structure of the target object location identification system by the 2nd Embodiment of this invention. 本第２の実施形態の対象物位置特定システムにおける処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in the target object location identification system of the 2nd embodiment. 本第２の実施形態の対象物位置特定システムにおいて対象物の位置を簡易的に特定する処理の考え方を説明する図である。It is a figure explaining the view of the process which specifies easily the position of a target object in the target object specifying system of the 2nd embodiment. 本発明の第３の実施形態による対象物位置特定システムの概略構成を示したブロック図である。It is the block diagram which showed schematic structure of the target object location identification system by the 3rd Embodiment of this invention. 本第３の実施形態の対象物位置特定システムにおける処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in the target object location identification system of the 3rd embodiment. 本第３の実施形態の対象物位置特定システムにおいて対象物の位置を簡易的に特定する処理の考え方を説明する図である。It is a figure explaining the view of the process which specifies easily the position of a target object in the target object specifying system of the 3rd embodiment. 本発明の第４の実施形態による対象物位置特定システムの概略構成を示したブロック図である。It is the block diagram which showed schematic structure of the target object location identification system by the 4th Embodiment of this invention.

＜第１の実施形態＞
以下、本発明の実施形態について、図面を参照して説明する。図１は、本第１の実施形態による対象物位置特定システムの概略構成を示したブロック図である。図１において、対象物位置特定システム１０は、局所特徴ベクトル生成部１１０と、量子化ベクトル生成部１２０と、ヒストグラム生成部１３０と、ＳＶＭ（Ｓｕｐｐｏｒｔｖｅｃｔｏｒｍａｃｈｉｎｅ：サポートベクタマシン）演算部１４０と、教師データ群１５０と、位置特定制御部１６０と、量子化ベクトル保存部１７０と、を備えている。 <First Embodiment>
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a schematic configuration of an object position specifying system according to the first embodiment. In FIG. 1, an object position specifying system 10 includes a local feature vector generation unit 110, a quantization vector generation unit 120, a histogram generation unit 130, an SVM (Support Vector Machine) operation unit 140, a teacher A data group 150, a position specifying control unit 160, and a quantization vector storage unit 170 are provided.

対象物位置特定システム１０は、入力された画像に対して、画像に写っている物体、つまり、被写体（対象物）や画像が撮影されたシーンを認識するシーン認識の処理を行い、様々な対象物の種類毎に分類されたそれぞれの教師データとの類似度の情報を、シーン認識の処理によって判別した情報として出力する。また、対象物位置特定システム１０は、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を行い、特定した対象物が写っている位置を表す情報を出力する。 The target object specifying system 10 performs scene recognition processing on an input image to recognize an object shown in the image, that is, a subject (target object) and a scene where the image is photographed. Information on the degree of similarity with each teacher data classified for each type of object is output as information determined by scene recognition processing. Further, the object position specifying system 10 performs a position specifying process for specifying a position where the determined object is reflected in the image subjected to the scene recognition process, and determines the position where the specified object is reflected. Output the information that represents it.

教師データ群１５０は、同じ対象物が写っている大量の画像のヒストグラムが、対象物の種類（カテゴリ）毎に分類されたそれぞれの教師データとして含まれているデータベースである。教師データは、例えば、人、犬、猫、花などの対象物のカテゴリ毎に分類されており、分類されたそれぞれのカテゴリ毎に、例えば、１５００枚の画像のヒストグラムから構成されている。すなわち、教師データ群１５０には、対象物が「人」である１つのカテゴリに対して、１５００個のヒストグラムが教師データとしてそれぞれ存在し、同様に、対象物が「犬」、「猫」、「花」であるそれぞれのカテゴリに対しても、それぞれ１５００個のヒストグラムが教師データとしてそれぞれ存在している。つまり、教師データ群１５０には、４つのカテゴリのそれぞれに対して１５００個のヒストグラム（合計で４×１５００＝６０００個のヒストグラム）が、教師データとして含まれている。 The teacher data group 150 is a database in which histograms of a large number of images showing the same object are included as respective teacher data classified for each type (category) of the object. The teacher data is classified into categories of objects such as people, dogs, cats, and flowers, for example, and is composed of, for example, a histogram of 1500 images for each classified category. That is, in the teacher data group 150, there are 1500 histograms as teacher data for one category whose object is “person”, and similarly, the objects are “dog”, “cat”, For each category of “flower”, 1500 histograms exist as teacher data. That is, the teacher data group 150 includes 1500 histograms (4 × 1500 = 6000 histograms in total) for each of the four categories as teacher data.

局所特徴ベクトル生成部１１０は、位置特定制御部１６０からの制御に応じて、対象物位置特定システム１０に入力された画像の局所特徴ベクトルを生成する。局所特徴ベクトル生成部１１０は、入力された画像の全体の領域を予め定めた大きさの領域（以下、「シーン認識分割領域」という）に細かく分割し、分割したそれぞれのシーン認識分割領域に含まれる画像データにおける局所的な特徴を表す局所特徴ベクトルを生成する。そして、局所特徴ベクトル生成部１１０は、生成したそれぞれのシーン認識分割領域の局所特徴ベクトルの値を、量子化ベクトル生成部１２０に出力する。また、局所特徴ベクトル生成部１１０は、全てのシーン認識分割領域の局所特徴ベクトルの生成が完了したとき、局所特徴ベクトルの生成が完了したことを位置特定制御部１６０に通知する。なお、局所特徴ベクトル生成部１１０において局所特徴ベクトルを生成する処理の方法は、従来の技術においてシーン認識の処理を行う際に局所特徴ベクトルを生成する処理の方法と同様であるため、詳細な説明は省略する。 The local feature vector generation unit 110 generates a local feature vector of the image input to the object position specifying system 10 in accordance with the control from the position specifying control unit 160. The local feature vector generation unit 110 finely divides the entire area of the input image into areas of a predetermined size (hereinafter referred to as “scene recognition divided areas”) and includes the divided scene recognition divided areas. Local feature vectors representing local features in the image data to be generated are generated. Then, the local feature vector generation unit 110 outputs the value of the generated local feature vector of each scene recognition divided region to the quantization vector generation unit 120. In addition, when the generation of local feature vectors for all scene recognition divided regions is completed, the local feature vector generation unit 110 notifies the position specification control unit 160 that the generation of the local feature vectors is completed. The local feature vector generation unit 110 generates a local feature vector in the same manner as the processing method for generating a local feature vector when performing scene recognition processing in the prior art. Is omitted.

量子化ベクトル生成部１２０は、位置特定制御部１６０からの制御に応じて、局所特徴ベクトル生成部１１０から入力されたそれぞれのシーン認識分割領域の局所特徴ベクトルの値を量子化し、それぞれのシーン認識分割領域に対応する量子化ベクトルを生成する。そして、量子化ベクトル生成部１２０は、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、ヒストグラム生成部１３０に出力すると共に、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存させる。また、量子化ベクトル生成部１２０は、全てのシーン認識分割領域の量子化ベクトルの生成が完了したとき、量子化ベクトルの生成が完了したことを位置特定制御部１６０に通知する。なお、量子化ベクトル生成部１２０において量子化ベクトルを生成する処理の方法は、従来の技術においてシーン認識の処理を行う際に量子化ベクトルを生成する処理の方法と同様であるため、詳細な説明は省略する。 The quantization vector generation unit 120 quantizes the local feature vector values of the respective scene recognition divided regions input from the local feature vector generation unit 110 in accordance with the control from the position specifying control unit 160, and each scene recognition. A quantization vector corresponding to the divided region is generated. Then, the quantization vector generation unit 120 outputs the generated quantization vector value of each scene recognition divided region to the histogram generation unit 130, and outputs the generated quantization vector value of each scene recognition divided region. The quantization vector storage unit 170 stores the result. Further, when the generation of the quantization vectors of all scene recognition divided regions is completed, the quantization vector generation unit 120 notifies the position specification control unit 160 that the generation of the quantization vectors is completed. Note that the processing method for generating the quantization vector in the quantization vector generation unit 120 is the same as the processing method for generating the quantization vector when performing the scene recognition processing in the conventional technique, and thus detailed description will be given. Is omitted.

量子化ベクトル保存部１７０は、量子化ベクトル生成部１２０からの制御に応じて、量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を一時的に保存する、例えば、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などのメモリである。量子化ベクトル保存部１７０は、量子化ベクトル生成部１２０から入力された、それぞれのシーン認識分割領域に対応する量子化ベクトルの値を、それぞれのシーン認識分割領域毎に保存する。量子化ベクトル保存部１７０に保存されたそれぞれのシーン認識分割領域毎の量子化ベクトルの値は、ヒストグラム生成部１３０からの制御に応じて、ヒストグラム生成部１３０に出力される。 The quantization vector storage unit 170 temporarily stores the quantization vector value of each scene recognition divided region generated by the quantization vector generation unit 120 according to the control from the quantization vector generation unit 120, for example, And a memory such as a DRAM (Dynamic Random Access Memory). The quantization vector storage unit 170 stores the value of the quantization vector corresponding to each scene recognition divided region input from the quantization vector generation unit 120 for each scene recognition divided region. The value of the quantization vector for each scene recognition divided region stored in the quantization vector storage unit 170 is output to the histogram generation unit 130 in accordance with control from the histogram generation unit 130.

ヒストグラム生成部１３０は、位置特定制御部１６０からの制御に応じたシーン認識の処理において、量子化ベクトル生成部１２０から入力されたそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、対象物位置特定システム１０に入力された画像の全体を表すヒストグラムを生成する。そして、ヒストグラム生成部１３０は、生成した画像全体のヒストグラムを、ＳＶＭ演算部１４０に出力する。また、ヒストグラム生成部１３０は、入力された画像に対応した画像全体のヒストグラムの生成が完了したとき、画像全体のヒストグラムの生成が完了したことを位置特定制御部１６０に通知する。なお、ヒストグラム生成部１３０において画像全体のヒストグラムを生成する処理の方法は、従来の技術においてシーン認識の処理を行う際に画像全体のヒストグラムを生成する処理の方法と同様であるため、詳細な説明は省略する。 In the scene recognition process according to the control from the position specifying control unit 160, the histogram generation unit 130 calculates the object from the quantization vector value for each scene recognition divided region input from the quantization vector generation unit 120. A histogram representing the entire image input to the position specifying system 10 is generated. Then, the histogram generation unit 130 outputs the generated histogram of the entire image to the SVM calculation unit 140. Further, when the generation of the histogram of the entire image corresponding to the input image is completed, the histogram generation unit 130 notifies the position specification control unit 160 that the generation of the histogram of the entire image is completed. Note that the processing method for generating the histogram of the entire image in the histogram generation unit 130 is the same as the processing method for generating the histogram of the entire image when performing the scene recognition processing in the conventional technique, and thus detailed description will be given. Is omitted.

また、ヒストグラム生成部１３０は、位置特定制御部１６０からの制御に応じた対象物の位置特定の処理において、量子化ベクトル保存部１７０に保存されているそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、位置特定制御部１６０から指定された、シーン認識分割領域の大きさよりも大きな、予め定めた大きさの領域（以下、「位置特定分割領域」という）のヒストグラムを生成する。この位置特定分割領域は、対象物位置特定システム１０に入力された画像全体の内で対象物が写っている位置を特定する単位を定めた領域である。そして、ヒストグラム生成部１３０は、生成したそれぞれの位置特定分割領域毎のヒストグラム（以下、「位置特定ヒストグラム」という）を、ＳＶＭ演算部１４０に出力する。また、ヒストグラム生成部１３０は、位置特定制御部１６０から指定された位置特定分割領域に対応した位置特定ヒストグラムの生成が完了したとき、指定された位置特定ヒストグラムの生成が完了したことを位置特定制御部１６０に通知する。この通知によって、位置特定制御部１６０から次の位置特定分割領域が指定され、ヒストグラム生成部１３０は、指定された位置特定分割領域に対応した位置特定ヒストグラムの生成を繰り返す。なお、ヒストグラム生成部１３０において位置特定ヒストグラムを生成する処理の方法は、ヒストグラムを生成する領域の大きさが異なる以外は、シーン認識の処理において画像全体のヒストグラムを生成する処理の方法と同様である。 Further, the histogram generation unit 130 performs quantization vector for each scene recognition divided region stored in the quantization vector storage unit 170 in the process of specifying the position of the object according to the control from the position specification control unit 160. Based on the value, a histogram of an area having a predetermined size (hereinafter referred to as “position-specific divided area”) larger than the size of the scene recognition divided area designated by the position specifying control unit 160 is generated. The position specifying divided area is an area in which a unit for specifying a position where the object is shown in the entire image input to the object position specifying system 10 is determined. Then, the histogram generation unit 130 outputs the generated histogram for each position specifying divided region (hereinafter referred to as “position specifying histogram”) to the SVM calculation unit 140. In addition, when the generation of the position specifying histogram corresponding to the position specifying divided region specified by the position specifying control unit 160 is completed, the histogram generating unit 130 determines that the generation of the specified position specifying histogram has been completed. To the unit 160. By this notification, the next position specifying divided region is designated by the position specifying control unit 160, and the histogram generating unit 130 repeats generation of the position specifying histogram corresponding to the designated position specifying divided region. Note that the processing method for generating the position specifying histogram in the histogram generation unit 130 is the same as the processing method for generating the histogram of the entire image in the scene recognition processing, except that the size of the region for generating the histogram is different. .

ＳＶＭ演算部１４０は、位置特定制御部１６０からの制御に応じたシーン認識の処理において、ヒストグラム生成部１３０から入力された画像全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、教師データ群１５０において分類された対象物のカテゴリ毎に類似度を算出する。そして、ＳＶＭ演算部１４０は、入力された画像全体のヒストグラムに対するそれぞれの対象物のカテゴリとの類似度の算出が完了したとき、すなわち、ＳＶＭ演算が完了したとき、ＳＶＭ演算によって算出したそれぞれの対象物のカテゴリ毎の類似度を表す情報を、対象物位置特定システム１０がシーン認識の処理を行って判別した情報として出力する。また、ＳＶＭ演算部１４０は、シーン認識の処理を行うＳＶＭ演算が完了したことを位置特定制御部１６０に通知する。なお、ＳＶＭ演算部１４０におけるＳＶＭ演算の方法は、従来の技術においてシーン認識の処理を行う際のＳＶＭ演算の方法と同様であるため、詳細な説明は省略する。 In the scene recognition process in accordance with the control from the position specifying control unit 160, the SVM calculation unit 140 includes the histogram of the entire image input from the histogram generation unit 130 and the histogram of each teacher data included in the teacher data group 150. SVM calculation is performed, and similarity is calculated for each category of objects classified in the teacher data group 150. The SVM calculation unit 140 then calculates each of the targets calculated by the SVM calculation when the calculation of the similarity with the category of each target with respect to the entire histogram of the input image is completed, that is, when the SVM calculation is completed. Information indicating the similarity for each category of the object is output as information determined by the object position specifying system 10 performing scene recognition processing. In addition, the SVM calculation unit 140 notifies the position specifying control unit 160 that the SVM calculation for performing the scene recognition processing is completed. Note that the SVM calculation method in the SVM calculation unit 140 is the same as the SVM calculation method used when performing scene recognition processing in the conventional technique, and thus detailed description thereof is omitted.

また、ＳＶＭ演算部１４０は、位置特定制御部１６０からの制御に応じた対象物の位置特定の処理において、ヒストグラム生成部１３０から入力されたそれぞれの位置特定ヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、それぞれの位置特定分割領域毎に、教師データ群１５０において分類された対象物のカテゴリとの類似度を算出する。また、ＳＶＭ演算部１４０は、対象物の位置特定の処理を行う位置特定分割領域毎に、ＳＶＭ演算が完了したことを位置特定制御部１６０に通知する。この通知によって、位置特定制御部１６０から次の位置特定分割領域が指定され、ＳＶＭ演算部１４０は、指定された位置特定分割領域の位置特定ヒストグラムに対するＳＶＭ演算を繰り返す。そして、ＳＶＭ演算部１４０は、全ての位置特定分割領域の位置特定ヒストグラムに対するＳＶＭ演算が完了したとき、ＳＶＭ演算によって算出したそれぞれの位置特定分割領域毎に、シーン認識の処理によって判別した対象物のカテゴリとの類似度を表す情報を、対象物位置特定システム１０が対象物の位置特定の処理を行った結果として出力する。なお、ＳＶＭ演算部１４０における位置特定ヒストグラムに対するＳＶＭ演算の方法は、ＳＶＭ演算の処理を行うヒストグラムが異なる以外、つまり、画像全体のヒストグラムが位置特定ヒストグラムとなる以外は、シーン認識の処理におけるＳＶＭ演算の方法と同様である。 In addition, the SVM calculation unit 140 is included in each position specifying histogram input from the histogram generation unit 130 and the teacher data group 150 in the process of specifying the position of the object according to the control from the position specifying control unit 160. An SVM operation for comparing the histograms of the respective teacher data is performed, and the similarity with the category of the object classified in the teacher data group 150 is calculated for each position specifying divided region. In addition, the SVM calculating unit 140 notifies the position specifying control unit 160 that the SVM calculation is completed for each position specifying divided area where the process of specifying the position of the object is performed. By this notification, the next position specifying divided region is designated from the position specifying control unit 160, and the SVM calculating unit 140 repeats the SVM calculation for the position specifying histogram of the specified position specifying divided region. When the SVM calculation unit 140 completes the SVM calculation with respect to the position specifying histograms of all the position specifying divided regions, the SVM calculating unit 140 determines the object identified by the scene recognition process for each position specifying divided region calculated by the SVM calculation. Information representing the similarity to the category is output as a result of the object position specifying system 10 performing the object position specifying process. Note that the SVM calculation method for the position specification histogram in the SVM calculation unit 140 is different from the histogram for performing the SVM calculation process, that is, the SVM calculation in the scene recognition process except that the histogram of the entire image becomes the position specification histogram. It is the same as the method.

位置特定制御部１６０は、対象物位置特定システム１０の全体、すなわち、対象物位置特定システム１０に備えた局所特徴ベクトル生成部１１０、量子化ベクトル生成部１２０、ヒストグラム生成部１３０、およびＳＶＭ演算部１４０のそれぞれの動作を制御する。位置特定制御部１６０は、ヒストグラム生成分割領域指定部１６１を備えている。 The position specifying control unit 160 is the entire object position specifying system 10, that is, the local feature vector generating unit 110, the quantized vector generating unit 120, the histogram generating unit 130, and the SVM calculating unit included in the object position specifying system 10. Each operation of 140 is controlled. The position specifying control unit 160 includes a histogram generation divided region specifying unit 161.

ヒストグラム生成分割領域指定部１６１は、対象物位置特定システム１０における対象物の位置特定の処理において、対象物位置特定システム１０に入力された画像の全体の領域を、画像内で対象物が写っている位置を特定する位置特定の処理を行うための予め定めた大きさの位置特定分割領域に分割する。そして、ヒストグラム生成分割領域指定部１６１は、分割したそれぞれの位置特定分割領域を、ヒストグラム生成部１３０に位置特定ヒストグラムを生成させる領域およびＳＶＭ演算部１４０にＳＶＭ演算をさせる領域として、順次指定する。このとき、ヒストグラム生成分割領域指定部１６１は、今回指定した位置特定分割領域に対応した位置特定ヒストグラムの生成が完了したことを表す通知をヒストグラム生成部１３０から受け取る毎に、ヒストグラム生成部１３０に指定する位置特定分割領域を、次の位置特定分割領域に順次移動させる。また、ヒストグラム生成分割領域指定部１６１は、今回指定した位置特定分割領域の位置特定ヒストグラムに対するＳＶＭ演算が完了したことを表す通知をＳＶＭ演算部１４０から受け取る毎に、ＳＶＭ演算部１４０に指定する位置特定分割領域を、次の位置特定分割領域に順次移動させる。つまり、ヒストグラム生成分割領域指定部１６１は、ＳＶＭ演算部を行う位置特定ヒストグラムを、次の位置特定分割領域の位置特定ヒストグラムに順次変更する。 The histogram generation divided region specifying unit 161 shows the entire region of the image input to the object position specifying system 10 in the image in the object position specifying process in the object position specifying system 10. It divides | segments into the position specific division area of the predetermined magnitude | size for performing the position specific process which specifies the position which exists. Then, the histogram generation divided region specifying unit 161 sequentially specifies each of the divided position specifying divided regions as a region where the histogram generating unit 130 generates the position specifying histogram and a region where the SVM calculating unit 140 performs the SVM calculation. At this time, the histogram generation divided region designation unit 161 designates the histogram generation unit 130 every time it receives a notification from the histogram generation unit 130 that the generation of the position specific histogram corresponding to the position specific division region designated this time is completed. The position specifying divided area to be moved is sequentially moved to the next position specifying divided area. In addition, the histogram generation divided region designation unit 161 designates a position to be designated to the SVM calculation unit 140 every time it receives a notification from the SVM calculation unit 140 indicating that the SVM calculation for the position specification histogram of the position specific division region designated this time has been completed. The specific divided area is sequentially moved to the next position specific divided area. That is, the histogram generation divided region specifying unit 161 sequentially changes the position specifying histogram for performing the SVM calculation unit to the position specifying histogram of the next position specifying divided region.

このような構成よって、対象物位置特定システム１０では、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を量子化ベクトル保存部１７０に保存し、量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、対象物の位置特定の処理を行う際の位置特定分割領域毎の位置特定ヒストグラムを生成する。これにより、対象物位置特定システム１０では、シーン認識の処理の後に、判別した対象物が写っている画像内の位置を特定することが求められた場合でも、それぞれの位置特定分割領域に対してシーン認識の処理を再度行うよりも少ない処理で、対象物の位置特定の処理を行うことができる。 With such a configuration, the target object specifying system 10 stores the quantization vector value of each scene recognition divided region generated by the quantization vector generation unit 120 in the scene recognition processing in the quantization vector storage unit 170. Then, by using the quantization vector value of each scene recognition divided region stored in the quantization vector storage unit 170, a position specifying histogram for each position specifying divided region when performing the process of specifying the position of the object is generated. Thereby, in the object position specifying system 10, even when it is required to specify the position in the image in which the determined object is shown after the scene recognition process, the object position specifying region 10 is determined. The processing for specifying the position of the object can be performed with less processing than performing the processing for scene recognition again.

次に、対象物位置特定システム１０の動作について説明する。図２は、本第１の実施形態の対象物位置特定システム１０における処理手順を示したフローチャートである。また、図３〜図５は、本第１の実施形態の対象物位置特定システム１０におけるそれぞれの処理の一例を説明する図である。図２に示した対象物位置特定システム１０における処理のフローチャートの説明においては、適宜、図３〜図５に示した対象物位置特定システム１０におけるそれぞれの処理の一例を参照し、画像に写っている対象物が「犬」である場合の例を説明する。 Next, the operation of the object position specifying system 10 will be described. FIG. 2 is a flowchart showing a processing procedure in the object position specifying system 10 of the first embodiment. Moreover, FIGS. 3-5 is a figure explaining an example of each process in the target object location specifying system 10 of the 1st embodiment. In the description of the flowchart of the process in the object position specifying system 10 shown in FIG. 2, an example of each process in the object position specifying system 10 shown in FIGS. An example in which the target object is a “dog” will be described.

対象物位置特定システム１０に画像が入力されると、位置特定制御部１６０は、まず、入力された画像に対するシーン認識の処理を行い、その後、対象物の位置特定の処理を行うように、対象物位置特定システム１０に備えたそれぞれの構成要素の動作を制御する。図３は、本第１の実施形態の対象物位置特定システム１０における全体の処理の一例を模式的に示した図である。 When an image is input to the object position specifying system 10, the position specifying control unit 160 first performs a scene recognition process on the input image, and then performs an object position specifying process. The operation of each component provided in the object location specifying system 10 is controlled. FIG. 3 is a diagram schematically illustrating an example of the entire process in the object position specifying system 10 according to the first embodiment.

対象物位置特定システム１０に、図３（ａ）に示したような位置に「犬」が写っている画像が入力された場合、まず、位置特定制御部１６０は、対象物位置特定システム１０に備えたそれぞれの構成要素の動作を制御して、入力された画像（図３（ａ））に対してシーン認識の処理を行って、入力された画像に「犬」が写っていると判別する。その後、位置特定制御部１６０は、対象物位置特定システム１０に備えたそれぞれの構成要素の動作を制御して、入力された画像（図３（ａ））全体の領域を図３（ｂ）に示したように複数の位置特定分割領域に分割したそれぞれの位置特定分割領域毎に対象物の位置特定の処理を順次行って、対象物である「犬」が写っている位置特定分割領域の位置を特定する。図３（ｂ）には、画像（図３（ａ））全体の領域を水平方向および垂直方向にそれぞれ３分割した９つの位置特定分割領域Ａ１〜Ａ９に分割し、位置特定分割領域Ａ６の位置を、対象物である「犬」が写っている位置と特定した場合を示している。 When an image in which “dog” is shown at the position shown in FIG. 3A is input to the object position specifying system 10, first, the position specifying control unit 160 first sets the object position specifying system 10 to the object position specifying system 10. By controlling the operation of each of the constituent elements provided, scene recognition processing is performed on the input image (FIG. 3A), and it is determined that “dog” appears in the input image. . Thereafter, the position specifying control unit 160 controls the operation of each component included in the object position specifying system 10, and the entire area of the input image (FIG. 3A) is shown in FIG. 3B. As shown in the figure, the position of the position-specific divided area where the object “dog” is reflected is performed by sequentially performing the process of specifying the position of the object for each position-specific divided area divided into a plurality of position-specific divided areas. Is identified. In FIG. 3B, the entire area of the image (FIG. 3A) is divided into nine position-specific divided areas A1 to A9 that are each divided into three in the horizontal direction and the vertical direction, and the position of the position-specific divided area A6. Is identified as the position where the object “dog” is shown.

対象物位置特定システム１０に画像が入力されると、対象物位置特定システム１０は、ステップＳ１００から、入力された画像のシーンを認識するシーン認識の処理を開始する。対象物位置特定システム１０におけるシーン認識の処理では、まず、ステップＳ１００において、位置特定制御部１６０は、局所特徴ベクトル生成部１１０に、入力された画像（図３（ａ））全体の領域を予め定めた大きさの細かいシーン認識分割領域に分割したそれぞれのシーン認識分割領域毎の局所特徴ベクトルを生成させる。 When an image is input to the object position specifying system 10, the object position specifying system 10 starts scene recognition processing for recognizing the scene of the input image from step S100. In the process of scene recognition in the object position specifying system 10, first, in step S 100, the position specifying control unit 160 stores the entire region of the input image (FIG. 3A) in advance in the local feature vector generation unit 110. A local feature vector is generated for each scene recognition divided area divided into small scene recognition divided areas of a predetermined size.

続いて、ステップＳ１１０において、位置特定制御部１６０は、量子化ベクトル生成部１２０に、局所特徴ベクトル生成部１１０が生成した局所特徴ベクトルに基づいて、それぞれのシーン認識分割領域毎の量子化ベクトルを生成させる。また、ステップＳ１１５において、位置特定制御部１６０は、量子化ベクトル生成部１２０に、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存させる。 Subsequently, in step S110, the position specifying control unit 160 causes the quantization vector generation unit 120 to obtain the quantization vector for each scene recognition divided region based on the local feature vector generated by the local feature vector generation unit 110. Generate. In step S115, the position specifying control unit 160 causes the quantization vector generation unit 120 to store the generated quantization vector values of the respective scene recognition divided regions in the quantization vector storage unit 170.

続いて、ステップＳ１２０において、位置特定制御部１６０は、ヒストグラム生成部１３０に、量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域毎の量子化ベクトルの値に基づいて、対象物位置特定システム１０に入力された画像（図３（ａ））の全体を表すヒストグラムを生成させる。 Subsequently, in step S120, the position specifying control unit 160 causes the histogram generating unit 130 to specify the object position based on the value of the quantization vector for each scene recognition divided region generated by the quantization vector generating unit 120. A histogram representing the entire image (FIG. 3A) input to the system 10 is generated.

続いて、ステップＳ１３０において、位置特定制御部１６０は、ＳＶＭ演算部１４０に、ヒストグラム生成部１３０が生成した画像（図３（ａ））全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を実行させる。これにより、対象物位置特定システム１０は、入力された画像（図３（ａ））に「犬」が写っていると判別することができ、それぞれの対象物のカテゴリ毎の類似度を表す情報を出力する。 Subsequently, in step S130, the position specification control unit 160 causes the SVM calculation unit 140 to display the entire histogram generated by the histogram generation unit 130 (FIG. 3A) and each teacher included in the teacher data group 150. The SVM calculation for calculating the similarity with the data histogram is executed. Thereby, the object position specifying system 10 can determine that “dog” is reflected in the input image (FIG. 3A), and information indicating the similarity for each category of each object. Is output.

ここまでの処理が、対象物位置特定システム１０におけるシーン認識の処理である。なお、対象物位置特定システム１０におけるシーン認識の処理は、従来の技術によるシーン認識の処理と同様である。 The process so far is the scene recognition process in the object position specifying system 10. The scene recognition process in the object position specifying system 10 is the same as the scene recognition process according to the conventional technique.

ここで、対象物位置特定システム１０によって行われる、ステップＳ１００〜ステップＳ１３０までのシーン認識の処理の一例について説明する。図４は、本第１の実施形態の対象物位置特定システム１０においてシーン認識の処理を行う動作の一例を模式的に示した図である。 Here, an example of the scene recognition process from step S100 to step S130 performed by the object position specifying system 10 will be described. FIG. 4 is a diagram schematically illustrating an example of an operation for performing scene recognition processing in the object position specifying system 10 according to the first embodiment.

対象物位置特定システム１０におけるシーン認識の処理では、まず、ステップＳ１００において、局所特徴ベクトル生成部１１０が、位置特定制御部１６０からの制御に応じて、入力された画像（図３（ａ））全体の領域を、予め定めた大きさの細かいシーン認識分割領域に分割し、分割したそれぞれのシーン認識分割領域毎に局所特徴ベクトルを生成する。そして、局所特徴ベクトル生成部１１０は、生成したそれぞれのシーン認識分割領域の局所特徴ベクトルの値を、量子化ベクトル生成部１２０に出力する。図４（ａ）には、局所特徴ベクトル生成部１１０が、入力された画像を水平方向および垂直方向にそれぞれ９分割したシーン認識分割領域の状態の一例を示している。 In the scene recognition processing in the object position specifying system 10, first, in step S100, the local feature vector generation unit 110 receives an input image (FIG. 3A) in accordance with control from the position specifying control unit 160. The entire area is divided into small scene recognition divided areas having a predetermined size, and a local feature vector is generated for each divided scene recognition divided area. Then, the local feature vector generation unit 110 outputs the value of the generated local feature vector of each scene recognition divided region to the quantization vector generation unit 120. FIG. 4A shows an example of a state of a scene recognition divided region in which the local feature vector generation unit 110 divides the input image into nine parts in the horizontal direction and the vertical direction, respectively.

そして、ステップＳ１１０において、量子化ベクトル生成部１２０が、位置特定制御部１６０からの制御に応じて、局所特徴ベクトル生成部１１０が生成した局所特徴ベクトルの値を量子化し、それぞれのシーン認識分割領域毎の量子化ベクトルを生成する。そして、量子化ベクトル生成部１２０は、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、ヒストグラム生成部１３０に出力する。また、ステップＳ１１５において、量子化ベクトル生成部１２０は、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存する。図４（ｂ）には、量子化ベクトル生成部１２０によって量子化ベクトルの生成が完了した状態の一例を示している。 In step S110, the quantization vector generation unit 120 quantizes the value of the local feature vector generated by the local feature vector generation unit 110 in accordance with the control from the position specifying control unit 160, and each scene recognition divided region. Generate a quantization vector for each. Then, the quantization vector generation unit 120 outputs the generated quantization vector value of each scene recognition divided region to the histogram generation unit 130. In step S115, the quantization vector generation unit 120 stores the generated quantization vector value of each scene recognition divided region in the quantization vector storage unit 170. FIG. 4B illustrates an example of a state where the quantization vector generation unit 120 has completed the generation of the quantization vector.

その後、ステップＳ１２０において、ヒストグラム生成部１３０が、位置特定制御部１６０からの制御に応じて、量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、対象物位置特定システム１０に入力された画像（図３（ａ））の全体を表すヒストグラムを生成する。そして、ヒストグラム生成部１３０は、生成した画像（図３（ａ））全体のヒストグラムを、ＳＶＭ演算部１４０に出力する。図４（ｃ）には、ヒストグラム生成部１３０が生成した画像全体のヒストグラムの一例を示している。 Thereafter, in step S120, the histogram generation unit 130 determines the target object from the quantization vector value for each scene recognition divided region generated by the quantization vector generation unit 120 according to the control from the position specifying control unit 160. A histogram representing the entire image (FIG. 3A) input to the position specifying system 10 is generated. Then, the histogram generation unit 130 outputs the entire histogram of the generated image (FIG. 3A) to the SVM calculation unit 140. FIG. 4C shows an example of the histogram of the entire image generated by the histogram generation unit 130.

その後、ステップＳ１３０において、ＳＶＭ演算部１４０が、位置特定制御部１６０からの制御に応じて、ヒストグラム生成部１３０が生成した画像（図３（ａ））全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラム（例えば、上述した４つのカテゴリに含まれる１５００個のヒストグラムの合計６０００個のヒストグラム）との類似度を算出するＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、算出したＳＶＭ演算の結果に基づいて得られる、それぞれの対象物のカテゴリ毎の類似度を表す情報を出力する。 Thereafter, in step S 130, the SVM calculation unit 140 is included in the entire histogram generated by the histogram generation unit 130 (FIG. 3A) and the teacher data group 150 in accordance with the control from the position specifying control unit 160. SVM calculation is performed to calculate the degree of similarity with the histogram of each teacher data (for example, a total of 6000 histograms of the 1500 histograms included in the four categories described above). And the SVM calculating part 140 outputs the information showing the similarity for every category of each target object obtained based on the result of the calculated SVM calculation.

なお、ＳＶＭ演算部１４０によるＳＶＭ演算では、ヒストグラム生成部１３０が生成した画像（図３（ａ））全体のヒストグラムとそれぞれの教師データのヒストグラムとにおける同じ階級同士の度数の差分絶対値を算出し、それぞれの階級の差分絶対値を加算する。ここで算出した差分絶対値の加算結果は、ヒストグラム生成部１３０が生成した画像（図３（ａ））全体のヒストグラムとそれぞれの教師データのヒストグラムとの差が小さいほど、つまり、それぞれのヒストグラムの類似度が高いほど、値が小さくなる。これにより、ＳＶＭ演算部１４０が算出した差分絶対値の加算結果の値が最も小さい教師データが含まれているカテゴリが、入力された画像（図３（ａ））に写っている対象物のカテゴリであると判別することができる。ＳＶＭ演算部１４０は、差分絶対値の加算結果に応じた類似度を表す情報を出力する。図４（ｄ）には、対象物が「犬」であるカテゴリに対する類似度が８０％である情報と、対象物が「猫」であるカテゴリに対する類似度が２０％である情報とを出力した場合の一例を示している。 In the SVM calculation by the SVM calculation unit 140, the absolute difference value of the frequency between the same classes in the histogram of the entire image (FIG. 3A) generated by the histogram generation unit 130 and the histogram of each teacher data is calculated. , The difference absolute value of each class is added. The difference absolute value addition result calculated here indicates that the smaller the difference between the histogram of the entire image (FIG. 3A) generated by the histogram generation unit 130 and the histogram of each teacher data, that is, for each histogram. The higher the similarity, the smaller the value. As a result, the category including the teacher data having the smallest value of the absolute difference value calculated by the SVM calculation unit 140 is the category of the object shown in the input image (FIG. 3A). Can be determined. The SVM calculation unit 140 outputs information indicating the similarity according to the addition result of the absolute difference value. In FIG. 4D, information that the similarity to the category whose object is “Dog” is 80% and information that the similarity to the category whose object is “Cat” is 20% are output. An example of the case is shown.

なお、それぞれのヒストグラムで表した画像の領域の大きさが異なると、つまり、画像に含まれる画像データの合計数（画素数）が異なると、たとえ、同じ画像を表すヒストグラムであったとしても、同じ階級における度数が異なり、同じ階級同士の度数から算出した差分絶対値が大きな値になってしまう。このため、ＳＶＭ演算では、それぞれのヒストグラムで表した画像の領域の大きさが同等になるように、つまり、画像データの合計数が同等になるように、それぞれのヒストグラムに含まれる度数の数を正規化した後に、それぞれのヒストグラムにおける同じ階級同士の度数の差分絶対値を算出する。これは、従来の技術によるシーン認識の処理においても同様である。 Note that if the size of the image area represented by each histogram is different, that is, if the total number of image data (number of pixels) included in the image is different, even if the histograms represent the same image, The frequencies in the same class are different, and the absolute difference value calculated from the frequencies of the same class becomes a large value. For this reason, in the SVM calculation, the number of frequencies included in each histogram is set so that the size of the image area represented by each histogram is equal, that is, the total number of image data is equal. After normalization, the difference absolute value of the frequency between the same classes in each histogram is calculated. The same applies to scene recognition processing according to the conventional technique.

そして、シーン認識の処理が完了すると、対象物位置特定システム１０は、ステップＳ２００から、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を開始する。対象物位置特定システム１０における対象物の位置特定の処理では、シーン認識の処理において量子化ベクトル生成部１２０が量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、シーン認識の処理における画像全体のヒストグラムの生成（ステップＳ１２０）以降の処理と同様の処理を、位置特定分割領域毎に行う。 When the scene recognition process is completed, the object position specifying system 10 performs a position specifying process for specifying the position where the determined object is shown in the image subjected to the scene recognition process from step S200. Start. In the object position specifying process in the object position specifying system 10, the quantization vector value of each scene recognition divided region stored in the quantization vector storing unit 170 by the quantization vector generating unit 120 in the scene recognition process is used. The same processing as the processing after the generation of the histogram of the entire image (step S120) in the scene recognition processing is performed for each position-specific divided region.

まず、ステップＳ２００において、位置特定制御部１６０は、ヒストグラム生成部１３０に、入力された画像（図３（ａ））全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）の内、１つ目の位置特定分割領域を指定する。そして、位置特定制御部１６０は、ヒストグラム生成部１３０に、指定した１つ目の位置特定分割領域に対応する量子化ベクトルの値を、量子化ベクトル保存部１７０から取得させる。 First, in step S200, the position specifying control unit 160 causes the histogram generating unit 130 to divide the entire area of the input image (FIG. 3A) into nine position specifying divided regions (see FIG. 3B). The first position specifying divided area is designated. Then, the position specification control unit 160 causes the histogram generation unit 130 to acquire the quantization vector value corresponding to the designated first position specification divided region from the quantization vector storage unit 170.

続いて、ステップＳ２１０において、位置特定制御部１６０は、ヒストグラム生成部１３０に、取得した１つ目の位置特定分割領域に対応する量子化ベクトルの値に基づいて、１つ目の位置特定分割領域を表す位置特定ヒストグラムを生成させる。 Subsequently, in step S210, the position specifying control unit 160 causes the histogram generating unit 130 to determine the first position specifying divided region based on the value of the quantization vector corresponding to the acquired first position specifying divided region. A location-specific histogram representing is generated.

続いて、ステップＳ２２０において、位置特定制御部１６０は、ＳＶＭ演算部１４０に、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を実行させる。これにより、対象物位置特定システム１０は、１つ目の位置特定分割領域内に、シーン認識の処理において類似度が最も高かった「犬」が写っているか否かを判別することができる類似度を表す情報を出力する。 Subsequently, in step S220, the position specifying control unit 160 includes the SVM calculation unit 140 in the position specifying histogram representing the first position specifying divided region generated by the histogram generating unit 130 and the teacher data group 150. The SVM calculation for calculating the similarity with the histogram of each teacher data of the category of the object with the highest similarity determined in the scene recognition process is executed. Thereby, the object position specifying system 10 can determine whether or not the “dog” having the highest similarity in the scene recognition processing is included in the first position specifying divided region. Outputs information indicating.

ここまでの処理が、対象物位置特定システム１０における１つの位置特定分割領域に対する位置特定の処理である。対象物位置特定システム１０では、入力された画像（図３（ａ））全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）、すなわち、全ての位置特定分割領域に対して、ステップＳ２００〜ステップＳ２２０までの位置特定の処理を行う。 The process so far is the position specifying process for one position specifying divided region in the object position specifying system 10. In the object position specifying system 10, nine position specifying divided areas (see FIG. 3B) obtained by dividing the entire area of the input image (FIG. 3A), that is, all the position specifying divided areas are obtained. Thus, the position specifying process from step S200 to step S220 is performed.

より具体的には、ステップＳ２３０において、位置特定制御部１６０は、入力された画像（図３（ａ））全体の領域を分割した全ての位置特定分割領域の指定が終了したか否かを判定する。ステップＳ２３０による判定の結果、分割した全ての位置特定分割領域の指定が終了した場合には、対象物位置特定システム１０における処理を完了する。一方、ステップＳ２３０による判定の結果、分割した全ての位置特定分割領域の指定が終了していない場合には、ステップＳ２００に戻って、次の位置特定分割領域を指定して、ステップＳ２００〜ステップＳ２２０までの位置特定の処理を繰り返す。 More specifically, in step S230, the position specifying control unit 160 determines whether or not the specification of all the position specifying divided areas obtained by dividing the entire area of the input image (FIG. 3A) has been completed. To do. As a result of the determination in step S230, when the designation of all divided position specifying divided areas is completed, the processing in the object position specifying system 10 is completed. On the other hand, if the result of determination in step S230 is that the designation of all the divided position-specific divided areas has not been completed, the process returns to step S200 to designate the next position-specific divided area, and steps S200 to S220. Repeat the position identification process up to.

ここで、対象物位置特定システム１０によって行われる、ステップＳ２００〜ステップＳ２２０までの位置特定の処理について説明する。図５は、本第１の実施形態の対象物位置特定システム１０において対象物の位置を特定する処理の考え方を説明する図である。 Here, the position specifying process from step S200 to step S220 performed by the object position specifying system 10 will be described. FIG. 5 is a diagram for explaining the concept of processing for specifying the position of an object in the object position specifying system 10 of the first embodiment.

対象物位置特定システム１０における対象物の位置特定の処理では、まず、ステップＳ２００において、位置特定制御部１６０が、入力された画像（図３（ａ））全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）の内、１つ目の位置特定分割領域Ａ１を指定する。そして、ヒストグラム生成部１３０が、位置特定制御部１６０によって指定された１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得する。図５（ａ−１）には、ヒストグラム生成部１３０が、位置特定制御部１６０によって指定された１つ目の位置特定分割領域Ａ１に対応するそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０から取得する状態の一例を示している。 In the process of specifying the position of the object in the object position specifying system 10, first, in step S200, the position specifying control unit 160 divides the entire area of the input image (FIG. 3A) into nine positions specified. Among the divided areas (see FIG. 3B), the first position specifying divided area A1 is designated. Then, the histogram generation unit 130 acquires the quantization vector value corresponding to the first position specifying divided region A1 specified by the position specifying control unit 160 from the quantized vector storage unit 170. In FIG. 5A-1, the histogram generation unit 130 indicates the quantization vector value of each scene recognition divided region corresponding to the first position specifying divided region A1 specified by the position specifying control unit 160. 2 shows an example of a state acquired from the quantized vector storage unit 170.

その後、ステップＳ２１０において、ヒストグラム生成部１３０が、位置特定制御部１６０からの制御に応じて、量子化ベクトル保存部１７０から取得した１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値から、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを生成する。そして、ヒストグラム生成部１３０は、生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを、ＳＶＭ演算部１４０に出力する。図５（ａ−２）には、ヒストグラム生成部１３０が、量子化ベクトル保存部１７０から取得した１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値から生成した位置特定ヒストグラムの一例を示している。 After that, in step S210, the histogram generation unit 130 controls the value of the quantization vector corresponding to the first position specifying divided region A1 acquired from the quantization vector storage unit 170 in accordance with the control from the position specifying control unit 160. Then, a position specifying histogram representing the first position specifying divided area A1 is generated. Then, the histogram generation unit 130 outputs a position specifying histogram representing the generated first position specifying divided region A1 to the SVM calculation unit 140. FIG. 5A-2 illustrates an example of a position specifying histogram generated by the histogram generation unit 130 from the quantization vector value corresponding to the first position specifying divided region A1 acquired from the quantization vector storage unit 170. Is shown.

その後、ステップＳ２２０において、ＳＶＭ演算部１４０が、位置特定制御部１６０からの制御に応じて、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において類似度が最も高かった「犬」であるカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、位置特定分割領域Ａ１に対して算出したＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を出力する。 Thereafter, in step S220, the SVM calculating unit 140, in response to control from the position specifying control unit 160, a position specifying histogram representing the first position specifying divided area A1 generated by the histogram generating unit 130, and a teacher data group 150, the SVM calculation is performed to calculate the similarity with the histogram of each teacher data of the category “dog” having the highest similarity in the scene recognition process. Then, the SVM calculation unit 140 outputs information representing the similarity with the category “dog” obtained based on the result of the SVM calculation calculated for the position specifying divided region A1.

なお、対象物の位置特定の処理においても、シーン認識の処理におけるＳＶＭ演算部１４０によるＳＶＭ演算と同様に、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムとそれぞれの教師データのヒストグラムとにおける同じ階級同士の度数の差分絶対値を算出し、それぞれの階級の差分絶対値を加算する。これにより、ＳＶＭ演算部１４０は、算出した差分絶対値の加算結果の値が最も小さい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域の位置を特定する情報を出力する。 Note that, in the process of specifying the position of the target object, as in the SVM calculation by the SVM calculation unit 140 in the scene recognition process, the position specifying histogram representing the first position specifying divided region A1 generated by the histogram generating unit 130 The absolute difference value of the frequency between the same classes in the histogram of each teacher data is calculated, and the absolute difference value of each class is added. As a result, the SVM calculation unit 140 determines that the position-specific divided area having the smallest value of the calculated difference absolute value addition value is the position-specific divided area in which the object “dog” is reflected, and Information specifying the position of the position specifying divided area is output.

また、ヒストグラム生成部１３０は、位置特定制御部１６０から指定された１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムの生成が完了したとき、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムの生成が完了したことを位置特定制御部１６０に通知する。この通知に応じて、位置特定制御部１６０は、ステップＳ２３０の判定を行い、ステップＳ２００に戻って、２つ目の位置特定分割領域Ａ２を指定する。 Further, when the generation of the position specifying histogram representing the first position specifying divided area A1 designated by the position specifying control section 160 is completed, the histogram generating unit 130 indicates the position indicating the first position specifying divided area A1. The position identification control unit 160 is notified that the generation of the specific histogram has been completed. In response to this notification, the position specifying control unit 160 performs the determination in step S230, returns to step S200, and designates the second position specifying divided region A2.

そして、ヒストグラム生成部１３０が、位置特定制御部１６０によって指定された２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得する。図５（ｂ−１）には、ヒストグラム生成部１３０が、位置特定制御部１６０によって指定された２つ目の位置特定分割領域Ａ２に対応するそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０から取得する状態の一例を示している。 Then, the histogram generation unit 130 acquires the quantization vector value corresponding to the second position specifying divided region A2 specified by the position specifying control unit 160 from the quantized vector storage unit 170. In FIG. 5B-1, the histogram generation unit 130 indicates the quantization vector value of each scene recognition divided region corresponding to the second position specifying divided region A2 specified by the position specifying control unit 160. 2 shows an example of a state acquired from the quantized vector storage unit 170.

その後、ステップＳ２１０において、ヒストグラム生成部１３０が、位置特定制御部１６０からの制御に応じて、量子化ベクトル保存部１７０から取得した２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値から、２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを生成する。そして、ヒストグラム生成部１３０は、生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを、ＳＶＭ演算部１４０に出力する。図５（ｂ−２）には、ヒストグラム生成部１３０が、量子化ベクトル保存部１７０から取得した２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値から生成した位置特定ヒストグラムの一例を示している。 Thereafter, in step S210, the histogram generation unit 130 controls the value of the quantization vector corresponding to the second position specifying divided region A2 acquired from the quantization vector storage unit 170 in accordance with the control from the position specifying control unit 160. Thus, a position specifying histogram representing the second position specifying divided area A2 is generated. Then, the histogram generation unit 130 outputs a position specifying histogram representing the generated second position specifying divided region A2 to the SVM calculation unit 140. FIG. 5B-2 shows an example of the position specifying histogram generated from the value of the quantization vector corresponding to the second position specifying divided region A2 acquired by the histogram generation unit 130 from the quantization vector storage unit 170. Is shown.

その後、ステップＳ２２０において、ＳＶＭ演算部１４０が、位置特定制御部１６０からの制御に応じて、ヒストグラム生成部１３０が生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において類似度が最も高かった「犬」であるカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、位置特定分割領域Ａ２に対して算出したＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を出力する。 Thereafter, in step S220, the SVM calculation unit 140, in response to control from the position specification control unit 160, a position specifying histogram representing the second position specifying divided region A2 generated by the histogram generation unit 130, and a teacher data group 150, the SVM calculation is performed to calculate the similarity with the histogram of each teacher data of the category “dog” having the highest similarity in the scene recognition process. Then, the SVM calculation unit 140 outputs information representing the similarity to the category “dog” obtained based on the result of the SVM calculation calculated for the position specifying divided region A2.

以降、同様に、位置特定制御部１６０が、入力された画像（図３（ａ））全体の領域を分割した９つの位置特定分割領域を順次指定し、ヒストグラム生成部１３０が、位置特定制御部１６０によって指定されたそれぞれの位置特定分割領域を表す位置特定ヒストグラムを順次生成してＳＶＭ演算部１４０に出力する。また、同様に、ＳＶＭ演算部１４０が、ヒストグラム生成部１３０が生成したそれぞれの位置特定分割領域を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において類似度が最も高かった「犬」であるカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、算出したＳＶＭ演算の結果に基づいて得られる、それぞれの位置特定分割領域における「犬」であるカテゴリとの類似度を表す情報を出力する。 Thereafter, similarly, the position specifying control unit 160 sequentially designates nine position specifying divided regions obtained by dividing the entire region of the input image (FIG. 3A), and the histogram generating unit 130 determines the position specifying control unit. A position specifying histogram representing each position specifying divided region designated by 160 is sequentially generated and output to the SVM calculation unit 140. Similarly, the SVM calculation unit 140 has the highest similarity in the scene recognition process included in the teacher data group 150 and the location specification histogram representing each location specification divided region generated by the histogram generation unit 130. An SVM calculation is performed to calculate the similarity to the histogram of each teacher data of the category “Dog”. Then, the SVM calculation unit 140 outputs information representing the degree of similarity with the category “dog” in each position specifying divided region, which is obtained based on the result of the calculated SVM calculation.

また、ＳＶＭ演算部１４０は、それぞれの位置特定分割領域における「犬」であるカテゴリとの類似度を表す情報に基づいて、類似度が最も大きい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域の位置を特定する情報を出力する。 Further, the SVM calculation unit 140 selects the position-specific divided area having the highest similarity as the object “dog” based on the information indicating the similarity to the category “dog” in each position-specific divided area. Is determined to be a position-specific divided area, and information for specifying the position of the position-specific divided area is output.

このようにして、対象物位置特定システム１０では、入力された画像（図３（ａ））全体の領域を分割した全ての位置特定分割領域に対する位置特定の処理を繰り返すことによって、それぞれの位置特定分割領域の中で、「犬」であるカテゴリとの類似度が最も高い位置特定分割領域を、シーン認識の処理によって判別した「犬」が対象物として写っている位置特定分割領域として特定することができる。これにより、対象物位置特定システム１０は、特定した位置特定分割領域の位置を表す情報を出力することができる。 In this manner, the object position specifying system 10 repeats the position specifying process for all the position specifying divided areas obtained by dividing the entire area of the input image (FIG. 3A), thereby specifying each position. Identifying the position-specific divided area having the highest similarity to the category “dog” among the divided areas as the position-specific divided area in which the “dog” determined by the scene recognition process is reflected as an object Can do. Thereby, the target object specifying system 10 can output information representing the position of the specified position specifying divided region.

本実施形態によれば、入力された画像の全体の領域を、予め定めた第１の大きさの複数の第１の領域（シーン認識分割領域）に分割し、この分割したシーン認識分割領域毎に、このシーン認識分割領域に含まれる画像データにおける局所的な特徴を表す局所特徴ベクトルを生成する局所特徴ベクトル生成部（局所特徴ベクトル生成部１１０）と、局所特徴ベクトル生成部１１０が生成した、それぞれのシーン認識分割領域の局所特徴ベクトルの値を量子化し、それぞれのシーン認識分割領域に対応する量子化ベクトルを生成する量子化ベクトル生成部（量子化ベクトル生成部１２０）と、量子化ベクトル生成部１２０が生成したそれぞれの量子化ベクトルの値を、シーン認識分割領域毎に保存する量子化ベクトル保存部（量子化ベクトル保存部１７０）と、シーン認識分割領域毎のそれぞれの量子化ベクトルの値から、画像の全体または一部の領域を表すヒストグラム（画像全体のヒストグラムまたは位置特定ヒストグラム）を生成するヒストグラム生成部（ヒストグラム生成部１３０）と、ヒストグラム生成部１３０が生成したヒストグラムに対するサポートベクタマシン（ＳＶＭ）演算を行うＳＶＭ演算部（ＳＶＭ演算部１４０）と、対象物位置特定システム１０内の構成要素のそれぞれを制御し、対象物が写されている画像のシーンを認識するシーン認識の処理を実行させた後に、このシーン認識の処理において判別した対象物（例えば、「犬」）が、画像の全体の領域をシーン認識分割領域よりも大きな予め定めた第２の大きさに分割した複数の第２の領域（位置特定分割領域）のいずれの位置に写されているかを特定するための位置特定の処理を実行させる位置特定制御部（位置特定制御部１６０）と、を備え、位置特定制御部１６０は、シーン認識の処理において、ヒストグラム生成部１３０に、それぞれのシーン認識分割領域毎の量子化ベクトルの値から、画像の全体を表すヒストグラムを生成させ、ＳＶＭ演算部１４０に、画像の全体を表すヒストグラムと、複数の画像のヒストグラムが対象物の種類毎に分類してまとめられた複数の教師データのヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させ、位置特定の処理において、ヒストグラム生成部１３０に、量子化ベクトル保存部１７０に保存されたシーン認識分割領域毎のそれぞれの量子化ベクトルの値から、それぞれの位置特定分割領域の画像を表すヒストグラムを生成させ、ＳＶＭ演算部１４０に、位置特定分割領域を表すヒストグラムのそれぞれに対するＳＶＭ演算を実行させる、対象物位置特定システム（対象物位置特定システム１０）が構成される。 According to the present embodiment, the entire area of the input image is divided into a plurality of first areas (scene recognition divided areas) having a predetermined first size, and each divided scene recognition divided area is divided. In addition, a local feature vector generation unit (local feature vector generation unit 110) that generates a local feature vector representing a local feature in the image data included in the scene recognition divided region, and a local feature vector generation unit 110 generate, A quantization vector generation unit (quantization vector generation unit 120) that quantizes local feature vector values of each scene recognition division region and generates a quantization vector corresponding to each scene recognition division region; Each quantization vector value generated by the unit 120 is stored for each scene recognition divided area (quantized vector storage). 170) and a histogram generation unit (histogram generation unit) that generates a histogram (a histogram of the entire image or a position specifying histogram) that represents the entire image or a partial region of the image from each quantization vector value for each scene recognition divided region 130), an SVM calculation unit (SVM calculation unit 140) that performs a support vector machine (SVM) calculation on the histogram generated by the histogram generation unit 130, and each of the components in the object position specifying system 10 are controlled. After executing the scene recognition process to recognize the scene of the image in which the object is copied, the object (for example, “dog”) determined in the scene recognition process divides the entire area of the image into the scene recognition division A plurality of second areas (position-specific divisions) divided into a predetermined second size larger than the area A position specifying control unit (position specifying control unit 160) that executes a position specifying process for specifying at which position in the area), the position specifying control unit 160 performs scene recognition processing. The histogram generation unit 130 generates a histogram representing the entire image from the quantization vector value for each scene recognition divided region, and the SVM calculation unit 140 includes a histogram representing the entire image and a plurality of images. The histogram generation unit 130 causes the histogram generation unit 130 to perform a quantization vector storage unit in a position specifying process by executing an SVM operation for comparing the histograms of the plurality of teacher data with each of the histograms of the plurality of teacher data collected and classified for each type of object. From the value of each quantization vector for each scene recognition divided area stored in 170, the image of each position-specific divided area is displayed. An object position specifying system (object position specifying system 10) is configured to generate a histogram representing an image and cause the SVM calculating unit 140 to execute an SVM calculation for each of the histograms indicating the position specifying divided regions.

また、本実施形態によれば、入力された画像の全体の領域を、予め定めた第１の大きさの複数の第１の領域（シーン認識分割領域）に分割し、この分割したシーン認識分割領域毎に、このシーン認識分割領域に含まれる画像データにおける局所的な特徴を表す局所特徴ベクトルを生成する局所特徴ベクトル生成部（局所特徴ベクトル生成部１１０）と、局所特徴ベクトル生成部１１０が生成した、それぞれのシーン認識分割領域の局所特徴ベクトルの値を量子化し、それぞれのシーン認識分割領域に対応する量子化ベクトルを生成する量子化ベクトル生成部（量子化ベクトル生成部１２０）と、量子化ベクトル生成部１２０が生成したそれぞれの量子化ベクトルの値を、シーン認識分割領域毎に保存する量子化ベクトル保存部（量子化ベクトル保存部１７０）と、シーン認識分割領域毎のそれぞれの量子化ベクトルの値から、画像の全体または一部の領域を表すヒストグラム（画像全体のヒストグラムまたは位置特定ヒストグラム）を生成するヒストグラム生成部（ヒストグラム生成部１３０）と、ヒストグラム生成部１３０が生成したヒストグラムに対するサポートベクタマシン（ＳＶＭ）演算を行うＳＶＭ演算部（ＳＶＭ演算部１４０）と、対象物位置特定システム１０内の構成要素のそれぞれを制御し、対象物が写されている画像のシーンを認識するシーン認識の処理を実行させた後に、このシーン認識の処理において判別した対象物（例えば、「犬」）が、画像の全体の領域をシーン認識分割領域よりも大きな予め定めた第２の大きさに分割した複数の第２の領域（位置特定分割領域）のいずれの位置に写されているかを特定するための位置特定の処理を実行させる位置特定制御部（位置特定制御部１６０）と、を備えた対象物位置特定システムにおいて、位置特定制御部１６０が、シーン認識の処理において、ヒストグラム生成部１３０に、それぞれのシーン認識分割領域毎の量子化ベクトルの値から、画像の全体を表すヒストグラムを生成させる手順と、ＳＶＭ演算部１４０に、画像の全体を表すヒストグラムと、複数の画像のヒストグラムが対象物の種類毎に分類してまとめられた複数の教師データのヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させる手順と、を含み、位置特定の処理において、ヒストグラム生成部１３０に、量子化ベクトル保存部１７０に保存されたシーン認識分割領域毎のそれぞれの量子化ベクトルの値から、それぞれの位置特定分割領域の画像を表すヒストグラムを生成させる手順と、ＳＶＭ演算部１４０に、位置特定分割領域を表すヒストグラムのそれぞれに対するＳＶＭ演算を実行させる手順と、を含む、対象物位置特定方法が構成される。 Further, according to the present embodiment, the entire area of the input image is divided into a plurality of first areas (scene recognition divided areas) having a predetermined first size, and the divided scene recognition divided areas are divided. For each region, a local feature vector generation unit (local feature vector generation unit 110) that generates local feature vectors representing local features in the image data included in the scene recognition divided region, and a local feature vector generation unit 110 generate A quantization vector generation unit (quantization vector generation unit 120) that quantizes local feature vector values of each scene recognition divided region and generates a quantization vector corresponding to each scene recognition divided region; Each quantization vector value generated by the vector generation unit 120 is stored for each scene recognition divided region (quantization vector vector). A storage unit 170) and a histogram generation unit (histogram) that generates a histogram (a histogram of the entire image or a position specifying histogram) that represents the whole or a partial region of the image from each quantization vector value for each scene recognition divided region. Generation unit 130), an SVM calculation unit (SVM calculation unit 140) that performs a support vector machine (SVM) calculation on the histogram generated by the histogram generation unit 130, and each of the components in the object position specifying system 10. After executing the scene recognition process for recognizing the scene of the image in which the object is copied, the object (for example, “dog”) determined in the scene recognition process is used for the entire area of the image. A plurality of second areas (position characteristics) divided into a predetermined second size larger than the recognition divided area. In the object position specifying system, the position specifying control unit includes a position specifying control unit (position specifying control unit 160) that executes a position specifying process for specifying at which position in the divided area). In the scene recognition process, the unit 160 causes the histogram generation unit 130 to generate a histogram representing the entire image from the quantization vector value for each scene recognition divided region, and the SVM calculation unit 140 And a procedure for executing an SVM operation for comparing each of a plurality of histograms of teacher data in which histograms of a plurality of images are classified and grouped for each type of object. In this process, the histogram generation unit 130 stores each of the scene recognition divided regions stored in the quantization vector storage unit 170. A procedure for generating a histogram representing an image of each position-specific divided region from each quantization vector value, and a procedure for causing the SVM calculation unit 140 to perform an SVM calculation on each of the histograms representing the position-specific divided regions The object position specifying method is configured.

上記に述べたように、本第１の実施形態の対象物位置特定システム１０では、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存する。そして、本第１の実施形態の対象物位置特定システム１０における対象物の位置特定の処理では、量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、位置特定分割領域毎の位置特定ヒストグラムを生成する。これにより、本第１の実施形態の対象物位置特定システム１０では、対象物が写っている画像内の位置を特定するために、入力された画像に対してシーン認識の処理を行った後に、それぞれの位置特定分割領域に対してシーン認識の処理と同等の処理を再度行うよりも少ない処理で、位置特定の処理を行うことができる。つまり、本第１の実施形態の対象物位置特定システム１０における対象物の位置特定の処理では、シーン認識の処理における局所特徴ベクトルを生成する処理（ステップＳ１００）と、量子化ベクトルを生成する処理（ステップＳ１１０）とを省略することができる。このことにより、本第１の実施形態の対象物位置特定システム１０では、対象物が写っている画像内の位置を特定するために要する演算時間を短縮することができる。 As described above, in the target object specifying system 10 of the first embodiment, the quantization vector value of each scene recognition divided region generated by the quantization vector generation unit 120 in the scene recognition process is set as follows. The data is stored in the quantized vector storage unit 170. In the object position specifying process in the object position specifying system 10 according to the first embodiment, the quantization vector value of each scene recognition divided region stored in the quantization vector storage unit 170 is used. A position specifying histogram for each position specifying divided region is generated. Thereby, in the object position specifying system 10 of the first embodiment, after performing the scene recognition process on the input image in order to specify the position in the image in which the object is shown, The position specifying process can be performed with less processing than performing the same process as the scene recognition process again for each position specifying divided region. That is, in the object position specifying process in the object position specifying system 10 of the first embodiment, a process of generating a local feature vector in the scene recognition process (step S100) and a process of generating a quantization vector (Step S110) can be omitted. Thereby, in the object position specifying system 10 of the first embodiment, it is possible to shorten the calculation time required for specifying the position in the image in which the object is shown.

なお、本第１の実施形態の対象物位置特定システム１０では、ＳＶＭ演算部１４０が、シーン認識の処理において、算出したＳＶＭ演算の結果に基づいて得られるそれぞれの対象物のカテゴリ毎の類似度を表す情報を出力し、対象物の位置特定の処理において、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成について説明した。しかし、対象物のカテゴリ毎の類似度を表す情報や、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成要素は、ＳＶＭ演算部１４０に限定されるものではない。例えば、位置特定制御部１６０が、ＳＶＭ演算部１４０が算出したそれぞれのＳＶＭ演算の結果に基づいて、それぞれの対象物のカテゴリ毎の類似度を表す情報や、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成にすることもできる。 In the target object specifying system 10 according to the first embodiment, the SVM calculation unit 140 uses the similarity of each target object category obtained based on the calculated SVM calculation result in the scene recognition process. In the above description, the configuration is described in which the information indicating the position of the position-specific divided region in which the determined target object is captured is output in the process of specifying the position of the target object. However, the component that outputs the information indicating the degree of similarity for each category of the object and the information for specifying the position of the position-specific divided area where the determined object is reflected is not limited to the SVM calculation unit 140. Absent. For example, based on the result of each SVM calculation calculated by the SVM calculation unit 140, the position specifying control unit 160 shows information indicating the similarity for each category of each target object, or the position where the determined target object is reflected. A configuration for outputting information for specifying the position of the specific divided region may be employed.

＜第２の実施形態＞
次に、本発明の第２の実施形態について説明する。図６は、本第２の実施形態による対象物位置特定システムの概略構成を示したブロック図である。図６において、対象物位置特定システム２０は、局所特徴ベクトル生成部１１０と、量子化ベクトル生成部１２０と、ヒストグラム生成部２３０と、ＳＶＭ演算部２４０と、教師データ群１５０と、位置特定制御部２６０と、量子化ベクトル保存部１７０と、ヒストグラム保存部２８０と、を備えている。 <Second Embodiment>
Next, a second embodiment of the present invention will be described. FIG. 6 is a block diagram showing a schematic configuration of the object position specifying system according to the second embodiment. 6, the object position specifying system 20 includes a local feature vector generation unit 110, a quantization vector generation unit 120, a histogram generation unit 230, an SVM calculation unit 240, a teacher data group 150, and a position specification control unit. 260, a quantization vector storage unit 170, and a histogram storage unit 280.

なお、図６に示した対象物位置特定システム２０は、図１に示した第１の実施形態の対象物位置特定システム１０に備えたヒストグラム生成部１３０がヒストグラム生成部２３０に、ＳＶＭ演算部１４０がＳＶＭ演算部２４０に、位置特定制御部１６０が位置特定制御部２６０に、それぞれ代わり、さらに、ヒストグラム保存部２８０を備えた構成である。また、対象物位置特定システム２０に備えたその他の構成要素は、図１に示した第１の実施形態の対象物位置特定システム１０に備えた構成要素と同じ構成要素である。従って、本第２の実施形態の対象物位置特定システム２０の説明においては、第１の実施形態の対象物位置特定システム１０に備えた構成要素と異なる構成要素および動作のみを説明し、第１の実施形態の対象物位置特定システム１０と同様の構成要素および動作に関する詳細な説明は省略する。 The object position specifying system 20 shown in FIG. 6 includes the histogram generating unit 130 included in the object position specifying system 10 of the first embodiment shown in FIG. Is replaced with the SVM calculation unit 240 and the position specifying control unit 160 is replaced with the position specifying control unit 260, respectively, and further includes a histogram storage unit 280. Further, the other components provided in the object position specifying system 20 are the same as the components provided in the object position specifying system 10 of the first embodiment shown in FIG. Therefore, in the description of the object position specifying system 20 according to the second embodiment, only components and operations different from those provided in the object position specifying system 10 according to the first embodiment will be described. Detailed description of the same components and operations as those of the object position specifying system 10 of the embodiment will be omitted.

対象物位置特定システム２０は、第１の実施形態の対象物位置特定システム１０と同様に、入力された画像に対して、画像に写っている被写体（対象物）や画像が撮影されたシーンを認識するシーン認識の処理を行い、様々な対象物の種類毎に分類されたそれぞれの教師データとの類似度の情報を、シーン認識の処理によって判別した情報として出力する。また、対象物位置特定システム２０は、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を行い、特定した対象物が写っている位置を表す情報を出力する。 Similar to the target object specifying system 10 of the first embodiment, the target object specifying system 20 is configured to display a subject (target object) in the image and a scene in which the image is captured with respect to the input image. Recognizing scene recognition processing is performed, and information on the degree of similarity with each teacher data classified for each type of object is output as information determined by scene recognition processing. In addition, the object position specifying system 20 is a position for specifying the position where the determined object is shown in the image subjected to the scene recognition process, similarly to the object position specifying system 10 of the first embodiment. A specific process is performed, and information indicating the position where the specified object is shown is output.

局所特徴ベクトル生成部１１０は、位置特定制御部２６０からの制御に応じて、対象物位置特定システム２０に入力された画像の局所特徴ベクトルをシーン認識分割領域毎に生成し、生成したそれぞれのシーン認識分割領域の局所特徴ベクトルの値を、量子化ベクトル生成部１２０に出力する。 The local feature vector generation unit 110 generates a local feature vector of the image input to the object position specifying system 20 for each scene recognition divided region in accordance with the control from the position specifying control unit 260, and generates each of the generated scenes. The value of the local feature vector of the recognition divided region is output to the quantized vector generation unit 120.

量子化ベクトル生成部１２０は、位置特定制御部２６０からの制御に応じて、局所特徴ベクトル生成部１１０から入力されたそれぞれのシーン認識分割領域の局所特徴ベクトルの値を量子化したシーン認識分割領域毎の量子化ベクトルを生成し、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、ヒストグラム生成部２３０に出力すると共に、量子化ベクトル保存部１７０に保存させる。 The quantization vector generation unit 120 quantizes the local feature vector value of each scene recognition division region input from the local feature vector generation unit 110 in accordance with the control from the position specifying control unit 260. Each quantization vector is generated, and the generated quantization vector value of each scene recognition divided region is output to the histogram generation unit 230 and stored in the quantization vector storage unit 170.

量子化ベクトル保存部１７０は、量子化ベクトル生成部１２０から入力された、それぞれのシーン認識分割領域に対応する量子化ベクトルの値を、それぞれのシーン認識分割領域毎に保存し、保存したそれぞれのシーン認識分割領域毎の量子化ベクトルの値を、ヒストグラム生成部２３０からの制御に応じて、ヒストグラム生成部２３０に出力する。 The quantization vector storage unit 170 stores the value of the quantization vector input from the quantization vector generation unit 120 and corresponding to each scene recognition divided region for each scene recognition divided region. The value of the quantization vector for each scene recognition divided region is output to the histogram generation unit 230 in accordance with the control from the histogram generation unit 230.

ヒストグラム生成部２３０は、第１の実施形態の対象物位置特定システム１０に備えたヒストグラム生成部１３０と同様に、位置特定制御部２６０からの制御に応じたシーン認識の処理において、量子化ベクトル生成部１２０から入力されたそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、対象物位置特定システム２０に入力された画像の全体を表すヒストグラムを生成する。そして、ヒストグラム生成部２３０は、生成した画像全体のヒストグラムを、ＳＶＭ演算部２４０に出力する。また、ヒストグラム生成部２３０は、第１の実施形態の対象物位置特定システム１０に備えたヒストグラム生成部１３０と異なり、生成した画像全体のヒストグラムを、ヒストグラム保存部２８０に保存させる。また、ヒストグラム生成部２３０は、入力された画像に対応した画像全体のヒストグラムの生成が完了したとき、画像全体のヒストグラムの生成が完了したことを位置特定制御部２６０に通知する。なお、ヒストグラム生成部２３０において画像全体のヒストグラムを生成する処理の方法も、第１の実施形態の対象物位置特定システム１０に備えたヒストグラム生成部１３０と同様に、従来の技術においてシーン認識の処理を行う際に画像全体のヒストグラムを生成する処理の方法と同様であるため、詳細な説明は省略する。 Similar to the histogram generation unit 130 provided in the object position specifying system 10 of the first embodiment, the histogram generation unit 230 generates a quantization vector in the process of scene recognition according to the control from the position specification control unit 260. A histogram representing the entire image input to the object position specifying system 20 is generated from the value of the quantization vector for each scene recognition divided region input from the unit 120. Then, the histogram generation unit 230 outputs the generated histogram of the entire image to the SVM calculation unit 240. Further, unlike the histogram generation unit 130 provided in the object position specifying system 10 of the first embodiment, the histogram generation unit 230 causes the histogram storage unit 280 to store the generated histogram of the entire image. Further, when the generation of the histogram of the entire image corresponding to the input image is completed, the histogram generation unit 230 notifies the position specification control unit 260 that the generation of the histogram of the entire image is completed. The processing method for generating the histogram of the entire image in the histogram generation unit 230 is the same as the histogram generation unit 130 provided in the object position specifying system 10 of the first embodiment. Since this is the same as the method of generating a histogram of the entire image when performing the above, detailed description thereof will be omitted.

また、ヒストグラム生成部２３０は、第１の実施形態の対象物位置特定システム１０に備えたヒストグラム生成部１３０と同様に、位置特定制御部２６０からの制御に応じた対象物の位置特定の処理において、量子化ベクトル保存部１７０に保存されているそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、位置特定制御部２６０から指定された位置特定分割領域を表す位置特定ヒストグラムを生成する。そして、ヒストグラム生成部２３０は、生成したそれぞれの位置特定分割領域毎の位置特定ヒストグラムを、ＳＶＭ演算部２４０に出力する。また、ヒストグラム生成部２３０は、位置特定制御部２６０から指定された位置特定分割領域に対応した位置特定ヒストグラムの生成が完了したとき、指定された位置特定ヒストグラムの生成が完了したことを位置特定制御部２６０に通知する。この通知によって、位置特定制御部２６０から次の位置特定分割領域が指定され、ヒストグラム生成部２３０は、指定された位置特定分割領域に対応した位置特定ヒストグラムの生成を繰り返す。なお、ヒストグラム生成部２３０において位置特定ヒストグラムを生成する処理の方法も、ヒストグラムを生成する領域の大きさが異なる以外は、シーン認識の処理において画像全体のヒストグラムを生成する処理の方法と同様である。 In addition, the histogram generation unit 230 performs the process of specifying the position of the object in accordance with the control from the position specifying control unit 260, similarly to the histogram generation unit 130 provided in the object position specifying system 10 of the first embodiment. From the quantization vector value for each scene recognition divided region stored in the quantized vector storage unit 170, a position specifying histogram representing the position specifying divided region designated by the position specifying control unit 260 is generated. Then, the histogram generation unit 230 outputs the generated position specifying histogram for each position specifying divided region to the SVM calculating unit 240. In addition, when the generation of the position specifying histogram corresponding to the position specifying divided region specified by the position specifying control unit 260 is completed, the histogram generating unit 230 determines that the generation of the specified position specifying histogram has been completed. Notify the unit 260. By this notification, the next position specifying divided region is designated from the position specifying control unit 260, and the histogram generating unit 230 repeats generation of a position specifying histogram corresponding to the designated position specifying divided region. Note that the processing method for generating the position specifying histogram in the histogram generation unit 230 is the same as the processing method for generating the histogram of the entire image in the scene recognition processing, except that the size of the area for generating the histogram is different. .

ヒストグラム保存部２８０は、ヒストグラム生成部２３０からの制御に応じて、ヒストグラム生成部２３０が生成した画像全体のヒストグラムを一時的に保存する、例えば、ＤＲＡＭなどのメモリである。ヒストグラム保存部２８０に保存された画像全体のヒストグラムは、ＳＶＭ演算部２４０からの制御に応じて、ＳＶＭ演算部２４０に出力される。 The histogram storage unit 280 is a memory such as a DRAM that temporarily stores a histogram of the entire image generated by the histogram generation unit 230 in accordance with control from the histogram generation unit 230. The histogram of the entire image stored in the histogram storage unit 280 is output to the SVM calculation unit 240 in accordance with control from the SVM calculation unit 240.

ＳＶＭ演算部２４０は、第１の実施形態の対象物位置特定システム１０に備えたＳＶＭ演算部１４０と同様に、位置特定制御部２６０からの制御に応じたシーン認識の処理において、ヒストグラム生成部２３０から入力された画像全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、教師データ群１５０において分類された対象物のカテゴリ毎に類似度を算出する。そして、ＳＶＭ演算部２４０は、入力された画像全体のヒストグラムに対するＳＶＭ演算が完了したとき、ＳＶＭ演算によって算出したそれぞれの対象物のカテゴリ毎の類似度を表す情報を、対象物位置特定システム２０がシーン認識の処理を行って判別した情報として出力する。また、ＳＶＭ演算部２４０は、シーン認識の処理を行うＳＶＭ演算が完了したことを位置特定制御部２６０に通知する。なお、ＳＶＭ演算部２４０におけるＳＶＭ演算の方法も、第１の実施形態の対象物位置特定システム１０に備えたＳＶＭ演算部１４０と同様に、従来の技術においてシーン認識の処理を行う際のＳＶＭ演算の方法と同様であるため、詳細な説明は省略する。 Similar to the SVM calculation unit 140 provided in the object position specifying system 10 of the first embodiment, the SVM calculation unit 240 is a histogram generation unit 230 in the scene recognition process according to the control from the position specifying control unit 260. The SVM calculation is performed to compare the histogram of the entire image input from the image and the histogram of each teacher data included in the teacher data group 150, and the similarity is calculated for each category of the object classified in the teacher data group 150. To do. Then, when the SVM calculation unit 240 completes the SVM calculation for the entire histogram of the input image, the object position specifying system 20 displays information representing the similarity for each category of each target calculated by the SVM calculation. It outputs as information determined by performing scene recognition processing. In addition, the SVM calculation unit 240 notifies the position specification control unit 260 that the SVM calculation for performing the scene recognition process is completed. Note that the SVM calculation method in the SVM calculation unit 240 is the same as the SVM calculation unit 140 provided in the object position specifying system 10 of the first embodiment. Since it is the same as the method of, detailed explanation is omitted.

また、ＳＶＭ演算部２４０は、第１の実施形態の対象物位置特定システム１０に備えたＳＶＭ演算部１４０と異なり、位置特定制御部２６０からの制御に応じた対象物の位置特定の処理において、ヒストグラム生成部２３０から入力されたそれぞれの位置特定ヒストグラムと、ヒストグラム保存部２８０に保存されている画像全体のヒストグラムとを比較するＳＶＭ演算（以下、「簡易ＳＶＭ演算」という）を行い、それぞれの位置特定分割領域毎に、画像全体のヒストグラムとの類似度を算出する。また、ＳＶＭ演算部２４０は、対象物の位置特定の処理を行う位置特定分割領域毎に、簡易ＳＶＭ演算が完了したことを位置特定制御部２６０に通知する。この通知によって、位置特定制御部２６０から次の位置特定分割領域が指定され、ＳＶＭ演算部２４０は、指定された位置特定分割領域の位置特定ヒストグラムに対する簡易ＳＶＭ演算を繰り返す。なお、ＳＶＭ演算部２４０における位置特定ヒストグラムに対する簡易ＳＶＭ演算の方法も、ＳＶＭ演算の処理を行う、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムが、ヒストグラム保存部２８０に保存されている画像全体のヒストグラムに代わる以外は、シーン認識の処理におけるＳＶＭ演算の方法と同様である。 Further, the SVM calculation unit 240 is different from the SVM calculation unit 140 provided in the object position specifying system 10 of the first embodiment in the process of specifying the position of the object according to the control from the position specifying control unit 260. SVM calculation (hereinafter referred to as “simple SVM calculation”) for comparing each position specifying histogram input from the histogram generation unit 230 and the histogram of the entire image stored in the histogram storage unit 280 is performed, and each position is determined. The similarity with the histogram of the entire image is calculated for each specific divided region. In addition, the SVM calculating unit 240 notifies the position specifying control unit 260 that the simple SVM calculation is completed for each position specifying divided region where the process of specifying the position of the object is performed. By this notification, the next position specifying divided region is designated by the position specifying control unit 260, and the SVM calculating unit 240 repeats the simple SVM calculation for the position specifying histogram of the specified position specifying divided region. Note that the simple SVM calculation method for the position specifying histogram in the SVM calculation unit 240 is also an image in which the histogram of each teacher data included in the teacher data group 150 that performs the SVM calculation process is stored in the histogram storage unit 280. The method is the same as the SVM calculation method in the scene recognition process except that the entire histogram is used.

また、ＳＶＭ演算部２４０は、位置特定制御部２６０からの制御に応じた対象物の位置特定の処理において、全ての位置特定分割領域の位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了した後に、画像全体のヒストグラムとの類似度が最も高かった位置特定ヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、教師データ群１５０において分類された対象物のカテゴリとの類似度を算出する。そして、ＳＶＭ演算部２４０は、簡易ＳＶＭ演算において画像全体のヒストグラムとの類似度が最も高かった位置特定ヒストグラムに対するＳＶＭ演算が完了したとき、ＳＶＭ演算によって算出した位置特定分割領域の位置特定ヒストグラムの、シーン認識の処理によって判別した対象物のカテゴリとの類似度を表す情報を、対象物位置特定システム２０が対象物の位置特定の処理を行った結果として出力する。なお、ＳＶＭ演算部２４０における位置特定ヒストグラムに対するＳＶＭ演算の方法も、第１の実施形態の対象物位置特定システム１０に備えたＳＶＭ演算部１４０と同様に、ＳＶＭ演算の処理を行うヒストグラムが位置特定ヒストグラムに代わる以外は、シーン認識の処理におけるＳＶＭ演算の方法と同様である。 In addition, the SVM calculation unit 240 performs processing for specifying the position of the object according to the control from the position specification control unit 260, and after the simple SVM calculation for the position specification histograms of all the position specification divided regions is completed, An SVM operation is performed to compare the position specifying histogram having the highest similarity with the histogram and the histogram of each teacher data included in the teacher data group 150, and the category of the object classified in the teacher data group 150 is calculated. Calculate similarity. Then, when the SVM calculation is completed with respect to the position specifying histogram having the highest similarity to the histogram of the entire image in the simple SVM calculation, the SVM calculating unit 240 calculates the position specifying histogram of the position specifying divided region calculated by the SVM calculation. Information indicating the similarity to the category of the object determined by the scene recognition process is output as a result of the object position specifying system 20 performing the object position specifying process. Note that the SVM calculation method for the position specification histogram in the SVM calculation unit 240 is the same as the SVM calculation unit 140 included in the object position specifying system 10 of the first embodiment. The method is the same as the SVM calculation method in the scene recognition process except that the histogram is used.

位置特定制御部２６０は、対象物位置特定システム２０の全体、すなわち、対象物位置特定システム２０に備えた局所特徴ベクトル生成部１１０、量子化ベクトル生成部１２０、ヒストグラム生成部２３０、およびＳＶＭ演算部２４０のそれぞれの動作を制御する。位置特定制御部２６０は、ヒストグラム生成分割領域指定部１６１と、位置特定ＳＶＭ演算判定部２６２と、を備えている。 The position specifying control unit 260 is the entire object position specifying system 20, that is, the local feature vector generating unit 110, the quantized vector generating unit 120, the histogram generating unit 230, and the SVM calculating unit included in the object position specifying system 20. Each operation of 240 is controlled. The position specifying control unit 260 includes a histogram generation divided region specifying unit 161 and a position specifying SVM calculation determining unit 262.

なお、位置特定制御部２６０は、図１に示した第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０に、さらに、位置特定ＳＶＭ演算判定部２６２を備えた構成である。なお、位置特定制御部２６０に備えたヒストグラム生成分割領域指定部１６１は、図１に示した第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０内のヒストグラム生成分割領域指定部１６１と同じ動作をする。従って、ヒストグラム生成分割領域指定部１６１の動作に関する詳細な説明は省略する。 The position specifying control unit 260 has a configuration in which the position specifying control unit 160 provided in the object position specifying system 10 of the first embodiment shown in FIG. 1 is further provided with a position specifying SVM calculation determining unit 262. is there. Note that the histogram generation divided region specifying unit 161 provided in the position specifying control unit 260 is a histogram generation divided region in the position specifying control unit 160 provided in the object position specifying system 10 of the first embodiment shown in FIG. The same operation as the designation unit 161 is performed. Therefore, a detailed description of the operation of the histogram generation divided area designating unit 161 is omitted.

位置特定ＳＶＭ演算判定部２６２は、対象物位置特定システム２０における対象物の位置特定の処理において、ＳＶＭ演算部２４０が、ヒストグラム生成部２３０から入力されたそれぞれの位置特定ヒストグラムに対するＳＶＭ演算を行う際のヒストグラムを、ヒストグラム保存部２８０に保存されている画像全体のヒストグラム、または教師データ群１５０に含まれるそれぞれの教師データのヒストグラムのいずれか一方に切り替える。より具体的には、位置特定ＳＶＭ演算判定部２６２は、ＳＶＭ演算部２４０が、ヒストグラム生成部２３０から入力されたそれぞれの位置特定ヒストグラムに対する簡易ＳＶＭ演算を行う際に、それぞれの位置特定ヒストグラムと比較するヒストグラムを、ヒストグラム保存部２８０に保存されている画像全体のヒストグラムに切り替える。また、位置特定ＳＶＭ演算判定部２６２は、ＳＶＭ演算部２４０が、全ての位置特定分割領域の位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了した後、ＳＶＭ演算部２４０がさらに、画像全体のヒストグラムとの類似度が最も高かった位置特定ヒストグラムに対するＳＶＭ演算を行う際に、画像全体のヒストグラムと最も類似度が高かった位置特定ヒストグラムと比較するヒストグラムを、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラムに切り替える。 The position specifying SVM calculation determining unit 262 is used when the SVM calculating unit 240 performs the SVM calculation for each position specifying histogram input from the histogram generating unit 230 in the process of specifying the position of the target in the target position specifying system 20. Is switched to either the histogram of the entire image stored in the histogram storage unit 280 or the histogram of each teacher data included in the teacher data group 150. More specifically, the position specifying SVM calculation determination unit 262 compares the position specifying histogram with each position specifying histogram when the SVM calculating unit 240 performs a simple SVM calculation for each position specifying histogram input from the histogram generation unit 230. The histogram to be switched is switched to the histogram of the entire image stored in the histogram storage unit 280. Further, the position specifying SVM calculation determining unit 262 is similar to the histogram of the entire image after the SVM calculating unit 240 completes the simple SVM calculation for the position specifying histograms of all the position specifying divided regions. When performing the SVM calculation for the position-specific histogram with the highest degree, the histogram that compares the histogram with the position-most histogram with the highest degree of similarity with the histogram of the entire image was determined in the scene recognition process, and the degree of similarity was the highest. Switch to the histogram of the teacher data for each category of the object.

このような構成よって、対象物位置特定システム２０では、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、対象物の位置特定の処理を行う際の位置特定分割領域毎の位置特定ヒストグラムを生成する。さらに、対象物位置特定システム２０では、ヒストグラム生成部２３０が生成した画像全体のヒストグラムをヒストグラム保存部２８０に保存し、ヒストグラム保存部２８０に保存した画像全体のヒストグラムを用いて、対象物の位置特定の処理を行う。より具体的には、対象物位置特定システム２０による対象物の位置特定の処理において、ヒストグラム生成部２３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算において比較する、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラム（例えば、１つのカテゴリに含まれる１５００個のヒストグラム）の代わりに、ヒストグラム保存部２８０に保存した画像全体のヒストグラムを用いる。つまり、対象物位置特定システム２０による対象物の位置特定の処理では、ＳＶＭ演算を行う際に用いる大量の教師データの代わりに、シーン認識の処理において生成した画像全体を表す１つのヒストグラムを使用して、シーン認識の処理において判別した対象物が写っている位置特定分割領域の位置を、簡易的に特定することができる。この画像全体を表す１つのヒストグラムを大量の教師データの代わりに使用することができる理由は、シーン認識の処理において一度判別した対象物は、いずれかの位置特定分割領域内に写っていると考えられるからである。これにより、対象物位置特定システム２０では、画像全体のヒストグラムとの類似度が最も高い位置特定ヒストグラムを特定するためにＳＶＭ演算部２４０によって行う、ヒストグラム生成部２３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算に要する時間を短縮することができる。 With such a configuration, in the object position specifying system 20, as in the object position specifying system 10 of the first embodiment, each scene recognition divided region generated by the quantized vector generation unit 120 in the scene recognition process. The position specifying histogram for each position specifying divided region when performing the process of specifying the position of the object is generated using the value of the quantization vector. Further, in the object position specifying system 20, the histogram of the entire image generated by the histogram generating unit 230 is stored in the histogram storing unit 280, and the position of the object is specified using the histogram of the entire image stored in the histogram storing unit 280. Perform the process. More specifically, in the process of scene recognition that is compared in the SVM calculation for the position specifying histogram for each position specifying divided area generated by the histogram generation unit 230 in the process of specifying the position of the object by the object position specifying system 20. Instead of the histograms of the respective teacher data of the category of the object with the highest similarity determined (for example, 1500 histograms included in one category), the histogram of the entire image stored in the histogram storage unit 280 is used. Use. That is, in the object position specifying process by the object position specifying system 20, one histogram representing the entire image generated in the scene recognition process is used instead of a large amount of teacher data used when performing the SVM calculation. Thus, it is possible to easily specify the position of the position specifying divided region where the object determined in the scene recognition process is shown. The reason that one histogram representing the entire image can be used instead of a large amount of teacher data is that the object once determined in the scene recognition processing is reflected in one of the position-specific divided regions. Because it is. Thereby, in the object position specifying system 20, the SVM calculation unit 240 performs the position specifying divided regions generated by the histogram generating unit 230 in order to specify the position specifying histogram having the highest similarity with the histogram of the entire image. It is possible to reduce the time required for the SVM calculation for the position specifying histogram.

次に、対象物位置特定システム２０の動作について説明する。図７は、本第２の実施形態の対象物位置特定システム２０における処理手順を示したフローチャートである。また、図８は、本第２の実施形態の対象物位置特定システム２０において対象物の位置を特定する処理の一例を説明する図である。図７に示した対象物位置特定システム２０における処理のフローチャートの説明においては、適宜、図３〜図５に示した第１の実施形態の対象物位置特定システム１０におけるそれぞれの処理の一例、および図８に示した対象物位置特定システム２０において対象物の位置を特定する処理の一例を参照する。そして、対象物位置特定システム２０における処理においても、第１の実施形態の対象物位置特定システム１０と同様に、画像に写っている対象物が「犬」である場合において、画像全体の領域を９つの位置特定分割領域に分割して、対象物である「犬」が写っている位置特定分割領域を特定する場合の例を説明する。 Next, the operation of the object position specifying system 20 will be described. FIG. 7 is a flowchart showing a processing procedure in the object position specifying system 20 according to the second embodiment. Moreover, FIG. 8 is a figure explaining an example of the process which pinpoints the position of a target object in the target object location specifying system 20 of the 2nd embodiment. In the description of the flowchart of the process in the object position specifying system 20 shown in FIG. 7, an example of each process in the object position specifying system 10 of the first embodiment shown in FIGS. An example of processing for specifying the position of the object in the object position specifying system 20 shown in FIG. 8 will be referred to. In the processing in the object position specifying system 20, as in the object position specifying system 10 of the first embodiment, when the object shown in the image is a “dog”, the entire image area is determined. An example will be described in which a position-specific divided area in which the object “dog” is reflected is specified by dividing into nine position-specific divided areas.

なお、対象物位置特定システム２０の処理には、第１の実施形態の対象物位置特定システム１０の処理と同じ処理が含まれている。このため、図７に示した本第２の実施形態の対象物位置特定システム２０における処理手順を示したフローチャートには、第１の実施形態の対象物位置特定システム１０の処理と同じ処理を行う手順に、図２に示した本第１の実施形態の対象物位置特定システム１０における処理手順を示したフローチャートに付与したステップ番号と同一のステップ番号を付与している。従って、図７に示した対象物位置特定システム２０における処理のフローチャートの説明においては、第１の実施形態の対象物位置特定システム１０の処理と同じ処理を行う手順に関する詳細な説明は省略する。 Note that the processing of the object position specifying system 20 includes the same processing as the processing of the object position specifying system 10 of the first embodiment. For this reason, the same processing as the processing of the object position specifying system 10 of the first embodiment is performed in the flowchart showing the processing procedure in the object position specifying system 20 of the second embodiment shown in FIG. The same step number as the step number given to the flowchart showing the processing procedure in the object position specifying system 10 of the first embodiment shown in FIG. 2 is given to the procedure. Therefore, in the description of the flowchart of the process in the object position specifying system 20 shown in FIG.

対象物位置特定システム２０に画像が入力されると、位置特定制御部２６０は、第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０と同様に、まず、入力された画像に対するシーン認識の処理を行い、その後、対象物の位置特定の処理を行うように、対象物位置特定システム２０に備えたそれぞれの構成要素の動作を制御する（図３参照）。 When an image is input to the object position specifying system 20, the position specifying control unit 260 is first input in the same manner as the position specifying control unit 160 included in the object position specifying system 10 of the first embodiment. The operation of each component included in the object position specifying system 20 is controlled so that the scene recognition process is performed on the image, and then the object position specifying process is performed (see FIG. 3).

対象物位置特定システム２０におけるシーン認識の処理では、まず、ステップＳ１００〜ステップＳ１１５において、位置特定制御部２６０は、局所特徴ベクトル生成部１１０に、入力された画像（図３（ａ）参照）のそれぞれのシーン認識分割領域毎の局所特徴ベクトルを生成させ、量子化ベクトル生成部１２０に、それぞれのシーン認識分割領域毎の量子化ベクトルを生成させて、生成した量子化ベクトルの値をそれぞれのシーン認識分割領域に量子化ベクトル保存部１７０に保存させる。 In the process of scene recognition in the object position specifying system 20, first, in steps S <b> 100 to S <b> 115, the position specifying control unit 260 inputs the image (see FIG. 3A) input to the local feature vector generation unit 110. A local feature vector for each scene recognition divided region is generated, and a quantization vector generation unit 120 generates a quantization vector for each scene recognition divided region, and the value of the generated quantization vector is set for each scene. The quantization vector storage unit 170 stores the recognition divided region.

続いて、ステップＳ１２０において、位置特定制御部２６０は、ヒストグラム生成部２３０に、量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域毎の量子化ベクトルの値から、対象物位置特定システム２０に入力された画像（図３（ａ）参照）の全体を表すヒストグラムを生成させる。また、ステップＳ１２５において、位置特定制御部２６０は、ヒストグラム生成部２３０に、生成した画像（図３（ａ）参照）の全体を表すヒストグラムを、ヒストグラム保存部２８０に保存させる。 Subsequently, in step S120, the position specifying control unit 260 causes the histogram generating unit 230 to calculate the object position specifying system 20 from the quantization vector value for each scene recognition divided region generated by the quantization vector generating unit 120. A histogram representing the entire image (see FIG. 3A) input to is generated. In step S125, the position specifying control unit 260 causes the histogram generation unit 230 to store a histogram representing the entire generated image (see FIG. 3A) in the histogram storage unit 280.

続いて、ステップＳ１３０において、位置特定制御部２６０は、ＳＶＭ演算部２４０に、ヒストグラム生成部２３０が生成した画像（図３（ａ）参照）全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を実行させる。これにより、対象物位置特定システム２０は、入力された画像（図３（ａ）参照）に「犬」が写っていると判別することができ、それぞれの対象物のカテゴリ毎の類似度を表す情報を出力する（図４参照）。 Subsequently, in step S130, the position specifying control unit 260 causes the SVM calculation unit 240 to display the histogram of the entire image (see FIG. 3A) generated by the histogram generation unit 230 and the teacher data group 150. The SVM calculation for calculating the similarity with the histogram of the teacher data is executed. Thereby, the object position specifying system 20 can determine that “dog” is reflected in the input image (see FIG. 3A), and represents the similarity for each object category. Information is output (see FIG. 4).

そして、ステップＳ１００〜ステップＳ１３０までのシーン認識の処理が完了すると、対象物位置特定システム２０は、ステップＳ３００から、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を開始する。対象物位置特定システム２０における対象物の位置特定の処理では、第１の実施形態の対象物位置特定システム１０と同様に、まず、シーン認識の処理において量子化ベクトル生成部１２０が量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、位置特定ヒストグラムを生成する。その後、対象物位置特定システム２０における対象物の位置特定の処理では、生成した位置特定ヒストグラムに対する簡易ＳＶＭ演算を、シーン認識の処理においてヒストグラム生成部２３０がヒストグラム保存部２８０に保存した画像の全体を表すヒストグラムを用いて、位置特定分割領域毎に行う。そして、対象物位置特定システム２０における対象物の位置特定の処理では、最後に、簡易ＳＶＭ演算によって簡易的に判別した、シーン認識の処理において判別した対象物が写っている位置特定分割領域を表す位置特定ヒストグラムに対するＳＶＭ演算を行う。 When the scene recognition process from step S100 to step S130 is completed, the object position specifying system 20 determines the position where the determined object is reflected in the image subjected to the scene recognition process from step S300. The position specifying process to be specified is started. In the object position specifying process in the object position specifying system 20, as in the object position specifying system 10 of the first embodiment, first, the quantization vector generation unit 120 stores the quantized vectors in the scene recognition process. A position specifying histogram is generated using the quantization vector value of each scene recognition divided region stored in the unit 170. Thereafter, in the process of specifying the position of the object in the object position specifying system 20, a simple SVM calculation is performed on the generated position specifying histogram, and the entire image stored in the histogram storing unit 280 by the histogram generating unit 230 in the scene recognition process. This is performed for each position-specific divided region using the histogram to be represented. And in the process of specifying the position of the object in the object position specifying system 20, finally, the position specifying divided region where the object determined in the scene recognition process is simply determined by simple SVM calculation is shown. SVM calculation is performed on the position specifying histogram.

まず、ステップＳ３００において、位置特定制御部２６０は、第１の実施形態の対象物位置特定システム１０におけるステップＳ２００と同様に、ヒストグラム生成部２３０に、入力された画像全体の領域を分割した９つの位置特定分割領域の内、１つ目の位置特定分割領域を指定し、指定した１つ目の位置特定分割領域に対応する量子化ベクトルの値を、量子化ベクトル保存部１７０から取得させる。 First, in step S300, the position specifying control unit 260 divides the entire region of the input image into the histogram generation unit 230 in the same manner as in step S200 in the object position specifying system 10 of the first embodiment. The first position-specific divided area is designated from among the position-specific divided areas, and the quantization vector value corresponding to the designated first position-specific divided area is acquired from the quantized vector storage unit 170.

続いて、ステップＳ３１０において、位置特定制御部２６０は、第１の実施形態の対象物位置特定システム１０におけるステップＳ２１０と同様に、ヒストグラム生成部２３０に、取得した１つ目の位置特定分割領域に対応する量子化ベクトルの値に基づいて、１つ目の位置特定分割領域を表す位置特定ヒストグラムを生成させる。 Subsequently, in step S310, the position specifying control unit 260 sends the acquired first position specifying divided region to the histogram generating unit 230 in the same manner as in step S210 in the object position specifying system 10 of the first embodiment. Based on the value of the corresponding quantization vector, a position specifying histogram representing the first position specifying divided region is generated.

続いて、ステップＳ３２０において、位置特定制御部２６０は、ＳＶＭ演算部２４０に、ヒストグラム生成部２３０が生成した１つ目の位置特定分割領域を表す位置特定ヒストグラムと、ヒストグラム保存部２８０に保存した画像の全体を表すヒストグラムとの類似度を算出する簡易ＳＶＭ演算を実行させる。これにより、対象物位置特定システム２０は、１つ目の位置特定分割領域内に、シーン認識の処理において類似度が最も高かった対象物（図３に示した処理の一例では「犬」）が写っているか否かを、簡易的に判別することができる。 Subsequently, in step S320, the position specifying control unit 260 causes the SVM calculating unit 240 to store the position specifying histogram representing the first position specifying divided region generated by the histogram generating unit 230 and the image stored in the histogram storing unit 280. The simple SVM calculation for calculating the degree of similarity with the histogram representing the whole is executed. As a result, the object position specifying system 20 includes an object having the highest similarity in the scene recognition process (“dog” in the example of the process shown in FIG. 3) in the first position specifying divided region. It is possible to easily determine whether or not the image is shown.

続いて、ステップＳ３３０において、位置特定制御部２６０は、入力された画像全体の領域を分割した全ての位置特定分割領域に対する簡易的な判別が終了したか否かを判定する。ステップＳ３３０による判定の結果、分割した全ての位置特定分割領域に対する簡易的な判別が終了していない場合には、ステップＳ３００に戻って、次の位置特定分割領域を指定し、分割した全ての位置特定分割領域に対する簡易的な判別が終了するまで、ステップＳ３００〜ステップＳ３２０までの簡易的な判別の処理を繰り返す。ステップＳ３３０による判定の結果、分割した全ての位置特定分割領域に対する簡易的な判別が終了した場合には、対象物位置特定システム２０における簡易的な判別の処理を終了し、ステップＳ３４０に進む。 Subsequently, in step S330, the position specifying control unit 260 determines whether or not the simple determination for all the position specifying divided areas obtained by dividing the entire area of the input image has been completed. If the result of determination in step S330 is that simple determination for all divided position-specific divided areas has not been completed, the process returns to step S300 to specify the next position-specific divided area and all divided positions. The simple discrimination process from step S300 to step S320 is repeated until the simple discrimination for the specific divided region is completed. If the result of determination in step S330 is that simple determination for all divided position specifying divided areas has been completed, simple determination processing in the object position specifying system 20 ends, and the process proceeds to step S340.

ここで、対象物位置特定システム２０によって行われる、ステップＳ３００〜ステップＳ３２０までの簡易的な判別の処理について説明する。図８は、本第２の実施形態の対象物位置特定システム２０において対象物の位置を簡易的に特定する処理の考え方を説明する図である。 Here, a simple determination process from step S300 to step S320 performed by the object position specifying system 20 will be described. FIG. 8 is a diagram for explaining the concept of processing for simply specifying the position of an object in the object position specifying system 20 according to the second embodiment.

対象物位置特定システム２０における対象物の位置特定の処理では、まず、位置特定制御部２６０が、ステップＳ３００およびステップＳ３１０において、第１の実施形態の対象物位置特定システム１０におけるステップＳ２００およびステップＳ２１０と同様に、入力された画像（図３（ａ）参照）全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）の内、１つ目の位置特定分割領域Ａ１を指定する。これにより、ヒストグラム生成部２３０は、位置特定制御部２６０によって指定された１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得し（図５（ａ−１）参照）、取得した１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値から、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを生成する（図５（ａ−２）参照）。そして、ヒストグラム生成部２３０は、生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを、ＳＶＭ演算部２４０に出力する。 In the process of specifying the position of the object in the object position specifying system 20, first, the position specifying control unit 260 in steps S300 and S310, steps S200 and S210 in the object position specifying system 10 of the first embodiment. Similarly, the first position-specific divided area A1 is designated among the nine position-specific divided areas (see FIG. 3B) obtained by dividing the entire area of the input image (see FIG. 3A). To do. As a result, the histogram generation unit 230 acquires the quantization vector value corresponding to the first position specifying divided region A1 designated by the position specifying control unit 260 from the quantized vector storage unit 170 (FIG. 5A -1)), a position specifying histogram representing the first position specifying divided area A1 is generated from the obtained quantization vector value corresponding to the first position specifying divided area A1 (FIG. 5 (a-)). 2)). Then, the histogram generation unit 230 outputs a position specifying histogram representing the generated first position specifying divided region A1 to the SVM calculation unit 240.

その後、ステップＳ３２０において、ＳＶＭ演算部２４０が、位置特定制御部２６０からの制御に応じて、ヒストグラム保存部２８０に保存されている画像の全体を表すヒストグラムを取得する。そして、ＳＶＭ演算部２４０は、取得した画像の全体を表すヒストグラムと、ヒストグラム生成部２３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部２４０は、位置特定分割領域Ａ１に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を、位置特定分割領域Ａ１内に「犬」が写っているか否かを簡易的に判別する情報として出力する。 Thereafter, in step S <b> 320, the SVM calculation unit 240 acquires a histogram representing the entire image stored in the histogram storage unit 280 in accordance with control from the position specifying control unit 260. Then, the SVM calculation unit 240 performs a simple SVM calculation for calculating the similarity between the histogram representing the entire acquired image and the position specifying histogram representing the first position specifying divided region A1 generated by the histogram generating unit 230. Do. Then, the SVM calculating unit 240 obtains information indicating the similarity to the category “dog” obtained based on the result of the simple SVM calculation calculated for the position specifying divided area A1 in the position specifying divided area A1. Is output as information for easily discriminating whether or not “Dog” is shown in FIG.

なお、ＳＶＭ演算部２４０による簡易ＳＶＭ演算においても、シーン認識の処理におけるＳＶＭ演算部２４０によるＳＶＭ演算と同様に、ヒストグラム生成部２３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムと、取得した画像の全体を表すヒストグラムとのそれぞれが表す領域の大きさが同等になるように正規化した後に、それぞれのヒストグラムにおける同じ階級同士の度数の差分絶対値を算出し、それぞれの階級の差分絶対値を加算する。これにより、ＳＶＭ演算部２４０は、１つ目の位置特定分割領域Ａ１内に、シーン認識の処理において類似度が最も高かった「犬」が写っているか否かを、簡易的に判別することができる、画像の全体を表すヒストグラムと１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムとの類似度を表す情報を出力する。なお、ＳＶＭ演算部２４０による簡易ＳＶＭ演算においても、算出した差分絶対値の加算結果の値が最も小さい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域の位置を特定する情報を出力することができる。 Note that, in the simple SVM calculation by the SVM calculation unit 240 as well, as in the SVM calculation by the SVM calculation unit 240 in the scene recognition process, the position specifying histogram representing the first position specifying divided region A1 generated by the histogram generation unit 230. And the histogram representing the entire acquired image are normalized so that the size of each region is equal, and then the absolute value of the frequency difference between the same classes in each histogram is calculated, and each class is calculated. The difference absolute value of is added. As a result, the SVM calculation unit 240 can easily determine whether or not the “dog” having the highest similarity in the scene recognition process is captured in the first position-specific divided area A1. Information indicating the similarity between the histogram representing the entire image and the position specifying histogram representing the first position specifying divided area A1 is output. In the simple SVM calculation by the SVM calculation unit 240, the position-specific divided area having the smallest value of the calculated difference absolute value is the position-specific divided area in which the object “dog” is reflected. It is possible to determine and output information for specifying the position of the position specifying divided area.

また、ＳＶＭ演算部２４０は、位置特定制御部２６０から指定された１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了したとき、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了したことを位置特定制御部２６０に通知する。この通知に応じて、位置特定制御部２６０は、ステップＳ３３０の判定を行い、ステップＳ３００に戻って、２つ目の位置特定分割領域Ａ２を指定する。 In addition, when the simple SVM calculation for the position specifying histogram representing the first position specifying divided area A1 designated by the position specifying control section 260 is completed, the SVM calculating section 240 determines the first position specifying divided area A1. The position specifying control unit 260 is notified that the simple SVM calculation for the position specifying histogram is completed. In response to this notification, the position specifying control unit 260 performs the determination in step S330, returns to step S300, and designates the second position specifying divided region A2.

そして、ヒストグラム生成部２３０は、位置特定制御部２６０によって指定された２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得し（図５（ｂ−１）参照）、取得した２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値から、２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを生成する（図５（ｂ−２）参照）。そして、ヒストグラム生成部２３０は、生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを、ＳＶＭ演算部２４０に出力する。 Then, the histogram generation unit 230 acquires the quantization vector value corresponding to the second position specifying divided region A2 designated by the position specifying control unit 260 from the quantized vector storage unit 170 (FIG. 5 (b-)). 1)), a position specifying histogram representing the second position specifying divided area A2 is generated from the obtained quantization vector value corresponding to the second position specifying divided area A2 (FIG. 5B-2). )reference). Then, the histogram generation unit 230 outputs a position specifying histogram representing the generated second position specifying divided region A2 to the SVM calculation unit 240.

その後、ステップＳ３２０において、ＳＶＭ演算部２４０が、位置特定制御部２６０からの制御に応じて、取得した画像の全体を表すヒストグラムと、ヒストグラム生成部２３０が生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部２４０は、位置特定分割領域Ａ２に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を、位置特定分割領域Ａ２内に「犬」が写っているか否かを簡易的に判別する情報として出力する。 After that, in step S320, the SVM calculation unit 240, according to the control from the position specifying control unit 260, a histogram representing the entire acquired image and the second position specifying divided region A2 generated by the histogram generating unit 230. A simple SVM calculation is performed to calculate the degree of similarity with the position specifying histogram representing. Then, the SVM calculation unit 240 obtains information indicating the similarity to the category “dog” obtained based on the result of the simple SVM calculation calculated for the position specifying divided area A2 in the position specifying divided area A2. Is output as information for easily discriminating whether or not “Dog” is shown in FIG.

以降、同様に、位置特定制御部２６０が、入力された画像（図３（ａ）参照）全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）を順次指定し、ヒストグラム生成部２３０が、位置特定制御部２６０によって指定されたそれぞれの位置特定分割領域を表す位置特定ヒストグラムを順次生成してＳＶＭ演算部２４０に出力する。また、同様に、ＳＶＭ演算部２４０が、ヒストグラム保存部２８０から取得した画像の全体を表すヒストグラムと、ヒストグラム生成部２３０が生成したそれぞれの位置特定分割領域を表す位置特定ヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部２４０は、それぞれの位置特定分割領域に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、それぞれの位置特定分割領域内に「犬」が写っているか否かを簡易的に判別する情報を出力する。図８には、ＳＶＭ演算部２４０が、ヒストグラム保存部２８０に保存されている画像全体のヒストグラムと、ヒストグラム生成部２３０が生成した位置特定ヒストグラムのそれぞれとを比較する簡易ＳＶＭ演算を実行している状態の一例を示している。 Thereafter, similarly, the position specifying control unit 260 sequentially designates nine position specifying divided areas (see FIG. 3B) obtained by dividing the entire area of the input image (see FIG. 3A), and the histogram The generation unit 230 sequentially generates a position specifying histogram representing each position specifying divided region designated by the position specifying control unit 260 and outputs the position specifying histogram to the SVM calculation unit 240. Similarly, the SVM calculation unit 240 calculates the degree of similarity between the histogram representing the entire image acquired from the histogram storage unit 280 and the position identification histogram representing each position identification divided region generated by the histogram generation unit 230. Simple SVM calculation is performed. Then, the SVM calculation unit 240 simply determines whether or not “dog” is reflected in each position-specific divided area obtained based on the result of the simple SVM calculation calculated for each position-specific divided area. Output the information to be determined. In FIG. 8, the SVM calculation unit 240 executes a simple SVM calculation that compares the histogram of the entire image stored in the histogram storage unit 280 with each of the position specifying histograms generated by the histogram generation unit 230. An example of a state is shown.

また、ＳＶＭ演算部２４０は、それぞれの位置特定分割領域における「犬」であるカテゴリとの類似度を表す情報に基づいて、類似度が最も大きい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域を特定する情報を出力する。 In addition, the SVM calculation unit 240 selects the position-specific divided area having the highest similarity as the object “dog” based on the information indicating the similarity to the category “dog” in each position-specific divided area. Is identified as a position-specific divided area, and information for specifying the position-specific divided area is output.

ここまでの処理が、対象物位置特定システム２０による対象物の位置特定の処理における、シーン認識の処理において判別した対象物が写っている位置特定分割領域の簡易的な判別の処理である。 The process so far is the simple determination process of the position specifying divided region in which the object determined in the scene recognition process is included in the object position specifying process by the object position specifying system 20.

続いて、位置特定制御部２６０は、ステップＳ３３０において全ての位置特定分割領域に対する簡易的な判別が終了した場合、ＳＶＭ演算部２４０に、シーン認識の処理において類似度が最も高かった対象物（図３に示した処理の一例では「犬」）が写っていると簡易的に判別した位置特定分割領域に対するＳＶＭ演算を実行させる。より具体的には、ステップＳ３４０において、位置特定制御部２６０は、ＳＶＭ演算部２４０に、シーン認識の処理において類似度が最も高かった「犬」が写っていると簡易的に判別した位置特定分割領域を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を実行させる。そして、対象物位置特定システム２０が、ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を出力し、対象物位置特定システム２０における処理を完了する。 Subsequently, when the simple determination for all the position-specific divided regions is completed in step S330, the position specifying control unit 260 instructs the SVM calculating unit 240 to display the object having the highest similarity in the scene recognition process (see FIG. In the example of the process shown in FIG. 3, the SVM calculation is performed on the position-specific divided area that is simply determined as “dog”). More specifically, in step S340, the position specifying control unit 260 simply determines that the “dog” having the highest similarity in the scene recognition process is captured in the SVM calculating unit 240. SVM calculation for calculating a similarity between a position specifying histogram representing a region and a histogram of each teacher data of the category of the object having the highest similarity, which is included in the teacher data group 150 and is determined in the scene recognition process Is executed. Then, the object position specifying system 20 outputs information representing the similarity with the category “dog” obtained based on the result of the SVM calculation, and the processing in the object position specifying system 20 is completed.

なお、ＳＶＭ演算部２４０によるステップＳ３３０におけるＳＶＭ演算においても、シーン認識の処理におけるＳＶＭ演算部２４０によるＳＶＭ演算と同様に、簡易的に判別した位置特定分割領域を表す位置特定ヒストグラムとそれぞれの教師データのヒストグラムとにおける同じ階級同士の度数の差分絶対値を算出し、それぞれの階級の差分絶対値を加算する。 Note that, in the SVM calculation in step S330 by the SVM calculation unit 240, as well as the SVM calculation by the SVM calculation unit 240 in the scene recognition process, the position specifying histogram representing the position-specific divided areas that are simply determined and the respective teacher data The difference absolute value of the frequency between the same classes in the histogram is calculated, and the difference absolute value of each class is added.

このようにして、対象物位置特定システム２０では、入力された画像（図３（ａ））全体の領域を分割した全ての位置特定分割領域に対する簡易的な判別を繰り返すことによって、それぞれの位置特定分割領域の中で、画像全体のヒストグラムとの類似度が最も高い位置特定分割領域を、シーン認識の処理によって判別した「犬」が対象物として写っている位置特定分割領域として簡易的に特定することができる。そして、対象物位置特定システム２０では、簡易的に特定した位置特定分割領域の位置を、シーン認識の処理において判別した対象物が写っている位置特定分割領域の位置として、類似度を表す情報を出力することができる。 In this manner, the object position specifying system 20 repeats simple discrimination for all position specifying divided areas obtained by dividing the entire area of the input image (FIG. 3A), thereby specifying each position. Among the divided areas, the position-specific divided area having the highest similarity to the histogram of the entire image is simply identified as the position-specific divided area in which the “dog” determined by the scene recognition process is reflected as the target object. be able to. Then, the object position specifying system 20 uses the position of the position specifying divided area simply specified as the position of the position specifying divided area in which the object determined in the scene recognition process is reflected, and represents information indicating similarity. Can be output.

本実施形態によれば、対象物位置特定システム２０に、ヒストグラム生成部（ヒストグラム生成部２３０）が生成した、画像の全体を表すヒストグラムを保存するヒストグラム保存部（ヒストグラム保存部２８０）、をさらに備え、位置特定制御部（位置特定制御部２６０）は、位置特定の処理において、ＳＶＭ演算部（ＳＶＭ演算部２４０）に、位置特定分割領域を表すヒストグラムのそれぞれと、ヒストグラム保存部２８０に保存された画像の全体を表すヒストグラムとを比較するＳＶＭ演算を実行させる、対象物位置特定システム（対象物位置特定システム２０）が構成される。 According to the present embodiment, the object position specifying system 20 further includes a histogram storage unit (histogram storage unit 280) that stores a histogram representing the entire image generated by the histogram generation unit (histogram generation unit 230). In the position specifying process, the position specifying control unit (position specifying control unit 260) stores each of the histograms representing the position specifying divided regions and the histogram storing unit 280 in the SVM calculating unit (SVM calculating unit 240). An object position specifying system (object position specifying system 20) is configured to execute an SVM calculation that compares a histogram representing the entire image.

上記に述べたように、本第２の実施形態の対象物位置特定システム２０では、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存する。これにより、本第２の実施形態の対象物位置特定システム２０における対象物の位置特定の処理では、第１の実施形態の対象物位置特定システム１０と同様に、量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、少ない処理で位置特定分割領域毎の位置特定ヒストグラムを生成することができる。 As described above, in the object position specifying system 20 of the second embodiment, the quantized vector generation unit 120 is used in the scene recognition process, as in the object position specifying system 10 of the first embodiment. The generated quantization vector value of each scene recognition divided region is stored in the quantization vector storage unit 170. As a result, in the object position specifying process in the object position specifying system 20 of the second embodiment, the quantized vector storage unit 170 stores the object as in the object position specifying system 10 of the first embodiment. Using the quantization vector value of each scene recognition divided area, a position specifying histogram for each position specifying divided area can be generated with a small amount of processing.

また、本第２の実施形態の対象物位置特定システム２０では、シーン認識の処理においてヒストグラム生成部２３０が生成した画像全体のヒストグラムを、ヒストグラム保存部２８０に保存する。そして、本第２の実施形態の対象物位置特定システム２０における対象物の位置特定の処理では、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラムの代わりに、ヒストグラム保存部２８０に保存した画像全体のヒストグラムを用いて簡易ＳＶＭ演算を行う。これにより、本第２の実施形態の対象物位置特定システム２０では、対象物が写っている画像内の位置を特定するために行う、ヒストグラム生成部２３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算を、簡易的に行うことができる。つまり、本第２の実施形態の対象物位置特定システム２０における対象物の位置特定の処理では、１つの位置特定ヒストグラムに対すＳＶＭ演算を、類似度が最も高かった対象物のカテゴリに含まれる大量の教師データを用いて行うのではなく、ヒストグラム保存部２８０に保存した１つのヒストグラムのみを用いて行うことができる。このことにより、本第２の実施形態の対象物位置特定システム２０では、詳細なＳＶＭ演算を行う必要がある位置特定分割領域を絞り込むことができ、対象物が写っている画像内の位置を特定するために要する演算時間を、第１の実施形態の対象物位置特定システム１０よりもさらに短縮することができる。 Further, in the object position specifying system 20 of the second embodiment, the histogram of the entire image generated by the histogram generation unit 230 in the scene recognition process is stored in the histogram storage unit 280. In the object position specifying process in the object position specifying system 20 of the second embodiment, the histogram of each teacher data of the category of the object having the highest similarity determined in the scene recognition process. Instead, simple SVM calculation is performed using the histogram of the entire image stored in the histogram storage unit 280. Thereby, in the object position specifying system 20 of the second embodiment, the position specifying for each position specifying divided area generated by the histogram generation unit 230 is performed to specify the position in the image in which the object is shown. The SVM calculation for the histogram can be performed simply. That is, in the object position specifying process in the object position specifying system 20 of the second embodiment, the SVM calculation for one position specifying histogram is performed in a large amount included in the object category having the highest similarity. It is possible to use only one histogram stored in the histogram storage unit 280 instead of using the teacher data. As a result, in the object position specifying system 20 of the second embodiment, it is possible to narrow down the position specifying divided area where the detailed SVM calculation needs to be performed, and specify the position in the image in which the object is shown. The calculation time required to do this can be further shortened compared to the object position specifying system 10 of the first embodiment.

なお、本第２の実施形態の対象物位置特定システム２０でも、第１の実施形態の対象物位置特定システム１０と同様に、ＳＶＭ演算部２４０が、対象物のカテゴリ毎の類似度を表す情報や、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成について説明したが、第１の実施形態の対象物位置特定システム１０と同様に、ＳＶＭ演算部２４０以外の構成要素が出力する構成にすることもできる。例えば、位置特定制御部２６０に備えた位置特定ＳＶＭ演算判定部２６２が、ＳＶＭ演算部２４０が行ったぞれぞれの位置特定分割領域に対する簡易ＳＶＭ演算の結果から得られる対象物のカテゴリとの類似度を表す情報に基づいて、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成にすることもできる。 Note that in the object position specifying system 20 of the second embodiment as well, as in the object position specifying system 10 of the first embodiment, the SVM calculation unit 240 is information indicating the similarity for each category of the object. In addition, the configuration for outputting the information for specifying the position of the position specifying divided region where the determined object is shown has been described. However, as with the object position specifying system 10 of the first embodiment, other than the SVM calculation unit 240 It is also possible to adopt a configuration in which these components are output. For example, the position specifying SVM calculation determining unit 262 provided in the position specifying control unit 260 is connected to the category of the object obtained from the result of the simple SVM calculation for each position specifying divided region performed by the SVM calculating unit 240. Based on the information indicating the degree of similarity, it may be configured to output information for specifying the position of the position-specific divided region in which the determined object is shown.

なお、本第２の実施形態の対象物位置特定システム２０では、画像全体のヒストグラムを保存する構成、つまり、ヒストグラム保存部２８０を備えることによって、ＳＶＭ演算部２４０が行うＳＶＭ演算を簡易的にし、対象物が写っている画像内の位置を特定するために要する演算時間を短縮する構成について説明した。しかし、画像全体のヒストグラムを保存する構成を備えない場合でも、対象物が写っている画像内の位置を特定するためにＳＶＭ演算部２４０が行うＳＶＭ演算を簡易的にすることができる。 In addition, in the target object specifying system 20 of the second embodiment, the configuration for storing the histogram of the entire image, that is, the histogram storage unit 280 is provided, so that the SVM calculation performed by the SVM calculation unit 240 is simplified. The configuration for shortening the calculation time required for specifying the position in the image in which the object is shown has been described. However, even when the configuration for storing the histogram of the entire image is not provided, the SVM calculation performed by the SVM calculation unit 240 in order to specify the position in the image in which the object is shown can be simplified.

＜第３の実施形態＞
次に、本発明の第３の実施形態について説明する。図９は、本第３の実施形態による対象物位置特定システムの概略構成を示したブロック図である。図９において、対象物位置特定システム３０は、局所特徴ベクトル生成部１１０と、量子化ベクトル生成部１２０と、ヒストグラム生成部１３０と、ＳＶＭ演算部１４０と、教師データ群１５０と、位置特定制御部３６０と、量子化ベクトル保存部１７０と、教師データ切り替え部３９０と、を備えている。 <Third Embodiment>
Next, a third embodiment of the present invention will be described. FIG. 9 is a block diagram showing a schematic configuration of an object position specifying system according to the third embodiment. 9, the object position specifying system 30 includes a local feature vector generating unit 110, a quantized vector generating unit 120, a histogram generating unit 130, an SVM calculating unit 140, a teacher data group 150, and a position specifying control unit. 360, a quantized vector storage unit 170, and a teacher data switching unit 390.

なお、図９に示した対象物位置特定システム３０は、図１に示した第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０が位置特定制御部３６０に代わり、さらに、教師データ切り替え部３９０を備えた構成である。また、対象物位置特定システム３０に備えたその他の構成要素は、図１に示した第１の実施形態の対象物位置特定システム１０に備えた構成要素と同じ構成要素である。従って、本第３の実施形態の対象物位置特定システム３０の説明においては、第１の実施形態の対象物位置特定システム１０に備えた構成要素と異なる構成要素および動作のみを説明し、第１の実施形態の対象物位置特定システム１０と同様の構成要素および動作に関する詳細な説明は省略する。 In addition, in the object position specifying system 30 shown in FIG. 9, the position specifying control unit 160 provided in the object position specifying system 10 of the first embodiment shown in FIG. The teacher data switching unit 390 is provided. Further, the other components provided in the object position specifying system 30 are the same as the components provided in the object position specifying system 10 of the first embodiment shown in FIG. Therefore, in the description of the object position specifying system 30 according to the third embodiment, only components and operations different from those provided in the object position specifying system 10 according to the first embodiment will be described. A detailed description of the same components and operations as those of the object position specifying system 10 of the embodiment will be omitted.

対象物位置特定システム３０は、第１の実施形態の対象物位置特定システム１０と同様に、入力された画像に対して、画像に写っている被写体（対象物）や画像が撮影されたシーンを認識するシーン認識の処理を行い、様々な対象物の種類毎に分類されたそれぞれの教師データとの類似度の情報を、シーン認識の処理によって判別した情報として出力する。また、対象物位置特定システム３０は、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を行い、特定した対象物が写っている位置を表す情報を出力する。 Similar to the object position specifying system 10 of the first embodiment, the object position specifying system 30 uses a subject (object) shown in the image and a scene in which the image is captured with respect to the input image. Recognizing scene recognition processing is performed, and information on the degree of similarity with each teacher data classified for each type of object is output as information determined by scene recognition processing. In addition, the object position specifying system 30 is a position for specifying the position where the determined object is shown in the image subjected to the scene recognition process, similarly to the object position specifying system 10 of the first embodiment. A specific process is performed, and information indicating the position where the specified object is shown is output.

局所特徴ベクトル生成部１１０は、位置特定制御部３６０からの制御に応じて、対象物位置特定システム３０に入力された画像の局所特徴ベクトルをシーン認識分割領域毎に生成し、生成したそれぞれのシーン認識分割領域の局所特徴ベクトルの値を、量子化ベクトル生成部１２０に出力する。 The local feature vector generation unit 110 generates a local feature vector of an image input to the object position specifying system 30 for each scene recognition divided region in accordance with the control from the position specifying control unit 360, and generates each of the generated scenes. The value of the local feature vector of the recognition divided region is output to the quantized vector generation unit 120.

量子化ベクトル生成部１２０は、位置特定制御部３６０からの制御に応じて、局所特徴ベクトル生成部１１０から入力されたそれぞれのシーン認識分割領域の局所特徴ベクトルの値を量子化したシーン認識分割領域毎の量子化ベクトルを生成し、生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、ヒストグラム生成部１３０に出力すると共に、量子化ベクトル保存部１７０に保存させる。 The quantization vector generation unit 120 quantizes the local feature vector value of each scene recognition division region input from the local feature vector generation unit 110 in accordance with the control from the position specification control unit 360. Each quantization vector is generated, and the generated quantization vector value of each scene recognition divided region is output to the histogram generation unit 130 and stored in the quantization vector storage unit 170.

ヒストグラム生成部１３０は、位置特定制御部３６０からの制御に応じたシーン認識の処理において、量子化ベクトル生成部１２０から入力されたそれぞれのシーン認識分割領域毎の量子化ベクトルの値に基づいた画像全体を表すヒストグラムを生成し、生成した画像全体のヒストグラムを、ＳＶＭ演算部２４０に出力する。また、ヒストグラム生成部１３０は、位置特定制御部３６０からの制御に応じた対象物の位置特定の処理において、量子化ベクトル保存部１７０に保存されているそれぞれのシーン認識分割領域毎の量子化ベクトルの値に基づいた位置特定分割領域毎の位置特定ヒストグラムを生成し、生成した位置特定分割領域毎の位置特定ヒストグラムのそれぞれを、ＳＶＭ演算部２４０に出力する。 The histogram generation unit 130 is an image based on the value of the quantization vector for each scene recognition divided region input from the quantization vector generation unit 120 in the process of scene recognition according to the control from the position specifying control unit 360. A histogram representing the entire image is generated, and the generated histogram of the entire image is output to the SVM calculation unit 240. Further, the histogram generation unit 130 performs quantization vector for each scene recognition divided region stored in the quantization vector storage unit 170 in the process of specifying the position of the object according to the control from the position specification control unit 360. A position specifying histogram for each position specifying divided area based on the value of the position specifying divided area is generated, and each of the generated position specifying histograms for each position specifying divided area is output to the SVM calculating unit 240.

ＳＶＭ演算部１４０は、位置特定制御部３６０からの制御に応じたシーン認識の処理において、ヒストグラム生成部１３０から入力された画像全体のヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、教師データ群１５０において分類された対象物のカテゴリ毎に類似度を算出する。ただし、対象物位置特定システム３０では、ＳＶＭ演算部１４０が画像全体のヒストグラムと比較するそれぞれの教師データが、教師データ切り替え部３９０を介して入力される。 In the scene recognition process according to the control from the position specifying control unit 360, the SVM calculation unit 140 includes the histogram of the entire image input from the histogram generation unit 130 and the histogram of each teacher data included in the teacher data group 150. SVM calculation is performed, and similarity is calculated for each category of objects classified in the teacher data group 150. However, in the object position specifying system 30, each teacher data that the SVM calculation unit 140 compares with the histogram of the entire image is input via the teacher data switching unit 390.

また、ＳＶＭ演算部１４０は、位置特定制御部３６０からの制御に応じた対象物の位置特定の処理において、ヒストグラム生成部１３０から入力されたそれぞれの位置特定ヒストグラムと、教師データ群１５０に含まれるそれぞれの教師データのヒストグラムとを比較するＳＶＭ演算を行い、それぞれの位置特定分割領域毎に、教師データ群１５０において分類された対象物のカテゴリとの類似度を算出する。ただし、対象物位置特定システム３０では、ＳＶＭ演算部１４０がそれぞれの位置特定ヒストグラムと比較するそれぞれの教師データも、教師データ切り替え部３９０を介して入力される。なお、対象物位置特定システム３０では、第２の実施形態の対象物位置特定システム２０と同様に、まず、それぞれの位置特定ヒストグラムに対するＳＶＭ演算を簡易的に行い、全ての位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了した後に、最も類似度が最も高かった位置特定ヒストグラムに対してさらにＳＶＭ演算を行って、教師データ群１５０において分類された対象物のカテゴリとの類似度を算出する。以下の説明においては、対象物位置特定システム３０における簡易的なＳＶＭ演算も、「簡易ＳＶＭ演算」という。 Further, the SVM calculation unit 140 is included in each position specifying histogram input from the histogram generation unit 130 and the teacher data group 150 in the process of specifying the position of the object according to the control from the position specifying control unit 360. An SVM operation for comparing the histograms of the respective teacher data is performed, and the similarity with the category of the object classified in the teacher data group 150 is calculated for each position specifying divided region. However, in the object position specifying system 30, each teacher data that the SVM calculating unit 140 compares with each position specifying histogram is also input via the teacher data switching unit 390. In the object position specifying system 30, as in the object position specifying system 20 of the second embodiment, first, the SVM calculation for each position specifying histogram is simply performed, and the simple SVM for all the position specifying histograms. After the calculation is completed, the SVM calculation is further performed on the position specifying histogram having the highest similarity, and the similarity with the category of the object classified in the teacher data group 150 is calculated. In the following description, the simple SVM calculation in the object position specifying system 30 is also referred to as “simple SVM calculation”.

教師データ切り替え部３９０は、位置特定制御部３６０からの制御に応じて、ＳＶＭ演算部１４０に入力するそれぞれの教師データを切り替える。より具体的には、ＳＶＭ演算部１４０が、ヒストグラム生成部１３０から入力されたそれぞれの位置特定ヒストグラムに対する簡易ＳＶＭ演算を行う際にＳＶＭ演算部１４０に入力する教師データのヒストグラムを、予め定めた条件に応じて選択された教師データのヒストグラムのみとする。また、ＳＶＭ演算部１４０が、全ての位置特定分割領域の位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了した後、さらにＳＶＭ演算を行う際にＳＶＭ演算部１４０に入力する教師データのヒストグラムを、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリの全ての教師データのヒストグラムとする。 The teacher data switching unit 390 switches each teacher data input to the SVM calculation unit 140 in accordance with the control from the position specifying control unit 360. More specifically, when the SVM calculation unit 140 performs a simple SVM calculation for each position specifying histogram input from the histogram generation unit 130, a histogram of teacher data to be input to the SVM calculation unit 140 is set to a predetermined condition. Only the histogram of the teacher data selected according to Further, after the SVM calculation unit 140 completes the simple SVM calculation for the position specifying histograms of all the position specifying divided areas, the histogram of the teacher data input to the SVM calculating unit 140 when performing the SVM calculation is used for scene recognition. The histogram of all the teacher data of the category of the object with the highest similarity determined in the processing is used.

なお、ＳＶＭ演算部１４０が簡易ＳＶＭ演算を行う際に入力する教師データのヒストグラムは、対象物のカテゴリを代表する教師データのヒストグラムであり、例えば、対象物の正面が写った画像、対象物の側面が写った画像など、ＳＶＭ演算によって同じカテゴリの対象物を簡易的に判別することができる予め定めた条件によって選択された教師データのヒストグラムである。より具体的には、教師データ群１５０に含まれる１つのカテゴリの教師データとして１５００個のヒストグラムがある場合、上述したような条件によって、例えば、この１５００個のヒストグラムの内、１０個のヒストグラムを選択する。以下の説明においては、選択された対象物のカテゴリを代表する教師データを、「抽出教師データ」という。 Note that the histogram of the teacher data input when the SVM calculation unit 140 performs the simple SVM calculation is a histogram of the teacher data that represents the category of the target object. For example, an image showing the front of the target object, It is the histogram of the teacher data selected by the predetermined conditions which can discriminate | determine easily the target object of the same category by SVM calculation, such as the image in which the side surface was reflected. More specifically, in the case where there are 1500 histograms as one category of teacher data included in the teacher data group 150, for example, 10 histograms of the 1500 histograms are selected according to the conditions described above. select. In the following description, the teacher data representing the category of the selected object is referred to as “extracted teacher data”.

位置特定制御部３６０は、対象物位置特定システム３０の全体、すなわち、対象物位置特定システム３０に備えた局所特徴ベクトル生成部１１０、量子化ベクトル生成部１２０、ヒストグラム生成部１３０、ＳＶＭ演算部１４０、および教師データ切り替え部３９０のそれぞれの動作を制御する。位置特定制御部３６０は、ヒストグラム生成分割領域指定部１６１と、位置特定ＳＶＭ演算判定部３６２と、を備えている。 The position specifying control unit 360 is the entire object position specifying system 30, that is, the local feature vector generating unit 110, the quantized vector generating unit 120, the histogram generating unit 130, and the SVM calculating unit 140 included in the object position specifying system 30. And the operation of the teacher data switching unit 390 are controlled. The position specifying control unit 360 includes a histogram generation divided region specifying unit 161 and a position specifying SVM calculation determining unit 362.

なお、位置特定制御部３６０は、図６に示した第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部２６０内の位置特定ＳＶＭ演算判定部２６２が、位置特定ＳＶＭ演算判定部３６２に代わった構成である。なお、位置特定制御部３６０に備えたヒストグラム生成分割領域指定部１６１は、第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０内のヒストグラム生成分割領域指定部１６１および第２の実施形態の対象物位置特定システム２０に備えた位置特定制御部２６０内のヒストグラム生成分割領域指定部１６１と同じ動作をする。従って、ヒストグラム生成分割領域指定部１６１の動作に関する詳細な説明は省略する。 The position specifying control unit 360 is configured so that the position specifying SVM calculation determining unit 262 in the position specifying control unit 260 included in the object position specifying system 10 of the first embodiment shown in FIG. This is a configuration that replaces the portion 362. Note that the histogram generation divided region specifying unit 161 included in the position specifying control unit 360 includes a histogram generation divided region specifying unit 161 and a first one in the position specifying control unit 160 included in the object position specifying system 10 according to the first embodiment. The same operation as that of the histogram generation divided region specifying unit 161 in the position specifying control unit 260 provided in the object position specifying system 20 of the second embodiment is performed. Therefore, a detailed description of the operation of the histogram generation divided area designating unit 161 is omitted.

位置特定ＳＶＭ演算判定部３６２は、対象物位置特定システム３０における対象物の位置特定の処理において、ＳＶＭ演算部１４０が、ヒストグラム生成部１３０から入力されたそれぞれの位置特定ヒストグラムに対するＳＶＭ演算を行う際のヒストグラムを、教師データ群１５０に含まれるそれぞれの教師データのヒストグラム、または予め定めた条件に応じて選択された抽出教師データのヒストグラムのいずれか一方を選択する。これにより、教師データ切り替え部３９０は、対象物位置特定システム３０における対象物の位置特定の処理において、教師データ群１５０に含まれる同じカテゴリの対象物の全ての教師データのヒストグラム、または予め定めた条件に応じて選択された教師データ群１５０内の一部の教師データのヒストグラムのいずれか一方を、ＳＶＭ演算部１４０に出力する。より具体的には、位置特定ＳＶＭ演算判定部３６２は、ＳＶＭ演算部１４０が、ヒストグラム生成部１３０から入力されたそれぞれの位置特定ヒストグラムに対する簡易ＳＶＭ演算を行う際に、それぞれの位置特定ヒストグラムと比較するヒストグラムを、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリを代表する一部の教師データのヒストグラムとするように、教師データ切り替え部３９０を制御する。また、位置特定ＳＶＭ演算判定部３６２は、ＳＶＭ演算部１４０が、全ての位置特定分割領域の位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了した後、ＳＶＭ演算部１４０がさらに、ＳＶＭ演算を行う際に、簡易ＳＶＭ演算において比較した教師データと最も類似度が高かった位置特定ヒストグラムと比較するヒストグラムを、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリの全ての教師データのヒストグラムとするように、教師データ切り替え部３９０を制御する。 The position specifying SVM calculation determining unit 362 is used when the SVM calculating unit 140 performs the SVM calculation for each position specifying histogram input from the histogram generating unit 130 in the process of specifying the position of the target in the target position specifying system 30. Is selected from either the histogram of each teacher data included in the teacher data group 150 or the histogram of extracted teacher data selected according to a predetermined condition. Accordingly, the teacher data switching unit 390 performs a histogram of all the teacher data of objects of the same category included in the teacher data group 150 in the object position specifying process in the object position specifying system 30 or a predetermined value. One of the histograms of a part of the teacher data in the teacher data group 150 selected according to the condition is output to the SVM calculation unit 140. More specifically, the position specifying SVM calculation determining unit 362 compares the position specifying SVM calculation unit 140 with each position specifying histogram when the SVM calculating unit 140 performs a simple SVM calculation for each position specifying histogram input from the histogram generating unit 130. The teacher data switching unit 390 is controlled so that the histogram to be used is a histogram of a part of the teacher data representing the category of the object having the highest similarity determined in the scene recognition process. In addition, the position specifying SVM calculation determining unit 362, when the SVM calculating unit 140 further performs the SVM calculation after the SVM calculating unit 140 completes the simple SVM calculation for the position specifying histograms of all the position specifying divided regions. Histograms of all the teacher data of the category of the object having the highest similarity determined in the scene recognition process are compared with the position identification histogram having the highest similarity with the teacher data compared in the simple SVM calculation. In this manner, the teacher data switching unit 390 is controlled.

このような構成よって、対象物位置特定システム３０では、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、対象物の位置特定の処理を行う際の位置特定分割領域毎の位置特定ヒストグラムを生成する。さらに、対象物位置特定システム３０では、対象物の位置特定の処理において、ＳＶＭ演算を行う際に用いる教師データのヒストグラムを、抽出教師データまたは全ての教師データのいずれか一方を選択し、選択した教師データムを用いて、対象物の位置特定の処理を行う。より具体的には、対象物位置特定システム３０による対象物の位置特定の処理において、ヒストグラム生成部１３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算において比較する、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリの全ての教師データのヒストグラム（例えば、１つのカテゴリに含まれる１５００個のヒストグラム）の代わりに、予め定めた条件に応じて選択した抽出教師データのヒストグラムを用いる。つまり、対象物位置特定システム３０による対象物の位置特定の処理では、ＳＶＭ演算を行う際に用いる大量の教師データの代わりに、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリを代表する抽出教師データのヒストグラムを使用して、シーン認識の処理において判別した対象物が写っている位置特定分割領域の位置を、簡易的に特定することができる。これにより、対象物位置特定システム３０では、対象物のカテゴリを代表する抽出教師データのヒストグラムとの類似度が最も高い位置特定ヒストグラムを特定するためにＳＶＭ演算部１４０によって行う、ヒストグラム生成部１３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算に要する時間を短縮することができる。 With such a configuration, in the object position specifying system 30, as in the object position specifying system 10 of the first embodiment, each scene recognition divided region generated by the quantized vector generation unit 120 in the scene recognition process. The position specifying histogram for each position specifying divided region when performing the process of specifying the position of the object is generated using the value of the quantization vector. Further, the object position specifying system 30 selects either one of the extracted teacher data or all the teacher data for the teacher data histogram used when performing the SVM calculation in the object position specifying process. Using the teacher datum, the process of specifying the position of the object is performed. More specifically, in the process of identifying a target by the target object specifying system 30, in the process of scene recognition to be compared in the SVM calculation for the position specifying histogram for each position specifying divided region generated by the histogram generation unit 130. Extracted teacher data selected according to a predetermined condition, instead of the histograms of all the teacher data of the category of the object with the highest similarity determined (for example, 1500 histograms included in one category) The histogram is used. That is, in the object position specifying process by the object position specifying system 30, instead of a large amount of teacher data used when performing the SVM calculation, the object having the highest similarity determined in the scene recognition process is used. Using the histogram of extracted teacher data representing the category, the position of the position-specific divided area where the object determined in the scene recognition process is reflected can be easily specified. Thereby, in the object position specifying system 30, the histogram generating unit 130, which is performed by the SVM calculating unit 140 in order to specify the position specifying histogram having the highest similarity with the histogram of the extracted teacher data representing the category of the object, is provided. It is possible to reduce the time required for the SVM calculation for the generated position specifying histogram for each position specifying divided region.

次に、対象物位置特定システム３０の動作について説明する。図１０は、本第３の実施形態の対象物位置特定システム３０における処理手順を示したフローチャートである。また、図１１は、本第３の実施形態の対象物位置特定システム３０において対象物の位置を特定する処理の一例を説明する図である。図１０に示した対象物位置特定システム３０における処理のフローチャートの説明においては、適宜、図３〜図５に示した第１の実施形態の対象物位置特定システム１０におけるそれぞれの処理の一例、および図１１に示した対象物位置特定システム３０において対象物の位置を特定する処理の一例を参照する。そして、対象物位置特定システム３０における処理においても、第１の実施形態の対象物位置特定システム１０と同様に、画像に写っている対象物が「犬」である場合において、画像全体の領域を９つの位置特定分割領域に分割して、対象物である「犬」が写っている位置特定分割領域を特定する場合の例を説明する。 Next, the operation of the object position specifying system 30 will be described. FIG. 10 is a flowchart showing a processing procedure in the object position specifying system 30 of the third embodiment. FIG. 11 is a diagram illustrating an example of processing for specifying the position of an object in the object position specifying system 30 according to the third embodiment. In the description of the flowchart of the process in the object position specifying system 30 shown in FIG. 10, an example of each process in the object position specifying system 10 of the first embodiment shown in FIGS. An example of processing for specifying the position of an object in the object position specifying system 30 shown in FIG. 11 will be referred to. And also in the processing in the object position specifying system 30, as in the object position specifying system 10 of the first embodiment, when the object shown in the image is “dog”, the entire image area is determined. An example will be described in which a position-specific divided area in which the object “dog” is reflected is specified by dividing into nine position-specific divided areas.

なお、対象物位置特定システム３０の処理には、第１の実施形態の対象物位置特定システム１０の処理と同じ処理が含まれている。このため、図１０に示した本第３の実施形態の対象物位置特定システム３０における処理手順を示したフローチャートには、第１の実施形態の対象物位置特定システム１０の処理と同じ処理を行う手順に、図２に示した本第１の実施形態の対象物位置特定システム１０における処理手順を示したフローチャートに付与したステップ番号と同一のステップ番号を付与している。従って、図１０に示した対象物位置特定システム３０における処理のフローチャートの説明においては、第１の実施形態の対象物位置特定システム１０の処理と同じ処理を行う手順に関する詳細な説明は省略する。 Note that the processing of the object position specifying system 30 includes the same processing as the processing of the object position specifying system 10 of the first embodiment. For this reason, the same processing as the processing of the object position specifying system 10 of the first embodiment is performed in the flowchart showing the processing procedure in the object position specifying system 30 of the third embodiment shown in FIG. The same step number as the step number given to the flowchart showing the processing procedure in the object position specifying system 10 of the first embodiment shown in FIG. 2 is given to the procedure. Therefore, in the description of the flowchart of the process in the object position specifying system 30 illustrated in FIG. 10, the detailed description regarding the procedure for performing the same process as the process of the object position specifying system 10 of the first embodiment is omitted.

対象物位置特定システム３０に画像が入力されると、位置特定制御部３６０は、第１の実施形態の対象物位置特定システム１０に備えた位置特定制御部１６０と同様に、まず、入力された画像に対するシーン認識の処理を行い、その後、対象物の位置特定の処理を行うように、対象物位置特定システム３０に備えたそれぞれの構成要素の動作を制御する（図３参照）。 When an image is input to the object position specifying system 30, the position specifying control unit 360 is first input in the same manner as the position specifying control unit 160 provided in the object position specifying system 10 of the first embodiment. The operation of each component included in the object position specifying system 30 is controlled so that the scene recognition process is performed on the image, and then the object position specifying process is performed (see FIG. 3).

対象物位置特定システム３０におけるシーン認識の処理では、まず、ステップＳ１００〜ステップＳ１１５において、位置特定制御部３６０は、局所特徴ベクトル生成部１１０に、入力された画像（図３（ａ）参照）のそれぞれのシーン認識分割領域毎の局所特徴ベクトルを生成させ、量子化ベクトル生成部１２０に、それぞれのシーン認識分割領域毎の量子化ベクトルを生成させて、生成した量子化ベクトルの値をそれぞれのシーン認識分割領域に量子化ベクトル保存部１７０に保存させる。 In the process of scene recognition in the object position specifying system 30, first, in steps S100 to S115, the position specifying control unit 360 inputs the image (see FIG. 3A) input to the local feature vector generating unit 110. A local feature vector for each scene recognition divided region is generated, and a quantization vector generation unit 120 generates a quantization vector for each scene recognition divided region, and the value of the generated quantization vector is set for each scene. The quantization vector storage unit 170 stores the recognition divided region.

続いて、ステップＳ１２０〜ステップＳ１３０において、位置特定制御部３６０は、ヒストグラム生成部１３０に、それぞれのシーン認識分割領域毎の量子化ベクトルの値から入力された画像（図３（ａ）参照）の全体を表すヒストグラムを生成させ、ＳＶＭ演算部１４０に、生成した画像（図３（ａ）参照）全体のヒストグラムに対するＳＶＭ演算を実行させる。これにより、対象物位置特定システム３０は、入力された画像（図３（ａ）参照）に「犬」が写っていると判別することができ、それぞれの対象物のカテゴリ毎の類似度を表す情報を出力する（図４参照）。 Subsequently, in step S120 to step S130, the position specifying control unit 360 inputs the image (see FIG. 3A) input to the histogram generation unit 130 from the value of the quantization vector for each scene recognition divided region. A histogram representing the whole is generated, and the SVM calculation unit 140 is caused to execute an SVM calculation on the histogram of the entire generated image (see FIG. 3A). Thereby, the object position specifying system 30 can determine that “dog” is reflected in the input image (see FIG. 3A), and represents the similarity of each object for each category. Information is output (see FIG. 4).

そして、ステップＳ１００〜ステップＳ１３０までのシーン認識の処理が完了すると、対象物位置特定システム３０は、ステップＳ４００から、シーン認識の処理を行った画像内で、判別した対象物が写っている位置を特定する位置特定の処理を開始する。対象物位置特定システム３０における対象物の位置特定の処理では、第１の実施形態の対象物位置特定システム１０および第２の実施形態の対象物位置特定システム２０と同様に、まず、シーン認識の処理において量子化ベクトル生成部１２０が量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、位置特定ヒストグラムを生成する。その後、対象物位置特定システム３０における対象物の位置特定の処理では、生成した位置特定ヒストグラムに対する簡易ＳＶＭ演算を、抽出教師データのヒストグラムを用いて、位置特定分割領域毎に行う。そして、対象物位置特定システム３０における対象物の位置特定の処理では、最後に、第２の実施形態の対象物位置特定システム２０と同様に、簡易ＳＶＭ演算によって簡易的に判別した、シーン認識の処理において判別した対象物が写っている位置特定分割領域を表す位置特定ヒストグラムに対するＳＶＭ演算を行う。 When the scene recognition process from step S100 to step S130 is completed, the object position specifying system 30 determines the position where the determined object is reflected in the image subjected to the scene recognition process from step S400. The position specifying process to be specified is started. In the object position specifying process in the object position specifying system 30, first, scene recognition is performed in the same manner as in the object position specifying system 10 of the first embodiment and the object position specifying system 20 of the second embodiment. In the processing, the quantization vector generation unit 120 generates a position specifying histogram using the quantization vector value of each scene recognition divided region stored in the quantization vector storage unit 170. Thereafter, in the process of specifying the position of the object in the object position specifying system 30, the simple SVM calculation for the generated position specifying histogram is performed for each position specifying divided region using the histogram of the extracted teacher data. Then, in the object position specifying process in the object position specifying system 30, finally, as in the object position specifying system 20 of the second embodiment, the scene recognition of the scene recognition determined simply by the simple SVM calculation is performed. The SVM calculation is performed on the position specifying histogram representing the position specifying divided region in which the object determined in the process is shown.

まず、ステップＳ４００〜ステップＳ４１０において、位置特定制御部３６０は、第１の実施形態の対象物位置特定システム１０におけるステップＳ２００〜ステップＳ２１０と同様に、ヒストグラム生成部１３０に、１つ目の位置特定分割領域を指定し、指定した１つ目の位置特定分割領域に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得させ、１つ目の位置特定分割領域を表す位置特定ヒストグラムを生成させる。 First, in step S400 to step S410, the position specifying control unit 360 causes the histogram generating unit 130 to specify the first position as in step S200 to step S210 in the object position specifying system 10 of the first embodiment. Designates a divided area, acquires the value of the quantization vector corresponding to the designated first position-specific divided area from the quantization vector storage unit 170, and generates a position-specific histogram representing the first position-specific divided area Let

続いて、ステップＳ４２０において、位置特定制御部３６０は、ＳＶＭ演算部１４０に、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域を表す位置特定ヒストグラムと、抽出教師データのヒストグラムとの類似度を算出する簡易ＳＶＭ演算を実行させる。これにより、対象物位置特定システム３０は、１つ目の位置特定分割領域内に、シーン認識の処理において類似度が最も高かった対象物（図３に示した処理の一例では「犬」）が写っているか否かを、簡易的に判別することができる。 Subsequently, in step S420, the position specifying control unit 360 causes the SVM calculating unit 140 to resemble the position specifying histogram representing the first position specifying divided region generated by the histogram generating unit 130 and the histogram of the extracted teacher data. A simple SVM calculation for calculating the degree is executed. As a result, the object position specifying system 30 includes an object having the highest similarity in the scene recognition process (“dog” in the example of the process shown in FIG. 3) in the first position specifying divided region. It is possible to easily determine whether or not the image is shown.

続いて、ステップＳ４３０において、位置特定制御部３６０は、入力された画像全体の領域を分割した全ての位置特定分割領域に対する簡易的な判別が終了したか否かを判定する。ステップＳ４３０による判定の結果、分割した全ての位置特定分割領域に対する簡易的な判別が終了していない場合には、ステップＳ４００に戻って、次の位置特定分割領域を指定し、分割した全ての位置特定分割領域に対する簡易的な判別が終了するまで、ステップＳ４００〜ステップＳ４２０までの簡易的な判別の処理を繰り返す。ステップＳ４３０による判定の結果、分割した全ての位置特定分割領域に対する簡易的な判別が終了した場合には、対象物位置特定システム３０における簡易的な判別の処理を終了し、ステップＳ４４０に進む。 Subsequently, in step S430, the position specifying control unit 360 determines whether or not the simple determination for all the position specifying divided areas obtained by dividing the entire area of the input image has been completed. If the result of determination in step S430 is that simple determination has not been completed for all divided position-specific divided areas, the process returns to step S400 to specify the next position-specific divided area and all divided positions. The simple discrimination process from step S400 to step S420 is repeated until the simple discrimination for the specific divided region is completed. As a result of the determination in step S430, when the simple determination for all the divided position specifying divided regions is completed, the simple determination process in the object position specifying system 30 is ended, and the process proceeds to step S440.

ここで、対象物位置特定システム３０によって行われる、ステップＳ４００〜ステップＳ４２０までの簡易的な判別の処理について説明する。図１１は、本第３の実施形態の対象物位置特定システム３０において対象物の位置を簡易的に特定する処理の考え方を説明する図である。 Here, a simple determination process from step S400 to step S420 performed by the object position specifying system 30 will be described. FIG. 11 is a diagram illustrating the concept of processing for simply specifying the position of an object in the object position specifying system 30 according to the third embodiment.

対象物位置特定システム３０における対象物の位置特定の処理では、まず、位置特定制御部３６０が、ステップＳ４００およびステップＳ４１０において、第１の実施形態の対象物位置特定システム１０におけるステップＳ２００およびステップＳ２１０と同様に、入力された画像（図３（ａ）参照）全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）の内、１つ目の位置特定分割領域Ａ１を指定する。これにより、ヒストグラム生成部１３０は、位置特定制御部３６０によって指定された１つ目の位置特定分割領域Ａ１に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得し（図５（ａ−１）参照）、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを生成し（図５（ａ−２）参照）、生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムを、ＳＶＭ演算部１４０に出力する。 In the process of specifying the position of the object in the object position specifying system 30, first, the position specifying control unit 360 in steps S400 and S410, steps S200 and S210 in the object position specifying system 10 of the first embodiment. Similarly, the first position-specific divided area A1 is designated among the nine position-specific divided areas (see FIG. 3B) obtained by dividing the entire area of the input image (see FIG. 3A). To do. Thereby, the histogram generation unit 130 acquires the value of the quantization vector corresponding to the first position specifying divided region A1 specified by the position specifying control unit 360 from the quantized vector storage unit 170 (FIG. 5A -1))) A position specifying histogram representing the first position specifying divided area A1 is generated (see FIG. 5A-2), and the position specifying histogram representing the generated first position specifying divided area A1 is generated. Is output to the SVM calculation unit 140.

その後、ステップＳ４２０において、ＳＶＭ演算部１４０が、位置特定制御部３６０からの制御に応じて、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムと、教師データ切り替え部３９０を介して入力されたそれぞれの抽出教師データのヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、位置特定分割領域Ａ１に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を、位置特定分割領域Ａ１内に「犬」が写っているか否かを簡易的に判別する情報として出力する。 After that, in step S420, the SVM calculation unit 140 switches the teacher data switching with the position specifying histogram representing the first position specifying divided area A1 generated by the histogram generating unit 130 in accordance with the control from the position specifying control unit 360. A simple SVM calculation is performed to calculate the degree of similarity with the histogram of each extracted teacher data input via the unit 390. Then, the SVM calculation unit 140 obtains information indicating the similarity with the category “dog” obtained based on the result of the simple SVM calculation calculated for the position specifying divided area A1 in the position specifying divided area A1. Is output as information for easily discriminating whether or not “Dog” is shown in FIG.

なお、ＳＶＭ演算部１４０による簡易ＳＶＭ演算においても、シーン認識の処理におけるＳＶＭ演算部１４０によるＳＶＭ演算と同様に、ヒストグラム生成部１３０が生成した１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムと、それぞれの抽出教師データのヒストグラムとのそれぞれが表す領域の大きさが同等になるように正規化した後に、それぞれのヒストグラムにおける同じ階級同士の度数の差分絶対値を加算する。これにより、ＳＶＭ演算部１４０は、１つ目の位置特定分割領域Ａ１内に、シーン認識の処理において類似度が最も高かった「犬」が写っているか否かを、簡易的に判別することができる、それぞれの抽出教師データのヒストグラムと１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムとの類似度を表す情報を出力する。なお、ＳＶＭ演算部１４０による簡易ＳＶＭ演算においても、算出した差分絶対値の加算結果の値が最も小さい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域の位置を特定する情報を出力する。 Note that, in the simple SVM calculation performed by the SVM calculation unit 140, as in the SVM calculation performed by the SVM calculation unit 140 in the scene recognition process, the position specifying histogram representing the first position specifying divided region A1 generated by the histogram generation unit 130 is used. And the histograms of the respective extracted teacher data are normalized so that the sizes of the regions are equal to each other, and then the difference absolute values of the frequencies of the same classes in the respective histograms are added. Thereby, the SVM calculation unit 140 can easily determine whether or not the “dog” having the highest similarity in the scene recognition process is captured in the first position-specific divided area A1. Information indicating the similarity between the histogram of each extracted teacher data and the position specifying histogram representing the first position specifying divided area A1 is output. In the simple SVM calculation by the SVM calculation unit 140, the position-specific divided area having the smallest value of the calculated difference absolute value addition value is the position-specific divided area in which the “dog” that is the object is reflected. It discriminate | determines and outputs the information which pinpoints the position of the position specific division area.

また、ＳＶＭ演算部１４０は、位置特定制御部３６０から指定された１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了したとき、１つ目の位置特定分割領域Ａ１を表す位置特定ヒストグラムに対する簡易ＳＶＭ演算が完了したことを位置特定制御部３６０に通知する。この通知に応じて、位置特定制御部３６０は、ステップＳ４３０の判定を行い、ステップＳ４００に戻って、２つ目の位置特定分割領域Ａ２を指定する。 In addition, when the simple SVM calculation for the position specifying histogram representing the first position specifying divided area A1 designated by the position specifying control section 360 is completed, the SVM calculating section 140 determines the first position specifying divided area A1. The position specifying control unit 360 is notified that the simple SVM calculation for the position specifying histogram is completed. In response to this notification, the position specifying control unit 360 performs the determination in step S430, returns to step S400, and specifies the second position specifying divided region A2.

そして、ヒストグラム生成部１３０は、位置特定制御部３６０によって指定された２つ目の位置特定分割領域Ａ２に対応する量子化ベクトルの値を量子化ベクトル保存部１７０から取得し（図５（ｂ−１）参照）、２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを生成し（図５（ｂ−２）参照）、生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムを、ＳＶＭ演算部１４０に出力する。 Then, the histogram generation unit 130 acquires the quantization vector value corresponding to the second position specifying divided region A2 designated by the position specifying control unit 360 from the quantized vector storage unit 170 (FIG. 5B-b). (See 1)) A position specifying histogram representing the second position specifying divided area A2 is generated (see FIG. 5B-2), and a position specifying histogram representing the generated second position specifying divided area A2 is generated. , Output to the SVM calculation unit 140.

その後、ステップＳ４２０において、ＳＶＭ演算部１４０が、位置特定制御部３６０からの制御に応じて、ヒストグラム生成部１３０が生成した２つ目の位置特定分割領域Ａ２を表す位置特定ヒストグラムと、教師データ切り替え部３９０を介して入力されたそれぞれの抽出教師データのヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、位置特定分割領域Ａ２に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を、位置特定分割領域Ａ２内に「犬」が写っているか否かを簡易的に判別する情報として出力する。 After that, in step S420, the SVM calculation unit 140 switches the position specifying histogram representing the second position specifying divided region A2 generated by the histogram generating unit 130 according to the control from the position specifying control unit 360, and teacher data switching. A simple SVM calculation is performed to calculate the degree of similarity with the histogram of each extracted teacher data input via the unit 390. Then, the SVM calculation unit 140 obtains information indicating the similarity with the category “dog” obtained based on the result of the simple SVM calculation calculated for the position specifying divided area A2 in the position specifying divided area A2. Is output as information for easily discriminating whether or not “Dog” is shown in FIG.

以降、同様に、位置特定制御部３６０が、入力された画像（図３（ａ）参照）全体の領域を分割した９つの位置特定分割領域（図３（ｂ）参照）を順次指定し、ヒストグラム生成部１３０が、位置特定制御部３６０によって指定されたそれぞれの位置特定分割領域を表す位置特定ヒストグラムを順次生成してＳＶＭ演算部１４０に出力する。また、同様に、ＳＶＭ演算部１４０が、ヒストグラム生成部１３０が生成したそれぞれの位置特定分割領域を表す位置特定ヒストグラムと、教師データ切り替え部３９０を介して入力されたそれぞれの抽出教師データのヒストグラムとの類似度を算出する簡易ＳＶＭ演算を行う。そして、ＳＶＭ演算部１４０は、それぞれの位置特定分割領域に対して算出した簡易ＳＶＭ演算の結果に基づいて得られる、それぞれの位置特定分割領域内に「犬」が写っているか否かを簡易的に判別する情報を出力する。図１１には、ＳＶＭ演算部１４０が、対象物が「犬」であるカテゴリに含まれる１５００個のヒストグラムの内、１０個のヒストグラムが選択された抽出教師データのヒストグラムのそれぞれと、ヒストグラム生成部１３０が生成した位置特定ヒストグラムのそれぞれとを比較する簡易ＳＶＭ演算を実行している状態の一例を示している。 Thereafter, similarly, the position specifying control unit 360 sequentially designates nine position specifying divided areas (see FIG. 3B) obtained by dividing the entire area of the input image (see FIG. 3A), and the histogram The generating unit 130 sequentially generates a position specifying histogram representing each position specifying divided region designated by the position specifying control unit 360 and outputs the position specifying histogram to the SVM calculating unit 140. Similarly, the SVM calculating unit 140 includes a position specifying histogram representing each position specifying divided region generated by the histogram generating unit 130, and a histogram of each extracted teacher data input via the teacher data switching unit 390. A simple SVM calculation for calculating the degree of similarity is performed. Then, the SVM calculating unit 140 simply determines whether or not “dog” is reflected in each position specifying divided area obtained based on the result of the simple SVM calculation calculated for each position specifying divided area. Output the information to be determined. In FIG. 11, the SVM calculation unit 140 includes each of the histograms of extracted teacher data in which 10 histograms are selected from the 1500 histograms included in the category whose target is “dog”, and the histogram generation unit. An example of a state in which a simple SVM calculation that compares each of the position specifying histograms generated by 130 is executed is shown.

また、ＳＶＭ演算部１４０は、それぞれの位置特定分割領域における「犬」であるカテゴリとの類似度を表す情報に基づいて、類似度が最も大きい位置特定分割領域を、対象物である「犬」が写っている位置特定分割領域であると判別し、その位置特定分割領域を特定する情報を出力する。 Further, the SVM calculation unit 140 selects the position-specific divided area having the highest similarity as the object “dog” based on the information indicating the similarity to the category “dog” in each position-specific divided area. Is identified as a position-specific divided area, and information for specifying the position-specific divided area is output.

ここまでの処理が、対象物位置特定システム３０による対象物の位置特定の処理における、シーン認識の処理において判別した対象物が写っている位置特定分割領域の簡易的な判別の処理である。 The process so far is the simple determination process of the position-specific divided region in which the object determined in the scene recognition process is included in the object position specifying process by the object position specifying system 30.

続いて、位置特定制御部３６０は、ステップＳ４３０において全ての位置特定分割領域に対する簡易的な判別が終了した場合、ＳＶＭ演算部１４０に、シーン認識の処理において類似度が最も高かった対象物（図３に示した処理の一例では「犬」）が写っていると簡易的に判別した位置特定分割領域に対するＳＶＭ演算を実行させる。より具体的には、ステップＳ４４０において、位置特定制御部３６０は、ＳＶＭ演算部１４０に、シーン認識の処理において類似度が最も高かった「犬」が写っていると簡易的に判別した位置特定分割領域を表す位置特定ヒストグラムと、教師データ群１５０に含まれる、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリのそれぞれの教師データのヒストグラムとの類似度を算出するＳＶＭ演算を実行させる。そして、対象物位置特定システム３０が、ＳＶＭ演算の結果に基づいて得られる、「犬」であるカテゴリとの類似度を表す情報を出力し、対象物位置特定システム３０における処理を完了する。 Subsequently, when the simple determination for all the position-specific divided regions is completed in step S430, the position-specific control unit 360 instructs the SVM calculation unit 140 to display the object having the highest similarity in the scene recognition process (see FIG. In the example of the process shown in FIG. 3, the SVM calculation is performed on the position-specific divided area that is simply determined as “dog”). More specifically, in step S440, the position specifying control unit 360 simply determines that the “dog” having the highest similarity in the scene recognition process is captured in the SVM calculating unit 140. SVM calculation for calculating a similarity between a position specifying histogram representing a region and a histogram of each teacher data of the category of the object having the highest similarity, which is included in the teacher data group 150 and is determined in the scene recognition process Is executed. Then, the object position specifying system 30 outputs information representing the similarity with the category “dog” obtained based on the result of the SVM calculation, and the processing in the object position specifying system 30 is completed.

なお、ＳＶＭ演算部１４０によるステップＳ４３０におけるＳＶＭ演算においても、シーン認識の処理におけるＳＶＭ演算部１４０によるＳＶＭ演算と同様に、簡易的に判別した位置特定分割領域を表す位置特定ヒストグラムとそれぞれの教師データのヒストグラムとにおける同じ階級同士の度数の差分絶対値を加算する。 Note that, in the SVM calculation in step S430 by the SVM calculation unit 140 as well, as in the SVM calculation by the SVM calculation unit 140 in the scene recognition process, the position specification histogram representing the position determination divided region that is simply determined and the respective teacher data The difference absolute value of the frequency between the same classes in the histogram is added.

このようにして、対象物位置特定システム３０でも、入力された画像（図３（ａ））全体の領域を分割した全ての位置特定分割領域に対する簡易的な判別を繰り返すことによって、それぞれの位置特定分割領域の中で、抽出教師データのヒストグラムとの類似度が最も高い位置特定分割領域を、シーン認識の処理によって判別した「犬」が対象物として写っている位置特定分割領域として簡易的に特定することができる。そして、対象物位置特定システム３０では、簡易的に特定した位置特定分割領域の位置を、シーン認識の処理において判別した対象物が写っている位置特定分割領域の位置として、類似度を表す情報を出力することができる。 In this way, the object position specifying system 30 also repeats simple determination for all position specifying divided areas obtained by dividing the entire area of the input image (FIG. 3A), thereby specifying each position. Among the divided areas, the position-specific divided area having the highest similarity to the histogram of the extracted teacher data is simply identified as the position-specific divided area in which the “dog” identified by the scene recognition process is reflected as the target object. can do. Then, in the object position specifying system 30, information indicating similarity is obtained by using the position of the position specifying divided area simply specified as the position of the position specifying divided area where the object determined in the scene recognition processing is reflected. Can be output.

本実施形態によれば、位置特定制御部（位置特定制御部３６０）は、位置特定の処理において、ＳＶＭ演算部１４０に、位置特定分割領域を表すヒストグラムのそれぞれと、複数の教師データの内、予め定めた条件に応じて選択した一部の教師データ（抽出教師データ）のヒストグラムのそれぞれとを比較するＳＶＭ演算を実行させる、対象物位置特定システム（対象物位置特定システム３０）が構成される。 According to the present embodiment, in the position specifying process, the position specifying control unit (position specifying control unit 360) causes the SVM calculating unit 140 to include each of the histogram representing the position specifying divided region and a plurality of teacher data. An object position specifying system (object position specifying system 30) is configured to execute an SVM operation for comparing each of histograms of a part of teacher data (extracted teacher data) selected according to a predetermined condition. .

上記に述べたように、本第３の実施形態の対象物位置特定システム３０でも、第１の実施形態の対象物位置特定システム１０と同様に、シーン認識の処理において量子化ベクトル生成部１２０が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を、量子化ベクトル保存部１７０に保存する。これにより、本第３の実施形態の対象物位置特定システム３０における対象物の位置特定の処理でも、第１の実施形態の対象物位置特定システム１０と同様に、量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、少ない処理で位置特定分割領域毎の位置特定ヒストグラムを生成することができる。 As described above, in the object position specifying system 30 according to the third embodiment, the quantized vector generation unit 120 performs the scene recognition processing in the same manner as the object position specifying system 10 according to the first embodiment. The generated quantization vector value of each scene recognition divided region is stored in the quantization vector storage unit 170. As a result, the object position specifying process in the object position specifying system 30 of the third embodiment is also stored in the quantized vector storage unit 170 as in the object position specifying system 10 of the first embodiment. Using the quantization vector value of each scene recognition divided area, a position specifying histogram for each position specifying divided area can be generated with a small amount of processing.

また、本第３の実施形態の対象物位置特定システム３０では、対象物の位置特定の処理において、シーン認識の処理において判別した、類似度が最も高かった対象物のカテゴリの全ての教師データのヒストグラムの代わりに、この対象物のカテゴリを代表する抽出教師データのヒストグラムを用いて簡易ＳＶＭ演算を行う。これにより、本第３の実施形態の対象物位置特定システム３０では、対象物が写っている画像内の位置を特定するために行う、ヒストグラム生成部１３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算を、簡易的に行うことができる。つまり、本第３の実施形態の対象物位置特定システム３０における対象物の位置特定の処理では、１つの位置特定ヒストグラムに対すＳＶＭ演算を、類似度が最も高かった対象物のカテゴリに含まれる大量の教師データを用いて行うのではなく、同じ対象物のカテゴリに含まれる教師データから予め定めた条件に応じて選択した一部の教師データのヒストグラムを用いて行うことができる。このことにより、本第３の実施形態の対象物位置特定システム３０では、詳細なＳＶＭ演算を行う必要がある位置特定分割領域を絞り込むことができ、対象物が写っている画像内の位置を特定するために要する演算時間を、第１の実施形態の対象物位置特定システム１０よりもさらに短縮することができる。 Further, in the target object specifying system 30 of the third embodiment, all the teacher data of the category of the target object having the highest similarity determined in the scene recognition process in the target object specifying process is obtained. A simple SVM calculation is performed using a histogram of extracted teacher data representing the category of the object instead of the histogram. Thereby, in the target object specifying system 30 of the third embodiment, the position specifying for each position specifying divided region generated by the histogram generating unit 130 is performed to specify the position in the image in which the target is shown. The SVM calculation for the histogram can be performed simply. That is, in the object position specifying process in the object position specifying system 30 according to the third embodiment, the SVM calculation for one position specifying histogram is performed in a large amount included in the object category having the highest similarity. It is possible to use a histogram of a part of teacher data selected according to a predetermined condition from teacher data included in the same object category. As a result, in the target object specifying system 30 of the third embodiment, it is possible to narrow down the position specifying divided area where the detailed SVM calculation needs to be performed, and to specify the position in the image in which the target is shown. The calculation time required to do this can be further shortened compared to the object position specifying system 10 of the first embodiment.

なお、本第３の実施形態の対象物位置特定システム３０でも、第１の実施形態の対象物位置特定システム１０と同様に、ＳＶＭ演算部１４０が、対象物のカテゴリ毎の類似度を表す情報や、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成について説明したが、第１の実施形態の対象物位置特定システム１０と同様に、ＳＶＭ演算部１４０以外の構成要素が出力する構成にすることもできる。例えば、位置特定制御部３６０に備えた位置特定ＳＶＭ演算判定部３６２が、ＳＶＭ演算部１４０が行ったぞれぞれの位置特定分割領域に対する簡易ＳＶＭ演算の結果から得られる対象物のカテゴリとの類似度を表す情報に基づいて、判別した対象物が写っている位置特定分割領域の位置を特定する情報を出力する構成にすることもできる。 Note that in the object position specifying system 30 of the third embodiment as well, as in the object position specifying system 10 of the first embodiment, the SVM calculation unit 140 is information indicating the similarity for each category of the object. In addition, the configuration for outputting the information for specifying the position of the position specifying divided region where the determined object is shown has been described. However, as with the object position specifying system 10 of the first embodiment, other than the SVM calculation unit 140 It is also possible to adopt a configuration in which these components are output. For example, the position specifying SVM calculation determining unit 362 included in the position specifying control unit 360 may determine the category of the object obtained from the result of the simple SVM calculation for each position specifying divided region performed by the SVM calculating unit 140. Based on the information indicating the degree of similarity, it may be configured to output information for specifying the position of the position-specific divided region in which the determined object is shown.

なお、本第３の実施形態の対象物位置特定システム３０では、教師データ切り替え部３９０を備え、位置特定制御部３６０が教師データ切り替え部３９０を制御することによって、対象物の位置特定の処理においてＳＶＭ演算部１４０がそれぞれの位置特定ヒストグラムに対して行う簡易ＳＶＭ演算に用いる教師データのヒストグラムを、抽出教師データまたは全ての教師データのいずれか一方に切り替える構成について説明した。しかし、ＳＶＭ演算部１４０が簡易ＳＶＭ演算を行う際に用いる教師データのヒストグラムを切り替える方法は、教師データ切り替え部３９０による方法に限定されるものではない。例えば、それぞれの教師データに、対象物のカテゴリを代表する教師データ、すなわち、抽出教師データとして選択されているか否かを表すフラグなどの情報を含ませる。そして、位置特定制御部３６０が、ＳＶＭ演算部１４０が簡易ＳＶＭ演算を行う際に用いる教師データとして、抽出教師データとして選択されていることを表すフラグが含まれている教師データを使用するのか、または抽出教師データとして選択されていないことを表すフラグが含まれている教師データを使用するのかを指定する構成にする。この構成によっても、ＳＶＭ演算部１４０がそれぞれの位置特定ヒストグラムに対して行う簡易ＳＶＭ演算に用いる教師データのヒストグラムを切り替えることができる。この構成であれば、位置特定制御部３６０が直接、ＳＶＭ演算部１４０に、簡易ＳＶＭ演算を行う際に用いる教師データを指示することができ、対象物位置特定システム３０に教師データ切り替え部３９０を備えなくてもよい。 Note that the object position specifying system 30 according to the third embodiment includes the teacher data switching unit 390, and the position specifying control unit 360 controls the teacher data switching unit 390 so that the object position specifying process 30 is performed. A configuration has been described in which the histogram of teacher data used for the simple SVM calculation performed by the SVM calculation unit 140 on each position specifying histogram is switched to either extracted teacher data or all teacher data. However, the method of switching the histogram of the teacher data used when the SVM calculation unit 140 performs the simple SVM calculation is not limited to the method by the teacher data switching unit 390. For example, information such as a flag indicating whether or not the teacher data representing the category of the object, that is, whether or not the extracted teacher data is selected, is included in each teacher data. Whether the position specifying control unit 360 uses teacher data including a flag indicating that it is selected as extracted teacher data, as the teacher data used when the SVM calculation unit 140 performs the simple SVM calculation, Or it is set as the structure which designates whether the teacher data containing the flag showing not being selected as extraction teacher data are used. Also with this configuration, it is possible to switch the histogram of teacher data used for the simple SVM calculation performed by the SVM calculation unit 140 on each position specifying histogram. With this configuration, the position specification control unit 360 can directly instruct the teacher data used when performing the simple SVM calculation to the SVM calculation unit 140, and the teacher data switching unit 390 is connected to the object position specifying system 30. It does not have to be provided.

なお、第２の実施形態の対象物位置特定システム２０では、シーン認識の処理においてヒストグラム生成部２３０が生成した画像全体のヒストグラムを保存するヒストグラム保存部２８０を備え、シーン認識の処理によって判別した対象物が写っている画像内の位置を簡易的に特定するために、保存した画像全体のヒストグラムを用いる動作を示した。また、本第３の実施形態の対象物位置特定システム３０では、簡易ＳＶＭ演算の処理で用いる教師データを切り替える教師データ切り替え部３９０を備え、シーン認識の処理によって判別した対象物が写っている画像内の位置を簡易的に特定するために、シーン認識の処理において類似度が最も高かった対象物のカテゴリを代表する抽出教師データのヒストグラムを用いる動作を示した。しかし、シーン認識の処理によって判別した対象物が写っている画像内の位置を簡易的に特定するための構成は、第２の実施形態や本第３の実施形態の構成に限定されるものではない。例えば、シーン認識の処理によって判別した対象物が写っている画像内の位置を簡易的に特定するための構成として、第２の実施形態において示した対象物位置特定システム２０の構成と、第３の実施形態において示した対象物位置特定システム３０の構成とを、同時に備えた構成にすることもできる。この場合、例えば、シーン認識の処理によって判別した、教師データと対象物との類似度の大きさによって、対象物が写っている画像内の位置を簡易的に特定する方法を、第２の実施形態の対象物位置特定システム２０の動作、または本第３の実施形態の対象物位置特定システム３０のいずれか一方の動作に切り替えることができる。 Note that the target object specifying system 20 of the second embodiment includes a histogram storage unit 280 that stores a histogram of the entire image generated by the histogram generation unit 230 in the scene recognition process, and the target determined by the scene recognition process. In order to easily specify the position in the image where the object is shown, the operation using the histogram of the whole stored image was shown. In addition, the object position specifying system 30 according to the third embodiment includes a teacher data switching unit 390 that switches teacher data used in the simple SVM calculation process, and an image in which the object determined by the scene recognition process is shown. In order to easily identify the position in the image, an operation using a histogram of extracted teacher data representing the category of the object having the highest similarity in the scene recognition process is shown. However, the configuration for easily specifying the position in the image in which the object determined by the scene recognition process is reflected is not limited to the configuration of the second embodiment or the third embodiment. Absent. For example, as a configuration for easily specifying a position in an image in which an object determined by scene recognition processing is shown, the configuration of the object position specifying system 20 shown in the second embodiment, The configuration of the object position specifying system 30 shown in the embodiment can be configured to be provided at the same time. In this case, for example, the second embodiment is a method for simply specifying the position in the image in which the target object is captured based on the degree of similarity between the teacher data and the target object determined by the scene recognition process. It is possible to switch to one of the operations of the object position specifying system 20 of the form or the object position specifying system 30 of the third embodiment.

＜第４の実施形態＞
ここで、第２の実施形態の対象物位置特定システム２０の構成と、第３の実施形態の対象物位置特定システム３０の構成とを同時に備えた、本発明の第４の実施形態について説明する。図１２は、本第４の実施形態による対象物位置特定システムの概略構成を示したブロック図である。図１２において、対象物位置特定システム４０は、局所特徴ベクトル生成部１１０と、量子化ベクトル生成部１２０と、ヒストグラム生成部２３０と、ＳＶＭ演算部１４０と、教師データ群１５０と、位置特定制御部４６０と、量子化ベクトル保存部１７０と、ヒストグラム保存部２８０と、教師データ切り替え部４９０と、を備えている。 <Fourth Embodiment>
Here, a description will be given of a fourth embodiment of the present invention that includes the configuration of the object position specifying system 20 of the second embodiment and the configuration of the object position specifying system 30 of the third embodiment at the same time. . FIG. 12 is a block diagram showing a schematic configuration of an object position specifying system according to the fourth embodiment. 12, the object position specifying system 40 includes a local feature vector generating unit 110, a quantized vector generating unit 120, a histogram generating unit 230, an SVM calculating unit 140, a teacher data group 150, and a position specifying control unit. 460, a quantized vector storage unit 170, a histogram storage unit 280, and a teacher data switching unit 490.

なお、図１２に示した対象物位置特定システム４０は、図６に示した第２の実施形態の対象物位置特定システム２０の構成要素と、図９に示した第３の実施形態の対象物位置特定システム３０の構成要素とを合わせた構成である。従って、本第４の実施形態の対象物位置特定システム４０の説明においては、第２の実施形態の対象物位置特定システム２０の構成要素、および第３の実施形態の対象物位置特定システム３０の構成要素と異なる構成要素および動作のみを説明する。 The object position specifying system 40 shown in FIG. 12 includes the components of the object position specifying system 20 of the second embodiment shown in FIG. 6 and the object of the third embodiment shown in FIG. This is a configuration combining the components of the position specifying system 30. Therefore, in the description of the object position specifying system 40 according to the fourth embodiment, the components of the object position specifying system 20 according to the second embodiment and the object position specifying system 30 according to the third embodiment. Only components and operations different from the components will be described.

教師データ切り替え部４９０は、位置特定制御部４６０からの制御に応じて、ＳＶＭ演算部１４０に入力するヒストグラムを切り替える。より具体的には、ＳＶＭ演算部１４０が、ヒストグラム生成部２３０から入力されたそれぞれの位置特定ヒストグラムに対する簡易ＳＶＭ演算を行う際に入力するヒストグラムを、教師データ群１５０に含まれるそれぞれの教師データのヒストグラム、ヒストグラム保存部２８０に保存されている画像全体のヒストグラム、または予め定めた条件に応じて選択された抽出教師データのヒストグラムのいずれか一つのヒストグラムとする。 The teacher data switching unit 490 switches the histogram input to the SVM calculation unit 140 in accordance with control from the position specifying control unit 460. More specifically, the histogram input when the SVM calculation unit 140 performs the simple SVM calculation for each position specifying histogram input from the histogram generation unit 230 is used for each teacher data included in the teacher data group 150. The histogram is one of a histogram, a histogram of the entire image stored in the histogram storage unit 280, or a histogram of extracted teacher data selected according to a predetermined condition.

位置特定制御部４６０は、対象物位置特定システム４０の全体、すなわち、対象物位置特定システム４０に備えた局所特徴ベクトル生成部１１０、量子化ベクトル生成部１２０、ヒストグラム生成部２３０、ＳＶＭ演算部１４０、および教師データ切り替え部４９０のそれぞれの動作を制御する。位置特定制御部４６０は、ヒストグラム生成分割領域指定部１６１と、位置特定ＳＶＭ演算判定部４６２と、を備えている。 The position specifying control unit 460 is the entire object position specifying system 40, that is, the local feature vector generating unit 110, the quantized vector generating unit 120, the histogram generating unit 230, and the SVM calculating unit 140 provided in the object position specifying system 40. And the operation of the teacher data switching unit 490 are controlled. The position specifying control unit 460 includes a histogram generation divided region specifying unit 161 and a position specifying SVM calculation determining unit 462.

位置特定ＳＶＭ演算判定部４６２は、対象物位置特定システム４０における対象物の位置特定の処理において、教師データ切り替え部４９０を制御することによって、ＳＶＭ演算部１４０が、ヒストグラム生成部２３０から入力されたそれぞれの位置特定ヒストグラムに対するＳＶＭ演算を行う際のヒストグラムを切り替える。なお、位置特定ＳＶＭ演算判定部４６２がＳＶＭ演算部１４０に入力するヒストグラムを切り替える際の動作は、第２の実施形態の対象物位置特定システム２０に備えた位置特定制御部２６０内の位置特定ＳＶＭ演算判定部２６２と、第３の実施形態の対象物位置特定システム３０に備えた位置特定制御部３６０内の位置特定ＳＶＭ演算判定部３６２とを合わせた動作として容易に理解することができるため、詳細な説明は省略する。 The position specifying SVM calculation determining unit 462 controls the teacher data switching unit 490 in the object position specifying process in the target object specifying system 40, so that the SVM calculating unit 140 is input from the histogram generating unit 230. The histogram for performing the SVM calculation for each position specifying histogram is switched. In addition, the operation | movement at the time of switching the histogram which the position specific SVM calculation determination part 462 inputs into the SVM calculation part 140 is the position specific SVM in the position specific control part 260 with which the target object position specific system 20 of 2nd Embodiment was equipped. Since the operation determination unit 262 and the position specifying SVM calculation determining unit 362 in the position specifying control unit 360 provided in the object position specifying system 30 of the third embodiment can be easily understood, Detailed description is omitted.

このような構成よって、対象物位置特定システム４０では、位置特定制御部４６０による制御によって、第２の実施形態の対象物位置特定システム２０または第３の実施形態の対象物位置特定システム３０のいずれか一方と同様の動作をすることができる。なお、対象物位置特定システム４０の動作は、図７に示した第２の実施形態の対象物位置特定システム２０、または図１０に示した第３の実施形態の対象物位置特定システム３０の動作と同様であるため、詳細な説明は省略する。 With such a configuration, in the object position specifying system 40, either the object position specifying system 20 of the second embodiment or the object position specifying system 30 of the third embodiment is controlled by the position specifying control unit 460. It is possible to perform the same operation as either one. The operation of the object position specifying system 40 is the same as the operation of the object position specifying system 20 of the second embodiment shown in FIG. 7 or the object position specifying system 30 of the third embodiment shown in FIG. Therefore, detailed description is omitted.

本実施形態によれば、対象物位置特定システム４０に、ヒストグラム生成部２３０が生成した、画像の全体を表すヒストグラムを保存するヒストグラム保存部２８０と、ヒストグラム保存部２８０に保存された画像の全体を表すヒストグラム、または複数の教師データの内、予め定めた条件に応じて選択した抽出教師データのヒストグラムのいずれか一方を選択して出力する教師データ切り替え部（教師データ切り替え部４９０）と、をさらに備え、位置特定制御部（位置特定制御部４６０）は、位置特定の処理において、ＳＶＭ演算部１４０に、位置特定分割領域を表すヒストグラムのそれぞれと、教師データ切り替え部を制御することによってこの教師データ切り替え部４９０から出力されたヒストグラム（ヒストグラム保存部２８０に保存された画像の全体を表すヒストグラム、または抽出教師データのヒストグラム）とを比較するＳＶＭ演算を実行させる、（対象物位置特定システム４０）が構成される。 According to the present embodiment, in the object position specifying system 40, the histogram storage unit 280 that stores the histogram representing the entire image generated by the histogram generation unit 230, and the entire image stored in the histogram storage unit 280 are stored. A teacher data switching unit (teacher data switching unit 490) for selecting and outputting either a histogram to be represented or a histogram of extracted teacher data selected according to a predetermined condition from a plurality of teacher data In the position specifying process, the position specifying control unit (position specifying control unit 460) causes the SVM calculation unit 140 to control each of the histograms indicating the position specifying divided regions and the teacher data switching unit to control the teacher data. Histogram output from the switching unit 490 (in the histogram storage unit 280 Histogram represents the overall presence images, or to perform an SVM operation for comparing the histograms) of the extraction teacher data, is composed of (the object location system 40).

上記に述べたように、本第４の実施形態の対象物位置特定システム４０でも、第１の実施形態の対象物位置特定システム１０と同様に、量子化ベクトル保存部１７０に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、少ない処理で位置特定分割領域毎の位置特定ヒストグラムを生成することができる。 As described above, in the object position specifying system 40 of the fourth embodiment, each scene stored in the quantized vector storage unit 170 is the same as the object position specifying system 10 of the first embodiment. Using the value of the quantization vector of the recognition divided area, a position specifying histogram for each position specifying divided area can be generated with a small amount of processing.

また、本第４の実施形態の対象物位置特定システム４０では、第２の実施形態の対象物位置特定システム２０または第３の実施形態の対象物位置特定システム３０のいずれか一方と同様の方法で、対象物が写っている画像内の位置を特定するために行う、ヒストグラム生成部２３０が生成した位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算を、簡易的に行うことができる。このことにより、本第４の実施形態の対象物位置特定システム４０でも、第２の実施形態の対象物位置特定システム２０または第３の実施形態の対象物位置特定システム３０と同様に、詳細なＳＶＭ演算を行う必要がある位置特定分割領域を絞り込むことができ、対象物が写っている画像内の位置を特定するために要する演算時間を、第１の実施形態の対象物位置特定システム１０よりもさらに短縮することができる。 Further, in the object position specifying system 40 of the fourth embodiment, the same method as that of either the object position specifying system 20 of the second embodiment or the object position specifying system 30 of the third embodiment. Thus, the SVM calculation for the position specifying histogram for each position specifying divided region generated by the histogram generation unit 230, which is performed to specify the position in the image in which the object is captured, can be easily performed. As a result, the object position specifying system 40 of the fourth embodiment is the same as the object position specifying system 20 of the second embodiment or the object position specifying system 30 of the third embodiment. It is possible to narrow down the position specifying divided area where the SVM calculation needs to be performed, and the calculation time required to specify the position in the image in which the object is shown is determined by the object position specifying system 10 of the first embodiment. Can be further shortened.

なお、対象物が写っている画像内の位置を特定するために行う簡易ＳＶＭ演算を、第２の実施形態の対象物位置特定システム２０の動作または第３の実施形態の対象物位置特定システム３０の動作のいずれの動作で行うかは、例えば、シーン認識の処理によって判別した、教師データと対象物との類似度の大きさによって切り替えることが考えられる。より具体的には、シーン認識の処理において判別した対象物のカテゴリとの類似度が８０パーセント以上のときには、第２の実施形態の対象物位置特定システム２０の動作で簡易ＳＶＭ演算を行い、シーン認識の処理において判別した対象物のカテゴリとの類似度が６０パーセント以上、８０パーセント未満のときには、第３の実施形態の対象物位置特定システム３０の動作で簡易ＳＶＭ演算を行うようにすることができる。また、シーン認識の処理において判別した対象物のカテゴリとの類似度が６０パーセント未満のときには簡易ＳＶＭ演算を行わず、第１の実施形態の対象物位置特定システム１０の動作で、通常のＳＶＭ演算を行うようにすることができる。 Note that the simple SVM calculation performed to specify the position in the image in which the object is shown is the operation of the object position specifying system 20 of the second embodiment or the object position specifying system 30 of the third embodiment. It is conceivable that which of the above operations is performed is switched according to the degree of similarity between the teacher data and the object determined by the scene recognition process, for example. More specifically, when the similarity with the category of the object determined in the scene recognition process is 80% or more, a simple SVM calculation is performed by the operation of the object position specifying system 20 of the second embodiment, and the scene When the similarity to the category of the object determined in the recognition process is 60% or more and less than 80%, the simple SVM calculation may be performed by the operation of the object position specifying system 30 of the third embodiment. it can. Further, when the similarity with the category of the object determined in the scene recognition process is less than 60%, the simple SVM calculation is not performed, and the normal SVM calculation is performed by the operation of the object position specifying system 10 of the first embodiment. Can be done.

上記に述べたように、本発明を実施するための形態によれば、シーン認識の処理において量子化ベクトル生成部が生成したそれぞれのシーン認識分割領域の量子化ベクトルの値を保存する量子化ベクトル保存部を備える。また、本発明を実施するための形態では、量子化ベクトル保存部に保存したそれぞれのシーン認識分割領域の量子化ベクトルの値を用いて、シーン認識の処理によって判別した対象物が写っている画像内の位置を特定するために用いる、予め定めた大きさの位置特定分割領域毎の位置特定ヒストグラムを生成する。これにより、本発明を実施するための形態では、入力された画像に対してシーン認識の処理を行った後に、対象物が写っている画像内の位置を特定するために行う、それぞれの位置特定分割領域に対する対象物の位置特定の処理を、シーン認識の処理と同等の処理を再度行うよりも少ない処理で行うことができる。このことにより、本発明を実施するための形態では、対象物が写っている画像内の位置を特定するために要する演算時間を短縮することができる。 As described above, according to the mode for carrying out the present invention, the quantization vector for storing the quantization vector value of each scene recognition divided region generated by the quantization vector generation unit in the scene recognition processing. A storage unit is provided. Further, in the embodiment for carrying out the present invention, an image in which the object determined by the scene recognition process is shown using the quantization vector value of each scene recognition divided region stored in the quantization vector storage unit. A position specifying histogram is generated for each position specifying divided area having a predetermined size, which is used to specify the position within the area. Thereby, in the form for implementing this invention, after performing the process of scene recognition with respect to the input image, each position specification performed in order to specify the position in the image in which the target object is reflected is shown. The process of specifying the position of the object with respect to the divided area can be performed with fewer processes than when the process equivalent to the scene recognition process is performed again. Thereby, in the form for implementing this invention, the calculation time required in order to pinpoint the position in the image in which the target object is reflected can be shortened.

また、本発明を実施するための形態によれば、シーン認識の処理においてヒストグラム生成部が生成した画像全体のヒストグラム、またはシーン認識の処理において類似度が最も高かった対象物のカテゴリを代表する一部の教師データのヒストグラムを用いてＳＶＭ演算を行う。これにより、本発明を実施するための形態では、それぞれの位置特定分割領域毎の位置特定ヒストグラムに対するＳＶＭ演算を簡易的に行うことができ、シーン認識の処理によって判別した対象物が写っている画像内の位置を簡易的に特定することができる。つまり、本発明を実施するための形態では、シーン認識の処理によって判別した対象物が写っている画像内の位置を特定するために、それぞれの位置特定分割領域に対して行うＳＶＭ演算の処理を、シーン認識の処理において類似度が最も高かった対象物のカテゴリに含まれる大量の教師データのヒストグラムを用いて行うのではなく、少ない数のヒストグラムを用いて簡易的に行うことができる。このことにより、本発明を実施するための形態では、対象物が写っている画像内の位置を特定するために要する演算時間を短縮することができる。 Further, according to the embodiment for carrying out the present invention, the histogram of the whole image generated by the histogram generation unit in the scene recognition process or the category of the object having the highest similarity in the scene recognition process is represented. The SVM calculation is performed using the histogram of the teacher data of the part. Thereby, in the form for implementing this invention, the SVM calculation with respect to the position specific histogram for every position specific division area can be performed simply, and the image which the target object discriminate | determined by the process of scene recognition is reflected The position inside can be easily specified. That is, in the embodiment for carrying out the present invention, in order to specify the position in the image in which the object determined by the scene recognition process is shown, the SVM calculation process performed for each position specifying divided region is performed. Instead of using a histogram of a large amount of teacher data included in the category of the object having the highest similarity in the scene recognition process, it can be easily performed using a small number of histograms. Thereby, in the form for implementing this invention, the calculation time required in order to pinpoint the position in the image in which the target object is reflected can be shortened.

なお、本実施形態においては、１つの位置特定分割領域に対する一連の処理（すなわち、位置特定ヒストグラムの生成とＳＶＭ演算との処理）が完了した後に、処理が完了したことを表す通知に応じて、次の一連の処理を実行する動作の場合について説明した。しかし、それぞれの位置特定分割領域に対する一連の処理の動作は、本発明を実施するための形態で説明した動作に限定されるものではない。例えば、１つ目の位置特定分割領域を表す位置特定ヒストグラムの生成が完了した後、１つ目の位置特定分割領域に対するＳＶＭ演算と同時期に、２つ目の位置特定分割領域を表す位置特定ヒストグラムの生成を行うように制御することもできる。つまり、例えば、１つ目の位置特定分割領域に対するＳＶＭ演算と、２つ目の置特定分割領域に対するヒストグラムの生成とを並列に実行するように制御してもよい。 In the present embodiment, after a series of processing (that is, processing of generating a position specifying histogram and processing of SVM calculation) for one position specifying divided region is completed, in response to a notification indicating that the process is completed, The operation for executing the following series of processes has been described. However, a series of processing operations for each position-specific divided region is not limited to the operations described in the embodiment for carrying out the present invention. For example, after the generation of the position specifying histogram representing the first position specifying divided area is completed, the position specifying representing the second position specifying divided area is performed simultaneously with the SVM calculation for the first position specifying divided area. It can also be controlled to generate a histogram. That is, for example, the SVM calculation for the first position specific divided area and the generation of the histogram for the second position specific divided area may be executed in parallel.

また、本実施形態においては、入力された画像を８１個のシーン認識分割領域に分割し、９つの位置特定分割領域に分割した場合の例で説明したが、入力された画像を分割するシーン認識分割領域および位置特定分割領域の数は、本発明を実施するための形態で説明した数に限定されるものではない。 In the present embodiment, the example in which the input image is divided into 81 scene recognition divided areas and divided into nine position-specific divided areas has been described, but scene recognition for dividing the input image is described. The numbers of the divided areas and the position-specific divided areas are not limited to the numbers described in the embodiment for carrying out the present invention.

以上、本発明の実施形態について、図面を参照して説明してきたが、具体的な構成はこの実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲においての種々の変更も含まれる。 The embodiment of the present invention has been described above with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes various modifications within the scope of the present invention. It is.

１０，２０，３０，４０・・・対象物位置特定システム
１１０・・・局所特徴ベクトル生成部
１２０・・・量子化ベクトル生成部
１３０，２３０・・・ヒストグラム生成部
１４０，２４０・・・ＳＶＭ演算部
１５０・・・教師データ群
１６０，２６０，３６０，４６０・・・位置特定制御部
１６１・・・ヒストグラム生成分割領域指定部（位置特定制御部）
１７０・・・量子化ベクトル保存部
２６２，３６２，４６２・・・位置特定ＳＶＭ演算判定部（位置特定制御部）
２８０・・・ヒストグラム保存部
３９０，４９０・・・教師データ切り替え部 10, 20, 30, 40 ... object position specifying system 110 ... local feature vector generation unit 120 ... quantization vector generation unit 130, 230 ... histogram generation unit 140, 240 ... SVM calculation Unit 150 ... Teacher data group 160, 260, 360, 460 ... Position specifying control unit 161 ... Histogram generation divided area specifying unit (position specifying control unit)
170... Quantized vector storage units 262, 362, 462... Location specifying SVM calculation determining unit (position specifying control unit)
280: Histogram storage unit 390, 490 ... Teacher data switching unit

Claims

The entire area of the input image is divided into a plurality of first areas having a predetermined first size, and the image data included in the first area is divided for each of the divided first areas. A local feature vector generating unit for generating a local feature vector representing a local feature in
A quantization vector generation unit that quantizes the value of the local feature vector of each of the first regions generated by the local feature vector generation unit and generates a quantization vector corresponding to each of the first regions; ,
A quantization vector storage unit that stores the value of each of the quantization vectors generated by the quantization vector generation unit for each of the first regions;
A histogram generation unit that generates a histogram representing a whole or a partial region of the image from the value of the quantization vector for each of the first regions;
An SVM calculation unit that performs a support vector machine (SVM) calculation on the histogram generated by the histogram generation unit;
A scene recognition unit that controls each of the local feature vector generation unit, the quantization vector generation unit, the histogram generation unit, and the SVM calculation unit to recognize a scene of the image in which an object is copied. After the process is executed, the object determined in the scene recognition process includes a plurality of second areas obtained by dividing the entire area of the image into a predetermined second size larger than the first area. A position specifying control unit for executing a position specifying process for specifying at which position in the area;
With
The position specifying control unit includes:
In the scene recognition process,
The histogram generation unit generates a histogram representing the entire image from the value of the quantization vector for each of the first regions, and the SVM calculation unit includes a histogram representing the entire image, and a plurality of histograms. The histogram of the image is compared with each of the histograms of the plurality of teacher data that are grouped according to the type of the object,
In the position specifying process,
The histogram generation unit generates a histogram representing each image of the second region from the value of the quantization vector for each of the first regions stored in the quantization vector storage unit, Causing the SVM calculation unit to execute an SVM calculation for each of the histograms representing the second region;
An object location system characterized by the above.

A histogram storage unit that stores the histogram generated by the histogram generation unit and representing the entire image;
Further comprising
The position specifying control unit includes:
In the position specifying process,
Causing the SVM calculation unit to execute an SVM calculation that compares each of the histograms representing the second region with a histogram representing the whole of the image stored in the histogram storage unit;
The object position specifying system according to claim 1.

The position specifying control unit includes:
In the position specifying process,
The SVM calculating unit compares each of the histograms representing the second region with each of the histograms of a part of the teacher data selected according to a predetermined condition among the plurality of teacher data. To perform the operation,
The object position specifying system according to claim 1 or 2, characterized in that:

A histogram storage unit that stores the histogram generated by the histogram generation unit and representing the entire image;
Select either one of the histogram representing the entire image stored in the histogram storage unit, or a part of the teacher data selected from a plurality of the teacher data according to a predetermined condition. An output teacher data switching unit;
Further comprising
The position specifying control unit includes:
In the position specifying process,
Causing the SVM calculation unit to execute an SVM calculation that compares each of the histograms representing the second region with a histogram output from the teacher data switching unit by controlling the teacher data switching unit;
The object position specifying system according to any one of claims 1 to 3, characterized in that

The entire area of the input image is divided into a plurality of first areas having a predetermined first size, and the image data included in the first area is divided for each of the divided first areas. A local feature vector generating unit that generates a local feature vector that represents a local feature in, and quantizing the value of the local feature vector of each of the first regions generated by the local feature vector generating unit, A quantization vector generation unit that generates a quantization vector corresponding to the first region, and a quantum that stores the value of each of the quantization vectors generated by the quantization vector generation unit for each first region A quantized vector storage unit, a histogram generating unit that generates a histogram representing the whole or a partial region of the image from the value of the quantization vector for each of the first regions, and the hiss And SVM operation unit for performing support vector machine (SVM) operation on the histogram grams generator has generated, and the local feature vector generating unit, and the quantized vector generation unit, and the histogram generating unit, and the SVM computation unit Each of the images is controlled to execute a scene recognition process for recognizing the scene of the image in which the object is copied, and then the object determined in the scene recognition process defines the entire area of the image. A position specifying control unit for executing a position specifying process for specifying in which position of the plurality of second areas divided into a predetermined second size larger than the first area. In an object location system comprising:
The position specifying control unit
In the scene recognition process,
A step of causing the histogram generation unit to generate a histogram representing the entire image from the value of the quantization vector for each of the first regions; and a histogram representing the entire image to the SVM calculation unit; A procedure for executing an SVM operation for comparing each of a plurality of histograms of a plurality of teacher data in which histograms of a plurality of images are classified and grouped for each type of object;
Including
In the position specifying process,
A step of causing the histogram generation unit to generate a histogram representing an image of each of the second regions from the value of the quantization vector for each of the first regions stored in the quantization vector storage unit; , Causing the SVM calculation unit to execute SVM calculation for each of the histograms representing the second region;
including,
An object position specifying method characterized by the above.