JP5347798B2

JP5347798B2 - Object detection apparatus, object detection method, and object detection program

Info

Publication number: JP5347798B2
Application number: JP2009171839A
Authority: JP
Inventors: 昇中島
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-07-23
Filing date: 2009-07-23
Publication date: 2013-11-20
Anticipated expiration: 2029-07-23
Also published as: JP2011028419A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a device which detects a specific object without using a marker. <P>SOLUTION: An object detection device includes: a feature storage part 22 which stores a feature point of a background image which does not include a target object; an invariant feature storage part 32 which stores an invariant feature which shows the feature point of the target object in an invariant feature space; a comparison means 40 which compares the invariant feature which shows in the invariant feature space, the feature points obtained by subtracting from the feature point of the detection target image, the feature point corresponding to the background image which does not include the target object, with the invariant feature which shows the feature point of the target object in the invariant feature space and determines whether the target object is detected by the comparison. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、所定空間に現れる特定の物体を検出するための物体検出装置、物体検出方法及び物体検出プログラムに関する。 The present invention relates to an object detection device, an object detection method, and an object detection program for detecting a specific object appearing in a predetermined space.

ある空間中に所望の物体があるか否かを識別するための技術として、対象となる物体に予め所定形状のマーカを付しておき、その空間の背景映像におけるマーカの有無に応じて対象物体の存否を判別するマーカ検出方法が一般に知られている。
例えば、特許文献１には、マーカを含まない背景映像から特徴点を抽出し、その特徴点にもとづいて導き出される、背景映像には現れていない画像特徴によって生成されるマーカを利用したマーカ検出方法が開示されてある。
このマーカ検出方法によれば、準備段階で、このような方法で生成されたマーカと同一のパターンを対象物体に付しておき、検出段階で、所定の空間映像からこのパターンが検出された場合には対象物体はその空間に存在すると判別することができる。 As a technique for identifying whether or not there is a desired object in a certain space, a marker having a predetermined shape is attached to the target object in advance, and the target object is determined according to the presence or absence of the marker in the background image of the space. In general, a marker detection method for determining the presence or absence of a mark is known.
For example, Patent Document 1 discloses a marker detection method that uses a marker generated from an image feature that does not appear in a background video, which is derived based on the feature point extracted from a background video that does not include a marker. Is disclosed.
According to this marker detection method, the same pattern as the marker generated by such a method is attached to the target object at the preparation stage, and this pattern is detected from a predetermined spatial image at the detection stage. It can be determined that the target object exists in the space.

国際公開第２００８／０９０９０８号パンフレットInternational Publication No. 2008/090908 Pamphlet

しかしながら、このようなマーカ検出方法は、マーカを製造する手間や製造したマーカを対象物体に貼付する必要があり、対象物体の数が多い場合には一層手間やコストがかかるため、より簡便に検出し得る方法が望まれていた。 However, such a marker detection method requires more time and effort for manufacturing the marker, and the manufactured marker needs to be affixed to the target object. A method that could do this was desired.

本発明の目的は、上記の事情にかんがみなされたものであり、マーカによらず特定の物体を検出することが可能な物体検出装置、物体検出方法及び物体検出プログラムの提供を目的とする。 An object of the present invention is considered in view of the above circumstances, and an object thereof is to provide an object detection device, an object detection method, and an object detection program capable of detecting a specific object regardless of a marker.

この目的を達成するため、本発明の物体検出装置は、対象物体を含まない背景画像の特徴点を記憶する特徴記憶部と、前記対象物体の特徴点を不変量特徴空間に表した不変特徴を記憶する不変特徴記憶部と、検出対象画像の特徴点から前記対象物体を含まない背景画像の対応する特徴点を減じて得た特徴点を不変量特徴空間に表した不変特徴と、前記対象物体の特徴点を不変量特徴空間に表した不変特徴との比較を行って対象物体を検出したか否かを判定する比較手段とを備えた構成としてある。 To achieve this object, the object detection device of the present invention includes a feature storage unit that stores feature points of a background image that does not include a target object, and an invariant feature that represents the feature points of the target object in an invariant feature space. An invariant feature storage unit for storing, an invariant feature representing a feature point obtained by subtracting a corresponding feature point of a background image not including the target object from a feature point of the detection target image in an invariant feature space, and the target object The comparison point is compared with the invariant feature represented in the invariant feature space to determine whether or not the target object has been detected.

また、本発明の物体検出方法は、対象物体を含まない背景画像の特徴点を記憶する工程と、前記対象物体の特徴点を不変量特徴空間に表した不変特徴を記憶する工程と、検出対象画像の特徴点から前記対象物体を含まない背景画像の対応する特徴点を減じて得た特徴点を不変量特徴空間に表した不変特徴と、前記対象物体の特徴点を不変量特徴空間に表した不変特徴との比較を行い、この比較によって対象物体を検出したか否かを判定する工程とを有した方法としてある。 The object detection method of the present invention includes a step of storing feature points of a background image that does not include a target object, a step of storing invariant features representing the feature points of the target object in an invariant feature space, and a detection target A feature point obtained by subtracting a corresponding feature point of a background image not including the target object from a feature point of the image is represented in an invariant feature space, and a feature point of the target object is represented in an invariant feature space. And a step of determining whether or not the target object is detected by this comparison.

また、本発明の物体検出プログラムは、物体検出装置を、対象物体を含まない背景画像の特徴点を記憶する手段、前記対象物体の特徴点を不変量特徴空間に表した不変特徴を記憶する手段、及び、検出対象画像の特徴点から前記対象物体を含まない背景画像の対応する特徴点を減じて得た特徴点を不変量特徴空間に表した不変特徴と、前記対象物体の特徴点を不変量特徴空間に表した不変特徴との比較を行い、この比較によって対象物体を検出したか否かを判定する手段として機能させるようにしてある。 Also, the object detection program of the present invention stores, in the object detection device, means for storing feature points of a background image not including the target object, and means for storing invariant features representing the feature points of the target object in an invariant feature space. And an invariant feature obtained by subtracting the corresponding feature point of the background image not including the target object from the feature point of the detection target image in the invariant feature space, and the feature point of the target object A comparison is made with the invariant feature represented in the variable feature space, and this comparison is made to function as a means for determining whether or not the target object has been detected.

本発明の物体検出装置、物体検出方法及び物体検出プログラムによれば、マーカによらず特定の物体を検出することができる。 According to the object detection device, the object detection method, and the object detection program of the present invention, it is possible to detect a specific object regardless of the marker.

本発明の第一実施形態における物体検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the object detection apparatus in 1st embodiment of this invention. 本発明の第一実施形態における物体検出装置の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the object detection apparatus in 1st embodiment of this invention. 映像入力部が入力する画像を示す例図である。It is an example figure which shows the image which a video input part inputs. 入力画像から抽出した特徴点を模式的に表した図である。It is the figure which represented typically the feature point extracted from the input image. 特徴空間に特徴点が配置された様子を示す図である。It is a figure which shows a mode that the feature point was arrange | positioned in the feature space. 特徴点情報の構成を示す図表である。It is a chart which shows the composition of feature point information. 検出対象画像の特徴点と背景画像の特徴点との差分を示す図である。It is a figure which shows the difference of the feature point of a detection target image, and the feature point of a background image. 基底数を１とした場合における特徴点の写像方法を示す説明図である。It is explanatory drawing which shows the mapping method of the feature point when a base number is set to 1. 基底数を２とした場合における特徴点の写像方法を示す説明図である。It is explanatory drawing which shows the mapping method of the feature point when a base number is set to 2. 基底数を３とした場合における特徴点の写像方法を示す説明図である。It is explanatory drawing which shows the mapping method of the feature point when a base number is set to 3. 不変特徴情報の構成を示す図表である。It is a graph which shows the structure of invariant feature information. 本発明の第一実施形態に係る物体検出の具体例を示す第一の説明図である。It is a first explanatory view showing a specific example of object detection according to the first embodiment of the present invention. 本発明の第一実施形態に係る物体検出の具体例を示す第二の説明図である。It is the 2nd explanatory view showing the specific example of the object detection concerning a first embodiment of the present invention. 本発明の第一実施形態に係る物体検出の具体例を示す第三の説明図である。It is 3rd explanatory drawing which shows the specific example of the object detection which concerns on 1st embodiment of this invention. 本発明の第一実施形態に係る物体検出の具体例を示す第四の説明図である。It is a 4th explanatory view showing a specific example of object detection concerning a first embodiment of the present invention. 本発明の第一実施形態における準備段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the preparation stage in 1st embodiment of this invention. 本発明の第一実施形態における検出段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the detection stage in 1st embodiment of this invention. 本発明の第二実施形態における物体検出装置の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the object detection apparatus in 2nd embodiment of this invention. 不変量特徴空間に表される不変特徴の頻度分布を示す図である。It is a figure which shows the frequency distribution of the invariant feature represented by the invariant feature space. 本発明の第二実施形態に係る物体検出の具体例を示す第一の説明図である。It is a 1st explanatory view showing a specific example of object detection concerning a second embodiment of the present invention. 本発明の第二実施形態に係る物体検出の具体例を示す第二の説明図である。It is the 2nd explanatory view showing the example of object detection concerning a second embodiment of the present invention. 本発明の第二実施形態に係る物体検出の具体例を示す第三の説明図である。It is 3rd explanatory drawing which shows the specific example of the object detection which concerns on 2nd embodiment of this invention. 本発明の第二実施形態に係る物体検出の具体例を示す第四の説明図である。It is the 4th explanatory view showing the example of object detection concerning a second embodiment of the present invention. 本発明の第二実施形態における準備段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the preparation stage in 2nd embodiment of this invention. 本発明の第二実施形態における検出段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the detection stage in 2nd embodiment of this invention. 本発明の第三実施形態における物体検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the object detection apparatus in 3rd embodiment of this invention. 本発明の第三実施形態における物体検出装置の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the object detection apparatus in 3rd embodiment of this invention. 特異特徴の選択処理を説明するための説明図である。It is explanatory drawing for demonstrating the selection process of a peculiar feature. 本発明の第三実施形態における準備段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the preparation stage in 3rd embodiment of this invention. 本発明の第三実施形態における検出段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the detection stage in 3rd embodiment of this invention. 本発明の第四実施形態における物体検出装置の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the object detection apparatus in 4th embodiment of this invention. 本発明の第四実施形態に係る物体検出の具体例を示す説明図である。It is explanatory drawing which shows the specific example of the object detection which concerns on 4th embodiment of this invention. 本発明の第四実施形態における検出段階における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the detection stage in 4th embodiment of this invention.

以下、本発明に係る物体検出装置、物体検出方法及び物体検出プログラムの実施形態について、図面を参照して説明する。 Hereinafter, embodiments of an object detection apparatus, an object detection method, and an object detection program according to the present invention will be described with reference to the drawings.

［物体検出装置及び物体検出方法の第一実施形態］
まず、本発明の物体検出装置及び物体検出方法の第一実施形態について、図１及び図２を参照して説明する。
図１は、本実施形態の物体検出装置の構成を示すブロック図である。
また、図２は、本実施形態の物体検出装置の詳細な構成を示すブロック図である。 [First Embodiment of Object Detection Apparatus and Object Detection Method]
First, a first embodiment of an object detection apparatus and an object detection method of the present invention will be described with reference to FIGS. 1 and 2.
FIG. 1 is a block diagram showing the configuration of the object detection apparatus of the present embodiment.
FIG. 2 is a block diagram showing a detailed configuration of the object detection apparatus of the present embodiment.

図１に示すように、物体検出装置１ａは、映像入力手段１０と、特徴抽出手段２０と、不変特徴変換手段３０と、比較手段４０とを備えている。 As shown in FIG. 1, the object detection apparatus 1 a includes a video input unit 10, a feature extraction unit 20, an invariant feature conversion unit 30, and a comparison unit 40.

映像入力手段１０は、図２に示すように、映像入力部１１と映像記憶部１２とを有する。
映像入力部１１は、検出処理を行う前の準備段階において、対象物体を含まない検出エリアの対象映像（以下、「背景画像」という。）の入力を行う。
例えば、図３（ａ）に示すように、対象物体（以下、紙飛行機を対象物体とする。）が現れない背景を撮影した映像などが該当する。
また、映像入力部１１は、準備段階において、対象物体の画像の入力を行う。
図３（ｂ）に示すように、「対象物体の画像」とは、対象物体のみが映し出された画像であり、例えば、ブルースクリーンを背景として対象物体を撮像し、この撮像画像から青色の成分を除去したものが相当する。 As shown in FIG. 2, the video input means 10 includes a video input unit 11 and a video storage unit 12.
The video input unit 11 inputs a target video (hereinafter referred to as “background image”) in a detection area that does not include the target object in a preparation stage before performing the detection process.
For example, as shown in FIG. 3A, an image of a background in which a target object (hereinafter referred to as a paper airplane) is not displayed corresponds to this.
The video input unit 11 inputs an image of the target object in the preparation stage.
As shown in FIG. 3B, the “target object image” is an image in which only the target object is projected. For example, the target object is imaged with a blue screen as a background, and a blue component is captured from the captured image. It corresponds to the one obtained by removing.

映像入力部１１は、検出段階においては、対象物体を検出するエリアの対象映像を入力する。検出段階で入力された映像はディジタイズされたフレーム画像として映像記憶部１２に記憶する。フレーム画像とは、静止画フレームの一枚一枚をいう。
以下、検出段階で入力された検出エリアの画像を「検出対象画像」という。
図３（ｃ）は、この検出対象画像のうち対象物体を含む画像の例を示すものである。 In the detection stage, the video input unit 11 inputs the target video of the area where the target object is detected. The video input at the detection stage is stored in the video storage unit 12 as a digitized frame image. A frame image refers to each still image frame.
Hereinafter, the image of the detection area input at the detection stage is referred to as a “detection target image”.
FIG. 3C shows an example of an image including a target object among the detection target images.

映像記憶部１２は、準備段階において入力した画像（背景画像や対象物体の画像）や、検出段階において入力した画像（検出対象画像）を記憶する。
映像記憶部１２は、入力した各フレーム画像のそれぞれに付された番号（例えば、シリアル番号）を記憶することができる。この番号は、一枚のフレーム画像を一意に特定するものである。 The video storage unit 12 stores an image (background image or target object image) input in the preparation stage and an image (detection target image) input in the detection stage.
The video storage unit 12 can store a number (for example, a serial number) assigned to each input frame image. This number uniquely identifies one frame image.

特徴抽出手段２０は、図２に示すように、特徴抽出部２１と、特徴記憶部２２とを有する。
特徴抽出部２１は、映像記憶部１２からフレーム画像を取り出し、取り出したフレーム画像中の特徴的なパターンを含む画像特徴を抽出する。
具体的には、特徴抽出部２１は、準備段階において、映像記憶部１２から背景画像や対象物体の画像を取り出して特徴点を抽出し、その特徴点情報を特徴記憶部２２に記憶する。 As shown in FIG. 2, the feature extraction unit 20 includes a feature extraction unit 21 and a feature storage unit 22.
The feature extraction unit 21 extracts a frame image from the video storage unit 12 and extracts an image feature including a characteristic pattern in the extracted frame image.
Specifically, in the preparation stage, the feature extraction unit 21 extracts a background image or a target object image from the video storage unit 12 to extract feature points, and stores the feature point information in the feature storage unit 22.

例えば、図４（ａ）に示すように、背景画像からは、その画像に含まれるオブジェクトの特徴点を複数抽出し、これらの特徴点情報が背景記憶領域２２１に記憶される。
また、特徴抽出部２１は、図４（ｂ）に示すように、対象物体の画像から特徴点を抽出し、その特徴点情報を特徴記憶部２２の物体記憶領域２２２に記憶する。 For example, as illustrated in FIG. 4A, a plurality of feature points of an object included in the image are extracted from the background image, and the feature point information is stored in the background storage area 221.
Further, as shown in FIG. 4B, the feature extraction unit 21 extracts feature points from the image of the target object, and stores the feature point information in the object storage area 222 of the feature storage unit 22.

さらに、特徴抽出部２１は、図４（ｃ）に示すように、検出段階において、映像記憶部１２から検出対象画像を取り出して特徴点を抽出し、その特徴点情報を特徴記憶部２２の入力記憶領域２２３に記憶する。 Furthermore, as shown in FIG. 4C, the feature extraction unit 21 extracts a detection point image from the video storage unit 12 and extracts a feature point at the detection stage, and inputs the feature point information to the feature storage unit 22. Store in the storage area 223.

画像特徴としては、例えば、図形的に特徴的な特性を数値化したものを用いることができる。
これには、例えば、１９９８年ＩＥＥＥコンピュータビジョン・パターン認識会議予稿集に掲載されている方法を使用することができる。この方法は、画像中の物体形状の頂点、線状の物体の交差点、端点などを抽出することができる。そして、それらの点の画像上での位置座標情報の系列を図形的特徴とすることができる。なお、特徴点が配置された空間を特徴空間という。 As the image feature, for example, a graphic characteristic characteristic in numerical form can be used.
For this, for example, the method published in the 1998 IEEE Computer Vision Pattern Recognition Conference Proceedings can be used. This method can extract vertices of object shapes in images, intersections and end points of linear objects, and the like. A series of position coordinate information on the image of these points can be used as a graphic feature. A space in which feature points are arranged is called a feature space.

また、他の方法として、例えば、Montanariによる１９７１年Communications of ACM、１４巻に掲載されている「On the option detection ｏｆ curves in noisy pictures」に記載の方法がある。
これは、基準点からの距離、相対確度を記憶するＲテーブルの内容を特徴として使用することができる。この際、基準点をすべての特徴位置に対して設定し、網羅的に特徴を抽出しておくことで、部分的な特徴の欠損に対してマーカの検出が頑健となる。
さらに、他の特徴抽出方法としては、例えば、画像上の各画素の輝度値、あるいは色差値を特徴とする方法がある。 As another method, for example, there is a method described in “On the option detection of curves in noisy pictures” published in Montanari, 1971 Communications of ACM, Volume 14.
This can use the contents of the R table storing the distance from the reference point and the relative accuracy as features. At this time, by setting reference points for all feature positions and extracting features comprehensively, marker detection is robust against partial feature loss.
Further, as another feature extraction method, for example, there is a method characterized by the luminance value or color difference value of each pixel on the image.

次いで、特徴抽出部２１は、特徴点のそれぞれにシリアル番号を付与する。
図５は、図４（ｃ）に示すフレーム画像の特徴点にシリアル番号を付した例を示すものである。
同図に示すように、シリアル番号は、例えば、最も上に位置するものから順番に１，２，３，４，・・・のように付与することができる。
続いて、特徴抽出部２１は、特徴点のそれぞれの座標を求める。座標は、特徴空間にＸ軸とＹ軸を設定し、Ｙ軸からの距離をＸ座標、Ｘ軸からの距離をＹ座標とすることができる。 Next, the feature extraction unit 21 assigns a serial number to each feature point.
FIG. 5 shows an example in which serial numbers are assigned to the feature points of the frame image shown in FIG.
As shown in the figure, serial numbers can be assigned in the order of 1, 2, 3, 4,...
Subsequently, the feature extraction unit 21 obtains the coordinates of the feature points. As coordinates, the X and Y axes can be set in the feature space, the distance from the Y axis can be set as the X coordinate, and the distance from the X axis can be set as the Y coordinate.

そして、特徴抽出部２１は、それら特徴点のシリアル番号や座標を特徴記憶部２２の各記憶領域に記憶させる。特徴記憶部２２は、それらシリアル番号や座標等を図６に示すように、「特徴点情報」として記憶することができる。
特徴点情報は、同図に示すように、「フレーム画像のシリアル番号」（ア）と、「特徴点のシリアル番号」（イ）と、「特徴点のｘ座標」（ウ）と、「特徴点のｙ座標」（エ）を項目として構成することができる。
「フレーム画像のシリアル番号」は、特徴点を抽出したフレーム画像に付された番号を示す。
「特徴点のシリアル番号」は、特徴点の１枚のフレーム画像から抽出された複数の特徴点の一つ一つに付された番号を示す。
「特徴点のｘ座標」は、特徴空間におけるその特徴点のｘ座標を示す。
「特徴点のｙ座標」は、特徴空間におけるその特徴点のｙ座標を示す。 Then, the feature extraction unit 21 stores the serial numbers and coordinates of these feature points in each storage area of the feature storage unit 22. The feature storage unit 22 can store the serial number, coordinates, and the like as “feature point information” as shown in FIG.
As shown in the figure, the feature point information includes “frame image serial number” (a), “feature point serial number” (b), “feature point x coordinate” (c), and “feature point information”. The “y coordinate of point” (d) can be configured as an item.
The “frame image serial number” indicates a number assigned to the frame image from which the feature points are extracted.
The “feature point serial number” indicates a number assigned to each of a plurality of feature points extracted from one frame image of the feature points.
The “x coordinate of the feature point” indicates the x coordinate of the feature point in the feature space.
The “y-coordinate of the feature point” indicates the y-coordinate of the feature point in the feature space.

差分抽出部２３は、検出段階において、特徴記憶部２２の入力記憶領域２２３から検出対象画像の特徴点情報を取り出すとともに、背景記憶領域２２１から背景画像の特徴点情報を取り出し、同じ座標に位置する特徴点の情報を削除する処理を行う。
図７は、検出対象画像の特徴点から背景画像の対応する特徴点を削除した後の特徴空間を模式的に表した図である。
同図に示すように、差分抽出部２３によれば、背景画像に本来存在しない物体（対象物体に限らない）の特徴点のみを抽出することができる。
そして、差分抽出部２３によって抽出された特徴点の特徴点情報は、不変特徴変換手段３０に出力される。 In the detection stage, the difference extraction unit 23 extracts feature point information of the detection target image from the input storage area 223 of the feature storage unit 22 and also extracts feature point information of the background image from the background storage area 221 and is located at the same coordinates. A process of deleting feature point information is performed.
FIG. 7 is a diagram schematically illustrating the feature space after the corresponding feature point of the background image is deleted from the feature points of the detection target image.
As shown in the figure, the difference extraction unit 23 can extract only feature points of objects (not limited to target objects) that do not originally exist in the background image.
Then, the feature point information of the feature points extracted by the difference extraction unit 23 is output to the invariant feature conversion means 30.

不変特徴変換手段３０は、図２に示すように、不変特徴変換部３１と、不変特徴記憶部３２とを有する。
不変特徴変換部３１は、特徴点を不変量特徴空間に写像することで不変特徴を取得する。
まず、不変特徴変換部３１は、準備段階において、特徴記憶部２２の物体記憶領域２２２から対象物体の特徴点を取り出し不変量特徴空間に写像する。そして、この写像処理によって得た不変特徴の不変特徴情報を不変特徴記憶部３２の物体不変記憶領域３２１に記憶する。
また、不変特徴変換部３１は、検出段階において、差分抽出部２３から出力された特徴点の情報を受け取り、特徴点を不変量特徴空間に写像する。そして、この写像処理によって取得した不変特徴の不変特徴情報を不変特徴記憶部３２の差分不変記憶領域３２２に記憶する。 As shown in FIG. 2, the invariant feature conversion unit 30 includes an invariant feature conversion unit 31 and an invariant feature storage unit 32.
The invariant feature conversion unit 31 acquires an invariant feature by mapping the feature point to the invariant feature space.
First, the invariant feature conversion unit 31 takes out the feature point of the target object from the object storage area 222 of the feature storage unit 22 and maps it to the invariant feature space in the preparation stage. Then, the invariant feature information of the invariant feature obtained by this mapping process is stored in the object invariant storage area 321 of the invariant feature storage unit 32.
The invariant feature conversion unit 31 receives the feature point information output from the difference extraction unit 23 and maps the feature point to the invariant feature space at the detection stage. Then, the invariant feature information of the invariant feature acquired by this mapping process is stored in the difference invariant storage area 322 of the invariant feature storage unit 32.

ここで、特徴点の不変量特徴空間への写像処理について詳細に説明を行う。
例えば、画像の特徴的な部位として点（特徴点）を抽出しその空間（特徴空間）での位置座標情報の系列を図形的特徴とする場合、特徴空間上の特徴点から基準点を選択し、この基準点が不変量特徴空間の基準座標のところにくるように配置し、他のすべての特徴点についても、同一の変換規則で不変量特徴空間に配置することで各特徴点は不変量特徴空間へ写像される。以下、この基準点を「基底」という。そして、「基底」は、背景映像中のオブジェクトの姿勢変化に応じてその数を定めることができる。この数を「基底数」という。 Here, the process of mapping the feature points to the invariant feature space will be described in detail.
For example, when a point (feature point) is extracted as a characteristic part of an image and a series of position coordinate information in that space (feature space) is used as a graphic feature, a reference point is selected from the feature points on the feature space. By placing this reference point at the reference coordinates of the invariant feature space, and placing all other feature points in the invariant feature space using the same transformation rule, each feature point is invariant. It is mapped to the feature space. Hereinafter, this reference point is referred to as “base”. The number of “bases” can be determined according to a change in the posture of the object in the background video. This number is called a “basic number”.

そして、対象物体が幾何学的変化を伴う場合、すなわちカメラと対象物体のシーンが相対的に回転、平行移動し、剪断変形などの姿勢変化を被る場合に、予めその姿勢変化に対応する基底数にもとづいて対象物体の特徴点を不変量特徴空間に写像しておくことによって、その相対的な位置関係変化によらず、不変な特徴量を、特徴点群の位置関係から求めることができる。
具体的には、姿勢変化が平行移動のみの場合には基底数を１、拡大、縮小もしくは回転のいずれかの姿勢変化が現れる場合には基底数を２、さらにせん断変形が加わる場合には基底数を３とすることができる。 When the target object is accompanied by a geometric change, that is, when the scene of the camera and the target object rotates and translates relatively and undergoes a posture change such as shear deformation, the base number corresponding to the posture change in advance By mapping the feature point of the target object to the invariant feature space based on the above, the invariant feature amount can be obtained from the positional relationship of the feature point group regardless of the relative positional relationship change.
Specifically, the base number is 1 when the posture change is only parallel movement, the base number is 2 when any of the posture changes of enlargement, reduction or rotation appears, and the base number when shear deformation is applied. The number can be three.

なお、本実施形態では、簡単のため、背景が遠方にある場合の幾何学的な不変特徴について説明する。また、背景が遠方にない場合などより自由度の高い不変特徴に拡張することは容易である。 In the present embodiment, for the sake of simplicity, geometric invariant features when the background is far away will be described. Also, it is easy to expand to invariant features with a higher degree of freedom, such as when the background is not far away.

以下、紙飛行機を対象物体とし、この特徴点を不変量特徴空間に写像する方法について図８〜図１０を参照しながら説明する。
まず、基底数を１とした場合における特徴点の写像方法について図８を参照して説明を行う。
図８（ａ）に示すように、対象物体のみを撮像したフレーム画像を準備する。すなわち、予め、映像入力部１１が、対象物体の画像を入力し、これを映像記憶部１２が記憶しておく。なお、各特徴点には、図８（ｂ）に示すように、シリアル番号が付されているものとする。 Hereinafter, a method for mapping a feature point to an invariant feature space using a paper airplane as a target object will be described with reference to FIGS.
First, a method of mapping feature points when the base number is 1 will be described with reference to FIG.
As shown in FIG. 8A, a frame image obtained by capturing only the target object is prepared. That is, the video input unit 11 inputs an image of the target object in advance, and the video storage unit 12 stores this. Each feature point is given a serial number as shown in FIG.

同図に示すように、特徴抽出部２１が、対象物体の特徴点を抽出する。
ここで、不変特徴変換部３１は、抽出した複数の特徴点の中から一つの特徴点を基底として定め、この基底が不変量特徴空間の座標（０，０）のところにくるように移動し、この移動量を求め、他のすべての特徴点についても、その移動量で不変量特徴空間に移動する。
例えば、シリアル番号１番の特徴点を基底とし、このシリアル番号１番の特徴点が不変量特徴空間で座標（０，０）のところにくるように、すべての特徴点を平行移動する。これにより、特徴点が不変量特徴空間に写像され同図（ｃ）に示すような不変特徴を得ることができる。 As shown in the figure, the feature extraction unit 21 extracts feature points of the target object.
Here, the invariant feature conversion unit 31 determines one feature point as a base from among the plurality of extracted feature points, and moves so that this base comes to the coordinate (0, 0) of the invariant feature space. The movement amount is obtained, and all other feature points are moved to the invariant feature space by the movement amount.
For example, the feature point of serial number 1 is used as a base, and all feature points are translated so that the feature point of serial number 1 is at the coordinate (0, 0) in the invariant feature space. As a result, the feature points are mapped to the invariant feature space, and the invariant features as shown in FIG.

そして、このように、一つの特徴点を基底として定め、この基底を不変量特徴空間の原点に移動するのに伴って、すべての特徴点を平行移動させる処理を、各特徴点を順次基底として定めるごとに行うことで、各特徴点が不変量特徴空間に写像される。
このため、検出段階で、検出対象画像の特徴点を基底数１として不変量特徴空間に写像して比較することにより、入力画像によっては対象物体が平行移動した場合であってもこれを検出することができる。 Then, in this way, one feature point is defined as a base, and as the base is moved to the origin of the invariant feature space, the process of translating all the feature points is performed sequentially with each feature point as a base. Each feature point is mapped to the invariant feature space by performing it every time.
For this reason, at the detection stage, the feature point of the detection target image is mapped to the invariant feature space with a basis number of 1, and this is detected depending on the input image even when the target object is translated. be able to.

次に、基底数を２とした場合における特徴点の写像方法について図９を参照して説明する。
この場合、不変特徴変換部３１は、二つの特徴点を基底として定め、それぞれの基底が不変量特徴空間上の対応する二つの基準座標のところにくるように移動し、これに伴いすべての特徴点を相対的な位置関係を維持しながら不変量特徴空間に移動する。そして、この移動を、すべての特徴点から選択し得る２点の組み合わせからなる各基底について行う。
まず、図９（ａ）〜（ｂ）に示すように、予め対象物体の画像を入力し、その特徴点を抽出しておく。
ここで、例えば、シリアル番号１番の特徴点を第一基底、シリアル番号２番の特徴点を第二基底とする場合には、第一基底を不変量特徴空間の座標（０，０）に移動し、第二基底を座標（１，０）に移動することに伴って、同一変換規則にしたがって他のすべての特徴点を移動する。これにより、同図（ｃ）に示すような不変特徴を得ることができる。 Next, a method of mapping feature points when the base number is 2 will be described with reference to FIG.
In this case, the invariant feature conversion unit 31 determines two feature points as bases, moves so that the respective bases are located at two corresponding reference coordinates on the invariant feature space, and all the features are associated therewith. The point is moved to the invariant feature space while maintaining the relative positional relationship. And this movement is performed about each base which consists of the combination of 2 points | pieces which can be selected from all the feature points.
First, as shown in FIGS. 9A to 9B, an image of a target object is input in advance and its feature points are extracted.
Here, for example, when the feature point with serial number 1 is the first basis and the feature point with serial number 2 is the second basis, the first basis is set to the coordinates (0, 0) of the invariant feature space. As the second base moves to the coordinates (1, 0), all other feature points are moved according to the same conversion rule. Thereby, an invariant feature as shown in FIG.

そして、このように、二つの特徴点を基底として定め、これらの基底を不変量特徴空間の二つの基準点に移動するのに伴って、すべての特徴点を平行移動若しくは回転又は拡大・縮小させる処理を、各特徴点の組み合わせを順次基底として定めるごとに行うことで、各特徴点が不変量特徴空間に写像される。
このため、検出段階で、検出対象画像の特徴点を基底数２として不変量特徴空間に写像して比較することにより、入力画像によっては対象物体が平行移動、回転、拡大・縮小した場合であってもこれを検出することができる。 In this way, two feature points are defined as bases, and as these bases are moved to two reference points in the invariant feature space, all the feature points are translated, rotated, or enlarged / reduced. By performing the process every time the combination of each feature point is sequentially determined as a base, each feature point is mapped to the invariant feature space.
Therefore, at the detection stage, the feature point of the detection target image is mapped to the invariant feature space with a basis number of 2 and compared, so that depending on the input image, the target object may be translated, rotated, enlarged / reduced. Even this can be detected.

次に、基底数を３とした場合における特徴点の写像方法について図１０を参照して説明する。
この場合、不変特徴変換部３１は、三つの特徴点を基底として定め、それぞれの基底が不変量空間上の対応する三つの基準座標のところにくるように移動し、これに伴いすべての特徴点を相対的な位置関係を維持しながら不変量特徴空間に移動する。そして、この移動を、すべての特徴点から選択し得る３点の組み合わせからなる各基底について行う。
まず、図１０（ａ）〜（ｂ）に示すように、対象物体画像を準備し、特徴点を抽出する。
ここで、例えば、シリアル番号２番の特徴点を第一基底、シリアル番号３番の特徴点を第二基底、シリアル番号１番の特徴点を第三基底とする場合には、第一基底を不変量特徴空間の座標（０，０）に移動し、第二基底を座標（１，０）に移動し、第三基底を座標（０，１）に移動することに伴って、同一変換規則にしたがって他のすべての特徴点を移動する。これにより、同図（ｃ）に示すような不変特徴を得ることができる。 Next, a feature point mapping method when the basis number is 3 will be described with reference to FIG.
In this case, the invariant feature conversion unit 31 determines three feature points as bases, moves so that each base comes to the corresponding three reference coordinates on the invariant space, and all the feature points are associated therewith. Are moved to the invariant feature space while maintaining the relative positional relationship. And this movement is performed about each base which consists of a combination of 3 points | pieces which can be selected from all the feature points.
First, as shown in FIGS. 10A to 10B, target object images are prepared and feature points are extracted.
Here, for example, when the feature point of serial number 2 is the first basis, the feature point of serial number 3 is the second basis, and the feature point of serial number 1 is the third basis, the first basis is The same transformation rule is moved by moving to the coordinate (0,0) of the invariant feature space, moving the second base to the coordinate (1,0), and moving the third base to the coordinate (0,1). To move all other feature points. Thereby, an invariant feature as shown in FIG.

そして、このように、三つの特徴点を基底として定め、これらの基底を不変量特徴空間の三つの基準点に移動するのに伴って、すべての特徴点を平行移動若しくは回転又は拡大・縮小又はせん断変形させる処理を、各特徴点の組み合わせを順次基底として定めるごとに行うことで、各特徴点が不変量特徴空間に写像される。
このため、検出段階で、検出対象画像の特徴点を基底数３として不変量特徴空間に写像して比較することにより、入力画像によっては対象物体が平行移動、回転、拡大・縮小又はせん断変形した場合であってもこれを検出することができる。 In this way, three feature points are defined as bases, and as these bases are moved to the three reference points of the invariant feature space, all feature points are translated, rotated, enlarged / reduced, or By performing the process of shear deformation every time a combination of feature points is sequentially determined as a basis, each feature point is mapped to the invariant feature space.
Therefore, at the detection stage, the feature point of the detection target image is mapped to the invariant feature space with a basis number of 3, and the target object is translated, rotated, enlarged / reduced or sheared depending on the input image. This can be detected even in cases.

このように原画像空間から、不変量特徴空間への１対１線形写像がアフィン変換として定義できる。基底を除くすべての特徴点群を、基底により特徴付けられた同一のアフィン変換を用いて不変特徴空間へ写像すると、これら特徴点群はカメラとシーンの相対的位置関係によらず不変となる。ただし、実際には、シーンから常に同じ基底を選択できるとは限らないため、特徴点群のすべての３点の順列組み合わせから基底選択を行い、各基底に対する非基底特徴点を不変量特徴空間に写像する必要がある。
これら特徴点群が物体の幾何変形（例えば、平行移動、回転、拡大・縮小、せん断変形など）に対して不変である理由は、他の物体を含む映像中で、特徴点から選択される基底により、得られる不変特徴は、常に一致するためである。 Thus, a one-to-one linear mapping from the original image space to the invariant feature space can be defined as an affine transformation. When all the feature point groups except the base are mapped to the invariant feature space using the same affine transformation characterized by the base, these feature point groups become invariant regardless of the relative positional relationship between the camera and the scene. In practice, however, it is not always possible to select the same base from the scene. Therefore, base selection is performed from a permutation combination of all three points of the feature point group, and the non-basis feature points for each base are set in the invariant feature space. Need to map.
The reason why these feature points are invariant to geometric deformation (for example, translation, rotation, enlargement / reduction, shear deformation, etc.) of the object is that the base selected from the feature points in the image including other objects. This is because the obtained invariant feature always matches.

不変特徴記憶部３２の物体不変記憶領域３２１は、このような不変特徴変換を介して得た対象物体の不変特徴情報を記憶する。
また、不変特徴記憶部３２の差分不変記憶領域３２２は、同様の不変特徴変換によって得た検出対象画像と背景画像の差分にもとづく不変特徴情報を記憶する。
「不変特徴情報」は、図１１に示すように、「不変量特徴空間のシリアル番号」（ア）と「不変特徴のシリアル番号」（イ）と「不変特徴のｘ座標」（ウ）と「不変特徴のｙ座標」（エ）とを項目として構成することができる。 The object invariant storage area 321 of the invariant feature storage unit 32 stores invariant feature information of the target object obtained through such invariant feature conversion.
In addition, the difference invariant storage area 322 of the invariant feature storage unit 32 stores invariant feature information based on the difference between the detection target image and the background image obtained by the same invariant feature conversion.
As shown in FIG. 11, “invariant feature information” includes “invariant feature space serial number” (a), “invariant feature serial number” (b), “invariant feature x-coordinate” (c), “ The y coordinate of the invariant feature "(d) can be configured as an item.

「不変量特徴空間のシリアル番号」は、不変量特徴空間に付された番号を示す。
「不変特徴のシリアル番号」は、複数の不変特徴のそれぞれに付された番号を示す。
「不変特徴のｘ座標」は、不変量特徴空間におけるその不変特徴のｘ座標を示す。
「不変特徴のｙ座標」は、不変量特徴空間におけるその特徴点のｙ座標を示す。 “Serial number of invariant feature space” indicates a number assigned to the invariant feature space.
The “invariant feature serial number” indicates a number assigned to each of a plurality of invariant features.
“Invariant feature x-coordinate” indicates the x-coordinate of the invariant feature in the invariant feature space.
“Y coordinate of invariant feature” indicates the y coordinate of the feature point in the invariant feature space.

なお、本実施形態において、特徴点を不変量特徴空間へ写像する方法は、図８〜図１０に示す方法とするが、写像方法は、この方法に限るものではなく、４点以上の基底を選択して写像を行うこともできる。
また、前述した幾何学的不変量の他、物体色の不変量を用いた写像も可能である。
物体の色は、同一物体であっても撮影環境に存在する光源色に依存して、異なった色で撮影されてしまう。画像上から光源色変動の影響を分離して取り除くことができれば、実際の物体色を得ることができる。得られる実際の物体色を物体色不変量として使用してもよい。鏡面反射している箇所は光源色の影響が支配的で、輝度値が光源色成分において飽和しやすいため、これを光源色とみなして、飽和箇所に対応する色成分を不変特徴として選択しないようにしてもよい。 In this embodiment, the method for mapping the feature points to the invariant feature space is the method shown in FIGS. 8 to 10, but the mapping method is not limited to this method, and bases of four or more points are used. You can also select and map.
In addition to the geometric invariants described above, mapping using invariants of object colors is also possible.
Even if the color of the object is the same, the color of the object is photographed with different colors depending on the light source color existing in the photographing environment. If the influence of the light source color variation can be separated and removed from the image, the actual object color can be obtained. The actual object color obtained may be used as the object color invariant. The specular reflection location is dominated by the influence of the light source color, and the luminance value is likely to be saturated in the light source color component. It may be.

他にも、画像から物体色を推定する方法には、Robby T. Tan and Katsushi Ikeuchiによる、IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE、VOL. 27、NO. 2、FEBRUARY 2005、pp.178-193に記載の「Separating Reflection Components of Textured Surfaces Using a Single Image」や、Graham D. Finlayson、Steven D. Hordley、Cheng Lu、and Mark S. Drewによる、IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENT、 VOL.28、NO.1、JANUARY ２００６、pp.59-68、に記載の「On the Removal of Shadows from Images」などを使用してもよい。 Other methods for estimating the object color from the image can be found in IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 2, FEBRUARY 2005, pp.178-193 by Robby T. Tan and Katsushi Ikeuchi. `` Separating Reflection Components of Textured Surfaces Using a Single Image '', IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENT, VOL.28, NO by Graham D. Finlayson, Steven D. Hordley, Cheng Lu, and Mark S. Drew .1, JANUARY 2006, pp.59-68, “On the Removal of Shadows from Images”, etc. may be used.

また、テクスチャを不変量として用いることができる。
画像の部分領域の輝度分布に対して数値演算を施し得られた数値またはベクトルを特徴量とする。図形的不変量と同様にテクスチャ不変量はカメラと撮影対象との相対位置関係に影響を受けやすいため、この影響を受けにくい特徴量を算出し、テクスチャ不変量とする。例えば、カメラと対象の距離やズームに不変な特徴量は、注目している部分画像を極座標変換し、動径方向にパワースペクトルをとることで実装可能である。さらに、上記パワースペクトルに対して方位角方向に再度パワースペクトルを求めるとカメラの光軸周りの回転に対して不変な特徴量となる。その他、Chi-Man Pun and Moon-Chuen LeeによるIEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE、VOL. 25、NO. 5、MAY 2003記載の「Log-Polar Wavelet Energy Signatures for Rotation and Scale Invariant Texture Classification」などの方法を用いてもよい。 Moreover, a texture can be used as an invariant.
A numerical value or vector obtained by performing numerical calculation on the luminance distribution of the partial region of the image is used as a feature amount. Similar to the graphical invariant, the texture invariant is easily affected by the relative positional relationship between the camera and the object to be photographed. Therefore, a feature amount that is not easily affected is calculated and set as the texture invariant. For example, a feature quantity that is invariant to the distance and zoom between the camera and the object can be implemented by converting the focused partial image into polar coordinates and taking the power spectrum in the radial direction. Further, when the power spectrum is obtained again in the azimuth direction with respect to the power spectrum, the feature amount is invariable with respect to the rotation around the optical axis of the camera. In addition, `` Log-Polar Wavelet Energy Signatures for Rotation and Scale Invariant Texture Classification '' described in IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 5, MAY 2003 by Chi-Man Pun and Moon-Chuen Lee A method may be used.

さらに、幾何学的不変量についても、Richard Hartley and Andrew Zissermanによる「Multiple View Geometry in Computer Vision」などに記載されているような他の幾何学的不変量を使用してもよい。同一シーンを複数のカメラで観測する場合には、同文献に記載の方法により、距離もしくは深さ方向の相対位置関係の情報を得ることが可能となるが、この場合、同一平面にない４点を基底に選択し、不変量特徴空間を３次元とすると、３次元の幾何学的不変量を作ることができる。この際には、特徴点群から選択した基底４点のうち１点を不変量空間の原点、その他の基底の特徴点を不変量空間における位置座標（１，０，０）および（０，１，０）、（０，０，１）に対応付ける変換写像を求め、その他特徴をこの変換写像を使用して不変量空間に写像するようにする。 Further, as for geometric invariants, other geometric invariants as described in “Multiple View Geometry in Computer Vision” by Richard Hartley and Andrew Zisserman may be used. When observing the same scene with a plurality of cameras, it is possible to obtain information on the relative positional relationship in the distance or depth direction by the method described in the same document, but in this case, there are four points that are not on the same plane. Is selected as the basis, and the invariant feature space is three-dimensional, a three-dimensional geometric invariant can be created. In this case, one of the four base points selected from the feature point group is the origin of the invariant space, and the other base feature points are the position coordinates (1, 0, 0) and (0, 1) in the invariant space. , 0), (0, 0, 1), a transformation map is obtained, and other features are mapped to the invariant space using this transformation map.

比較手段４０は、図２に示すように、照合部４１と、判定部４２とを有する。
照合部４１は、不変特徴記憶部３２の物体不変記憶領域３２１と差分不変記憶領域３２２とからそれぞれ不変特徴情報を取り出し不変特徴の配置の照合を行う。
すなわち、「検出対象画像と背景画像の差分にもとづく不変特徴」の座標と「対象物体の不変特徴」の座標とを照合する。
そして、判定部４２は、照合の結果、対応する不変特徴の配置（座標）の一致が認められれば、対象物体を検出したと判定する。
「配置（座標）の一致」は、対応する不変特徴の全ての配置（座標）が一致する場合のみならず、各配置（座標）のうち所定数以上の一致が確認できた場合にこれを認めることもできる。 As shown in FIG. 2, the comparison unit 40 includes a collation unit 41 and a determination unit 42.
The collation unit 41 extracts invariant feature information from the object invariant storage area 321 and the difference invariant storage area 322 of the invariant feature storage unit 32, and collates the arrangement of the invariant features.
That is, the coordinates of the “invariant feature based on the difference between the detection target image and the background image” are collated with the coordinates of the “invariant feature of the target object”.
And if the matching of the arrangement | positioning (coordinate) of a corresponding invariant feature is recognized as a result of collation, it will determine with the determination part 42 having detected the target object.
“Arrangement (coordinates) match” is accepted not only when all the arrangements (coordinates) of the corresponding invariant features are coincident but also when a predetermined number or more of the arrangements (coordinates) are confirmed. You can also

ここで、本実施形態の物体検出の具体的方法について例をあげて説明する。
最初に、検出対象画像に対象物体が含まれる場合について図１２を参照して説明する。
まず、映像入力部１１は、図１２（ａ）に示す検出対象画像を入力する。
次に、特徴抽出部２１は、検出対象画像から特徴点を抽出し、差分抽出部２３が、検出対象画像と背景画像の特徴点の差分を抽出する（図１２（ｂ））。
続いて、不変特徴変換部３１が、この差分の特徴点を不変量特徴空間に写像することで図１２（ｃ）に示す不変特徴群を得る。
そして、照合部４１は、この不変特徴の不変量特徴空間における座標Ａと、対象物体の不変特徴の不変量特徴空間における座標Ｂとを照合する。
この結果、同図に示すように、双方の不変特徴の配置は一致するため、判定部４２は、検出対象画像の中に対象物体が存在するものと判断することができる。 Here, a specific method of object detection according to the present embodiment will be described with an example.
First, the case where the target object is included in the detection target image will be described with reference to FIG.
First, the video input unit 11 inputs a detection target image shown in FIG.
Next, the feature extraction unit 21 extracts feature points from the detection target image, and the difference extraction unit 23 extracts a difference between the feature points of the detection target image and the background image (FIG. 12B).
Subsequently, the invariant feature conversion unit 31 maps the difference feature points to the invariant feature space to obtain the invariant feature group shown in FIG.
Then, the collation unit 41 collates the coordinates A in the invariant feature space of the invariant feature with the coordinates B in the invariant feature space of the invariant feature of the target object.
As a result, as shown in the figure, since the arrangement of both invariant features matches, the determination unit 42 can determine that the target object exists in the detection target image.

また、対象物体の向きや大きさが変化しても同様に検出することができる。
これは、前述したように、特徴点に対して基底を複数（この場合、基底数２とする）選択して不変特徴変換を行うことによって、平行移動のみならず拡大・縮小や回転といった幾何学的変化に対する不変のパターンを予め不変量特徴空間に取得しているからである。
このため、図１３に示すように、対象物体の向きや大きさが変化してもこれを確実に検出することができる。
なお、図１３（ｃ）に示す不変量特徴空間Ａ及び不変量特徴空間Ｂは、それぞれ元の画像の特徴点を基底数２として写像したときの不変特徴の配置を模式的に示したものである。ただし、これらの図では、簡単のため、写像する特徴点数を省略している。 Moreover, even if the direction and size of the target object change, it can be detected in the same manner.
This is because, as described above, by selecting a plurality of bases for feature points (in this case, the number of bases is 2) and performing invariant feature conversion, geometric such as enlargement / reduction or rotation as well as translation is performed. This is because an invariant pattern with respect to a local change is acquired in the invariant feature space in advance.
For this reason, as shown in FIG. 13, even if the orientation and size of the target object change, this can be reliably detected.
Note that the invariant feature space A and the invariant feature space B shown in FIG. 13C schematically show the arrangement of the invariant features when the feature points of the original image are mapped as the basis number 2, respectively. is there. However, in these figures, for simplicity, the number of feature points to be mapped is omitted.

また、図１４に示すように、入力した検出対象画像に対象物体以外の物体（同図では、星状の物体を示す）が現れる場合がある。
この場合も同様に、特徴抽出部２１が検出対象画像（図１４（ａ））から特徴点を抽出し、差分抽出部２３が背景画像との特徴点の差分を抽出する（図１４（ｂ））。
そして、その差分にもとづく不変特徴Ａと、対象物体の不変特徴Ｂとを照合する。
ただし、この場合、図１４（ｃ）に示すように、双方の不変特徴の配置は一致しない。
また、図１５に示すように、入力した検出対象画像に物体が一切現れない場合がある。
この場合も、特徴抽出部２１が検出対象画像（図１５（ａ））から特徴点を抽出し、差分抽出部２３が背景画像との特徴点の差分を抽出する（図１５（ｂ））。
そして、その差分にもとづく不変特徴Ａと、対象物体の不変特徴Ｂとを照合するが、図１５（ｃ）に示すように、これらは一致しない。
したがって、このような場合、判定部４２は、対象物体の検出を肯定する判定は行わない。 Further, as shown in FIG. 14, an object other than the target object (in the figure, a star-shaped object) may appear in the input detection target image.
In this case as well, the feature extraction unit 21 extracts feature points from the detection target image (FIG. 14A), and the difference extraction unit 23 extracts feature point differences from the background image (FIG. 14B). ).
Then, the invariant feature A based on the difference is compared with the invariant feature B of the target object.
However, in this case, as shown in FIG. 14C, the arrangement of both invariant features does not match.
Further, as shown in FIG. 15, there is a case where no object appears in the input detection target image.
Also in this case, the feature extraction unit 21 extracts feature points from the detection target image (FIG. 15A), and the difference extraction unit 23 extracts feature point differences from the background image (FIG. 15B).
Then, the invariant feature A based on the difference is compared with the invariant feature B of the target object, but these do not match as shown in FIG.
Therefore, in such a case, the determination unit 42 does not perform a determination to affirm the detection of the target object.

なお、入力した検出対象画像に対象物体（紙飛行機）と対象物体以外の物体（星形の物体）とが混在して現れる場合があるが、この場合も、対象物体の不変特徴についての照合は一致するため検出することができる。 There are cases where the target object (paper airplane) and an object other than the target object (star-shaped object) appear together in the input detection target image. In this case as well, verification of the invariant features of the target object is not possible. It can be detected because it matches.

次に、本実施形態の物体検出装置の動作（物体検出方法）について、図１６及び図１７を参照して説明する。
図１６は、物体検出方法の準備段階における手順を示すフローチャートである。
図１７は、物体検出方法の検出段階における手順を示すフローチャートである。 Next, the operation (object detection method) of the object detection apparatus of the present embodiment will be described with reference to FIGS.
FIG. 16 is a flowchart illustrating a procedure in the preparation stage of the object detection method.
FIG. 17 is a flowchart illustrating a procedure in the detection stage of the object detection method.

＜準備段階における手順＞
図１６に示すように、本実施形態の物体検出装置１ａは、映像入力手段１０の映像入力部１１が、背景画像と対象物体の画像を入力する（Ｓ１０１）。入力画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から画像を取り出し特徴点を抽出する（Ｓ１０２）。すなわち、背景画像の特徴点と、対象物体の特徴点を抽出する。
背景画像の特徴点情報として特徴記憶部２２の背景記憶領域２２１に記憶される（Ｓ１０３）。
また、対象物体の特徴点情報は、特徴記憶部２２の物体記憶領域２２２に記憶される。 <Procedure in preparation stage>
As shown in FIG. 16, in the object detection device 1a of the present embodiment, the video input unit 11 of the video input means 10 inputs a background image and a target object image (S101). The input image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 takes out an image from the video storage unit 12 and extracts feature points (S102). That is, the feature point of the background image and the feature point of the target object are extracted.
The feature point information of the background image is stored in the background storage area 221 of the feature storage unit 22 (S103).
The feature point information of the target object is stored in the object storage area 222 of the feature storage unit 22.

続いて、不変特徴変換手段３０の不変特徴変換部３１は、物体記憶領域２２２から対象物体の特徴点情報を取り出し、特徴点を不変量特徴空間に写像する（Ｓ１０４）。これにより、対象物体の特徴点は不変特徴に変換される。
そして、対象物体の不変特徴情報は不変特徴記憶部３２の物体不変記憶領域３２１に記憶される（Ｓ１０５）。 Subsequently, the invariant feature conversion unit 31 of the invariant feature conversion unit 30 extracts the feature point information of the target object from the object storage area 222, and maps the feature points into the invariant feature space (S104). Thereby, the feature points of the target object are converted into invariant features.
Then, the invariant feature information of the target object is stored in the object invariant storage area 321 of the invariant feature storage unit 32 (S105).

＜検出段階における手順＞
図１７に示すように、本実施形態の物体検出装置１ａは、映像入力手段１０の映像入力部１１が、検出対象画像を入力する（Ｓ２０１）。入力した検出対象画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から検出対象画像を取り出し特徴点を抽出する（Ｓ２０２）。抽出された検出対象画像の特徴点情報は、特徴記憶部２２の入力記憶領域２２３に記憶される。
続いて、差分抽出部２３は、入力記憶領域２２３から検出対象画像の特徴点を取り出すとともに、準備段階において背景記憶領域２２１に記憶した背景画像の特徴点を取り出す。
そして、差分抽出部２３は、検出対象画像の特徴点から背景画像の対応する特徴点を削除して、特徴点の差分を求める（Ｓ２０３）。 <Procedure in the detection stage>
As shown in FIG. 17, in the object detection device 1a of the present embodiment, the video input unit 11 of the video input unit 10 inputs a detection target image (S201). The input detection target image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 extracts the detection target image from the video storage unit 12 and extracts feature points (S202). The feature point information of the extracted detection target image is stored in the input storage area 223 of the feature storage unit 22.
Subsequently, the difference extraction unit 23 extracts the feature points of the detection target image from the input storage area 223 and also extracts the feature points of the background image stored in the background storage area 221 in the preparation stage.
Then, the difference extraction unit 23 deletes the corresponding feature point of the background image from the feature point of the detection target image, and obtains the difference of the feature point (S203).

次いで、不変特徴変換手段３０の不変特徴変換部３１は、差分として抽出された特徴点を不変量特徴空間に写像して不変特徴を取得する（Ｓ２０４）。取得した不変特徴の不変特徴情報は、不変特徴記憶部３２の差分不変記憶領域３２２に記憶される。
そして、対象物体の不変特徴との照合を行う（Ｓ２０５）。
具体的には、比較手段４０の照合部４１が、差分不変記憶領域３２２から不変特徴情報を取り出すとともに、物体不変記憶領域３２１から不変特徴情報を取り出す。そして、取り出した不変特徴情報にもとづき、これらの不変特徴の不変量特徴空間における座標の照合を行う。 Next, the invariant feature conversion unit 31 of the invariant feature conversion unit 30 maps the feature points extracted as the difference to the invariant feature space to acquire an invariant feature (S204). The obtained invariant feature information of the invariant feature is stored in the difference invariant storage area 322 of the invariant feature storage unit 32.
Then, matching with the invariant feature of the target object is performed (S205).
Specifically, the matching unit 41 of the comparison unit 40 extracts the invariant feature information from the difference invariant storage area 322 and extracts the invariant feature information from the object invariant storage area 321. Then, based on the extracted invariant feature information, the coordinates of these invariant features in the invariant feature space are collated.

照合の結果、双方の不変特徴の座標が一致した場合（Ｓ２０５：一致）、判定部４２が対象物体を検出したものと判定する（Ｓ２０６）。一方、双方の不変特徴の座標が一致しなかった場合（Ｓ２０５：不一致）、対象物体の検出は行わない。つまり、この場合、特段の処理は行わずにＳ２０７に進む。
そして、さらに物体検出を継続する場合（Ｓ２０７：ＹＥＳ）、Ｓ２０１に戻って同様の処理を繰り返し、物体検出を継続しない場合（Ｓ２０７：ＮＯ）、物体検出を終了する。 As a result of the collation, when the coordinates of both invariant features match (S205: match), the determination unit 42 determines that the target object has been detected (S206). On the other hand, if the coordinates of the invariant features do not match (S205: mismatch), the target object is not detected. That is, in this case, the process proceeds to S207 without performing any special processing.
If the object detection is further continued (S207: YES), the process returns to S201 and the same processing is repeated. If the object detection is not continued (S207: NO), the object detection is terminated.

以上、説明したように、本実施形態の物体検出装置及び物体検出方法によれば、検出の際に入力した検出対象画像の特徴から背景の特徴を除外し、何らかの特徴点が残存する場合には対象物体が現れた可能性があるとし、さらにその特徴点を不変量特徴空間に写像し、これにより得た不変特徴と対象物体の特徴点を不変量特徴空間に写像して得た不変特徴とを照合するようにしている。
そして、この照合により、双方の不変特徴の座標の一致が確認された場合、対象物体の検出を認めることができる。
このため、マーカによらずとも対象物体の検出ができ、また、対象物体や背景画像の幾何学的変形にも対応して正確に物体検出を行うことができる。
したがって、利便性や信頼性に優れた物体検出装置を低コストで実現することができる。 As described above, according to the object detection apparatus and the object detection method of the present embodiment, when the background feature is excluded from the features of the detection target image input at the time of detection and some feature points remain, It is assumed that the target object may have appeared, and the feature point is mapped to the invariant feature space, and the invariant feature obtained by mapping the feature point of the target object and the feature point of the target object to the invariant feature space Are collated.
When the matching of the coordinates of both invariant features is confirmed by this collation, detection of the target object can be recognized.
For this reason, the target object can be detected regardless of the marker, and the object detection can be accurately performed in accordance with the geometric deformation of the target object or the background image.
Therefore, it is possible to realize an object detection device excellent in convenience and reliability at a low cost.

［物体検出装置及び物体検出方法の第二実施形態］
次に、本発明の物体検出装置及び物体検出方法の第二実施形態について、図１８を参照して説明する。
図１８は、本実施形態の物体検出装置の詳細な構成を示すブロック図である。
本実施形態は、第一実施形態と比較して、不変特徴変換部３０に特徴を有する。
具体的には、本実施形態の物体検出装置１ｂは、不変特徴変換部３０の不変特徴記憶部３２が、物体頻度記憶領域３２３と差分頻度記憶領域３２４とを有している。 [Second Embodiment of Object Detection Apparatus and Object Detection Method]
Next, a second embodiment of the object detection device and the object detection method of the present invention will be described with reference to FIG.
FIG. 18 is a block diagram showing a detailed configuration of the object detection apparatus of the present embodiment.
This embodiment is characterized by the invariant feature conversion unit 30 as compared to the first embodiment.
Specifically, in the object detection device 1b of the present embodiment, the invariant feature storage unit 32 of the invariant feature conversion unit 30 includes an object frequency storage area 323 and a difference frequency storage area 324.

すなわち、第一実施形態では不変特徴の配置（座標）が一致するか否かによって対象物体の検出を判断していたのに対し、本実施形態では、不変特徴の頻度が一致するか否かによって対象物体の検出を判断するようにしている。
他の構成要素は第一実施形態と同様である。
したがって、図１８において、図２と同様の構成部分については同一の符号を付して、その詳細な説明を省略する。 That is, in the first embodiment, the detection of the target object is determined based on whether or not the arrangement (coordinates) of the invariant features match, whereas in the present embodiment, depending on whether or not the frequency of the invariant features matches. The detection of the target object is determined.
Other components are the same as those in the first embodiment.
Therefore, in FIG. 18, the same components as those in FIG. 2 are denoted by the same reference numerals, and detailed description thereof is omitted.

不変特徴変換手段３０の不変特徴記憶部３２は、図１８に示すように、物体頻度記憶領域３２３と差分頻度記憶領域３２４とを有する。
物体頻度記憶領域３２３は、準備段階において求めた対象物体の不変特徴にもとづき、その頻度分布を記憶する。
具体的には、不変特徴変換部３１は、準備段階において物体不変記憶領域３２１に記憶した対象物体の不変特徴（不変特徴情報）を取り出す。
図１９（ａ）は、物体不変記憶領域３２１から取り出した対象物体の不変特徴を不変量特徴空間に表した模式図である。 The invariant feature storage unit 32 of the invariant feature conversion unit 30 includes an object frequency storage area 323 and a difference frequency storage area 324 as shown in FIG.
The object frequency storage area 323 stores the frequency distribution based on the invariant feature of the target object obtained in the preparation stage.
Specifically, the invariant feature conversion unit 31 extracts invariant features (invariant feature information) of the target object stored in the object invariant storage area 321 in the preparation stage.
FIG. 19A is a schematic diagram showing the invariant feature of the target object extracted from the object invariant storage area 321 in the invariant feature space.

次に、不変特徴変換部３１は、不変特徴が存在する不変量特徴空間に格子状のメッシュを付して複数の区画に分ける。
この区画は、図１９（ｂ）に示すように、不変量特徴空間の一部の領域を対象としても良く、全体を対象としても良い。領域を一部に絞り込むことで演算負荷の軽減を図ることができ、他方、全体を対象とすることで精度の向上を図ることができる。
続いて、不変特徴変換部３１は、図１９（ｃ）に示すように、区画ごとに不変特徴の数（頻度分布）を求める。
そして、不変特徴変換部３１は、求めた対象物体の不変特徴の頻度分布を物体頻度記憶領域３２３に記憶する。 Next, the invariant feature conversion unit 31 attaches a grid-like mesh to the invariant feature space in which the invariant features exist, and divides it into a plurality of sections.
As shown in FIG. 19B, this section may be a partial area of the invariant feature space or the entire area. The calculation load can be reduced by narrowing down the region to a part, while the accuracy can be improved by targeting the entire region.
Subsequently, as shown in FIG. 19C, the invariant feature conversion unit 31 obtains the number of invariant features (frequency distribution) for each section.
Then, the invariant feature conversion unit 31 stores the obtained frequency distribution of the invariant features of the target object in the object frequency storage area 323.

差分頻度記憶領域３２４は、検出段階において求めた検出対象画像と背景画像との特徴点の差分にもとづく不変特徴の頻度分布を記憶する。
具体的には、不変特徴変換部３１は、差分記憶領域３２２に記憶されている不変特徴を取り出し、その頻度分布を求める。すなわち、検出対象画像の特徴点から背景画像の対応する特徴点を差し引いた後の特徴点を不変量特徴空間に写像し、これにより得た不変特徴にもとづいてその頻度分布を求める。
そして、不変特徴変換部３１は、求めた不変特徴の頻度分布情報を差分頻度記憶領域３２４に記憶する。
なお、「頻度分布情報」は、例えば、不変量特徴空間における各区画の位置座標や各区画に配置された不変特徴の数によって構成される。 The difference frequency storage area 324 stores the frequency distribution of invariant features based on the difference between the feature points of the detection target image and the background image obtained in the detection stage.
Specifically, the invariant feature conversion unit 31 takes out the invariant features stored in the difference storage area 322 and obtains the frequency distribution thereof. That is, the feature point after subtracting the corresponding feature point of the background image from the feature point of the detection target image is mapped to the invariant feature space, and the frequency distribution is obtained based on the invariant feature obtained thereby.
Then, the invariant feature conversion unit 31 stores the obtained invariant feature frequency distribution information in the difference frequency storage area 324.
Note that the “frequency distribution information” is configured by, for example, the position coordinates of each section in the invariant feature space and the number of invariant features arranged in each section.

このようにして、物体頻度記憶領域３２３と差分頻度記憶領域３２４に記憶されたそれぞれの不変特徴の頻度分布は、互いに照合され、一致が認められれば対象物体を検出したものと判定される。
具体的には、双方の頻度分布のうち対応する区画に配置された不変特徴数の一致（すなわち、頻度分布の一致）が認められれば、検出対象物体を検出したものと判定する。
「頻度分布の一致」は、対応する不変特徴の全ての区画の頻度が一致する場合のみならず、所定数以上の区画における頻度の一致が確認できた場合にこれを認めることもできる。 In this way, the frequency distributions of the invariant features stored in the object frequency storage area 323 and the difference frequency storage area 324 are collated with each other, and if a match is recognized, it is determined that the target object has been detected.
Specifically, if the coincidence of the number of invariant features arranged in the corresponding section of both frequency distributions (that is, the coincidence of frequency distributions) is recognized, it is determined that the detection target object has been detected.
“Frequency distribution match” can be recognized not only when the frequencies of all the sections of the corresponding invariant feature match but also when the frequency matches in a predetermined number or more of the sections can be confirmed.

ここで、本実施形態における物体検出の具体的方法について例をあげて説明する。
最初に、検出対象画像に対象物体が含まれる場合について図２０を参照して説明する。
まず、映像入力部１１は、図２０（ａ）に示す検出対象画像を入力する。
次に、特徴抽出部２１は、検出対象画像から特徴点を抽出し、差分抽出部２３が、検出対象画像と背景画像の特徴点の差分を抽出する（図２０（ｂ））。
続いて、不変特徴変換部３１が、この差分の特徴点を不変量特徴空間に写像することで図２０（ｃ）に示す不変特徴群を得る。 Here, a specific method of object detection in the present embodiment will be described with an example.
First, a case where a target object is included in a detection target image will be described with reference to FIG.
First, the video input unit 11 inputs a detection target image shown in FIG.
Next, the feature extraction unit 21 extracts feature points from the detection target image, and the difference extraction unit 23 extracts the difference between the feature points of the detection target image and the background image (FIG. 20B).
Subsequently, the invariant feature conversion unit 31 maps the difference feature points to the invariant feature space to obtain the invariant feature group shown in FIG.

次いで、不変特徴変換部３１が、その不変特徴が存在する不変量特徴空間を区分けして不変特徴の数を求めることで図２０（ｄ）に示す頻度分布を得る。
そして、照合部４１は、この不変特徴の頻度分布Ａと、対象物体の不変特徴の頻度分布Ｂとを照合する。
この結果、同図に示すように、双方の不変特徴の各区画における不変特徴数は一致するため、判定部４２は、検出対象画像の中に対象物体が存在するものと判断することができる。 Next, the invariant feature conversion unit 31 obtains the frequency distribution shown in FIG. 20D by dividing the invariant feature space in which the invariant feature exists and obtaining the number of invariant features.
The collation unit 41 collates the frequency distribution A of the invariant features with the frequency distribution B of the invariant features of the target object.
As a result, as shown in the figure, since the invariant feature numbers in the sections of both invariant features coincide, the determination unit 42 can determine that the target object exists in the detection target image.

なお、本実施形態においては、対象物体の向きや大きさが変化してもこれを確実に検出することができる。
これは、第一実施形態において図１３を参照しながら説明した理由と同じである。
ただし、第一実施形態の場合、座標値の完全一致が原則となるが、本実施形態の場合、座標にもとづく区画が比較基準となるため、第一実施形態に比べ一致判定の許容を広く設定することができる。 In the present embodiment, even if the orientation or size of the target object changes, this can be reliably detected.
This is the same reason as described with reference to FIG. 13 in the first embodiment.
However, in the case of the first embodiment, complete matching of the coordinate values is the principle, but in the case of this embodiment, since the section based on the coordinates is a reference for comparison, the allowance for matching determination is set wider than in the first embodiment. can do.

また、図２２に示すように、入力した検出対象画像に対象物体以外の物体（同図では、星状の物体を示す）が現れる場合がある。
この場合も同様に、特徴抽出部２１が検出対象画像（図２２（ａ））から特徴点を抽出し、差分抽出部２３が背景画像との特徴点の差分を抽出する（図２２（ｂ））。
次いで、不変特徴変換部３１が、その差分の特徴点を不変量特徴空間に写像して不変特徴を求める（図２２（ｃ））。
続いて、不変特徴変換部３１が、その不変特徴が存在する不変量特徴空間を区分けして不変特徴の数を求めることで図２２（ｄ）に示す頻度分布を得る。
そして、その差分にもとづく不変特徴の頻度分布Ａと、対象物体の不変特徴の頻度分布Ｂとを照合する。
ただし、この場合、同図に示すように、双方の不変特徴に頻度分布は一致しない。 In addition, as shown in FIG. 22, an object other than the target object (in the figure, a star-shaped object) may appear in the input detection target image.
In this case as well, the feature extraction unit 21 extracts feature points from the detection target image (FIG. 22A), and the difference extraction unit 23 extracts feature point differences from the background image (FIG. 22B). ).
Next, the invariant feature conversion unit 31 obtains an invariant feature by mapping the difference feature point to the invariant feature space (FIG. 22C).
Subsequently, the invariant feature conversion unit 31 obtains the frequency distribution shown in FIG. 22D by dividing the invariant feature space in which the invariant feature exists and obtaining the number of invariant features.
Then, the frequency distribution A of the invariant feature based on the difference is collated with the frequency distribution B of the invariant feature of the target object.
However, in this case, as shown in the figure, the frequency distributions do not coincide with both invariant features.

また、図２３に示すように、入力した検出対象画像に物体が一切現れない場合がある。
この場合も、特徴抽出部２１が検出対象画像（図２３（ａ））から特徴点を抽出し、差分抽出部２３が背景画像との特徴点の差分を抽出する（図２３（ｂ））。
次いで、不変特徴変換部３１が、その差分の特徴点を不変量特徴空間に写像して不変特徴を求める（図２３（ｃ））。
続いて、不変特徴変換部３１が、その不変特徴が存在する不変量特徴空間を区分けして不変特徴の数を求めることで図２３（ｄ）に示す頻度分布を得る。
そして、その差分にもとづく不変特徴の頻度分布Ａと、対象物体の不変特徴の頻度分布Ｂとを照合するが、同図に示すように、これらの分布は一致しない。
したがって、このような場合、判定部４２は、対象物体の検出を肯定する判定は行わない。 Further, as shown in FIG. 23, there is a case where no object appears in the input detection target image.
Also in this case, the feature extraction unit 21 extracts feature points from the detection target image (FIG. 23 (a)), and the difference extraction unit 23 extracts feature point differences from the background image (FIG. 23 (b)).
Next, the invariant feature converting unit 31 maps the difference feature points to the invariant feature space to obtain an invariant feature (FIG. 23C).
Subsequently, the invariant feature conversion unit 31 obtains the frequency distribution shown in FIG. 23D by dividing the invariant feature space in which the invariant feature exists and obtaining the number of invariant features.
Then, the invariant feature frequency distribution A based on the difference is collated with the invariant feature frequency distribution B of the target object. However, as shown in FIG.
Therefore, in such a case, the determination unit 42 does not perform a determination to affirm the detection of the target object.

なお、入力した検出対象画像に対象物体（紙飛行機）と対象物体以外の物体（星形の物体）とが混在して現れる場合があるが、この場合も、対象物体の不変特徴の頻度分布についての照合によってこれを検出することができる。
この場合、頻度の完全一致でなくても不変特徴の頻度分布の所定数以上の区画において一致が確認できた場合や、対応する頻度数が近似する場合に対象物体の検出を肯定的に認めることができる。
また、不変量特徴空間における頻度分布の区画幅を変更することによって、物体検出の精度を調整することができる。 In addition, there are cases where the target object (paper airplane) and the object other than the target object (star-shaped object) appear together in the input detection target image. In this case as well, the frequency distribution of the invariant features of the target object This can be detected by checking.
In this case, even if the frequency is not a perfect match, if the match is confirmed in a predetermined number or more sections of the frequency distribution of the invariant feature, or if the corresponding frequency number is approximate, the detection of the target object is positively accepted. Can do.
Moreover, the accuracy of object detection can be adjusted by changing the section width of the frequency distribution in the invariant feature space.

次に、本実施形態の物体検出装置の動作（物体検出方法）について、図２４及び図２５を参照して説明する。
図２４は、物体検出方法の準備段階における手順を示すフローチャートである。
図２５は、物体検出方法の検出段階における手順を示すフローチャートである。 Next, the operation (object detection method) of the object detection apparatus of the present embodiment will be described with reference to FIGS.
FIG. 24 is a flowchart illustrating a procedure in the preparation stage of the object detection method.
FIG. 25 is a flowchart illustrating a procedure in the detection stage of the object detection method.

＜準備段階における手順＞
図２４に示すように、本実施形態の物体検出装置１ｂは、映像入力手段１０の映像入力部１１が、背景画像と対象物体の画像を入力する（Ｓ３０１）。入力画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から画像を取り出し特徴点を抽出する（Ｓ３０２）。すなわち、背景画像の特徴点と、対象物体の特徴点とを抽出する。
背景画像の特徴点情報は、特徴記憶部３２の背景記憶領域２２１に記憶される（Ｓ３０３）。
また、対象物体の特徴点情報は、特徴記憶部２２の物体記憶領域２２２に記憶される。 <Procedure in preparation stage>
As shown in FIG. 24, in the object detection device 1b of the present embodiment, the video input unit 11 of the video input unit 10 inputs the background image and the target object image (S301). The input image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 takes out an image from the video storage unit 12 and extracts feature points (S302). That is, the feature point of the background image and the feature point of the target object are extracted.
The feature point information of the background image is stored in the background storage area 221 of the feature storage unit 32 (S303).
The feature point information of the target object is stored in the object storage area 222 of the feature storage unit 22.

続いて、不変特徴変換手段３０の不変特徴変換部３１は、物体記憶領域２２２から対象物体の特徴点を取り出し、不変量特徴空間に写像する（Ｓ３０４）。
次いで、対象物体の不変特徴情報は、不変特徴記憶部３２の物体不変記憶領域３２１に記憶される（Ｓ３０５）。
そして、不変特徴変換部３１は、物体不変記憶領域３２１から対象物体の不変特徴を取り出しその頻度分布を求め、その頻度分布情報を不変特徴記憶部３２の物体頻度記憶領域３２３に記憶する（Ｓ３０６）。 Subsequently, the invariant feature conversion unit 31 of the invariant feature conversion unit 30 extracts the feature point of the target object from the object storage area 222 and maps it to the invariant feature space (S304).
Next, the invariant feature information of the target object is stored in the object invariant storage area 321 of the invariant feature storage unit 32 (S305).
Then, the invariant feature conversion unit 31 extracts the invariant feature of the target object from the object invariant storage region 321, obtains its frequency distribution, and stores the frequency distribution information in the object frequency storage region 323 of the invariant feature storage unit 32 (S 306). .

＜検出段階における手順＞
図２５に示すように、本実施形態の物体検出装置１ｂは、映像入力手段１０の映像入力部１１が、検出対象画像を入力する（Ｓ４０１）。入力した検出対象画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から検出対象画像を取り出し特徴点を抽出する（Ｓ４０２）。抽出された検出対象画像の特徴点情報は、特徴記憶部２２の入力記憶領域２２３に記憶される。
続いて、差分抽出部２３は、入力記憶領域２２３から検出対象画像の特徴点を取り出すとともに、準備段階において背景記憶領域２２１に記憶した背景画像の特徴点を取り出す。
そして、差分抽出部２３は、検出対象画像の特徴点から背景画像の対応する特徴点を削除して、特徴点の差分を求める（Ｓ４０３）。 <Procedure in the detection stage>
As shown in FIG. 25, in the object detection device 1b of the present embodiment, the video input unit 11 of the video input unit 10 inputs a detection target image (S401). The input detection target image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 extracts the detection target image from the video storage unit 12 and extracts feature points (S402). The feature point information of the extracted detection target image is stored in the input storage area 223 of the feature storage unit 22.
Subsequently, the difference extraction unit 23 extracts the feature points of the detection target image from the input storage area 223 and also extracts the feature points of the background image stored in the background storage area 221 in the preparation stage.
Then, the difference extraction unit 23 deletes the corresponding feature point of the background image from the feature point of the detection target image, and obtains the difference of the feature point (S403).

次いで、不変特徴変換手段３０の不変特徴変換部３１は、差分として抽出された特徴点を不変量特徴空間に写像して不変特徴を取得する（Ｓ４０４）。取得した不変特徴の不変特徴情報は、不変特徴記憶部３２の差分不変記憶領域３２２に記憶される。
続いて、不変特徴変換部３１は、差分不変記憶領域３２２から不変特徴を取り出して頻度分布を求め、その頻度分布情報を差分頻度記憶領域３２６に記憶する（Ｓ４０５）。
そして、対象物体の不変特徴の頻度分布との照合を行う（Ｓ４０６）。
具体的には、比較手段４０の照合部４１が、差分頻度記憶領域３２６から不変特徴の頻度分布情報を取り出すとともに、物体頻度記憶領域３２３から不変特徴の頻度分布情報を取り出す。そして、取り出した不変特徴の頻度分布情報にもとづき、これらの頻度分布が一致するか否かを照合する。 Next, the invariant feature converting unit 31 of the invariant feature converting unit 30 maps the feature points extracted as the difference to the invariant feature space to acquire the invariant feature (S404). The obtained invariant feature information of the invariant feature is stored in the difference invariant storage area 322 of the invariant feature storage unit 32.
Subsequently, the invariant feature conversion unit 31 extracts the invariant features from the difference invariant storage area 322, obtains the frequency distribution, and stores the frequency distribution information in the difference frequency storage area 326 (S405).
And it collates with the frequency distribution of the invariant feature of a target object (S406).
Specifically, the matching unit 41 of the comparison unit 40 extracts the invariant feature frequency distribution information from the difference frequency storage area 326 and extracts the invariant feature frequency distribution information from the object frequency storage area 323. Then, based on the extracted frequency distribution information of the invariant features, it is verified whether or not these frequency distributions match.

照合の結果、双方の頻度分布が一致した場合（Ｓ４０６：一致）、判定部４２が対象物体を検出したものと判定する（Ｓ４０７）。一方、双方の頻度分布が一致しなかった場合（Ｓ４０６：不一致）、対象物体の検出は行わない。つまり、この場合、特段の処理は行わずにＳ４０８に進む。
そして、さらに物体検出を継続する場合（Ｓ４０８：ＹＥＳ）、Ｓ４０１に戻って同様の処理を繰り返し、物体検出を継続しない場合（Ｓ４０８：ＮＯ）、物体検出を終了する。 As a result of the collation, when both frequency distributions match (S406: match), it is determined that the determination unit 42 has detected the target object (S407). On the other hand, when both frequency distributions do not match (S406: mismatch), the target object is not detected. That is, in this case, the process proceeds to S408 without performing any special processing.
If the object detection is further continued (S408: YES), the process returns to S401 and the same processing is repeated. If the object detection is not continued (S408: NO), the object detection is terminated.

以上、説明したように、本実施形態の物体検出装置及び物体検出方法によれば、第一実施形態と同様、マーカによらずに対象物体を検出することができる。
特に、本実施形態においては、対象物体の検出判定を、不変量特徴空間に現れる不変特徴の区画ごとの数が一致するか否かによって行うようにしている。
このため、座標の一致を対象物体の検出条件として求める第一実施形態に比べ、一致度に許容をもたせることができる。
このため、例えば、画像の乱れやノイズ等に起因する不変特徴座標の微細なズレを無効化することも可能である。
また、一致度の許容範囲については、不変量特徴空間における区画幅を変更することで、物体検出の精度を自在に調整することが可能である。 As described above, according to the object detection device and the object detection method of the present embodiment, it is possible to detect a target object without using a marker as in the first embodiment.
In particular, in this embodiment, the detection determination of the target object is performed based on whether or not the number of invariant features appearing in the invariant feature space matches each section.
For this reason, compared with 1st embodiment which calculates | requires a coordinate coincidence as a detection condition of a target object, tolerance can be given.
For this reason, for example, it is possible to invalidate a minute shift of invariant feature coordinates caused by image disturbance, noise, or the like.
In addition, regarding the allowable range of coincidence, it is possible to freely adjust the accuracy of object detection by changing the section width in the invariant feature space.

［物体検出装置及び物体検出方法の第三実施形態］
次に、本発明の物体検出装置及び物体検出方法の第三実施形態について、図２６及び図２７を参照して説明する。
図２６は、本実施形態の物体検出装置の構成を示すブロック図である。
また、図２７は、本実施形態の物体検出装置の詳細な構成を示すブロック図である。
本実施形態の物体検出装置１ｃは、図２６に示すように、特異特徴選択手段５０を備える点で、前述の第一実施形態及び第二実施形態と異なる。 [Third Embodiment of Object Detection Device and Object Detection Method]
Next, a third embodiment of the object detection device and the object detection method of the present invention will be described with reference to FIGS.
FIG. 26 is a block diagram illustrating a configuration of the object detection device of the present embodiment.
FIG. 27 is a block diagram showing a detailed configuration of the object detection apparatus of the present embodiment.
As shown in FIG. 26, the object detection apparatus 1c of this embodiment is different from the first embodiment and the second embodiment described above in that it includes a unique feature selection unit 50.

具体的には、背景の特徴点から導き出される不変量特徴空間における不変特徴の中から不変特徴が現れなかった部分（又は不変特徴数が所定数以下の部分）を特異特徴として選択しておき、検出段階において特異特徴の部分に不変特徴が現れた場合には対象物体が現れた可能性が高いものとして物体検出をより肯定的に判定するようにしている。
つまり、本実施形態では、前述した実施形態の物体検出装置の物体検出における判定精度をより高めることを目的としている。他の構成要素は第一実施形態と同様である。
したがって、図２６及び図２７において、図１及び図２と同様の構成部分については同一の符号を付して、その詳細な説明を省略する。 Specifically, a part where the invariant feature did not appear from the invariant features in the invariant feature space derived from the feature points of the background (or a part where the number of invariant features is a predetermined number or less) is selected as a singular feature, If an invariant feature appears in the specific feature portion at the detection stage, the object detection is more positively determined that the target object is likely to appear.
That is, the present embodiment aims to further increase the determination accuracy in object detection of the object detection device of the above-described embodiment. Other components are the same as those in the first embodiment.
Accordingly, in FIGS. 26 and 27, the same components as those in FIGS. 1 and 2 are denoted by the same reference numerals, and detailed description thereof is omitted.

本実施形態の物体検出装置１ｃは、図２７に示すように、不変特徴変換手段３０の不変特徴記憶部３２が、背景不変記憶領域３２５と背景頻度記憶領域３２６と入力頻度記憶領域３２７とを有する。
背景不変記憶領域３２５は、準備段階において、背景画像の特徴点を不変量特徴空間に写像して得た不変特徴の情報を記憶する。
背景頻度記憶領域３２６は、準備段階において、背景不変記憶領域３２５に記憶した背景画像の不変特徴にもとづく頻度分布情報を記憶する。
入力頻度記憶領域３２７は、検出段階において、検出対象画像の特徴点を不変量特徴空間に写像して得た不変特徴の頻度分布を記憶する。
そして、この写像によって取得した不変特徴の頻度分布情報を不変特徴記憶部３２の背景頻度記憶領域３２５に記憶する。 In the object detection device 1c of this embodiment, as shown in FIG. 27, the invariant feature storage unit 32 of the invariant feature conversion means 30 includes a background invariant storage region 325, a background frequency storage region 326, and an input frequency storage region 327. .
The background invariant storage area 325 stores invariant feature information obtained by mapping the feature points of the background image to the invariant feature space in the preparation stage.
The background frequency storage area 326 stores frequency distribution information based on the invariant features of the background image stored in the background invariant storage area 325 in the preparation stage.
The input frequency storage area 327 stores the invariant feature frequency distribution obtained by mapping the feature points of the detection target image to the invariant feature space in the detection stage.
Then, the frequency distribution information of the invariant feature acquired by this mapping is stored in the background frequency storage area 325 of the invariant feature storage unit 32.

特異特徴選択手段５０は、図２７に示すように、特異特徴選択部５１と特異特徴記憶部５２とを有する。
特異特徴選択部５１は、準備段階において、不変特徴記憶部３２の背景頻度記憶領域３２５から背景画像に関する不変特徴の頻度分布情報を取り出す。
そして、特異特徴選択部５１は、取り出した不変特徴の頻度分布情報を分析し、不変特徴数が所定数以下である区画を特異特徴として選択する。
図２８は、背景画像から導き出された特異特徴を示す模式図である。
同図に示すように、不変特徴数が「０」の区画を特異特徴として選択することができる。
このように、特異特徴選択部５１は、所定の画像から抽出した特徴点群が現れていない特徴空間の部分にもとづく不変特徴から特異特徴を選択することができる。 As illustrated in FIG. 27, the unique feature selection unit 50 includes a unique feature selection unit 51 and a unique feature storage unit 52.
The singular feature selection unit 51 extracts the frequency distribution information of the invariant features related to the background image from the background frequency storage area 325 of the invariant feature storage unit 32 in the preparation stage.
Then, the singular feature selection unit 51 analyzes the extracted frequency distribution information of the invariant features, and selects a section having the invariant feature number equal to or less than a predetermined number as the singular feature.
FIG. 28 is a schematic diagram showing unique features derived from a background image.
As shown in the figure, a section having an invariant feature number of “0” can be selected as a unique feature.
As described above, the singular feature selection unit 51 can select a singular feature from the invariant features based on the portion of the feature space where the feature point group extracted from the predetermined image does not appear.

この特異特徴の選択は、不変量特徴空間の不変特徴の分布から大きな空白を見つける問題と同一視できるから、例えば、２００３年文書解析認識国際会議予稿集に掲載されている「An algorithm for Finding Maximal Whitespace Rectangles at Arbitrary Orientations for Document Layout Analysis」などのアルゴリズムを使用して、大きな空白領域を抽出しても良いし、得られた不変特徴を含まない矩形領域の中心を特異特徴としても良い。 This selection of singular features can be identified as the problem of finding a large blank from the distribution of invariant features in the invariant feature space. For example, “An algorithm for Finding Maximal” published in the 2003 Proc. An algorithm such as “Whitespace Rectangles at Arbitrary Orientations for Document Layout Analysis” may be used to extract a large blank area, or the center of the obtained rectangular area that does not include invariant features may be used as a singular feature.

その他の方法としては、不変量特徴空間を特定の大きさのメッシュ（区画）で量子化し、１次元もしくは多次元のヒストグラムを生成し、不変特徴の発生頻度が０となる区画の中心を特異特徴とするなどしても良い。頻度が０となる区画が存在しない場合、区画幅を小さくして、ヒストグラムをとり、頻度が０となる区画が現れた場合、このときの区画から特異特徴を選択するようにしてもよい。頻度が０となるメッシュが見つからない場合は、ヒストグラムを既定値で閾値処理し、規定値以下のメッシュから特異特徴を選択しても良い。 As another method, the invariant feature space is quantized with a mesh (partition) of a specific size to generate a one-dimensional or multidimensional histogram, and the center of the section where the occurrence frequency of the invariant feature is 0 is a unique feature. And so on. If there is no partition with a frequency of 0, the partition width may be reduced, a histogram may be taken, and if a partition with a frequency of 0 appears, a unique feature may be selected from the partition at this time. If a mesh with a frequency of 0 is not found, the histogram may be thresholded with a default value, and unique features may be selected from meshes that are less than or equal to the specified value.

特異特徴記憶部５０は、特異特徴選択部５１で選択された背景画像に関する特異特徴の情報を記憶する。特異特徴の情報としては、各特異特徴の位置を示す座標が相当する。
そして、検出段階において、照合部４１は、不変特徴記憶部３２の入力頻度記憶領域３２６から検出対象画像の不変特徴の頻度分布情報を取り出すとともに、特異特徴記憶部５２に予め記憶した背景画像の特異特徴情報を取り出し、これらを照合する。 The unique feature storage unit 50 stores information on unique features related to the background image selected by the unique feature selection unit 51. The unique feature information corresponds to coordinates indicating the position of each unique feature.
Then, in the detection stage, the matching unit 41 extracts the frequency distribution information of the invariant features of the detection target image from the input frequency storage area 326 of the invariant feature storage unit 32 and the uniqueness of the background image stored in advance in the unique feature storage unit 52. Feature information is extracted and collated.

この結果、判定部４２は、背景画像の特異特徴部分に不変特徴が１以上現れると、何らかの物体が検出されたことを認識することができる。
これにより、他の判定方法により対象物体の検出がある程度認められている場合には、その判定をより肯定的に認めることができる。
このため、前述の実施形態に係る物体検出方法と組み合わせることで対象物体をより正確に検出することができる。
つまり、前述の実施形態の物体検出装置１ａ、１ｂに、本実施形態特有の構成を加えることで、極めて検出精度の優れた物体検出装置を実現することができる。 As a result, the determination unit 42 can recognize that some object has been detected when one or more invariant features appear in the unique feature portion of the background image.
Thereby, when the detection of the target object is recognized to some extent by another determination method, the determination can be recognized more positively.
For this reason, a target object can be detected more correctly by combining with the object detection method according to the above-described embodiment.
That is, an object detection device with extremely excellent detection accuracy can be realized by adding a configuration unique to the present embodiment to the object detection devices 1a and 1b of the above-described embodiment.

次に、本実施形態の物体検出装置の動作（物体検出方法）について、図２９及び図３０を参照して説明する。
図２９は、物体検出方法の準備段階における手順を示すフローチャートである。
図３０は、物体検出方法の検出段階における手順を示すフローチャートである。 Next, the operation (object detection method) of the object detection apparatus of this embodiment will be described with reference to FIGS. 29 and 30. FIG.
FIG. 29 is a flowchart illustrating a procedure in a preparation stage of the object detection method.
FIG. 30 is a flowchart illustrating a procedure in the detection stage of the object detection method.

＜準備段階における手順＞
図２９に示すように、本実施形態の物体検出装置１ｃは、映像入力手段１０の映像入力部１１が、背景画像と対象物体の画像を入力する（Ｓ５０１）。入力画像は映像記憶部１２に記憶される。
特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から画像を取り出し特徴点を抽出する（Ｓ５０２）。すなわち、背景画像の特徴点と、対象物体の特徴点とを抽出する。
背景画像の特徴点情報は、特徴記憶部３２の背景記憶領域２２１に記憶される。
また、対象物体の特徴点情報は、特徴記憶部３２の物体記憶領域２２２に記憶される（Ｓ５０３）。 <Procedure in preparation stage>
As shown in FIG. 29, in the object detection device 1c of the present embodiment, the video input unit 11 of the video input unit 10 inputs a background image and an image of the target object (S501). The input image is stored in the video storage unit 12.
The feature extraction unit 21 of the feature extraction unit 20 takes out an image from the video storage unit 12 and extracts feature points (S502). That is, the feature point of the background image and the feature point of the target object are extracted.
The feature point information of the background image is stored in the background storage area 221 of the feature storage unit 32.
Also, the feature point information of the target object is stored in the object storage area 222 of the feature storage unit 32 (S503).

次に、不変特徴変換手段３０の不変特徴変換部３１は、物体記憶領域２２２から対象物体の特徴点を取り出し、特徴点を不変量特徴空間に写像する（Ｓ５０４）。これにより、対象物体の特徴点は不変特徴に変換される。
対象物体の不変特徴情報は、不変特徴記憶部３２の物体不変記憶領域３２１に記憶される（Ｓ５０５）。 Next, the invariant feature conversion unit 31 of the invariant feature conversion means 30 takes out the feature points of the target object from the object storage area 222 and maps the feature points to the invariant feature space (S504). Thereby, the feature points of the target object are converted into invariant features.
The invariant feature information of the target object is stored in the object invariant storage area 321 of the invariant feature storage unit 32 (S505).

不変特徴変換部３１は、物体頻度記憶領域３２１から対象物体の不変特徴を取り出して頻度分布を求め、その情報を不変特徴記憶部３２の物体頻度記憶領域３２３に記憶する（Ｓ５０６）。
また、不変特徴変換部３１は、特徴記憶部２２の背景記憶領域２２１から背景画像の特徴点を取り出し、不変量特徴空間に写像する（Ｓ５０７）。これにより、背景画像の特徴点は不変特徴に変換され、この不変特徴情報は不変特徴記憶部３２の背景不変記憶領域３２５に記憶される。
次いで、不変特徴変換部３１は、背景画像の不変特徴の頻度分布を求め、その情報を背景頻度記憶領域３２５に記憶する（Ｓ５０８）。 The invariant feature conversion unit 31 extracts the invariant features of the target object from the object frequency storage area 321, obtains the frequency distribution, and stores the information in the object frequency storage area 323 of the invariant feature storage unit 32 (S506).
Further, the invariant feature conversion unit 31 extracts the feature points of the background image from the background storage area 221 of the feature storage unit 22 and maps them to the invariant feature space (S507). Thereby, the feature points of the background image are converted into invariant features, and the invariant feature information is stored in the background invariant storage area 325 of the invariant feature storage unit 32.
Next, the invariant feature conversion unit 31 obtains the frequency distribution of the invariant features of the background image and stores the information in the background frequency storage area 325 (S508).

そして、特異特徴選択部５１は、背景頻度記憶領域３２５から背景画像の不変特徴の頻度分布を取り出して特異特徴を選択し、この特異特徴情報を特異特徴記憶部５２に記憶する（Ｓ５０９）。 Then, the unique feature selection unit 51 extracts the frequency distribution of the invariant features of the background image from the background frequency storage area 325, selects the unique features, and stores the unique feature information in the unique feature storage unit 52 (S509).

＜検出段階における手順＞
図３０に示すように、本実施形態の物体検出装置１ｃは、映像入力手段１０の映像入力部１１が、検出対象画像を入力する（Ｓ６０１）。入力した画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から検出対象画像を取り出し特徴点を抽出する（Ｓ６０２）。抽出された特徴点の特徴点情報は、特徴記憶部２２の入力記憶領域２２３に記憶される。
続いて、差分抽出部２３は、入力記憶領域２２３から検出対象画像の特徴点を取り出すとともに、準備段階において背景記憶領域２２１に記憶した背景画像の特徴点を取り出す。
そして、差分抽出部２３は、検出対象画像の特徴点から背景画像の対応する特徴点を削除して、特徴点の差分を求める（Ｓ６０３）。 <Procedure in the detection stage>
As shown in FIG. 30, in the object detection device 1c of this embodiment, the video input unit 11 of the video input unit 10 inputs a detection target image (S601). The input image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 extracts the detection target image from the video storage unit 12 and extracts feature points (S602). The feature point information of the extracted feature points is stored in the input storage area 223 of the feature storage unit 22.
Subsequently, the difference extraction unit 23 extracts the feature points of the detection target image from the input storage area 223 and also extracts the feature points of the background image stored in the background storage area 221 in the preparation stage.
Then, the difference extraction unit 23 deletes the corresponding feature point of the background image from the feature point of the detection target image, and obtains the difference of the feature point (S603).

不変特徴変換手段３０の不変特徴変換部３１は、差分として抽出された特徴点を不変量特徴空間に写像して不変特徴を取得する（Ｓ６０４）。取得した不変特徴の不変特徴情報は、不変特徴記憶部３２の差分不変記憶領域３２２に記憶される。
そして、対象物体の不変特徴の頻度分布との照合を行う（Ｓ６０５）。
具体的には、比較手段４０の照合部４１が、差分頻度記憶領域３２６から不変特徴の頻度分布情報を取り出すとともに、物体頻度記憶領域３２３から不変特徴の頻度分布情報を取り出す。そして、取り出した不変特徴の頻度分布情報にもとづき、これらの頻度分布が一致するか否かを照合する。 The invariant feature conversion unit 31 of the invariant feature conversion unit 30 maps the feature points extracted as the difference to the invariant feature space to acquire an invariant feature (S604). The obtained invariant feature information of the invariant feature is stored in the difference invariant storage area 322 of the invariant feature storage unit 32.
And it collates with the frequency distribution of the invariant feature of a target object (S605).
Specifically, the matching unit 41 of the comparison unit 40 extracts the invariant feature frequency distribution information from the difference frequency storage area 326 and extracts the invariant feature frequency distribution information from the object frequency storage area 323. Then, based on the extracted frequency distribution information of the invariant features, it is verified whether or not these frequency distributions match.

照合の結果、双方の頻度分布が一致した場合（Ｓ６０５：一致）、さらに、特異特徴の照合を行う（Ｓ６０６）。
具体的には、照合部４１が、特異特徴記憶部５２に記憶した背景画像の特異特徴情報を取り出すとともに、入力頻度記憶領域３２７に記憶した検出対象画像に関する不変特徴の頻度情報を取り出す。
そして、背景画像の特異特徴の部分に、検出対象画像の不変特徴が１以上検出されれば、判定部４２が対象物体を検出したものと判定する（Ｓ６０７）。
一方、Ｓ６０５やＳ６０６において照合が一致しない場合、対象物体の検出は行わない、つまり、この場合、特段の処理は行わずにＳ６０８に進む。
そして、さらに物体検出を継続する場合（Ｓ６０８：ＹＥＳ）、Ｓ６０１に戻って同様の処理を繰り返し、物体検出を継続しない場合（Ｓ６０８：ＮＯ）、物体検出を終了する。 As a result of the matching, if both frequency distributions match (S605: match), a unique feature is further checked (S606).
Specifically, the collation unit 41 extracts the unique feature information of the background image stored in the unique feature storage unit 52 and also extracts the frequency information of the invariant feature related to the detection target image stored in the input frequency storage region 327.
If one or more invariant features of the detection target image are detected in the specific feature portion of the background image, the determination unit 42 determines that the target object has been detected (S607).
On the other hand, if the collation does not match in S605 or S606, the target object is not detected, that is, in this case, the process proceeds to S608 without performing any special processing.
If further object detection is to be continued (S608: YES), the process returns to S601 and the same processing is repeated. If object detection is not continued (S608: NO), the object detection is terminated.

以上、説明したように、本実施形態の物体検出装置及び物体検出方法によれば、背景の特徴点から導き出される不変量特徴空間における不変特徴の中から不変特徴が現れなかった部分（又は不変特徴数が所定数以下の部分）を特異特徴として選択しておき、検出段階において特異特徴の部分に不変特徴が現れた場合には対象物体が現れた可能性が高いものとして物体検出をより肯定的に判定するようにしている。
したがって、物体検出における判定精度をより高めることができる。 As described above, according to the object detection device and the object detection method of the present embodiment, a portion (or an invariant feature) in which an invariant feature did not appear among invariant features in the invariant feature space derived from the feature points of the background. If the invariant feature appears in the part of the singular feature in the detection stage, the object detection is more positive because the target object is likely to appear. Judgment is made.
Therefore, the determination accuracy in object detection can be further increased.

［物体検出装置及び物体検出方法の第四実施形態］
次に、本発明の物体検出装置及び物体検出方法の第四実施形態について、図３１を参照して説明する。
図３１は、本実施形態の物体検出装置の詳細な構成を示すブロック図である。
本実施形態の物体検出装置１ｄは、図３１に示すように、比較手段４０が数判定部４３を備えるところに特徴を有する。
すなわち、物体そのものの検出に加え、その物体の数を検出することを目的としている。他の構成要素は第一実施形態と同様である。
したがって、図３１において、図２と同様の構成部分については同一の符号を付して、その詳細な説明を省略する。 [Fourth Embodiment of Object Detection Apparatus and Object Detection Method]
Next, a fourth embodiment of the object detection device and the object detection method of the present invention will be described with reference to FIG.
FIG. 31 is a block diagram showing a detailed configuration of the object detection apparatus of the present embodiment.
As shown in FIG. 31, the object detection device 1 d of this embodiment is characterized in that the comparison unit 40 includes a number determination unit 43.
That is, the object is to detect the number of objects in addition to the detection of the objects themselves. Other components are the same as those in the first embodiment.
Therefore, in FIG. 31, the same components as those in FIG. 2 are denoted by the same reference numerals, and detailed description thereof is omitted.

本実施形態の物体検出装置１ｄは、図３１に示すように、比較手段４０が数判定部４３を有する。
以下、本実施形態に係る物体の数判定の具体的な方法について図３２を参照しながら説明を行う。
ここで、映像入力部１１は、対象物体を複数含むフレーム画像（図３２（ａ））を検出対象画像として入力し、映像記憶部１２に記憶する。
特徴抽出部２１は、検出対象画像から特徴点を抽出し、差分抽出部２３が検出対象画像と背景画像の特徴点の差分を抽出する（図３２（ｂ））。
続いて、不変特徴変換部３１が、この差分の特徴点を不変量特徴空間に写像することで図３２（ｃ）に示す不変特徴群を得る。なお、本実施形態の場合、基底は１点として不変特徴変換を行う。 In the object detection device 1d of the present embodiment, the comparison unit 40 includes a number determination unit 43, as shown in FIG.
Hereinafter, a specific method for determining the number of objects according to the present embodiment will be described with reference to FIG.
Here, the video input unit 11 inputs a frame image including a plurality of target objects (FIG. 32A) as a detection target image and stores it in the video storage unit 12.
The feature extraction unit 21 extracts feature points from the detection target image, and the difference extraction unit 23 extracts the difference between the feature points of the detection target image and the background image (FIG. 32B).
Subsequently, the invariant feature conversion unit 31 maps the difference feature points to the invariant feature space to obtain the invariant feature group shown in FIG. In the case of this embodiment, invariant feature conversion is performed with a base as one point.

次に、図３２（ｄ）に示すように、不変特徴変換部３１は、これらの不変特徴が存在する不変量特徴空間をメッシュに分け、区画ごとの頻度分布を求める。
そして、比較手段４０の数判定部４３は、不変特徴記憶部３２の物体頻度記憶領域３２３に記憶した対象物体の不変特徴の頻度分布を取り出し、照合を行う。
ただし、この照合は、不変量特徴空間の原点を中心とした一定領域の範囲で行う。具体的には、単体の対象物体から抽出される各特徴点のうち最も離れた２点の特徴点情報を取り出し、この２点間におけるｘ軸方向の最大距離とｙ軸方向の最大距離を求め、それぞれの距離を不変量特徴空間の原点を基準としたｘ軸の±方向の照合範囲、ｙ軸方向の±方向の照合範囲とする。
この結果、同図に示すように、数判定部４３は、検出対象画像に関する区画ごとの不変特徴の数が、対象物体の対応する区画の不変特徴の数の2倍になることを認識することができる。
このような場合、数判定部４３は、対象物体を２個検出したものと判定することができる。 Next, as shown in FIG. 32D, the invariant feature conversion unit 31 divides the invariant feature space in which these invariant features exist into meshes, and obtains a frequency distribution for each section.
Then, the number determination unit 43 of the comparison unit 40 extracts the frequency distribution of the invariant features of the target object stored in the object frequency storage area 323 of the invariant feature storage unit 32 and performs collation.
However, this collation is performed in the range of a certain region centered on the origin of the invariant feature space. Specifically, the feature point information of two points farthest from each feature point extracted from a single target object is extracted, and the maximum distance in the x-axis direction and the maximum distance in the y-axis direction between these two points are obtained. Each distance is defined as a collation range in the ± direction of the x-axis and a collation range in the ± direction of the y-axis direction with respect to the origin of the invariant feature space.
As a result, as shown in the figure, the number determination unit 43 recognizes that the number of invariant features for each section related to the detection target image is twice the number of invariant features in the corresponding section of the target object. Can do.
In such a case, the number determination unit 43 can determine that two target objects have been detected.

なお、本実施形態においては、対象物体が２個の例について説明したが、この数に制限するものではなく、１以上の個数であれば検出することができる。
また、対象物体が複数個、検出対象画像に現れる場合、対象物体同士は一定距離離れていることが条件となる。具体的には、単体の対象物体から抽出される各特徴点のうち最も離れた２点の特徴点間の距離分だけ、対象物体が離れていることを条件とする。
これにより、不変量特徴空間において不変特徴が重複することによって発生する数の誤差をなくすことができ、正確に対象物体の数を検出することができる。 In the present embodiment, an example in which there are two target objects has been described. However, the number is not limited to this number, and any number of one or more can be detected.
Further, when a plurality of target objects appear in the detection target image, it is a condition that the target objects are separated from each other by a certain distance. Specifically, the condition is that the target object is separated by the distance between the two most distant feature points among the feature points extracted from the single target object.
As a result, it is possible to eliminate the number of errors caused by overlapping invariant features in the invariant feature space, and to accurately detect the number of target objects.

次に、本実施形態の物体検出装置の動作（物体検出方法）について、図３３を参照して説明する。
図３３は、本実施形態の物体検出方法の手順を示すフローチャートである。
なお、本実施形態における準備段階の処理については、前述の第二実施形態又は第三実施形態と同様であるため詳細な説明は省略する。 Next, the operation (object detection method) of the object detection apparatus of this embodiment will be described with reference to FIG.
FIG. 33 is a flowchart showing the procedure of the object detection method of the present embodiment.
In addition, about the process of the preparation stage in this embodiment, since it is the same as that of above-mentioned 2nd embodiment or 3rd embodiment, detailed description is abbreviate | omitted.

図３３に示すように、本実施形態の物体検出装置１ｄは、映像入力手段１０の映像入力部１１が、検出対象画像を入力する（Ｓ７０１）。入力した検出対象画像は映像記憶部１２に記憶される。
次に、特徴抽出手段２０の特徴抽出部２１は、映像記憶部１２から検出対象画像を取り出し特徴点を抽出する（Ｓ７０２）。抽出された検出対象画像の特徴点情報は、特徴記憶部２２の入力記憶領域２２３に記憶される。
続いて、差分抽出部２３は、入力記憶領域２２３から検出対象画像の特徴点を取り出すとともに、準備段階において背景記憶領域２２１に記憶した背景画像の特徴点を取り出す。
そして、差分抽出部２３は、検出対象画像の特徴点から背景画像の対応する特徴点を削除して、特徴点の差分を求める（Ｓ７０３）。 As shown in FIG. 33, in the object detection device 1d of the present embodiment, the video input unit 11 of the video input unit 10 inputs a detection target image (S701). The input detection target image is stored in the video storage unit 12.
Next, the feature extraction unit 21 of the feature extraction unit 20 extracts the detection target image from the video storage unit 12 and extracts feature points (S702). The feature point information of the extracted detection target image is stored in the input storage area 223 of the feature storage unit 22.
Subsequently, the difference extraction unit 23 extracts the feature points of the detection target image from the input storage area 223 and also extracts the feature points of the background image stored in the background storage area 221 in the preparation stage.
Then, the difference extraction unit 23 deletes the corresponding feature point of the background image from the feature point of the detection target image, and obtains the difference of the feature point (S703).

次いで、不変特徴変換手段３０の不変特徴変換部３１は、差分として抽出された特徴点を不変量特徴空間に写像して不変特徴を取得する（Ｓ７０４）。取得した不変特徴の不変特徴情報は、不変特徴記憶部３２の差分不変記憶領域３２２に記憶される。
そして、対象物体の不変特徴の頻度分布との照合を行う（Ｓ７０５）。
具体的には、比較手段４０の照合部４１が、差分頻度記憶領域３２６から不変特徴の頻度分布情報を取り出すとともに、物体頻度記憶領域３２３から不変特徴の頻度分布情報を取り出す。そして、取り出した不変特徴の頻度分布情報にもとづき、これらの頻度分布が一致するか否かを照合する。 Next, the invariant feature converting unit 31 of the invariant feature converting unit 30 maps the feature points extracted as the difference to the invariant feature space to acquire invariant features (S704). The obtained invariant feature information of the invariant feature is stored in the difference invariant storage area 322 of the invariant feature storage unit 32.
And it collates with the frequency distribution of the invariant feature of a target object (S705).
Specifically, the matching unit 41 of the comparison unit 40 extracts the invariant feature frequency distribution information from the difference frequency storage area 326 and extracts the invariant feature frequency distribution information from the object frequency storage area 323. Then, based on the extracted frequency distribution information of the invariant features, it is verified whether or not these frequency distributions match.

照合の結果、双方の頻度分布が一致した場合（Ｓ７０５：ＹＥＳ）、判定部４２により、対象物体を検出したものと判定される（Ｓ７０６）。ただし、この場合、対象物体を１個検出したものと判定される。
一方、照合が一致しない場合（Ｓ７０５：ＮＯ）、対象物体の不変特徴の頻度分布を整数倍して照合を行う（Ｓ７０７）。
具体的には、数判定部４３が、物体頻度記憶領域３２３に記憶した対象物体の不変特徴の頻度分布を取り出し、各区画の頻度をともに所定の範囲で整数倍しながら繰り返し照合を行う。この照合の際に用いる整数の範囲としては、例えば、最小値が２（又は１）で、最大値は差分として現れた特徴点の数の範囲で行うことができる。 As a result of the collation, when both frequency distributions match (S705: YES), the determination unit 42 determines that the target object has been detected (S706). However, in this case, it is determined that one target object has been detected.
On the other hand, if the collation does not match (S705: NO), the frequency distribution of the invariant features of the target object is multiplied by an integer to perform collation (S707).
Specifically, the number determination unit 43 extracts the frequency distribution of the invariant features of the target object stored in the object frequency storage area 323, and repeatedly performs the collation while multiplying the frequency of each section by an integral multiple within a predetermined range. As a range of integers used in this collation, for example, the minimum value can be 2 (or 1), and the maximum value can be in the range of the number of feature points that appear as differences.

数判定部４３は、この結果、頻度分布が一致した場合（Ｓ７０７：一致）、そのときに用いた整数の個数の対象物体を検出したものと判定する（Ｓ７０８）。
一方、Ｓ７０７においても照合が一致しなかった場合（Ｓ７０７：不一致）、対象物体の検出は行わない。つまり、特段の処理は行わずにＳ７０９に進む。
そして、さらに物体検出を継続する場合（Ｓ７０９：ＹＥＳ）、Ｓ７０１に戻って同様の処理を繰り返し、物体検出を継続しない場合（Ｓ７０９：ＮＯ）、物体検出を終了する。 As a result, when the frequency distributions match (S707: match), the number determination unit 43 determines that the integer number of target objects used at that time have been detected (S708).
On the other hand, if the collation does not match in S707 (S707: mismatch), the target object is not detected. That is, the process proceeds to S709 without performing any special processing.
If further object detection is to be continued (S709: YES), the process returns to S701 and the same processing is repeated. If object detection is not continued (S709: NO), the object detection is terminated.

以上、説明したように、本実施形態の物体検出装置及び物体検出方法によれば、対象物体のみならず、その個数を検出することができる。
このため、前述の実施形態と同様の作用・効果を奏するのみならず、さらに利便性に優れた物体検出装置を実現することができる。 As described above, according to the object detection device and the object detection method of the present embodiment, not only the target object but also the number thereof can be detected.
For this reason, it is possible to realize an object detection device that not only has the same operations and effects as the above-described embodiment, but is also more convenient.

［物体検出プログラム］
次に、物体検出プログラムについて説明する。
上記の各実施形態におけるコンピュータ（物体検出装置）の物体検出機能は、記憶手段（例えば、ＲＯＭ（Read Only Memory）やハードディスクなど）に記憶された物体検出プログラムにより実現される。 [Object detection program]
Next, the object detection program will be described.
The object detection function of the computer (object detection device) in each of the above embodiments is realized by an object detection program stored in a storage unit (for example, a ROM (Read Only Memory) or a hard disk).

物体検出プログラム、コンピュータの制御手段（ＣＰＵ（Central Processing Unit）など）に読み込まれることにより、コンピュータの構成各部に指令を送り、所定の処理、例えば、物体検出装置の映像入力処理、特徴抽出処理、不変特徴変換処理、特異特徴選択処理、比較処理などを行わせる。
これによって、物体検出機能は、ソフトウェアである物体検出プログラムとハードウェア資源であるコンピュータ（物体検出装置）の各構成手段とが協働することにより実現される。 An object detection program is read by a computer control means (CPU (Central Processing Unit), etc.) to send commands to the components of the computer, and predetermined processing, for example, video input processing of the object detection device, feature extraction processing, Invariant feature conversion processing, unique feature selection processing, comparison processing, etc. are performed.
Thus, the object detection function is realized by cooperation between the object detection program that is software and each component of the computer (object detection device) that is hardware resource.

なお、物体検出機能を実現するための物体検出プログラムは、コンピュータのＲＯＭやハードディスクなどに記憶される他、コンピュータが読み取り可能な記録媒体、例えば、外部記憶装置及び可搬記録媒体に格納することができる。
外部記憶装置とは、ＣＤ−ＲＯＭ（Compact disc−Read Only Memory）等の記録媒体を内蔵し、物体検出装置に外部接続されるメモリ増設装置をいう。一方、可搬記録媒体とは、記録媒体駆動装置（ドライブ装置）に装着でき、かつ、持ち運び可能な記録媒体であって、例えば、フレキシブルディスク、メモリカード、光磁気ディスク等をいう。 The object detection program for realizing the object detection function is stored in a computer ROM, hard disk, or the like, or may be stored in a computer-readable recording medium, for example, an external storage device or a portable recording medium. it can.
The external storage device is a memory expansion device that incorporates a recording medium such as a CD-ROM (Compact Disc-Read Only Memory) and is externally connected to the object detection device. On the other hand, the portable recording medium is a recording medium that can be mounted on a recording medium driving device (drive device) and is portable, and refers to, for example, a flexible disk, a memory card, a magneto-optical disk, and the like.

そして、記録媒体に記録されたプログラムは、コンピュータのＲＡＭ（Random Access Memory）等にロードされて、ＣＰＵ（制御手段）により実行される。この実行により、上述した各実施形態の物体検出装置の機能が実現される。
さらに、コンピュータで物体検出プログラムをロードする場合、他のコンピュータで保有された物体検出プログラムを、通信回線を利用して自己の有するＲＡＭや外部記憶装置にダウンロードすることもできる。このダウンロードされた物体検出プログラムも、ＣＰＵにより実行され、上記各実施形態の物体検出装置の物体検出機能を実現する。 Then, the program recorded on the recording medium is loaded into a RAM (Random Access Memory) or the like of the computer and executed by the CPU (control means). By this execution, the function of the object detection device of each embodiment described above is realized.
Further, when the object detection program is loaded by the computer, the object detection program held by another computer can be downloaded to its own RAM or external storage device using a communication line. The downloaded object detection program is also executed by the CPU, and realizes the object detection function of the object detection device of each of the above embodiments.

以上説明したように、本実施形態の物体検出装置、物体検出方法及び物体検出プログラムによれば、マーカによらずとも所望の物体がその空間内に存在していることを検出することができる。
また、対象物体の姿勢変化や入力画像の歪みに対応した物体検出も可能である。
さらに、対象物体の検出をより高める構成を加えることも可能である。
また、複数の対象物体を検出することも可能である。 As described above, according to the object detection device, the object detection method, and the object detection program of the present embodiment, it is possible to detect that a desired object exists in the space regardless of the marker.
In addition, object detection corresponding to the posture change of the target object and distortion of the input image is also possible.
Furthermore, it is possible to add a configuration that further increases the detection of the target object.
It is also possible to detect a plurality of target objects.

以上、本発明の物体検出装置、物体検出方法及び物体検出プログラムの実施形態について説明したが、本発明に係る物体検出装置、物体検出方法及び物体検出プログラムは上述した実施形態にのみ限定されるものではなく、本発明の範囲で種々の変更実施が可能であることは言うまでもない。
例えば、前述した各実施形態において、特徴抽出手段２０が特徴記憶部２３を有し、不変特徴変換手段３０が不変特徴記憶部３２を有する構成となっているが、これら記憶部は他の構成要素に含まれる構成であっても良く、また、記憶部が独立した構成であっても良い。また、これら記憶部が、外部の記憶装置によって実現される態様であっても良い。 The embodiments of the object detection device, the object detection method, and the object detection program of the present invention have been described above. However, the object detection device, the object detection method, and the object detection program according to the present invention are limited to the above-described embodiments. However, it goes without saying that various modifications can be made within the scope of the present invention.
For example, in each of the above-described embodiments, the feature extraction unit 20 has the feature storage unit 23 and the invariant feature conversion unit 30 has the invariant feature storage unit 32. May be included, or the storage unit may be independent. Further, the storage unit may be realized by an external storage device.

本発明は、物体検出に関する発明であるため、物体を検出する装置や機器、例えば、物品管理、フィジカルセキュリティをはじめとする映像モニタリング、ロボットビジョン、複合現実感ＵＩ、コンテンツ生成応用に利用可能である。 Since the present invention relates to object detection, the present invention can be used for apparatuses and devices that detect objects, for example, image management such as article management and physical security, robot vision, mixed reality UI, and content generation applications. .

１物体検出装置
１０映像入力手段
２０特徴抽出手段
２２特徴記憶部
２３差分抽出部
３０不変特徴変換手段
３２不変特徴記憶部
４０比較手段
４３数判定部
５０特異特徴選択手段 DESCRIPTION OF SYMBOLS 1 Object detection apparatus 10 Image | video input means 20 Feature extraction means 22 Feature memory | storage part 23 Difference extraction part 30 Invariant feature conversion means 32 Invariant feature storage part 40 Comparison means 43 Number determination part 50 Singular feature selection means

Claims

A video input unit for inputting a detection target image;
A feature extraction unit for extracting feature points from the input detection target image;
An invariant feature converter that represents the extracted feature points in an invariant feature space;
A feature storage unit that stores feature points of a background image that does not include a target object;
An invariant feature storage unit for storing an invariant feature arrangement in which feature points of the target object are represented in an invariant feature space;
All or a predetermined number or more of the invariant feature arrangements obtained by subtracting the corresponding feature points of the background image from the feature points of the detection target image in the invariant feature space are the feature points of the target object. Comparing means for determining that the target object has been detected when it matches the arrangement of the invariant features represented in the random feature space ;
In the case of comprising a singular feature selection means for selecting, as a singular feature, a portion having a frequency equal to or lower than a predetermined value in the frequency distribution of the invariant feature representing the feature point of the background image in the invariant feature space,
The comparison means includes
It is determined that the target object has been detected when an invariant feature represented by subtracting the corresponding feature point of the background image from the feature point of the detection target image in the invariant feature space appears in the portion of the singular feature. An object detection apparatus characterized by:

The invariant feature storage unit includes:
  Storing a frequency distribution of invariant features representing feature points of the target object in an invariant feature space;
  The comparison means includes
  All or a predetermined number or more of the frequency distributions of invariant features represented by subtracting the corresponding feature points of the background image from the feature points of the detection target image in the invariant feature space are the feature points of the target object. It is determined that the target object has been detected if it matches the frequency distribution of the invariant features represented in the invariant feature space.
The object detection apparatus according to claim 1.

The comparison means includes
All or a part of the frequency distribution in the predetermined region of the invariant feature represented by subtracting the corresponding feature point of the background image from the feature point of the detection target image in the invariant feature space is the feature of the target object. If the frequency of invariant features representing points in the invariant space matches the frequency distribution obtained by multiplying by an integer, it is determined that the integer number of target objects has been detected.
The object detection apparatus according to claim 2.

Inputting a detection target image;
Extracting feature points from the input detection target image;
Expressing the extracted feature points in an invariant feature space;
Storing feature points of the background image not including the target object;
Storing an invariant feature arrangement representing feature points of the target object in an invariant feature space;
All or a predetermined number or more of the invariant feature arrangements obtained by subtracting the corresponding feature points of the background image from the feature points of the detection target image in the invariant feature space are the feature points of the target object. A comparison step of determining that the target object has been detected when it matches the arrangement of the invariant features represented in the random feature space, and
In the case of comprising a step of selecting, as a unique feature, a portion having a frequency equal to or less than a predetermined value from a frequency distribution of invariant features representing feature points of the background image in an invariant feature space
The comparison step includes
It is determined that the target object has been detected when an invariant feature represented by subtracting the corresponding feature point of the background image from the feature point of the detection target image in the invariant feature space appears in the portion of the singular feature. An object detection method characterized by:

The object detection device
Means for inputting a detection target image;
Means for extracting feature points from the input detection target image;
Means for representing the extracted feature points in an invariant feature space;
Means for storing feature points of a background image not including a target object;
Means for storing an arrangement of invariant features representing feature points of the target object in an invariant feature space;
All or a predetermined number or more of the invariant feature arrangements obtained by subtracting the corresponding feature points of the background image from the feature points of the detection target image in the invariant feature space are the feature points of the target object. A comparison means for determining that the target object has been detected when the arrangement of the invariant features represented in the variable feature space matches ,
When functioning as a singular feature selection means for selecting a portion having a frequency equal to or less than a predetermined value from the frequency distribution of invariant features representing the feature points of the background image in an invariant feature space,
In the comparison means,
It is determined that the target object has been detected when an invariant feature represented by subtracting the corresponding feature point of the background image from the feature point of the detection target image in the invariant feature space appears in the portion of the singular feature. An object detection program characterized by causing