JP5220482B2

JP5220482B2 - Object detection apparatus and program

Info

Publication number: JP5220482B2
Application number: JP2008143452A
Authority: JP
Inventors: 美也子馬場; 伸征白木; 芳樹二宮; 孝昌白井; 聡小山内
Original assignee: Denso Corp; Toyota Central R&D Labs Inc
Current assignee: Denso Corp; Toyota Central R&D Labs Inc
Priority date: 2008-05-30
Filing date: 2008-05-30
Publication date: 2013-06-26
Anticipated expiration: 2028-05-30
Also published as: JP2009289189A

Description

本発明は、対象物検出装置及びプログラムに関する。 The present invention relates to a Target object detection apparatus and a program.

従来、背景の切り取り等の処理を行うことなく映像の画面中の人物を直接かつ精度良く検出する物体検出装置が開示されている（特許文献１参照）。 Conventionally, an object detection device that directly and accurately detects a person in a screen of a video without performing processing such as background cutting has been disclosed (see Patent Document 1).

特許文献１の物体検出装置は、物体特徴辞書に定義されている物体の特徴を参照しながら特定された画像から物体候補を検出し、物体候補の中から検出対象の物体と検出対象に似ているが検出対象の物体ではない疑似物体とを識別するための判別特徴辞書を作成し、この判別特徴辞書を用いて物体候補の中から検出対象である物体のみを判別する。
特開平１０−２２２６７８号公報 The object detection device of Patent Literature 1 detects an object candidate from an image identified while referring to the object feature defined in the object feature dictionary, and resembles the detection target object and the detection target from the object candidates. A discriminant feature dictionary for identifying a pseudo object that is not a detection target object is created, and only the object that is the detection target is discriminated from the object candidates using the discrimination feature dictionary.
JP-A-10-222678

通常、検出しようとする対象物や対象物候補（以下「対象物等」という。）の大きさ、すなわち対象物等の解像度は、対象物までの距離に影響されるので、入力される画像により異なる。しかし、特許文献１では、解像度が考慮されることなく、物体の特徴が物体特徴辞書で定義され、判別特徴辞書が作成される。このため、性能のよい辞書が作成されない問題がある。また、特許文献１の物体検出装置は、解像度の異なる画像に対して、所定の解像度の画像から作成された同一の辞書を使用するので、識別機能の性能が低いという問題がある。 Usually, the size of an object to be detected or an object candidate (hereinafter referred to as “object etc.”), that is, the resolution of the object is influenced by the distance to the object. Different. However, in Patent Document 1, the feature of an object is defined by the object feature dictionary without considering the resolution, and the discrimination feature dictionary is created. For this reason, there is a problem that a dictionary with good performance is not created. Further, the object detection device of Patent Document 1 uses the same dictionary created from images with a predetermined resolution for images with different resolutions, and thus has a problem that the performance of the identification function is low.

本発明は、上述した課題を解決するために提案されたものであり、画像内の対象物の大きさにかかわらず、対象物を高精度に検出することができる対象物検出装置及びプログラムを提供することを目的とする。 The present invention has been proposed to solve the problems described above, regardless of the size of the object in the image, the object pair Ru can be detected with high accuracy elephant object detection apparatus and program The purpose is to provide.

請求項１の発明である対象物検出装置は、入力された画像に対して、前処理として、ぼかし度合いが異なる複数のぼかし処理が行われた複数の画像を生成する前処理手段と、前記画像に対して対象物の探索領域を設定する探索領域設定手段と、前記探索領域設定手段により前記探索領域が設定される毎に、前記探索領域設定手段により設定された探索領域の大きさが所定の大きさより大きい場合、前記前処理手段により前処理された前記複数の画像のうち、前記探索領域の大きさが大きくなるに従ってぼかし度合いが大きくなる画像を用いて、該画像と、対象物の特徴を示す学習モデルと、に基づいて、前記探索領域に対象物があるかを検出し、前記探索領域設定手段により設定された探索領域の大きさが前記所定の大きさ以下である場合、前記入力された画像と、前記対象物の特徴を示す学習モデルと、に基づいて、前記探索領域に対象物があるかを検出する対象物検出手段と、を備えている。 The object detection device according to the first aspect of the present invention includes a preprocessing unit that generates a plurality of images obtained by performing a plurality of blurring processes with different degrees of blurring on an input image, and the image A search area setting means for setting a search area for an object, and each time the search area is set by the search area setting means, the size of the search area set by the search area setting means is a predetermined value. When larger than the size, among the plurality of images preprocessed by the preprocessing means, an image whose blurring degree increases as the size of the search region increases is used, and the characteristics of the image and the object are determined. Based on the learning model to be detected, it is detected whether there is an object in the search area, and when the size of the search area set by the search area setting means is equal to or less than the predetermined size, Serial input image, a learning model that indicates the characteristic of the object, on the basis, and a, and an object detecting means for detecting whether there is the object in the search area.

請求項２の発明は、請求項１に記載の対象物検出装置であって、前記前処理手段は、前記入力された画像に対して、前処理として、前記複数のぼかし処理、及び輪郭部分を強調するシャープネス処理が行われた前記複数の画像を生成し、前記対象物検出手段は、前記探索領域設定手段により設定された探索領域の大きさが所定の大きさより小さい第２の大きさ以下である場合、前記前処理手段により前処理された複数の画像のうち、輪郭部分が強調された画像を用いて、該画像と、対象物の特徴を示す学習モデルと、に基づいて、前記探索領域に対象物があるかを検出する。 Invention of Claim 2 is the target object detection apparatus of Claim 1 , Comprising: The said pre-processing means performs the said several blurring process and an outline part as said pre-processing with respect to the said input image. The plurality of images that have undergone sharpening processing to be enhanced are generated, and the object detection means has a search area size set by the search area setting means that is less than or equal to a second size smaller than a predetermined size. In some cases, the search region based on the image and the learning model indicating the characteristics of the target object using an image in which the contour portion is emphasized among the plurality of images preprocessed by the preprocessing means. Detect if there is an object in

請求項５の発明である対象物検出プログラムは、コンピュータを、入力された画像に対して、前処理として、ぼかし度合いが異なる複数のぼかし処理が行われた複数の画像を生成する前処理手段と、前記画像に対して対象物の探索領域を設定する探索領域設定手段と、前記探索領域設定手段により前記探索領域が設定される毎に、前記探索領域設定手段により設定された探索領域の大きさが所定の大きさより大きい場合、前記前処理手段により前処理された前記複数の画像のうち、前記探索領域の大きさが大きくなるに従ってぼかし度合いが大きくなる画像を用いて、該画像と、対象物の特徴を示す学習モデルと、に基づいて、前記探索領域に対象物があるかを検出し、前記探索領域設定手段により設定された探索領域の大きさが前記所定の大きさ以下である場合、前記入力された画像と、前記対象物の特徴を示す学習モデルと、に基づいて、前記探索領域に対象物があるかを検出する対象物検出手段と、して機能させる。 According to a fifth aspect of the present invention, there is provided an object detection program comprising: a preprocessing unit that generates a plurality of images in which a plurality of blurring processes having different degrees of blurring are performed as preprocessing on an input image; A search area setting means for setting a search area of an object for the image, and a size of the search area set by the search area setting means each time the search area is set by the search area setting means Is larger than a predetermined size, among the plurality of images pre-processed by the pre-processing means, an image whose blurring degree increases as the size of the search area increases is used. Based on the learning model indicating the characteristics of the search area, it is detected whether there is an object in the search area, and the size of the search area set by the search area setting means is the predetermined If it is equal to or less than the size, it functions as an object detection means for detecting whether there is an object in the search area based on the input image and a learning model indicating the characteristics of the object Let

本発明に係る対象物検出装置及びプログラムは、探索領域が設定される毎に、輪郭部分の階調差が異なるように前処理された複数の画像のうち探索領域の大きさに応じた階調差の画像を用いて対象物を検出するので、画像内の対象物の大きさにかかわらず、高精度に対象物を検出することができる。 The object detection device and the program according to the present invention each have a gradation corresponding to the size of the search area among a plurality of images preprocessed so that the gradation difference of the contour portion differs every time the search area is set. Since the object is detected using the difference image, the object can be detected with high accuracy regardless of the size of the object in the image.

以下、本発明の好ましい実施の形態について図面を参照しながら詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

［原理］
図１は、様々なサイズの探索ウィンドウ内の画像（以下「探索ウィンドウ画像」という。）と対象物検出結果を示す図である。同図に示すように、対象物までの距離が遠くなるほど、対象物（本実施形態では歩行者）の画像サイズが小さくなるので、探索ウィンドウ画像が小さくなる。従来、対象物までの距離が遠い場合及び近い場合は探索ウィンドウ画像から対象物が検出されず、その中間の距離の場合のみに探索ウィンドウ画像から対象物が検出された。そこで、対象物までの距離が遠い場合及び近い場合は、対象物を検出するための前処理が必要になる。 [principle]
FIG. 1 is a diagram showing images in search windows of various sizes (hereinafter referred to as “search window images”) and object detection results. As shown in the figure, as the distance to the object increases, the image size of the object (pedestrian in the present embodiment) decreases, so the search window image decreases. Conventionally, when the distance to the object is far or near, the object is not detected from the search window image, and the object is detected from the search window image only in the middle distance. Therefore, when the distance to the object is long or close, pre-processing for detecting the object is necessary.

図２は、対象物まで遠い、近い、中間の場合の探索ウィンドウ画像と２次元ＦＦＴ（高速フーリエ変換）のスペクトルパターンを示す図である。図３は、２次元ＦＦＴのスペクトルパターンの高周波部分を示す図である。図４は、７×７ガウシアンフィルタ処理をしていない場合及びその処理をしている場合の２次元ＦＦＴのスペクトルパターン、検出性能が良いサイズの探索ウィンドウ画像の２次元ＦＦＴのスペクトルパターンを示す図である。なお、ぼかし処理はガウシアンフィルタ処理に限定されるものではなく、その他の処理、例えばメディアンフィルタ処理を用いてもよい。 FIG. 2 is a diagram showing a search window image and a two-dimensional FFT (Fast Fourier Transform) spectrum pattern in the case of being far, close, or intermediate to the object. FIG. 3 is a diagram illustrating a high-frequency portion of the spectrum pattern of the two-dimensional FFT. FIG. 4 is a diagram showing a two-dimensional FFT spectrum pattern of a search window image having a good detection performance and a two-dimensional FFT spectrum pattern when the 7 × 7 Gaussian filter processing is not performed and when the processing is performed. It is. The blurring process is not limited to the Gaussian filter process, and other processes such as a median filter process may be used.

図４に示すように、７×７ガウシアンフィルタ処理をしていない場合の高周波成分は、検出性能が良いサイズの探索ウィンドウ画像の高周波成分に類似しないが、７×７ガウシアンフィルタ処理をしている場合の高周波成分は、検出性能が良いサイズの探索ウィンドウ画像の高周波成分に類似している。すなわち、対象物までの距離が近い場合は、探索ウィンドウ画像にぼかし処理を施すことにより、探索ウィンドウ画像から対象物を検出することが可能になる。これは、次の理由による。 As shown in FIG. 4, the high frequency component when the 7 × 7 Gaussian filter processing is not performed is not similar to the high frequency component of the search window image having a good detection performance, but the 7 × 7 Gaussian filter processing is performed. The high-frequency component in this case is similar to the high-frequency component of the search window image having a good detection performance. That is, when the distance to the object is short, it is possible to detect the object from the search window image by performing a blurring process on the search window image. This is due to the following reason.

探索ウィンドウのサイズ（対象物がカメラから近いか遠いか）により、画像の鮮明さである解像度が異なる。例えば、対象物が人物である場合、人物がカメラから近い、すなわち探索ウィンドウのサイズが大きい場合は、画像が鮮明になり、人物の大まかな形状だけでなく、服の模様や背景の雑多なものが鮮明に撮影される。よって、この画像をそのまま用いて人物を検出するのは難しい。 Depending on the size of the search window (whether the object is near or far from the camera), the resolution, which is the sharpness of the image, differs. For example, if the object is a person, if the person is close to the camera, that is, if the search window size is large, the image will be clear and not only the rough shape of the person but also the clothes pattern and background Is photographed clearly. Therefore, it is difficult to detect a person using this image as it is.

また、人物がカメラから遠い、すなわち探索ウィンドウのサイズが小さい場合は、画像がぼけるため服の模様や背景等の対象物検出に影響を与える情報は低減されるものの、人物の大まかな形状さえも検出しづらくなる。探索ウィンドウのサイズがこれらの中間にある場合、探索ウィンドウ画像は適度なぼけ具合となり、最も検出しやすくなる。 Also, if the person is far from the camera, that is, the search window is small in size, the image will be blurred, so the information affecting the detection of objects such as clothing patterns and backgrounds will be reduced, but even the rough shape of the person Difficult to detect. When the size of the search window is between these, the search window image is moderately blurred and is most easily detected.

そこで、探索ウィンドウのサイズが大きい場合と小さい場合では、その中間の場合と同様に、ぼけ具合が同じようになるように探索ウィンドウ画像に画像処理を行えば、対象物検出性能を向上させることができる。なお、対象物検出時だけでなく、学習モデル生成時においても、上記と同様の処理を行うことで、対象物検出性能を向上させることができる。 Therefore, in the case where the size of the search window is large and small, the object detection performance can be improved by performing image processing on the search window image so that the degree of blur is the same as in the middle case. it can. Note that the object detection performance can be improved by performing the same processing as described above not only when detecting an object but also when generating a learning model.

図５は、最適な前処理の探し方を説明する図である。まず、サンプルとなる全画像に対して、サイズごとに複数のグループを作成する。この画像に対して、前処理のない従来の対象物検出装置により検出処理を行い、最も検出率の高いサンプルのグループを基準画像とする。次に図５に示すように、前記サンプルとなる全画像に対して、処理なしの画像、処理１、２、・・・、Ｎ（処理１〜Ｎ：ぼかし方法、ぼかし度合いの異なるぼかし処理）の施された画像を用意する。そして、基準画像のＦＦＴ高周波成分の平均と、上述した各画像のＦＦＴ高周波成分と、の類似度を、例えばユークリッド距離を用いて算出する。 FIG. 5 is a diagram for explaining how to search for the optimum preprocessing. First, a plurality of groups are created for each size for all the sample images. This image is subjected to detection processing by a conventional object detection apparatus without preprocessing, and a group of samples having the highest detection rate is set as a reference image. Next, as shown in FIG. 5, all the sample images are processed images, processes 1, 2,..., N (process 1 to N: blurring method, blurring process with different blurring degrees). Prepare an image marked with. Then, the similarity between the FFT high frequency component of the reference image and the FFT high frequency component of each image described above is calculated using, for example, the Euclidean distance.

サイズごとのグループに対して、処理なしの類似度より処理後の類似度が高くなるサンプルの数を集計し、処理なしの類似度より処理後の類似度が低くなるサンプルの数より多い処理を有効な処理とする。有効な処理の中から、サイズごとのグループに対して使用する前処理を選択する。選択の仕方は、サンプル画像に対して使用する処理を最小化する、あるいは、処理なしの類似度より処理後の類似度が高くなるサンプルの数が最も多い処理を選択する方法がある。本発明の実施の形態に係る対象物検出装置は、このようにして見つけられた前処理を次のように行う。 For each size group, count the number of samples with a higher similarity after processing than the unprocessed similarity, and process more than the number of samples with a lower similarity after processing than the unprocessed similarity. Use effective processing. Select a pre-process to be used for each size group from the valid processes. As a selection method, there is a method of minimizing the processing to be used for the sample image or selecting a processing having the largest number of samples whose similarity after processing is higher than the similarity without processing. The object detection apparatus according to the embodiment of the present invention performs the pre-processing thus found as follows.

［対象物検出装置の構成］
図６は、本発明の実施の形態に係る対象物検出装置の構成を示すブロック図である。対象物検出装置は、学習モデルを生成するための前処理を行う学習用前処理部１１と、学習モデルを生成する学習部１２と、生成された学習モデルを記憶する学習モデル記憶部１３と、を備えている。 [Configuration of object detection device]
FIG. 6 is a block diagram showing the configuration of the object detection device according to the embodiment of the present invention. The object detection device includes a learning preprocessing unit 11 that performs preprocessing for generating a learning model, a learning unit 12 that generates a learning model, a learning model storage unit 13 that stores the generated learning model, It has.

対象物検出装置は、さらに、物体を撮像する撮像部２１と、物体を検出するための前処理を行う検出用前処理部２２と、探索ウィンドウ内の画像であるウィンドウ画像を抽出するウィンドウ画像抽出部２３と、対象物を検出する処理を行う検出部２４と、検出結果を出力する結果出力部２５と、を備えている。 The object detection apparatus further includes an imaging unit 21 that captures an object, a detection preprocessing unit 22 that performs preprocessing for detecting the object, and a window image extraction that extracts a window image that is an image in the search window. Unit 23, a detection unit 24 that performs processing for detecting an object, and a result output unit 25 that outputs a detection result.

以上のように構成された対象物検出装置は、対象物として例えば歩行者を検出するために、次の学習処理ルーチンを実行して学習モデルを生成し、この学習モデルを用いて対象物検出ルーチンを実行する。 The object detection apparatus configured as described above executes the following learning processing routine to generate a learning model in order to detect, for example, a pedestrian as an object, and uses this learning model to detect the object detection routine. Execute.

［対象物の学習］
図７は、学習処理ルーチンを示すフローチャートである。 [Learning objects]
FIG. 7 is a flowchart showing a learning processing routine.

ステップＳ１では、学習用前処理部１１は、対象物の学習用画像（Positive 画像）と対象物ではないものの学習用画像（Negative 画像）を入力して、ステップＳ２に進む。本実施形態では、Positive 画像として歩行者の画像、Negative 画像として、電柱や看板などの画像が用いられる。 In step S1, the learning preprocessing unit 11 inputs the learning image (positive image) of the object and the learning image (negative image) of the object that is not the object, and proceeds to step S2. In this embodiment, a pedestrian image is used as a positive image, and an image such as a utility pole or a signboard is used as a negative image.

ステップＳ２では、学習用前処理部１１は、学習用画像に含まれる対象物の大きさに基づいて、学習用画像に前処理を行う。本実施形態では、学習用前処理部１１は、対象物の大きさ（縦方向の画素数）が５０画素以下であれば学習用画像にシャープネス処理を行い、対象物の大きさが５０画素を超えて９０画素以下であれば何も処理をしない。また、学習用前処理部１１は、対象物の大きさが９０画素を超えて１４０画素以下であれば３×３ガウシアンフィルタ処理を行い、対象物の大きさが１４０画素を超えれば７×７ガウシアンフィルタ処理を行う。すなわち、学習用前処理部１１は、対象物の大きさが大きくなるに従ってぼかしの程度が大きくなるように、必要に応じて学習用画像に画像処理を行う。 In step S2, the learning preprocessing unit 11 preprocesses the learning image based on the size of the target object included in the learning image. In the present embodiment, the learning pre-processing unit 11 performs sharpness processing on the learning image if the size of the object (number of pixels in the vertical direction) is 50 pixels or less, and sets the size of the object to 50 pixels. If it exceeds 90 pixels, no processing is performed. The learning pre-processing unit 11 performs 3 × 3 Gaussian filter processing if the size of the object exceeds 90 pixels and is 140 pixels or less, and 7 × 7 if the size of the object exceeds 140 pixels. Perform Gaussian filter processing. That is, the learning pre-processing unit 11 performs image processing on the learning image as necessary so that the degree of blur increases as the size of the object increases.

ステップＳ３では、学習用前処理部１１は、ステップＳ２を経た学習用画像から必要な領域の画像、具体的には対象物の存在する領域画像（以下、「対象物領域画像」という。）を所定の縦横比で切り取り、ステップＳ４に進む。例えば、学習用前処理部１１は、対象物が歩行者の場合、横１に対して縦２の比率で対象物領域画像を切り取る。 In step S3, the learning preprocessing unit 11 selects an image of a necessary area from the learning image obtained in step S2, specifically, an area image in which an object exists (hereinafter referred to as “object area image”). Cut out with a predetermined aspect ratio and proceed to step S4. For example, when the object is a pedestrian, the learning preprocessing unit 11 cuts out the object region image at a ratio of 2 to 1 in the horizontal direction.

ステップＳ４では、学習部１２は、正規化された対象物領域画像を用いて学習を行い、学習モデルを生成する。学習部１２は、例えば、対象物領域画像の輝度やサイズを正規化し特徴量を抽出する。特徴量としては、例えば、Ｈａａｒ−Ｌｉｋｅ特徴、４方向面特徴、２次元高速フーリエ変換のスペクトルパターン等がある。また、学習部１２は、例えば、ＡｄａＢｏｏｓｔ、サポートベクターマシン（ＳＶＭ）などの学習を行いてもよい。 In step S4, the learning unit 12 performs learning using the normalized object region image and generates a learning model. For example, the learning unit 12 normalizes the luminance and size of the object region image and extracts the feature amount. Examples of the feature amount include a Haar-Like feature, a four-way surface feature, a two-dimensional fast Fourier transform spectrum pattern, and the like. In addition, the learning unit 12 may perform learning such as AdaBoost and support vector machine (SVM).

このように、対象物検出装置は、学習用画像に対して、対象物までの距離が近くなるに従って、すなわち学習用画像に含まれる対象物の大きさが大きくなるに従ってぼかし度合いの大きくなるようなぼかし処理を行い、ぼかし処理済みの学習用画像を用いて対象物を検出するための学習モデルの生成を行う。これにより、対象物検出装置は、対象物までの距離に影響されることなく、対象物を精度良く検出するための学習モデルを生成することができる。 As described above, the object detection device increases the degree of blurring as the distance to the object becomes shorter with respect to the learning image, that is, as the size of the object included in the learning image increases. A blurring process is performed, and a learning model for detecting an object is generated using the learning image that has been blurred. Thereby, the target object detection apparatus can generate a learning model for accurately detecting the target object without being affected by the distance to the target object.

［対象物検出］
図８は、対象物検出ルーチンを示すフローチャートである。対象物検出装置は、次のステップＳ１１からステップＳ２０までの処理を実行する。 [Target detection]
FIG. 8 is a flowchart showing an object detection routine. The target object detection apparatus executes the processes from the next step S11 to step S20.

ステップＳ１１では、撮像部２１が被写体を撮像して画像を生成する。検出用前処理部２２は撮像部２１で生成された画像を入力して、ステップＳ１２に進む。本実施形態では、検出用前処理部２２は、撮像部２１によってリアルタイムで生成された画像を入力しているが、図示しない記憶装置に予め記憶された画像を入力してもよい。 In step S11, the imaging unit 21 captures a subject and generates an image. The detection preprocessing unit 22 inputs the image generated by the imaging unit 21, and proceeds to step S12. In the present embodiment, the detection preprocessing unit 22 inputs an image generated in real time by the imaging unit 21, but an image stored in advance in a storage device (not shown) may be input.

ステップＳ１２では、検出用前処理部２２は、使用する複数の前処理により前処理画像を作成して、ステップＳ１３に進む。ここでは、検出用前処理部２２は、撮像部２１で生成された画像に対して、３つの前処理、具体的にはシャープネス処理（例えば原画像からラプラシアン処理済みの画像を引いた画像）、３×３ガウシアンフィルタ処理、７×７ガウシアンフィルタ処理をそれぞれ行い、そして所定サイズの探索ウィンドウを設定する。 In step S12, the detection preprocessing unit 22 creates a preprocessed image by a plurality of preprocesses to be used, and proceeds to step S13. Here, the detection preprocessing unit 22 performs three preprocessing, specifically sharpness processing (for example, an image obtained by subtracting a Laplacian-processed image) from the image generated by the imaging unit 21, 3 × 3 Gaussian filter processing and 7 × 7 Gaussian filter processing are performed, respectively, and a search window of a predetermined size is set.

ステップＳ１３では、ウィンドウ抽出部２３は、前処理済みの各画像及び前処理の施されていない画像の中から、探索ウィンドウのサイズ（縦方向の画素数）に応じて例えば次のように画像を選択する。 In step S13, the window extraction unit 23 selects an image from the preprocessed images and the unprocessed images according to the size of the search window (the number of pixels in the vertical direction), for example, as follows. select.

（探索ウィンドウのサイズ） →（選択される前処理済みの画像）
５０画素以下 → シャープネス処理済み画像
５０画素を超えて９０画素以下 → 前処理なし画像
９０画素を超えて１４０画素以下→ ３×３ガウシアンフィルタ処理済み画像
１４０画素超 → ７×７ガウシアンフィルタ処理済み画像
このように、探索ウィンドウのサイズが大きくなるに従って、選択される画像の輪郭部分の階調差が小さく（ぼかし度合いが大きく）なっている。なお、探索ウィンドウのサイズと前処理済み画像との関係は一例に過ぎず、本発明はこれに限定されるものではない。そして、ウィンドウ抽出部２３は、選択した画像に探索ウィンドウを設定し、探索ウィンドウ内の画像である探索ウィンドウ画像を抽出して、ステップＳ１４に進む。 (Search window size) → (Preprocessed image to be selected)
50 pixels or less → Sharpness processed image Over 50 pixels and 90 pixels or less → Unprocessed image Over 90 pixels and 140 pixels or less → 3 × 3 Gaussian filtered image Over 140 pixels → 7 × 7 Gaussian filtered image As described above, as the size of the search window increases, the gradation difference of the contour portion of the selected image decreases (the degree of blurring increases). The relationship between the search window size and the preprocessed image is merely an example, and the present invention is not limited to this. Then, the window extraction unit 23 sets a search window for the selected image, extracts a search window image that is an image in the search window, and proceeds to step S14.

ステップＳ１４では、ウィンドウ抽出部２３は、探索ウィンドウ画像のサイズ及び輝度値をそれぞれ正規化して、ステップＳ１５に進む。 In step S14, the window extraction unit 23 normalizes the size and luminance value of the search window image, and proceeds to step S15.

ステップＳ１５では、検出部２４は、ステップＳ１４で正規化された探索ウィンドウ画像から特徴量を計算して、ステップＳ１６に進む。特徴量としては、例えば、Ｈａａｒ−Ｌｉｋｅ特徴、４方向面特徴、２次元高速フーリエ変換（ＦＦＴ）のスペクトルパターン等がある。 In step S15, the detection unit 24 calculates a feature amount from the search window image normalized in step S14, and proceeds to step S16. The feature amount includes, for example, a Haar-Like feature, a four-way surface feature, a two-dimensional fast Fourier transform (FFT) spectrum pattern, and the like.

ステップＳ１６では、検出部２４は、正規化された探索ウィンドウ画像の特徴量と、学習モデル記憶部１３に記憶されている学習モデル（特徴量）と、を比較して、ステップＳ１７に進む。 In step S16, the detection unit 24 compares the normalized feature amount of the search window image with the learning model (feature amount) stored in the learning model storage unit 13, and proceeds to step S17.

ステップＳ１７では、検出部２４は、ステップＳ１６の比較結果として対象物の確からしさを評価値として算出し、その評価値に基づいて探索ウィンドウ画像が対象物であるか否かを判定する。なお、探索ウィンドウ画像と学習モデルとを用いた対象物の検出においては、公知の技術を用いることができる。そして、探索ウィンドウ画像が対象物である場合はステップＳ１８に進み、対象物でない場合はステップＳ１９に進む。 In step S17, the detection unit 24 calculates the likelihood of the target object as an evaluation value as the comparison result in step S16, and determines whether the search window image is the target object based on the evaluation value. It should be noted that a known technique can be used for detecting an object using a search window image and a learning model. If the search window image is an object, the process proceeds to step S18. If the search window image is not an object, the process proceeds to step S19.

ステップＳ１８では、検出部２４は、ステップＳ１５で対象物を検出したときの探索ウィンドウのサイズ及び位置を対象物検出リストに保存して、ステップＳ１９に進む。 In step S18, the detection unit 24 saves the size and position of the search window when the object is detected in step S15 in the object detection list, and proceeds to step S19.

ステップＳ１９では、検出部２４は、探索ウィンドウを画像全体でスキャンして画像全体の探索が終了したかを判定し、探索が終了したときはステップＳ２０に進み、探索が終了していないときは探索ウィンドウの位置を１ステップずらしてステップＳ１４に戻る。そして、再びステップＳ１３からステップＳ１９までの処理が繰り返し実行される。探索ウィンドウが画像全体を探索すると、ステップＳ２０に進む。 In step S19, the detection unit 24 scans the entire search window to determine whether the entire image has been searched. When the search is completed, the process proceeds to step S20. When the search has not been completed, the search is performed. The window position is shifted by one step and the process returns to step S14. And the process from step S13 to step S19 is repeatedly performed again. When the search window searches the entire image, the process proceeds to step S20.

ステップＳ２０では、検出部２４は、すべてのサイズの探索ウィンドウを使用したかを判定し、肯定判定の場合はステップＳ２１に進み、否定判定のときは探索ウィンドウのサイズを変更してステップＳ１３に戻る。そして、再びステップＳ１３からステップＳ２０までの処理が繰り返し実行され、すべてのサイズの探索ウィンドウで探索が終了すると、ステップＳ２１に進む。 In step S20, the detection unit 24 determines whether search windows of all sizes are used. If the determination is affirmative, the process proceeds to step S21. If the determination is negative, the size of the search window is changed and the process returns to step S13. . Then, the processes from step S13 to step S20 are repeated, and when the search is completed in the search windows of all sizes, the process proceeds to step S21.

ステップＳ２１では、結果出力部２５は、検出結果として、対象物検出リストにある探索ウィンドウのサイズ及び位置に基づいて、検出された対象物をウィンドウで囲んで表示する。なお、結果出力部２５は、上記の例に限らず、例えば対象物の有無の情報、対象物画像、対象物位置の少なくとも１つを音声または画像により出力してもよい。 In step S21, the result output unit 25 displays the detected object surrounded by a window based on the size and position of the search window in the object detection list as a detection result. The result output unit 25 is not limited to the above example, and may output at least one of information on the presence / absence of an object, an object image, and an object position by sound or an image.

以上のように、対象物検出装置は、対象物までの距離が近くなるに従って、すなわち探索ウィンドウのサイズが大きくなるに従って、輪郭部分の階調差の少ない、つまりぼかし度合いの大きい画像を使用して対象物を検出する。これにより、対象物検出装置は、対象物までの距離に影響されることなく、また、対象物までの距離毎の学習モデルを用意することなく、対象物を精度良く検出することができる。 As described above, the object detection apparatus uses an image with a small gradation difference in the contour portion, that is, with a large degree of blurring as the distance to the object decreases, that is, as the size of the search window increases. Detect objects. Thereby, the target object detection apparatus can accurately detect the target object without being affected by the distance to the target object and without preparing a learning model for each distance to the target object.

なお、本発明は、上述した実施の形態に限定されるものではなく、特許請求の範囲に記載された範囲内で設計上の変更をされたものにも適用可能であるのは勿論である。 Note that the present invention is not limited to the above-described embodiment, and it is needless to say that the present invention can also be applied to a design modified within the scope of the claims.

本実施形態では、ぼかし処理としてガウシアンフィルタ処理を用いたが、ぼかし処理はこれに限定されるものではなく、メディアンフィルタ処理、縮小処理を用いてもよい。 In this embodiment, the Gaussian filter process is used as the blur process, but the blur process is not limited to this, and a median filter process or a reduction process may be used.

また、ぼかしの程度については上述した例に限定されるものではなく、異なるフィルタサイズのガウシアンフィルタ処理を用いてもよい。さらに、複数のフィルタサイズのメディアンフィルタ処理を用いてもよい。ここで、画像内の対象物のサイズが大きい場合、ガウシアンフィルタ処理、メディアンフィルタ処理を行うとよく、画像内の対象物のサイズが小さい場合、シャープネス処理を行うとよい。 Further, the degree of blurring is not limited to the above-described example, and Gaussian filter processing with different filter sizes may be used. Furthermore, median filtering with a plurality of filter sizes may be used. Here, when the size of the object in the image is large, Gaussian filter processing and median filter processing may be performed. When the size of the object in the image is small, sharpness processing may be performed.

なお、対象物検出装置は、学習用前処理部１１、学習部１２を経て生成された学習モデルを用いて対象物を検出する場合、上記実施形態のように検出用前処理部２２で前処理された画像を用いてもよいし、前処理のされていない画像を用いて対象物を検出してもよい。
また、対象物検出装置は、検出用前処理部２２を経た画像を用いて対象物を検出する場合、上記実施形態のように学習用前処理部１１、学習部１２を経て生成された学習モデルを用いてもよいし、前処理のされていない学習用画像を用いて対象物を検出してもよい。 When the object detection device detects an object using the learning model generated through the learning preprocessing unit 11 and the learning unit 12, the preprocessing unit for detection 22 performs preprocessing as in the above embodiment. An image that has been processed may be used, or an object may be detected using an image that has not been preprocessed.
Further, when the object detection device detects an object using an image that has passed through the detection preprocessing unit 22, the learning model generated through the learning preprocessing unit 11 and the learning unit 12 as in the above embodiment. Or an object may be detected using a learning image that has not been preprocessed.

様々なサイズの探索ウィンドウ画像と対象物検出結果を示す図である。It is a figure which shows the search window image and object detection result of various sizes. 対象物まで遠い、近い、中間の場合の探索ウィンドウ画像と２次元ＦＦＴのスペクトルパターンを示す図である。It is a figure which shows the spectrum pattern of the search window image and the two-dimensional FFT in the case of being far, close, and intermediate to the object. ２次元ＦＦＴのスペクトルパターンの高周波部分を示す図である。It is a figure which shows the high frequency part of the spectrum pattern of two-dimensional FFT. ７×７ガウシアンフィルタ処理をしていない場合及びその処理をしている場合の２次元ＦＦＴのスペクトルパターン、検出性能が良いサイズの探索ウィンドウ画像の２次元ＦＦＴのスペクトルパターンを示す図である。It is a figure which shows the spectrum pattern of the two-dimensional FFT of the case where the 7 * 7 Gaussian filter process is not performed, and the case where the process is performed, and the two-dimensional FFT spectrum pattern of the search window image having a good detection performance. 最適な前処理の探し方を説明する図である。It is a figure explaining how to search for the optimal preprocessing. 本発明の実施の形態に係る対象物検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the target object detection apparatus which concerns on embodiment of this invention. 学習処理ルーチンを示すフローチャートである。It is a flowchart which shows a learning process routine. 対象物検出ルーチンを示すフローチャートである。It is a flowchart which shows a target object detection routine.

Explanation of symbols

１１学習用前処理部
１２学習部
１３学習モデル記憶部
２１撮像部
２２検出用前処理部
２３ウィンドウ抽出部
２４検出部
２５結果出力部 DESCRIPTION OF SYMBOLS 11 Learning pre-processing part 12 Learning part 13 Learning model memory | storage part 21 Imaging part 22 Detection pre-processing part 23 Window extraction part 24 Detection part 25 Result output part

Claims

Preprocessing means for generating a plurality of images subjected to a plurality of blurring processes with different blurring degrees as preprocessing for the input image;
Search area setting means for setting a search area of an object for the image;
Each time the search area is set by the search area setting means, if the size of the search area set by the search area setting means is larger than a predetermined size, the plurality of preprocessed by the preprocessing means Among the images, an image whose blurring degree increases as the size of the search region increases, and the search region has an object based on the image and a learning model indicating the feature of the object. And when the size of the search area set by the search area setting means is less than or equal to the predetermined size, based on the input image and a learning model indicating the characteristics of the object Object detection means for detecting whether there is an object in the search area;
An object detection apparatus comprising:

The preprocessing means generates the plurality of images subjected to the plurality of blurring processes and the sharpness processing for enhancing the contour portion as the preprocessing for the input image,
When the size of the search area set by the search area setting unit is equal to or smaller than a second size smaller than a predetermined size, the object detection unit is configured to store a plurality of images preprocessed by the preprocessing unit. among them, using an image contour portion is emphasized, and the image, the learning model indicating the feature of the object, based on the subject of claim 1 for detecting whether there is the object in the search area Object detection device.

For each of the plurality of learning images including the target object, when the size of the target object in the learning image is larger than a predetermined size, the preprocessing is blurred as the target size increases. Preprocessing means for learning that performs blurring processing to increase the degree,
Learning model generation means for generating a learning model indicating the characteristics of the object based on each learning image preprocessed by the learning preprocessing means;
The object detection device according to claim 1, further comprising:

The learning preprocessing means performs a sharpness process for emphasizing an outline portion as the preprocessing when the size of the object in the learning image is equal to or smaller than a second size smaller than the predetermined size.
The object detection device according to claim 3.

Computer
Preprocessing means for generating a plurality of images subjected to a plurality of blurring processes with different blurring degrees as preprocessing for the input image;
Search area setting means for setting a search area of an object for the image;
Each time the search area is set by the search area setting means, if the size of the search area set by the search area setting means is larger than a predetermined size, the plurality of preprocessed by the preprocessing means Among the images, an image whose blurring degree increases as the size of the search region increases, and the search region has an object based on the image and a learning model indicating the feature of the object. And when the size of the search area set by the search area setting means is less than or equal to the predetermined size, based on the input image and a learning model indicating the characteristics of the object Object detection means for detecting whether there is an object in the search area;
The object detection program to make it function.

  The preprocessing means generates the plurality of images subjected to the plurality of blurring processes and the sharpness processing for enhancing the contour portion as the preprocessing for the input image,
  When the size of the search area set by the search area setting unit is equal to or smaller than a second size smaller than a predetermined size, the object detection unit is configured to store a plurality of images preprocessed by the preprocessing unit. Of these, an image with an emphasized contour portion is used to detect whether there is an object in the search region based on the image and a learning model indicating the characteristics of the object.
  The object detection program according to claim 5.