JP2012221162A

JP2012221162A - Object detection device and program

Info

Publication number: JP2012221162A
Application number: JP2011085503A
Authority: JP
Inventors: Kunihiro Goto; 邦博後藤; Yoshikatsu Kimura; 好克木村; Arata Takahashi; 新高橋; Masayoshi Hiratsuka; 誠良平塚; Masakazu Nishijima; 征和西嶋
Original assignee: Toyota Motor Corp; Toyota Central R&D Labs Inc
Current assignee: Toyota Motor Corp; Toyota Central R&D Labs Inc
Priority date: 2011-04-07
Filing date: 2011-04-07
Publication date: 2012-11-12

Abstract

PROBLEM TO BE SOLVED: To enable an object to be accurately detected even when the object stands still.SOLUTION: A window image extraction unit 20 extracts a window image from a captured image captured by an imaging device 12. A score calculation unit 22, based on a discrimination model for each pedestrian direction class and the window image, calculates a score indicating pedestrian likelihood for each direction class. A pedestrian discrimination unit 26 discriminates whether the window image is a pedestrian image based on a distribution of the calculated score for each direction class.

Description

本発明は、対象物検出装置及びプログラムに係り、特に、撮像した画像から対象物を検出するための対象物検出装置及びプログラムに関する。 The present invention relates to an object detection device and a program, and more particularly to an object detection device and a program for detecting an object from a captured image.

従来より、可視画像を撮像する可視カメラを用い、夜間の走行においても、歩行者のような所定の対象物を検出する車両の周辺監視装置が知られている（特許文献１）。この車両の周辺監視装置では、夜間ライトに照らされた左右の足の輝度値の時間的な変化を検出して歩行者か非歩行者かどうかを判定している。 2. Description of the Related Art Conventionally, there has been known a vehicle periphery monitoring device that uses a visible camera that captures a visible image and detects a predetermined object such as a pedestrian even during night driving (Patent Document 1). In this vehicle periphery monitoring device, it is determined whether a pedestrian or a non-pedestrian is detected by detecting temporal changes in luminance values of left and right feet illuminated by night lights.

また、通常の可視画像から精度高く障害物を識別し、しかも画像処理の負荷を軽減する障害物検出装置が知られている（特許文献２）。この障害物検出装置では、歩行者の時系列的な動きパターンを学習している。 Also, an obstacle detection device that accurately identifies an obstacle from a normal visible image and reduces the load of image processing is known (Patent Document 2). In this obstacle detection device, a time-series motion pattern of a pedestrian is learned.

また、車両周辺の画像から該車両との接触を回避すべき対象となる歩行者等の対象物を迅速に判定して、運転者への情報提示や車両挙動の制御を行うことができる車両周辺監視装置が知られている（特許文献３）。この車両周辺監視装置では、非対称性や体の傾き具合、相対位置の変化を用いて姿勢を推定している。 In addition, a vehicle periphery capable of quickly determining an object such as a pedestrian to be avoided from contact with the vehicle from an image around the vehicle and presenting information to the driver and controlling vehicle behavior A monitoring device is known (Patent Document 3). In this vehicle periphery monitoring device, the posture is estimated using asymmetry, the degree of inclination of the body, and the change in relative position.

また、車載の比較的安価な撮像手段で撮像された画像中からオブジェクトの特定の属性を的確に抽出し識別することが可能な画像処理装置が知られている（特許文献４）。この画像処理装置では、自己組織化マップにより歩行者の属性を判定している。 There is also known an image processing apparatus capable of accurately extracting and identifying a specific attribute of an object from an image captured by a relatively inexpensive imaging unit mounted on a vehicle (Patent Document 4). In this image processing apparatus, the attribute of a pedestrian is determined by a self-organizing map.

特開２０１０−２０５０８７号公報JP 2010-205087 A 特開２００４−１４５６６０号公報JP 2004-145660 A 特開２００７−２９３４８７号公報JP 2007-293487 A 特開２００７−３２３１７７号公報JP 2007-323177 A

しかしながら、上記の特許文献１に記載の技術は、夜間に限定された手法であり、前後方向に進む歩行者のみが検出対象となっている、という問題がある。また、動きに着目しており、静止した歩行者と非歩行者とを識別することができない、という問題がある。 However, the technique described in Patent Document 1 is a method limited to nighttime, and has a problem that only pedestrians moving in the front-rear direction are targeted for detection. In addition, attention is paid to movement, and there is a problem that stationary pedestrians and non-pedestrians cannot be distinguished.

また、上記の特許文献２に記載の技術では、動きに着目しており、静止した歩行者と非歩行者とを識別することができない、という問題がある。 In addition, the technique described in Patent Document 2 focuses on movement, and there is a problem that a stationary pedestrian and a non-pedestrian cannot be identified.

また、上記の特許文献３に記載の技術で用いる特徴量は、動いている歩行者に顕著に表れるものであり、静止している歩行者に対しては適用できない。また、衝突の危険性が高い歩行者かどうかの判定に姿勢を利用しており、誤検出除去のための方法に適用することができない、という問題がある。 In addition, the feature amount used in the technique described in Patent Document 3 appears prominently for a moving pedestrian and cannot be applied to a stationary pedestrian. Further, there is a problem that the posture is used for determining whether or not a pedestrian has a high risk of collision, and cannot be applied to a method for removing false detections.

また、上記の特許文献４に記載の技術では、属性を判定するのみで、歩行者か非歩行者かを判定していない、という問題がある。 In addition, the technique described in Patent Document 4 has a problem in that it only determines an attribute and does not determine whether it is a pedestrian or a non-pedestrian.

本発明は、上述した問題を解決するために成されたものであり、対象物が静止していても、精度良く対象物を検出することができる対象物検出装置及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-described problem, and an object of the present invention is to provide an object detection device and a program that can detect an object with high accuracy even when the object is stationary. And

上記目的を達成するために第１の発明の対象物検出装置は、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する抽出手段と、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記ウインドウ画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出するスコア算出手段と、前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する識別手段と、を含んで構成されている。 In order to achieve the above object, an object detection apparatus according to a first aspect of the present invention identifies an extraction means for extracting a window image from a captured image obtained by imaging the periphery of the apparatus, and identifies the object for each classification related to the attribute of the object. Score calculating means for calculating a score indicating the likelihood of the object for each classification relating to the attribute, based on the identification model for performing and the window image extracted by the extracting means; and calculating by the score calculating means Identification means for identifying whether or not the window image is an image representing the object based on the score for each classification related to the attribute.

第２の発明のプログラムは、コンピュータを、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する抽出手段、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記ウインドウ画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出するスコア算出手段、及び前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する識別手段として機能させるためのプログラムである。 According to a second aspect of the present invention, there is provided a program for extracting a window image from a captured image obtained by capturing an image of the periphery of the device, an identification model for identifying the object for each classification related to the attribute of the object, Based on the window image extracted by the extraction means, for each classification related to the attribute, a score calculation means for calculating a score indicating the likelihood of the object, and for each classification related to the attribute calculated by the score calculation means It is a program for causing a window image to function as an identification unit for identifying whether or not the window image is an image representing the object based on a score.

第１の発明及び第２の発明によれば、抽出手段によって、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する。スコア算出手段によって、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記ウインドウ画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出する。 According to the first invention and the second invention, the window image is extracted from the captured image obtained by imaging the periphery of the own apparatus by the extracting means. Based on an identification model for identifying the object for each classification related to the attribute of the object by the score calculation means and the window image extracted by the extraction means, the object for each classification related to the attribute A score indicating the likelihood is calculated.

そして、識別手段によって、前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する。 Then, the identifying means identifies whether the window image is an image representing the object based on the score for each classification related to the attribute calculated by the score calculating means.

このように、対象物の属性に関する分類毎の識別モデルに基づいて、属性に関する分類毎にスコアを算出し、属性に関する分類毎のスコアに基づいて、ウインドウ画像が対象物を表す画像か否かを識別することにより、対象物が静止していても、精度良く対象物を検出することができる。 As described above, based on the identification model for each category relating to the attribute of the object, a score is calculated for each category relating to the attribute, and based on the score for each category relating to the attribute, whether the window image is an image representing the object or not. By identifying, even if the object is stationary, the object can be detected with high accuracy.

第３の発明の対象物検出装置は、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する抽出手段と、前記抽出手段によって抽出されたウインドウ画像と一部重複するように前記ウインドウ画像の領域をずらした複数の周辺画像の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記周辺画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出するスコア算出手段と、前記スコア算出手段によって算出された前記複数の周辺画像の各々に対する前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、または前記属性の分類毎の累積スコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する識別手段と、を含んで構成されている。 According to a third aspect of the present invention, there is provided an object detection apparatus for extracting a window image from a captured image obtained by imaging the periphery of the own apparatus, and for extracting the window image so as to partially overlap the window image extracted by the extraction means. For each of a plurality of peripheral images with shifted regions, the target object is classified for each attribute-related classification based on an identification model for identifying the target object for each classification regarding the target object attribute and the peripheral image. A score calculation means for calculating a score indicating the likelihood, and a classification of attributes for which the score satisfies a predetermined condition, obtained from a score for each of the attributes for each of the plurality of surrounding images calculated by the score calculation means; Alternatively, identification means for identifying whether the window image is an image representing the object based on a cumulative score for each attribute classification , It is configured to include a.

第４の発明のプログラムは、コンピュータを、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する抽出手段、前記抽出手段によって抽出されたウインドウ画像と一部重複するように前記ウインドウ画像の領域をずらした複数の周辺画像の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記周辺画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出するスコア算出手段、及び前記スコア算出手段によって算出された前記複数の周辺画像の各々に対する前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、または前記属性の分類毎の累積スコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する識別手段として機能させるためのプログラムである。 According to a fourth aspect of the present invention, there is provided a program for extracting a window image from a captured image obtained by imaging the periphery of its own device, and a region of the window image so as to partially overlap the window image extracted by the extraction unit. For each of the plurality of peripheral images shifted from each other, based on the identification model for identifying the object for each classification related to the attribute of the target object and the peripheral image, the likelihood of the object for each classification related to the attribute A score calculation means for calculating a score indicating the above, and a classification of attributes for which the score satisfies a predetermined condition, obtained from a score for each of the attributes for each of the plurality of surrounding images calculated by the score calculation means, or Identify whether the window image is an image representing the object based on a cumulative score for each category of the attribute Is a program for functioning as that identification means.

第３の発明及び第４の発明によれば、抽出手段によって、自装置の周辺を撮像した撮像画像からウインドウ画像を抽出する。スコア算出手段によって、前記抽出手段によって抽出されたウインドウ画像と一部重複するように前記ウインドウ画像の領域をずらした複数の周辺画像の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記周辺画像とに基づいて、前記属性に関する分類毎に、前記対象物らしさを示すスコアを算出する。 According to the third and fourth aspects of the invention, the window image is extracted from the captured image obtained by capturing the periphery of the own apparatus by the extracting means. For each of a plurality of peripheral images in which the window image area is shifted so as to partially overlap with the window image extracted by the extracting unit, the object is identified for each classification related to the object by the score calculating unit. On the basis of the identification model for doing so and the surrounding image, a score indicating the likelihood of the object is calculated for each classification related to the attribute.

そして、識別手段によって、前記スコア算出手段によって算出された前記複数の周辺画像の各々に対する前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、または前記属性の分類毎の累積スコアに基づいて、前記ウインドウ画像が前記対象物を表す画像か否かを識別する。 Then, the classification unit obtains from the score for each category related to the attribute for each of the plurality of peripheral images calculated by the score calculation unit, the classification of the attribute that satisfies the predetermined condition, or for each classification of the attribute Based on the cumulative score, it is determined whether or not the window image is an image representing the object.

このように、ウインドウ画像をずらした複数の周辺画像の各々について、対象物の属性に関する分類毎の識別モデルに基づいて、属性に関する分類毎にスコアを算出し、複数の周辺画像の各々に対する属性に関する分類毎のスコアに基づいて、ウインドウ画像が対象物を表す画像か否かを識別することにより、対象物が静止していても、精度良く対象物を検出することができる。 As described above, for each of the plurality of peripheral images in which the window image is shifted, a score is calculated for each classification related to the attribute based on the identification model for each classification regarding the attribute of the object, and the attribute related to each of the plurality of peripheral images is related. By identifying whether the window image is an image representing an object based on the score for each classification, the object can be detected with high accuracy even when the object is stationary.

第５の発明の対象物識別装置は、自装置の周辺を撮像した撮像画像の時系列の各々から複数のウインドウ画像を抽出する抽出手段と、前記撮像画像の時系列の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記複数のウインドウ画像とに基づいて、前記属性に関する分類毎に、各ウインドウ画像に対する前記対象物らしさを示すスコアを算出するスコア算出手段と、前記撮像画像の時系列の各々について、前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記複数のウインドウ画像の各々が、対象物を表わす候補画像であるか否かを判定する候補画像判定手段と、前記撮像画像の時系列において、前記候補画像であると判定された前記ウインドウ画像を追跡する追跡手段と、前記撮像画像の時系列において追跡された前記ウインドウ画像の時系列に対して算出された前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、前記属性の分類毎の累積スコア、又は前記スコアが最大値となる属性の分類の変化に基づいて、前記ウインドウ画像が対象物を表す画像か否かを識別する識別手段と、を含んで構成されている。 According to a fifth aspect of the present invention, there is provided an object identification device including: an extracting unit that extracts a plurality of window images from each of a time series of captured images obtained by imaging the periphery of the own device; Based on the identification model for identifying the object for each classification related to the attribute and the plurality of window images extracted by the extraction means, the object-likeness for each window image is determined for each classification regarding the attribute. Each of the plurality of window images is based on a score for each classification related to the attribute calculated by the score calculation unit for each of the time series of the captured images, Candidate image determination means for determining whether the image is a candidate image representing the image, and is determined to be the candidate image in the time series of the captured image. A tracking means for tracking the window image; and a score for each category related to the attribute calculated for the time series of the window image tracked in the time series of the captured image, wherein the score satisfies a predetermined condition. An identification means for identifying whether the window image is an image representing an object based on a classification of attributes to be satisfied, a cumulative score for each classification of the attributes, or a change in an attribute classification for which the score is a maximum value; It is comprised including.

第６の発明のプログラムは、コンピュータを、自装置の周辺を撮像した撮像画像の時系列の各々から複数のウインドウ画像を抽出する抽出手段、前記撮像画像の時系列の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記複数のウインドウ画像とに基づいて、前記属性に関する分類毎に、各ウインドウ画像に対する前記対象物らしさを示すスコアを算出するスコア算出手段、前記撮像画像の時系列の各々について、前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記複数のウインドウ画像の各々が、対象物を表わす候補画像であるか否かを判定する候補画像判定手段、前記撮像画像の時系列において、前記候補画像であると判定された前記ウインドウ画像を追跡する追跡手段、及び前記撮像画像の時系列において追跡された前記ウインドウ画像の時系列に対して算出された前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、前記属性の分類毎の累積スコア、又は前記スコアが最大値となる属性の分類の変化に基づいて、前記ウインドウ画像が対象物を表す画像か否かを識別する識別手段として機能させるためのプログラムである。 According to a sixth aspect of the invention, there is provided a program for extracting a plurality of window images from each of a time series of captured images obtained by imaging the periphery of the device, and attribute of an object for each of the time series of the captured images. Based on an identification model for identifying the object for each classification related to and a plurality of window images extracted by the extracting means, the object likelihood for each window image is indicated for each classification related to the attribute A score calculation means for calculating a score, and for each time series of the captured image, each of the plurality of window images represents an object based on a score for each classification related to the attribute calculated by the score calculation means. Candidate image determination means for determining whether or not the image is a candidate image, in the time series of the captured image, Tracking means for tracking the determined window image, and the score obtained from the score for each category related to the attribute calculated for the time series of the window image tracked in the time series of the captured image, Identification means for identifying whether or not the window image is an image representing an object based on a classification of an attribute that satisfies a condition, a cumulative score for each classification of the attribute, or a change in a classification of an attribute having the maximum score It is a program to make it function as.

第５の発明及び第６の発明によれば、抽出手段によって、自装置の周辺を撮像した撮像画像の時系列の各々から複数のウインドウ画像を抽出する。スコア算出手段によって、前記撮像画像の時系列の各々について、対象物の属性に関する分類毎に前記対象物を識別するための識別モデルと、前記抽出手段によって抽出された前記複数のウインドウ画像とに基づいて、前記属性に関する分類毎に、各ウインドウ画像に対する前記対象物らしさを示すスコアを算出する。 According to the fifth and sixth inventions, the extraction means extracts a plurality of window images from each of the time series of the captured images obtained by imaging the periphery of the own device. Based on each of the time series of the captured images by the score calculation means based on the identification model for identifying the object for each classification related to the attribute of the object, and the plurality of window images extracted by the extraction means. Thus, for each classification related to the attribute, a score indicating the object likeness for each window image is calculated.

そして、候補画像判定手段によって、前記撮像画像の時系列の各々について、前記スコア算出手段によって算出された前記属性に関する分類毎のスコアに基づいて、前記複数のウインドウ画像の各々が、対象物を表わす候補画像であるか否かを判定する。追跡手段によって、前記撮像画像の時系列において、前記候補画像であると判定された前記ウインドウ画像を追跡する。 Then, each of the plurality of window images represents an object based on a score for each classification related to the attribute calculated by the score calculation unit for each of the time series of the captured images by the candidate image determination unit. It is determined whether the image is a candidate image. The window image determined to be the candidate image in the time series of the captured image is tracked by the tracking unit.

そして、識別手段によって、前記撮像画像の時系列において追跡された前記ウインドウ画像の時系列に対して算出された前記属性に関する分類毎のスコアから得られる、前記スコアが所定条件を満たす属性の分類、前記属性の分類毎の累積スコア、又は前記スコアが最大値となる属性の分類の変化に基づいて、前記ウインドウ画像が対象物を表す画像か否かを識別する。 Then, the classification of the attribute satisfying a predetermined condition, the score obtained from the score for each classification related to the attribute calculated for the time series of the window image tracked in the time series of the captured image by the identification unit, Whether the window image is an image representing an object is identified based on a cumulative score for each attribute classification or a change in an attribute classification with the maximum score.

このように、撮影画像の時系列において追跡されたウインドウ画像の時系列について、対象物の属性に関する分類毎の識別モデルに基づいて、属性に関する分類毎にスコアを算出し、追跡されたウインドウ画像の時系列の各々に対する属性に関する分類毎のスコアに基づいて、ウインドウ画像が対象物を表す画像か否かを識別することにより、対象物が静止していても、精度良く対象物を検出することができる。 In this way, for the time series of window images tracked in the time series of captured images, a score is calculated for each category related to the attribute based on the identification model for each category related to the attribute of the object, By identifying whether or not the window image is an image representing the object based on the score for each classification relating to the attribute for each of the time series, the object can be accurately detected even when the object is stationary. it can.

また、上記の対象物を、歩行者又は二輪車とすることができる。 Moreover, said object can be made into a pedestrian or a two-wheeled vehicle.

また、対象物が歩行者である場合、上記の属性を、歩行者の向き、年齢、又は性別とすることができる。 Moreover, when a target object is a pedestrian, said attribute can be made into pedestrian direction, age, or sex.

なお、本発明のプログラムを記憶する記憶媒体は、特に限定されず、ハードディスクであってもよいし、ＲＯＭであってもよい。また、ＣＤ−ＲＯＭやＤＶＤディスク、光磁気ディスクやＩＣカードであってもよい。更にまた、該プログラムを、ネットワークに接続されたサーバ等からダウンロードするようにしてもよい。 The storage medium for storing the program of the present invention is not particularly limited, and may be a hard disk or a ROM. Further, it may be a CD-ROM, a DVD disk, a magneto-optical disk or an IC card. Furthermore, the program may be downloaded from a server or the like connected to the network.

以上説明したように、本発明によれば、対象物の属性に関する分類毎の識別モデルに基づいて、属性に関する分類毎にスコアを算出し、属性に関する分類毎のスコアに基づいて、ウインドウ画像が対象物を表す画像か否かを識別することにより、対象物が静止していても、精度良く対象物を検出することができる、という効果が得られる。 As described above, according to the present invention, a score is calculated for each category related to the attribute based on the identification model for each category related to the attribute of the object, and the window image is processed based on the score for each category related to the attribute. By identifying whether or not the image represents an object, it is possible to obtain an effect that the object can be accurately detected even when the object is stationary.

第１の実施の形態に係る対象物検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the target object detection apparatus which concerns on 1st Embodiment. 各向きの歩行者を表わす画像を示す図である。It is a figure which shows the image showing the pedestrian of each direction. 歩行者の向きを求める方法を説明するための図である。It is a figure for demonstrating the method of calculating | requiring the direction of a pedestrian. 非歩行者を表わす画像を示す図である。It is a figure which shows the image showing a non-pedestrian. 歩行者画像及び非歩行者画像の各々で算出される向きの分類毎のスコアの例を説明するための図である。It is a figure for demonstrating the example of the score for every classification | category of the direction calculated with each of a pedestrian image and a non-pedestrian image. 第１の実施の形態の対象物検出装置における対象物検出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the target object detection processing routine in the target object detection apparatus of 1st Embodiment. 第２の実施の形態に係る対象物検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the target object detection apparatus which concerns on 2nd Embodiment. 周辺部分画像を説明するための図である。It is a figure for demonstrating a peripheral partial image. 歩行者画像及び非歩行者画像の各々で算出される向きの分類毎のスコアの例を説明するための図である。It is a figure for demonstrating the example of the score for every classification | category of the direction calculated with each of a pedestrian image and a non-pedestrian image. （ａ）歩行者周辺から抽出された周辺部分画像から得られた最大スコアを持つ向きの分類の頻度を示したヒストグラムを示す図、及び（ｂ）非歩行者周辺から抽出された周辺部分画像から得られた最大スコアを持つ向きの分類の頻度を示したヒストグラムを示す図である。(A) The figure which shows the histogram which showed the frequency of the classification | category of the direction which has the maximum score obtained from the periphery partial image extracted from the pedestrian periphery, and (b) From the periphery partial image extracted from the non-pedestrian periphery It is a figure which shows the histogram which showed the frequency of the classification | category of the direction with the obtained maximum score. 第２の実施の形態の対象物検出装置における対象物検出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the target object detection processing routine in the target object detection apparatus of 2nd Embodiment. 第２の実施の形態の対象物検出装置における対象物検出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the target object detection processing routine in the target object detection apparatus of 2nd Embodiment. 第３の実施の形態に係る対象物検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the target object detection apparatus which concerns on 3rd Embodiment. 歩行者画像の時系列及び非歩行者画像の時系列における、最大スコアとなる向きの分類の変化を説明するための図である。It is a figure for demonstrating the change of the classification | category of the direction used as the maximum score in the time series of a pedestrian image, and the time series of a non-pedestrian image. 向きの変化のパターンに応じて重みの例を示す図である。It is a figure which shows the example of a weight according to the pattern of direction change. 第３の実施の形態の対象物検出装置における対象物検出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the target object detection processing routine in the target object detection apparatus of 3rd Embodiment. 第３の実施の形態の対象物検出装置における対象物検出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the target object detection processing routine in the target object detection apparatus of 3rd Embodiment.

以下、図面を参照して本発明の実施の形態を詳細に説明する。本実施の形態では、車両に搭載され、対象物として歩行者を検出する対象物検出装置に本発明を適用した場合について説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the present embodiment, a case will be described in which the present invention is applied to an object detection device that is mounted on a vehicle and detects a pedestrian as an object.

図１に示すように、第１の実施の形態に係る対象物検出装置１０は、自車両の前方における識別対象領域を含む範囲を撮像する撮像装置１２と、撮像装置１２から出力される撮像画像に基づいて歩行者を検出する対象物検出処理ルーチンを実行するコンピュータ１６と、コンピュータ１６での処理結果を表示するための表示装置１８と、を備えている。 As illustrated in FIG. 1, the object detection device 10 according to the first embodiment includes an imaging device 12 that captures a range including an identification target region in front of the host vehicle, and a captured image output from the imaging device 12. The computer 16 which performs the target object detection processing routine which detects a pedestrian based on this, and the display apparatus 18 for displaying the processing result in the computer 16 are provided.

撮像装置１２は、自車両の前方における識別対象領域を含む範囲を撮像し、画像信号を生成する撮像部（図示省略）と、撮像部で生成されたアナログ信号である画像信号をデジタル信号に変換するＡ／Ｄ変換部（図示省略）と、Ａ／Ｄ変換された画像信号を一時的に格納するための画像メモリ（図示省略）とを備えている。なお、用いる画像はカラーでもモノクロでも良いし、可視光画像でも近赤画像でもよい。 The imaging device 12 images a range including an identification target region in front of the host vehicle, generates an image signal (not shown), and converts an image signal that is an analog signal generated by the imaging unit into a digital signal. An A / D converter (not shown), and an image memory (not shown) for temporarily storing the A / D converted image signal. The image to be used may be color or monochrome, and may be a visible light image or a near red image.

コンピュータ１６は、対象物検出装置１０全体の制御を司るＣＰＵ、後述する対象物検出処理ルーチンのプログラム等を記憶した記憶媒体としてのＲＯＭ、ワークエリアとしてデータを一時格納するＲＡＭ、及びこれらを接続するバスを含んで構成されている。このような構成の場合には、各構成要素の機能を実現するためのプログラムをＲＯＭやＨＤＤ等の記憶媒体に記憶しておき、これをＣＰＵが実行することによって、各機能が実現されるようにする。 The computer 16 is connected to a CPU that controls the entire object detection apparatus 10, a ROM as a storage medium that stores a program for an object detection processing routine described later, a RAM that temporarily stores data as a work area, and these. It is configured to include a bus. In the case of such a configuration, a program for realizing the function of each component is stored in a storage medium such as a ROM or HDD, and each function is realized by executing the program by the CPU. To.

このコンピュータ１６をハードウエアとソフトウエアとに基づいて定まる機能実現手段毎に分割した機能ブロックで説明すると、図１に示すように、撮像装置１２で撮像されコンピュータ１６へ入力された撮像画像を取得する画像取得部１９と、取得した撮像画像から所定領域のウインドウ画像を抽出するウインドウ画像抽出部２０と、ウインドウ画像抽出部２０により抽出されたウインドウ画像について画像特徴量を抽出する特徴量抽出部２１と、ウインドウ画像について抽出された画像特徴量と識別モデルとを比較して、歩行者らしさを示すスコアを算出するスコア算出部２２と、識別モデルが記憶された識別モデル記憶部２４と、ウインドウ画像が検出対象の歩行者を表す画像であるか否かを識別する歩行者識別部２６と、を含んだ構成で表すことができる。 When the computer 16 is described with functional blocks divided for each function realizing means determined based on hardware and software, as shown in FIG. 1, a captured image captured by the imaging device 12 and input to the computer 16 is acquired. An image acquisition unit 19 that extracts a window image of a predetermined region from the acquired captured image, and a feature amount extraction unit 21 that extracts an image feature amount of the window image extracted by the window image extraction unit 20 And a score calculation unit 22 for calculating a score indicating the pedestrian-likeness by comparing the image feature amount extracted for the window image with the identification model, an identification model storage unit 24 in which the identification model is stored, and the window image Including a pedestrian identification unit 26 that identifies whether or not is an image representing a detection target pedestrian. It can be represented by.

ウインドウ画像抽出部２０は、撮像画像から予め定められたサイズのウインドウ（探索ウインドウと呼称）を１ステップにつき、予め定められた移動量（探索ステップと呼称）だけ移動させながら画像を切り取る。ここでは、切り取った画像をウインドウ画像といい、ウインドウ画像のサイズ（すなわち探索ウインドウのサイズ）をウインドウサイズと呼称する。ウインドウサイズは様々なサイズの歩行者を検出するために複数種設定されており、ウインドウ画像抽出部２０は、設定されている全てのウインドウサイズの探索ウインドウを用いてウインドウ画像を抽出する。また、ウインドウ画像抽出部２０は、抽出したウインドウ画像を予め設定された画素数の画像に変換する。 The window image extraction unit 20 cuts an image from a captured image while moving a predetermined size window (referred to as a search window) by a predetermined amount of movement (referred to as a search step) per step. Here, the cut image is referred to as a window image, and the size of the window image (that is, the size of the search window) is referred to as a window size. A plurality of types of window sizes are set to detect pedestrians of various sizes, and the window image extraction unit 20 extracts window images using search windows of all the set window sizes. The window image extraction unit 20 converts the extracted window image into an image having a preset number of pixels.

特徴量抽出部２１は、各ウインドウ画像について画像特徴量を抽出する。画像特徴量として、Haar-Like Feature、HOG(Histograms of Oriented Gradients)、FIND(Feature Interaction Descriptor)などが利用できる。なお、FINDについては、非特許文献（Hui CAO, Koichiro YAMAGUCHI, Mitsuhiko OHTA, Takashi NAITO and Yoshiki NINOMIYA:" Feature Interaction Descriptor for Pedestrian Detection", IEICE TRANSACTIONS on Information and Systems, Volume E93-D No.9, pp.2651-2655, 2010）に記載されているものを利用すればよいため、詳細な説明を省略する。 The feature amount extraction unit 21 extracts an image feature amount for each window image. As the image feature amount, Haar-Like Feature, HOG (Histograms of Oriented Gradients), FIND (Feature Interaction Descriptor), etc. can be used. Regarding FIND, non-patent literature (Hui CAO, Koichiro YAMAGUCHI, Mitsuhiko OHTA, Takashi NAITO and Yoshiki NINOMIYA: "Feature Interaction Descriptor for Pedestrian Detection", IEICE TRANSACTIONS on Information and Systems, Volume E93-D No.9, pp. .2651-2655, 2010) may be used, and detailed description thereof is omitted.

識別モデル記憶部２４には、予め学習により生成され、かつ、スコア算出部２２でスコアを算出する際に参照される識別モデルが記憶されている。なお、ここでは、識別モデル記憶部２４をコンピュータ１６に設ける場合について説明するが、他の外部装置の記憶手段に識別モデルを記憶しておき、ネットワークや通信手段を介して他の外部装置に接続して、他の外部装置の記憶手段に記憶された識別モデルを読み込むような構成としてもよい。 The identification model storage unit 24 stores an identification model that is generated by learning in advance and that is referred to when the score calculation unit 22 calculates a score. Here, the case where the identification model storage unit 24 is provided in the computer 16 will be described. However, the identification model is stored in the storage unit of another external device and connected to the other external device via a network or communication unit. And it is good also as a structure which reads the identification model memorize | stored in the memory | storage means of another external apparatus.

ここで、識別モデル記憶部２４に記憶される識別モデルについて説明する。 Here, the identification model stored in the identification model storage unit 24 will be described.

まず、歩行者検出においては、向きなどの属性の分類ごとに学習したモデルを用いて検出性能を向上させることが行われている。これは図２に示すように、向きによって歩行者の姿勢に差があるため、向きの分類ごとの識別モデルによって安定に歩行者を検出するためである。一方で、植物や路面など非歩行者を歩行者と誤って検出してしまう場合がある。そのため、このような誤検出を削減できればより安定な歩行者検出が期待できる。以下、説明のために属性を向きとして記述する。 First, in pedestrian detection, detection performance is improved by using a model learned for each attribute classification such as orientation. This is because, as shown in FIG. 2, there is a difference in the posture of the pedestrian depending on the orientation, so that the pedestrian is stably detected by the identification model for each orientation classification. On the other hand, non-pedestrians such as plants and road surfaces may be erroneously detected as pedestrians. Therefore, if such erroneous detection can be reduced, more stable pedestrian detection can be expected. In the following, attributes are described as orientations for explanation.

向きの分類別の識別モデルを用いない手法は、歩行者の位置や大きさを検出することしかできない。このような場合に向きを検出するためには動き情報を利用するが、静止した歩行者には対応することができない、という問題点がある。向きの分類別の識別モデルを用いる場合には、見え方（パターン情報）を利用するため、静止歩行者の向きを検出することが可能である。向きの分類別の識別モデルを用いて歩行者の向きを求める方法を図３に示す。 A method that does not use the identification model for each orientation classification can only detect the position and size of the pedestrian. In such a case, motion information is used to detect the direction, but there is a problem that it cannot cope with a stationary pedestrian. When using an identification model for each direction classification, it is possible to detect the direction of a stationary pedestrian because it uses the appearance (pattern information). FIG. 3 shows a method for obtaining the direction of a pedestrian using an identification model for each direction classification.

識別モデルごとにその属性の分類に属する歩行者が存在するかどうかの確信度を示すスコアを計算し、スコアが最大値となる向きの分類をその歩行者の向きとするものである。 For each identification model, a score indicating the certainty of whether there is a pedestrian belonging to the classification of the attribute is calculated, and the classification of the direction in which the score becomes the maximum value is set as the direction of the pedestrian.

そこで、本実施の形態では、識別モデル記憶部２４は、歩行者の向きの分類（例えば、前後向き、左向き、右向き）毎に学習した向き別識別モデルをそれぞれ記憶している。 Therefore, in the present embodiment, the identification model storage unit 24 stores the orientation-specific identification models learned for each pedestrian orientation classification (for example, front-rear direction, left direction, right direction).

向き別識別モデルの学習処理では、歩行者の向きの分類毎に、予め歩行者が撮影された対象物の学習用画像、及び歩行者以外が撮影された非対象物の学習用画像を所定枚数用意し、対象物の学習用画像を、歩行者の向きの分類毎に分類し、歩行者の向きの分類毎に、当該向きの分類に分類された対象物の学習用画像と、非対象物の学習用画像とを用いて、各学習用画像の画像特徴量と教師ラベルとに従って学習を行い、当該向きの分類に対する識別モデルを生成する。 In the learning process for the identification model according to orientation, a predetermined number of learning images of the target object in which the pedestrian was previously photographed and non-target learning images in which the person other than the pedestrian was photographed for each pedestrian orientation classification Prepare and classify the learning images for each pedestrian orientation, and for each pedestrian orientation classification, the learning images for the objects classified into the orientation classification and non-objects The learning image is used for learning according to the image feature amount and the teacher label of each learning image, and an identification model for the classification of the direction is generated.

また、スコア算出部２２が、各ウインドウ画像について、抽出された画像特徴量と向きの分類毎の識別モデルとに基づいて、向きの分類毎に、識別器によってスコアを算出する。識別器としてBoostingや、SVMなどが利用できる。また、非特許文献（HT Lin, CJ Lin and RC Weng:" A note on Platt's probabilistic outputs for support vector machines", Machine Learning, Springer, 2007）に記載されている手法を用いて、識別器から出力されるスコアを確率値に変換した値をスコアとするようにしてもよい。 In addition, the score calculation unit 22 calculates a score for each window image by the classifier for each direction classification based on the extracted image feature amount and the identification model for each direction classification. Boosting or SVM can be used as an identifier. It is also output from the discriminator using the method described in non-patent literature (HT Lin, CJ Lin and RC Weng: “A note on Platt's probabilistic outputs for support vector machines”, Machine Learning, Springer, 2007). A value obtained by converting a score to a probability value may be used as the score.

次に、本実施の形態の原理について説明する。 Next, the principle of this embodiment will be described.

例えば図４のような非歩行者画像では、上半身の形状は比較的歩行者と一致するものの、下半身はエッジが無く歩行者の形状とは大きく異なるものとなっている。歩行者の上半身（特に肩、胴体）の輪郭形状は向きによる違いが脚部ほど大きくなく、どの向きの歩行者モデルに対しても、それなりにマッチしてしまう傾向がある。このような場合には、どの向きに対してもほぼ同等のスコアが算出されるため、属性間にスコア差が出にくいという現象が生じる。一方で歩行者画像に対しては、図５に示すように、向きの分類ごとにスコアに大きな差が出る。歩行者画像に対して向きの分類毎の識別モデルを適用して、スコアを計算すると、正しい向きの分類に対しては高い値を示し、その他の向きの分類に対しては低い値を示す。そこで、１つの向きの分類だけが高いスコアを持つかどうかを判定するために、向きの各分類におけるスコアの関係を評価することで歩行者と非歩行者の識別が可能となる。 For example, in the non-pedestrian image as shown in FIG. 4, the shape of the upper body relatively matches that of the pedestrian, but the lower body has no edge and is significantly different from the shape of the pedestrian. The contour shape of the pedestrian's upper body (especially shoulders and torso) is not as great as the leg, and tends to match the pedestrian model in any direction. In such a case, since almost the same score is calculated in any direction, a phenomenon that a difference in score is difficult to occur between attributes occurs. On the other hand, for pedestrian images, as shown in FIG. 5, there is a large difference in scores for each orientation classification. When a score is calculated by applying an identification model for each orientation classification to a pedestrian image, a high value is shown for a correct orientation classification, and a low value is shown for other orientation classifications. Therefore, in order to determine whether only one orientation category has a high score, it is possible to distinguish between pedestrians and non-pedestrians by evaluating the relationship between the scores in each orientation category.

本実施の形態では、歩行者識別部２６によって、以下に説明するように、各ウインドウ画像について、向きの分類毎に算出されたスコア（Ｎ個のスコアとする）から、歩行者識別処理を行う。 In the present embodiment, as will be described below, the pedestrian identification unit 26 performs pedestrian identification processing from the scores calculated for each orientation classification (referred to as N scores) for each window image. .

まず、スコアの大きさにより、非歩行者のウインドウ画像を取り除く。具体的には、すべてのスコアがあらかじめ設定された閾値より低い場合には非歩行者と判定し、少なくとも１つ以上のスコアが大きい場合には歩行者とする。このとき設定される閾値は、すべての向きの分類に対して同一でもよいし、向きの分類毎に異なる値を設定することもできる。 First, the non-pedestrian window image is removed according to the score. Specifically, when all scores are lower than a preset threshold, it is determined as a non-pedestrian, and when at least one score is large, it is determined as a pedestrian. The threshold value set at this time may be the same for all orientation categories, or a different value may be set for each orientation category.

また、上記の閾値処理に加えて、各向きの分類におけるスコアの分布に基づいて、歩行者識別処理を行う。 In addition to the threshold processing described above, pedestrian identification processing is performed based on the score distribution in each orientation classification.

例えば、各向きの分類におけるスコアの分布から、最大スコアと最小スコアの差を算出し、得られた値が、閾値未満であれば、ウインドウ画像が、非歩行者画像であると識別し、閾値以上であれば、歩行者画像であると識別する。 For example, the difference between the maximum score and the minimum score is calculated from the score distribution in each orientation classification, and if the obtained value is less than the threshold, the window image is identified as a non-pedestrian image, and the threshold If it is above, it will identify as a pedestrian image.

なお、各向きの分類におけるスコアの分布から、全スコア間の差や、スコアの分散値を算出し、同様に閾値判定を行って、歩行者画像であるか識別してもよい。 Note that a difference between all scores or a variance value of scores may be calculated from the distribution of scores in each orientation classification, and a threshold determination may be performed in the same manner to identify whether the image is a pedestrian image.

また、歩行者識別部２６は、識別結果を撮像画像に重畳して表示するよう、表示装置１８を制御する。 In addition, the pedestrian identification unit 26 controls the display device 18 to display the identification result superimposed on the captured image.

次に、図６を参照して、第１の実施の形態の対象物検出装置１０のコンピュータ１６で実行される対象物検出処理ルーチンについて説明する。 Next, an object detection processing routine executed by the computer 16 of the object detection apparatus 10 according to the first embodiment will be described with reference to FIG.

ステップ１００で、撮像装置１２で撮像された撮像画像を取得し、次に、ステップ１０２で、撮像画像に対して探索ウインドウを撮像画像に設定し、設定した探索ウインドウを用いて、撮像画像からウインドウ画像ｘを抽出する。 In step 100, a captured image captured by the imaging device 12 is acquired. Next, in step 102, a search window is set as a captured image with respect to the captured image, and a window is acquired from the captured image using the set search window. Image x is extracted.

次に、ステップ１０４で、上記ステップ１０２で抽出されたウインドウ画像ｘから画像特徴量を抽出する。ステップ１０６では、識別モデルを識別する変数Ｋを初期値である１に設定し、ステップ１０８において、Ｋ番目の向き別識別モデルと、上記ステップ１０４で抽出された画像特徴量とに基づいて、スコアを算出する。そして、ステップ１１０では、変数Ｋを１インクリメントして、ステップ１１２において、変数Ｎが、識別モデルの個数を示す定数Ｎ以下であるか否かを判定する。変数Ｋが定数Ｎ以下である場合には、上記ステップ１０８へ戻るが、一方、変数Ｋが定数Ｎより大きい場合には、全ての向き別識別モデルについてスコアを算出したと判断し、ステップ１１４へ移行する。 Next, in step 104, image feature amounts are extracted from the window image x extracted in step 102. In step 106, a variable K for identifying the identification model is set to 1 as an initial value, and in step 108, the score is determined based on the K-th orientation-specific identification model and the image feature amount extracted in step 104 above. Is calculated. In step 110, the variable K is incremented by 1. In step 112, it is determined whether or not the variable N is equal to or smaller than a constant N indicating the number of identification models. If the variable K is less than or equal to the constant N, the process returns to step 108. On the other hand, if the variable K is greater than the constant N, it is determined that the score has been calculated for all orientation-specific identification models, and the process proceeds to step 114. Transition.

ステップ１１４では、上記ステップ１０８で算出した向きの分類毎のスコアに基づいて、ウインドウ画像ｘが歩行者画像であるか否かを識別する。 In step 114, it is identified whether or not the window image x is a pedestrian image based on the score for each direction classification calculated in step 108.

次のステップ１１６では、上記ステップ１１４で歩行者画像であると識別されたか否かを判定し、歩行者画像でないと識別された場合には、後述するステップ１２０へ移行するが、一方、歩行者画像であると識別された場合には、ステップ１１８において、ウインドウ画像ｘを歩行者領域として記録して、ステップ１２０へ移行する。 In the next step 116, it is determined whether or not the image is identified as a pedestrian image in step 114. If it is identified that the image is not a pedestrian image, the process proceeds to step 120 described later. If it is identified as an image, in step 118, the window image x is recorded as a pedestrian area, and the process proceeds to step 120.

ステップ１２０では、上記ステップ１００で取得された撮像画像の全体について探索ウインドウをスキャンして探索が終了したか否かを判断する。終了していない場合は、ステップ１０２へ戻り、探索ウインドウの位置を予め定められた探索ステップだけ移動させた位置からウインドウ画像を抽出し、ステップ１０２〜ステップ１１８の処理を繰り返す。また、現サイズの探索ウインドウでの画像全体の探索が終了した場合には、同様にステップ１０２へ戻り、探索ウインドウのサイズを変更して、ステップ１０２〜ステップ１１８の処理を繰り返す。撮像画像全体について、全てのサイズの探索ウインドウでの探索が終了した場合には、ステップ１２２へ移行する。 In step 120, it is determined whether or not the search is completed by scanning the search window for the entire captured image acquired in step 100. If not completed, the process returns to step 102, the window image is extracted from the position where the position of the search window has been moved by a predetermined search step, and the processing of step 102 to step 118 is repeated. When the search for the entire image in the current size search window is completed, the process returns to step 102 in the same manner, the size of the search window is changed, and the processing from step 102 to step 118 is repeated. When the search in all size search windows is completed for the entire captured image, the process proceeds to step 122.

ステップ１２２では、検出結果の出力として、上記ステップ１１８で記録された歩行者領域に基づいて、撮像画像に対して、検出された歩行者がウインドウで囲まれて表示されるように表示装置１８を制御して、処理を終了する。 In step 122, based on the pedestrian area recorded in step 118 as an output of the detection result, the display device 18 is displayed on the captured image so that the detected pedestrian is surrounded by a window. Control and end the process.

以上説明したように、第１の実施の形態の対象物検出装置１０によれば、歩行者の向きに関する分類毎の識別モデルに基づいて、向きの分類毎にスコアを算出し、向きの分類毎のスコアの分布に基づいて、ウインドウ画像が歩行者を表す画像か否かを識別することにより、歩行者が静止していても、精度良く歩行者を検出することができる。 As described above, according to the object detection device 10 of the first embodiment, a score is calculated for each orientation classification based on the identification model for each classification regarding the pedestrian orientation, By identifying whether the window image is an image representing a pedestrian based on the score distribution, it is possible to detect a pedestrian with high accuracy even when the pedestrian is stationary.

また、各向きの分類のスコアの分布を評価することで、歩行者と非歩行者との識別が可能となり、歩行者検出において生じる植物などの誤検出を除去することができる。 Further, by evaluating the distribution of the scores of the classifications in each direction, it is possible to distinguish between pedestrians and non-pedestrians, and it is possible to eliminate false detections such as plants that occur in pedestrian detection.

また、歩行者のパターン情報から各向きの分類に対するスコアを算出するため、静止歩行者であっても、精度よく検出することができる。 Moreover, since the score with respect to each direction classification is calculated from the pedestrian pattern information, even a stationary pedestrian can be detected with high accuracy.

次に、第２の実施の形態について説明する。なお、第１の実施の形態と同様の構成となる部分については、同一符号を付して説明を省略する。 Next, a second embodiment will be described. In addition, about the part which becomes the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第２の実施の形態では、ウインドウ画像について、周辺に領域をずらした複数の周辺部分画像について算出されるスコアに基づいて、歩行者画像であるか否かを識別している点が、第１の実施の形態とは異なる。 In the second embodiment, with respect to the window image, the first point is that whether or not it is a pedestrian image is identified based on the scores calculated for a plurality of peripheral partial images shifted in the periphery. This is different from the embodiment.

図７に示すように、第２の実施の形態の対象物検出装置２１０は、撮像装置１２と、コンピュータ２１６と、表示装置１８と、を備えている。コンピュータ２１６をハードウエアとソフトウエアとに基づいて定まる機能実現手段毎に分割した機能ブロックで説明すると、図７に示すように、画像取得部１９、ウインドウ画像抽出部２０と、ウインドウ画像について後述する周辺部分画像を複数抽出する周辺部分画像抽出部２２０と、各ウインドウ画像及び各ウインドウ画像の複数の周辺部分画像について画像特徴量を抽出する特徴量抽出部２２１と、各ウインドウ画像及び各ウインドウ画像の複数の周辺部分画像についてスコアを算出するスコア算出部２２２と、識別モデル記憶部２４と、ウインドウ画像が検出対象の歩行者を表す画像であるか否かを識別する歩行者識別部２２６と、を含んだ構成で表すことができる。 As illustrated in FIG. 7, the object detection device 210 according to the second embodiment includes an imaging device 12, a computer 216, and a display device 18. When the computer 216 is described with functional blocks divided for each function realization means determined based on hardware and software, as shown in FIG. 7, the image acquisition unit 19, the window image extraction unit 20, and the window image will be described later. A peripheral part image extracting unit 220 that extracts a plurality of peripheral part images, a feature amount extracting unit 221 that extracts image feature amounts for each window image and a plurality of peripheral part images of each window image, and each window image and each window image A score calculation unit 222 that calculates a score for a plurality of peripheral partial images, an identification model storage unit 24, and a pedestrian identification unit 226 that identifies whether the window image is an image representing a detection target pedestrian. It can be expressed in a configuration that includes it.

周辺部分画像抽出部２２０は、図８に示すように、ウインドウ画像について、当該ウインドウ画像と一部重複するようにウインドウ画像の領域をずらした複数の領域から、複数の周辺部分画像を抽出する。 As shown in FIG. 8, the peripheral partial image extraction unit 220 extracts a plurality of peripheral partial images from a plurality of areas in which the window image areas are shifted so as to partially overlap the window image.

特徴量抽出部２２１は、各ウインドウ画像について、当該ウインドウ画像の画像特徴量を抽出すると共に、当該ウインドウ画像に対する複数の周辺部分画像の各々の画像特徴量を抽出する。 The feature amount extraction unit 221 extracts, for each window image, the image feature amount of the window image and the image feature amounts of a plurality of peripheral partial images corresponding to the window image.

スコア算出部２２２は、各ウインドウ画像について、抽出された画像特徴量と向きの分類毎の識別モデルとに基づいて、向きの分類毎に、識別器によってスコアを算出すると共に、各ウインドウ画像の各周辺部分画像について、抽出された画像特徴量と向きの分類毎の識別モデルとに基づいて、向きの分類毎に、同様に、スコアを算出する。 The score calculation unit 222 calculates, for each window image, a score for each direction classification based on the extracted image feature amount and the identification model for each direction classification, and for each window image. For the peripheral partial images, a score is calculated in the same manner for each orientation classification based on the extracted image feature amount and the identification model for each orientation classification.

次に、本実施の形態における原理について説明する。 Next, the principle in the present embodiment will be described.

図９に示すように、切り出し画像の位置やサイズのわずかな変化による画像の違いは、各向きの分類に対するスコアに影響を与えている。上記の第１の実施の形態で述べたように、非歩行者画像では各向きの分類に対するスコアが非常に類似した値を示すことがある。このような場合には、切り出し画像の位置やサイズによる画像の違いによるスコアの少しの変化分によって、各向きの分類に対するスコアの大小関係を逆転させてしまうことがある。そのため、周辺部分画像間で最大スコアとなる属性の分類に変化が生じる。一方、歩行者画像では図９に示すように、ずれによるスコアの変動はあるものの、正しい向きとそれ以外とではスコアに大きな差があるため、最大スコアとなる属性の分類の変化は生じない。 As shown in FIG. 9, image differences due to slight changes in the position and size of the cut-out image affect the score for each orientation classification. As described in the first embodiment above, in the non-pedestrian image, the score for each orientation classification may show a very similar value. In such a case, the magnitude relation of the score with respect to the classification in each direction may be reversed by a slight change in the score due to the difference in the image depending on the position and size of the cut-out image. For this reason, a change occurs in the classification of the attribute having the maximum score between the peripheral partial images. On the other hand, as shown in FIG. 9, in the pedestrian image, although the score varies due to the deviation, there is a large difference in the score between the correct orientation and the other directions, so that the attribute classification that becomes the maximum score does not change.

そこで、本実施の形態では、歩行者識別部２２６によって、以下に説明するように、各ウインドウ画像について、当該ウインドウ画像及び複数の周辺部分画像に対して向きの分類毎に算出されたスコア（Ｎ個のスコアとする）から、歩行者識別処理を行う。 Therefore, in the present embodiment, as described below, the pedestrian identification unit 226 calculates, for each window image, a score (N) calculated for each orientation classification with respect to the window image and a plurality of peripheral partial images. Pedestrian identification processing is performed.

まず、注目しているウインドウ画像のスコアの大きさによって非歩行者のウインドウ画像を取り除く。具体的には、すべてのスコアがあらかじめ設定された閾値より低い場合には非歩行者と判定し、少なくとも１つ以上のスコアが大きい場合には歩行者とする。このとき設定される閾値は、すべての属性に対して同一でもよいし、向きの分類毎に異なる値を設定することもできる。 First, the window image of a non-pedestrian is removed based on the score of the window image being noticed. Specifically, when all scores are lower than a preset threshold, it is determined as a non-pedestrian, and when at least one score is large, it is determined as a pedestrian. The threshold set at this time may be the same for all attributes, or a different value may be set for each orientation classification.

また、上記閾値処理に加えて、当該ウインドウ画像について、周辺部分画像のスコアを用いて歩行者識別処理を行う。 In addition to the threshold processing, pedestrian identification processing is performed on the window image using the scores of the peripheral partial images.

図１０（ａ）、（ｂ）は、歩行者周辺から抽出された周辺部分画像から得られた最大スコアを持つ向きの分類の頻度を示したヒストグラムである。上記図１０（ａ）のように歩行者周辺の周辺部分画像から得られるヒストグラムは正しい方向の分類に高いピークを示す。一方、図１０（ｂ）に示す非歩行者周辺の周辺部分画像から得られるヒストグラムでは、ばらけた分布となる。このように、注目ウインドウ画像に対する複数の周辺部分画像における、最大スコアが得られる向きの分類の分布を用いて、歩行者と非歩行者の識別が可能となる。 FIGS. 10A and 10B are histograms showing the classification frequency of the direction having the maximum score obtained from the peripheral partial images extracted from the pedestrian periphery. As shown in FIG. 10A, the histogram obtained from the peripheral image around the pedestrian shows a high peak in the correct direction classification. On the other hand, the histogram obtained from the peripheral partial image around the non-pedestrian shown in FIG. In this way, it is possible to identify a pedestrian and a non-pedestrian using the distribution of the classification of the direction in which the maximum score is obtained in a plurality of peripheral partial images with respect to the window image of interest.

例えば、注目ウインドウ画像についての最大スコアが得られる向きの分類の頻度分布から、最大スコアと最小スコアの差を算出し、得られた値が、閾値未満であれば、ウインドウ画像が、非歩行者画像であると識別し、閾値以上であれば、歩行者画像であると識別する。なお、最大スコアが得られる向きの分類の頻度分布が、スコアが所定条件を満たす属性の分類の分布の一例である。 For example, the difference between the maximum score and the minimum score is calculated from the frequency distribution of the classification in the direction in which the maximum score for the window image of interest is obtained, and if the obtained value is less than the threshold, the window image is a non-pedestrian It is identified as an image, and if it is equal to or greater than a threshold value, it is identified as a pedestrian image. The frequency distribution of the classification in the direction in which the maximum score is obtained is an example of the distribution of the classification of attributes whose score satisfies a predetermined condition.

このとき作成するヒストグラムは、各周辺部分画像について、スコアが最大値となる向きの分類をカウントしたものでもよいし、各スコアを重み付けしてカウントしたヒストグラムを用いてもよい。ヒストグラムの評価には、エントロピー値が利用できる。 The histogram created at this time may be obtained by counting the classification of the direction in which the score is the maximum value for each peripheral partial image, or may be a histogram obtained by weighting and counting each score. An entropy value can be used to evaluate the histogram.

次に、図１１、図１２を参照して、第２の実施の形態の対象物検出装置２１０のコンピュータ２１６で実行される対象物検出処理ルーチンについて説明する。なお、第１の実施の形態と同一の処理については、同一の符号を付して説明を省略する。 Next, an object detection processing routine executed by the computer 216 of the object detection device 210 according to the second embodiment will be described with reference to FIGS. 11 and 12. In addition, about the process same as 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

ステップ１００で、撮像装置１２で撮像された撮像画像を取得し、次に、ステップ１０２で、撮像画像に対して探索ウインドウを撮像画像に設定し、設定した探索ウインドウを用いて、撮像画像からウインドウ画像ｘを抽出する。ステップ１０４で、上記ステップ１０２で抽出されたウインドウ画像ｘから画像特徴量を抽出する。 In step 100, a captured image captured by the imaging device 12 is acquired. Next, in step 102, a search window is set as a captured image with respect to the captured image, and a window is acquired from the captured image using the set search window. Image x is extracted. In step 104, an image feature amount is extracted from the window image x extracted in step 102.

ステップ１０６では、識別モデルを識別する変数Ｋを初期値である１に設定し、ステップ１０８において、Ｋ番目の向き別識別モデルと、上記ステップ１０４で抽出された画像特徴量とに基づいて、スコアを算出する。そして、ステップ１１０では、変数Ｋを１インクリメントして、ステップ１１２において、変数Ｎが、識別モデルの個数を示す定数Ｎ以下であるか否かを判定する。変数Ｋが定数Ｎ以下である場合には、上記ステップ１０８へ戻るが、一方、変数Ｋが定数Ｎより大きい場合には、全ての向き別識別モデルについてスコアを算出したと判断し、ステップ２５０へ移行する。 In step 106, a variable K for identifying the identification model is set to 1 as an initial value, and in step 108, the score is determined based on the K-th orientation-specific identification model and the image feature amount extracted in step 104 above. Is calculated. In step 110, the variable K is incremented by 1. In step 112, it is determined whether or not the variable N is equal to or smaller than a constant N indicating the number of identification models. If the variable K is less than or equal to the constant N, the process returns to step 108. On the other hand, if the variable K is greater than the constant N, it is determined that the scores have been calculated for all the orientation-specific identification models, and the process proceeds to step 250. Transition.

次に、ステップ２５０で、上記ステップ１０２で抽出されたウインドウ画像ｘに対して、複数の周辺部分画像を抽出する。ステップ２５２では、周辺部分画像を識別する変数Ｓを初期値である１に設定し、ステップ２５４で、上記ステップ２５０で抽出されたＳ番目の周辺部分画像から画像特徴量を抽出する。ステップ１０６では、識別モデルを識別する変数Ｋを初期値である１に設定し、ステップ２５５において、Ｋ番目の向き別識別モデルと、上記ステップ２５４でＳ番目の周辺部分画像について抽出された画像特徴量とに基づいて、スコアを算出する。そして、ステップ１１０では、変数Ｋを１インクリメントして、ステップ１１２において、変数Ｎが、識別モデルの個数を示す定数Ｎ以下であるか否かを判定する。変数Ｋが定数Ｎ以下である場合には、上記ステップ２５５へ戻るが、一方、変数Ｋが定数Ｎより大きい場合には、全ての向き別識別モデルについてスコアを算出したと判断し、ステップ２５６へ移行する。 In step 250, a plurality of peripheral partial images are extracted from the window image x extracted in step 102. In step 252, a variable S for identifying the peripheral partial image is set to 1 which is an initial value, and in step 254, an image feature amount is extracted from the S-th peripheral partial image extracted in step 250. In step 106, a variable K for identifying the identification model is set to an initial value of 1. In step 255, the image feature extracted for the Kth orientation-specific identification model and the Sth peripheral partial image in step 254 above. A score is calculated based on the quantity. In step 110, the variable K is incremented by 1. In step 112, it is determined whether or not the variable N is equal to or smaller than a constant N indicating the number of identification models. If the variable K is less than or equal to the constant N, the process returns to step 255. On the other hand, if the variable K is greater than the constant N, it is determined that the scores have been calculated for all the orientation-specific identification models, and the process proceeds to step 256. Transition.

ステップ２５６では、変数Ｓを１インクリメントして、ステップ２５８において、変数Ｓが、周辺部分画像の個数を示す定数Ｍ以下であるか否かを判定する。変数Ｓが定数Ｍ以下である場合には、上記ステップ２５４へ戻るが、一方、変数Ｓが定数Ｍより大きい場合には、全ての周辺部分画像についてスコアを算出したと判断し、ステップ２６０へ移行する。 In step 256, the variable S is incremented by 1. In step 258, it is determined whether or not the variable S is equal to or less than a constant M indicating the number of peripheral partial images. If the variable S is less than or equal to the constant M, the process returns to step 254. On the other hand, if the variable S is greater than the constant M, it is determined that the scores have been calculated for all peripheral partial images, and the process proceeds to step 260. To do.

ステップ２６０では、上記ステップ１０８でウインドウ画像ｘについて算出した向きの分類毎のスコアと、上記ステップ２５５で各周辺部分画像について算出した向きの分類毎のスコアとに基づいて、ウインドウ画像ｘが歩行者画像であるか否かを識別する。 In step 260, the window image x is determined to be a pedestrian based on the score for each orientation classification calculated for the window image x in step 108 and the score for each orientation classification calculated for each peripheral partial image in step 255. Identify whether it is an image.

次のステップ１１６では、上記ステップ２６０で歩行者画像であると識別されたか否かを判定し、歩行者画像でないと識別された場合には、ステップ１２０へ移行するが、一方、歩行者画像であると識別された場合には、ステップ１１８において、ウインドウ画像ｘを歩行者領域として記録して、ステップ１２０へ移行する。 In the next step 116, it is determined whether or not the image is identified as a pedestrian image in step 260. If it is identified that the image is not a pedestrian image, the process proceeds to step 120. If it is determined that there is a window image, the window image x is recorded as a pedestrian area in step 118, and the process proceeds to step 120.

ステップ１２０では、上記ステップ１００で取得された撮像画像の全体について探索ウインドウをスキャンして探索が終了したか否かを判断する。終了していない場合は、ステップ１０２へ戻り、一方、撮像画像全体について、全てのサイズの探索ウインドウでの探索が終了した場合には、ステップ１２２へ移行する。 In step 120, it is determined whether or not the search is completed by scanning the search window for the entire captured image acquired in step 100. If the search has not been completed, the process returns to step 102. On the other hand, if the search in the search windows of all sizes is completed for the entire captured image, the process proceeds to step 122.

以上説明したように、第２の実施の形態の対象物検出装置２１０によれば、ウインドウ画像をずらした複数の周辺部分画像の各々について、歩行者の向きに関する分類毎の識別モデルに基づいて、向きの分類毎にスコアを算出し、複数の周辺部分画像の各々に対する向きの分類毎のスコアに基づいて、ウインドウ画像が歩行者を表す画像か否かを識別することにより、歩行者が静止していても、精度良く歩行者を検出することができる。 As described above, according to the object detection device 210 of the second embodiment, for each of the plurality of peripheral partial images shifted from the window image, based on the identification model for each classification related to the pedestrian orientation, A score is calculated for each orientation classification, and the pedestrian is stopped by identifying whether the window image is an image representing a pedestrian based on the orientation classification score for each of the plurality of peripheral partial images. Pedestrians can be detected with high accuracy.

なお、上記の実施の形態では、複数の周辺部分画像について得られる、最大スコアとなる向きの分類の頻度分布に基づいて、歩行者画像であるか否かを識別する場合を例に説明したが、これに限定されるものではない。例えば、複数の周辺部分画像について得られる、各向きの分類における累積スコアの分布から、最大スコアと最小スコアの差、全スコア間の差、又はスコアの分散値を算出し、算出された値について閾値判定を行って、歩行者画像であるか識別してもよい。 In the above-described embodiment, the case where the pedestrian image is identified is described as an example based on the frequency distribution of the classification with the direction of the maximum score obtained for a plurality of peripheral partial images. However, the present invention is not limited to this. For example, from the cumulative score distribution in each orientation classification obtained for a plurality of peripheral partial images, the difference between the maximum score and the minimum score, the difference between all scores, or the variance value of the scores is calculated, and the calculated value A threshold determination may be performed to identify whether the image is a pedestrian image.

次に、第３の実施の形態について説明する。なお、第１の実施の形態の対象物検出装置１０と同一の構成については、同一の符号を付して説明を省略する。 Next, a third embodiment will be described. In addition, about the structure same as the target object detection apparatus 10 of 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第３の実施の形態では、連続して撮像された撮像画像において歩行者候補領域を追跡し、追跡された歩行者候補領域について算出されたスコアに基づいて、歩行者画像であるか識別している点が、第１の実施の形態と異なっている。 In the third embodiment, a pedestrian candidate area is tracked in captured images that are continuously captured, and a pedestrian image is identified based on a score calculated for the tracked pedestrian candidate area. This is different from the first embodiment.

図１３に示すように、第３の実施の形態の対象物検出装置３１０は、撮像装置１２と、コンピュータ３１６と、表示装置１８と、を備えている。コンピュータ３１６をハードウエアとソフトウエアとに基づいて定まる機能実現手段毎に分割した機能ブロックで説明すると、図１３に示すように、連続して撮像された撮像画像の時系列を取得する画像取得部３１９と、ウインドウ画像抽出部２０と、特徴量抽出部２１、スコア算出部２２、識別モデル記憶部２４、撮像画像の時系列の各々について、ウインドウ画像が歩行者候補領域であるか否かを判定する歩行者候補判定部３２４と、各歩行者候補領域について追跡処理を行う追跡処理部３２６と、追跡できた歩行者候補領域であるウインドウ画像が歩行者画像であるか否かを識別する歩行者識別部３２８と、を含んだ構成で表すことができる。 As illustrated in FIG. 13, the object detection device 310 according to the third embodiment includes an imaging device 12, a computer 316, and a display device 18. If the computer 316 is described with function blocks divided for each function realization means determined based on hardware and software, as shown in FIG. 13, an image acquisition unit that acquires a time series of captured images continuously captured 319, the window image extraction unit 20, the feature amount extraction unit 21, the score calculation unit 22, the identification model storage unit 24, and the time series of the captured images are determined whether or not the window image is a pedestrian candidate region. A pedestrian candidate determination unit 324 that performs tracking processing for each pedestrian candidate area, and a pedestrian that identifies whether or not the window image that is a pedestrian candidate area that has been tracked is a pedestrian image. And an identification unit 328.

ウインドウ画像抽出部２０は、撮像画像の時系列の各々から、所定領域のウインドウ画像を抽出する。特徴量抽出部２１は、撮像画像の時系列の各々から抽出されたウインドウ画像について、画像特徴量を各々抽出する。スコア算出部２２は、撮像画像の時系列の各々に対する各ウインドウ画像について、抽出された画像特徴量と向き別識別モデルとを比較して、歩行者らしさを示すスコアを算出する。 The window image extraction unit 20 extracts a window image of a predetermined area from each time-series of captured images. The feature amount extraction unit 21 extracts image feature amounts from the window images extracted from the time series of the captured images. The score calculation unit 22 compares the extracted image feature quantity with the orientation-specific identification model for each window image for each time series of the captured image, and calculates a score indicating the pedestrian-likeness.

歩行者候補判定部３２４は、撮像画像の時系列の各々について、上記の第１の実施の形態で説明した歩行者識別部２６と同様に、各ウインドウ画像について、閾値処理を行って、非歩行者画像を取り除くとともに、向きの分類毎に算出されたスコアの分布に基づいて、歩行者識別処理を行い、歩行者画像であると識別されたウインドウ画像を、歩行者候補領域とする。 The pedestrian candidate determination unit 324 performs threshold processing on each window image for each time series of captured images, similarly to the pedestrian identification unit 26 described in the first embodiment, and performs non-walking. While removing the pedestrian image, pedestrian identification processing is performed based on the score distribution calculated for each orientation classification, and the window image identified as the pedestrian image is set as a pedestrian candidate region.

上記の第２の実施の形態で説明したように、位置やサイズのわずかな変化による画像の違いは、各向きの分類に対するスコアに影響を与える。車載カメラによって撮像される画像では、自車両の運動によって、同じ対象物に対してもサイズや見る角度が徐々に変化する。このため、図１４に示すように、上記の第２の実施の形態と同様に、歩行者画像では、時系列間で、最大スコアとなる属性の分類の変化は生じない。一方、非歩行者画像では、時系列間の画像の違いによる各向きの分類に対するスコアの大小関係の逆転が発生しやすく、時系列の画像間で、最大スコアとなる属性の分類に変化が生じる。 As described in the second embodiment, image differences due to slight changes in position and size affect the score for each orientation classification. In the image captured by the in-vehicle camera, the size and the viewing angle gradually change for the same object due to the movement of the host vehicle. For this reason, as shown in FIG. 14, in the pedestrian image, as in the second embodiment, there is no change in the attribute classification that becomes the maximum score between time series. On the other hand, in non-pedestrian images, the reversal of the magnitude relationship between scores for each orientation classification is likely to occur due to image differences between time series, and the attribute classification that gives the maximum score changes between time series images. .

そこで、本実施の形態では、追跡処理部３２６によって、歩行者候補判定部３２４によって歩行者候補領域であると判定されたウインドウ画像に対して追跡処理を行う。追跡処理部３２６では、現時刻ｔで歩行者候補領域と判定されたウインドウ画像の領域と、時刻ｔ−１まで追跡した歩行者追跡モデルの対応付けを行う。歩行者追跡モデルの更新にはαβトラッカを利用することができる。歩行者追跡モデルには、追跡された各時刻の歩行者候補領域について算出された向きの分類毎のスコアが記録されている。現時刻ｔで歩行者候補領域と判定されたウインドウ領域と対応する歩行者追跡モデルが見つからない場合は、新たな歩行者追跡モデルを生成する。時刻ｔ−１まで追跡した歩行者追跡モデルで、現時刻ｔにおいて対応する、歩行者候補領域と判定されたウインドウ画像が見つからない場合は、歩行者追跡モデルの現時刻ｔの各向きの分類のスコアをＮＵＬＬにする。 Therefore, in the present embodiment, the tracking processing unit 326 performs the tracking process on the window image determined by the pedestrian candidate determination unit 324 as a pedestrian candidate area. The tracking processing unit 326 associates the area of the window image determined as a pedestrian candidate area at the current time t with the pedestrian tracking model tracked up to time t-1. An αβ tracker can be used to update the pedestrian tracking model. In the pedestrian tracking model, a score for each classification of orientation calculated for the tracked pedestrian candidate area at each time is recorded. If no pedestrian tracking model corresponding to the window area determined as the pedestrian candidate area at the current time t is found, a new pedestrian tracking model is generated. In the pedestrian tracking model tracked up to time t−1, when a window image determined to be a pedestrian candidate region corresponding to the current time t is not found, the classification of each direction of the current time t of the pedestrian tracking model is performed. Set the score to NULL.

なお、この追跡処理では、各時刻の向きの分類毎のスコアの最大スコアを用いて、特許文献（特開２００５−３１１６９１号公報）に記載の手法により歩行者かどうかを判定する処理を行ってもよい。 In this tracking process, a process for determining whether or not a person is a pedestrian is performed using the maximum score of the scores for each time direction classification according to the method described in the patent document (Japanese Patent Application Laid-Open No. 2005-311691). Also good.

歩行者識別部３２８は、追跡処理部３２６によって追跡された歩行者追跡モデルを用いて、歩行者追跡モデルの各々について、各時刻で向きの分類毎に算出されたスコアから、歩行者識別処理を行う。 The pedestrian identification unit 328 uses the pedestrian tracking model tracked by the tracking processing unit 326 to perform pedestrian identification processing from the scores calculated for each orientation classification at each time for each of the pedestrian tracking models. Do.

例えば、歩行者追跡モデルに基づいて、最大スコアとなる向きの分類の変化度合いを評価し、変化度合いが、閾値以上であれば、当該歩行者追跡モデルに対応するウインドウ画像が、非歩行者画像であると識別し、閾値未満であれば、歩行者画像であると識別する。 For example, based on the pedestrian tracking model, the degree of change in the classification of the orientation that becomes the maximum score is evaluated, and if the degree of change is equal to or greater than a threshold, the window image corresponding to the pedestrian tracking model is a non-pedestrian image If it is less than the threshold, it is identified as a pedestrian image.

向きの変化度合を評価する方法として、各時刻で向きの分類毎に算出されたスコアに基づいて、最大スコアとなる向きの分類の変化回数をカウントすればよい。また、図１５に示すように変化のパターンによって重みを付けて変化回数をカウントするようにしてもよい。 As a method of evaluating the degree of change in direction, the number of changes in the direction classification that becomes the maximum score may be counted based on the score calculated for each direction classification at each time. Further, as shown in FIG. 15, the number of changes may be counted by applying a weight according to the change pattern.

次に、図１６、図１７を参照して、第３の実施の形態の対象物検出装置３１０のコンピュータ３１６で実行される対象物検出処理ルーチンについて説明する。なお、第１の実施の形態と同一の処理については、同一の符号を付して詳細な説明を省略する。 Next, an object detection processing routine executed by the computer 316 of the object detection device 310 according to the third embodiment will be described with reference to FIGS. In addition, about the process same as 1st Embodiment, the same code | symbol is attached | subjected and detailed description is abbreviate | omitted.

次に、ステップ１０４で、上記ステップ１０２で抽出されたウインドウ画像ｘから画像特徴量を抽出する。ステップ１０６では、識別モデルを識別する変数Ｋを初期値である１に設定し、ステップ１０８において、Ｋ番目の向き別識別モデルと、上記ステップ１０４で抽出された画像特徴量とに基づいて、スコアを算出する。そして、ステップ１１０では、変数Ｋを１インクリメントして、ステップ１１２において、変数Ｎが、向き別識別モデルの個数を示す定数Ｎ以下であるか否かを判定する。変数Ｋが定数Ｎ以下である場合には、上記ステップ１０８へ戻るが、一方、変数Ｋが定数Ｎより大きい場合には、全ての向き別識別モデルについてスコアを算出したと判断し、ステップ１１４へ移行する。 Next, in step 104, image feature amounts are extracted from the window image x extracted in step 102. In step 106, a variable K for identifying the identification model is set to 1 as an initial value, and in step 108, the score is determined based on the K-th orientation-specific identification model and the image feature amount extracted in step 104 above. Is calculated. In step 110, the variable K is incremented by 1. In step 112, it is determined whether the variable N is equal to or smaller than a constant N indicating the number of orientation-specific identification models. If the variable K is less than or equal to the constant N, the process returns to step 108. On the other hand, if the variable K is greater than the constant N, it is determined that the score has been calculated for all orientation-specific identification models, and the process proceeds to step 114. Transition.

次のステップ１１６では、上記ステップ１１４で歩行者画像であると識別されたか否かを判定し、歩行者画像でないと識別された場合には、後述するステップ１２０へ移行するが、一方、歩行者画像であると識別された場合には、ステップ３５０において、ウインドウ画像ｘを歩行者候補領域として記録して、ステップ１２０へ移行する。 In the next step 116, it is determined whether or not the image is identified as a pedestrian image in step 114. If it is identified that the image is not a pedestrian image, the process proceeds to step 120 described later. If it is identified as an image, in step 350, the window image x is recorded as a pedestrian candidate area, and the process proceeds to step 120.

ステップ１２０では、上記ステップ１００で取得された撮像画像の全体について探索ウインドウをスキャンして探索が終了したか否かを判断する。終了していない場合は、ステップ１０２へ戻り、探索ウインドウの位置を予め定められた探索ステップだけ移動させた位置からウインドウ画像を抽出し、ステップ１０２〜ステップ１１６、３５０の処理を繰り返す。また、現サイズの探索ウインドウでの画像全体の探索が終了した場合には、同様にステップ１０２へ戻り、探索ウインドウのサイズを変更して、ステップ１０２〜ステップ１１６、３５０の処理を繰り返す。撮像画像全体について、全てのサイズの探索ウインドウでの探索が終了した場合には、ステップ３５２へ移行する。 In step 120, it is determined whether or not the search is completed by scanning the search window for the entire captured image acquired in step 100. If not completed, the process returns to step 102, a window image is extracted from the position where the position of the search window has been moved by a predetermined search step, and the processing of steps 102 to 116, 350 is repeated. If the search of the entire image in the search window of the current size is completed, the process returns to step 102 in the same manner, the size of the search window is changed, and the processes of steps 102 to 116 and 350 are repeated. If the search in all size search windows is completed for the entire captured image, the process proceeds to step 352.

ステップ３５２では、上記ステップ３５０で記録された歩行者候補領域から、１つの歩行者候補領域を選択する。そして、ステップ３５４において、上記ステップ３５２で選択された歩行者候補領域について、１時刻前までの追跡処理により得られた歩行者追跡モデルを用いて、追跡処理を行う。 In step 352, one pedestrian candidate area is selected from the pedestrian candidate areas recorded in step 350. In step 354, the pedestrian candidate area selected in step 352 is tracked using the pedestrian tracking model obtained by the tracking process up to one hour ago.

次のステップ３５６では、上記ステップ３５２で選択された歩行者候補領域について、１時刻前までの歩行者追跡モデルから追跡できたか否かを判定し、追跡できなかった場合には、新たに歩行者追跡モデルを作成し、後述するステップ３６４へ移行する。 In the next step 356, it is determined whether or not the pedestrian candidate area selected in the above step 352 can be tracked from the pedestrian tracking model up to one hour ago. A tracking model is created, and the process proceeds to step 364 described later.

一方、１時刻前までの歩行者追跡モデルから追跡できた場合には、ステップ３５８において、歩行者追跡モデルに基づいて、追跡された各時刻の歩行者候補領域であるウインドウ画像について、上記ステップ１０８で算出された向きの分類毎のスコアの時系列データに基づいて、当該ウインドウ画像が歩行者画像であるか否かを識別する。 On the other hand, if the tracking can be performed from the pedestrian tracking model up to one hour ago, in step 358, the window image which is a pedestrian candidate area tracked at each time is tracked based on the pedestrian tracking model. Whether or not the window image is a pedestrian image is identified based on the time-series data of the scores for each orientation classification calculated in (1).

次のステップ３６０では、上記ステップ３５８で歩行者画像であると識別されたか否かを判定し、歩行者画像でないと識別された場合には、後述するステップ３６４へ移行するが、一方、歩行者画像であると識別された場合には、ステップ３６２において、現時刻における当該ウインドウ画像を歩行者領域として記録して、ステップ３６２へ移行する。 In the next step 360, it is determined whether or not the image is identified as a pedestrian image in step 358. If the image is not identified as a pedestrian image, the process proceeds to step 364 described later. If it is identified as an image, the window image at the current time is recorded as a pedestrian area in step 362, and the process proceeds to step 362.

ステップ３６２では、全ての歩行者候補領域について、上記ステップ３５２〜ステップ３６２の処理が終了したか否かを判定し、終了していない場合には、上記ステップ３５２へ戻り、まだ選択されていない歩行者候補領域を選択して、上記ステップ３５４〜ステップ３６２の処理を繰り返す。一方、全ての歩行者候補領域について、上記ステップ３５２〜ステップ３６２の処理が終了した場合には、ステップ１２２へ移行し、検出結果の出力として、上記ステップ３６２で記録された歩行者領域に基づいて、現時刻の撮像画像に対して、検出された歩行者がウインドウで囲まれて表示されるように表示装置１８を制御して、上記ステップ１００へ戻り、撮像装置１２で次の時刻に撮像された撮像画像を取得し、上述した処理を実行する。 In step 362, it is determined whether or not the processing in steps 352 to 362 has been completed for all pedestrian candidate areas. If not, the process returns to step 352, and walking that has not been selected yet. A candidate area is selected, and the processes of steps 354 to 362 are repeated. On the other hand, when the processing of step 352 to step 362 is completed for all pedestrian candidate regions, the process proceeds to step 122, and the detection result output is based on the pedestrian region recorded in step 362. Then, the display device 18 is controlled so that the detected pedestrian is displayed surrounded by a window with respect to the captured image at the current time, the process returns to step 100, and the image is captured at the next time by the imaging device 12. The captured image is acquired and the above-described processing is executed.

上記のように、対象物検出処理ルーチンでは、撮像装置１２によって撮像される撮像画像の時系列の各々について、上述した処理が繰り返し実行される。 As described above, in the object detection processing routine, the above-described processing is repeatedly executed for each time series of captured images captured by the imaging device 12.

以上説明したように、第３の実施の形態に係る対象物検出装置３１０によれば、撮影画像の時系列において追跡されたウインドウ画像の時系列について、歩行者の向きに関する分類毎の識別モデルに基づいて、向きの分類毎にスコアを算出し、追跡されたウインドウ画像の時系列の各々に対する向きの分類毎のスコアに基づいて、ウインドウ画像が歩行者を表す画像か否かを識別することにより、歩行者が静止していても、精度良く歩行者を検出することができる。 As described above, according to the object detection apparatus 310 according to the third embodiment, the time series of the window images tracked in the time series of the photographed images is used as the identification model for each classification regarding the pedestrian orientation. And calculating a score for each orientation classification and identifying whether the window image is an image representing a pedestrian based on the score for each orientation classification for each time series of tracked window images Even when the pedestrian is stationary, the pedestrian can be detected with high accuracy.

また、時系列的な向きの変化状態を評価することで、歩行者と非歩行者の識別が可能となり誤検出を除去することができる。 In addition, by evaluating the time-series direction change state, it is possible to distinguish between pedestrians and non-pedestrians and to eliminate false detections.

なお、上記の実施の形態では、追跡された歩行者候補領域（ウインドウ画像）について得られる、最大スコアとなる向きの分類の変化の回数に基づいて、当該ウインドウ画像が歩行者画像であるか否かを識別する場合を例に説明したが、これに限定されるものではない。例えば、各向きの分類に対するスコアの関係を時系列的に判定するようにしてもよい。各時刻における向きの分類毎の最大スコアと最小スコアの差、すべての向き別スコア間の差、あるいは分散値を時系列で累積し、その累積値があらかじめ設定した閾値以下であれば、当該歩行者追跡モデルに対応するウインドウ画像が、非歩行者画像であると識別するようにしてもよい。 In the above-described embodiment, whether or not the window image is a pedestrian image based on the number of changes in the direction classification that is the maximum score obtained for the tracked pedestrian candidate region (window image). However, the present invention is not limited to this. For example, the relationship of the score with respect to each orientation classification may be determined in time series. The difference between the maximum score and the minimum score for each orientation classification at each time, the difference between all orientation scores, or the variance value is accumulated in time series, and if the accumulated value is less than or equal to a preset threshold, the walking The window image corresponding to the person tracking model may be identified as a non-pedestrian image.

また、追跡された歩行者候補領域（ウインドウ画像）について得られる、最大スコアとなる向きの分類の頻度分布に基づいて、上記の第２の実施の形態と同様に、ウインドウ画像が、歩行者画像であるか否かを識別するようにしてもよい。 Further, based on the frequency distribution of the classification of the direction that becomes the maximum score obtained for the tracked pedestrian candidate region (window image), the window image is converted into the pedestrian image as in the second embodiment. You may make it identify whether it is.

また、追跡された歩行者候補領域（ウインドウ画像）について得られる、各向きの分類における累積スコアの分布から、最大スコアと最小スコアの差、全スコア間の差、またはスコアの分散値を算出し、算出された値について閾値判定を行って、歩行者画像であるか識別してもよい。また、時系列でスコアを加算する場合に、現時刻から遠ざかるごとに重みを小さくして加算し、合計値があらかじめ設定した閾値以下であれば非歩行者画像と判定するようにしてもよい。 In addition, the difference between the maximum score and the minimum score, the difference between all scores, or the variance value of the scores is calculated from the distribution of cumulative scores in each orientation classification obtained for the tracked pedestrian candidate area (window image). The threshold value may be determined for the calculated value to identify whether the image is a pedestrian image. In addition, when adding scores in time series, the weights may be reduced each time they move away from the current time, and may be determined as a non-pedestrian image if the total value is equal to or less than a preset threshold value.

また、現時刻の歩行者候補領域を検出してから、１時刻前までの追跡結果からの追跡処理を行う場合を例に説明したが、これに限定されるものではなく、１時刻前までの追跡結果から、対応する領域を検索し、検索結果に基づいて、現時刻の歩行者候補領域を検出してもよい。 Moreover, although the case where the tracking process from the tracking result up to one hour before the pedestrian candidate area at the current time is detected has been described as an example, the present invention is not limited to this and is not limited to this. A corresponding area may be searched from the tracking result, and a pedestrian candidate area at the current time may be detected based on the search result.

また、上記の第３の実施の形態において、上記の第２の実施の形態で説明した複数の周辺部分画像を用いた技術を適用してもよい。 In the third embodiment, the technique using the plurality of peripheral partial images described in the second embodiment may be applied.

また、上記の第１の実施の形態〜第３の実施の形態では、歩行者の属性として、向きを用いた場合を例に説明したが、これに限定されるものではなく、大人や子供、老人といった分類を有する年齢に関するものや、男女などの分類を有する性別に関するものを、属性として用いてもよい。この場合には、年齢別識別モデルや、性別毎の識別モデルを予め学習して用意しておけばよい。また、歩行者の属性の分類を、向き＋開脚度合い（例えば、大／中／小／なし）のように細分化するようにしてもよい。 Moreover, in said 1st Embodiment-3rd Embodiment, although the case where direction was used was demonstrated as an example as a pedestrian attribute, it is not limited to this, An adult, a child, An attribute related to age having a classification such as an elderly person or a sex related to classification having a classification such as gender may be used as an attribute. In this case, an age-specific identification model or a sex-specific identification model may be learned and prepared in advance. Moreover, you may make it divide the classification | category of the attribute of a pedestrian like direction + the degree of leg spread (for example, large / medium / small / none).

また、上記の第１の実施の形態〜第３の実施の形態において、スコアの閾値処理により、非歩行者のウインドウ画像を取り除く処理を行う場合を例に説明したが、スコアの閾値処理により非歩行者のウインドウ画像を取り除く処理を省略してもよい。 In the first to third embodiments described above, the case of performing the process of removing the non-pedestrian's window image by the score threshold process has been described as an example. You may abbreviate | omit the process which removes a pedestrian's window image.

また、検出対象物を歩行者とした場合を例に説明したが、これに限定されるものではなく、例えば、自転車などの二輪車を検出対象物としてもよい。 Moreover, although the case where the detection target object is a pedestrian has been described as an example, the present invention is not limited to this. For example, a two-wheeled vehicle such as a bicycle may be used as the detection target object.

１０、２１０、３１０対象物検出装置
１２撮像装置
１６、２１６、３１６コンピュータ
２０ウインドウ画像抽出部
２１、２２１特徴量抽出部
２２、２２２スコア算出部
２４識別モデル記憶部
２６、２２６、３２８歩行者識別部
２２０周辺部分画像抽出部
３２４歩行者候補判定部
３２６追跡処理部 10, 210, 310 Object detection device 12 Imaging device 16, 216, 316 Computer 20 Window image extraction unit 21, 221 Feature extraction unit 22, 222 Score calculation unit 24 Identification model storage unit 26, 226, 328 Pedestrian identification unit 220 Peripheral partial image extraction unit 324 Pedestrian candidate determination unit 326 Tracking processing unit

Claims

Extraction means for extracting a window image from a captured image obtained by imaging the periphery of the device;
Based on an identification model for identifying the object for each classification relating to the attribute of the object and the window image extracted by the extracting means, a score indicating the likelihood of the object is obtained for each classification relating to the attribute. A score calculating means for calculating;
Identification means for identifying whether the window image is an image representing the object based on a score for each classification related to the attribute calculated by the score calculation means;
An object detection apparatus including:

2. The object detection according to claim 1, wherein the identification unit identifies whether the window image is an image representing the object based on a score distribution for each classification related to the attribute calculated by the score calculation unit. apparatus.

Extraction means for extracting a window image from a captured image obtained by imaging the periphery of the device;
An identification model for identifying the object for each of the classifications related to the attributes of the object for each of a plurality of peripheral images in which the window image area is shifted so as to partially overlap the window image extracted by the extracting unit And a score calculation means for calculating a score indicating the likelihood of the object for each classification related to the attribute based on the peripheral image;
Based on an attribute classification for which the score satisfies a predetermined condition, or a cumulative score for each attribute classification, obtained from the score for each of the attributes for each of the plurality of surrounding images calculated by the score calculation means Identifying means for identifying whether the window image is an image representing the object;
An object detection apparatus including:

The identification unit is obtained from a score for each classification related to the attribute for each of the plurality of peripheral images calculated by the score calculation unit, and the distribution of the attribute classification for which the score satisfies a predetermined condition, or the classification of the attribute The target object detection apparatus according to claim 3, wherein the window image identifies whether or not the window image is an image representing the target object based on a cumulative score distribution for each.

Extraction means for extracting a plurality of window images from each of the time series of captured images obtained by imaging the periphery of the device;
For each of the time series of the captured images, based on an identification model for identifying the object for each classification related to the attribute of the object, and the plurality of window images extracted by the extraction unit, Score calculating means for calculating a score indicating the likelihood of the object for each window image for each classification;
Whether or not each of the plurality of window images is a candidate image representing an object based on a score for each classification related to the attribute calculated by the score calculation means for each of the time series of the captured images. Candidate image determining means for determining;
Tracking means for tracking the window image determined to be the candidate image in a time series of the captured images;
Obtained from the score for each category related to the attribute calculated with respect to the time series of the window image tracked in the time series of the captured image, the category of the attribute for which the score satisfies a predetermined condition, for each category of the attribute Identification means for identifying whether the window image is an image representing an object based on a cumulative score or a change in the classification of the attribute having the maximum score,
An object detection apparatus including:

The identification unit is obtained from a score for each classification related to the attribute calculated with respect to the time series of the window image tracked in the time series of the captured image, and the distribution of attribute classifications in which the score satisfies a predetermined condition The object according to claim 5, wherein the window image is an image representing an object based on a distribution of a cumulative score for each attribute classification, or a change in an attribute classification with the maximum score. Object detection device.

The object detection device according to claim 1, wherein the object is a pedestrian or a two-wheeled vehicle.

The object detection device according to claim 7, wherein, when the object is a pedestrian, the attribute is a pedestrian orientation, age, or gender.

The object detection device according to claim 1, wherein the identification model is learned in advance for each classification related to an attribute of the object.

Computer
Extraction means for extracting a window image from a captured image obtained by imaging the periphery of the own device;
Based on an identification model for identifying the object for each classification relating to the attribute of the object and the window image extracted by the extracting means, a score indicating the likelihood of the object is obtained for each classification relating to the attribute. A score calculation means for calculating, and a program for functioning as an identification means for identifying whether or not the window image is an image representing the object based on the score for each classification related to the attribute calculated by the score calculation means .

Computer
Extraction means for extracting a window image from a captured image obtained by imaging the periphery of the own device;
An identification model for identifying the object for each of the classifications related to the attributes of the object for each of a plurality of peripheral images in which the window image area is shifted so as to partially overlap the window image extracted by the extracting unit And a score calculation unit that calculates a score indicating the likelihood of the object for each classification related to the attribute based on the peripheral image, and the attribute for each of the plurality of peripheral images calculated by the score calculation unit An identification for identifying whether the window image is an image representing the object based on a classification of an attribute satisfying a predetermined condition, or a cumulative score for each classification of the attribute, obtained from a score for each classification Program to function as a means.

Computer
Extraction means for extracting a plurality of window images from each of the time series of captured images obtained by imaging the periphery of the own device;
For each of the time series of the captured images, based on an identification model for identifying the object for each classification related to the attribute of the object, and the plurality of window images extracted by the extraction unit, Score calculating means for calculating a score indicating the likelihood of the object for each window image for each classification;
Whether or not each of the plurality of window images is a candidate image representing an object based on a score for each classification related to the attribute calculated by the score calculation means for each of the time series of the captured images. Candidate image determining means for determining,
Tracking means for tracking the window image determined to be the candidate image in the time series of the captured image, and the time calculated for the time series of the window image tracked in the time series of the captured image The window image is obtained from a score for each attribute-related classification, the attribute classification for which the score satisfies a predetermined condition, a cumulative score for each attribute classification, or a change in the attribute classification for which the score is the maximum value. A program for functioning as an identification means for identifying whether or not an image represents an object.