JP2014106685A

JP2014106685A - Vehicle periphery monitoring device

Info

Publication number: JP2014106685A
Application number: JP2012258488A
Authority: JP
Inventors: Yasushi Yagi; 康史八木; Yasushi Makihara; 靖槇原; Chunsheng Hua; 春生華; Keisuke Miyagawa; 恵介宮川; Shun Iwasaki; 瞬岩崎
Original assignee: Honda Motor Co Ltd; Osaka University NUC
Current assignee: Honda Motor Co Ltd; Osaka University NUC
Priority date: 2012-11-27
Filing date: 2012-11-27
Publication date: 2014-06-09
Also published as: WO2014084218A1

Abstract

PROBLEM TO BE SOLVED: To provide a vehicle periphery monitoring device capable of improving learning accuracy and identification accuracy regardless of the kind of an object.SOLUTION: An object identification part 44 is a discriminator 50 created using machine learning, into which feature data group as an image characteristic amount is inputted, and from which presence/absence information on objects (road-crossing pedestrians 70, 72) is outputted. The discriminator 50 creates the feature data group from an image of at least one sub-region 82 (non-mask regions 102, 108) selected out of a plurality of sub-regions 82 constituting respective learning sample images 74, 76 supplied to the machine learning in accordance with the kind of the object and inputs the created feature data group thereinto.

Description

この発明は、車載された撮像手段から取得した撮像画像に基づいて、車両の周辺に存在する対象物を検知する車両周辺監視装置に関する。 The present invention relates to a vehicle periphery monitoring device that detects an object existing around a vehicle based on a captured image acquired from an on-vehicle imaging unit.

従来から、カメラ等の撮像手段を用いて撮像し、得られた撮像画像を基に物体を検知等する方法が種々提案されている。例えば、画像の局所領域での輝度の強度及び勾配に関する特徴量（いわゆるＨＯＧ特徴量）を用いて、撮像画像に映りこんだ歩行者を検知等する手法が記載されている（非特許文献１参照）。 Conventionally, various methods have been proposed in which an image is picked up using an image pickup means such as a camera, and an object is detected based on the obtained picked-up image. For example, a method for detecting a pedestrian reflected in a captured image using a feature amount (so-called HOG feature amount) related to intensity and gradient of luminance in a local region of the image is described (see Non-Patent Document 1). ).

そして、上記の撮像手段と、撮像画像を基に物体を識別可能な識別器とを車両に併せて搭載することで、その車両外に存在する物体を検知し、あるいは物体と自車両との相対的な位置関係等を監視する車両周辺監視装置が構築される。この装置を導入することにより、車両の乗員（特に運転者）の支援になる。 Then, by mounting the above-described imaging means and an identifier capable of identifying an object based on the captured image together with the vehicle, the object existing outside the vehicle is detected, or the relative between the object and the host vehicle is detected. A vehicle periphery monitoring device for monitoring a general positional relationship and the like is constructed. By introducing this device, the vehicle occupant (especially the driver) is supported.

“ＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓｆｏｒＨｕｍａｎＤｅｔｅｃｔｉｏｎ”、ＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ、Ｖｏｌ．１、ｐｐ．８８６−８９３（２００５）“Histograms of Oriented Gradients for Human Detection”, IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 886-893 (2005)

上記した識別器は、種々の機械学習をさせることで、歩行者（人体）のみならず、動物、人工構造物等を識別可能である。ところが、これらの対象物を少なくとも含む、識別対象の画像領域（以下、識別対象領域）を抽出する場合、対象物の種類によって識別精度が異なる場合があった。これは、対象物の種類に応じて投影像の形状が異なるので、背景部の画像情報を取り込む割合が変化するためと考えられる。換言すれば、背景部の画像情報は、学習処理又は識別処理の際に、対象物の学習・識別精度を低下させる外乱因子（ノイズ情報）として作用する。 The discriminator described above can discriminate not only pedestrians (human bodies) but also animals, artificial structures and the like by performing various machine learning. However, when extracting an image area (hereinafter referred to as an identification target area) that includes at least these objects, the identification accuracy may differ depending on the type of the object. This is probably because the shape of the projected image varies depending on the type of the object, and the ratio of capturing the image information of the background portion changes. In other words, the image information of the background portion acts as a disturbance factor (noise information) that lowers the learning / identification accuracy of the object during the learning process or the identification process.

特に、この種の車両周辺監視装置では車両の走行中に逐次撮像するため、カメラから得た撮像画像のシーン（背景・天候等）は時々刻々と変化し得る。また、車両が路面上を走行した場合に取得される識別対象（歩行者等）の画像には、横断歩道、ガードレール等の路面パターンが含まれることが多い。このため、識別器に機械学習をさせる結果、この路面パターンが歩行者であるとして誤って学習する可能性が高くなる。すなわち、背景部における画像特徴を予測することは困難であり、外乱因子としての影響を無視できないという問題があった。 In particular, since this type of vehicle periphery monitoring apparatus sequentially captures images while the vehicle is running, the scene (background, weather, etc.) of the captured image obtained from the camera can change from moment to moment. Moreover, road surface patterns such as pedestrian crossings and guardrails are often included in images of identification objects (pedestrians and the like) acquired when a vehicle travels on a road surface. For this reason, as a result of causing the classifier to perform machine learning, there is a high possibility that the road surface pattern is erroneously learned as a pedestrian. That is, it is difficult to predict the image feature in the background portion, and there is a problem that the influence as a disturbance factor cannot be ignored.

本発明は上記した問題を解決するためになされたもので、対象物の種類にかかわらず学習・識別精度を向上可能な車両周辺監視装置を提供することを目的とする。 The present invention has been made to solve the above-described problem, and an object thereof is to provide a vehicle periphery monitoring device that can improve learning / identification accuracy regardless of the type of an object.

本発明に係る車両周辺監視装置は、車両に搭載され、前記車両の走行中に撮像することで前記車両の周辺における撮像画像を取得する撮像手段と、前記撮像手段により取得された前記撮像画像の中から識別対象領域を抽出する識別対象領域抽出手段と、前記識別対象領域抽出手段により抽出された前記識別対象領域における画像特徴量から、前記識別対象領域内に対象物が存在するか否かを前記対象物の種類毎に識別する対象物識別手段とを備え、前記対象物識別手段は、前記画像特徴量としての特徴データ群を入力とし前記対象物の存否情報を出力とする、機械学習を用いて生成された識別器であり、前記識別器は、前記機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、前記対象物の種類に応じて選択した少なくとも１つの前記サブ領域の画像から、前記特徴データ群を作成し入力することを特徴とする。 A vehicle periphery monitoring device according to the present invention is mounted on a vehicle and captures an image of the captured image acquired by the imaging unit by acquiring an image of the periphery of the vehicle by capturing an image while the vehicle is running. An identification target area extracting unit for extracting an identification target area from the inside, and whether or not an object exists in the identification target area from the image feature amount in the identification target area extracted by the identification target area extracting unit. Object identification means for identifying each type of the object, and the object identification means performs machine learning in which a feature data group as the image feature amount is input and presence / absence information of the object is output. And the classifier is selected according to the type of the target object from among a plurality of sub-regions constituting each learning sample image used for the machine learning. From one image of the sub-region, and wherein the inputting create the feature data group.

このように、対象物識別手段としての識別器は、機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、対象物の種類に応じて選択した少なくとも１つのサブ領域の画像から、前記特徴データ群を作成し入力するようにしたので、前記対象物の投影像の形状に適したサブ領域の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、前記対象物の識別精度を向上させることができる。なお、車載された撮像手段から得た撮像画像のシーンは時々刻々と変化するため、特に効果的である。 Thus, the discriminator as the object discriminating means is an image of at least one sub-area selected according to the type of the object out of a plurality of sub-areas constituting each learning sample image used for machine learning. Since the feature data group is created and input, image information of sub-regions suitable for the shape of the projected image of the object can be selectively employed for the learning process, The learning accuracy can be improved regardless of the type. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved. In addition, since the scene of the captured image obtained from the imaging means mounted on the vehicle changes every moment, it is particularly effective.

また、前記対象物識別手段は、前記識別対象領域内に前記対象物が存在するか否かを該対象物の移動方向毎に識別することが好ましい。移動方向に応じて撮像画像上での像形状が変化する傾向を踏まえ、移動方向毎に識別することで対象物の学習・識別精度が一層向上する。 Moreover, it is preferable that the said object identification means identifies for each moving direction of the said object whether the said object exists in the said identification object area | region. Based on the tendency that the image shape on the captured image changes according to the moving direction, the learning / identification accuracy of the object is further improved by identifying each moving direction.

さらに、前記識別対象領域抽出手段は、前記撮像手段から前記対象物までの距離に比例するサイズの前記識別対象領域を抽出することが好ましい。これにより、対象物と識別対象領域との間の相対的大小関係が距離によらず一定になり、外乱因子（対象物の投影像以外の画像情報）の影響を一律に抑えることで対象物の学習・識別精度が一層向上する。 Furthermore, it is preferable that the identification target area extracting unit extracts the identification target area having a size proportional to a distance from the imaging unit to the target. As a result, the relative magnitude relationship between the target object and the identification target region becomes constant regardless of the distance, and the influence of disturbance factors (image information other than the projected image of the target object) is uniformly suppressed to uniformly Learning / identification accuracy is further improved.

さらに、前記画像特徴量には、空間上での輝度勾配方向ヒストグラムが含まれることが好ましい。撮像の露光量に起因する輝度勾配方向の変動は小さいので、対象物の特徴を的確に捉えることが可能であり、環境光の強度が時々刻々と変化する屋外環境であっても安定した識別精度が得られる。 Furthermore, it is preferable that the image feature amount includes a luminance gradient direction histogram in space. Because the fluctuation in the brightness gradient direction due to the exposure amount of imaging is small, it is possible to accurately grasp the characteristics of the target object, and stable identification accuracy even in outdoor environments where the intensity of ambient light changes from moment to moment Is obtained.

さらに、前記画像特徴量には、時空間上での輝度勾配方向ヒストグラムが含まれることが好ましい。空間上のみならず、時間上の輝度勾配方向も併せて考慮することで、時系列で取得された複数の撮像画像にわたる対象物の検知・追跡が容易になる。 Furthermore, it is preferable that the image feature amount includes a luminance gradient direction histogram in time and space. By considering the luminance gradient direction in time as well as in space, it becomes easy to detect and track an object over a plurality of captured images acquired in time series.

本発明に係る車両周辺監視装置によれば、対象物識別手段としての識別器は、機械学習に供される各学習サンプル画像を構成する複数のサブ領域のうち、対象物の種類に応じて選択した少なくとも１つのサブ領域の画像から、前記特徴データ群を作成し入力するようにしたので、前記対象物の投影像の形状に適したサブ領域の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、前記対象物の識別精度を向上させることができる。なお、車載された撮像手段から得た撮像画像のシーンは時々刻々と変化するため、特に効果的である。 According to the vehicle periphery monitoring apparatus according to the present invention, the discriminator as the object discriminating unit is selected according to the type of the object out of the plurality of sub-regions constituting each learning sample image provided for machine learning. Since the feature data group is created and input from the at least one sub-region image, image information of the sub-region suitable for the shape of the projected image of the object is selectively selected for the learning process. The learning accuracy can be improved regardless of the type of the object. Then, by excluding the remaining sub-regions from the learning process, it is possible to prevent over-learning for image information other than the projected image of the object that acts as a disturbance factor during the identification process, and the identification accuracy of the object Can be improved. In addition, since the scene of the captured image obtained from the imaging means mounted on the vehicle changes every moment, it is particularly effective.

本実施形態に係る車両周辺監視装置の構成を示すブロック図である。It is a block diagram which shows the structure of the vehicle periphery monitoring apparatus which concerns on this embodiment. 図１に示す車両周辺監視装置が搭載された車両の概略斜視図である。It is a schematic perspective view of the vehicle carrying the vehicle periphery monitoring apparatus shown in FIG. 図３Ａ及び図３Ｂは、カメラを用いた撮像により取得された撮像画像の一例を示す画像図である。3A and 3B are image diagrams showing an example of a captured image acquired by imaging using a camera. 図１に示す識別器による学習処理の説明に供されるフローチャートである。It is a flowchart with which it uses for description of the learning process by the discriminator shown in FIG. 横断歩行者を含む学習サンプル画像の一例を示す画像図である。It is an image figure which shows an example of the learning sample image containing a crossing pedestrian. 各サブ領域の定義方法に関する概略説明図である。It is a schematic explanatory drawing regarding the definition method of each sub area | region. 図１に示す識別器の詳細ブロック図である。It is a detailed block diagram of the discriminator shown in FIG. 図８Ａは、横断歩行者を含む多数の学習サンプル画像から得た、典型輪郭画像を表す模式図である。図８Ｂは、横断歩行者を含む各学習サンプル画像に共通する、非マスク領域及びマスク領域を示す概略説明図である。FIG. 8A is a schematic diagram showing a typical contour image obtained from a large number of learning sample images including crossing pedestrians. FIG. 8B is a schematic explanatory diagram illustrating a non-mask area and a mask area common to each learning sample image including a crossing pedestrian. 図９Ａは、対面歩行者を含む多数の学習サンプル画像から得た、典型輪郭画像を表す模式図である。図９Ｂは、対面歩行者を含む各学習サンプル画像に共通する、非マスク領域及びマスク領域を示す概略説明図である。FIG. 9A is a schematic diagram showing a typical contour image obtained from a large number of learning sample images including facing pedestrians. FIG. 9B is a schematic explanatory diagram showing a non-mask area and a mask area common to each learning sample image including a face-to-face pedestrian. 図１に示すＥＣＵの動作説明に供されるフローチャートである。2 is a flowchart provided for explaining the operation of the ECU shown in FIG. 車両、カメラ及び人体の位置関係を表す概略説明図である。It is a schematic explanatory drawing showing the positional relationship of a vehicle, a camera, and a human body. 識別対象領域の決定方法に関する概略説明図である。It is a schematic explanatory drawing regarding the determination method of an identification object area | region. 図１に示す対象物識別部の詳細ブロック図である。It is a detailed block diagram of the target object identification part shown in FIG. 図１４Ａ〜図１４Ｃは、ＨＯＧ特徴量の算出方法に関する概略説明図である。14A to 14C are schematic explanatory diagrams regarding a method for calculating the HOG feature amount. 図１５Ａは、ＨＯＧ特徴量の算出方法に関する概略説明図である。図１５Ｂは、ＳＴＨＯＧ特徴量の算出方法に関する概略説明図である。FIG. 15A is a schematic explanatory diagram relating to a method for calculating the HOG feature amount. FIG. 15B is a schematic explanatory diagram regarding a method of calculating an STHOG feature amount.

以下、本発明に係る車両周辺監視装置について好適な実施形態を挙げ、添付の図面を参照しながら説明する。 Hereinafter, preferred embodiments of a vehicle periphery monitoring device according to the present invention will be described with reference to the accompanying drawings.

図１は、本実施形態に係る車両周辺監視装置１０の構成を示すブロック図である。図２は、図１に示す車両周辺監視装置１０が搭載された車両１２の概略斜視図である。 FIG. 1 is a block diagram illustrating a configuration of a vehicle periphery monitoring device 10 according to the present embodiment. FIG. 2 is a schematic perspective view of the vehicle 12 on which the vehicle periphery monitoring device 10 shown in FIG. 1 is mounted.

図１及び図２に示すように、車両周辺監視装置１０は、複数のカラーチャンネルからなるカラー画像（以下、撮像画像Ｉｍという）を撮像するカラーカメラ（以下、単に「カメラ１４」という）と、車両１２の車速Ｖｓを検出する車速センサ１６と、車両１２のヨーレートＹｒを検出するヨーレートセンサ１８と、運転者によるブレーキペダルの操作量Ｂｒを検出するブレーキセンサ２０と、この車両周辺監視装置１０を制御する電子制御装置（以下、「ＥＣＵ２２」という）と、音声で警報等を発するためのスピーカ２４と、カメラ１４から出力された撮像画像等を表示する表示装置２６と、を備える。 As shown in FIGS. 1 and 2, the vehicle periphery monitoring apparatus 10 includes a color camera (hereinafter simply referred to as “camera 14”) that captures a color image (hereinafter referred to as a captured image Im) including a plurality of color channels, and A vehicle speed sensor 16 for detecting the vehicle speed Vs of the vehicle 12, a yaw rate sensor 18 for detecting the yaw rate Yr of the vehicle 12, a brake sensor 20 for detecting the brake pedal operation amount Br by the driver, and the vehicle periphery monitoring device 10 An electronic control device (hereinafter referred to as “ECU 22”) to be controlled, a speaker 24 for issuing an alarm or the like by sound, and a display device 26 for displaying a captured image output from the camera 14 are provided.

カメラ１４は、主に可視光領域の波長を有する光を利用するカメラであり、車両１２の周辺を撮像する撮像手段として機能する。カメラ１４は、被写体の表面を反射する光量が多いほど、その出力信号レベルが高くなり、画像の輝度（例えば、ＲＧＢ値）が増加する特性を有する。図２に示すように、カメラ１４は、車両１２の前部バンパー部の略中心部に固定的に配置（搭載）されている。 The camera 14 is a camera that mainly uses light having a wavelength in the visible light region, and functions as an imaging unit that images the periphery of the vehicle 12. The camera 14 has a characteristic that the output signal level increases as the amount of light reflected on the surface of the subject increases, and the luminance (for example, RGB value) of the image increases. As shown in FIG. 2, the camera 14 is fixedly disposed (mounted) at a substantially central portion of the front bumper portion of the vehicle 12.

なお、車両１２の周囲を撮像する撮像手段は、上記した構成例（いわゆる単眼カメラ）に限られることなく、例えば複眼カメラ（ステレオカメラ）であってもよい。また、カラーカメラに代替して赤外線カメラを用いてもよく、或いは両方を併せ備えてもよい。さらに、単眼カメラの場合、別の測距手段（レーダ装置）を併せて備えてもよい。 The imaging means for imaging the periphery of the vehicle 12 is not limited to the above configuration example (so-called monocular camera), and may be, for example, a compound eye camera (stereo camera). Further, an infrared camera may be used instead of the color camera, or both may be provided. Further, in the case of a monocular camera, another ranging means (radar apparatus) may be provided.

図１に戻って、スピーカ２４は、ＥＣＵ２２からの指令に応じて、警報音等の出力を行う。スピーカ２４は、車両１２の図示しないダッシュボードに設けられる。あるいは、スピーカ２４に代替して、他の装置（例えば、オーディオ装置又はナビゲーション装置）が備える音声出力機能を用いてもよい。 Returning to FIG. 1, the speaker 24 outputs an alarm sound or the like in response to a command from the ECU 22. The speaker 24 is provided on a dashboard (not shown) of the vehicle 12. Alternatively, instead of the speaker 24, an audio output function provided in another device (for example, an audio device or a navigation device) may be used.

表示装置２６（図１及び図２参照）は、車両１２のフロントウインドシールド上、運転者の前方視界を妨げない位置に配されたＨＵＤ（ヘッドアップディスプレイ）である。表示装置２６として、ＨＵＤに限らず、車両１２に搭載されたナビゲーションシステムの地図等を表示するディスプレイや、メータユニット内等に設けられた燃費等を表示するディスプレイ（ＭＩＤ；マルチインフォメーションディスプレイ）を利用することができる。 The display device 26 (see FIGS. 1 and 2) is a HUD (head-up display) disposed on the front windshield of the vehicle 12 at a position that does not obstruct the driver's front view. The display device 26 is not limited to the HUD, but a display that displays a map or the like of a navigation system mounted on the vehicle 12 or a display (MID; multi-information display) that displays fuel consumption or the like provided in a meter unit or the like is used. can do.

ＥＣＵ２２は、入出力部２８、演算部３０、表示制御部３２、及び記憶部３４を基本的に備える。 The ECU 22 basically includes an input / output unit 28, a calculation unit 30, a display control unit 32, and a storage unit 34.

カメラ１４、車速センサ１６、ヨーレートセンサ１８及びブレーキセンサ２０からの各信号は、入出力部２８を介してＥＣＵ２２側に入力される。また、ＥＣＵ２２からの各信号は、入出力部２８を介してスピーカ２４及び表示装置２６側に出力される。入出力部２８は、入力されたアナログ信号をデジタル信号に変換する図示しないＡ／Ｄ変換回路を備える。 Signals from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20 are input to the ECU 22 side via the input / output unit 28. Each signal from the ECU 22 is output to the speaker 24 and the display device 26 via the input / output unit 28. The input / output unit 28 includes an A / D conversion circuit (not shown) that converts an input analog signal into a digital signal.

演算部３０は、カメラ１４、車速センサ１６、ヨーレートセンサ１８及びブレーキセンサ２０からの各信号に基づく演算を実行し、演算結果に基づきスピーカ２４及び表示装置２６に対する信号を生成する。演算部３０は、距離推定部４０、識別対象領域決定部４２（識別対象領域抽出手段）、対象物識別部４４（対象物識別手段）、及び対象物検知部４６として機能する。ここで、対象物識別部４４は、画像特徴量としての特徴データ群を入力とし対象物の存否情報を出力とする、機械学習を用いて生成された識別器５０で構成される。 The calculation unit 30 performs calculations based on the signals from the camera 14, the vehicle speed sensor 16, the yaw rate sensor 18, and the brake sensor 20, and generates signals for the speaker 24 and the display device 26 based on the calculation results. The calculation unit 30 functions as a distance estimation unit 40, an identification target region determination unit 42 (identification target region extraction unit), an object identification unit 44 (subject identification unit), and an object detection unit 46. Here, the target object identification unit 44 is configured by a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs the presence / absence information of the target object.

演算部３０における各部の機能は、記憶部３４に記憶されているプログラムを読み出して実行することにより実現される。或いは、前記プログラムは、図示しない無線通信装置（携帯電話機、スマートフォン等）を介して外部から供給されてもよい。 The function of each unit in the calculation unit 30 is realized by reading and executing a program stored in the storage unit 34. Alternatively, the program may be supplied from the outside via a wireless communication device (mobile phone, smartphone, etc.) not shown.

表示制御部３２は、表示装置２６を駆動制御する制御回路である。表示制御部３２が、入出力部２８を介して、表示制御に供される信号を表示装置２６に出力することで、表示装置２６が駆動する。これにより、表示装置２６は各種画像（撮像画像Ｉｍ、マーク等）を表示することができる。 The display control unit 32 is a control circuit that drives and controls the display device 26. The display control unit 32 drives the display device 26 by outputting a signal used for display control to the display device 26 via the input / output unit 28. Thereby, the display apparatus 26 can display various images (captured image Im, a mark, etc.).

記憶部３４は、デジタル信号に変換された撮像信号、各種演算処理に供される一時データ等を記憶するＲＡＭ（Random Access Memory）、及び実行プログラム、テーブル又はマップ等を記憶するＲＯＭ（Read Only Memory）等で構成される。 The storage unit 34 includes an imaging signal converted into a digital signal, a RAM (Random Access Memory) that stores temporary data used for various arithmetic processes, and a ROM (Read Only Memory) that stores an execution program, a table, a map, or the like. ) Etc.

本実施形態に係る車両周辺監視装置１０は、基本的には、以上のように構成される。この車両周辺監視装置１０の動作の概要について以下説明する。 The vehicle periphery monitoring apparatus 10 according to the present embodiment is basically configured as described above. An outline of the operation of the vehicle periphery monitoring device 10 will be described below.

ＥＣＵ２２は、所定のフレームクロック間隔・周期（例えば、１秒あたり３０フレーム）毎に、カメラ１４から出力されるアナログの映像信号をデジタル信号に変換し、記憶部３４に一時的に取り込む。 The ECU 22 converts the analog video signal output from the camera 14 into a digital signal at predetermined frame clock intervals / cycles (for example, 30 frames per second) and temporarily captures it in the storage unit 34.

図３Ａに示すように、撮像時点Ｔ＝Ｔ１において、第１フレームの撮像画像Ｉｍが得られたとする。本図例では、撮像画像Ｉｍにおいて、車両１２が走行する道路領域（以下、単に「道路６０」という）、道路６０に沿って略等間隔に設置された複数の電柱領域（以下、単に「電柱６２」という）、道路６０上の横断歩行者領域（以下、単に「横断歩行者６４」という）がそれぞれ存在する。 As shown in FIG. 3A, it is assumed that a captured image Im of the first frame is obtained at an imaging time point T = T1. In the illustrated example, in the captured image Im, a road area (hereinafter simply referred to as “road 60”) on which the vehicle 12 travels, and a plurality of utility pole areas (hereinafter simply referred to as “electric poles”) installed along the road 60 at substantially equal intervals. 62 ”) and a crossing pedestrian area on the road 60 (hereinafter simply referred to as“ crossing pedestrian 64 ”).

図３Ｂに示すように、撮像時点Ｔ＝Ｔ２（＞Ｔ１）において、第２フレームの撮像画像Ｉｍが得られたとする。本図の撮像画像Ｉｍにおいて、図３Ａに示す撮像画像Ｉｍの場合と同様に、道路６０、複数の電柱６２、及び横断歩行者６４がそれぞれ存在する。ところが、フレーム間隔時間が経過するにつれて、カメラ１４及び各対象物の間の相対的位置関係が時々刻々と変化する。これにより、第１及び第２フレーム（撮像画像Ｉｍ）が同一の画角範囲であっても、各対象物は、それぞれ異なる形態（形状、大きさ又は色彩）で画像化される。 As shown in FIG. 3B, it is assumed that a captured image Im of the second frame is obtained at an imaging time point T = T2 (> T1). In the captured image Im of this figure, as in the captured image Im shown in FIG. 3A, a road 60, a plurality of utility poles 62, and a crossing pedestrian 64 exist. However, as the frame interval time elapses, the relative positional relationship between the camera 14 and each object changes every moment. Thereby, even if the first and second frames (captured image Im) are in the same field angle range, each object is imaged in a different form (shape, size, or color).

そして、ＥＣＵ２２は、記憶部３４から読み出した撮像画像Ｉｍ（車両１２の前方画像）に対して各種演算処理を施す。ＥＣＵ２２（特に演算部３０）は、撮像画像Ｉｍに対する処理結果、必要に応じて車両１２の走行状態を示す各信号（車速Ｖｓ、ヨーレートＹｒ及び操作量Ｂｒ）を総合的に考慮し、車両１２の前方に存在する人体、動物等を、監視対象となる物体（以下、「監視対象物」あるいは単に「対象物」という。）として検出する。 Then, the ECU 22 performs various arithmetic processes on the captured image Im (the front image of the vehicle 12) read from the storage unit 34. The ECU 22 (particularly the arithmetic unit 30) comprehensively considers the processing results for the captured image Im, and signals (vehicle speed Vs, yaw rate Yr, and operation amount Br) that indicate the traveling state of the vehicle 12 as necessary. A human body, an animal, or the like existing in front is detected as an object to be monitored (hereinafter referred to as “monitoring object” or simply “object”).

車両１２が監視対象物に接触する可能性が高いと演算部３０により判断された場合、ＥＣＵ２２は、運転者の注意を喚起するために車両周辺監視装置１０の各出力部を制御する。ＥＣＵ２２は、例えば、スピーカ２４を介して警報音（例えば、ピッ、ピッ、…と鳴る音）を出力させるともに、表示装置２６上に可視化された撮像画像Ｉｍのうちその監視対象物の部位を強調表示させる。 When the calculation unit 30 determines that there is a high possibility that the vehicle 12 is in contact with the monitoring target, the ECU 22 controls each output unit of the vehicle periphery monitoring device 10 in order to call the driver's attention. For example, the ECU 22 outputs an alarm sound (for example, a beeping sound) via the speaker 24 and emphasizes the part of the monitored object in the captured image Im visualized on the display device 26. Display.

続いて、図１に示す識別器５０による学習処理について、図４のフローチャートを参照しながら詳細に説明する。機械学習の手法として、教師あり学習、教師なし学習、強化学習のうちのいずれのアルゴリズムを採用してもよい。本実施形態では、事前に与えられたデータセットに基づいて学習を行う「教師あり学習」について具体例を挙げて説明する。 Next, the learning process by the discriminator 50 shown in FIG. 1 will be described in detail with reference to the flowchart of FIG. As a machine learning method, any algorithm of supervised learning, unsupervised learning, and reinforcement learning may be employed. In the present embodiment, “supervised learning” in which learning is performed based on a data set given in advance will be described with a specific example.

ステップＳ１１において、機械学習に供される学習データが収集される。ここで、学習データは、対象物を含む（又は含まない）学習サンプル画像と、この対象物の種類（「対象物無し」の属性を含む）とのデータセットである。対象物の種類として、例えば、人体、各種動物（具体的には、鹿、馬、羊、犬、猫等の哺乳動物、鳥類等）、人工構造物（具体的には、車両、標識、電柱、ガードレール、壁等）等が挙げられる。 In step S11, learning data used for machine learning is collected. Here, the learning data is a data set of a learning sample image including (or not including) the target object and the type of the target object (including the attribute “no target object”). Examples of types of objects include human bodies, various animals (specifically, mammals such as deer, horses, sheep, dogs and cats, birds, etc.), artificial structures (specifically, vehicles, signs, utility poles) , Guardrails, walls, etc.).

図５は、横断歩行者７０、７２を含む学習サンプル画像７４、７６の一例を示す画像図である。学習サンプル画像７４の略中央には、左側から右側に向かって横断歩行する人体の投影像（以下、横断歩行者７０）が映し出されている。また、別の学習サンプル画像７６の略中央には、右手前側から左奥側に向かって横断歩行する人体の投影像（以下、横断歩行者７２）が映し出されている。本図例の学習サンプル画像７４、７６はいずれも、対象物が含まれる正解画像に相当する。収集される各学習サンプル画像には、対象物が含まれない不正解画像が含まれてもよい。また、識別器５０（図１参照）側に入力されるデータ形式を統一するため、各学習サンプル画像７４、７６は、それらの形状が互いに一致又は相似する画像領域８０を有することが好ましい。 FIG. 5 is an image diagram showing an example of learning sample images 74 and 76 including crossing pedestrians 70 and 72. At the approximate center of the learning sample image 74, a projected image of a human body that crosswalks from the left side to the right side (hereinafter referred to as a crossing pedestrian 70) is displayed. In addition, a projected image of a human body that crosswalks from the right front side to the left back side (hereinafter referred to as crossing pedestrian 72) is displayed at the approximate center of another learning sample image 76. Each of the learning sample images 74 and 76 in this example corresponds to a correct image including the object. Each learning sample image collected may include an incorrect image that does not include an object. Further, in order to unify the data format input to the discriminator 50 (see FIG. 1) side, it is preferable that the learning sample images 74 and 76 have an image region 80 whose shapes match or are similar to each other.

ステップＳ１２において、各学習サンプル画像７４、７６の画像領域８０を分割することで、複数のサブ領域８２がそれぞれ定義される。図６例では、矩形状の画像領域８０は、そのサイズにかかわらず、行数が８つ、列数が６つとして、格子状に均等分割されている。すなわち、画像領域８０内において、同一の形状を有する、４８個のサブ領域８２がそれぞれ定義される。 In step S <b> 12, a plurality of sub-regions 82 are defined by dividing the image region 80 of each learning sample image 74, 76. In the example of FIG. 6, the rectangular image region 80 is equally divided into a lattice shape with eight rows and six columns regardless of the size. That is, 48 sub-regions 82 having the same shape are defined in the image region 80, respectively.

ステップＳ１３において、識別器５０の学習アーキテクチャが構築される。学習アーキテクチャの例として、ブースティング法、ＳＶＭ（Support Vector machine）、ニューラルネットワーク、ＥＭ（Expectation Maximization）アルゴリズム等が挙げられる。本実施形態では、ブースティング法の一種であるＡｄａＢｏｏｓｔを適用する。 In step S13, a learning architecture of the classifier 50 is constructed. Examples of the learning architecture include a boosting method, SVM (Support Vector machine), neural network, EM (Expectation Maximization) algorithm, and the like. In this embodiment, AdaBoost, which is a kind of boosting method, is applied.

図７に示すように、識別器５０は、Ｎ個（Ｎは２以上の自然数）の特徴データ生成器９０と、Ｎ個の弱学習器９２と、重み付け更新器９３と、重み付け演算器９４と、サンプル荷重更新器９５とから構成される。 As shown in FIG. 7, the discriminator 50 includes N (N is a natural number of 2 or more) feature data generators 90, N weak learners 92, a weight updater 93, and a weight calculator 94. And a sample load updater 95.

なお、本図において、説明の便宜のため、各特徴データ生成器９０に対して上から順に、第１データ生成器、第２データ生成器、第３データ生成器、‥、第Ｎデータ生成器と表記している。同様に、各弱学習器９２に対して上から順に、第１弱学習器、第２弱学習器、第３弱学習器、‥、第Ｎ弱学習器と表記している。 In this figure, for convenience of explanation, the first data generator, the second data generator, the third data generator,..., The Nth data generator are sequentially arranged from the top with respect to each feature data generator 90. It is written. Similarly, the first weak learner, the second weak learner, the third weak learner,..., And the Nth weak learner are written in order from the top for each weak learner 92.

本図に示すように、１つの特徴データ生成器９０（例えば、第１データ生成器）及び１つの弱学習器９２（例えば、第１弱学習器）が択一的に接続されることで、Ｎ個のサブシステムが構築されている。そして、重み付け更新器９３の入力側にはＮ個のサブシステムの出力側（各弱学習器９２）がそれぞれ接続され、その出力側には、重み付け演算器９４及びサンプル荷重更新器９５が直列に接続されている。なお、機械学習の際における、識別器５０の詳細な動作については後述する。 As shown in the figure, one feature data generator 90 (for example, the first data generator) and one weak learner 92 (for example, the first weak learner) are alternatively connected, N subsystems are constructed. The output side (each weak learner 92) of N subsystems is connected to the input side of the weight updater 93, and the weighting calculator 94 and the sample load updater 95 are connected in series to the output side. It is connected. The detailed operation of the classifier 50 during machine learning will be described later.

ステップＳ１４において、対象物の種類毎に、サブ領域８２のマスク条件が決定される。ここで、マスク条件とは、１つの学習サンプル画像７４からＮ個の特徴データ（以下、特徴データ群という）を生成する際に、各サブ領域８２内の画像を採用するか否かについての選択条件を意味する。 In step S14, the mask conditions for the sub-region 82 are determined for each type of object. Here, the mask condition is a selection as to whether or not to adopt an image in each sub-region 82 when N feature data (hereinafter referred to as a feature data group) is generated from one learning sample image 74. Means a condition.

例えば、車両１２にカメラ１４を搭載して実際に得た撮像画像Ｉｍを、学習サンプル画像７４、７６（正解画像）として用いる場合、対象物の背後に種々の構造物が映り込むことがある。図５例に示す学習サンプル画像７４において、横断歩行者７０を除いた背景部７８には、別の３名の人体、路面、建物の壁等が映し出されている。 For example, when the captured image Im actually obtained by mounting the camera 14 on the vehicle 12 is used as the learning sample images 74 and 76 (correct images), various structures may be reflected behind the object. In the learning sample image 74 shown in FIG. 5, the background portion 78 excluding the crossing pedestrian 70 shows another three human bodies, a road surface, a building wall, and the like.

機械学習の際、横断歩行者７０のみならず、背景部７８も含めて特徴データ群を作成すると、学習サンプル画像７４等を蓄積するデータベースの傾向によって背景部７８に関する過学習が起こることがある。この場合、背景部７８の画像情報は、対象物としての横断歩行者７０の学習・識別精度を低下させる外乱因子（ノイズ情報）として作用する。そこで、４８個すべてのサブ領域８２のうち、識別処理に適したサブ領域８２のみを選択し、作成された特徴データ群を用いて学習させることが効果的である。 During machine learning, if a feature data group including not only the crossing pedestrian 70 but also the background portion 78 is created, overlearning related to the background portion 78 may occur due to the tendency of the database that stores the learning sample images 74 and the like. In this case, the image information of the background part 78 acts as a disturbance factor (noise information) that reduces the learning / identification accuracy of the crossing pedestrian 70 as the object. Therefore, it is effective to select only the sub-region 82 suitable for the identification process from among all 48 sub-regions 82 and to learn using the created feature data group.

図８Ａは、横断歩行者７０、７２を含む多数の学習サンプル画像７４、７６から得た、典型輪郭画像１００を表す模式図である。具体的には、図示しない画像処理装置を用いて、各学習サンプル画像７４等に対して公知のエッジ抽出処理を施すことで、各オブジェクトの輪郭が抽出された輪郭抽出画像（図示しない。）をそれぞれ作成する。ここで、各輪郭抽出画像は、オブジェクトの輪郭が抽出された部位を白色で表現するとともに、輪郭が抽出されなかった部位を黒色で表現する。 FIG. 8A is a schematic diagram showing a typical contour image 100 obtained from a large number of learning sample images 74 and 76 including crossing pedestrians 70 and 72. Specifically, a contour extraction image (not shown) in which the contour of each object is extracted by performing known edge extraction processing on each learning sample image 74 or the like using an image processing device (not shown). Create each one. Here, each contour extraction image represents a part from which the contour of the object has been extracted in white, and represents a part from which the outline has not been extracted in black.

そして、図示しない画像処理装置を用いて、各輪郭抽出画像に対して種々の統計処理（例えば、平均化処理）を施すことで、各学習サンプル画像７４に典型的な輪郭画像（典型輪郭画像１００）を得る。ここで、典型輪郭画像１００は、各輪郭抽出画像と同様に、オブジェクトの輪郭が抽出された部位を白色で表現するとともに、輪郭が抽出されなかった部位を黒色で表現する。すなわち、典型輪郭画像１００は、多数の学習サンプル画像７４等に共通して含まれる対象物（横断歩行者７０、７２）の輪郭を表す画像に相当する。 Then, by using various image processing (for example, averaging processing) for each contour extraction image using an image processing device (not shown), a contour image typical for each learning sample image 74 (typical contour image 100). ) Here, similar to each contour extraction image, the typical contour image 100 represents a part from which the contour of the object has been extracted in white, and represents a part from which the contour has not been extracted in black. That is, the typical contour image 100 corresponds to an image representing the contour of an object (crossing pedestrians 70 and 72) included in common in a large number of learning sample images 74 and the like.

例えば、輪郭特徴量が所定の閾値を超えたサブ領域８２を演算対象として採用し、輪郭特徴量が所定の閾値を下回ったサブ領域８２を演算対象から除外する。その結果、図８Ｂに示すように、画像領域８０内にそれぞれ定義された４８個のサブ領域８２のうち、２０個のサブ領域８２（典型輪郭画像１００の部分画像を示した領域）の集合が非マスク領域１０２として決定されたとする。また、４８個のサブ領域８２のうち、残余の２８個のサブ領域８２（白く塗り潰した領域）の集合がマスク領域１０４として決定される。 For example, the sub region 82 whose contour feature amount exceeds a predetermined threshold is adopted as a calculation target, and the sub region 82 whose contour feature amount falls below a predetermined threshold is excluded from the calculation target. As a result, as shown in FIG. 8B, among the 48 sub-regions 82 defined in the image region 80, a set of 20 sub-regions 82 (region showing a partial image of the typical contour image 100) is obtained. It is assumed that the non-mask area 102 is determined. Of the 48 sub-regions 82, the remaining 28 sub-regions 82 (regions filled in with white) are determined as the mask region 104.

図９Ａは、対面歩行者を含む多数の学習サンプル画像から得た、典型輪郭画像１０６を表す模式図である。典型輪郭画像１０６の作成方法については、図８Ａの典型輪郭画像１００と同様であるので、その説明を省略する。 FIG. 9A is a schematic diagram showing a typical contour image 106 obtained from a large number of learning sample images including a face-to-face pedestrian. The method for creating the typical contour image 106 is the same as that for the typical contour image 100 in FIG.

図９Ｂは、対面歩行者を含む各学習サンプル画像に共通する、非マスク領域１０８及びマスク領域１１０を示す概略説明図である。マスク領域１１０の決定方法については、図８Ｂのマスク領域１０４と同様であるので、その説明を省略する。 FIG. 9B is a schematic explanatory diagram showing a non-mask area 108 and a mask area 110 that are common to each learning sample image including a face-to-face pedestrian. Since the determination method of the mask area 110 is the same as that of the mask area 104 in FIG. 8B, the description thereof is omitted.

図８Ｂ及び図９Ｂから理解されるように、対象物は同一（歩行者）であるにもかかわらず、マスク領域１０４、１１０が異なっている。より詳細には、マスク領域１０４、１１０は、ハッチングを付した４つのサブ領域８２（図９Ｂ参照）がマスク対象であるか否かで異なる。これは、歩行動作（手足を振る動き）に起因する像形状の変化の度合いが、歩行者の移動方向によって異なるためである。このように、移動方向に応じて画像上での像形状が変化する傾向を踏まえ、対象物の移動方向毎に学習させてもよい。移動方向としては、画像面に対して横断方向（より詳しくは右方向、左方向）、対面方向（より詳しくは手前方向、奥方向）、斜め方向のいずれであってもよい。 As can be understood from FIGS. 8B and 9B, the mask areas 104 and 110 are different even though the object is the same (pedestrian). More specifically, the mask areas 104 and 110 differ depending on whether or not the four sub-areas 82 (see FIG. 9B) with hatching are mask targets. This is because the degree of change in the image shape due to walking motion (movement of shaking hands and feet) varies depending on the moving direction of the pedestrian. In this way, learning may be performed for each moving direction of the object based on the tendency of the image shape on the image to change according to the moving direction. The moving direction may be any of a transverse direction (more specifically, right direction and left direction), a facing direction (more specifically, near side direction and back direction), and an oblique direction with respect to the image plane.

ステップＳ１５において、ステップＳ１１で収集した多数の学習データを、識別器５０に対して遂次入力することで機械学習をさせる。以下、図７を参照しながら詳細に説明する。 In step S15, machine learning is performed by sequentially inputting a large number of learning data collected in step S11 to the discriminator 50. Hereinafter, it will be described in detail with reference to FIG.

先ず、識別器５０は、収集された各学習データのうち、横断歩行者７０を含む学習サンプル画像７４を特徴データ生成器９０側にそれぞれ入力する。そして、各特徴データ生成器９０は、ステップＳ１４で決定されたマスク条件に従って、学習サンプル画像７４に対し特定の演算処理を施すことで各特徴データ（総称して、特徴データ群）を作成する。マスク処理の具体的方法として、マスク領域１０４（図８Ｂ参照）に属する全画素の値を使用せずに演算してもよいし、上記した全画素の値を所定値（例えば０）に置き換えた後に演算することで、特徴データを実質的に無効にしてもよい。 First, the discriminator 50 inputs the learning sample image 74 including the crossing pedestrian 70 among the collected learning data to the feature data generator 90 side. Then, each feature data generator 90 creates each feature data (collectively, feature data group) by performing specific arithmetic processing on the learning sample image 74 according to the mask condition determined in step S14. As a specific method of the mask processing, the calculation may be performed without using the values of all pixels belonging to the mask area 104 (see FIG. 8B), or the values of all the pixels described above are replaced with predetermined values (for example, 0). The feature data may be substantially invalidated by calculating later.

そして、各弱学習器９２（第ｉ弱学習器；１≦ｉ≦Ｎ）は、特徴データ生成器９０（第ｉデータ生成器）から取得した各特徴データ（第ｉ特徴データ）に対して所定の演算を施すことで各出力結果（第ｉ出力結果）を得る。そして、重み付け更新器９３は、各弱学習器９２から取得した第１〜第Ｎ出力結果をそれぞれ入力するとともに、収集された１つの学習データのうちの対象物の存否情報である対象物情報９６を入力する。ここで、対象物情報９６は、学習サンプル画像７４の中に横断歩行者７０が含まれる旨を示す。 Each weak learner 92 (i-th weak learner; 1 ≦ i ≦ N) is predetermined for each feature data (i-th feature data) acquired from the feature data generator 90 (i-th data generator). Each output result (i-th output result) is obtained by performing the above calculation. The weight updater 93 receives the first to Nth output results acquired from the weak learners 92, and the object information 96 that is the presence / absence information of the object in the collected learning data. Enter. Here, the object information 96 indicates that the crossing pedestrian 70 is included in the learning sample image 74.

その後、重み付け更新器９３は、対象物情報９６に応じた出力値との誤差量が最小となる出力結果を得た弱学習器９２を１つ選択し、その重み付け係数αが大きくなるように更新量Δαを決定する。そして、重み付け演算器９４は、重み付け更新器９３から供給された更新量Δαを加算することで重み付け係数αを更新する。これと併せて、サンプル荷重更新器９５は、更新された重み付け係数α等に基づいて、学習サンプル画像７４等に予め付与された荷重（以下、サンプル荷重９７という）を更新する。 Thereafter, the weight updater 93 selects one weak learner 92 that has obtained an output result that minimizes the amount of error from the output value corresponding to the object information 96, and updates the weighting coefficient α so as to increase. The quantity Δα is determined. The weighting calculator 94 updates the weighting coefficient α by adding the update amount Δα supplied from the weight updater 93. At the same time, the sample load updater 95 updates a load (hereinafter referred to as a sample load 97) previously applied to the learning sample image 74 and the like based on the updated weighting coefficient α and the like.

このように、学習データの入力、重み付け係数αの更新、及びサンプル荷重９７の更新を順次繰り返し、収束条件を満たすまで識別器５０に機械学習をさせる（ステップＳ１５）。 In this manner, the learning data is input, the weighting coefficient α is updated, and the sample load 97 is sequentially updated, and the discriminator 50 performs machine learning until the convergence condition is satisfied (step S15).

以上のようにして、機械学習された識別器５０を車両周辺監視装置１０（図１参照）内に実装することで、撮像画像Ｉｍの識別対象領域内に少なくとも１種類の対象物が存在するか否かを識別可能な対象物識別部４４が構築される。 As described above, by mounting the machine-learned discriminator 50 in the vehicle periphery monitoring device 10 (see FIG. 1), whether at least one type of object exists in the discrimination target area of the captured image Im. An object identification unit 44 that can identify whether or not is constructed.

続いて、車両周辺監視装置１０の詳細な動作について、図１０のフローチャートを参照しながら説明する。なお、本処理の流れは、車両１２が走行中である場合、撮像のフレーム毎に実行される。 Next, the detailed operation of the vehicle periphery monitoring device 10 will be described with reference to the flowchart of FIG. Note that this processing flow is executed for each imaging frame when the vehicle 12 is traveling.

ステップＳ２１において、ＥＣＵ２２は、フレーム毎に、カメラ１４により撮像された車両１２の前方（所定画角範囲）の出力信号である撮像画像Ｉｍを取得する。例えば、図３Ａに示すように、撮像時点Ｔ＝Ｔ１においてフレーム単位の撮像画像Ｉｍが得られたとする。そして、ＥＣＵ２２は、取得した撮像画像Ｉｍを記憶部３４に一時的に記憶させる。例えば、カメラ１４としてＲＧＢカメラを用いる場合、得られた撮像画像Ｉｍは、３つのカラーチャンネルからなる多階調画像である。 In step S21, the ECU 22 acquires a captured image Im that is an output signal in front of the vehicle 12 (predetermined angle of view range) captured by the camera 14 for each frame. For example, as shown in FIG. 3A, it is assumed that a captured image Im in units of frames is obtained at an imaging time T = T1. Then, the ECU 22 temporarily stores the acquired captured image Im in the storage unit 34. For example, when an RGB camera is used as the camera 14, the obtained captured image Im is a multi-gradation image composed of three color channels.

ステップＳ２２において、距離推定部４０は、ステップＳ２１で取得された撮像画像Ｉｍを用いて車両１２の仰俯角を算出することで、車両１２からの距離Ｄｉｓをそれぞれ推定する。 In step S22, the distance estimation unit 40 estimates the distance Dis from the vehicle 12 by calculating the elevation angle of the vehicle 12 using the captured image Im acquired in step S21.

図１１は、車両１２、カメラ１４及び人体Ｍの位置関係を表す概略説明図である。ここで、カメラ１４を搭載した車両１２、及び、対象物としての人体Ｍが、平坦な路面Ｓ上に存在する場合を想定する。また、人体Ｍと路面Ｓとの接触点をＰｃとし、カメラ１４の光軸をＬ１とし、カメラ１４の光学中心Ｃと接触点Ｐｃとを結ぶ直線をＬ２とする。 FIG. 11 is a schematic explanatory diagram showing the positional relationship between the vehicle 12, the camera 14, and the human body M. Here, it is assumed that the vehicle 12 on which the camera 14 is mounted and the human body M as an object are present on a flat road surface S. A contact point between the human body M and the road surface S is Pc, an optical axis of the camera 14 is L1, and a straight line connecting the optical center C of the camera 14 and the contact point Pc is L2.

例えば、カメラ１４の光軸Ｌ１が路面Ｓに対してなす角（仰俯角）がβであり、直線Ｌ２が直線Ｌ１に対してなす角がγであり、路面Ｓに対するカメラ１４の高さがＨｃであったとする。この場合、距離Ｄｉｓは、幾何学的な考察から、角度β、γ、及び高さＨｃを用いて、次の（１）式で算出される。
Ｄｉｓ＝Ｈｃ／ｔａｎ（β＋γ） ‥（１） For example, the angle (elevation angle) formed by the optical axis L1 of the camera 14 with respect to the road surface S is β, the angle formed by the straight line L2 with respect to the straight line L1 is γ, and the height of the camera 14 with respect to the road surface S is Hc. Suppose that In this case, the distance Dis is calculated by the following equation (1) using the angles β, γ, and the height Hc from a geometrical consideration.
Dis = Hc / tan (β + γ) (1)

このようにして、距離推定部４０は、撮像画像Ｉｍ上における路面Ｓ（図３Ａの道路６０）の各位置に対応する距離Ｄｉｓをそれぞれ推定できる。あるいは、距離推定部４０は、車両１２の運動に伴う、路面Ｓとカメラ１４との間の姿勢変化を考慮し、ＳｆＭ（Structure from Motion）等の公知の手法を用いて距離Ｄｉｓを推定してもよい。なお、車両周辺監視装置１０が測距センサを備える場合、これを利用して距離Ｄｉｓを計測してもよい。 In this way, the distance estimation unit 40 can estimate the distance Dis corresponding to each position of the road surface S (the road 60 in FIG. 3A) on the captured image Im. Alternatively, the distance estimation unit 40 estimates the distance Dis using a known method such as SfM (Structure from Motion) in consideration of the posture change between the road surface S and the camera 14 accompanying the motion of the vehicle 12. Also good. In addition, when the vehicle periphery monitoring apparatus 10 is provided with a distance measuring sensor, you may measure the distance Dis using this.

ステップＳ２３において、識別対象領域決定部４２は、識別対象の画像領域である識別対象領域１２２のサイズ等を決定する。本実施形態では、識別対象領域決定部４２は、ステップＳ２２で推定された距離Ｄｉｓ、及び／又は、車速センサ１６から取得した車速Ｖｓに応じて、識別対象領域１２２のサイズ等を決定する。この具体例について図１２を参照しながら説明する。 In step S23, the identification target area determination unit 42 determines the size and the like of the identification target area 122 that is an image area to be identified. In the present embodiment, the identification target area determination unit 42 determines the size or the like of the identification target area 122 according to the distance Dis estimated in step S22 and / or the vehicle speed Vs acquired from the vehicle speed sensor 16. A specific example will be described with reference to FIG.

図１２に示すように、路面Ｓ（道路６０）と人体Ｍ（横断歩行者６４）との接触点Ｐｃ（図１１参照）に対応する、撮像画像Ｉｍ上の位置を基準位置１２０とする。そして、横断歩行者６４をすべて含むように、矩形状の識別対象領域１２２が設定される。 As illustrated in FIG. 12, a position on the captured image Im corresponding to a contact point Pc (see FIG. 11) between the road surface S (road 60) and the human body M (crossing pedestrian 64) is set as a reference position 120. And the rectangular identification object area | region 122 is set so that all the crossing pedestrians 64 may be included.

識別対象領域決定部４２は、カメラ１４（図１１の光学中心Ｃ）から対象物までの距離Ｄｉｓに比例するように識別対象領域１２２のサイズを決定する。路面Ｓ（道路６０）上の基準位置１２４に、破線で図示した横断歩行者６４ｆが存在すると仮定した場合、識別対象領域１２２に相似する識別対象領域１２６が設定される。これにより、横断歩行者６４（６４ｆ）と識別対象領域１２２（１２６）との間の相対的大小関係が距離Ｄｉｓによらず一定になり、外乱因子（対象物の投影像以外の画像情報）の影響を一律に抑えることで対象物の学習・識別精度が一層向上する。 The identification target area determination unit 42 determines the size of the identification target area 122 so as to be proportional to the distance Dis from the camera 14 (optical center C in FIG. 11) to the target. When it is assumed that the crossing pedestrian 64f illustrated by a broken line exists at the reference position 124 on the road surface S (the road 60), an identification target area 126 similar to the identification target area 122 is set. As a result, the relative magnitude relationship between the crossing pedestrian 64 (64f) and the identification target region 122 (126) becomes constant regardless of the distance Dis, and the disturbance factor (image information other than the projected image of the target object) The learning and identification accuracy of the object is further improved by suppressing the influence uniformly.

そして、識別対象領域決定部４２は、後述するラスタスキャンの対象範囲である指定領域１２８を併せて決定する。例えば、通常の動作範囲内において対象物が確実に検知される距離であるＤｉｓ１を下限値として決定し、通常の動作範囲内において対象物に即時に衝突するおそれがない距離であるＤｉｓ２を上限値として決定してもよい。このように、撮像画像Ｉｍのうちの一部の領域におけるスキャンを省略することで、識別処理の演算量及び演算時間を低減できるのみならず、指定領域１２８以外の領域において生じ得る誤検出自体を無くすることができる。 Then, the identification target area determination unit 42 also determines a designated area 128 that is a target range of a raster scan described later. For example, Dis1 that is a distance in which the object is reliably detected within the normal operation range is determined as the lower limit value, and Dis2 that is a distance that does not cause a collision with the object immediately within the normal operation range is set as the upper limit value. May be determined as As described above, by omitting scanning in a part of the captured image Im, not only can the calculation amount and calculation time of the identification process be reduced, but also erroneous detection itself that may occur in a region other than the designated region 128. Can be eliminated.

なお、識別対象領域決定部４２は、対象物の種類に応じて、識別対象領域１２２の形状を適宜変更してもよい。この場合、本実施形態の場合と同様に、距離Ｄｉｓに比例したサイズに決定してもよい。 Note that the identification target area determination unit 42 may appropriately change the shape of the identification target area 122 according to the type of the target object. In this case, as in the case of the present embodiment, the size may be determined in proportion to the distance Dis.

また、識別対象領域決定部４２は、同一の距離Ｄｉｓであっても、対象物の身長に応じて識別対象領域１２２のサイズを変更してもよい。これにより、対象物の身長別に適切なサイズを設定可能であり、外乱因子の影響を一律に抑えることで対象物の学習・識別精度がさらに一層向上する。 Further, the identification target area determination unit 42 may change the size of the identification target area 122 according to the height of the target object even at the same distance Dis. Thereby, it is possible to set an appropriate size according to the height of the object, and the learning / identification accuracy of the object is further improved by uniformly suppressing the influence of the disturbance factor.

ステップＳ２４において、演算部３０は、ステップＳ２３で決定された指定領域１２８内で、撮像画像Ｉｍのラスタスキャンを開始する。ここで、ラスタスキャンとは、基準位置１２０（撮像画像Ｉｍ内の画素）を所定の方向に移動させながら、対象物の有無を遂次識別する手法をいう。以下、識別対象領域決定部４２は、現在スキャン中の基準位置１２０、及び、基準位置１２０から特定される識別対象領域１２２の位置・サイズを遂次決定する。 In step S24, the arithmetic unit 30 starts a raster scan of the captured image Im within the designated area 128 determined in step S23. Here, the raster scan refers to a method of successively identifying the presence or absence of an object while moving the reference position 120 (pixels in the captured image Im) in a predetermined direction. Hereinafter, the identification target area determination unit 42 sequentially determines the reference position 120 currently being scanned and the position / size of the identification target area 122 identified from the reference position 120.

ステップＳ２５において、対象物識別部４４は、決定された識別対象領域１２２内に、少なくとも１種類の対象物が存在するか否かを識別する。 In step S <b> 25, the object identification unit 44 identifies whether or not at least one type of object exists in the determined identification object region 122.

図１３に示すように、対象物識別部４４は、機械学習を用いて生成された識別器５０（図７参照）である。重み付け演算器９４には、機械学習（図４のステップＳ１５）により得た、適切な重み付け係数αｆが予め設定されている。 As illustrated in FIG. 13, the object identification unit 44 is a classifier 50 (see FIG. 7) generated using machine learning. In the weighting calculator 94, an appropriate weighting coefficient αf obtained by machine learning (step S15 in FIG. 4) is preset.

対象物識別部４４は、識別対象領域１２２を含む画像領域８０を有する評価画像１３０を、各特徴データ生成器９０側に入力する。ここで、対象物識別部４４による識別精度を高めるため、識別対象領域１２２の画像に対して正規化処理（階調処理・拡縮処理）、位置合わせ処理等の必要な画像処理を適宜施してもよい。そして、対象物識別部４４は、各特徴データ生成器９０、各弱学習器９２、重み付け演算器９４、及び、重み付け演算器９４から取得した重み付け出力結果に対して階段関数を作用させる統合学習器９８を介して、評価画像１３０を順次処理し、識別対象領域１２２内に横断歩行者６４が存在する旨の識別結果を出力する。この場合、対象物識別部４４は、Ｎ個の弱識別器（弱学習器９２）を組み合わせることで高い識別性能を備えた強識別器として機能する。 The object identifying unit 44 inputs an evaluation image 130 having an image area 80 including the identification target area 122 to each feature data generator 90 side. Here, in order to increase the identification accuracy by the object identification unit 44, necessary image processing such as normalization processing (gradation processing / enlargement / reduction processing), alignment processing, and the like may be appropriately performed on the image of the identification target region 122. Good. The object identification unit 44 includes each feature data generator 90, each weak learner 92, a weighting calculator 94, and an integrated learner that applies a step function to the weighted output result acquired from the weighting calculator 94. The evaluation image 130 is sequentially processed via 98, and an identification result indicating that the crossing pedestrian 64 exists in the identification target area 122 is output. In this case, the object discriminating unit 44 functions as a strong discriminator having high discrimination performance by combining N weak discriminators (weak learners 92).

各特徴データ生成器９０は、学習処理の場合（図７参照）と同一の演算方法を用いて、各サブ領域８２における画像特徴量（すなわち、上述の特徴データ群）をそれぞれ算出する。 Each feature data generator 90 calculates an image feature amount (that is, the above-described feature data group) in each sub-region 82 using the same calculation method as in the learning process (see FIG. 7).

ところで、画像特徴量の算出方法は、公知の方法を種々用いることができる。以下、画像の局所領域での輝度の強度及び勾配の特徴を示すＨＯＧ（Histograms of Oriented Gradient；輝度勾配方向ヒストグラム）特徴量について説明する。 By the way, various known methods can be used as the image feature amount calculation method. Hereinafter, the HOG (Histograms of Oriented Gradient) feature amount indicating the intensity and gradient characteristics of the luminance in the local region of the image will be described.

図１４Ａに示すように、画像領域８０の中から、ヒストグラムの作成単位であるブロックが１つ選択されたとする。技術の理解を容易にするため、以下、各ブロックをそれぞれのサブ領域８２に対応させて定義する。例えば、ブロックとしてのサブ領域８２は、縦に６画素、横に６画素、合計３６個の画素８４で構成されたとする。 As shown in FIG. 14A, it is assumed that one block which is a histogram creation unit is selected from the image area 80. In order to facilitate understanding of the technology, each block is defined below corresponding to each sub-region 82. For example, it is assumed that the sub-region 82 as a block is composed of a total of 36 pixels 84, 6 pixels vertically and 6 pixels horizontally.

図１４Ｂに示すように、ブロックを構成する画素８４毎に、輝度の二次元勾配（Ｉｘ，Ｉｙ）が算出される。この場合、勾配強度Ｉ及び空間輝度勾配角θは、次の（２）式及び（３）式に従って算出される。
Ｉ＝（Ｉｘ^２＋Ｉｙ^２）^１／２ ‥（２）
θ＝ｔａｎ^−１（Ｉｙ／Ｉｘ） ‥（３） As shown in FIG. 14B, a two-dimensional gradient (Ix, Iy) of luminance is calculated for each pixel 84 constituting the block. In this case, the gradient intensity I and the spatial luminance gradient angle θ are calculated according to the following equations (2) and (3).
I = (Ix ² + Iy ² ) ^1/2 (2)
θ = tan ⁻¹ (Iy / Ix) (3)

第１行目の各格子内に表記された矢印は、平面的な輝度勾配の方向を図示する。実際には、勾配強度Ｉ及び空間輝度勾配角θが、すべての画素８４に対して算出されるが、第２行目以降における矢印の図示を省略する。 The arrows written in each grid in the first row illustrate the direction of the planar luminance gradient. Actually, the gradient intensity I and the spatial luminance gradient angle θ are calculated for all the pixels 84, but the illustration of arrows in the second and subsequent rows is omitted.

図１４Ｃに示すように、１つのブロックにつき、空間輝度勾配角θに対するヒストグラムが作成される。ヒストグラムの横軸は空間輝度勾配角θ（本図例では、８つの区分）であり、ヒストグラムの縦軸は勾配強度Ｉである。この場合、（２）式に示す勾配強度Ｉに基づいてブロック毎のヒストグラムが作成される。 As shown in FIG. 14C, a histogram for the spatial luminance gradient angle θ is created for each block. The horizontal axis of the histogram is the spatial luminance gradient angle θ (eight divisions in this example), and the vertical axis of the histogram is the gradient intensity I. In this case, a histogram for each block is created based on the gradient intensity I shown in Equation (2).

図１５Ａに示すように、ブロック毎のヒストグラム（図１４Ｃ例では空間輝度勾配角θ）を予め定めた順番、例えば昇順に連結することで、評価画像１３０のＨＯＧ特徴量が得られる。 As illustrated in FIG. 15A, the HOG feature amount of the evaluation image 130 is obtained by connecting the histogram for each block (spatial luminance gradient angle θ in the example of FIG. 14C) in a predetermined order, for example, ascending order.

このように、画像特徴量には、空間上での輝度勾配方向ヒストグラム（ＨＯＧ特徴量）が含まれてもよい。撮像の露光量に起因する輝度勾配方向（θ）の変動は小さいので、対象物の特徴を的確に捉えることが可能であり、環境光の強度が時々刻々と変化する屋外環境であっても安定した識別精度が得られる。 Thus, the image feature amount may include a luminance gradient direction histogram (HOG feature amount) in space. Because the fluctuation of the luminance gradient direction (θ) due to the exposure amount of imaging is small, it is possible to accurately capture the characteristics of the target object, and it is stable even in outdoor environments where the intensity of ambient light changes from moment to moment Identification accuracy can be obtained.

また、画像特徴量として、ＨＯＧ特徴量の定義を拡張させたＳＴＨＯＧ（Spatio-Temporal Histograms of Oriented Gradient）特徴量を用いてもよい。この場合、時系列で取得された複数の撮像画像Ｉｍを用いて、図１４Ｂの場合と同様に、ブロックを構成する画素８４毎に、輝度の三次元的勾配（Ｉｘ，Ｉｙ，Ｉｔ）が算出される。ここで、勾配強度Ｉ及び時間輝度勾配角φは、次の（４）式及び（５）式に従って算出される。
Ｉ＝（Ｉｘ^２＋Ｉｙ^２＋Ｉｔ^２）^１／２ ‥（４）
φ＝ｔａｎ^−１｛Ｉｔ／（Ｉｘ^２＋Ｉｙ^２）^１／２｝ ‥（５） Further, as the image feature amount, an STHOG (Spatio-Temporal Histograms of Oriented Gradient) feature amount in which the definition of the HOG feature amount is expanded may be used. In this case, a three-dimensional gradient (Ix, Iy, It) of luminance is calculated for each pixel 84 constituting the block using a plurality of captured images Im acquired in time series, as in FIG. 14B. Is done. Here, the gradient intensity I and the temporal luminance gradient angle φ are calculated according to the following equations (4) and (5).
I = (Ix ² + Iy ² + It ² ) ^1/2 (4)
φ = tan ⁻¹ {It / (Ix ² + Iy ² ) ^1/2 } (5)

図１５Ｂに示すように、ＨＯＧ特徴量に対し、時間輝度勾配角φのヒストグラムをさらに連結することで、時空間上での輝度勾配ヒストグラムであるＳＴＨＯＧ特徴量が得られる。このように、空間上の輝度勾配方向（θ）のみならず、時間上の輝度勾配方向（φ）も併せて考慮することで、時系列で取得された複数の撮像画像Ｉｍにわたる対象物の検知・追跡が容易になる。 As shown in FIG. 15B, an STHOG feature quantity that is a brightness gradient histogram in space-time is obtained by further connecting a histogram of temporal brightness gradient angle φ to the HOG feature quantity. Thus, not only the luminance gradient direction (θ) in space but also the luminance gradient direction (φ) in time is taken into consideration, thereby detecting an object over a plurality of captured images Im acquired in time series.・ Easy tracking.

なお、ＳＴＨＯＧ特徴量の場合もＨＯＧ特徴量と同様に、（２）式に示す勾配強度Ｉに基づいてブロック毎のヒストグラムが作成される。 In the case of the STHOG feature value, similarly to the HOG feature value, a histogram for each block is created based on the gradient intensity I shown in the equation (2).

このようにして、対象物識別部４４は、決定された識別対象領域１２２内に、少なくとも１種類の対象物が存在するか否かを識別する（ステップＳ２５）。これにより、人体Ｍを含む対象物の種類のみならず、横断方向・対面方向を含む移動方向も併せて識別される。 In this way, the object identification unit 44 identifies whether or not at least one type of object exists in the determined identification object region 122 (step S25). Thereby, not only the kind of the object including the human body M but also the moving direction including the transverse direction and the facing direction are identified.

ステップＳ２６において、識別対象領域決定部４２は、指定領域１２８内のスキャンがすべて完了したか否かを判別する。未完了であると判別された場合（ステップＳ２６：ＮＯ）、次のステップ（Ｓ２７）に進む。 In step S <b> 26, the identification target area determination unit 42 determines whether all the scans in the designated area 128 have been completed. When it is determined that it is not completed (step S26: NO), the process proceeds to the next step (S27).

ステップＳ２７において、識別対象領域決定部４２は、識別対象領域１２２の位置又はサイズを変更する。具体的には、識別対象領域決定部４２は、スキャン対象であった基準位置１２０を所定方向（例えば、右方向）に所定量（例えば、１画素分）だけ移動する。また、距離Ｄｉｓが変化する場合、識別対象領域１２２のサイズも併せて変更する。さらに、対象物の種類によって体長又は体幅の典型値が異なることを考慮して、識別対象領域決定部４２は、対象物の種類に応じて識別対象領域１２２のサイズを変更してもよい。 In step S <b> 27, the identification target area determination unit 42 changes the position or size of the identification target area 122. Specifically, the identification target area determination unit 42 moves the reference position 120 that is the scan target by a predetermined amount (for example, one pixel) in a predetermined direction (for example, the right direction). When the distance Dis changes, the size of the identification target area 122 is also changed. Furthermore, considering that the typical value of the body length or the body width varies depending on the type of the object, the identification target region determination unit 42 may change the size of the identification target region 122 according to the type of the target.

その後、ステップＳ２５に戻って、演算部３０は、指定領域１２８内のスキャンがすべて完了するまでステップＳ２５〜Ｓ２７を順次繰り返す。完了したと判別された場合（ステップＳ２６：ＹＥＳ）、演算部３０は、撮像画像Ｉｍのラスタスキャンを終了する（ステップＳ２８）。 Thereafter, returning to step S25, the arithmetic unit 30 sequentially repeats steps S25 to S27 until all the scans in the designated area 128 are completed. When it is determined that the processing has been completed (step S26: YES), the calculation unit 30 ends the raster scan of the captured image Im (step S28).

ステップＳ２９において、対象物検知部４６は、撮像画像Ｉｍ内に存在する対象物を検知する。フレーム単体での識別結果を用いてもよいし、複数のフレームでの識別結果を併せて考慮することで、同一の対象物についての動きベクトルを算出できる。 In step S29, the object detection unit 46 detects an object present in the captured image Im. The identification result for a single frame may be used, or the motion vector for the same object can be calculated by considering the identification results for a plurality of frames.

ステップＳ３０において、ＥＣＵ２２は、次回の演算処理に必要なデータを記憶部３４に記憶させる。例えば、ステップＳ２２で作成された距離Ｄｉｓ、ステップＳ２５で得られた対象物（図３Ａ等の横断歩行者６４）の属性、基準位置１２０等が挙げられる。 In step S30, the ECU 22 causes the storage unit 34 to store data necessary for the next calculation process. For example, the distance Dis created in step S22, the attribute of the object (crossing pedestrian 64 in FIG. 3A, etc.) obtained in step S25, the reference position 120, and the like can be given.

この動作を遂次実行することで、車両周辺監視装置１０は、所定の時間間隔で、車両１２の前方に存在する対象物（例えば、図１１の人体Ｍ）を監視することができる。 By sequentially executing this operation, the vehicle periphery monitoring device 10 can monitor an object (for example, the human body M in FIG. 11) existing in front of the vehicle 12 at a predetermined time interval.

以上のように、本発明に係る車両周辺監視装置１０は、車両１２に搭載され、車両１２の走行中に撮像することで車両１２の周辺における撮像画像Ｉｍを取得するカメラ１４と、取得された撮像画像Ｉｍの中から識別対象領域１２２、１２６を抽出する識別対象領域決定部４２と、抽出された識別対象領域１２２、１２６における画像特徴量から、識別対象領域１２２、１２６内に対象物（例えば、横断歩行者６４）が存在するか否かを対象物の種類毎に識別する対象物識別部４４とを備える。対象物識別部４４は、画像特徴量としての特徴データ群を入力とし対象物の存否情報を出力とする、機械学習を用いて生成された識別器５０である。 As described above, the vehicle periphery monitoring device 10 according to the present invention is mounted on the vehicle 12 and acquired by the camera 14 that acquires the captured image Im in the vicinity of the vehicle 12 by capturing an image while the vehicle 12 is traveling. The identification target area determination unit 42 that extracts the identification target areas 122 and 126 from the captured image Im, and the object (for example, in the identification target areas 122 and 126) based on the extracted image feature amounts in the identification target areas 122 and 126. And an object identifying unit 44 for identifying whether or not there is a crossing pedestrian 64) for each type of object. The target object identification unit 44 is a classifier 50 generated using machine learning that receives a feature data group as an image feature amount and outputs target presence / absence information.

そして、対象物識別手段としての識別器５０は、機械学習に供される各学習サンプル画像７４、７６を構成する複数のサブ領域８２のうち、対象物の種類に応じて選択した少なくとも１つのサブ領域８２（非マスク領域１０２、１０８）の画像から、前記特徴データ群を作成し入力するようにしたので、対象物の投影像の形状に適したサブ領域８２の画像情報を、学習処理に対して選択的に採用可能であり、対象物の種類にかかわらず学習精度を向上させることができる。そして、残余のサブ領域８２（マスク領域１０４、１１０）を学習処理から除外することで、識別処理の際に外乱因子として作用する、対象物の投影像以外の画像情報に対する過学習を防止可能であり、対象物の識別精度を向上させることができる。なお、車載されたカメラ１４から得た撮像画像Ｉｍのシーン（背景、天候、路面パターン等）は時々刻々と変化するため、特に効果的である。 The discriminator 50 as the object discriminating means includes at least one sub selected according to the type of the object out of the plurality of sub areas 82 constituting the learning sample images 74 and 76 used for machine learning. Since the feature data group is created and input from the image of the region 82 (non-mask regions 102 and 108), the image information of the sub-region 82 suitable for the shape of the projected image of the object is obtained for the learning process. The learning accuracy can be improved regardless of the type of object. Further, by excluding the remaining sub-region 82 (mask regions 104 and 110) from the learning process, it is possible to prevent over-learning with respect to image information other than the projected image of the object that acts as a disturbance factor during the identification process. Yes, the identification accuracy of the object can be improved. Note that the scene (background, weather, road surface pattern, etc.) of the captured image Im obtained from the camera 14 mounted on the vehicle changes from moment to moment and is particularly effective.

なお、この発明は、上述した実施形態に限定されるものではなく、この発明の主旨を逸脱しない範囲で自由に変更できることは勿論である。 In addition, this invention is not limited to embodiment mentioned above, Of course, it can change freely in the range which does not deviate from the main point of this invention.

本実施形態では、単眼カメラ（カメラ１４）により得られた撮像画像Ｉｍに対して上記した識別処理を実行しているが、複眼カメラ（ステレオカメラ）でも同様の作用効果が得られることは言うまでもない。 In the present embodiment, the above-described identification process is performed on the captured image Im obtained by the monocular camera (camera 14), but it goes without saying that the same effect can also be obtained with a compound eye camera (stereo camera). .

また、本実施形態では、対象物識別部４４（識別器５０）による学習処理及び識別処理を分離して実行しているが、両者の処理を並列的に実行可能に設けてもよい。 In the present embodiment, the learning process and the identification process by the object identification unit 44 (identifier 50) are performed separately, but both processes may be provided so as to be executed in parallel.

１０…車両周辺監視装置１２…車両
１４…カメラ２２…ＥＣＵ
３０…演算部３４…記憶部
４０…距離推定部４２…識別対象領域決定部
４４…対象物識別部４６…対象物検知部
５０…識別器６４、７０、７２…横断歩行者
７４、７６…学習サンプル画像７８…背景部
８０…画像領域８２…サブ領域
９０…特徴データ生成器９２…弱学習器
９４…重み付け演算器９６…対象物情報
９８…統合学習器１００、１０６…典型輪郭画像
１０２、１０８…非マスク領域１０４、１１０…マスク領域
１２０、１２４…基準位置１２２、１２６…識別対象領域
１２８…指定領域１３０…評価画像 DESCRIPTION OF SYMBOLS 10 ... Vehicle periphery monitoring apparatus 12 ... Vehicle 14 ... Camera 22 ... ECU
DESCRIPTION OF SYMBOLS 30 ... Operation part 34 ... Memory | storage part 40 ... Distance estimation part 42 ... Identification object area | region determination part 44 ... Object identification part 46 ... Object detection part 50 ... Discriminator 64, 70, 72 ... Crossing pedestrian 74, 76 ... Learning Sample image 78 ... Background portion 80 ... Image area 82 ... Sub-area 90 ... Feature data generator 92 ... Weak learner 94 ... Weight calculator 96 ... Object information 98 ... Integrated learner 100, 106 ... Typical contour images 102, 108 ... Non-mask area 104, 110 ... Mask area 120, 124 ... Reference position 122, 126 ... Identification target area 128 ... Designated area 130 ... Evaluation image

Claims

An imaging means mounted on a vehicle for acquiring a captured image in the vicinity of the vehicle by capturing an image while the vehicle is running;
An identification target area extracting means for extracting an identification target area from the captured image acquired by the imaging means;
Object identification means for identifying, for each type of the object, whether or not an object exists in the identification object area from the image feature amount in the identification object area extracted by the identification object area extraction means. Prepared,
The target object identification means is a classifier generated using machine learning, wherein a feature data group as the image feature amount is input and presence / absence information of the target object is output.
The classifier includes the feature data group from images of at least one sub-region selected according to the type of the target object among a plurality of sub-regions constituting each learning sample image used for the machine learning. A vehicle periphery monitoring device characterized by creating and inputting a vehicle.

The vehicle periphery monitoring device according to claim 1,
The vehicle object monitoring device, wherein the object identifying means identifies whether the object exists in the identification object area for each moving direction of the object.

The apparatus according to claim 1 or 2,
The vehicle periphery monitoring apparatus, wherein the identification target area extracting unit extracts the identification target area having a size proportional to a distance from the imaging unit to the target.

The device according to any one of claims 1 to 3,
A vehicle periphery monitoring device, wherein the image feature amount includes a luminance gradient direction histogram in space.

The device according to any one of claims 1 to 3,
A vehicle periphery monitoring apparatus, wherein the image feature amount includes a luminance gradient direction histogram in time and space.