JP2018088234A

JP2018088234A - Information processing device, imaging device, apparatus control system, movable body, information processing method, and program

Info

Publication number: JP2018088234A
Application number: JP2017171533A
Authority: JP
Inventors: 元気渡邊; Genki Watanabe
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2016-11-22
Filing date: 2017-09-06
Publication date: 2018-06-07
Anticipated expiration: 2037-09-06
Also published as: JP6972798B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processing device that can continue accurate tracking.SOLUTION: An information processing device (apparatus control system 1) comprises: an acquisition part that acquires information with which the position in the vertical direction, position in the horizontal direction, and position in the depth direction of an object are associated; a judging part 142 that judges the type of the object on the basis of the information acquired by the acquisition part; and a determination part 143 that determines the position of the object with a method according to the type of the object judged by the judging part; and a tracking part 144 that tracks the object on the basis of the position of the object determined by the determination part.SELECTED DRAWING: Figure 4

Description

本発明は、情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法、及びプログラムに関する。 The present invention relates to an information processing device, an imaging device, a device control system, a moving body, an information processing method, and a program.

自動車の安全性において、従来は歩行者や自動車と衝突したときに、いかに歩行者を守れるか、乗員を保護できるかの観点から自動車のボディー構造などの開発が行われてきた。しかしながら近年、情報処理技術、画像処理技術の発達により、高速に人や自動車等を検出する技術が開発されてきている。これらの技術を応用して、衝突する前に自動的にブレーキをかけ、衝突を未然に防ぐという自動車もすでに発売されている。 Conventionally, in the safety of automobiles, body structures of automobiles have been developed from the viewpoint of how to protect pedestrians and protect passengers when they collide with pedestrians and automobiles. However, in recent years, with the development of information processing technology and image processing technology, technology for detecting people, cars, etc. at high speed has been developed. Automobiles that apply these technologies to automatically apply a brake before a collision to prevent the collision are already on the market.

自動的にブレーキをかけるには人や他車等の物体までの距離を測定する必要があり、そのために、ステレオカメラの画像を用いた測定が実用化されている。 In order to automatically apply a brake, it is necessary to measure the distance to an object such as a person or another vehicle. For this reason, measurement using an image of a stereo camera has been put into practical use.

このステレオカメラの画像を用いた測定では、あるフレームの視差画像で自車両よりも前方にある車両等の物体を検出した後、それ以降のフレームの視差画像において、当該物体をトラッキング（追跡）する技術が知られている（例えば、特許文献１参照）。 In the measurement using the image of the stereo camera, an object such as a vehicle ahead of the host vehicle is detected from the parallax image of a certain frame, and then the object is tracked (tracked) in the parallax images of the subsequent frames. A technique is known (see, for example, Patent Document 1).

しかし、従来技術では、例えば人が急に手を広げる等の動作をした場合、前回のフレームから検出された人等の物体の位置と、今回のフレームから検出された手を広げた当該人等の物体の位置との差が比較的大きくなる。この場合、同一の物体として追跡できなくなる場合がある。 However, in the prior art, for example, when a person suddenly spreads his / her hand, the position of an object such as a person detected from the previous frame and the person who has spread his / her hand detected from the current frame The difference from the position of the object becomes relatively large. In this case, it may become impossible to track the same object.

そこで、精度の高いトラッキングを継続できる技術を提供することを目的とする。 Therefore, an object is to provide a technique capable of continuing highly accurate tracking.

情報処理装置において、物体の縦方向の位置と、横方向の位置と、奥行方向の位置とが対応づけられた情報を取得する取得部と、前記取得部により取得された前記情報に基づき、前記物体の種別を判定する判定部と、前記判定部により判定された前記物体の種別に応じた方法により前記物体の位置を決定する決定部と、前記決定部により決定された前記物体の位置に基づいて前記物体を追跡する追跡部と、を備える。 In the information processing apparatus, based on the information acquired by the acquisition unit, the acquisition unit that acquires information in which the vertical position, the horizontal position, and the depth direction position of the object are associated with each other. Based on the determination unit that determines the type of the object, the determination unit that determines the position of the object by a method according to the type of the object determined by the determination unit, and the position of the object determined by the determination unit And a tracking unit that tracks the object.

開示の技術によれば、精度の高いトラッキングを継続することが可能となる。 According to the disclosed technique, tracking with high accuracy can be continued.

実施形態に係る機器制御システムの構成を示す図である。It is a figure which shows the structure of the apparatus control system which concerns on embodiment. 実施形態に係る撮像ユニット及び画像解析ユニットの構成を示す図である。It is a figure which shows the structure of the imaging unit and image analysis unit which concern on embodiment. 三角測量の原理を利用することで視差値から距離を算出する原理を説明するための図である。It is a figure for demonstrating the principle which calculates a distance from a parallax value by utilizing the principle of triangulation. 機器制御システムの機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of an apparatus control system. 視差画像データ、及びその視差画像データから生成されるＶマップについて説明するための図である。It is a figure for demonstrating the parallax image data and the V map produced | generated from the parallax image data. 一方の撮像部で撮像された基準画像としての撮影画像の画像例と、その撮影画像に対応するＶマップを示す図である。It is a figure which shows the image example of the picked-up image as a reference | standard image imaged with one imaging part, and the V map corresponding to the picked-up image. 基準画像の一例を模式的に表した画像例を示す図である。It is a figure which shows the example of an image which represented typically an example of the reference | standard image. 画像例に対応するＵマップを示す図である。It is a figure which shows U map corresponding to an example of an image. Ｕマップに対応するリアルＵマップを示す図である。It is a figure which shows the real U map corresponding to U map. Ｕマップの横軸の値からリアルＵマップの横軸の値を求める方法を説明するための図である。It is a figure for demonstrating the method of calculating | requiring the value of the horizontal axis of a real U map from the value of the horizontal axis of a U map. 孤立領域検出処理の一例を示すフローチャートである。It is a flowchart which shows an example of an isolated area | region detection process. 孤立領域検出部で検出された孤立領域が内接する矩形領域を設定したリアル頻度Ｕマップを示す図である。It is a figure which shows the real frequency U map which set the rectangular area | region which the isolated area detected by the isolated area detection part inscribed. 矩形領域に対応する走査範囲を設定した視差画像を示す図である。It is a figure which shows the parallax image which set the scanning range corresponding to a rectangular area. 走査範囲を探索してオブジェクト領域を設定した視差画像を示す図である。It is a figure which shows the parallax image which searched the scanning range and set the object area | region. 視差画像の対応領域検出部及びオブジェクト領域抽出部で行われる処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed by the corresponding area detection part and object area extraction part of a parallax image. オブジェクトタイプの分類を行うためのテーブルデータの一例を示す図である。It is a figure which shows an example of the table data for classifying an object type. ３次元位置決定処理の一例を示す図である。It is a figure which shows an example of a three-dimensional position determination process. 車両等のオブジェクトの位置を算出する方法について説明する図である。It is a figure explaining the method of calculating the position of objects, such as a vehicle. 歩行者、オートバイ、または自転車であるオブジェクトの位置を算出する方法について説明する図である。It is a figure explaining the method to calculate the position of the object which is a pedestrian, a motorcycle, or a bicycle.

以下、実施形態に係る画像処理装置を有する機器制御システムについて説明する。 Hereinafter, a device control system having the image processing apparatus according to the embodiment will be described.

〈機器制御システムの構成〉
図１は、実施形態に係る機器制御システムの構成を示す図である。 <Configuration of device control system>
FIG. 1 is a diagram illustrating a configuration of a device control system according to the embodiment.

この機器制御システム１は、移動体である自動車などの自車両１００に搭載されており、撮像ユニット１０１、画像解析ユニット１０２、表示モニタ１０３、及び車両走行制御ユニット１０４からなる。そして、撮像ユニット１０１で、移動体の前方を撮像した自車両進行方向前方領域（撮像領域）の複数の撮像画像データ（フレーム）から、自車両前方の物体を検知して追跡し、その追跡結果を利用して移動体や各種車載機器の制御を行う。移動体の制御には、例えば、警告の報知、自車両１００（自移動体）のハンドルの制御、または自車両１００（自移動体）のブレーキが含まれる。 The device control system 1 is mounted on a host vehicle 100 such as an automobile that is a moving body, and includes an imaging unit 101, an image analysis unit 102, a display monitor 103, and a vehicle travel control unit 104. Then, the imaging unit 101 detects and tracks an object in front of the host vehicle from a plurality of captured image data (frames) in the forward direction of the host vehicle (imaging region) in which the front of the moving body is captured, and the tracking result Is used to control mobile objects and various in-vehicle devices. The control of the moving body includes, for example, warning notification, control of the handle of the own vehicle 100 (own moving body), or braking of the own vehicle 100 (own moving body).

撮像ユニット１０１は、例えば、自車両１００のフロントガラス１０５のルームミラー（図示せず）付近に設置される。撮像ユニット１０１の撮像によって得られる撮像画像データ等の各種データは、画像処理手段としての画像解析ユニット１０２に入力される。 The imaging unit 101 is installed near a room mirror (not shown) of the windshield 105 of the host vehicle 100, for example. Various data such as captured image data obtained by imaging by the imaging unit 101 is input to an image analysis unit 102 as image processing means.

画像解析ユニット１０２は、撮像ユニット１０１から送信されてくるデータを解析して、自車両１００が走行している路面部分（自車両の真下に位置する路面部分）に対する自車両前方の走行路面上の各地点における相対的な高さ（位置情報）を検出し、自車両前方の走行路面の３次元形状を把握する。また、自車両前方の他車両、歩行者、各種障害物などの認識対象物を認識する。 The image analysis unit 102 analyzes the data transmitted from the imaging unit 101, and is on the traveling road surface in front of the own vehicle with respect to the road surface portion on which the own vehicle 100 is traveling (the road surface portion located directly below the own vehicle). The relative height (position information) at each point is detected, and the three-dimensional shape of the traveling road surface in front of the host vehicle is grasped. Also, recognition objects such as other vehicles in front of the host vehicle, pedestrians, and various obstacles are recognized.

画像解析ユニット１０２の解析結果は、表示モニタ１０３及び車両走行制御ユニット１０４に送られる。表示モニタ１０３は、撮像ユニット１０１で得られた撮像画像データ及び解析結果を表示する。なお、表示モニタ１０３はなくともよい。車両走行制御ユニット１０４は、画像解析ユニット１０２による自車両前方の他車両、歩行者、各種障害物などの認識対象物の認識結果に基づいて、例えば、自車両１００の運転者へ警告を報知したり、自車両のハンドルやブレーキを制御するなどの走行支援制御を行う。 The analysis result of the image analysis unit 102 is sent to the display monitor 103 and the vehicle travel control unit 104. The display monitor 103 displays captured image data and analysis results obtained by the imaging unit 101. The display monitor 103 may not be provided. For example, the vehicle travel control unit 104 notifies the driver of the host vehicle 100 of a warning based on the recognition result of the recognition target objects such as other vehicles in front of the host vehicle, pedestrians, and various obstacles by the image analysis unit 102. Or driving support control such as controlling the steering wheel and brake of the host vehicle.

〈撮像ユニット１０１及び画像解析ユニット１０２の構成〉
図２は、実施形態に係る撮像ユニット１０１及び画像解析ユニット１０２の構成を示す図である。 <Configuration of Imaging Unit 101 and Image Analysis Unit 102>
FIG. 2 is a diagram illustrating configurations of the imaging unit 101 and the image analysis unit 102 according to the embodiment.

撮像ユニット１０１は、撮像手段としての２つの撮像部１１０ａ，１１０ｂを備えたステレオカメラで構成されており、２つの撮像部１１０ａ，１１０ｂは同一のものである。各撮像部１１０ａ，１１０ｂは、それぞれ、撮像レンズ１１１ａ，１１１ｂと、受光素子が２次元配置された画像センサ１１３ａ，１１３ｂを含んだセンサ基板１１４ａ，１１４ｂと、センサ基板１１４ａ，１１４ｂから出力されるアナログ電気信号（画像センサ１１３ａ，１１３ｂ上の各受光素子が受光した受光量に対応する電気信号）をデジタル電気信号に変換した撮像画像データを生成して出力する信号処理部１１５ａ，１１５ｂとから構成されている。撮像ユニット１０１からは、輝度画像データと視差画像データが出力される。 The imaging unit 101 is configured by a stereo camera including two imaging units 110a and 110b as imaging means, and the two imaging units 110a and 110b are the same. The imaging units 110a and 110b respectively include imaging lenses 111a and 111b, sensor substrates 114a and 114b including image sensors 113a and 113b in which light receiving elements are two-dimensionally arranged, and analogs output from the sensor substrates 114a and 114b. It comprises signal processing units 115a and 115b that generate and output captured image data obtained by converting electrical signals (electrical signals corresponding to the amount of light received by the light receiving elements on the image sensors 113a and 113b) into digital electrical signals. ing. Luminance image data and parallax image data are output from the imaging unit 101.

また、撮像ユニット１０１は、ＦＰＧＡ（Field-Programmable Gate Array）等からなる処理ハードウェア部１２０を備えている。この処理ハードウェア部１２０は、各撮像部１１０ａ，１１０ｂから出力される輝度画像データから視差画像を得るために、各撮像部１１０ａ，１１０ｂでそれぞれ撮像した撮像画像間の対応画像部分の視差値を演算する視差画像情報生成手段としての視差演算部１２１を備えている。 In addition, the imaging unit 101 includes a processing hardware unit 120 including an FPGA (Field-Programmable Gate Array) or the like. In order to obtain a parallax image from the luminance image data output from each of the imaging units 110a and 110b, the processing hardware unit 120 obtains the parallax value of the corresponding image portion between the captured images captured by the imaging units 110a and 110b. A parallax calculation unit 121 is provided as parallax image information generation means for calculation.

ここでいう視差値とは、各撮像部１１０ａ，１１０ｂでそれぞれ撮像した撮像画像の一方を基準画像、他方を比較画像とし、撮像領域内の同一地点に対応した基準画像上の画像部分に対する比較画像上の画像部分の位置ズレ量を、当該画像部分の視差値として算出したものである。三角測量の原理を利用することで、この視差値から当該画像部分に対応した撮像領域内の当該同一地点までの距離を算出することができる。 The parallax value referred to here is a comparison image for an image portion on the reference image corresponding to the same point in the imaging region, with one of the captured images captured by the imaging units 110a and 110b as a reference image and the other as a comparison image. The positional deviation amount of the upper image part is calculated as the parallax value of the image part. By using the principle of triangulation, the distance to the same point in the imaging area corresponding to the image portion can be calculated from the parallax value.

図３は、三角測量の原理を利用することで視差値から距離を算出する原理を説明するための図である。図において、ｆは撮像レンズ１１１ａ，１１１ｂのそれぞれの焦点距離であり、Ｄは光軸間の距離である。また、Ｚは撮像レンズ１１１ａ，１１１ｂから被写体３０１までの距離（光軸に平行な方向の距離）である。この図において、被写体３０１上にある点Ｏに対する左右画像での結像位置は、結像中心からの距離がそれぞれΔ１とΔ２となる。このときの視差値ｄは、ｄ＝Δ１＋Δ２と規定することができる。 FIG. 3 is a diagram for explaining the principle of calculating the distance from the parallax value by using the principle of triangulation. In the figure, f is the focal length of each of the imaging lenses 111a and 111b, and D is the distance between the optical axes. Z is a distance from the imaging lenses 111a and 111b to the subject 301 (a distance in a direction parallel to the optical axis). In this figure, the imaging positions in the left and right images with respect to the point O on the subject 301 are Δ1 and Δ2 from the imaging center, respectively. The parallax value d at this time can be defined as d = Δ1 + Δ2.

図２の説明に戻る。画像解析ユニット１０２は、画像処理基板等から構成され、撮像ユニット１０１から出力される輝度画像データ及び視差画像データを記憶するＲＡＭやＲＯＭ等で構成される記憶手段１２２と、識別対象の認識処理や視差計算制御などを行うためのコンピュータプログラムを実行するＣＰＵ（Central Processing Unit）１２３と、データＩ／Ｆ（インタフェース）１２４と、シリアルＩ／Ｆ１２５を備えている。 Returning to the description of FIG. The image analysis unit 102 includes an image processing board and the like. The storage unit 122 includes a RAM, a ROM, and the like that store luminance image data and parallax image data output from the imaging unit 101; A CPU (Central Processing Unit) 123 that executes a computer program for performing parallax calculation control, a data I / F (interface) 124, and a serial I / F 125 are provided.

処理ハードウェア部１２０を構成するＦＰＧＡは、画像データに対してリアルタイム性が要求される処理、例えばガンマ補正、ゆがみ補正（左右の撮像画像の平行化）、ブロックマッチングによる視差演算を行って視差画像の情報を生成し、画像解析ユニット１０２のＲＡＭに書き出す処理などを行う。画像解析ユニット１０２のＣＰＵは、各撮像部１１０Ａ，１１０Ｂの画像センサコントローラの制御および画像処理基板の全体的な制御を担うとともに、路面の３次元形状の検出処理、ガードレールその他の各種オブジェクト（物体）の検出処理などを実行するプログラムをＲＯＭからロードして、ＲＡＭに蓄えられた輝度画像データや視差画像データを入力として各種処理を実行し、その処理結果をデータＩ／Ｆ１２４やシリアルＩ／Ｆ１２５から外部へと出力する。このような処理の実行に際し、データＩ／Ｆ１２４を利用して、自車両１００の車速、加速度（主に自車両前後方向に生じる加速度）、操舵角、ヨーレートなどの車両動作情報を入力し、各種処理のパラメータとして使用することもできる。外部に出力されるデータは、自車両１００の各種機器の制御（ブレーキ制御、車速制御、警告制御など）を行うための入力データとして使用される。 The FPGA that constitutes the processing hardware unit 120 performs parallax images by performing processing that requires real-time processing on image data, such as gamma correction, distortion correction (parallelization of left and right captured images), and parallax calculation by block matching. Are generated and written to the RAM of the image analysis unit 102. The CPU of the image analysis unit 102 is responsible for the control of the image sensor controller of each of the imaging units 110A and 110B and the overall control of the image processing board, the detection processing of the three-dimensional shape of the road surface, the guardrail and other various objects (objects). A program for executing the detection process is loaded from the ROM, the luminance image data and the parallax image data stored in the RAM are input, and various processes are executed. The processing results are read from the data I / F 124 and the serial I / F 125. Output to the outside. When executing such processing, the vehicle I / F 124 is used to input vehicle operation information such as the vehicle speed, acceleration (mainly acceleration generated in the longitudinal direction of the vehicle), steering angle, yaw rate, etc. It can also be used as a processing parameter. Data output to the outside is used as input data for controlling various devices of the host vehicle 100 (brake control, vehicle speed control, warning control, etc.).

なお、撮像ユニット１０１及び画像解析ユニット１０２は、一体の装置である撮像装置２として構成してもよい。 Note that the imaging unit 101 and the image analysis unit 102 may be configured as the imaging device 2 that is an integrated device.

〈物体検出処理〉
次に、図４を参照し、図２における処理ハードウェア部１２０及び画像解析ユニット１０２で実現される物体検出処理を行う機能について説明する。図４は、機器制御システム１の機能ブロック図の一例を示す図である。以下、本実施形態における物体検出処理について説明する。 <Object detection processing>
Next, with reference to FIG. 4, the function of performing the object detection process realized by the processing hardware unit 120 and the image analysis unit 102 in FIG. 2 will be described. FIG. 4 is a diagram illustrating an example of a functional block diagram of the device control system 1. Hereinafter, the object detection process in this embodiment will be described.

ステレオカメラを構成する２つの撮像部１１０ａ，１１０ｂからは輝度画像データが出力される。このとき、撮像部１１０ａ，１１０ｂがカラーの場合には、そのＲＧＢ信号から輝度信号（Ｙ）を得るカラー輝度変換を、例えば下記の式〔１〕を用いて行う。 Luminance image data is output from the two imaging units 110a and 110b constituting the stereo camera. At this time, when the imaging units 110a and 110b are in color, color luminance conversion for obtaining a luminance signal (Y) from the RGB signals is performed using, for example, the following equation [1].

Ｙ＝０．３Ｒ＋０．５９Ｇ＋０．１１Ｂ …式〔１〕
《視差画像生成処理》
次に、視差演算部１２１によって構成される視差画像生成部１３２において、視差画像データ（視差画像情報。「検出対象物の縦方向の位置と、横方向の位置と、奥行方向の位置とが対応づけられた情報」の一例。）を生成する視差画像生成処理を行う。視差画像生成処理では、まず、２つの撮像部１１０ａ，１１０ｂのうちの一方の撮像部１１０ａの輝度画像データを基準画像データとし、他方の撮像部１１０ｂの輝度画像データを比較画像データとし、これらを用いて両者の視差を演算して、視差画像データを生成して出力する。この視差画像データは、基準画像データ上の各画像部分について算出される視差値ｄに応じた画素値をそれぞれの画像部分の画素値として表した視差画像を示すものである。 Y = 0.3R + 0.59G + 0.11B (1)
<< Parallax image generation processing >>
Next, in the parallax image generation unit 132 configured by the parallax calculation unit 121, parallax image data (parallax image information. “The position in the vertical direction, the position in the horizontal direction, and the position in the depth direction of the detection target correspond to each other. An example of “attached information”) is generated. In the parallax image generation processing, first, the luminance image data of one imaging unit 110a of the two imaging units 110a and 110b is set as reference image data, and the luminance image data of the other imaging unit 110b is set as comparison image data. The parallax between them is calculated to generate and output parallax image data. The parallax image data indicates a parallax image in which pixel values corresponding to the parallax value d calculated for each image portion on the reference image data are represented as pixel values of the respective image portions.

《Ｖマップ生成処理》
次に、Ｖマップ生成部１３４において、視差画像生成部１３２から視差画像データを取得し、Ｖマップを生成するＶマップ生成処理を実行する。視差画像データに含まれる各視差画素データは、ｘ方向位置とｙ方向位置と視差値ｄとの組（ｘ，ｙ，ｄ）で示される。これを、Ｘ軸にｄ、Ｙ軸にｙ、Ｚ軸に頻度ｆを設定した三次元座標情報（ｄ，ｙ，ｆ）に変換したもの、又はこの三次元座標情報（ｄ，ｙ，ｆ）から所定の頻度閾値を超える情報に限定した三次元座標情報（ｄ，ｙ，ｆ）を、視差ヒストグラム情報として生成する。本実施形態の視差ヒストグラム情報は、三次元座標情報（ｄ，ｙ，ｆ）からなり、この三次元ヒストグラム情報をＸ−Ｙの２次元座標系に分布させたものを、Ｖマップ（視差ヒストグラムマップ、V-disparity map）と呼ぶ。 << V map generation process >>
Next, in the V map generation unit 134, parallax image data is acquired from the parallax image generation unit 132, and V map generation processing for generating a V map is executed. Each piece of parallax pixel data included in the parallax image data is indicated by a set (x, y, d) of an x-direction position, a y-direction position, and a parallax value d. This is converted into three-dimensional coordinate information (d, y, f) in which d is set on the X-axis, y is set on the Y-axis, and frequency f is set on the Z-axis, or this three-dimensional coordinate information (d, y, f) 3D coordinate information (d, y, f) limited to information exceeding a predetermined frequency threshold is generated as parallax histogram information. The parallax histogram information of this embodiment is composed of three-dimensional coordinate information (d, y, f), and this three-dimensional histogram information distributed in an XY two-dimensional coordinate system is represented by a V map (parallax histogram map). V-disparity map).

具体的に説明すると、Ｖマップ生成部１３４は、画像を上下方向に複数分割して得られる視差画像データの各行領域について、視差値頻度分布を計算する。この視差値頻度分布を示す情報が視差ヒストグラム情報である。 More specifically, the V map generation unit 134 calculates a parallax value frequency distribution for each row region of parallax image data obtained by dividing an image into a plurality of vertical directions. Information indicating the parallax value frequency distribution is parallax histogram information.

図５は視差画像データ、及びその視差画像データから生成されるＶマップについて説明するための図である。ここで、図５Ａは視差画像の視差値分布の一例を示す図であり、図５Ｂは、図５Ａの視差画像の行毎の視差値頻度分布を示すＶマップを示す図である。 FIG. 5 is a diagram for explaining parallax image data and a V map generated from the parallax image data. Here, FIG. 5A is a diagram illustrating an example of the parallax value distribution of the parallax image, and FIG. 5B is a diagram illustrating a V map illustrating the parallax value frequency distribution for each row of the parallax image of FIG. 5A.

図５Ａに示すような視差値分布をもった視差画像データが入力されたとき、Ｖマップ生成部１３４は、行毎の各視差値のデータの個数の分布である視差値頻度分布を計算し、これを視差ヒストグラム情報として出力する。このようにして得られる各行の視差値頻度分布の情報を、Ｙ軸に視差画像上のｙ方向位置（撮像画像の上下方向位置）をとりＸ軸に視差値をとった二次元直交座標系上に表すことで、図５Ｂに示すようなＶマップを得ることができる。このＶマップは、頻度ｆに応じた画素値をもつ画素が前記二次元直交座標系上に分布した画像として表現することもできる。 When parallax image data having a parallax value distribution as shown in FIG. 5A is input, the V map generation unit 134 calculates a parallax value frequency distribution that is a distribution of the number of data of each parallax value for each row, This is output as parallax histogram information. The parallax value frequency distribution information of each row obtained in this way is represented on a two-dimensional orthogonal coordinate system in which the y-axis position on the parallax image (the vertical position of the captured image) is taken on the Y-axis and the parallax value is taken on the X-axis. By expressing this, a V map as shown in FIG. 5B can be obtained. This V map can also be expressed as an image in which pixels having pixel values corresponding to the frequency f are distributed on the two-dimensional orthogonal coordinate system.

図６は、一方の撮像部で撮像された基準画像としての撮影画像の画像例と、その撮影画像に対応するＶマップを示す図である。ここで、図６Ａが撮影画像であり、図６ＢがＶマップである。即ち、図６Ａに示すような撮影画像から図６Ｂに示すＶマップが生成される。 FIG. 6 is a diagram illustrating an example of a captured image as a reference image captured by one imaging unit and a V map corresponding to the captured image. Here, FIG. 6A is a captured image, and FIG. 6B is a V map. That is, the V map shown in FIG. 6B is generated from the captured image as shown in FIG. 6A.

図６Ａに示す画像例では、自車両が走行している路面４０１と、自車両の前方に存在する先行車両４０２と、路外に存在する電柱４０３が映し出されている。また、図６Ｂに示すＶマップには、画像例に対応して、路面５０１、先行車両５０２、及び電柱５０３がある。 In the image example shown in FIG. 6A, a road surface 401 on which the host vehicle is traveling, a preceding vehicle 402 that exists in front of the host vehicle, and a utility pole 403 that exists outside the road are displayed. In addition, the V map illustrated in FIG. 6B includes a road surface 501, a preceding vehicle 502, and a utility pole 503 corresponding to the image example.

《路面形状検出処理》
次に、本実施形態では、Ｖマップ生成部１３４が生成したＶマップの情報（視差ヒストグラム情報）から、路面形状検出部１３５において、自車両１００の前方路面の３次元形状を検出する路面形状検出処理が実行される。《Road surface shape detection processing》
Next, in this embodiment, road surface shape detection in which the road surface shape detection unit 135 detects the three-dimensional shape of the front road surface of the vehicle 100 from the V map information (parallax histogram information) generated by the V map generation unit 134. Processing is executed.

図６Ａに示す画像例は、自車両１００の前方路面が相対的に平坦な路面、すなわち、自車両１００の前方路面が自車両１００の真下の路面部分と平行な面を自車前方へ延長して得られる仮想の基準路面（仮想基準移動面）に一致している場合のものである。この場合、画像の下部に対応するＶマップの下部において、高頻度の点（路面５０１）は、画像上方へ向かうほど視差値ｄが小さくなるような傾きをもった略直線状に分布する。このような分布を示す画素は、視差画像上の各行においてほぼ同一距離に存在していてかつ最も占有率が高く、しかも画像上方へ向かうほど距離が連続的に遠くなる検出対象物を映し出した画素であると言える。 In the image example shown in FIG. 6A, the road surface in which the front road surface of the host vehicle 100 is relatively flat, that is, the front road surface of the host vehicle 100 extends in front of the host vehicle 100 in a plane parallel to the road surface portion directly below the host vehicle 100. This corresponds to a case where the virtual reference road surface (virtual reference movement surface) obtained in this way is matched. In this case, in the lower part of the V map corresponding to the lower part of the image, the high-frequency points (road surface 501) are distributed in a substantially straight line having an inclination such that the parallax value d decreases toward the upper side of the image. Pixels exhibiting such a distribution are pixels that are present at almost the same distance in each row on the parallax image, have the highest occupation ratio, and project a detection object whose distance continuously increases toward the top of the image. It can be said that.

撮像部１１０Ａでは自車前方領域を撮像するため、その撮像画像の内容は、図６Ｂに示すように、画像上方へ向かうほど路面の視差値ｄは小さくなる。また、同じ行（横ライン）内において、路面を映し出す画素はほぼ同じ視差値ｄを持つことになる。したがって、Ｖマップ上において上述した略直線状に分布する高頻度の点（路面５０１）は、路面（移動面）を映し出す画素が持つ特徴に対応したものである。よって、Ｖマップ上における高頻度の点を直線近似して得られる近似直線上又はその近傍に分布する点の画素は、高い精度で、路面を映し出している画素であると推定することができる。また、各画素に映し出されている路面部分までの距離は、当該近似直線上の対応点の視差値ｄから高精度に求めることができる。なお、路面の推定により路面の高さが求められるため、当該路面上の物体の高さを求めることができる。これは、公知の方法により算出できる。例えば、推定した路面を表す直線式を求め、視差値ｄ＝０のときの対応するｙ座標ｙ０を路面の高さとする。そして、例えば、視差値がｄでｙ座標がｙ'である場合、ｙ'−ｙ０が視差値ｄのときの路面からの高さを示す。上述の座標（ｄ，ｙ'）の路面からの高さＨは、Ｈ＝（ｚ×（ｙ'−ｙ０））／ｆという演算式で求めることができる。なお、この演算式における「ｚ」は、視差値ｄから計算される距離（ｚ＝ＢＦ／（ｄ−ｏｆｆｓｅｔ））、「ｆ」は撮像部１０ａ、１０ｂの焦点距離を（ｙ'−ｙ０）の単位と同じ単位に変換した値である。ここで、ＢＦは、撮像部１０ａ、１０ｂの基線長Ｂと焦点距離ｆを乗じた値、ｏｆｆｓｅｔは無限遠のオブジェクトを撮影したときの視差である。 Since the imaging unit 110A captures an area in front of the host vehicle, as shown in FIG. 6B, the parallax value d of the road surface decreases as the captured image 110A moves upward. Also, in the same row (horizontal line), the pixels that project the road surface have substantially the same parallax value d. Therefore, the high-frequency points (road surface 501) distributed substantially linearly on the V map correspond to the characteristics of the pixels displaying the road surface (moving surface). Therefore, it is possible to estimate that the pixels of the points distributed on or near the approximate straight line obtained by linearly approximating high-frequency points on the V map are the pixels displaying the road surface with high accuracy. Further, the distance to the road surface portion projected on each pixel can be obtained with high accuracy from the parallax value d of the corresponding point on the approximate straight line. In addition, since the height of a road surface is calculated | required by estimation of a road surface, the height of the object on the said road surface can be calculated | required. This can be calculated by a known method. For example, a linear expression representing the estimated road surface is obtained, and the corresponding y coordinate y0 when the parallax value d = 0 is set as the height of the road surface. For example, when the parallax value is d and the y coordinate is y ′, the height from the road surface when y′−y0 is the parallax value d is indicated. The height H from the road surface of the above-described coordinates (d, y ′) can be obtained by an arithmetic expression H = (z × (y′−y0)) / f. In this arithmetic expression, “z” is a distance calculated from the parallax value d (z = BF / (d−offset)), and “f” is a focal length of the imaging units 10a and 10b (y′−y0). The value converted to the same unit as. Here, BF is a value obtained by multiplying the base line length B of the imaging units 10a and 10b and the focal length f, and offset is a parallax when an object at infinity is photographed.

《Ｕマップ生成処理》
次に、Ｕマップ生成部１３７は、Ｕマップ（U-disparity map）を生成するＵマップ生成処理として、頻度Ｕマップ生成処理及び高さＵマップ生成処理を実行する。 << U map generation process >>
Next, the U map generation unit 137 performs frequency U map generation processing and height U map generation processing as U map generation processing for generating a U map (U-disparity map).

頻度Ｕマップ生成処理では、視差画像データに含まれる各視差画素データにおけるｘ方向位置とｙ方向位置と視差値ｄとの組（ｘ，ｙ，ｄ）を、Ｘ軸にｘ、Ｙ軸にｄ、Ｚ軸に頻度を設定し、Ｘ−Ｙの２次元ヒストグラム情報を作成する。これを頻度Ｕマップと呼ぶ。本実施形態のＵマップ生成部１３７では、路面からの高さＨが所定の高さ範囲（たとえば２０ｃｍから３ｍ）にある視差画像の点（ｘ，ｙ，ｄ）についてだけ頻度Ｕマップを作成する。この場合、路面から当該所定の高さ範囲に存在する物体を適切に抽出することができる。 In the frequency U map generation processing, a set (x, y, d) of the x-direction position, the y-direction position, and the parallax value d in each parallax pixel data included in the parallax image data is x on the X axis and d on the Y axis. , The frequency is set on the Z axis, and XY two-dimensional histogram information is created. This is called a frequency U map. In the U map generation unit 137 of this embodiment, a frequency U map is created only for a point (x, y, d) of a parallax image whose height H from the road surface is in a predetermined height range (for example, 20 cm to 3 m). . In this case, an object existing within the predetermined height range from the road surface can be appropriately extracted.

また、高さＵマップ生成処理では、視差画像データに含まれる各視差画素データにおけるｘ方向位置とｙ方向位置と視差値ｄとの組（ｘ，ｙ，ｄ）を、Ｘ軸にｘ、Ｙ軸にｄ、Ｚ軸に路面からの高さを設定して、Ｘ−Ｙの２次元ヒストグラム情報を作成する。これを高さＵマップと呼ぶ。このときの高さの値は路面からの高さが最高のものである。 In the height U map generation processing, a set (x, y, d) of the x-direction position, the y-direction position, and the parallax value d in each piece of parallax pixel data included in the parallax image data is represented by x, Y on the X axis. X-Y two-dimensional histogram information is created by setting the axis d and the Z-axis height from the road surface. This is called a height U map. The height value at this time is the highest from the road surface.

図７は、撮像部１１０ａで撮像される基準画像の一例を模式的に表した画像例であり、図８は、図７の画像例に対応するＵマップである。ここで、図８Ａは頻度Ｕマップであり、図８Ｂは高さＵマップである。 FIG. 7 is an image example schematically showing an example of a reference image imaged by the imaging unit 110a, and FIG. 8 is a U map corresponding to the image example of FIG. Here, FIG. 8A is a frequency U map, and FIG. 8B is a height U map.

図７に示す画像例では、路面の左右両側にガードレール４１３，４１４が存在し、他車両としては、先行車両４１１と対向車両４１２がそれぞれ１台ずつ存在する。このとき、頻度Ｕマップにおいては、図８Ａに示すように、左右のガードレール４１３，４１４に対応する高頻度の点は、左右両端側から中央に向かって上方へ延びるような略直線状６０３，６０４に分布する。一方、先行車両４１１と対向車両４１２に対応する高頻度の点は、左右のガードレールの間で、略Ｘ軸方向に平行に延びる線分の状態６０１，６０２で分布する。なお、先行車両４１１の背面部分又は対向車両４１２の前面部分以外に、これらの車両の側面部分が映し出されているような状況にあっては、同じ他車両を映し出している画像領域内において視差が生じる。このような場合、図８Ａに示すように、他車両に対応する高頻度の点は、略Ｘ軸方向に平行に延びる線分と略Ｘ軸方向に対して傾斜した線分とが連結した状態の分布を示す。 In the image example shown in FIG. 7, guard rails 413 and 414 exist on the left and right sides of the road surface, and one preceding vehicle 411 and one oncoming vehicle 412 exist as other vehicles. At this time, in the frequency U map, as shown in FIG. 8A, the high-frequency points corresponding to the left and right guard rails 413 and 414 are substantially linear 603 and 604 extending upward from the left and right ends toward the center. Distributed. On the other hand, high-frequency points corresponding to the preceding vehicle 411 and the oncoming vehicle 412 are distributed between the left and right guard rails in the state 601 and 602 of line segments extending substantially parallel to the X-axis direction. In addition, in a situation where the side portions of these vehicles other than the rear portion of the preceding vehicle 411 or the front portion of the oncoming vehicle 412 are projected, the parallax is within the image area displaying the same other vehicle. Arise. In such a case, as shown in FIG. 8A, the high-frequency point corresponding to the other vehicle is a state in which a line segment extending in parallel with the substantially X-axis direction and a line segment inclined with respect to the approximately X-axis direction are connected. The distribution of.

また、高さＵマップにおいては、左右のガードレール４１３，４１４、先行車両４１１、及び対向車両４１２における路面からの高さが最高の点が頻度Ｕマップと同様に分布する。ここで、先行車両に対応する点の分布７０１及び対向車両に対応する点の分布７０２の高さはガードレールに対応する点の分布７０３，７０４よりも高くなる。これにより、高さＵマップにおける物体の高さ情報を物体検出に利用することができる。 In the height U map, the points with the highest height from the road surface in the left and right guard rails 413, 414, the preceding vehicle 411, and the oncoming vehicle 412 are distributed in the same manner as the frequency U map. Here, the heights of the point distribution 701 corresponding to the preceding vehicle and the point distribution 702 corresponding to the oncoming vehicle are higher than the point distributions 703 and 704 corresponding to the guardrail. Thereby, the height information of the object in the height U map can be used for object detection.

《リアルＵマップ生成処理》
次に、リアルＵマップ生成部１３８について説明する。リアルＵマップ生成部１３８では、リアルＵマップ（Real U-disparity map）（「分布データ」の一例）を生成するＵマップ生成処理として、リアル頻度Ｕマップ生成処理及びリアル高さＵマップ生成処理を実行する。 << Real U map generation process >>
Next, the real U map generation unit 138 will be described. The real U map generation unit 138 performs real frequency U map generation processing and real height U map generation processing as U map generation processing for generating a real U map (an example of “distribution data”). Run.

リアルＵマップは、Ｕマップにおける横軸を画像の画素単位から実際の距離相当の単位に変換し、縦軸の視差値を距離に応じた間引き率を有する間引き視差に変換したものである。なお、ここにいうリアルＵマップは、実空間を俯瞰的にとらえた俯瞰画像（鳥瞰画像）ということもできる。 The real U map is obtained by converting the horizontal axis in the U map from a pixel unit of an image to a unit corresponding to an actual distance, and converting the parallax value on the vertical axis into a thinned parallax having a thinning rate corresponding to the distance. Note that the real U map here can also be referred to as a bird's-eye view image (bird's-eye view image) obtained from a bird's-eye view of real space.

リアルＵマップ生成部１３８は、リアル頻度Ｕマップ生成処理において、視差画像データに含まれる各視差画素データにおけるｘ方向位置とｙ方向位置と視差値ｄとの組（ｘ，ｙ，ｄ）を、Ｘ軸に水平方向の実際の距離、Ｙ軸に間引き視差、Ｚ軸に頻度を設定して、Ｘ−Ｙの２次元ヒストグラム情報を作成する。なお、本実施形態のリアルＵマップ生成部１３８は、Ｕマップ生成部１３７と同様に、路面からの高さＨが所定の高さ範囲にある視差画像の点（ｘ，ｙ，ｄ）についてだけリアル頻度Ｕマップを作成する。なお、リアルＵマップ生成部１３８は、Ｕマップ生成部１３７が生成したＵマップに基づいて、リアルＵマップを生成する構成としてもよい。 In the real frequency U map generation process, the real U map generation unit 138 determines a set (x, y, d) of the x direction position, the y direction position, and the disparity value d in each piece of disparity pixel data included in the disparity image data. X-Y two-dimensional histogram information is created by setting an actual distance in the horizontal direction on the X axis, thinning parallax on the Y axis, and frequency on the Z axis. Note that the real U map generation unit 138 of the present embodiment is similar to the U map generation unit 137 only for the point (x, y, d) of the parallax image in which the height H from the road surface is within a predetermined height range. Create a real frequency U map. Note that the real U map generation unit 138 may generate a real U map based on the U map generated by the U map generation unit 137.

図９は、図８Ａに示す頻度Ｕマップに対応するリアルＵマップ（以下、リアル頻度Ｕマップ）を示す図である。図示のように、左右のガードレールは垂直の線状のパターン８０３，８０４で表され、先行車両、対向車両も実際の形に近いパターン８０１、８０２で表される。 FIG. 9 is a diagram showing a real U map (hereinafter, real frequency U map) corresponding to the frequency U map shown in FIG. 8A. As shown in the figure, the left and right guardrails are represented by vertical linear patterns 803 and 804, and the preceding vehicle and the oncoming vehicle are also represented by patterns 801 and 802 that are close to the actual shape.

縦軸の間引き視差は、遠距離（ここでは５０ｍ以上）については間引きなし、中距離（２０ｍ以上、５０ｍ未満）については１／２に間引き、近距離（１０ｍ以上、２０ｍ未満）については１／３に間引き、近距離（１０ｍ以上、２０ｍ未満）については１／８に間引いたものである。 The thinning parallax on the vertical axis is not thinned for a long distance (here, 50 m or more), thinned to 1/2 for a medium distance (20 m or more and less than 50 m), and 1 / for a short distance (10 m or more and less than 20 m). 3 is thinned out, and the short distance (10 m or more and less than 20 m) is thinned out to 1/8.

つまり、遠方ほど、間引く量を少なくしている。その理由は、遠方では物体が小さく写るため、視差データが少なく、距離分解能も小さいので間引きを少なくし、逆に近距離では、物体が大きく写るため、視差データが多く、距離分解能も大きいので間引きを多くする。 That is, the farther away, the smaller the amount to be thinned out. The reason for this is that because the object appears small in the distance, the parallax data is small and the distance resolution is small, so the thinning is reduced.On the other hand, in the short distance, the object is large, so the parallax data is large and the distance resolution is large. To increase.

横軸を画像の画素単位から実際の距離へ変換する方法、Ｕマップの（ｘ，ｄ）からリアルＵマップの（Ｘ，ｄ）を求める方法の一例について図１０を用いて説明する。 An example of a method for converting the horizontal axis from the pixel unit of the image to an actual distance and an example of a method for obtaining (X, d) of the real U map from (x, d) of the U map will be described with reference to FIG.

カメラから見て左右１０ｍずつ、即ち２０ｍの幅をオブジェクト検出範囲として設定する。リアルＵマップの横方向１画素の幅を１０ｃｍとすると、リアルＵマップの横方向サイズは２００画素となる。 A width of 10 m on each side, that is, 20 m when viewed from the camera is set as the object detection range. If the width of one pixel in the horizontal direction of the real U map is 10 cm, the horizontal size of the real U map is 200 pixels.

カメラの焦点距離をｆ、カメラ中心からのセンサの横方向の位置をｐ、カメラから被写体までの距離をＺ、カメラ中心から被写体までの横方向の位置をＸとする。センサの画素サイズをｓとすると、ｘとｐの関係は「ｘ＝ｐ／ｓ」で表される。また、ステレオカメラの特性から、「Ｚ＝Ｂｆ／ｄ」の関係がある。 The focal length of the camera is f, the lateral position of the sensor from the camera center is p, the distance from the camera to the subject is Z, and the lateral position from the camera center to the subject is X. When the pixel size of the sensor is s, the relationship between x and p is expressed by “x = p / s”. Further, there is a relationship of “Z = Bf / d” from the characteristics of the stereo camera.

また、図より、「ｘ＝ｐ＊Ｚ／ｆ」の関係があるから、「Ｘ＝ｓｘＢ／ｄ」で表すことができる。Ｘは実距離であるが、リアルＵマップ上での横方向１画素の幅が１０ｃｍあるので、容易にＸのリアルＵマップ上での位置を計算することができる。 From the figure, since there is a relationship of “x = p * Z / f”, it can be expressed by “X = sxB / d”. X is an actual distance, but since the width of one pixel in the horizontal direction on the real U map is 10 cm, the position of X on the real U map can be easily calculated.

図８Ｂに示す高さＵマップに対応するリアルＵマップ（以下、リアル高さＵマップ）も同様の手順で作成することができる。 A real U map (hereinafter, real height U map) corresponding to the height U map shown in FIG. 8B can be created in the same procedure.

リアルＵマップには、縦横の長さをＵマップより小さくできるので処理が高速になるというメリットがある。また、横方向が距離に非依存になるため、遠方、近傍いずれでも同じ物体は同じ幅で検出することが可能になり、後段の周辺領域除去や、横分離、縦分離への処理分岐の判定（幅の閾値処理）が簡単になるというメリットもある。 The real U map has a merit that the processing can be performed at high speed because the vertical and horizontal lengths can be made smaller than the U map. In addition, since the horizontal direction is independent of distance, the same object can be detected with the same width in both the distant and the vicinity, and the subsequent peripheral area removal and processing branching to horizontal separation and vertical separation are determined. There is also an advantage that (width threshold processing) is simplified.

Ｕマップにおける縦方向の長さは、測定可能な最短距離を何メートルにするかで決定される。つまり、「ｄ＝Ｂｆ／Ｚ」であるから、測定可能な最短のＺに応じて、ｄの最大値は決定される。また、視差値ｄはステレオ画像を扱うため、通常画素単位で計算されるが、少数を含むため、視差値に所定値を乗じて小数部分を四捨五入して整数化した視差値を使用する。 The vertical length in the U map is determined by how many meters the shortest measurable distance is. That is, since “d = Bf / Z”, the maximum value of d is determined according to the shortest measurable Z. The parallax value d is usually calculated in units of pixels in order to handle a stereo image. However, since the parallax value includes a small number, a parallax value obtained by multiplying the parallax value by a predetermined value and rounding off a decimal part is used.

測定可能な最短のＺが１／２になると、ｄは２倍になるので，それだけＵマップのデータは巨大となる。そこで、リアルＵマップを作成するときには、近距離ほど画素を間引いてデータを圧縮し、Ｕマップよりもデータ量を削減する。
そのため、ラベリングによるオブジェクト検出を高速に行うことができる。 When the shortest measurable Z is halved, d is doubled, so the U map data is huge. Therefore, when creating a real U map, the data is compressed by thinning out pixels as the distance is short, and the amount of data is reduced compared to the U map.
Therefore, object detection by labeling can be performed at high speed.

《孤立領域検出》
次に、孤立領域検出部１３９が行う孤立領域検出処理について説明する。図１１は、孤立領域検出処理の一例を示すフローチャートである。孤立領域検出部１３９では、まずリアルＵマップ生成部１３８で生成されたリアル頻度Ｕマップの情報の平滑化を行う（ステップＳ１１１）。《Isolated area detection》
Next, an isolated area detection process performed by the isolated area detection unit 139 will be described. FIG. 11 is a flowchart illustrating an example of the isolated region detection process. The isolated region detection unit 139 first smoothes the information of the real frequency U map generated by the real U map generation unit 138 (step S111).

これは、頻度値を平均化することで、有効な孤立領域を検出しやすくするためである。即ち、視差値には計算誤差等もあって分散があり、かつ、視差値がすべての画素について計算されているわけではないので、リアルＵマップは図９に示した模式図とは異なり、ノイズを含んでいる。そこで、ノイズを除去するためと、検出したいオブジェクトを分離しやすくするため、リアルＵマップを平滑化する。これは画像の平滑化と同様に、平滑化フィルタ(例えば３×３画素の単純平均)をリアルＵマップの頻度値（リアル頻度Ｕマップ）に対して適用することで、ノイズと考えられるような頻度は減少し、オブジェクトの部分では頻度が周囲より高い、まとまったグループとなり、後段の孤立領域検出処理を容易にする効果がある。 This is because it is easy to detect an effective isolated region by averaging the frequency values. That is, the disparity value has dispersion due to calculation errors and the like, and the disparity value is not calculated for all pixels. Therefore, the real U map is different from the schematic diagram shown in FIG. Is included. Therefore, the real U map is smoothed in order to remove noise and to easily separate an object to be detected. Like smoothing of an image, this may be considered as noise by applying a smoothing filter (for example, a simple average of 3 × 3 pixels) to a real U map frequency value (real frequency U map). The frequency decreases, and in the object portion, the frequency becomes higher than that of the surrounding area, and there is an effect of facilitating the subsequent isolated region detection processing.

次に、二値化の閾値を設定する（ステップＳ１１２）。最初は小さい値（＝０）を用いて、平滑化されたリアルＵマップの二値化を行う（ステップＳ１１３）。その後、値のある座標のラベリングを行って、孤立領域を検出する（ステップＳ１１４）。 Next, a threshold value for binarization is set (step S112). Initially, the smoothed real U map is binarized using a small value (= 0) (step S113). Thereafter, labeling of coordinates having a value is performed to detect an isolated region (step S114).

この二つのステップでは、リアル頻度Ｕマップで頻度が周囲より高い孤立領域(島と呼ぶことにする)を検出する。検出には、リアル頻度Ｕマップをまず二値化する（ステップＳ１１３）。最初は閾値０で二値化を行う。これは、オブジェクトの高さや、その形状、路面視差との分離などがあるため、島は孤立しているものもあれば他の島と連結しているものもあることの対策である。即ち、小さい閾値からリアル頻度Ｕマップを二値化することで最初は孤立した適切な大きさの島を検出し、その後、閾値を増加させていくことで連結している島を分離し、孤立した適切な大きさの島として検出することを可能にしたものである。 In these two steps, an isolated region (referred to as an island) having a higher frequency than the surroundings is detected in the real frequency U map. For detection, the real frequency U map is first binarized (step S113). First, binarization is performed with a threshold value of 0. This is a countermeasure against the fact that there are islands that are isolated or connected to other islands because of the height of the object, its shape, separation from the road surface parallax, and the like. In other words, binarizing the real frequency U map from a small threshold first detects an isolated island of an appropriate size, and then increases the threshold to separate connected islands and isolate It is possible to detect an island of an appropriate size.

二値化後の島を検出する方法はラベリングを用いる。二値化後の黒である座標(頻度値が二値化閾値より高い座標)をその連結性に基づいてラベリングして、同一ラベルが付いた領域を島とする。 Labeling is used as a method for detecting an island after binarization. Coordinates that are black after binarization (coordinates whose frequency value is higher than the binarization threshold) are labeled based on their connectivity, and regions with the same label are defined as islands.

検出された複数の孤立領域についてそれぞれ大きさの判定を行う（ステップＳ１１５）。これは、検出対象が歩行者から大型自動車であるため、孤立領域の幅がそのサイズの範囲であるか否かを判定するのである。もし、その大きさが大きければ（ステップＳ１１５：YES）、二値化閾値を１だけインクリメントして（ステップＳ１１２）、リアル頻度Ｕマップの当該孤立領域内だけ二値化を行う（ステップＳ１１３）。そしてラベリングを行い、より小さな孤立領域を検出して（ステップＳ１１４）、その大きさを判定する（ステップＳ１１５）。 The size of each of the detected isolated areas is determined (step S115). In this case, since the detection target is a pedestrian to a large automobile, it is determined whether or not the width of the isolated region is within the size range. If the size is large (step S115: YES), the binarization threshold is incremented by 1 (step S112), and binarization is performed only within the isolated region of the real frequency U map (step S113). Then, labeling is performed, a smaller isolated area is detected (step S114), and the size is determined (step S115).

上記の閾値設定からラベリングの処理を繰り返し行い、所望の大きさの孤立領域を検出するのである。所望の大きさの孤立領域が検出できたなら（ステップＳ１１５：NO）、次に周辺領域除去を行う（ステップＳ１１６）。これは、遠方にある物体で、路面検出の精度が悪く、路面の視差がリアルＵマップ内に導入され、物体と路面の視差が一塊になって検出された場合のその左右、近傍の高さが路面に近い部分の領域（孤立領域内の周辺部分）を削除する処理である。除去領域が存在する場合は（ステップＳ１１７：YES）、もう一度ラベリングを行って孤立領域の再設定を行う（ステップＳ１１４）。 The labeling process is repeatedly performed from the above threshold setting, and an isolated region having a desired size is detected. If an isolated region having a desired size is detected (step S115: NO), the peripheral region is then removed (step S116). This is a distant object, the accuracy of road surface detection is poor, the road surface parallax is introduced into the real U map, and when the object and road surface parallax are detected together, the height of the left and right, the vicinity Is a process of deleting an area close to the road surface (peripheral part in the isolated area). If the removal area exists (step S117: YES), labeling is performed again to reset the isolated area (step S114).

《視差画像の対応領域検出、及びオブジェクト領域抽出》
次に、視差画像の対応領域検出部１４０及びオブジェクト領域抽出部１４１について説明する。図１２は、孤立領域検出部で検出された孤立領域が内接する矩形領域を設定したリアル頻度Ｕマップを示す図であり、図１３は、図１２における矩形領域に対応する走査範囲を設定した視差画像を示す図であり、図１４は、図１３における走査範囲を探索してオブジェクト領域を設定した視差画像を示す図である。 << Detection of corresponding area of parallax image and extraction of object area >>
Next, the corresponding region detection unit 140 and the object region extraction unit 141 for parallax images will be described. 12 is a diagram showing a real frequency U map in which a rectangular area inscribed by an isolated area detected by the isolated area detection unit is set. FIG. 13 is a parallax in which a scanning range corresponding to the rectangular area in FIG. 12 is set. FIG. 14 is a diagram illustrating a parallax image in which an object area is set by searching the scanning range in FIG. 13.

孤立領域検出部１３９によりオブジェクト候補領域として決定された孤立領域について、図１２に示すように、当該孤立領域としての第１車両８０１、第２車両８０２が内接する矩形領域として第１検出島８１１及び第２検出島８１２を設定したとき、この矩形領域の幅（Ｕマップ上のＸ軸方向長さ）は、当該孤立領域に対応する識別対象物（オブジェクト）の幅に対応する。また、設定した矩形領域の高さは、当該孤立領域に対応する識別対象物（オブジェクト）の奥行き（自車両進行方向長さ）に対応している。一方で、各孤立領域に対応する識別対象物（オブジェクト）の高さについては、この段階では不明である。視差画像の対応領域検出部１４０は、オブジェクト候補領域に係る孤立領域に対応したオブジェクトの高さを得るために、当該孤立領域に対応する視差画像上の対応領域を検出する。 As shown in FIG. 12, the first detection island 811 and the first detection island 811 are defined as rectangular areas inscribed in the first vehicle 801 and the second vehicle 802 as the isolated areas. When the second detection island 812 is set, the width of the rectangular area (the length in the X-axis direction on the U map) corresponds to the width of the identification object (object) corresponding to the isolated area. Further, the height of the set rectangular area corresponds to the depth (length in the traveling direction of the host vehicle) of the identification object (object) corresponding to the isolated area. On the other hand, the height of the identification object (object) corresponding to each isolated region is unknown at this stage. The corresponding region detection unit 140 of the parallax image detects a corresponding region on the parallax image corresponding to the isolated region in order to obtain the height of the object corresponding to the isolated region related to the object candidate region.

視差画像の対応領域検出部１４０は、孤立領域検出部１３９から出力される孤立領域の情報に基づき、リアルＵマップから検出した第１検出島８１１及び第２検出島８１２島の位置、幅と最小視差から、図１３に示す視差画像で検出すべき範囲である第１検出島対応領域走査範囲４８１及び第２検出島対応領域走査範囲４８２のｘ方向範囲（ｘmin，ｘmax)を決定できる。また、視差画像においてオブジェクトの高さと位置(ｙmin="最大視差ｄmaxの時の路面からの最大高さに相当するｙ座標"からｙmax="最大視差ｄmaxから得られる路面の高さを示すｙ"まで)を決定できる。 The corresponding region detection unit 140 for the parallax image is based on the isolated region information output from the isolated region detection unit 139, and the position, width, and minimum of the first detection island 811 and the second detection island 812 island detected from the real U map. From the parallax, the x-direction range (xmin, xmax) of the first detection island corresponding region scanning range 481 and the second detection island corresponding region scanning range 482, which are the ranges to be detected in the parallax image shown in FIG. Further, in the parallax image, the height and position of the object (ymin = "y indicating the height of the road surface obtained from the maximum parallax dmax" from y coordinate corresponding to the maximum height from the road surface at the time of the maximum parallax dmax ")" Can be determined).

次に、オブジェクトの正確な位置を検出するため、設定した走査範囲を走査し、孤立領域検出部１３９で検出した矩形の奥行き(最小視差ｄmin,最大視差ｄmax)の範囲の値を視差にもつ画素を候補画素として抽出する。そして、抽出した候補画素群の中で検出幅に対して横方向に所定の割合以上あるラインをオブジェクト候補ラインとする。 Next, in order to detect the exact position of the object, the set scanning range is scanned, and pixels having a parallax with values of the range of the rectangular depth (minimum parallax dmin, maximum parallax dmax) detected by the isolated region detection unit 139 Are extracted as candidate pixels. In the extracted candidate pixel group, a line having a predetermined ratio or more in the horizontal direction with respect to the detection width is set as an object candidate line.

次に、縦方向に走査して、ある注目しているオブジェクト候補ラインの周囲に他のオブジェクト候補ラインが所定の密度以上ある場合には、その注目しているオブジェクト候補ラインをオブジェクトラインとして判定する。 Next, scanning is performed in the vertical direction, and when there are other object candidate lines having a predetermined density or more around an object candidate line of interest, the object candidate line of interest is determined as an object line. .

次に、オブジェクト領域抽出部１４１は、視差画像の探索領域でオブジェクトラインを探索して、オブジェクトラインの最下端、最上端を決定し、図１４に示すように、オブジェクトライン群の外接矩形４６１，４６２を視差画像におけるオブジェクト（第１車両、第２車両）の領域４５１，４５２として決定する。 Next, the object area extraction unit 141 searches the object line in the parallax image search area to determine the lowermost and uppermost ends of the object line, and as shown in FIG. 462 is determined as regions 451 and 452 of the objects (first vehicle and second vehicle) in the parallax image.

図１５は、視差画像の対応領域検出部１４０及びオブジェクト領域抽出部１４１で行われる処理の流れを示すフローチャートである。まずリアルＵマップにおける島の位置、幅と最小視差から、視差画像に対するｘ軸方向の探索範囲を設定する（ステップＳ１６１）。 FIG. 15 is a flowchart illustrating a flow of processing performed by the corresponding region detection unit 140 and the object region extraction unit 141 for parallax images. First, a search range in the x-axis direction for the parallax image is set from the position, width, and minimum parallax of the island in the real U map (step S161).

次に島の最大視差ｄmaxと路面高さの関係から、視差画像に対するｙ軸方向の最大探索値ｙmaxを設定する（ステップＳ１６２）。次にリアル高さＵマップにおける島の最大高さ、及びステップＳ１７２で設定したymaxとdmaxとから、視差画像に対するｙ軸方向の最小探索値ｙminを求めて設定することで、視差画像に対するｙ軸方向の探索範囲を設定する（ステップＳ１６３）。 Next, the maximum search value ymax in the y-axis direction with respect to the parallax image is set from the relationship between the maximum parallax dmax of the island and the road surface height (step S162). Next, by obtaining and setting the minimum search value ymin in the y-axis direction for the parallax image from the maximum height of the island in the real height U map and ymax and dmax set in step S172, the y-axis for the parallax image is set. A direction search range is set (step S163).

次いで設定した探索範囲で視差画像を探索して、島の最小視差ｄmin，最大視差ｄmaxの範囲内にある画素を抽出し、オブジェクト候補画素とする（ステップＳ１６４）。そのオブジェクト候補画素が横方向に一定以上の割合にあるとき、そのラインをオブジェクト候補ラインとして抽出する（ステップＳ１６５）。 Next, the parallax image is searched in the set search range, and pixels within the range of the minimum parallax dmin and the maximum parallax dmax of the island are extracted and set as object candidate pixels (step S164). When the object candidate pixels are at a certain ratio in the horizontal direction, the line is extracted as an object candidate line (step S165).

オブジェクト候補ラインの密度を計算して、密度が所定の値より大きい場合はそのラインをオブジェクトラインと決定する（ステップＳ１６６）。最後にオブジェクトライン群の外接矩形を視差画像内のオブジェクト領域として検出する（ステップＳ１６７）。 The density of the object candidate line is calculated, and if the density is greater than a predetermined value, the line is determined as the object line (step S166). Finally, a circumscribed rectangle of the object line group is detected as an object area in the parallax image (step S167).

それにより、識別対象物（オブジェクト、物体）を認識することができる。 Thereby, an identification target (object, object) can be recognized.

《オブジェクトタイプ分類》
次に、オブジェクトタイプ分類部１４２について説明する。 <Object type classification>
Next, the object type classification unit 142 will be described.

前記オブジェクト領域抽出部１４１で抽出されるオブジェクト領域の高さ（ｙomax−ｙomin）から、下記の式〔２〕より、そのオブジェクト領域に対応する画像領域に映し出されている識別対象物（オブジェクト）の実際の高さＨｏを計算できる。ただし、「ｚo」は、当該オブジェクト領域内の最小視差値ｄから計算される当該オブジェクト領域に対応するオブジェクトと自車両との距離であり、「ｆ」はカメラの焦点距離を（ｙomax−ｙomin）の単位と同じ単位に変換した値である。 From the height (yomax-yomin) of the object area extracted by the object area extraction unit 141, the identification object (object) displayed in the image area corresponding to the object area is calculated by the following equation [2]. The actual height Ho can be calculated. However, “zo” is the distance between the object corresponding to the object area calculated from the minimum parallax value d in the object area and the host vehicle, and “f” is the focal length of the camera (yomax−yomin) The value converted to the same unit as.

Ｈo＝ｚo×（ｙomax−ｙomin）／ｆ …式〔２〕
同様に、オブジェクト領域抽出部１４１で抽出されるオブジェクト領域の幅（ｘomax−ｘomin）から、下記の式〔３〕より、そのオブジェクト領域に対応する画像領域に映し出されている識別対象物（オブジェクト）の実際の幅Ｗoを計算できる。 Ho = zo × (yomax−yomin) / f (2)
Similarly, from the width (xomax−xomin) of the object area extracted by the object area extraction unit 141, the identification object (object) displayed in the image area corresponding to the object area is calculated from the following equation [3]. The actual width Wo can be calculated.

Ｗo＝ｚo×（ｘomax−ｘomin）／ｆ …式〔３〕
また、当該オブジェクト領域に対応する画像領域に映し出されている識別対象物（オブジェクト）の奥行きＤoは、当該オブジェクト領域に対応した孤立領域内の最大視差ｄmaxと最小視差ｄminから、下記の式〔４〕より計算することができる。 Wo = zo × (xomax−xomin) / f (3)
The depth Do of the identification target object (object) displayed in the image area corresponding to the object area is expressed by the following equation [4] from the maximum parallax dmax and the minimum parallax dmin in the isolated area corresponding to the object area. ] Can be calculated.

Ｄo＝ＢＦ×｛（１／（ｄmin−offset）−１／（ｄmax−offset）｝ …式〔４〕
オブジェクトタイプ分類部１４２は、このようにして計算できるオブジェクト領域に対応するオブジェクトの高さ、幅、奥行きの情報から、そのオブジェクトタイプの分類を行う。図１６に示す表は、オブジェクトタイプの分類を行うためのテーブルデータの一例を示すものである。図１６の例では、例えば、幅が１１００ｍｍ未満、高さが２５０ｍｍ未満、かつ奥行きが１０００ｍｍを超えていれば、「オートバイ、自転車」と判定される。また、幅が１１００ｍｍ未満、高さが２５０ｍｍ未満、かつ奥行きが１０００ｍｍ以下であれば、「歩行者」と判定される。これによれば、自車両前方に存在する識別対象物（オブジェクト）が、歩行者なのか、自転車またはオートバイなのか、小型車なのか、トラックなのか等を区別して認識することが可能となる。なお、上述の方法は一例であり、本発明においては、オブジェクトタイプの分類やオブジェクトの位置が特定できれば、種々の方法を利用することができる。 Do = BF × {(1 / (dmin−offset) −1 / (dmax−offset)}} Equation [4]
The object type classification unit 142 classifies the object type from information on the height, width, and depth of the object corresponding to the object area that can be calculated in this way. The table shown in FIG. 16 shows an example of table data for classifying object types. In the example of FIG. 16, for example, if the width is less than 1100 mm, the height is less than 250 mm, and the depth exceeds 1000 mm, it is determined as “motorcycle, bicycle”. If the width is less than 1100 mm, the height is less than 250 mm, and the depth is 1000 mm or less, it is determined as a “pedestrian”. According to this, it becomes possible to distinguish and recognize whether the identification object (object) existing in front of the host vehicle is a pedestrian, a bicycle or a motorcycle, a small car, a truck, or the like. The above-described method is an example, and various methods can be used in the present invention as long as the object type classification and the object position can be specified.

《３次元位置決定》
次に、図１７を参照し、３次元位置決定部１４３の処理について説明する。３次元位置決定部１４３は、自車両１００に対する識別対象物（オブジェクト）の相対的な３次元の位置を決定する。 << 3D position determination >>
Next, processing of the three-dimensional position determining unit 143 will be described with reference to FIG. The three-dimensional position determination unit 143 determines a relative three-dimensional position of the identification target (object) with respect to the host vehicle 100.

図１７は、３次元位置決定処理の一例を示す図である。なお、図１７では、オブジェクト領域抽出部１４１により抽出された各オブジェクトのうちの一のオブジェクト（対象オブジェクト）についての処理を説明する。そのため、３次元位置決定部１４３は、図１７の処理を、オブジェクト領域抽出部１４１により抽出された各オブジェクトに対して行う。 FIG. 17 is a diagram illustrating an example of the three-dimensional position determination process. In FIG. 17, a process for one object (target object) among the objects extracted by the object area extraction unit 141 will be described. Therefore, the three-dimensional position determination unit 143 performs the process of FIG. 17 on each object extracted by the object region extraction unit 141.

ステップＳ２０１において、３次元位置決定部１４３は、対象オブジェクトのオブジェクトタイプが、剛体ではない種別（「第一の種別」の一例。）であるか否かを判定する。なお、剛体とは、変形しない物体であり、剛体である種別とは、例えば「小型車」、「普通車」、「トラック」等である。また、剛体ではない種別とは、例えば、「歩行者」、「オートバイ、自転車」等である。なお、オートバイや自転車自体は剛体であるが、オートバイや自転車の運転者が剛体ではないため、剛体ではない種別と判定される。 In step S201, the three-dimensional position determination unit 143 determines whether or not the object type of the target object is a type that is not a rigid body (an example of “first type”). The rigid body is an object that does not deform, and the type of the rigid body is, for example, “small car”, “ordinary car”, “truck”, and the like. The types that are not rigid bodies include, for example, “pedestrians”, “motorcycles, bicycles”, and the like. Although the motorcycle or bicycle itself is a rigid body, since the driver of the motorcycle or bicycle is not a rigid body, it is determined as a type that is not a rigid body.

オブジェクトタイプが剛体である種別（「第二の種別」の一例。）であれば（ステップＳ２０１でＮＯ）、３次元位置決定部１４３は、対象オブジェクトの中心の位置を算出する（ステップＳ２０２）。３次元位置決定部１４３は、検出されたオブジェクト領域に対応するオブジェクトまでの距離や、視差画像の画像中心と視差画像上のオブジェクト領域の中心との画像上の距離に基づいて、オブジェクトの３次元座標における中心位置を、例えば以下の式により算出する。 If the object type is a type of rigid body (an example of “second type”) (NO in step S201), the three-dimensional position determination unit 143 calculates the center position of the target object (step S202). The three-dimensional position determination unit 143 determines the three-dimensional position of the object based on the distance to the object corresponding to the detected object region and the distance on the image between the image center of the parallax image and the center of the object region on the parallax image. The center position in the coordinates is calculated by the following formula, for example.

視差画像上のオブジェクト領域の中心座標を（region_centerＸ，region_centerＹ）とし、視差画像の画像中心座標を（image_centerＸ，image_centerＹ）としたとき、識別対象物（オブジェクト）の撮像部１１０ａ，１１０ｂに対する相対的な横方向の中心位置Ｘoおよび高さ方向の中心Ｙo位置は、下記の式〔５〕及び式〔６〕より計算できる。 When the center coordinates of the object region on the parallax image are (region_centerX, region_centerY) and the image center coordinates of the parallax image are (image_centerX, image_centerY), the horizontal direction relative to the imaging units 110a and 110b of the identification object (object) The center position Xo in the direction and the center Yo position in the height direction can be calculated from the following equations [5] and [6].

Ｘo＝Ｚ×（region_centerＸ−image_centerＸ）／ｆ …式〔５〕
Ｙo＝Ｚ×（region_centerＹ−image_centerＹ）／ｆ …式〔６〕
続いて、３次元位置決定部１４３は、対象オブジェクトの中心の位置を、対象オブジェクトの位置として決定し（ステップＳ２０３）、処理を終了する。 Xo = Z * (region_centerX-image_centerX) / f (5)
Yo = Z × (region_centerY−image_centerY) / f (6)
Subsequently, the three-dimensional position determination unit 143 determines the position of the center of the target object as the position of the target object (step S203), and ends the process.

図１８は、車両等のオブジェクトの位置を算出する方法について説明する図である。オブジェクトタイプが「歩行者」、または「オートバイ、自転車」でない場合、３次元位置決定部１４３は、例えばオブジェクト領域の中心の位置９０１ａ、９０１ｂ、９０１ｃを、対象オブジェクトの位置とする。これにより、後述する対象オブジェクトの重心の位置９００ａ、９００ｂ、９００ｃを対象オブジェクトの位置とする方法と比較して、ノイズ等の影響を低減することができる。 FIG. 18 is a diagram illustrating a method for calculating the position of an object such as a vehicle. When the object type is not “pedestrian” or “motorcycle, bicycle”, the three-dimensional position determination unit 143 sets, for example, the positions 901a, 901b, and 901c of the center of the object region as the positions of the target objects. Thereby, the influence of noise or the like can be reduced as compared with a method in which the positions 900a, 900b, and 900c of the center of gravity of the target object, which will be described later, are set as the positions of the target object.

一方、オブジェクトタイプが「歩行者」、または「オートバイ、自転車」であれば（ステップＳ２０１でＹＥＳ）、３次元位置決定部１４３は、対象オブジェクトの重心の位置を算出する（ステップＳ２０４）。これにより、オブジェクトタイプに応じた方法で、対象オブジェクトの横方向の位置を決定することができる。すなわち、オブジェクトタイプが、剛体ではない種別の場合、オブジェクトの横方向における重心の位置が、当該オブジェクトの横方向の位置と決定される。また、オブジェクトタイプが、剛体である種別の場合、オブジェクトの横方向における両端の中心の位置が、当該オブジェクトの横方向の位置と決定される。 On the other hand, if the object type is “pedestrian” or “motorcycle, bicycle” (YES in step S201), the three-dimensional position determination unit 143 calculates the position of the center of gravity of the target object (step S204). Thereby, the horizontal position of the target object can be determined by a method according to the object type. That is, when the object type is a type that is not a rigid body, the position of the center of gravity in the horizontal direction of the object is determined as the horizontal position of the object. When the object type is a type that is a rigid body, the positions of the centers of both ends in the horizontal direction of the object are determined as the horizontal positions of the object.

図１９は、歩行者、オートバイ、または自転車であるオブジェクトの位置を算出する方法について説明する図である。図１９（Ａ）では、歩行者が腕を閉じている場合の重心の位置９１０ａを示している。図１９（Ｂ）では、歩行者が腕を開いている場合の重心の位置９１０ｂを示している。 FIG. 19 is a diagram illustrating a method of calculating the position of an object that is a pedestrian, motorcycle, or bicycle. FIG. 19A shows the position 910a of the center of gravity when the pedestrian is closing his arm. FIG. 19B shows the position 910b of the center of gravity when the pedestrian is opening his arm.

ステップＳ２０４において、３次元位置決定部１４３は、視差画像において、対象オブジェクトの対応領域（例えば、図１３の第１検出島対応領域走査範囲４８１、または第２検出島対応領域走査範囲４８２）を探索範囲とする。そして、当該探索範囲において、当該オブジェクトの視差を示す画素の数を、横方向（ｘ方向）の各位置毎に合計する。例えば、当該オブジェクトの最小視差値から最大視差値内の視差値ｄを有する画素の数を、横方向（ｘ方向）の各位置毎に合計する。そして、３次元位置決定部１４３は、当該合計した数が最も多い横方向の位置を、対象オブジェクトの横方向の位置とする。なお、対象オブジェクトの縦方向の位置は、横方向の重心の位置と同様に求めてもよいし、縦方向の中心の位置としてもよい。 In step S204, the three-dimensional position determination unit 143 searches the parallax image for a corresponding region of the target object (for example, the first detection island corresponding region scanning range 481 or the second detection island corresponding region scanning range 482 in FIG. 13). Range. Then, in the search range, the number of pixels indicating the parallax of the object is summed for each position in the horizontal direction (x direction). For example, the number of pixels having the parallax value d within the maximum parallax value from the minimum parallax value of the object is summed for each position in the horizontal direction (x direction). Then, the three-dimensional position determining unit 143 sets the horizontal position having the largest total number as the horizontal position of the target object. Note that the vertical position of the target object may be obtained in the same manner as the position of the center of gravity in the horizontal direction, or may be the center position in the vertical direction.

続いて、３次元位置決定部１４３は、対象オブジェクトの重心の位置を、対象オブジェクトの位置とし（ステップＳ２０５）、処理を終了する。 Subsequently, the three-dimensional position determination unit 143 sets the position of the center of gravity of the target object as the position of the target object (step S205), and ends the process.

＜重心の位置の算出方法の変形例＞
ステップＳ２０４において、例えば以下のように重心の位置を算出してもよい。 <Modification of the method for calculating the position of the center of gravity>
In step S204, the position of the center of gravity may be calculated as follows, for example.

３次元位置決定部１４３は、上述の探索範囲において、当該オブジェクトの視差を示す画素の高さが最も高さが高い横方向の位置を、対象オブジェクトの横方向の重心の位置としてもよい。これは、歩行者、自転車等において、最も高い位置に人の頭があると考えられるためである。 The three-dimensional position determination unit 143 may set the horizontal position where the height of the pixel indicating the parallax of the object is the highest in the above-described search range as the position of the horizontal center of gravity of the target object. This is because a pedestrian, a bicycle, or the like is considered to have a human head at the highest position.

または、３次元位置決定部１４３は、基準画像を画像認識することにより人の頭を認識し、認識した人の頭の横方向の位置を、対象オブジェクトの横方向の重心の位置としてもよい。人の頭の認識方法としては、公知の種々の方法が適用可能である。 Alternatively, the three-dimensional position determination unit 143 may recognize the person's head by recognizing the reference image, and set the position of the recognized person's head in the horizontal direction as the position of the center of gravity in the horizontal direction of the target object. As a human head recognition method, various known methods can be applied.

または、３次元位置決定部１４３は、視差画像にける探索範囲に基づいて算出する代わりに、オブジェクト領域抽出部１４１により抽出されたオブジェクト領域に対応するリアルＵマップ上の孤立領域に基づいて、対象オブジェクトの重心の位置を算出してもよい。この場合、３次元位置決定部１４３は、例えば、孤立領域において、各Ｘ座標の位置毎に、Ｙ軸に沿って視差頻度の合計値を算出する。そして、３次元位置決定部１４３は、当該合計値が最も多いＸ座標を、対象オブジェクトの横方向の重心の位置とする。 Alternatively, the three-dimensional position determining unit 143 may calculate the target based on the isolated region on the real U map corresponding to the object region extracted by the object region extracting unit 141 instead of calculating based on the search range in the parallax image. The position of the center of gravity of the object may be calculated. In this case, for example, in the isolated region, the three-dimensional position determination unit 143 calculates the total value of the parallax frequencies along the Y axis for each position of each X coordinate. Then, the three-dimensional position determination unit 143 sets the X coordinate having the largest total value as the position of the center of gravity in the horizontal direction of the target object.

《オブジェクトトラッキング》
次に、オブジェクトトラッキング部１４４について説明する。オブジェクトトラッキング部１４４は、以前（過去）のフレームの視差画像から検出されたオブジェクト（物体）をトラッキング（追跡）する処理を実行する。《Object tracking》
Next, the object tracking unit 144 will be described. The object tracking unit 144 executes processing for tracking (tracking) an object (object) detected from a parallax image of a previous (past) frame.

オブジェクトトラッキング部１４４は、３次元位置決定部１４３により、以前の複数のフレームの視差画像に基づいて決定された、各オブジェクトの位置に基づき、今回のフレームの視差画像に対する各オブジェクトの位置を予測する。具体的には、オブジェクトトラッキング部１４４は、以前の複数のフレームの各オブジェクトの位置を用いて当該オブジェクトと自車両１００との相対的な移動速度及び移動方向を特定し、この移動速度及び移動方向に基づいて、今回のフレームの視差画像に対する各オブジェクトの位置を予測する。この物体追跡処理には、周知の技術が適用可能である。しかし、移動速度及び移動方向を用いて物体追跡処理を行う場合、剛体でない人等の物体が、例えば手を広げることにより当該物体を検出した位置がフレーム間で変動すれば、それに伴って当該物体が移動したとみなされてしまい、誤った物体追跡処理が発生してしまう。これを、前述のように重心位置を採用することにより防止する。一方で、剛体である物体については、オブジェクト領域に基づいた位置を採用することで、ノイズ等の影響を低減できるようになる。 The object tracking unit 144 predicts the position of each object with respect to the parallax image of the current frame based on the position of each object determined by the three-dimensional position determination unit 143 based on the parallax images of a plurality of previous frames. . Specifically, the object tracking unit 144 specifies the relative moving speed and moving direction between the object and the host vehicle 100 using the positions of the objects in the previous plurality of frames, and the moving speed and moving direction. Based on the above, the position of each object with respect to the parallax image of the current frame is predicted. A known technique can be applied to the object tracking process. However, when the object tracking process is performed using the moving speed and the moving direction, if the position of the object such as a non-rigid person who has detected the object fluctuates between frames, for example, by spreading the hand, the object will be accompanied accordingly. Is considered to have moved, and erroneous object tracking processing occurs. This is prevented by adopting the position of the center of gravity as described above. On the other hand, the influence of noise or the like can be reduced by adopting a position based on the object area for a rigid object.

そして、オブジェクトトラッキング部１４４は、以前のフレームの視差画像におけるオブジェクトの領域の視差画像と、予測位置に対する今回のフレームの視差画像における領域の視差画像との類似度に基づいて、オブジェクトの追跡を継続する。これにより、複数のフレームにおいて、同一の物体は、同一の物体として把握される。 Then, the object tracking unit 144 continues to track the object based on the similarity between the parallax image of the object area in the parallax image of the previous frame and the parallax image of the area in the parallax image of the current frame with respect to the predicted position. To do. Thereby, the same object is grasped as the same object in a plurality of frames.

オブジェクトトラッキング部１４４は、例えば日光の反射や、暗い等により、物体の視差が十分に測定できていない場合、または、歩行者、及び当該歩行者に隣接する物体が一つの物体であると誤判定された場合、当該物体の位置を推定する。この場合、オブジェクトトラッキング部１４４は、以前の複数のフレームにおける当該物体の自車両１００との相対的な移動速度及び移動方向に基づいて、今回のフレームの視差画像におけるオブジェクト領域を推定する。そして、オブジェクトトラッキング部１４４は、当該物体が人等の剛体でない物体である場合は、重心の位置を求めるため、以前の複数のフレームにおける当該物体のオブジェクト領域の横方向の右端の位置と重心の位置、及び左端の位置と重心の位置の比を算出する。そして、オブジェクトトラッキング部１４４は、今回のフレームの視差画像において推定されたオブジェクト領域の横方向を当該比で分割した位置を、今回のフレームの視差画像におけるオブジェクトの重心の位置と推定する。 The object tracking unit 144 erroneously determines that the parallax of the object is not sufficiently measured due to, for example, reflection of sunlight or darkness, or that the pedestrian and the object adjacent to the pedestrian are one object. If so, the position of the object is estimated. In this case, the object tracking unit 144 estimates the object area in the parallax image of the current frame based on the relative moving speed and moving direction of the object with respect to the host vehicle 100 in a plurality of previous frames. Then, when the object is a non-rigid object such as a person, the object tracking unit 144 calculates the position of the center of gravity in order to obtain the position of the center of gravity. The position and the ratio of the position of the left end and the position of the center of gravity are calculated. Then, the object tracking unit 144 estimates a position obtained by dividing the horizontal direction of the object region estimated in the parallax image of the current frame by the ratio as the position of the center of gravity of the object in the parallax image of the current frame.

例えば、以前の複数のフレームにおける当該物体のオブジェクト領域の横方向の右端の位置と重心の位置、及び左端の位置と重心の位置の比の平均が２対１の場合、今回のフレームの視差画像において推定されたオブジェクト領域の横方向を２対１に分割した位置が、今回のフレームの視差画像におけるオブジェクトの重心の位置と推定される。 For example, when the average of the ratio of the right end position and the center of gravity in the horizontal direction and the ratio of the left end position and the position of the center of gravity of the object area of the object in a plurality of previous frames is 2: 1, the parallax image of the current frame The position obtained by dividing the horizontal direction of the object area estimated in step 2 into 2 is estimated as the position of the center of gravity of the object in the parallax image of the current frame.

＜まとめ＞
従来、歩行者や自転車の運転手等の人が手を横方向に急に広げた場合、物体の横方向における両端の中心の位置が急激に変化するため、追跡できなくなる場合がある。
また、例えば以前の複数のフレームにおける物体の位置に基づいて、今回のフレームにおける当該物体の位置を予測して追跡する場合、歩行者や自転車の運転手等の人が手を横方向に急に広げると、手を広げた先が当該人の移動方向と誤認識されて、位置の予測精度が低下する場合がある。 <Summary>
Conventionally, when a person such as a pedestrian or a bicycle driver suddenly spreads his / her hand in the lateral direction, the position of the center of both ends in the lateral direction of the object may change abruptly, which may make tracking impossible.
Also, for example, when predicting and tracking the position of the object in the current frame based on the position of the object in a plurality of previous frames, a person such as a pedestrian or a bicycle driver suddenly moves the hand in the lateral direction. If the hand is spread, the position where the hand is spread may be erroneously recognized as the movement direction of the person, and the position prediction accuracy may be lowered.

上述した実施形態によれば、物体の種別に応じて、視差画像における物体の位置を決定する。それにより、精度の高いトラッキングを継続できる。 According to the above-described embodiment, the position of the object in the parallax image is determined according to the type of the object. Thereby, highly accurate tracking can be continued.

なお、距離の値（距離値）と視差値は等価に扱えることから、本実施形態においては距離画像の一例として視差画像を用いて説明しているが、これに限られない。例えば、ステレオカメラを用いて生成した視差画像に対して、ミリ波レーダやレーザレーダ等の検出装置を用いて生成した距離情報を統合して、距離画像を生成してもよい。また、ステレオカメラと、ミリ波レーダやレーザレーダ等の検出装置を併用し、上述したステレオカメラによる物体の検出結果と組み合わせることにより、検出の精度をさらに高める構成としてもよい。 Since the distance value (distance value) and the parallax value can be handled equivalently, in the present embodiment, the parallax image is used as an example of the distance image, but the present invention is not limited to this. For example, a distance image may be generated by integrating distance information generated using a detection device such as a millimeter wave radar or a laser radar with a parallax image generated using a stereo camera. Further, the detection accuracy may be further enhanced by using a stereo camera and a detection device such as a millimeter wave radar or a laser radar in combination with the detection result of the object by the stereo camera described above.

上述した実施形態におけるシステム構成は一例であり、用途や目的に応じて様々なシステム構成例があることは言うまでもない。 The system configuration in the above-described embodiment is an example, and it goes without saying that there are various system configuration examples depending on applications and purposes.

例えば、処理ハードウェア部１２０及び画像解析ユニット１０２の各機能部の少なくとも一部の処理を行う機能部は、１以上のコンピュータにより構成されるクラウドコンピューティングにより実現されていてもよい。 For example, the functional unit that performs processing of at least a part of the functional units of the processing hardware unit 120 and the image analysis unit 102 may be realized by cloud computing including one or more computers.

また、上述の実施の形態では、機器制御システム１が自車両１００としての自動車に搭載される例について説明したが、これに限定されるものではない。例えば、他の車両の一例としてバイク、自転車、車椅子または農業用の耕運機等の車両に搭載されるものとしてもよい。また、移動体の一例としての車両だけでなく、ロボット等の移動体に搭載されるものとしてもよい。 Moreover, although the above-mentioned embodiment demonstrated the example with which the apparatus control system 1 was mounted in the motor vehicle as the own vehicle 100, it is not limited to this. For example, it may be mounted on a vehicle such as a motorcycle, bicycle, wheelchair, or agricultural cultivator as an example of another vehicle. Moreover, it is good also as what is mounted not only on the vehicle as an example of a mobile body but on mobile bodies, such as a robot.

また、処理ハードウェア部１２０及び画像解析ユニット１０２の各機能部は、ハードウェアによって実現される構成としてもよいし、ＣＰＵが記憶装置に格納されたプログラムを実行することによって実現される構成としてもよい。このプログラムは、インストール可能な形式又は実行可能な形式のファイルによって、コンピュータで読み取り可能な記録メディアに記録されて流通されるようにしても良い。また、上記記録メディアの例として、ＣＤ−Ｒ(Compact Disc Recordable)、ＤＶＤ(Digital Versatile Disk)、ブルーレイディスク等が挙げられる。また、このプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成してもよい。また、このプログラムを、インターネット等のネットワーク経由で提供または配布するように構成してもよい。 The functional units of the processing hardware unit 120 and the image analysis unit 102 may be realized by hardware, or may be realized by executing a program stored in the storage device by the CPU. Good. The program may be recorded and distributed on a computer-readable recording medium by a file in an installable or executable format. Examples of the recording medium include CD-R (Compact Disc Recordable), DVD (Digital Versatile Disk), Blu-ray Disc, and the like. Further, this program may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. The program may be configured to be provided or distributed via a network such as the Internet.

１機器制御システム
１００自車両
１０１撮像ユニット
１０２画像解析ユニット（「情報処理装置」の一例）
１０３表示モニタ
１０４車両走行制御ユニット（「制御部」の一例）
１１０ａ，１１０ｂ撮像部
１２０処理ハードウェア部
１３２視差画像生成部（「生成部」の一例）
１３４Ｖマップ生成部（「取得部」の一例）
１３５路面形状検出部
１３７Ｕマップ生成部
１３８リアルＵマップ生成部
１３９孤立領域検出部
１４０視差画像の対応領域検出部
１４１オブジェクト領域抽出部
１４２オブジェクトタイプ分類部（「判定部」の一例）
１４３３次元位置決定部（「決定部」の一例）
１４４オブジェクトトラッキング部（「追跡部」の一例）
２撮像装置 1 device control system 100 own vehicle 101 imaging unit 102 image analysis unit (an example of “information processing apparatus”)
103 display monitor 104 vehicle travel control unit (an example of “control unit”)
110a, 110b Imaging unit 120 Processing hardware unit 132 Parallax image generation unit (an example of “generation unit”)
134 V map generation part (an example of "acquisition part")
135 Road surface shape detection unit 137 U map generation unit 138 Real U map generation unit 139 Isolated region detection unit 140 Parallax image corresponding region detection unit 141 Object region extraction unit 142 Object type classification unit (an example of “determination unit”)
143 3D position determination unit (an example of “determination unit”)
144 Object tracking part (an example of a "tracking part")
2 Imaging device

特開平１０−２８３４６２号公報JP-A-10-283462

Claims

An acquisition unit that acquires information in which a vertical position, a horizontal position, and a depth direction position of an object are associated;
A determination unit that determines a type of the object based on the information acquired by the acquisition unit;
A determination unit that determines the position of the object by a method according to the type of the object determined by the determination unit;
A tracking unit that tracks the object based on the position of the object determined by the determining unit;
An information processing apparatus comprising:

The determining unit determines the position of the object by a first method using the position of the center of gravity of the object when the type of the object determined by the determining unit is the first type; The information processing apparatus according to claim 1, wherein when the type of the object determined by the determination unit is a second type, the position of the object is determined by a second method different from the first method.

The determination unit, when the type of the object determined by the determination unit is a second type, determines the positions of the centers of both ends of the object in the lateral direction as the positions of the object. The information processing apparatus described.

The said determination part determines the position of the gravity center in the said horizontal direction of the said object based on the number of the information of the position of the depth direction corresponding to each position of the said horizontal direction of the said object in the said information. The information processing apparatus described.

A plurality of imaging units;
A generating unit that generates the information based on a plurality of images captured by the plurality of imaging units;
An information processing apparatus according to any one of claims 1 to 4,
An imaging apparatus comprising:

An imaging device according to claim 5;
Based on the data of the object being tracked by the tracking unit, a control unit that controls the moving body,
With
The plurality of imaging units are mounted on the moving body, and are device control systems that image the front of the moving body.

The apparatus control system according to claim 6 is provided.
A moving body controlled by the control unit.

Computer
Obtaining information in which the vertical position, the horizontal position, and the depth position of the object are associated with each other;
Determining the type of the object based on the acquired information;
Determining the position of the object by a method according to the determined type of the object;
Tracking the object based on the determined position of the object;
An information processing method is executed.

On the computer,
Obtaining information in which the vertical position, the horizontal position, and the depth position of the object are associated with each other;
Determining the type of the object based on the acquired information;
Determining the position of the object by a method according to the determined type of the object;
Tracking the object based on the determined position of the object;
A program that executes