JP6983334B2

JP6983334B2 - Image recognition device

Info

Publication number: JP6983334B2
Application number: JP2020546756A
Authority: JP
Inventors: 郭介牛場; 正幸小林; 都堀田; 裕史大塚
Original assignee: Hitachi Astemo Ltd
Current assignee: Hitachi Astemo Ltd
Priority date: 2018-09-12
Filing date: 2019-08-06
Publication date: 2021-12-17
Anticipated expiration: 2039-08-06
Also published as: JPWO2020054260A1; WO2020054260A1; CN112639877A

Description

本発明は、画像認識装置に関する。 The present invention relates to an image recognition device.

近年、運転支援や自動運転等に必要な画像認識装置に対する性能向上への要求が高まっている。例えば、歩行者に対する衝突安全機能では、自動車アセスメントにおいて夜間歩行者への衝突安全試験が追加されるなど、性能向上が求められている。これを実現するために、歩行者など立体物に対する高い認識性能が必要になる。 In recent years, there has been an increasing demand for improved performance of image recognition devices required for driving support and automatic driving. For example, in the collision safety function for pedestrians, performance improvement is required, such as the addition of a collision safety test for night pedestrians in an automobile assessment. In order to achieve this, high recognition performance for three-dimensional objects such as pedestrians is required.

特許文献１には、ある移動立体物と他の立体物が重なっている状況において、立体物を内包する所定の領域の内部の特徴点を追跡することで領域の内部に存在する歩行者などの移動立体物を検知する認識装置が提案されている。 Patent Document 1 describes, in a situation where a moving three-dimensional object and another three-dimensional object overlap each other, a pedestrian or the like existing inside the region by tracking the feature points inside a predetermined region containing the three-dimensional object. A recognition device for detecting a moving three-dimensional object has been proposed.

特開２０１７−１４２７６０号公報Japanese Unexamined Patent Publication No. 2017-142760

しかしながら従来の装置では、立体物の全体をうまく検知できなかった場合に認識性能が低下するという課題があった。 However, the conventional device has a problem that the recognition performance is deteriorated when the entire three-dimensional object cannot be detected well.

本発明の第１の態様による画像認識装置は、撮像部によって撮像された画像上に設定された立体物の検知領域を、前記立体物の検知特性情報に基づいて前記画像上で拡大もしくは縮小することで、前記画像上に立体物領域を設定する立体物領域設定部と、前記立体物領域設定部により設定された前記立体物領域に対して前記立体物の種別を特定する認識処理を行う認識処理部と、を備える。
本発明の第２の態様による画像認識装置は、撮像部によって撮像された画像上に設定された立体物の検知領域を、前記立体物の第１の特性情報に基づいて前記画像上で拡大もしくは縮小することで、前記画像上に立体物領域を設定する立体物領域設定部と、前記立体物領域設定部によって求めた前記立体物領域を基準サイズとして、前記立体物の第２の特性情報に基づいて、複数のサイズの認識領域を定める認識倍率設定部と、前記認識倍率設定部で定めた複数の前記認識領域に対して、前記立体物の第３の特性情報に基づいて前記認識領域よりも広い複数の走査領域を設定する走査領域設定部と、前記走査領域設定部で設定された前記走査領域を用いて、前記立体物の種別を特定する認識処理を行う認識処理部と、を備える。
本発明の第３の態様による画像認識装置は、撮像部によって撮像された画像上に設定された立体物の検知領域に対して、前記立体物の検知特性情報に基づいて前記立体物の検知領域を拡大もしくは縮小して立体物領域を設定する立体物領域設定部と、前記立体物領域設定部により設定された前記立体物領域に対して前記立体物の種別を特定する認識処理を行う認識処理部と、前記立体物領域設定部によって求めた前記立体物領域を基準サイズとして、前記立体物の認識特性情報に基づいて、複数のサイズの認識領域を定める認識倍率設定部と、を備え、前記認識処理部は、前記認識倍率設定部で定められた複数のサイズの前記認識領域に対して、それぞれに前記認識処理を行う。 Image recognition apparatus according to the first aspect of the present invention, the detection area of the three-dimensional object that has been set on the captured image by the imaging unit, and enlarged or reduced on the image based on the detection characteristics information of said three-dimensional object This recognizes the three-dimensional object area setting unit that sets the three-dimensional object area on the image and the recognition process that specifies the type of the three-dimensional object for the three-dimensional object area set by the three-dimensional object area setting unit. It is equipped with a processing unit.
Image recognition device according to the second aspect of the present invention, or a larger detection area of the three-dimensional object that has been set on the captured image by the imaging unit on the first on the basis of the characteristic information image of the three-dimensional object By reducing the size, the third characteristic information of the three-dimensional object is based on the three-dimensional object area setting unit that sets the three-dimensional object region on the image and the three-dimensional object region obtained by the three-dimensional object region setting unit as a reference size. Based on the recognition magnification setting unit that defines recognition regions of a plurality of sizes, and the recognition regions defined by the recognition magnification setting unit, from the recognition region based on the third characteristic information of the three-dimensional object. It also includes a scanning area setting unit that sets a plurality of wide scanning areas, and a recognition processing unit that performs recognition processing for specifying the type of the three-dimensional object using the scanning area set by the scanning area setting unit. ..
The image recognition device according to the third aspect of the present invention has the detection area of the three-dimensional object set on the image captured by the imaging unit based on the detection characteristic information of the three-dimensional object. A recognition process that specifies the type of the three-dimensional object for the three-dimensional object area set by the three-dimensional object area setting unit and the three-dimensional object area setting unit that sets the three-dimensional object area by enlarging or reducing the A unit and a recognition magnification setting unit that defines recognition areas of a plurality of sizes based on the recognition characteristic information of the three-dimensional object, with the three-dimensional object area obtained by the three-dimensional object area setting unit as a reference size, are provided. The recognition processing unit performs the recognition processing on each of the recognition areas of a plurality of sizes defined by the recognition magnification setting unit.

本発明によれば、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。 According to the present invention, it is possible to provide an image recognition device that accurately detects a three-dimensional object and has improved recognition performance.

画像認識装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of an image recognition apparatus. 画像認識装置の動作を示すフローチャートである。It is a flowchart which shows the operation of an image recognition apparatus. 検知処理により画像上に設定された立体物領域を示す図である。It is a figure which shows the 3D object area set on the image by the detection process. 認識処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the recognition process. 立体物領域設定処理の原理を説明する図である。It is a figure explaining the principle of a three-dimensional object area setting process. 認識倍率設定処理の原理を説明する図である。It is a figure explaining the principle of the recognition magnification setting process. 認識倍率設定処理における正規化を説明する図である。It is a figure explaining the normalization in the recognition magnification setting process. 走査領域設定処理の原理を説明する図である。It is a figure explaining the principle of a scanning area setting process. 倍率毎走査認識処理の原理を説明する図である。It is a figure explaining the principle of the scan recognition processing for every magnification. 最適倍率設定処理の原理を説明する図である。It is a figure explaining the principle of the optimum magnification setting process. 詳細認識位置決定処理の原理を説明する図である。It is a figure explaining the principle of a detailed recognition position determination process. 詳細認識処理の原理を説明する図である。It is a figure explaining the principle of a detailed recognition process. 変形例に係る画像認識装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the image recognition apparatus which concerns on the modification.

図１は、本実施形態にかかわる画像認識装置１００の全体構成を示すブロック図である。画像認識装置１００は、車両に搭載され、車両前方の左右に配置された左カメラ１０１と右カメラ１０２を備える。カメラ１０１、１０２は、ステレオカメラを構成し、例えば、歩行者、車両、信号、標識、白線、車のテールランプ、ヘッドライトなどの立体物を撮像する。画像認識装置１００は、カメラ１０１、１０２で撮像された車両前方の画像情報に基づいて車外環境を認識する。そして、車両（自車両）は、画像認識装置１００による認識結果に基づいて、ブレーキ、ステアリングなどを制御する。 FIG. 1 is a block diagram showing an overall configuration of an image recognition device 100 according to the present embodiment. The image recognition device 100 includes a left camera 101 and a right camera 102 mounted on the vehicle and arranged on the left and right in front of the vehicle. The cameras 101 and 102 constitute a stereo camera, and image a three-dimensional object such as a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, and a headlight. The image recognition device 100 recognizes the environment outside the vehicle based on the image information in front of the vehicle captured by the cameras 101 and 102. Then, the vehicle (own vehicle) controls the brake, steering, and the like based on the recognition result by the image recognition device 100.

画像認識装置１００は、カメラ１０１、１０２で撮像した画像を画像入力インタフェース１０３より取り込む。画像入力インタフェース１０３より取り込まれた画像情報は、内部バス１０９を介して画像処理部１０４へ送られる。そして、演算処理部１０５で処理され、処理途中の結果や最終結果の画像情報などは記憶部１０６に記憶される。 The image recognition device 100 captures the images captured by the cameras 101 and 102 from the image input interface 103. The image information taken in from the image input interface 103 is sent to the image processing unit 104 via the internal bus 109. Then, it is processed by the arithmetic processing unit 105, and the result in the process of processing, the image information of the final result, and the like are stored in the storage unit 106.

画像処理部１０４は、左カメラ１０１の撮像素子から得られる第１の画像と、右カメラ１０２の撮像素子から得られる第２の画像とを比較して、それぞれの画像に対して、撮像素子に起因するデバイス固有の偏差の補正や、ノイズ補間などの画像補正を行い、これを記憶部１０６に記憶する。更に、第１の画像と第２の画像との間で、相互に対応する箇所を計算して、視差情報を求め、画像上の各画素に対応する距離情報として、これを記憶部１０６に記憶する。画像処理部１０４は、内部バス１０９を介して演算処理部１０５、ＣＡＮインタフェース１０７、制御処理部１０８に接続されている。 The image processing unit 104 compares the first image obtained from the image sensor of the left camera 101 with the second image obtained from the image sensor of the right camera 102, and makes the image sensor for each image. The resulting device-specific deviation is corrected, image correction such as noise interpolation is performed, and this is stored in the storage unit 106. Further, the points corresponding to each other between the first image and the second image are calculated, the parallax information is obtained, and this is stored in the storage unit 106 as the distance information corresponding to each pixel on the image. do. The image processing unit 104 is connected to the arithmetic processing unit 105, the CAN interface 107, and the control processing unit 108 via the internal bus 109.

演算処理部１０５は、記憶部１０６に蓄えられた画像情報および距離情報（視差情報）を使い、車両周辺の環境を把握するために、立体物の認識を行う。立体物の認識結果や中間的な処理結果の一部が、記憶部１０６に記憶される。演算処理部１０５は、撮像した画像に対して立体物の認識を行った後に、認識結果を用いて車両制御の計算を行う。車両制御の計算の結果として得られた車両の制御方針や、認識結果の一部はＣＡＮインタフェース１０７を介して、車載ネットワークＣＡＮ１１０に伝えられ、これにより車両の制御が行われる。 The arithmetic processing unit 105 recognizes a three-dimensional object in order to grasp the environment around the vehicle by using the image information and the distance information (parallax information) stored in the storage unit 106. A part of the recognition result of the three-dimensional object and the intermediate processing result is stored in the storage unit 106. The arithmetic processing unit 105 recognizes a three-dimensional object with respect to the captured image, and then calculates the vehicle control using the recognition result. The vehicle control policy obtained as a result of the vehicle control calculation and a part of the recognition result are transmitted to the vehicle-mounted network CAN 110 via the CAN interface 107, whereby the vehicle is controlled.

制御処理部１０８は、各処理部が異常動作を起こしていないか、データ転送時にエラーが発生していないかなどを監視し、異常動作を防止する。画像処理部１０４、演算処理部１０５、および制御処理部１０８は、単一または複数のコンピュータユニットにより構成してもよい。 The control processing unit 108 monitors whether each processing unit has caused an abnormal operation, whether an error has occurred during data transfer, and the like, and prevents the abnormal operation. The image processing unit 104, the arithmetic processing unit 105, and the control processing unit 108 may be configured by a single computer unit or a plurality of computer units.

図２は、画像認識装置１００の動作を示すフローチャートである。
画像認識装置１００に備えられた左カメラ１０１と右カメラ１０２とにより画像が撮像され、撮像された画像情報２０３、２０４のそれぞれについて、撮像素子が持つ固有の癖を吸収するための補正などの画像処理２０５を行う。画像処理２０５の処理結果は画像バッファ２０６に蓄えられる。画像バッファ２０６は、図１の記憶部１０６に設けられる。FIG. 2 is a flowchart showing the operation of the image recognition device 100.
An image is captured by the left camera 101 and the right camera 102 provided in the image recognition device 100, and each of the captured image information 203 and 204 is an image such as a correction for absorbing the peculiar habit of the image sensor. Process 205 is performed. The processing result of the image processing 205 is stored in the image buffer 206. The image buffer 206 is provided in the storage unit 106 of FIG.

次に視差処理２０７が行われる。具体的には、補正された２つの画像を使って、画像同士の照合を行い、これにより左カメラ１０１、右カメラ１０２で得た画像の視差情報を得る。左右画像の視差により、立体物の画像上のある着目点が、三角測量の原理によって、立体物までの距離として求められる。画像処理２０５および視差処理２０７は、図１の画像処理部１０４で行われ、最終的に得られた画像情報、および視差情報は記憶部１０６に蓄えられる。 Next, the parallax processing 207 is performed. Specifically, the two corrected images are collated with each other, and the parallax information of the images obtained by the left camera 101 and the right camera 102 is obtained. By the parallax of the left and right images, a certain point of interest on the image of the three-dimensional object is obtained as the distance to the three-dimensional object by the principle of triangulation. The image processing 205 and the parallax processing 207 are performed by the image processing unit 104 of FIG. 1, and the finally obtained image information and the parallax information are stored in the storage unit 106.

そして、次の検知処理２０８では、視差処理２０７により左右画像の各画素の視差または距離が得られた視差情報を用いて、３次元空間上の立体物を検知する。図３は、検知処理２０８により画像上に設定された立体物の検知領域を示す図である。図３には、検知処理２０８の結果、画像上において、カメラ１０１、１０２によって検知された歩行者の検知領域３０１と車両の検知領域３０２が示されている。これらの検知領域３０１、３０２は、図３に示すように矩形であっても、視差や距離から得られる不定形の領域であってもよい。後段の処理において計算機での扱いを容易にするため一般的には矩形として扱われる。本実施形態では以下、領域は矩形として扱い、立体物の一例として主に歩行者を用いて説明する。 Then, in the next detection process 208, a three-dimensional object in the three-dimensional space is detected by using the parallax information obtained by the parallax process 207 for the parallax or the distance of each pixel of the left and right images. FIG. 3 is a diagram showing a detection area of a three-dimensional object set on the image by the detection process 208. FIG. 3 shows a pedestrian detection area 301 and a vehicle detection area 302 detected by the cameras 101 and 102 on the image as a result of the detection process 208. These detection regions 301 and 302 may be rectangular as shown in FIG. 3, or may be irregular regions obtained from parallax or distance. It is generally treated as a rectangle in order to facilitate the handling by a computer in the subsequent processing. In the present embodiment, the area will be treated as a rectangle, and a pedestrian will be mainly used as an example of a three-dimensional object.

次に、認識処理２０９では、検知処理２０８により画像上に設定された検知領域に対して立体物の種別を特定する認識処理を行う。認識処理２０９による認識対象の立体物は、例えば、歩行者、車両、信号、標識、白線、車のテールランプやヘッドライトなどであり、これらの何れであるかその種別が特定される。この認識処理２０９が安定して立体物の認識を行うためには、画像上の検知領域と認識したい対象の領域が一致している必要がある。しかし、カメラ１０１、１０２においては外環境の明るさやカメラ間の撮像性能のばらつきなどによって、認識したい画像上の領域を完全に一致させることができない場合がある。これは、ミリ波などのレーダーと、カメラなどの画像センサを組み合わせた場合でも同様である。この問題を解決した認識処理２０９の詳細については後述する。 Next, in the recognition process 209, a recognition process for specifying the type of the three-dimensional object is performed for the detection area set on the image by the detection process 208. The three-dimensional object to be recognized by the recognition process 209 is, for example, a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, a headlight, or the like, and the type of any of these is specified. In order for the recognition process 209 to stably recognize a three-dimensional object, it is necessary that the detection area on the image and the area of the target to be recognized match. However, in the cameras 101 and 102, it may not be possible to completely match the areas on the image to be recognized due to the brightness of the external environment and the variation in the imaging performance between the cameras. This is the same even when a radar such as a millimeter wave is combined with an image sensor such as a camera. The details of the recognition process 209 that solves this problem will be described later.

次に、車両制御処理２１０では、立体物の認識結果と、自車両の状態（速度、舵角など）とを勘案して、例えば、乗員に警告を発し、自車両のブレーキングや舵角調整などの制御を行う。あるいは、認識した立体物に対する回避制御を定め、その結果を自動制御情報としてＣＡＮインタフェース１０７を介して出力する。認識処理２０９および車両制御処理２１０は、図１の演算処理部１０５で行われる。 Next, in the vehicle control process 210, for example, a warning is issued to the occupant in consideration of the recognition result of the three-dimensional object and the state of the own vehicle (speed, steering angle, etc.), and the braking and steering angle adjustment of the own vehicle are performed. And so on. Alternatively, avoidance control for the recognized three-dimensional object is determined, and the result is output as automatic control information via the CAN interface 107. The recognition process 209 and the vehicle control process 210 are performed by the arithmetic processing unit 105 of FIG.

なお、図２のフローチャート、および後述の図４のフローチャートで示したプログラムを、ＣＰＵ、メモリなどを備えたコンピュータにより実行することができる。全部の処理、または一部の処理をハードロジック回路により実現してもよい。更に、このプログラムは、予め画像認識装置１００の記憶媒体に格納して提供することができる。あるいは、独立した記憶媒体にプログラムを格納して提供したり、ネットワーク回線によりプログラムを画像認識装置１００の記憶媒体に記録して格納することもできる。データ信号（搬送波）などの種々の形態のコンピュータ読み込み可能なコンピュータプログラム製品として供給してもよい。 The program shown in the flowchart of FIG. 2 and the flowchart of FIG. 4 described later can be executed by a computer equipped with a CPU, a memory, and the like. All processing or some processing may be realized by a hard logic circuit. Further, this program can be previously stored and provided in the storage medium of the image recognition device 100. Alternatively, the program can be stored and provided in an independent storage medium, or the program can be recorded and stored in the storage medium of the image recognition device 100 by a network line. It may be supplied as a computer-readable computer program product in various forms such as a data signal (carrier wave).

図４は、認識処理２０９の詳細を示すフローチャートである。このフローチャートは、図４に示すように、立体物領域設定処理４０１、認識倍率設定処理４０２、走査領域設定処理４０３、倍率毎走査認識処理４０４、最適倍率設定処理４０６、詳細認識位置決定処理４０７、詳細認識処理４０８を行う。以下、順に各処理を説明する。なお、これらの処理ではステレオカメラを前提に説明する。 FIG. 4 is a flowchart showing the details of the recognition process 209. As shown in FIG. 4, this flowchart shows a three-dimensional object area setting process 401, a recognition magnification setting process 402, a scanning area setting process 403, a magnification-by-scanning recognition process 404, an optimum magnification setting process 406, and a detailed recognition position determination process 407. The detailed recognition process 408 is performed. Hereinafter, each process will be described in order. In addition, these processes will be described on the premise of a stereo camera.

［立体物領域設定処理］
立体物領域設定処理４０１では、検知処理２０８によって得られた立体物の検知領域３０１を、立体物の検知特性情報に基づいて拡大もしくは縮小して立体物領域５０１を設定する。[Three-dimensional object area setting process]
In the three-dimensional object area setting process 401, the three-dimensional object detection area 301 obtained by the detection process 208 is enlarged or reduced based on the three-dimensional object detection characteristic information to set the three-dimensional object area 501.

図５は、立体物領域設定処理４０１の原理を説明する図である。図５では、歩行者の検知領域３０１を立体物の検知特性情報に基づいて拡大して立体物領域５０１を設定した例を示す。検知特性情報は、例えば、（１）立体物の識別性、（２）立体物との距離、（３）立体物の大きさ、（４）立体物の想定サイズ、（５）外環境の明るさ、（６）ヘッドライトの向き、（７）立体物が存在する路面の高さ、（８）センサ分解能などである。以下に、これらの検知特性情報について説明する。 FIG. 5 is a diagram illustrating the principle of the three-dimensional object area setting process 401. FIG. 5 shows an example in which the pedestrian detection area 301 is enlarged based on the detection characteristic information of the three-dimensional object to set the three-dimensional object area 501. The detection characteristic information includes, for example, (1) distinctiveness of a three-dimensional object, (2) distance from the three-dimensional object, (3) size of the three-dimensional object, (4) assumed size of the three-dimensional object, and (5) brightness of the external environment. These include (6) the direction of the headlight, (7) the height of the road surface on which the three-dimensional object exists, and (8) the sensor resolution. The detection characteristic information will be described below.

（１）立体物の識別性は、例えばカメラ１０１、１０２においては背景領域との組み合わせによって立体物が得にくい場合が考えられる。路面と同色の歩行者の服装や、夜間の歩行者の頭頂部などがこれに当たる。また、カメラ１０１、１０２が雨滴などの影響で対象の一部がぼけて立体物領域が欠けることも考えられる。このような場合、検知領域を拡大する。また、三次元空間中の人と人以外の領域が結合した立体物になる場合も有る。路肩の電柱や柵などと言った立体物と人との結合のように識別前に分離することは困難なためである。このような場合に画像の色や輝度やエッジに基づいて検知領域を縮小する。また、レーダーセンサなどの別種のセンサや、カメラ１０１、１０２においても水平方向の取り付け高さが違う場合、主に上部方向に隠れが発生し、立体物は小さく出る。このような構成の処理特性を持つ場合には検知領域を拡大する。 (1) As for the distinctiveness of a three-dimensional object, for example, in cameras 101 and 102, it may be difficult to obtain a three-dimensional object depending on the combination with the background area. This includes the clothes of pedestrians of the same color as the road surface and the crown of pedestrians at night. It is also conceivable that the cameras 101 and 102 are partially blurred due to the influence of raindrops and the like, and the three-dimensional object region is missing. In such a case, the detection area is expanded. In addition, there are cases where a person and a non-human area in a three-dimensional space are combined to form a three-dimensional object. This is because it is difficult to separate before identification, such as the connection between a person and a three-dimensional object such as a utility pole or fence on the shoulder of a road. In such a case, the detection area is reduced based on the color, brightness, and edge of the image. Further, when another type of sensor such as a radar sensor or the cameras 101 and 102 have different mounting heights in the horizontal direction, hiding occurs mainly in the upper direction, and the three-dimensional object appears small. If it has the processing characteristics of such a configuration, the detection area is expanded.

（２）立体物との距離は、遠ければ大きく立体物領域を拡大し、近ければ小さく立体物領域を縮小する。この拡大率は、カメラ１０１、１０２を含むセンサ分解能によって決定しても良い。対象が遠方に行けば行くほど１画素あたりの３次元空間を占めるサイズが大きくなり、誤差が乗るからである。 (2) As for the distance to the three-dimensional object, if it is far, the three-dimensional object area is greatly expanded, and if it is close, the three-dimensional object area is reduced. This magnification may be determined by the sensor resolution including the cameras 101 and 102. This is because the farther the object is, the larger the size that occupies the three-dimensional space per pixel becomes, and the error is added.

（３）立体物の大きさは、立体物が小さければ立体物領域を拡大し、立体物が大きければ立体物領域を縮小する。 (3) As for the size of the three-dimensional object, if the three-dimensional object is small, the three-dimensional object area is expanded, and if the three-dimensional object is large, the three-dimensional object area is reduced.

（４）立体物の想定サイズは、例えば、立体物を歩行者と想定して、歩行者と想定して小さ過ぎる立体物は立体物領域を拡大し、歩行者と想定して大き過ぎる立体物は立体物領域を縮小する。どの程度のものまでを対象とするかは、（５）外環境の明るさや（６）ヘッドライトの向きも考慮に入れて決定してよい。例えば、昼の明るい環境であれば立体物領域を縮小し、夜の暗い環境であれば立体物領域を拡大する。また、ヘッドライトの向きに応じて、例えば、ヘッドライトの向きがロウビームであれば足元に光が当たっているので、高さ方向に立体物領域を拡大する。ヘッドライトの向きがハイビームであれば、全身に光が当たっているので立体物領域を縮小する。また、立体物までの距離や（７）立体物が存在する位置の路面の高さによっても立体物領域を拡大もしくは縮小してもよい。例えば、路面の高さが低い場合に、ハイビームであれば、足元には光が当たらないため下方向に立体物領域を拡大する。 (4) The assumed size of a three-dimensional object is, for example, assuming that the three-dimensional object is a pedestrian, and the three-dimensional object that is too small assuming that it is a pedestrian expands the three-dimensional object area and is too large assuming that it is a pedestrian. Reduces the three-dimensional object area. The extent to which the target is targeted may be determined in consideration of (5) the brightness of the external environment and (6) the direction of the headlights. For example, in a bright environment in the daytime, the three-dimensional object area is reduced, and in a dark environment at night, the three-dimensional object area is expanded. Further, depending on the direction of the headlight, for example, if the direction of the headlight is a low beam, the light hits the feet, so that the three-dimensional object region is expanded in the height direction. If the direction of the headlight is high beam, the whole body is exposed to light, so the three-dimensional object area is reduced. Further, the three-dimensional object region may be expanded or reduced depending on the distance to the three-dimensional object and (7) the height of the road surface at the position where the three-dimensional object exists. For example, when the height of the road surface is low, if the high beam is used, the light does not hit the feet, so the three-dimensional object area is expanded downward.

（８）センサ分解能は、センサがカメラ１０１、１０２であれば、距離に応じて１画素あたりのサイズが変わるので、立体物のサイズや、対象の距離と組み合わせることによって立体物領域を拡大もしくは縮小する。例えば、立体物が近傍に居る場合は１画素あたりの３次元空間の分解能が高いため、立体物領域を拡大する。立体物が遠い場合は１画素あたりの３次元空間の分解能が低いため、立体物領域を縮小する。また、立体物領域として取得する領域の特性によっては、立体物領域を縮小する。立体物領域として取得する領域の特性とは、例えば立体物領域が得られた視差のある領域やセンサ応答領域がより大きく設定される場合であり、このような場合は立体物領域を縮小する。 (8) If the sensor is a camera 101 or 102, the size per pixel changes depending on the distance, so the sensor resolution expands or contracts the three-dimensional object area by combining with the size of the three-dimensional object and the target distance. do. For example, when a three-dimensional object is in the vicinity, the resolution of the three-dimensional space per pixel is high, so that the three-dimensional object area is expanded. When the three-dimensional object is far away, the resolution of the three-dimensional space per pixel is low, so the three-dimensional object area is reduced. Further, depending on the characteristics of the area acquired as the three-dimensional object area, the three-dimensional object area is reduced. The characteristic of the region acquired as the three-dimensional object region is, for example, the case where the parallax region or the sensor response region where the three-dimensional object region is obtained is set to be larger, and in such a case, the three-dimensional object region is reduced.

立体物の検知特性情報に基づいて立体物領域５０１を設定する例について説明する。例えば、歩行者においては、画像上の変化が大きい手足領域は欠けて小さくなりやすい。また、夜間の場合は黒髪の人ならば、頭が背景と混ざって検知しづらい。このような場合に、１画素あたりの３次元空間上のサイズを元にして、画像上の立体物領域５０１を変更する。例えば、明るい昼間なら頭部と思われる領域の拡大は０ｃｍ、足元の領域の拡大は１０ｃｍ、夜間なら頭部の領域の拡大は１０ｃｍ、足元の領域の拡大は１０ｃｍとする。さらに車両のロウビームが届く範囲なら頭部の領域の拡大は１０ｃｍ、足元の領域の拡大は０ｃｍとする。横幅も同様に適宜拡大縮小する。また時系列による横幅の変化量を元に補正を実施しても良い。また、後段の認識処理の内容によっては、認識領域を縮小しても良い。例えば、歩行者であれば上半身のみで認識を実施する場合などである。拡大縮小するサイズは所定の割合、若しくは画像上のサイズで決定しても良いが、３次元空間上のサイズを基準に設定することにより、認識対象としてあり得ないサイズの除外が可能となる。
また、この拡大縮小による立体物領域５０１のサイズは、３次元空間上の距離と画素の関係から、検知領域３０１と同一になる場合もある。An example of setting the three-dimensional object region 501 based on the detection characteristic information of the three-dimensional object will be described. For example, in a pedestrian, the limb region where the change on the image is large is likely to be chipped and become small. Also, at night, if you have black hair, your head will be mixed with the background and it will be difficult to detect. In such a case, the three-dimensional object region 501 on the image is changed based on the size in the three-dimensional space per pixel. For example, in the bright daytime, the expansion of the region considered to be the head is 0 cm, the expansion of the foot region is 10 cm, and in the nighttime, the expansion of the head region is 10 cm, and the expansion of the foot region is 10 cm. Further, if the low beam of the vehicle is within reach, the expansion of the head area is 10 cm, and the expansion of the foot area is 0 cm. Similarly, the width is also enlarged or reduced as appropriate. Further, the correction may be performed based on the amount of change in the width over time. Further, the recognition area may be reduced depending on the content of the recognition process in the subsequent stage. For example, in the case of a pedestrian, recognition is performed only by the upper body. The size to be enlarged or reduced may be determined by a predetermined ratio or the size on the image, but by setting the size on the three-dimensional space as a reference, it is possible to exclude a size that cannot be recognized.
Further, the size of the three-dimensional object region 501 due to this enlargement / reduction may be the same as that of the detection region 301 due to the relationship between the distance in the three-dimensional space and the pixels.

以上、検知特性情報について説明したが、立体物領域設定処理４０１においては、これらの検知特性情報を複数組み合わせることによって、より精度よく立体物領域５０１を設定する。例えば、距離、外環境の明るさ、ライト向き、路面高さ、センサ分解能などを組み合わせることで、昼夜の影響をより軽減した立体物領域５０１を設定する。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 Although the detection characteristic information has been described above, in the three-dimensional object area setting process 401, the three-dimensional object area 501 is set more accurately by combining a plurality of these detection characteristic information. For example, by combining the distance, the brightness of the outside environment, the direction of the light, the height of the road surface, the sensor resolution, and the like, the three-dimensional object region 501 that further reduces the influence of day and night is set. Further, the number of pixels and the size described here are examples, and are not limited to this range.

［認識倍率設定処理］
次に、図４に示す認識倍率設定処理４０２について説明する。
立体物領域設定処理４０１で設定された立体物領域５０１を認識処理２０９の基準サイズとすると、この基準サイズが認識のための最適な認識領域とは限らない。そこで、認識倍率設定処理４０２では、立体物の認識特性情報を用いて複数のサイズの認識領域を定める。この時、最適な認識領域は不明であることから、基準サイズを元に認識領域を拡大もしくは縮小して、複数のサイズの認識領域を定める。認識特性情報は、例えば、（１）立体物との距離、（２）立体物の大きさ、（３）立体物の限界サイズ、（４）センサ分解能などである。以下に、これらの認識特性情報について説明する。[Recognition magnification setting process]
Next, the recognition magnification setting process 402 shown in FIG. 4 will be described.
Assuming that the three-dimensional object region 501 set in the three-dimensional object region setting process 401 is the reference size of the recognition process 209, this reference size is not always the optimum recognition area for recognition. Therefore, in the recognition magnification setting process 402, recognition areas of a plurality of sizes are determined by using the recognition characteristic information of the three-dimensional object. At this time, since the optimum recognition area is unknown, the recognition area is expanded or reduced based on the reference size to determine recognition areas of a plurality of sizes. The recognition characteristic information includes, for example, (1) the distance to the three-dimensional object, (2) the size of the three-dimensional object, (3) the limit size of the three-dimensional object, and (4) the sensor resolution. These recognition characteristic information will be described below.

（１）立体物との距離は、認識領域を拡大もしくは縮小を行う場合の拡大量もしくは縮小量を決定する指標となる。例えば、立体物が遠方にある場合、１画素あたりの立体物サイズは大きくなる。この場合、認識処理を行う後述の識別器９０１の入力になるのは画像であるため、立体物が遠方にある場合は、近傍にある場合に比べて拡大もしくは縮小する画素数は小さくなる。そこで、基準サイズを元に、立体物との距離に応じて、認識領域を拡大もしくは縮小して、複数のサイズの認識領域を定める。 (1) The distance to the three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is expanded or reduced. For example, when the three-dimensional object is far away, the size of the three-dimensional object per pixel becomes large. In this case, since the input of the classifier 901, which will be described later, to perform the recognition process is an image, when the three-dimensional object is far away, the number of pixels to be enlarged or reduced is smaller than when the three-dimensional object is near. Therefore, based on the reference size, the recognition area is expanded or reduced according to the distance to the three-dimensional object, and the recognition areas of a plurality of sizes are defined.

（２）立体物の大きさは、認識領域を拡大もしくは縮小を行う場合の拡大量もしくは縮小量を決定する指標となる。例えば、立体物が大きい場合は立体物が小さい場合に比べて、画像上で拡大もしくは縮小を行う場合の画素数は小さくなる。また、実空間上のサイズに基づいて複数のサイズの認識領域を定める場合、遠方に居る場合は近傍に居る場合に比べて、画像上で拡大もしくは縮小を行う場合の画素数は小さくなる。サブピクセル単位で認識領域を設定しない場合は、基準サイズと同じになる場合も有る。 (2) The size of a three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is enlarged or reduced. For example, when the three-dimensional object is large, the number of pixels when enlarging or reducing the image is smaller than when the three-dimensional object is small. Further, when the recognition areas of a plurality of sizes are determined based on the sizes in the real space, the number of pixels when enlarging or reducing the image is smaller when the area is far away than when the area is near the area. If the recognition area is not set for each sub-pixel, it may be the same as the standard size.

（３）立体物の限界サイズは、立体物が認識対象である場合に想定される限界サイズである。例えば、立体物が歩行者である場合、立体物の高さが２．５メートルを超えるなど大きければ、縮小方向に領域を設定する。逆に高さが０．８メートルを下回るなど小さければ、拡大方向に領域を設定する。それら中間であれば、双方に領域を設定する。設定する領域の上限下限は、認識対象の立体物や認識処理の制限などから決定してよい。 (3) The limit size of a three-dimensional object is a limit size assumed when the three-dimensional object is a recognition target. For example, when the three-dimensional object is a pedestrian, if the height of the three-dimensional object is large, such as exceeding 2.5 meters, the area is set in the reduction direction. On the contrary, if the height is small, such as less than 0.8 meters, the area is set in the expansion direction. If it is in the middle, set the area on both sides. The upper and lower limits of the area to be set may be determined from the three-dimensional object to be recognized, the limitation of the recognition process, and the like.

（４）センサ分解能は、例えばセンサがカメラ１０１、１０２であれば、距離に応じて１画素あたりのサイズが変わる。そこで、センサ分解能を基に拡大もしくは縮小する範囲を定めることができる。例えば1画素あたりの３次元空間中のサイズが２０ｃｍを超すような遠方においては、拡大もしくは縮小する範囲は１画素、２画素と言った小さな範囲で定める。逆に１画素あたりの３次元空間中のサイズが１ｃｍを切るような近距離においては、１０画素や２０画素と言った大きな範囲で拡大もしくは縮小する。 (4) As for the sensor resolution, for example, if the sensors are cameras 101 and 102, the size per pixel changes according to the distance. Therefore, the range of enlargement or reduction can be determined based on the sensor resolution. For example, in a distant place where the size in the three-dimensional space per pixel exceeds 20 cm, the range of enlargement or reduction is defined as a small range of 1 pixel and 2 pixels. On the contrary, at a short distance where the size in the three-dimensional space per pixel is less than 1 cm, it is enlarged or reduced in a large range such as 10 pixels or 20 pixels.

なお、画像上のサイズは３次元空間中の立体物サイズから逆算して求めても良い。また、立体物領域設定処理４０１において立体物領域５０１の設定に考慮しなかった検知特性情報に関しては、認識倍率設定処理４０２で検知特性情報を用いて認識領域のバリエーションを設定してもよい。その場合、検知特性情報のどの条件に応じて認識領域を拡大するか縮小するかは、立体物領域設定処理４０１で説明した内容と同様である。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 The size on the image may be obtained by back calculation from the size of a three-dimensional object in the three-dimensional space. Further, regarding the detection characteristic information that is not considered in the setting of the three-dimensional object area 501 in the three-dimensional object area setting process 401, the variation of the recognition area may be set by using the detection characteristic information in the recognition magnification setting process 402. In that case, which condition of the detection characteristic information the recognition area is expanded or contracted is the same as the content described in the three-dimensional object area setting process 401. Further, the number of pixels and the size described here are examples, and are not limited to this range.

図６は認識倍率設定処理４０２の原理を説明する図である。認識倍率設定処理４０２は、立体物領域５０１を基準サイズの認識領域として、これを拡大もしくは縮小した認識領域６０１、６０２を定める。認識領域５０１は基準サイズ、認識領域６０１は、基準サイズを縮小した認識倍率の小さい認識領域、認識領域６０２は、基準サイズを拡大した認識倍率の大きい認識領域である。図６の例では基準サイズに対して拡大もしくは縮小した２種類の認識領域を示したが、この数は認識処理時間に余裕があれば多数のバリエーションを持ってよい。また、検知処理２０８や立体物領域設定処理４０１の設定により、拡大と縮小どちらかのみを設定してもよい。認識領域の拡大量もしくは縮小量は、認識特性情報に基づいて設定する。この場合、立体物領域設定処理４０１と同様に、画像の分解能によっては基準サイズの認識領域と同一となる場合もある。 FIG. 6 is a diagram illustrating the principle of the recognition magnification setting process 402. The recognition magnification setting process 402 defines the recognition areas 601 and 602 in which the three-dimensional object area 501 is set as the reference size recognition area and is enlarged or reduced. The recognition area 501 is a reference size, the recognition area 601 is a recognition area having a reduced reference size and a small recognition magnification, and the recognition area 602 is a recognition area having an enlarged reference size and a large recognition magnification. In the example of FIG. 6, two types of recognition areas enlarged or reduced with respect to the reference size are shown, but this number may have many variations if there is a margin in the recognition processing time. Further, only enlargement or reduction may be set by setting the detection process 208 or the three-dimensional object area setting process 401. The amount of enlargement or reduction of the recognition area is set based on the recognition characteristic information. In this case, similar to the three-dimensional object area setting process 401, it may be the same as the recognition area of the reference size depending on the resolution of the image.

図７は、認識倍率設定処理４０２における正規化を説明する図である。
図７に示すように、認識領域（５０１、６０１、６０２）を後段の認識処理を実施する場合において正規化する領域を定めている。認識領域は、後述の認識処理において、認識処理を行う範囲を示すものである。認識領域５０１は基準サイズ、認識領域６０１は、基準サイズを縮小した認識倍率の小さい認識領域、認識領域６０２は、基準サイズを拡大した認識倍率の大きい認識領域である。FIG. 7 is a diagram illustrating normalization in the recognition magnification setting process 402.
As shown in FIG. 7, the area for normalizing the recognition area (501, 601 and 602) when the recognition process in the subsequent stage is performed is defined. The recognition area indicates a range in which the recognition process is performed in the recognition process described later. The recognition area 501 is a reference size, the recognition area 601 is a recognition area having a reduced reference size and a small recognition magnification, and the recognition area 602 is a recognition area having an enlarged reference size and a large recognition magnification.

認識処理においては入力情報の次元数を合わせる必要がある。基準サイズの認識領域５０１は対象の物体を綺麗に捉えている保証が無く、また装置に実装された認識処理の特性によって、どのように捉えていればよいかが変わってくる。そこで、正規化する領域をあらかじめ設定する。図７の例では、認識領域５０１は頭と足がほぼ入っているのに対し、認識倍率の小さい認識領域６０１は頭頂部と手足がはみ出でおり、認識倍率の大きい認識領域６０２は逆に頭頂部や足元に余白ができる。これらの認識領域を同じサイズに正規化すると、図７に示すように、正規化後の認識領域７０１、７０２、７０３となり、後述の認識処理で同様な処理を施すことが可能となる。ただしこの正規化処理は認識倍率設定処理４０２で必ずしも行うものではない。後述の倍率毎走査認識処理４０４や後述の詳細認識処理４０８の処理の一部として実施してよい。 In the recognition process, it is necessary to match the number of dimensions of the input information. There is no guarantee that the recognition area 501 of the reference size will capture the target object neatly, and how it should be captured depends on the characteristics of the recognition process mounted on the device. Therefore, the area to be normalized is set in advance. In the example of FIG. 7, the recognition area 501 contains the head and the foot, whereas the recognition area 601 having a small recognition magnification has the crown and the limbs protruding, and the recognition area 602 having a large recognition magnification has the head on the contrary. There are margins on the top and feet. When these recognition areas are normalized to the same size, as shown in FIG. 7, the normalized recognition areas are 701, 702, and 703, and the same processing can be performed in the recognition processing described later. However, this normalization process is not always performed by the recognition magnification setting process 402. It may be carried out as a part of the processing of the scan recognition process 404 for each magnification described later and the detailed recognition process 408 described later.

［走査領域設定処理］
次に、図４の走査領域設定処理４０３について説明する。走査領域設定処理４０３は、各認識領域に対して、立体物の配置特性情報に基づいて、認識領域よりも大きな走査領域を設定する。走査領域は画像上の領域として設定され、認識処理においては、設定された走査領域内を認識領域により走査する。すなわち、認識領域は、後述の認識処理において、認識処理を行う範囲を示すものであり、走査領域は、この認識領域を走査領域の範囲内において移動させる範囲である。これにより、認識領域を走査領域の範囲内において移動させながら認識処理を行う。走査領域の大きさを決定する配置特性情報は、例えば、（１）立体物の遠近位置、（２）立体物が存在する路面高さなどである。以下に、これらの配置特性情報について説明する。[Scanning area setting process]
Next, the scanning area setting process 403 of FIG. 4 will be described. The scanning area setting process 403 sets a scanning area larger than the recognition area for each recognition area based on the arrangement characteristic information of the three-dimensional object. The scanning area is set as an area on the image, and in the recognition process, the inside of the set scanning area is scanned by the recognition area. That is, the recognition area indicates a range in which the recognition process is performed in the recognition process described later, and the scanning area is a range in which the recognition area is moved within the range of the scanning area. As a result, the recognition process is performed while moving the recognition area within the range of the scanning area. The arrangement characteristic information that determines the size of the scanning region is, for example, (1) the perspective position of the three-dimensional object, (2) the height of the road surface on which the three-dimensional object exists, and the like. The arrangement characteristic information will be described below.

（１）立体物の遠近位置は、走査領域の設定を行う場合の指標となる。例えば、立体物が近くに在る場合は、画像上の走査領域は大きく定める。また、立体物が遠方に在る場合は、走査領域は小さく定める。これは、近くに在る場合は、センサ分解能が高く、１画素走査した場合の３次元空間上の走査量が数ｍｍ程度になるのに対し、遠方では１０ｃｍを超える為である。走査領域は、立体物検知によって発生する検知のズレ量などの特性によっても定まる。例えば、立体物の横位置中心をとった場合に最も性能を発揮する認識処理を用いる場合、立体物の横位置中心と、実際の認識対象の横位置中心のズレ量や分散から、走査領域に認識対象の横位置中心が収まるように設定してもよい。 (1) The perspective position of a three-dimensional object is an index when setting the scanning area. For example, when a three-dimensional object is nearby, the scanning area on the image is largely defined. If the three-dimensional object is far away, the scanning area is set small. This is because the sensor resolution is high when the sensor is near, and the scanning amount in the three-dimensional space when scanning one pixel is about several mm, whereas when it is far away, it exceeds 10 cm. The scanning area is also determined by the characteristics such as the amount of deviation of detection generated by the detection of a three-dimensional object. For example, when the recognition process that exerts the best performance when the horizontal position center of the three-dimensional object is used, the deviation amount and dispersion between the horizontal position center of the three-dimensional object and the horizontal position center of the actual recognition target are taken into consideration in the scanning area. It may be set so that the center of the horizontal position of the recognition target fits.

（２）立体物が存在する路面高さは、走査領域の設定を行う場合の指標となる。例えば、路面が上昇しており立体物（歩行者など）が自車よりも高い位置に在る場合は、頭側の隠れが増えて高さが実際より小さく出る。また、立体物（歩行者など）が低い位置に在る場合は、画角などによっては足元が切れる、バンパーで隠れるなどが考えられる。このような状態に合わせて、走査領域を拡大もしくは縮小する。 (2) The height of the road surface where the three-dimensional object exists is an index when setting the scanning area. For example, when the road surface is raised and a three-dimensional object (pedestrian, etc.) is at a higher position than the own vehicle, the hiding on the head side increases and the height becomes smaller than the actual height. In addition, when a three-dimensional object (pedestrian, etc.) is in a low position, it is possible that the feet may be cut off or hidden by a bumper depending on the angle of view. The scanning area is enlarged or reduced according to such a state.

また、立体物領域設定処理４０１、認識倍率設定処理４０２において立体物領域や認識領域の設定に考慮しなかった検知特性情報や認識特性情報に関しては、これを用いて走査領域設定処理４０３で走査領域を定めてもよい。この場合、どの条件に応じて走査領域を拡大するか縮小するかは、立体物領域設定処理４０１や認識倍率設定処理４０２と同様である。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 Further, regarding the detection characteristic information and the recognition characteristic information that are not considered in the setting of the three-dimensional object area and the recognition area in the three-dimensional object area setting process 401 and the recognition magnification setting process 402, the scanning area is used in the scanning area setting process 403. May be determined. In this case, whether the scanning area is expanded or contracted according to which condition is the same as in the three-dimensional object area setting process 401 and the recognition magnification setting process 402. Further, the number of pixels and the size described here are examples, and are not limited to this range.

図８は、走査領域設定処理４０３の原理を説明する図である。走査領域設定処理４０３は、各認識領域５０１、６０１、６０２に対して、走査領域８０１、８０２、８０３をそれぞれ定める。走査領域８０１、８０２、８０３は認識領域５０１、６０１、６０２と同じかそれよりも大きな領域である。ただし走査領域８０１、８０２、８０３内を認識領域５０１、６０１、６０２で走査するため、走査量が多いとは限らない。走査領域８０１、８０２、８０３は配置特性情報から画像上の領域を定める。この時、画像の分解能によっては認識領域と走査領域の画像上が同じになる場合も有る。走査領域は、各認識領域に対して個別で定めるが、処理時間に余裕があるならば、最も走査領域が大きくなる１つを採用しても良い。また、処理時間に余裕ない場合、小さな走査領域１つを各認識領域に適応しても良い。 FIG. 8 is a diagram illustrating the principle of the scanning area setting process 403. The scanning area setting process 403 defines scanning areas 801, 802, and 803 for each recognition area 501, 601, 602, respectively. The scanning areas 801 and 802, 803 are the same as or larger than the recognition areas 501, 601 and 602. However, since the inside of the scanning areas 801, 802, and 803 is scanned in the recognition areas 501, 601 and 602, the scanning amount is not always large. The scanning regions 801, 802, and 803 define regions on the image from the arrangement characteristic information. At this time, depending on the resolution of the image, the recognition area and the scanning area may be the same on the image. The scanning area is individually determined for each recognition area, but if there is a margin in the processing time, one having the largest scanning area may be adopted. Further, if the processing time is not sufficient, one small scanning area may be applied to each recognition area.

［倍率毎走査認識処理］
次に、図４に示す倍率毎走査認識処理４０４について説明する。倍率毎走査認識処理４０４では、走査領域８０１、８０２、８０３に対応する画像および視差領域(距離領域)を認識領域５０１、６０１、６０２で走査し、各サイズの走査位置毎に認識処理を実施して、対象の走査位置が立体物であるかを判別する。[Scanning recognition process for each magnification]
Next, the scan recognition process 404 for each magnification shown in FIG. 4 will be described. In the magnification-by-magnification scanning recognition process 404, the images corresponding to the scanning areas 801, 802, and 803 and the disparity area (distance area) are scanned in the recognition areas 501, 601 and 602, and the recognition process is performed for each scanning position of each size. It is determined whether the scanning position of the target is a three-dimensional object.

ここで、認識処理の性能が十分であるならば、図４の破線４０５に示すように、倍率毎走査認識処理４０４の結果を用いて車両制御処理２１０を実施してもよい。倍率毎走査認識処理４０４は倍率、走査位置などにより複数の結果を有する場合があるが、これは認識結果が最良であった１つを選択するなどの処理によって絞り込みを実施する。 Here, if the performance of the recognition process is sufficient, the vehicle control process 210 may be performed using the result of the scan recognition process 404 for each magnification as shown by the broken line 405 of FIG. The scan recognition process 404 for each magnification may have a plurality of results depending on the magnification, scanning position, etc., but this is narrowed down by a process such as selecting the one with the best recognition result.

図９は倍率毎走査認識処理４０４の原理を説明する図である。各走査領域８０１、８０２、８０３内を、認識領域５０１、６０１、６０２で走査しながら、認識処理を行う識別器９０１で認識した結果の応答位置９０２を求める。応答位置９０２を図９ではxで示した。応答位置９０２の数が多いほど認識処理が良好であることを示している。走査領域８０１、８０２、８０３内を識別器９０１で認識した結果の一例は、走査領域８０１’、８０２’、８０３’の応答位置９０２で示すように、走査領域８０１’が最も多くなっている。 FIG. 9 is a diagram illustrating the principle of the scan recognition process 404 for each magnification. The response position 902 of the result recognized by the classifier 901 that performs the recognition process is obtained while scanning the inside of each of the scanning areas 801, 802, and 803 in the recognition areas 501, 601 and 602. The response position 902 is indicated by x in FIG. The larger the number of response positions 902, the better the recognition process. As an example of the result of recognizing the inside of the scanning areas 801, 802, and 803 by the classifier 901, the scanning area 801'is the largest as shown by the response position 902 of the scanning areas 801', 802', 803'.

識別器９０１は機械学習を用いても良いし、ヒューリスティックな閾値判定を用いても良い。この判定結果が十分であるならば、図４の破線４０５に示したように、この結果を用いて認識を終えてよい。その場合、例えば最も認識処理が良好であったものを採用する。 The classifier 901 may use machine learning or heuristic threshold determination. If this determination result is sufficient, recognition may be completed using this result as shown by the broken line 405 of FIG. In that case, for example, the one with the best recognition processing is adopted.

倍率毎走査認識処理４０４において、認識処理の計算コストの削減などにより、認識処理の性能が不十分である場合に、倍率毎走査認識処理４０４の結果を用いて、詳細処理を実施してもよい。本実施形態においては詳細処理として、最適倍率設定処理４０６、詳細認識位置決定処理４０７、詳細認識処理４０８を設けた場合を説明する。 In the magnification per scan recognition process 404, when the performance of the recognition process is insufficient due to reduction of the calculation cost of the recognition process or the like, detailed processing may be performed using the result of the magnification per scan recognition process 404. .. In the present embodiment, a case where the optimum magnification setting process 406, the detailed recognition position determination process 407, and the detailed recognition process 408 are provided as detailed processes will be described.

［最適倍率設定処理］
図４に示す最適倍率設定処理４０６は、認識倍率設定処理４０２で作成した複数のサイズの認識領域から、詳細認識処理に最適な認識領域を選択する。選択方法は、例えば走査によって得られた認識処理結果における認識対象と判定された個数やその信頼度や、非認識対象と判定された個数やその信頼度、認識結果の分布などを用い、応答数の量や信頼度を複数のサイズの認識領域間で比較し、最適な認識領域を用いる。最適倍率設定処理４０６は、処理時間に十分な猶予がないならば省略してもよい。[Optimal magnification setting process]
The optimum magnification setting process 406 shown in FIG. 4 selects the optimum recognition area for the detailed recognition process from the recognition areas of a plurality of sizes created by the recognition magnification setting process 402. As the selection method, for example, the number of responses determined to be the recognition target and its reliability in the recognition processing result obtained by scanning, the number determined to be the non-recognition target and its reliability, the distribution of the recognition result, and the like are used. The amount and reliability of are compared between recognition areas of multiple sizes, and the optimum recognition area is used. The optimum magnification setting process 406 may be omitted if there is not sufficient grace in the processing time.

図１０は最適倍率設定処理４０６の原理を説明する図である。複数の倍率の認識結果から、最も応答が良かった最適倍率を選択する。最適倍率は前述の通り、認識処理の走査領域での応答数やその信頼度を用いて選択する。図１０の例では応答数が最も多かった走査領域８０１’を選択しているが、この走査領域８０１’は、基準サイズの走査領域８０１に対応し、基準サイズの走査領域８０１は基準サイズの認識領域５０１に対応している。 FIG. 10 is a diagram illustrating the principle of the optimum magnification setting process 406. From the recognition results of multiple magnifications, select the optimum magnification with the best response. As described above, the optimum magnification is selected by using the number of responses in the scanning area of the recognition process and its reliability. In the example of FIG. 10, the scanning region 801'with the largest number of responses is selected, but this scanning region 801'corresponds to the reference size scanning region 801 and the reference size scanning region 801 recognizes the reference size. It corresponds to the area 501.

［詳細認識位置決定処理］
図４に示す詳細認識位置決定処理４０７は、最適倍率設定処理４０６で得られた最適倍率について、詳細認識を実施する代表位置を決定する。詳細認識は、例えば、倍率毎走査認識処理４０４で得られた認識処理の信頼度が最大の位置を選ぶ。または、平均変位法（Mean Shift法）のようなクラスタリング手段を用いて位置を決定しても良い。最適倍率設定処理４０６を行わない場合、各倍率に対して詳細認識位置決定処理４０７を実施してよい。[Detailed recognition position determination process]
The detailed recognition position determination process 407 shown in FIG. 4 determines a representative position for performing detailed recognition for the optimum magnification obtained by the optimum magnification setting process 406. For detailed recognition, for example, the position where the reliability of the recognition process obtained by the scan recognition process 404 for each magnification is maximum is selected. Alternatively, the position may be determined by using a clustering means such as the mean shift method (Mean Shift method). When the optimum magnification setting process 406 is not performed, the detailed recognition position determination process 407 may be performed for each magnification.

図１１は詳細認識位置決定処理４０７の原理を説明する図である。倍率毎走査認識処理４０４から得られた一つ以上の応答位置から、詳細認識処理４０８を行う代表位置１１１を決定する。複数の反応点が存在する場合は、例えばMean Shift法のようなクラスタリング技術を用いる。決定された代表位置１１１を中心とした領域が詳細識別領域となる。 FIG. 11 is a diagram illustrating the principle of the detailed recognition position determination process 407. From one or more response positions obtained from the scan recognition process 404 for each magnification, the representative position 111 to perform the detailed recognition process 408 is determined. If there are multiple reaction points, a clustering technique such as the Mean Shift method is used. The area centered on the determined representative position 111 is the detailed identification area.

［詳細認識処理］
図４に示す詳細認識処理４０８は、詳細認識位置決定処理４０７で決定した代表位置１１１に対して詳細認識を実施し、対象の種別や信頼度を算出する。もしくは、倍率毎走査認識処理４０４による応答位置に基づいて選択された最適のサイズの認識領域を用いて詳細認識を実施し、対象の種別や信頼度を算出する。詳細認識処理４０８は倍率毎走査認識処理４０４で用いた認識処理と同等性能以上の種別分類性能を有する識別器１２０を用いる。[Detailed recognition processing]
The detailed recognition process 408 shown in FIG. 4 performs detailed recognition on the representative position 111 determined by the detailed recognition position determination process 407, and calculates the type and reliability of the target. Alternatively, detailed recognition is performed using a recognition area of the optimum size selected based on the response position by the scan recognition process 404 for each magnification, and the type and reliability of the target are calculated. The detailed recognition process 408 uses a classifier 120 having a classification performance equal to or higher than that of the recognition process used in the magnification-based scanning recognition process 404.

図１２は、詳細認識処理４０８の原理を説明する図である。詳細認識位置決定処理４０７によって求めた代表位置１１１に対して識別器１２０を用いて詳細な認識処理を行い、立体物の種別を決定する。立体物の種別とは、例えば、歩行者、車両、信号、標識、白線、車のテールランプやヘッドライトなどである。 FIG. 12 is a diagram illustrating the principle of the detailed recognition process 408. The representative position 111 obtained by the detailed recognition position determination process 407 is subjected to detailed recognition processing using the classifier 120 to determine the type of the three-dimensional object. The types of three-dimensional objects are, for example, pedestrians, vehicles, traffic lights, signs, white lines, tail lamps and headlights of vehicles.

倍率毎走査認識処理４０４と詳細認識処理４０８で用いる認識処理には、例えば以下のような技術があげられる。予め用意した認識対象らしさを有するテンプレートと認識領域を比較するテンプレートマッチングを用いる技術。輝度画像やＨＯＧやＨａａｒ−Ｌｉｋｅといった特徴量と、サポートベクターマシンやＡｄａ−ＢｏｏｓｔやＤｅｅｐＬｅａｒｎｉｎｇといった機械学習手法を合わせた識別器を利用する技術。また、エッジ形状などを人為的に決めた閾値判定で認識しても良い。倍率毎走査認識処理４０４と詳細認識処理４０８にはこれらを実施するために必要なリサイズ、平滑化、エッジ抽出、正規化、孤立点除去、勾配抽出、色変換、ヒストグラム作成などの画像処理を含む。 Examples of the recognition process used in the magnification-by-magnification scan recognition process 404 and the detailed recognition process 408 include the following techniques. A technology that uses template matching that compares a recognition area with a template that has the appearance of a recognition target prepared in advance. Technology that uses a classifier that combines features such as brightness images, HOG, and Haar-Like with machine learning methods such as support vector machines, Ada-Boost, and Deep Learning. Further, the edge shape or the like may be recognized by an artificially determined threshold value determination. The scan recognition process 404 for each magnification and the detailed recognition process 408 include image processing such as resizing, smoothing, edge extraction, normalization, isolated point removal, gradient extraction, color conversion, and histogram creation necessary for performing these processes. ..

（変形例）
本実施形態では、ステレオカメラを用いた画像認識装置１００で説明した。しかし、ステレオカメラを用いない画像認識装置１００’を用いて実現してもよい。
図１３は、画像認識装置１００’における処理動作を示す図である。図２に示した画像認識装置１００と同一の個所には同一の符号を付してその説明を省略する。(Modification example)
In this embodiment, the image recognition device 100 using a stereo camera has been described. However, it may be realized by using an image recognition device 100'that does not use a stereo camera.
FIG. 13 is a diagram showing a processing operation in the image recognition device 100'. The same parts as those of the image recognition device 100 shown in FIG. 2 are designated by the same reference numerals, and the description thereof will be omitted.

画像認識装置１００’は、光学カメラ１３０１とレーダーセンサ１３０２を備えている。これにより、立体物を検知する。光学カメラ１３０１により画像が撮像され、撮像された画像情報について、撮像素子が持つ固有の癖を吸収するための補正などの画像処理２０５を行う。画像処理２０５の処理結果は画像バッファ２０６に蓄えられる。また、レーダーセンサ１３０２により、立体物までの距離が得られる。検知処理１３０３は、立体物までの距離に基づいて、３次元空間上の立体物を検知する。認識処理２０９は、検知処理１３０３により設定された検知領域に対して立体物の種別を特定する認識処理を行う。 The image recognition device 100'includes an optical camera 1301 and a radar sensor 1302. As a result, a three-dimensional object is detected. An image is captured by the optical camera 1301, and the captured image information is subjected to image processing 205 such as correction for absorbing the peculiar habit of the image sensor. The processing result of the image processing 205 is stored in the image buffer 206. Further, the radar sensor 1302 can obtain the distance to the three-dimensional object. The detection process 1303 detects a three-dimensional object in a three-dimensional space based on the distance to the three-dimensional object. The recognition process 209 performs a recognition process for specifying the type of the three-dimensional object for the detection area set by the detection process 1303.

レーダーセンサ１３０２から出力される立体物までの距離を入力とする検知処理１３０３は、距離計測に用いるレーダーセンサ１３０２のセンサ特性を考慮した検知処理を行う必要はあるが、検知領域を決定した後の処理は、画像認識装置１００で説明したステレオカメラによる構成と同様にできる。また、画像認識装置１００’は、画像処理２０５において複数の画像を必要としない。 The detection process 1303, which inputs the distance from the radar sensor 1302 to the output stereoscopic object, needs to perform the detection process in consideration of the sensor characteristics of the radar sensor 1302 used for the distance measurement, but after determining the detection area. The processing can be performed in the same manner as the configuration by the stereo camera described in the image recognition device 100. Further, the image recognition device 100'does not require a plurality of images in the image processing 205.

以上説明した実施形態によれば、次の作用効果が得られる。
（１）画像認識装置１００、１００’は、カメラ１０１、１０２によって撮像された画像上に設定された立体物の検知領域３０１に対して、立体物の検知特性情報に基づいて立体物の検知領域３０１を拡大もしくは縮小して立体物領域５０１を設定する立体物領域設定処理４０１と、立体物領域設定処理４０１により設定された立体物領域５０１に対して立体物の種別を特定する認識処理を行う認識処理２０９と、を備える。検知特性情報は、例えば、立体物の識別性、立体物との距離、立体物の大きさ、立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、立体物が存在する路面の高さ、撮像部のセンサ分解能の少なくとも一つである。これにより、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。According to the embodiment described above, the following effects can be obtained.
(1) The image recognition devices 100 and 100'are for the three-dimensional object detection area 301 set on the image captured by the cameras 101 and 102, based on the three-dimensional object detection characteristic information. Performs recognition processing for specifying the type of three-dimensional object for the three-dimensional object area setting processing 401 that sets the three-dimensional object area 501 by enlarging or reducing 301 and the three-dimensional object area 501 set by the three-dimensional object area setting process 401. The recognition process 209 and the like are provided. The detection characteristic information includes, for example, the distinctiveness of a three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the external environment, the direction of the headlight, and the height of the road surface on which the three-dimensional object exists. It is at least one of the sensor resolutions of the image pickup unit. This makes it possible to provide an image recognition device that accurately detects a three-dimensional object and has improved recognition performance.

（２）画像認識装置１００、１００’は、カメラ１０１、１０２によって撮像された画像上に設定された立体物の検知領域３０１に対して、立体物の第１の特性情報に基づいて立体物の検知領域３０１を拡大もしくは縮小して立体物領域５０１を設定する立体物領域設定処理４０１と、立体物領域設定処理４０１によって求めた立体物領域５０１を基準サイズとして、立体物の第２の特性情報に基づいて、複数のサイズの認識領域６０１、６０２を定める認識倍率設定処理４０２と、認識倍率設定処理４０２で定めた複数の認識領域６０１、６０２に対して、立体物の第３の特性情報に基づいて認識領域６０１、６０２よりも広い複数の走査領域８０２、８０３を設定する走査領域設定処理４０３と、走査領域設定処理４０３で設定された走査領域８０２、８０３を用いて、認識処理を行う認識処理２０９と、を備える。第１の特性情報乃至第３の特性情報は、例えば、立体物の識別性、立体物との距離、立体物の大きさ、立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、立体物が存在する路面の高さ、撮像部のセンサ分解能、立体物の限界サイズ、立体物の遠近位置、立体物が存在する路面高さの少なくとも一つである。これにより、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。 (2) The image recognition devices 100 and 100'are based on the first characteristic information of the three-dimensional object with respect to the detection area 301 of the three-dimensional object set on the images captured by the cameras 101 and 102. The second characteristic information of the three-dimensional object is based on the three-dimensional object area setting process 401 for setting the three-dimensional object area 501 by enlarging or reducing the detection area 301 and the three-dimensional object area 501 obtained by the three-dimensional object area setting process 401 as a reference size. Based on the above, the recognition magnification setting process 402 that defines the recognition areas 601 and 602 of a plurality of sizes and the plurality of recognition areas 601 and 602 defined by the recognition magnification setting process 402 are used as the third characteristic information of the three-dimensional object. Recognition that performs recognition processing using the scanning area setting process 403 for setting a plurality of scanning areas 802 and 803 wider than the recognition areas 601 and 602 and the scanning areas 802 and 803 set in the scanning area setting process 403. Processing 209 and. The first characteristic information to the third characteristic information include, for example, the distinctiveness of a three-dimensional object, the distance from the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the external environment, the direction of the headlight, and the like. It is at least one of the height of the road surface on which the three-dimensional object exists, the sensor resolution of the image pickup unit, the limit size of the three-dimensional object, the perspective position of the three-dimensional object, and the road surface height on which the three-dimensional object exists. This makes it possible to provide an image recognition device that accurately detects a three-dimensional object and has improved recognition performance.

本発明は、上記の実施形態に限定されるものではなく、本発明の特徴を損なわない限り、本発明の技術思想の範囲内で考えられるその他の形態についても、本発明の範囲内に含まれる。また、上述の実施形態と変形例を組み合わせた構成としてもよい。 The present invention is not limited to the above-described embodiment, and other embodiments considered within the scope of the technical idea of the present invention are also included within the scope of the present invention as long as the features of the present invention are not impaired. .. Further, the configuration may be a combination of the above-described embodiment and the modified example.

１００、１００’ 画像認識装置、１０１、１０２カメラ、１０３画像入力インタフェース、１０４画像処理部、１０５演算処理部、１０６記憶部、１０７ＣＡＮインタフェース、１０８制御処理部、１０９内部バス、１１０車載ネットワークＣＡＮ 100, 100'Image recognition device, 101, 102 camera, 103 image input interface, 104 image processing unit, 105 arithmetic processing unit, 106 storage unit, 107 CAN interface, 108 control processing unit, 109 internal bus, 110 in-vehicle network CAN

Claims

The detection region of the three-dimensional object that has been set on the captured image by the imaging unit, by expanding or shrinking on the image based on the detection characteristics information of said three-dimensional object, sets the three-dimensional object area on the image Three-dimensional object area setting part and
A recognition processing unit that performs recognition processing for specifying the type of the three-dimensional object with respect to the three-dimensional object region set by the three-dimensional object region setting unit.
Image recognition device.

In the image recognition device according to claim 1,
The detection characteristic information includes the distinctiveness of the three-dimensional object, the distance from the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the external environment, the direction of the headlight, and the three-dimensional object. An image recognition device that is at least one of the height of the road surface and the sensor resolution of the image pickup unit.

In the image recognition apparatus according to claim 1 or 2.
A recognition magnification setting unit for defining recognition areas of a plurality of sizes based on the recognition characteristic information of the three-dimensional object is provided with the three-dimensional object area obtained by the three-dimensional object area setting unit as a reference size.
The recognition processing unit is an image recognition device that performs the recognition processing on each of the recognition areas of a plurality of sizes defined by the recognition magnification setting unit.

In the image recognition device according to claim 3,
The recognition characteristic information is an image recognition device having at least one of a distance from the three-dimensional object, a size of the three-dimensional object, a limit size of the three-dimensional object, and a sensor resolution of the imaging unit.

In the image recognition device according to claim 3,
A scanning area setting unit for setting a plurality of scanning areas wider than the recognition area based on the arrangement characteristic information of the three-dimensional object is provided for the plurality of recognition areas defined by the recognition magnification setting unit.
The recognition processing unit is an image recognition device that performs the recognition processing using the scanning area set by the scanning area setting unit.

In the image recognition device according to claim 5,
The arrangement characteristic information is an image recognition device that is at least one of the perspective position of the three-dimensional object and the road surface height on which the three-dimensional object exists.

In the image recognition device according to claim 5,
An image recognition device including a magnification-by-magnification scanning recognition processing unit that scans a plurality of the scanning regions by the plurality of recognition regions to obtain a response position of a recognition result.

In the image recognition device according to claim 7,
An image recognition device including an optimum magnification setting unit that selects the recognition area of the optimum size based on the response position by the magnification-by-magnification scanning recognition processing unit.

In the image recognition device according to claim 8,
An image recognition device including a detailed recognition position determination processing unit that determines a representative position for performing the recognition processing based on the response position of the scanning region corresponding to the recognition region selected by the optimum magnification setting unit.

In the image recognition device according to claim 9,
An image recognition device including a detailed recognition processing unit that performs the recognition process using the recognition area or the representative position of the optimum size and specifies the type of the three-dimensional object to be recognized.

The detection region of the three-dimensional object that has been set on the captured image by the imaging unit, by expanding or shrinking on the image based on the first characteristic information of the solid object, the solid object region on the image The three-dimensional object area setting unit to be set and
A recognition magnification setting unit that defines recognition areas of a plurality of sizes based on the second characteristic information of the three-dimensional object, using the three-dimensional object area obtained by the three-dimensional object area setting unit as a reference size.
A scanning area setting unit that sets a plurality of scanning areas wider than the recognition area based on the third characteristic information of the three-dimensional object with respect to the plurality of recognition areas defined by the recognition magnification setting unit.
A recognition processing unit that performs recognition processing for specifying the type of the three-dimensional object using the scanning area set by the scanning area setting unit, and a recognition processing unit.
Image recognition device.

In the image recognition device according to claim 11,
The first characteristic information, the second characteristic information, and the third characteristic information are the distinctiveness of the three-dimensional object, the distance from the three-dimensional object, the size of the three-dimensional object, and the three-dimensional object, respectively. The assumed size, the brightness of the external environment, the direction of the headlight, the height of the road surface on which the three-dimensional object exists, the sensor resolution of the image pickup unit, the limit size of the three-dimensional object, the perspective position of the three-dimensional object, and the three-dimensional object. An image recognition device that is at least one of the existing road surface heights.

A solid that sets the three-dimensional object area by enlarging or reducing the detection area of the three-dimensional object based on the detection characteristic information of the three-dimensional object with respect to the detection area of the three-dimensional object set on the image captured by the image pickup unit. Object area setting part and
A recognition processing unit that performs recognition processing for specifying the type of the three-dimensional object with respect to the three-dimensional object region set by the three-dimensional object region setting unit.
A recognition magnification setting unit that defines recognition areas of a plurality of sizes based on the recognition characteristic information of the three-dimensional object is provided with the three-dimensional object area obtained by the three-dimensional object area setting unit as a reference size .
The recognition processing unit is an image recognition device that performs the recognition processing on each of the recognition areas of a plurality of sizes defined by the recognition magnification setting unit.

In the image recognition device according to claim 13,
The detection characteristic information includes the distinctiveness of the three-dimensional object, the distance from the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the external environment, the direction of the headlight, and the three-dimensional object. An image recognition device that is at least one of the height of the road surface and the sensor resolution of the image pickup unit.

In the image recognition device according to claim 13,
The recognition characteristic information is an image recognition device having at least one of a distance from the three-dimensional object, a size of the three-dimensional object, a limit size of the three-dimensional object, and a sensor resolution of the imaging unit.

In the image recognition device according to claim 13,
A scanning area setting unit for setting a plurality of scanning areas wider than the recognition area based on the arrangement characteristic information of the three-dimensional object is provided for the plurality of recognition areas defined by the recognition magnification setting unit.
The recognition processing unit is an image recognition device that performs the recognition processing using the scanning area set by the scanning area setting unit.

In the image recognition apparatus according to claim 16,
The arrangement characteristic information is an image recognition device that is at least one of the perspective position of the three-dimensional object and the road surface height on which the three-dimensional object exists.

In the image recognition apparatus according to claim 16,
An image recognition device including a magnification-by-magnification scanning recognition processing unit that scans a plurality of the scanning regions by the plurality of recognition regions to obtain a response position of a recognition result.

In the image recognition device according to claim 18,
An image recognition device including an optimum magnification setting unit that selects the recognition area of the optimum size based on the response position by the magnification-by-magnification scanning recognition processing unit.

In the image recognition device according to claim 19,
An image recognition device including a detailed recognition position determination processing unit that determines a representative position for performing the recognition processing based on the response position of the scanning region corresponding to the recognition region selected by the optimum magnification setting unit.

In the image recognition device according to claim 20,
An image recognition device including a detailed recognition processing unit that performs the recognition process using the recognition area or the representative position of the optimum size and specifies the type of the three-dimensional object to be recognized.