JP4692344B2

JP4692344B2 - Image recognition device

Info

Publication number: JP4692344B2
Application number: JP2006075386A
Authority: JP
Inventors: 映夫深町; 貴司服部; 守古田
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 2006-03-17
Filing date: 2006-03-17
Publication date: 2011-06-01
Anticipated expiration: 2026-03-17
Also published as: JP2007249841A

Description

本発明は、画像認識装置に関し、特にパターン認識によって被写体を識別する画像認識装置に関する。 The present invention relates to an image recognition apparatus, and more particularly to an image recognition apparatus that identifies a subject by pattern recognition.

車両に搭載されたカメラで道路状況を撮影し、取得した画像情報から画面内の歩行者等を検出して運転者に知らせるシステムが知られている（例えば、特許文献１参照）。特許文献１には、赤外線画像と可視光画像とを取得し、可視光画像から道路領域を特定するとともに、道路領域内の赤外線画像から道路領域内の動体物（歩行者等）をテンプレートマッチング等により検出する技術が記載されている。
特開２００２−９９９９７号公報 There is known a system in which a road situation is photographed with a camera mounted on a vehicle, and a pedestrian or the like in a screen is detected from acquired image information to inform a driver (for example, see Patent Document 1). In Patent Document 1, an infrared image and a visible light image are acquired, a road region is specified from the visible light image, and moving objects (pedestrians, etc.) in the road region are template-matched from the infrared image in the road region. The technique of detecting by is described.
JP 2002-99997 A

この技術では、赤外線カメラの画像から歩行者と思われる動体領域を切り出し、テンプレートマッチングの手法によって歩行者の検出を行っている。しかしながら、この手法では、自転車や二輪車に乗っている人物についても自転車や二輪車とは別に乗員の領域のみが動体領域として切り出され、歩行者候補として取り扱われる等歩行者と区別することができない。この結果、テンプレートマッチングの処理時間も増大し、また、歩行者でないこれらの乗員を歩行者と誤認識するおそれがある。 In this technique, a moving body area that seems to be a pedestrian is cut out from an image of an infrared camera, and a pedestrian is detected by a template matching method. However, with this method, a person riding a bicycle or two-wheeled vehicle cannot be distinguished from a pedestrian because, for example, only the occupant's region is cut out as a moving body region separately from the bicycle or two-wheeled vehicle, and treated as a pedestrian candidate. As a result, the template matching processing time also increases, and there is a possibility that these occupants who are not pedestrians may be erroneously recognized as pedestrians.

そこで本発明は、自転車や二輪車およびそれらの乗員と歩行者との判別精度を向上させた画像認識装置を提供することを課題とする。 Therefore, an object of the present invention is to provide an image recognition apparatus that improves the accuracy of discrimination between bicycles and two-wheeled vehicles and their occupants and pedestrians.

上記課題を解決するため、本発明にかかる画像認識装置は、画像内の被写体を認識する画像認識装置において、被写体のテンプレートを格納している辞書を複数備えている辞書群と、画像中のオプティカルフローベクトルを抽出して、同ベクトルが類似する画素点を抽出することにより、移動体領域を抽出し、その移動速度に基づいて、前記辞書群中から被写体認識のために利用する辞書を選択する辞書選択手段と、抽出した前記移動体領域と、前記辞書選択手段で選択した辞書に格納されたテンプレートを比較することで、被写体を認識する認識手段と、を備えていることを特徴とする。 In order to solve the above-described problems, an image recognition apparatus according to the present invention is an image recognition apparatus for recognizing a subject in an image, and includes a dictionary group including a plurality of dictionaries storing templates of the subject, and an optical in the image. By extracting a flow vector and extracting pixel points with similar vectors, a moving body region is extracted, and a dictionary used for subject recognition is selected from the dictionary group based on the moving speed. It is characterized by comprising a dictionary selection means, a recognition means for recognizing a subject by comparing the extracted moving body region and a template stored in the dictionary selected by the dictionary selection means .

本発明によれば、被写体の移動速度に応じて適切な辞書、つまり、テンプレートのセットをテンプレートマッチングにおいて使用する。ここで、これらのテンプレートは被写体の種別に応じた辞書に格納されており、辞書選択手段は、被写体の移動速度に基づいて利用する辞書を切り替えるか、被写体の移動速度に基づいて利用する辞書の組み合わせおよびそれらの優先順位を切り替えるとよい。使用する辞書を切り替えたり、優先順位を変更することで、マッチングを行うテンプレートの数を効果的に削減する。 According to the present invention, an appropriate dictionary, that is, a set of templates is used in template matching according to the moving speed of the subject. Here, these templates are stored in a dictionary corresponding to the type of the subject, and the dictionary selection means switches the dictionary to be used based on the moving speed of the subject or uses the dictionary to be used based on the moving speed of the subject. It is good to switch combinations and their priorities. By switching the dictionary to be used or changing the priority order, the number of templates to be matched is effectively reduced.

辞書選択手段は、撮像時刻が既知の複数の時系列画像を用いて被写体の移動速度を算出するとよい。例えば、オプティカルフローを算出して、画像内での移動距離と、被写体とカメラとの距離から被写体の実際の移動速度を算出するとよい。

The dictionary selection means may calculate the moving speed of the subject using a plurality of time-series images whose imaging times are known. For example, the optical flow may be calculated, and the actual moving speed of the subject may be calculated from the moving distance in the image and the distance between the subject and the camera.

本発明によれば、被写体の移動速度に応じて適切なテンプレート辞書を用いるため、移動速度が異なる種別の被写体、例えば、走行中の自転車や二輪車の乗員を歩行者と認識する誤認識を低減できる。このため、認識結果を用いた警報や挙動制御の信頼性が向上するとともに、安全性も向上する。 According to the present invention, since an appropriate template dictionary is used according to the moving speed of a subject, it is possible to reduce misrecognition of a subject having a different moving speed, for example, a traveling bicycle or a motorcycle occupant as a pedestrian. . For this reason, the reliability of the alarm and behavior control using the recognition result is improved, and the safety is also improved.

このとき、テンプレート辞書を複数用意して、それぞれを切り替えて使用する手法も有効であるが、複数の辞書を組み合わせてその優先順位を変更するようにすると、個々のテンプレート辞書のサイズを小さくすることができるとともに、早期にパターンが一致する確率が高まるため、識別を高速化できる利点がある。 At this time, it is effective to prepare a plurality of template dictionaries and switch between them. However, if the priority is changed by combining a plurality of dictionaries, the size of each template dictionary can be reduced. In addition, the probability of matching the patterns at an early stage increases, so that there is an advantage that the identification can be speeded up.

さらに、移動速度を時系列画像から算出することで、対象物の移動速度を判定するために別の手段（例えば、レーダ装置等）を用いる必要がなくなり、システムを簡素化することができる。 Furthermore, by calculating the moving speed from the time-series image, it is not necessary to use another means (for example, a radar device) to determine the moving speed of the object, and the system can be simplified.

以下、添付図面を参照して本発明の好適な実施の形態について詳細に説明する。説明の理解を容易にするため、各図面において同一の構成要素に対しては可能な限り同一の参照番号を附し、重複する説明は省略する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments of the invention will be described in detail with reference to the accompanying drawings. In order to facilitate the understanding of the description, the same reference numerals are given to the same components in the drawings as much as possible, and duplicate descriptions are omitted.

図１は、本発明にかかる画像認識装置の構成を示すブロック図である。ここでは、取得した画像から歩行者等を認識し、運転者に報知する警報システムに採用した例を説明する。カメラ１は、車両前方の周辺画像を連続的に取得するカメラであり、例えば、車室内前方のルームミラーの前等に配置され、テレビフレームレートで動画像を取得するカメラが用いられる。警報ＥＣＵ２は、ＣＰＵ、ＲＯＭ、ＲＡＭ等によって構成されており、画像認識部２０と判定部２１とを備えている。画像認識部２０と判定部２１とは、異なるハードウェアによって実現されていてもよいが、一部または全てのハードウェアを共有し、ソフトウェアによって実現されていてもよい。この場合、ソフトウェアは完全に独立した構成でもよいが、一部のプログラムを共有していてもよく、共有されるプログラム内で明確に区分されていなくてもよい。 FIG. 1 is a block diagram showing a configuration of an image recognition apparatus according to the present invention. Here, the example employ | adopted as the alarm system which recognizes a pedestrian etc. from the acquired image and alert | reports to a driver | operator is demonstrated. The camera 1 is a camera that continuously acquires peripheral images in front of the vehicle. For example, a camera that is disposed in front of a room mirror in front of the vehicle interior and acquires moving images at a television frame rate is used. The alarm ECU 2 includes a CPU, a ROM, a RAM, and the like, and includes an image recognition unit 20 and a determination unit 21. The image recognition unit 20 and the determination unit 21 may be realized by different hardware, but may share some or all hardware and may be realized by software. In this case, the software may have a completely independent configuration, but may share a part of the program or may not be clearly divided in the shared program.

警報ＥＣＵ２には、出力装置である警報装置４と、記憶装置３が接続されている。警報装置４としては、計器パネル内の表示装置やナビゲーション装置等と共有されるディスプレイのように視覚的に警報を行う装置のほか、音声により警報を行うベル、ビープ装置、スピーカーのほか、運転者の着座するシートを揺動する手段等のいずれかまたはそれらの組み合わせを用いることができる。記憶装置３は、テンプレートマッチングのための辞書を格納するものであり、読み書き可能な記録媒体を用いた記録装置、例えば、ハードディスク装置やフラッシュメモリ等を用いるとよい。なお、辞書データの一部を読み出し専用の記憶媒体に記録し、これと分けて学習結果等を読み書き可能な記録媒体に記録するようにしてもよい。また、記憶装置３は、カメラ画像やその処理結果を格納するために用いることができる。 An alarm device 4 that is an output device and a storage device 3 are connected to the alarm ECU 2. As the alarm device 4, in addition to a visual alarm device such as a display shared with a display device or a navigation device in an instrument panel, a bell, a beep device, a speaker that performs an alarm by voice, a driver Any of a means for swinging a seat on which the user sits or a combination thereof can be used. The storage device 3 stores a dictionary for template matching, and a recording device using a readable / writable recording medium, for example, a hard disk device or a flash memory may be used. Note that a part of the dictionary data may be recorded on a read-only storage medium, and the learning result or the like may be recorded on a readable / writable recording medium separately. The storage device 3 can be used to store camera images and processing results thereof.

次に、この実施形態の動作を具体的に説明する。図２は、この装置における画像認識処理を示すフローチャートである。この処理は、画像認識部２０において、図示していない警報システムのスイッチがオンで、かつ、車両の電源がオンにされている間、所定のタイミングで繰り返し実行される。このタイミングは、カメラ１のフレームレートに同期してその各コマごとに行われるか、あるいは、所定のコマおきに実行されるとよい。 Next, the operation of this embodiment will be specifically described. FIG. 2 is a flowchart showing image recognition processing in this apparatus. This processing is repeatedly executed at a predetermined timing in the image recognition unit 20 while the alarm system switch (not shown) is on and the vehicle is powered on. This timing may be performed for each frame in synchronism with the frame rate of the camera 1 or may be executed every predetermined frame.

まず、カメラ１で取得したカメラ画像Ｉｍ（ｔ_１）が読み込まれる（ステップＳ１）。以下、ｔ_１は現在時刻を表すものとする。次に、カメラ１で取得した過去画像Ｉｍ（ｔ_２）を読み込む（ステップＳ３）。これは画像認識部２０内の一時記憶領域に格納されている。図３は、入力画像（動画）例を説明する図である。例えば、ｔ_２は、現在時刻ｔ_１から画像処理のタイムステップΔｔだけさかのぼった時点である。ステップＳ５では、現在時刻の画像Ｉｍ（ｔ_１）が一時記憶領域に格納される。格納された画像Ｉｍ（ｔ_１）は、次のタイムステップにおいてステップＳ３で読み込まれることになる。 First, the camera image Im (t ₁ ) acquired by the camera ₁ is read (step S1). Following, t ₁ is assumed to represent the current time. Next, the past image Im (t ₂ ) acquired by the camera 1 is read (step S3). This is stored in a temporary storage area in the image recognition unit 20. FIG. 3 is a diagram illustrating an example of an input image (moving image). For example, t ₂ is a point in time that goes back from the current time t _{1 by} a time step Δt of image processing. In step S5, the image Im (t ₁ ) at the current time is stored in the temporary storage area. The stored image Im (t ₁ ) is read in step S3 in the next time step.

次に、時系列の異なるこれら二つの画像からオプティカルフローを算出する（ステップＳ７）。このオプティカルフローの算出手法は各種の公知の手法を採用することができる。ここでは、勾配法を用いた算出手法について述べる。 Next, an optical flow is calculated from these two images having different time series (step S7). Various known methods can be adopted as the optical flow calculation method. Here, a calculation method using the gradient method will be described.

画像中の画素点（ｘ，ｙ）の時刻ｔにおける輝度をＥ（ｘ，ｙ，ｔ）とする。この画素点のオプティカルフローベクトルのｘ成分、ｙ成分をそれぞれｕ（ｘ，ｙ）、ｖ（ｘ，ｙ）とすると、十分に小さい時間間隔Δｔ後の点（ｘ＋ｕΔｔ，ｙ＋ｖΔｔ）の輝度はＥ（ｘ，ｙ，ｔ）と等しくなる。すなわち、

が成り立つ。輝度Ｅがｘ、ｙ、ｔに関して滑らかに変化するとすれば、上記（１）式の左辺はテーラー級数に展開でき、ｄｘ／ｄｔ＝ｕ、ｄｙ／ｄｔ＝ｖが成立する。この関係から（１）式を成立すると、以下の（２）式で表させるオプティカルフロー拘束方程式が成立する。

ここで、Ｅｘ、Ｅｙ、Ｅｔは、それぞれｘ、ｙ方向の空間微分値および時間微分値であり、時系列画像から計算される。拘束方程式は、１画素につき、１つの方程式しか得られないため、フローベクトルを一意に算出することはできない。そこで、「ある注目画素の近傍では、動きは滑らかである」という条件を加え、注目画素近傍の拘束方程式から最小自乗法により解を求め、オプティカルフローとする。 The luminance at the time t of the pixel point (x, y) in the image is E (x, y, t). If the x component and y component of the optical flow vector at this pixel point are u (x, y) and v (x, y), the luminance at a point (x + uΔt, y + vΔt) after a sufficiently small time interval Δt is E ( x, y, t). That is,

Holds. If the luminance E changes smoothly with respect to x, y, and t, the left side of the above equation (1) can be expanded into a Taylor series, and dx / dt = u and dy / dt = v are established. When formula (1) is established from this relationship, an optical flow constraint equation expressed by the following formula (2) is established.

Here, Ex, Ey, and Et are a spatial differential value and a temporal differential value in the x and y directions, respectively, and are calculated from the time series image. Since only one equation per constraint pixel can be obtained, a flow vector cannot be uniquely calculated. Therefore, a condition that “the motion is smooth in the vicinity of a certain pixel of interest” is added, and a solution is obtained by a least square method from a constraint equation in the vicinity of the pixel of interest to obtain an optical flow.

図４は、こうして求めたオプティカルフローベクトルを矢印でカメラ画像に重ね合わせて示したイメージ図である。図は、停車中の画像から算出した場合を示しているが、車両走行中は、車速センサ等から得られた自車両の移動量（移動ベクトル）を差し引くことで、静止座標系に対するフローベクトルを算出するとよい。 FIG. 4 is an image diagram in which the optical flow vector thus obtained is superimposed on the camera image with an arrow. The figure shows the case where it is calculated from the stopped image. While the vehicle is running, the flow vector for the stationary coordinate system is calculated by subtracting the movement amount (movement vector) of the host vehicle obtained from the vehicle speed sensor or the like. It is good to calculate.

オプティカルフローベクトルを算出したら、画像Ｉｍ（ｔ_１）中から移動物体を抽出する（ステップＳ９）。具体的には、オプティカルフローベクトルに着目することで、抽出処理を行う。ここでは、例として画像中から横方向へ移動する物体を抽出する場合を例に説明する。ある画素点のオプティカルフローベクトル成分を上述と同様に（ｕ，ｖ）とすると、｜ｕ｜＞０、かつ、ｖ≒０であれば、当該物体は画像内のＸ軸方向、つまり、横方向に移動したものとみなせる。画像全体を検索してこの関係を満たす点（画素）を抽出することで横方向へ移動する物体を抽出することができる。また、ベクトル成分（ｕ，ｖ）からフローの向き（勾配）として、θ＝ｔａｎ^−１（ｖ／ｕ）を求め、このθが所定範囲内（例えば、±１５°以内）であれば横方向に移動する物体とみなしてもよい。フローベクトルの方向を勘案することで、任意の方向へと移動する物体を抽出することができる。 When the optical flow vector is calculated, a moving object is extracted from the image Im (t ₁ ) (step S9). Specifically, the extraction process is performed by paying attention to the optical flow vector. Here, a case where an object moving in the horizontal direction is extracted from an image will be described as an example. Assuming that the optical flow vector component at a certain pixel point is (u, v) as described above, if | u |> 0 and v≈0, the object is in the X-axis direction in the image, that is, in the horizontal direction. It can be regarded as having moved to. By searching the entire image and extracting points (pixels) that satisfy this relationship, an object moving in the horizontal direction can be extracted. Further, θ = tan ⁻¹ (v / u) is obtained from the vector component (u, v) as the flow direction (gradient), and if this θ is within a predetermined range (for example, within ± 15 °), the lateral direction is obtained. It may be regarded as an object that moves. By considering the direction of the flow vector, an object moving in an arbitrary direction can be extracted.

次に、移動体の領域推定を行う（ステップＳ１１）。例えば、図５に示されるように歩行者が車両の進路となる道路上を画面の左から右へと横断中であるとする。ここで、歩行者の画像Ｓ_１を構成する各画素点は、ほぼ同一の大きさ、向きのフローベクトルを有し、連続していると考えられるから、フローベクトルの大きさ、向きがほぼ同一であって連続している画素点を抽出し、これを取り囲む最小の矩形領域Ｒ_１を移動体領域として設定する。移動物体によっては複雑な形状を有するが、矩形領域として取り扱うことで、その後の処理が容易になる。推定した移動体領域の情報は一時記憶領域に格納される（ステップＳ１３）。この情報としては、移動体領域の左上隅と右下隅の各画素の座標位置を用いればよい。図６に示される例では、歩行者Ｓ_１と、先行車両Ｓ_２の画像から、それぞれ領域Ｒ_１、Ｒ_２を抽出する。 Next, the area of the moving object is estimated (step S11). For example, as shown in FIG. 5, it is assumed that a pedestrian is crossing from the left to the right of the screen on the road that is the course of the vehicle. Here, each pixel points constituting an image S ₁ of the pedestrian has substantially the same size, the flow vector of the orientation, since it is considered that continuous, the flow vector size, orientation substantially identical extracting pixel points are continuous there is, it sets the minimum rectangular area R ₁ surrounding it as moving object region. Depending on the moving object, it has a complicated shape, but handling it as a rectangular area facilitates subsequent processing. Information on the estimated moving body area is stored in the temporary storage area (step S13). As this information, the coordinate positions of the pixels at the upper left corner and the lower right corner of the moving object region may be used. In the example shown in FIG. 6, regions R ₁ and R ₂ are extracted from images of the pedestrian S ₁ and the preceding vehicle S ₂ , respectively.

次に、推定した矩形領域Ｒ_１、Ｒ_２等の代表点として、左下隅の点Ｎ_１（ｘ_１，ｙ_１）、Ｎ_２（ｘ_２，ｙ_２）を決定し、当該点におけるフローベクトル（ｕ，ｖ）を代表量とする（ステップＳ１５）。なお、代表量は代表点におけるフローベクトルを用いるほか、当該矩形領域内におけるフローベクトルの平均値や中央値、最頻値等を用いてもよい。なお統計的な処理を行う場合には、当該矩形領域内においてフローベクトルの大きさが略０の画素については処理から除外することが好ましい。代表点、代表量を定めたら、その位置、代表量を一時記憶領域に格納して記憶する（ステップＳ１７）。 Next, the lower left corner points N ₁ (x ₁ , y ₁ ) and N ₂ (x ₂ , y ₂ ) are determined as representative points of the estimated rectangular areas R ₁ , R _2, etc., and the flow vector at that point is determined. Let (u, v) be a representative amount (step S15). The representative quantity may be a flow vector at a representative point, or an average value, median value, mode value, or the like of the flow vector in the rectangular area. When performing statistical processing, it is preferable to exclude pixels having a flow vector size of approximately 0 in the rectangular area from the processing. After the representative point and the representative amount are determined, the position and the representative amount are stored in the temporary storage area (step S17).

次に、カメラ１と対象物間の距離を求める（ステップＳ１９）。ここでは、図７を参照して説明する。カメラ１の縦方向の解像度をＴ、視野角をθとすると、１画素あたりの視野角変化量Δθは、θ／Ｔで表せる。対象物の代表点の座標をＡ（ｉ，ｊ）とし、Ａとカメラ中心Ｏとを結んだ線Ｌｔと視野範囲下端線Ｌｄとのなす角度をαとすると、α＝ｊ×Δθであり、また、カメラ中心Ｏから降ろした垂線と視野範囲下端線Ｌｄとのなす角度をβとし、カメラ設置角度、つまり、カメラの視野中心線Ｃとカメラ中心Ｏから降ろした垂線とのなす角度をρとすると、β＝ρ−１／２×θが成り立つ。カメラ１の設置高さをＨとするとき、以上の関係からカメラ１から対象物の代表点Ａまでの距離Ｌは、Ｌ＝Ｈ×ｔａｎ（α＋β）で表せる。 Next, the distance between the camera 1 and the object is obtained (step S19). Here, it demonstrates with reference to FIG. When the vertical resolution of the camera 1 is T and the viewing angle is θ, the viewing angle change amount Δθ per pixel can be expressed by θ / T. If the coordinate of the representative point of the object is A (i, j), and the angle between the line Lt connecting A and the camera center O and the field-of-view range bottom line Ld is α, then α = j × Δθ, In addition, the angle formed between the perpendicular line dropped from the camera center O and the visual field range lower end line Ld is β, and the camera installation angle, that is, the angle formed between the camera visual field center line C and the perpendicular line dropped from the camera center O is ρ. Then, β = ρ−1 / 2 × θ holds. When the installation height of the camera 1 is H, the distance L from the camera 1 to the representative point A of the object can be expressed by L = H × tan (α + β) from the above relationship.

次に、対象物の微小時間Δｔ（タイムステップ）における移動距離を算出する（ステップＳ２１）。ここでは、図８において、前回のタイムステップにおいて点Ｐ（ｉ１，ｊ１１）に存在した物体が点Ｑ（ｉ２，ｊ２）へと移動した場合に量転換の距離を求める場合を例に説明する。カメラの横方向の解像度をＷ、視野角度をφとすると、１画素あたりの視野角変化量Δφは、φ／Ｗで表せる。カメラ中心Ｏから点Ｐ、点Ｑのそれぞれへと結んだ２直線のなす角をγとすると、γ＝Δφ×（ｉ２−ｉ１）となる。今、カメラ視野中心線Ｃが線分ＰＱの垂直２等分線であるとすると、対象物が微小時間Δｔに移動した距離Ｋは、線分ＰＱの長さに等しく、また、点ＯからＣｘまでの距離をＬとすると、Ｌは、ステップＳ１９で算出済みであるから、Ｋ＝２×Ｌｔａｎ（１／２γ）により算出できる。同様の考え方によって横方向のみならず、任意の方向への移動距離についても算出することができる。求めた移動距離ＫをタイムステップΔｔで除算して実際の移動速度Ｖ（＝Ｋ／Δｔ）を算出する（ステップＳ２３）。 Next, the moving distance of the object in the minute time Δt (time step) is calculated (step S21). Here, in FIG. 8, a case will be described as an example in which the distance of the quantity change is obtained when the object existing at the point P (i1, j11) in the previous time step moves to the point Q (i2, j2). When the horizontal resolution of the camera is W and the viewing angle is φ, the viewing angle change amount Δφ per pixel can be expressed by φ / W. If an angle formed by two straight lines connecting from the camera center O to each of the points P and Q is γ, γ = Δφ × (i2−i1). Assuming that the camera visual field center line C is a perpendicular bisector of the line segment PQ, the distance K that the object has moved in the minute time Δt is equal to the length of the line segment PQ, and from the point O to Cx If the distance to is L, since L has already been calculated in step S19, it can be calculated by K = 2 × Ltan (1 / 2γ). With the same concept, it is possible to calculate not only the lateral direction but also the moving distance in an arbitrary direction. The obtained moving distance K is divided by the time step Δt to calculate the actual moving speed V (= K / Δt) (step S23).

次に、対象物の速度Ｖを基にして認識用のパターン辞書を設定する（ステップＳ２５）。例えば、図９に示されるように、速度ＶがＴＨ１以下の場合には、歩行者のテンプレート画像を記録した辞書を用い、速度ＶがＴＨ２を越え、ＴＨ３以下の場合には、自転車（乗員含む）のテンプレート画像を記録した辞書を用い、速度ＶがＴＨ４を越える場合には、自動車（二輪車およびその乗員を含む）のテンプレート画像を記録した辞書を用いる。そして、速度ＶがＴＨ１を越え、ＴＨ２以下の場合には、歩行者と自転車の両方のテンプレート画像を記録した辞書を用い、速度ＶがＴＨ３を越え、ＴＨ４以下の場合には、自転車と二輪車の両方のテンプレート画像を記録した辞書を用いるとよい。 Next, a recognition pattern dictionary is set based on the velocity V of the object (step S25). For example, as shown in FIG. 9, when the speed V is equal to or lower than TH1, a dictionary in which template images of pedestrians are recorded is used. When the speed V exceeds TH2 and equal to or lower than TH3, bicycles (including passengers) are used. ) Is used, and when the speed V exceeds TH4, a dictionary in which a template image of a car (including a motorcycle and its passenger) is recorded is used. When the speed V exceeds TH1 and is equal to or less than TH2, a dictionary in which template images of both pedestrians and bicycles are recorded is used. When the speed V exceeds TH3 and equal to or less than TH4, bicycles and two-wheeled vehicles are used. A dictionary in which both template images are recorded may be used.

あるいは、パターン辞書として歩行者、自転車、自動車（二輪車を含む）のテンプレート画像をそれぞれ記録した３種類の辞書（以下、辞書Ａ〜Ｃと称する。）を用意しておき、速度によって辞書の組み合わせと優先順位を切り替えてもよい。例えば、速度ＶがＴＨ以下の場合には、辞書Ａを最優先辞書とし、次の優先辞書として辞書Ｂを、最も優先度の低い辞書として辞書Ｃを使用する。速度ＶがＴＨ２を越え、ＴＨ４以下の場合には、辞書Ｂを最優先辞書とし、優先度の低い辞書として辞書Ｃを用いる。この場合、辞書Ａは使用しない。そして、速度ＶがＴＨ４を越える場合には、辞書Ｃのみを用い、辞書Ａ、辞書Ｂは使用しない。 Alternatively, three types of dictionaries (hereinafter referred to as dictionaries A to C) in which template images of pedestrians, bicycles, and automobiles (including two-wheeled vehicles) are recorded as pattern dictionaries are prepared, and combinations of dictionaries are determined depending on speed. The priority order may be switched. For example, when the speed V is equal to or lower than TH, the dictionary A is the highest priority dictionary, the dictionary B is used as the next priority dictionary, and the dictionary C is used as the lowest priority dictionary. When the speed V exceeds TH2 and is equal to or lower than TH4, the dictionary B is set as the highest priority dictionary, and the dictionary C is used as a dictionary with low priority. In this case, the dictionary A is not used. When the speed V exceeds TH4, only the dictionary C is used, and the dictionary A and the dictionary B are not used.

優先度の用い方としては、優先順位の高い辞書からパターン認識を行い、相関度がしきい値以上の場合には、他のパターンのマッチングを行わない手法のほか、優先順位に応じて重みづけ係数を設定し、相関度にこの重みづけ係数を乗じた値を相関係数としてマッチングを行うようにしてもよい。 As a method of using priority, pattern recognition is performed from a dictionary with high priority, and if the correlation is equal to or higher than a threshold value, other patterns are not matched and weighted according to priority. A coefficient may be set, and matching may be performed using a value obtained by multiplying the degree of correlation by the weighting coefficient as a correlation coefficient.

辞書を設定したら、設定した辞書に基づいてパターン認識を行うことで、物体の認識を行う（ステップＳ２７）。本実施形態においては、対象物の速度情報を基にして、それぞれ適切な辞書を選択しているので、歩行者に比べて移動速度の速い自転車や二輪車の乗員を歩行者と誤認識するケースを減らすことができ、判別精度が向上する。特に、辞書の組み合わせパターン、優先度を切り替える手法によれば、歩行者並の低速で移動中の自転車や二輪車についても精度よく判別することが可能である。一方、低速の場合には歩行者辞書のみを用いる手法では、比較的高速で移動中の二輪車の乗員を歩行者と誤認識するケースを確実に抑制できる。低速領域においては、歩行者並の低速で移動中の自転車や二輪車の乗員を歩行者と誤認識する可能性はあるが、これらの場合には、自転車や二輪車も、歩行者と同様の挙動を示す可能性が高いことから、同様に取り扱っても大きな支障は生じないと考えられる。 After the dictionary is set, the object is recognized by performing pattern recognition based on the set dictionary (step S27). In the present embodiment, since an appropriate dictionary is selected based on the speed information of the object, there is a case where a bicycle or a two-wheeled vehicle occupant having a higher moving speed than a pedestrian is erroneously recognized as a pedestrian. This can be reduced and the discrimination accuracy is improved. In particular, according to the technique of switching dictionary combination patterns and priorities, it is possible to accurately determine a bicycle or a two-wheeled vehicle that is moving at a low speed comparable to a pedestrian. On the other hand, in the case of a low speed, the method using only the pedestrian dictionary can reliably suppress a case where a passenger of a two-wheeled vehicle moving at a relatively high speed is erroneously recognized as a pedestrian. In low-speed areas, there is a possibility that a bicycle or two-wheeled vehicle passenger moving at a low speed comparable to a pedestrian may be misidentified as a pedestrian. Since there is a high possibility of showing, it is considered that there will be no major trouble even if handled in the same way.

認識後、その結果を出力して（ステップＳ２９）、処理を終了する。例えば、図１０に示されるように、対象物（歩行者、自転車、自動車）等との距離に応じて表示装置を点灯させたり、スピーカーから音声により移動体の存在を知らせる等すればよい。また、図示していない車両挙動制御装置に通知し、障害物回避やプリクラッシュ制御等を行ってもよい。 After recognition, the result is output (step S29), and the process is terminated. For example, as shown in FIG. 10, the display device may be turned on according to the distance from an object (pedestrian, bicycle, car, etc.), or the presence of a moving body may be notified by sound from a speaker. Further, the vehicle behavior control device (not shown) may be notified to perform obstacle avoidance, pre-crash control, and the like.

以上の説明では、一台のカメラにより取得した時系列画像から、画像処理により物体の種別、速度、距離等を判別する手法を説明したが、対象物の速度や距離は、ステレオカメラを用いて取得したり、あるいは、レーダ装置等により取得してもよい。そのほか、各種の変形を行うことが可能である。 In the above description, the method for discriminating the type, speed, distance, etc. of an object by image processing from a time-series image acquired by a single camera has been described. However, the speed and distance of an object are determined using a stereo camera. Or may be acquired by a radar device or the like. In addition, various modifications can be made.

本発明にかかる画像認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image recognition apparatus concerning this invention. 図１の装置における画像認識処理を示すフローチャートである。It is a flowchart which shows the image recognition process in the apparatus of FIG. 画像認識処理の入力画像（動画）例を説明する図である。It is a figure explaining the example of the input image (moving image) of an image recognition process. オプティカルフローベクトルのイメージ図である。It is an image figure of an optical flow vector. 移動物体の抽出処理を説明する図である。It is a figure explaining the extraction process of a moving object. 抽出した移動体領域を説明する図である。It is a figure explaining the extracted mobile body area | region. カメラと対象物の距離算出手法を説明する図である。It is a figure explaining the distance calculation method of a camera and a target object. 対象物の移動速度算出手法を説明する図である。It is a figure explaining the moving speed calculation method of a target object. 対象物速度による辞書切り替え基準を説明する図である。It is a figure explaining the dictionary switching reference | standard by object speed. 警報の表示例を示している。An example of alarm display is shown.

Explanation of symbols

１…カメラ、２…警報ＥＣＵ、３…記憶装置、４…警報装置、２０…画像認識部、２１…判定部。
DESCRIPTION OF SYMBOLS 1 ... Camera, 2 ... Alarm ECU, 3 ... Memory | storage device, 4 ... Alarm device, 20 ... Image recognition part, 21 ... Determination part.

Claims

In an image recognition apparatus that recognizes a subject in an image,
A dictionary group having a plurality of dictionaries storing templates of subjects;
An optical flow vector in the image is extracted, and a pixel point similar to the vector is extracted to extract a moving body region, which is used for subject recognition from the dictionary group based on the moving speed. Dictionary selection means for selecting a dictionary;
An image recognition apparatus comprising: a recognition unit that recognizes a subject by comparing the extracted moving body region with a template stored in a dictionary selected by the dictionary selection unit.

The image recognition apparatus according to claim 1, wherein the template is stored in a dictionary corresponding to a type of a subject, and the dictionary selection unit switches a dictionary to be used based on a moving speed of the subject.

2. The template according to claim 1, wherein the template is stored in a dictionary corresponding to a type of subject, and the dictionary selection unit switches a combination of dictionaries to be used and a priority order thereof based on a moving speed of the subject. Image recognition device.

The image recognition apparatus according to claim 1, wherein the dictionary selection unit calculates a moving speed of the subject using a plurality of time-series images whose imaging times are known.

In an image recognition apparatus that recognizes a subject in an image,
  A moving speed detecting means for detecting the moving speed of the subject;
  A dictionary group having a plurality of dictionaries for storing subject templates according to the subject type;
  Dictionary selection means for switching and selecting a combination of dictionaries used for subject recognition from the dictionary group and their priority order based on the movement speed of the subject detected by the movement speed detection means;
  An image recognition apparatus comprising: a recognizing unit for recognizing a subject by comparing a partial image region including the subject in the image with a template stored in the dictionary selected by the selecting unit.