JP6342874B2

JP6342874B2 - Image recognition device

Info

Publication number: JP6342874B2
Application number: JP2015228898A
Authority: JP
Inventors: 督之市原; 久貴加藤; 泰典川口
Original assignee: Yazaki Corp
Current assignee: Yazaki Corp
Priority date: 2015-11-24
Filing date: 2015-11-24
Publication date: 2018-06-13
Anticipated expiration: 2035-11-24
Also published as: JP2017097608A

Description

本発明は、撮影対象の３次元認識が可能な距離画像センサから入力される信号に基づいて人間の挙動を認識する画像認識装置に関する。 The present invention relates to an image recognition apparatus for recognizing human behavior based on a signal input from a distance image sensor capable of three-dimensional recognition of an imaging target.

例えば、車両上に搭載された様々な車載機器を運転者が操作する場合には、通常は、各々の車載機器を操作するために様々な箇所に設けられた専用のボタンを操作する必要がある。しかし、操作の度に目的のボタンが存在する位置を探して、その位置に指を位置合わせしなければならないので、この操作のために運転者は視線を前方から一時的にずらしたり、自分の手や指の動きに格別な注意を払う必要があり、安全運転の妨げになる。 For example, when a driver operates various in-vehicle devices mounted on a vehicle, it is usually necessary to operate dedicated buttons provided at various locations in order to operate each in-vehicle device. . However, every time an operation is performed, the position where the target button exists must be searched, and the finger must be aligned to that position. For this operation, the driver temporarily shifts his / her line of sight from the front, Special attention must be paid to the movement of hands and fingers, which hinders safe driving.

そこで、従来より運転者のボタン操作を不要にするための技術が開発されている。例えば、車両の室内に設けられたカメラによって運転者の挙動を画像認識し、所定の挙動を検出すると、その挙動に応じて車両に搭載された装置が動作する操作入力装置が知られている（特許文献１）。 Therefore, techniques for eliminating the need for driver's button operation have been developed. For example, an operation input device is known in which when a driver's behavior is image-recognized by a camera provided in a vehicle interior and a predetermined behavior is detected, a device mounted on the vehicle operates according to the behavior ( Patent Document 1).

特開２０１３−２１８３９１号公報JP 2013-218391 A

特許文献１のようにカメラで撮影した画像を認識して人間の挙動を検出する場合に、一般的なカメラでは二次元画像しか撮影できないので、奥行き方向の位置や挙動を検出することができず、操作パターン（ジェスチャー）の自由度を上げることができないし、検出が不要な時に挙動を検出してしまう可能性もある。 When a human behavior is detected by recognizing an image captured by a camera as in Patent Document 1, a general camera can only capture a two-dimensional image, so that the position and behavior in the depth direction cannot be detected. The degree of freedom of the operation pattern (gesture) cannot be increased, and the behavior may be detected when the detection is unnecessary.

近年では、撮影対象の３次元認識が可能なＴＯＦ（Time Of Flight）距離画像センサ（以下、ＴＯＦカメラと称する）が市販されている。また、ＴＯＦカメラの他にも、撮影対象の３次元認識が可能なカメラが存在する。ＴＯＦカメラは、光源の光が測定対象物に当たって戻るまでの時間を画素毎に検出できるので、奥行き方向の距離に相当する位置情報を含む立体的な画像を撮影できる。 In recent years, a TOF (Time Of Flight) distance image sensor (hereinafter referred to as a TOF camera) capable of three-dimensional recognition of an imaging target is commercially available. In addition to the TOF camera, there is a camera capable of three-dimensional recognition of an imaging target. Since the TOF camera can detect the time until the light from the light source hits the measurement target and returns for each pixel, it can capture a stereoscopic image including position information corresponding to the distance in the depth direction.

一方、特許文献１のような操作入力装置においては、事前に定めた特定の二次元または三次元空間（以下、挙動検出空間と称する）でのみ運転者の挙動を検出し、車載機器の動作に反映することが想定される。例えば、通常の運転状態のように、運転者の手がステアリングホイール（ハンドル）の近傍にある状況でのみ、運転者の手や指の挙動に応じて車載機器を操作するように制御すれば、運転者が無意識に手を動かしたような状況では車載機器が作動することはなくなる。 On the other hand, in the operation input device as in Patent Document 1, the behavior of the driver is detected only in a specific two-dimensional or three-dimensional space (hereinafter referred to as behavior detection space) determined in advance, and the operation of the in-vehicle device is performed. It is assumed to be reflected. For example, if the driver's hand is in the vicinity of the steering wheel (handle) as in a normal driving state, and control is performed so that the in-vehicle device is operated according to the behavior of the driver's hand or finger, In a situation where the driver moves his / her hand unconsciously, the in-vehicle device will not operate.

しかし、特許文献１のような操作入力装置では、運転者の手や指の挙動だけを監視しているので、認識可能な操作パターンの種類を増やすことが難しく、認識精度を上げることも難しい。例えば、運転者の微妙な手や指の形状の違いを区別しようとすると、装置が手や指の形状を間違って認識する場合があり、運転者が意図しているジェスチャーの操作パターンと、装置が実際に認識する操作パターンとの相違により誤動作が生じる可能性が想定される。 However, since the operation input device as in Patent Document 1 monitors only the behavior of the driver's hand and fingers, it is difficult to increase the types of operation patterns that can be recognized, and it is also difficult to increase the recognition accuracy. For example, when trying to distinguish the differences in the shape of the driver's delicate hands and fingers, the device may incorrectly recognize the shape of the hand or finger, and the gesture operation pattern intended by the driver and the device There is a possibility that a malfunction may occur due to a difference from the operation pattern that is actually recognized.

したがって、例えば運転者の手の他に他の部位の挙動も同時に検出することが考えられる。しかしながら、人間の様々な部位の挙動検出を可能にするためには、撮影するカメラの画角を大きくしたり、車両に設置するカメラの台数を増やす必要がある。しかし、カメラの画角を大きくすると、形状や位置の検出精度が低下して認識精度が下がることになる。また、車両に設置するカメラの台数を増やす場合には装置のコストが大幅に上昇するのは避けられない。 Therefore, for example, it is conceivable to simultaneously detect the behavior of other parts in addition to the driver's hand. However, in order to make it possible to detect the behavior of various human parts, it is necessary to increase the angle of view of the camera to be photographed or to increase the number of cameras installed in the vehicle. However, when the angle of view of the camera is increased, the detection accuracy of the shape and position is lowered and the recognition accuracy is lowered. In addition, when the number of cameras installed in the vehicle is increased, it is inevitable that the cost of the apparatus increases significantly.

また、ＴＯＦカメラが計測する距離には誤差が発生することがある。特に、高温の環境下においては光源からの照射角度が温度により変化して誤差が増大する傾向がある。したがって、温度変化の激しい車室内などにＴＯＦカメラを設置する場合にはＴＯＦカメラが計測する距離や三次元座標の誤差の影響が懸念される。 In addition, an error may occur in the distance measured by the TOF camera. In particular, in a high-temperature environment, there is a tendency that the angle of irradiation from the light source varies with temperature and the error increases. Therefore, when the TOF camera is installed in a vehicle room where the temperature changes rapidly, there is a concern about the influence of the distance measured by the TOF camera and the error of the three-dimensional coordinates.

ＴＯＦカメラが計測する距離の誤差が大きくなると、挙動検出空間が、拡大、縮小、または変位してしまう。その結果、挙動検出空間の外側に運転者の手が存在する場合であっても、無意識のうちに行われる手の挙動に反応し、車載機器が動作してしまう可能性がある。つまり、運転者が意図していない操作が勝手に行われてしまう。このような誤動作は、運転者の想定外の動作であるため運転者の思考に混乱を招き運転に支障を来す可能性がある。 When the error in the distance measured by the TOF camera increases, the behavior detection space is enlarged, reduced, or displaced. As a result, even when the driver's hand is present outside the behavior detection space, there is a possibility that the in-vehicle device may operate in response to the behavior of the hand performed unconsciously. That is, an operation unintended by the driver is performed without permission. Since such a malfunction is an operation that is not expected by the driver, the driver's thinking may be confused and the driving may be hindered.

本発明は、上述した事情に鑑みてなされたものであり、その目的は、装置コストの増大を抑制しつつ、認識可能なジェスチャーの操作パターンの種類を増やすこと、および操作検出精度を向上することが可能な画像認識装置を提供することにある。 The present invention has been made in view of the above-described circumstances, and an object thereof is to increase the types of recognizable gesture operation patterns and improve operation detection accuracy while suppressing an increase in apparatus cost. An object of the present invention is to provide an image recognition apparatus capable of performing the above.

前述した目的を達成するために、本発明に係る画像認識装置は、下記（１）〜（５）を特徴としている。
（１）撮影対象の３次元認識が可能な距離画像センサから入力される信号に基づいて人間の挙動を認識する画像認識装置であって、
人間の挙動を認識する空間として、
第１の挙動検出空間と、
前記距離画像センサと前記第１の挙動検出空間とを結ぶ線上に少なくとも一部分が位置し、且つ前記第１の挙動検出空間とは重複しない第２の挙動検出空間と、
が事前に決定され、
前記第１の挙動検出空間と前記第２の挙動検出空間とは、前記距離画像センサと前記第１の挙動検出空間とを結ぶ線上において互いに離隔した位置に割り当てられ、
前記第１の挙動検出空間及び前記第２の挙動検出空間の少なくとも一方において人間の挙動を認識した結果を出力に反映する、
画像認識装置。 In order to achieve the above-described object, an image recognition apparatus according to the present invention is characterized by the following (1) to (5).
(1) An image recognition device for recognizing human behavior based on a signal input from a distance image sensor capable of three-dimensional recognition of an imaging target,
As a space to recognize human behavior,
A first behavior detection space;
A second behavior detection space that is at least partially located on a line connecting the distance image sensor and the first behavior detection space and that does not overlap the first behavior detection space;
Is determined in advance,
The first behavior detection space and the second behavior detection space are assigned to positions separated from each other on a line connecting the distance image sensor and the first behavior detection space,
A result of recognizing human behavior in at least one of the first behavior detection space and the second behavior detection space is reflected in the output;
Image recognition device.

上記（１）の構成の画像認識装置によれば、装置コストの増大を抑制しつつ、認識可能なジェスチャーの操作パターンの種類を増やすこと、および操作検出精度を向上することが可能になる。すなわち、第１の挙動検出空間および第２の挙動検出空間のそれぞれにおいて人間の挙動を認識することにより、操作パターンの種類を増やすことができる。また、第１の挙動検出空間および第２の挙動検出空間が、距離画像センサから視て同じ線上に配置されるので、距離画像センサが撮像する際の画角を大きくすることなく、両方の挙動検出空間を同時に監視できる。したがって、撮影した各認識対象物に対応する画像データの画素数が増え、認識精度が向上する。 According to the image recognition device having the configuration (1), it is possible to increase the types of gesture operation patterns that can be recognized and to improve operation detection accuracy while suppressing an increase in device cost. That is, the types of operation patterns can be increased by recognizing human behavior in each of the first behavior detection space and the second behavior detection space. In addition, since the first behavior detection space and the second behavior detection space are arranged on the same line as viewed from the distance image sensor, both behaviors can be obtained without increasing the angle of view when the distance image sensor captures an image. The detection space can be monitored simultaneously. Therefore, the number of pixels of the image data corresponding to each recognized recognition object increases, and the recognition accuracy is improved.

（２）前記第１の挙動検出空間は、人間の顔の表情を認識可能な領域に割り当てられ、
前記第２の挙動検出空間は、人間の手又は指の動きを認識可能な領域に割り当てられている、
上記（１）に記載の画像認識装置。 (2) The first behavior detection space is assigned to an area where human facial expressions can be recognized,
The second behavior detection space is assigned to an area where human hand or finger movement can be recognized.
The image recognition apparatus according to (1) above.

上記（２）の構成の画像認識装置によれば、第１の挙動検出空間の画像を処理することにより、人間の顔の表情に関する挙動を検出でき、同時に第２の挙動検出空間の画像を処理することにより、人間の手又は指の動きに関する挙動を検出できる。 According to the image recognition device having the configuration (2) above, it is possible to detect the behavior related to the facial expression of the human face by processing the image in the first behavior detection space, and simultaneously process the image in the second behavior detection space. By doing so, it is possible to detect a behavior related to the movement of a human hand or finger.

（３）前記第１の挙動検出空間及び前記第２の挙動検出空間それぞれにおいて人間の挙動を認識し、認識された人間の挙動の組合せに応じてその人間が為した操作を識別する、
上記（１）に記載の画像認識装置。 (3) Recognizing a human behavior in each of the first behavior detection space and the second behavior detection space, and identifying an operation performed by the human in accordance with a recognized combination of human behaviors.
The image recognition apparatus according to (1) above.

上記（３）の構成の画像認識装置によれば、第１の挙動検出空間における挙動と、第２の挙動検出空間における挙動との組み合わせを利用するので、これら全体の挙動に対応する操作の種類をより高精度で認識可能になる。すなわち、運転者が特定のジェスチャー以外で特別な複数の挙動を同時に行う確率は非常に低いので、各挙動検出空間における微妙な変化を認識しなくても、特定のジェスチャーか否かを容易に区別できる。 According to the image recognition device having the configuration (3), the combination of the behavior in the first behavior detection space and the behavior in the second behavior detection space is used, and therefore the types of operations corresponding to these overall behaviors. Can be recognized with higher accuracy. In other words, the probability that the driver will perform a plurality of special behaviors other than a specific gesture at the same time is very low, so it is easy to distinguish whether a gesture is a specific gesture without recognizing subtle changes in each behavior detection space. it can.

（４）前記距離画像センサから入力される信号に基づき認識される事前に定めた特定の撮影対象までの第１の計測距離と、前記距離画像センサから前記特定の撮影対象までの距離を事前に実測して得られた参照距離との比率を算出し、前記第１の挙動検出空間および前記第２の挙動検出空間を特定するパラメータ、または前記距離画像センサから入力される信号に基づき認識される挙動監視対象の任意の点までの第２の計測距離を、前記比率に基づく補正量により補正する計測値補正部、を更に備えた、
上記（１）に記載の画像認識装置。 (4) A first measurement distance to a predetermined specific photographing target recognized based on a signal input from the distance image sensor and a distance from the distance image sensor to the specific photographing target are determined in advance. A ratio with a reference distance obtained by actual measurement is calculated and recognized based on a parameter specifying the first behavior detection space and the second behavior detection space, or a signal input from the distance image sensor. A measurement value correction unit that corrects the second measurement distance to an arbitrary point of the behavior monitoring target by a correction amount based on the ratio;
The image recognition apparatus according to (1) above.

上記（４）の構成の画像認識装置によれば、例えばＴＯＦカメラを利用する場合のように、距離画像センサが計測した距離等の結果に大きな誤差が含まれている場合であっても、誤差を自動的に修正できるので、第１の挙動検出空間及び第２の挙動検出空間以外の領域で検出した挙動により誤動作が生じるのを防止できる。 According to the image recognition device having the configuration (4), even when a large error is included in the result of the distance measured by the distance image sensor, for example, when a TOF camera is used, the error Can be automatically corrected, so that it is possible to prevent malfunctions caused by behaviors detected in regions other than the first behavior detection space and the second behavior detection space.

（５）前記第１の挙動検出空間および前記第２の挙動検出空間の少なくとも一方が、車室内の特定の固定部位を基準として、前記固定部位に隣接する領域、もしくは前記固定部位の周辺の領域に割り当てられている、
上記（１）または（２）に記載の画像認識装置。 (5) At least one of the first behavior detection space and the second behavior detection space is a region adjacent to the fixed portion, or a region around the fixed portion, with a specific fixed portion in the vehicle interior as a reference Assigned to the
The image recognition apparatus according to (1) or (2) above.

上記（５）の構成の画像認識装置によれば、第１の挙動検出空間や第２の挙動検出空間を、車室内の特定の固定部位を基準として定めているので、運転者が通常の運転姿勢である場合のような、特定の条件に限り運転者のジェスチャーを認識可能になる。したがって、運転者が無意識のうちに行った挙動を、特定のジェスチャーとして装置が誤認識するのを避けることができる。 According to the image recognition device having the configuration of (5) above, the first behavior detection space and the second behavior detection space are determined with reference to a specific fixed part in the passenger compartment. The driver's gesture can be recognized only under a specific condition such as the case of the posture. Therefore, it is possible to prevent the device from misrecognizing a behavior that the driver has unconsciously performed as a specific gesture.

本発明の画像認識装置によれば、装置コストの増大を抑制しつつ、認識可能なジェスチャーの操作パターンの種類を増やすこと、および操作検出精度を向上することが可能になる。すなわち、前記第１の挙動検出空間および第２の挙動検出空間のそれぞれにおいて人間の挙動を認識することにより、操作パターンの種類を増やすことができる。また、前記第１の挙動検出空間および第２の挙動検出空間が、前記距離画像センサから視て同じ線上に配置されるので、前記距離画像センサが撮像する際の画角を大きくすることなく、両方の挙動検出空間を同時に監視できる。したがって、撮影した各認識対象物に対応する画像データの画素数が増え、認識精度が向上する。 According to the image recognition device of the present invention, it is possible to increase the types of recognizable gesture operation patterns and improve operation detection accuracy while suppressing an increase in device cost. That is, it is possible to increase the types of operation patterns by recognizing human behavior in each of the first behavior detection space and the second behavior detection space. In addition, since the first behavior detection space and the second behavior detection space are arranged on the same line as viewed from the distance image sensor, without increasing the angle of view when the distance image sensor images, Both behavior detection spaces can be monitored simultaneously. Therefore, the number of pixels of the image data corresponding to each recognized recognition object increases, and the recognition accuracy is improved.

以上、本発明について簡潔に説明した。更に、以下に説明される発明を実施するための形態（以下、「実施形態」という。）を添付の図面を参照して通読することにより、本発明の詳細は更に明確化されるであろう。 The present invention has been briefly described above. Further, the details of the present invention will be further clarified by reading through a mode for carrying out the invention described below (hereinafter referred to as “embodiment”) with reference to the accompanying drawings. .

図１は、本発明の実施形態の画像認識装置を搭載した車両の車室内を車両の右側面側から視た各部の位置関係の例を示す側面図である。FIG. 1 is a side view showing an example of the positional relationship of each part when the interior of a vehicle equipped with an image recognition device according to an embodiment of the present invention is viewed from the right side of the vehicle. 図２は、本発明の実施形態の画像認識装置を含む車載システムの構成例を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration example of an in-vehicle system including the image recognition device according to the embodiment of the present invention. 図３は、本発明の実施形態の画像認識装置を搭載した車両の車室内の構成例を示す正面図である。FIG. 3 is a front view showing a configuration example of the interior of a vehicle in which the image recognition device according to the embodiment of the present invention is mounted. 図４は、本発明の実施形態の画像認識装置の動作例を示すフローチャートである。FIG. 4 is a flowchart showing an operation example of the image recognition apparatus according to the embodiment of the present invention. 図５（Ａ）および図５（Ｂ）は、それぞれ運転者の手の挙動に基づくジェスチャーの具体例を示す正面図である。5A and 5B are front views showing specific examples of gestures based on the behavior of the driver's hand. 図６は、変形例の画像認識装置を搭載した車両の車室内の構成例を示す正面図である。FIG. 6 is a front view illustrating a configuration example of the interior of a vehicle in which the image recognition device according to the modification is mounted.

本発明に関する具体的な実施形態について、各図を参照しながら以下に説明する。 Specific embodiments relating to the present invention will be described below with reference to the drawings.

＜画像認識装置を使用する環境の具体例＞
＜車室内の構成例＞
車両の室内における運転席近傍の主要な構成要素の配置例を図３に示す。本発明の画像認識装置は、例えば図３に示したような車両に搭載した状態で使用される。 <Specific example of environment using image recognition device>
<Example of vehicle interior configuration>
FIG. 3 shows an example of the arrangement of main components in the vicinity of the driver's seat in the vehicle interior. The image recognition apparatus of the present invention is used in a state where it is mounted on a vehicle as shown in FIG. 3, for example.

図３に示すように車両の室内には、運転席シート３４および助手席シート３５が設けてあり、運転席シート３４の前方にステアリングホイール３１が配置されている。また、前方のダッシュボード３３の中央付近には矩形の表示画面を有するディスプレイユニット３２が設置されている。このディスプレイユニット３２は、カーナビゲーション装置やカーオーディオ装置の表示部として利用することができる。 As shown in FIG. 3, a driver seat 34 and a passenger seat 35 are provided in the vehicle interior, and a steering wheel 31 is disposed in front of the driver seat 34. A display unit 32 having a rectangular display screen is installed near the center of the front dashboard 33. The display unit 32 can be used as a display unit of a car navigation device or a car audio device.

また、ディスプレイユニット３２を使用しない表示装置として、ヘッドアップディスプレイ（ＨＵＤ）の本体がダッシュボード３３の下方に格納されている。このヘッドアップディスプレイを使用する際には、表示光を反射する透明なコンバイナ（反射板）がダッシュボード３３の内側から上昇してダッシュボード３３上に現れ、運転者が視認できる位置に位置決めされる。 As a display device that does not use the display unit 32, the main body of a head-up display (HUD) is stored below the dashboard 33. When this head-up display is used, a transparent combiner (reflecting plate) that reflects display light rises from the inside of the dashboard 33 and appears on the dashboard 33, and is positioned at a position where the driver can visually recognize it. .

運転席シート３４の左前方の運転者が操作可能な位置に、可動構造の操作レバー３６が配置されている。この操作レバー３６は、車両のトランスミッションの変速モードを切り替えるために利用される。 An operation lever 36 having a movable structure is arranged at a position where the driver on the left front side of the driver seat 34 can operate. The operation lever 36 is used to switch the transmission mode of the vehicle transmission.

また、ステアリングホイール３１よりも少し前方のダッシュボード３３上にＴＯＦカメラ２１が固定した状態で設置されている。このＴＯＦカメラ２１は、撮影対象の３次元認識が可能なＴＯＦ（Time Of Flight）距離画像センサである。すなわち、ＴＯＦカメラ２１は、光源の光が測定対象物に当たって戻るまでの時間を画素毎に検出し、奥行き方向の距離に相当する位置情報を含む立体的な画像を撮影できる。 Further, the TOF camera 21 is installed on a dashboard 33 slightly ahead of the steering wheel 31 in a fixed state. The TOF camera 21 is a TOF (Time Of Flight) distance image sensor capable of three-dimensional recognition of an imaging target. That is, the TOF camera 21 can detect the time until the light from the light source hits the measurement target and returns for each pixel, and can capture a stereoscopic image including position information corresponding to the distance in the depth direction.

＜位置関係の具体例＞
本発明の実施形態の画像認識装置を搭載した車両の車室内を車両の右側面側から視た各部の位置関係の例を図１に示す。 <Specific examples of positional relationship>
FIG. 1 shows an example of the positional relationship of each part when the interior of a vehicle equipped with the image recognition device of the embodiment of the present invention is viewed from the right side of the vehicle.

図３に示したＴＯＦカメラ２１は、ステアリングホイール３１と対向する位置に設置してあり、ステアリングホイール３１の全体と、ステアリングホイール３１を操作する運転者の手との両方が同時に映る範囲を撮影できるように撮影方向および画角を事前に調整してある。 The TOF camera 21 shown in FIG. 3 is installed at a position facing the steering wheel 31, and can capture a range in which both the entire steering wheel 31 and the driver's hand operating the steering wheel 31 are reflected simultaneously. As described above, the shooting direction and the angle of view are adjusted in advance.

より具体的には、図１に示すように、ＴＯＦカメラ２１の撮影範囲２５Ａの中に、運転者３０の顔３０ｆと、手３０ｈと、ステアリングホイール３１等の車両上の特定の固定部位が含まれるように位置決めしてある。 More specifically, as shown in FIG. 1, the fixed range on the vehicle such as the face 30 f of the driver 30, the hand 30 h, and the steering wheel 31 is included in the shooting range 25 A of the TOF camera 21. It is positioned so that

また、本実施形態では、運転者のジェスチャーを検出するための三次元の挙動検出空間として、２つの挙動検出空間ＡｒＦ、ＡｒＨが事前に定めてある。挙動検出空間ＡｒＦは、運転者３０の顔３０ｆの挙動を認識するための空間であり、図１に示すように運転者３０が通常の運転姿勢である場合における顔３０ｆの位置を含む領域に割り当てられている。挙動検出空間ＡｒＨは、運転者３０の手３０ｈの挙動を認識するための空間であり、図１に示すように運転者３０が通常の運転姿勢である場合における手３０ｈの位置を含む領域に割り当てられている。挙動検出空間ＡｒＦおよびＡｒＨの各々は、大きさが固定された直方体形状の三次元空間である。 In this embodiment, two behavior detection spaces ArF and ArH are determined in advance as a three-dimensional behavior detection space for detecting a driver's gesture. The behavior detection space ArF is a space for recognizing the behavior of the face 30f of the driver 30, and is assigned to an area including the position of the face 30f when the driver 30 is in a normal driving posture as shown in FIG. It has been. The behavior detection space ArH is a space for recognizing the behavior of the hand 30h of the driver 30 and is assigned to an area including the position of the hand 30h when the driver 30 is in a normal driving posture as shown in FIG. It has been. Each of the behavior detection spaces ArF and ArH is a rectangular parallelepiped three-dimensional space having a fixed size.

実際の車両においては、運転者３０の運転姿勢や位置を調整できるので、調整された実際の運転姿勢や位置に合わせて各挙動検出空間ＡｒＦ、ＡｒＨを自動的に調整している。例えば、顔３０ｆを検出するための挙動検出空間ＡｒＦについては、運転席シート３４のヘッドレスト３４ｈの実際の位置を算出した後で、この位置を基準として所定距離離れた領域に割り当てている。また、手３０ｈを検出するための挙動検出空間ＡｒＨについては、ステアリングホイール３１の実際の姿勢（ティルト角度など）や位置を算出した後で、この位置を基準として所定距離離れた領域に割り当てている。 In an actual vehicle, since the driving posture and position of the driver 30 can be adjusted, the behavior detection spaces ArF and ArH are automatically adjusted according to the adjusted actual driving posture and position. For example, the behavior detection space ArF for detecting the face 30f is assigned to a region separated by a predetermined distance with reference to this position after calculating the actual position of the headrest 34h of the driver seat 34. Further, the behavior detection space ArH for detecting the hand 30h is allocated to a region separated by a predetermined distance with reference to this position after calculating the actual posture (tilt angle, etc.) and position of the steering wheel 31. .

また、図１に示すように、ＴＯＦカメラ２１の同じ撮影方向の軸Ａｚに沿って、奥行き方向（Ｚ方向）に互いにずれた位置に２つの挙動検出空間ＡｒＨおよびＡｒＦがそれぞれ割り当ててある。ＴＯＦカメラ２１側から視ると、２つの挙動検出空間ＡｒＨおよびＡｒＦは一部分のみが重なり、それぞれの空間内の手や顔が同時に撮影できる状態になっている。 Further, as shown in FIG. 1, two behavior detection spaces ArH and ArF are assigned to positions shifted from each other in the depth direction (Z direction) along the axis Az in the same shooting direction of the TOF camera 21. When viewed from the TOF camera 21 side, only two portions of the two behavior detection spaces ArH and ArF overlap each other, and a hand and a face in each space can be photographed simultaneously.

この場合、同じ軸Ａｚ上に２つの挙動検出空間ＡｒＨおよびＡｒＦが配置されているので、撮影範囲２５Ａに相当するＴＯＦカメラ２１の画角を格別に広げなくても、１台のＴＯＦカメラ２１だけで２つの挙動検出空間ＡｒＨおよびＡｒＦの両方の撮影対象を同時に撮影できる。画角を小さくすることにより、画像認識精度が向上する。また、複数のＴＯＦカメラ２１を設置する必要がないためコストの上昇を抑制できる。 In this case, since the two behavior detection spaces ArH and ArF are arranged on the same axis Az, only one TOF camera 21 is required without greatly expanding the angle of view of the TOF camera 21 corresponding to the imaging range 25A. Thus, it is possible to simultaneously photograph both photographing objects of the two behavior detection spaces ArH and ArF. Image recognition accuracy is improved by reducing the angle of view. Moreover, since it is not necessary to install a plurality of TOF cameras 21, an increase in cost can be suppressed.

また、複数の挙動検出空間ＡｒＨおよびＡｒＦのそれぞれについて挙動の認識を行うことにより、利用可能なジェスチャーの種類を増やすことが可能になる。また、ジェスチャーの認識精度を上げることが可能であるし、ユーザにとって使いやすいジェスチャーを採用することも可能になる。 In addition, it is possible to increase the types of gestures that can be used by recognizing the behavior for each of the plurality of behavior detection spaces ArH and ArF. In addition, it is possible to increase gesture recognition accuracy, and it is also possible to employ gestures that are easy for the user to use.

＜車載システムの構成例＞
本発明の実施形態における画像認識装置１０を含む車載システムの構成例を図２に示す。図２の車載システムは、図１および図３に示した車両に搭載されている。 <In-vehicle system configuration example>
A configuration example of an in-vehicle system including the image recognition device 10 according to the embodiment of the present invention is shown in FIG. The in-vehicle system shown in FIG. 2 is mounted on the vehicle shown in FIGS.

図２に示す画像認識装置１０は、運転者の所定のジェスチャー、すなわち身振り、手振りのような挙動から車載機器に対する操作の指示を自動的に認識し、該当する車載機器を制御することができる。 The image recognition apparatus 10 shown in FIG. 2 can automatically recognize an operation instruction for the in-vehicle device from a predetermined gesture of the driver, that is, a gesture or a gesture, and can control the corresponding in-vehicle device.

実際には、ＴＯＦカメラ２１により運転者の手、指、顔や、ステアリングホイール３１などを撮影し、この撮影により得られる三次元画像から、手、指、顔等の挙動をジェスチャーとして認識する。したがって、運転者は特別なボタン等を操作しなくてもジェスチャーにより車載機器を操作することができる。そのため、運転者が車載機器を操作する際に、目的のボタンを探したり、操作のためにステアリングホイール３１から手を離したりする必要がなくなり、安全運転の向上に役立つ機能を提供できる。 Actually, the TOF camera 21 captures the driver's hand, finger, face, steering wheel 31 and the like, and recognizes the behavior of the hand, finger, face, etc. as a gesture from the three-dimensional image obtained by the capturing. Therefore, the driver can operate the in-vehicle device by a gesture without operating a special button or the like. Therefore, when the driver operates the in-vehicle device, it is not necessary to search for a target button or to release the hand from the steering wheel 31 for the operation, and a function useful for improving safe driving can be provided.

但し、本実施形態では、画像認識装置１０が手、指、顔等の挙動を検出するのは、これらが事前に定めた挙動検出空間ＡｒＨ、ＡｒＦ等の内側に存在する場合のみに限定してある。これにより、例えば運転者が無意識のうちに動かした手の動きを特定のジェスチャーとして誤認識するのを防止し、車載機器が想定外の動作をするのを避けることができる。 However, in the present embodiment, the image recognition apparatus 10 detects the behavior of the hand, finger, face, etc. only when they exist inside the predetermined behavior detection spaces ArH, ArF, etc. is there. As a result, for example, it is possible to prevent the driver from unintentionally recognizing the movement of the hand moved unconsciously as a specific gesture, and to prevent the in-vehicle device from performing an unexpected operation.

図２に示すように、ＴＯＦカメラ２１は光源部２１ａおよび受光部２１ｂを備えている。光源部２１ａは、パルス状の光を撮影対象物に照射することができる。受光部２１ｂは、ＣＭＯＳなどで構成される二次元イメージセンサを備えている。また、受光部２１ｂが検出した二次元画像を構成する画素毎に、光源部２１ａの光が手などの撮影対象物にあたり受光部２１ｂに戻るまでの時間（Time Of Flight）に応じた距離情報を検出する回路がＴＯＦカメラ２１に内蔵されている。したがって、ＴＯＦカメラ２１は三次元画像を撮影できる。 As shown in FIG. 2, the TOF camera 21 includes a light source unit 21a and a light receiving unit 21b. The light source unit 21a can irradiate the object to be imaged with pulsed light. The light receiving unit 21b includes a two-dimensional image sensor formed of a CMOS or the like. In addition, for each pixel constituting the two-dimensional image detected by the light receiving unit 21b, distance information corresponding to the time (Time Of Flight) until the light from the light source unit 21a hits a photographing object such as a hand and returns to the light receiving unit 21b is obtained. A circuit for detection is built in the TOF camera 21. Therefore, the TOF camera 21 can capture a three-dimensional image.

ＴＯＦカメラ２１が撮影した三次元画像の情報は、画像認識装置１０の入力に印加される。図２に示すように、画像認識装置１０は画像認識処理部１１およびジェスチャー監視制御部１２を備えている。 Information on the three-dimensional image captured by the TOF camera 21 is applied to the input of the image recognition apparatus 10. As shown in FIG. 2, the image recognition device 10 includes an image recognition processing unit 11 and a gesture monitoring control unit 12.

画像認識処理部１１は、ＴＯＦカメラ２１から出力される画像情報に対する情報処理を高速で実行し、事前に登録した特定形状のパターンを認識したり、認識したパターンの三次元座標上の位置、寸法、色、動き、形状変化などを計測する機能を有している。 The image recognition processing unit 11 executes information processing on the image information output from the TOF camera 21 at high speed, recognizes a pattern of a specific shape registered in advance, and positions and dimensions of the recognized pattern on the three-dimensional coordinates. It has the function to measure color, movement, shape change and so on.

ジェスチャー監視制御部１２は、運転者の手が事前に定めた挙動検出空間の内側に存在するか否かを識別する。また、運転者の手が事前に定めた挙動検出空間の内側に存在する場合には、画像認識処理部１１の認識結果に基づき運転者の手、指、顔等の挙動を監視して、事前に登録したジェスチャーのパターンと一致するか否かを識別する。特定のジェスチャーと一致する挙動を検知した場合には、事前に定めた制御を実施する。 The gesture monitoring control unit 12 identifies whether or not the driver's hand exists inside the predetermined behavior detection space. Further, when the driver's hand is present inside the predetermined behavior detection space, the driver's hand, finger, face, etc. are monitored based on the recognition result of the image recognition processing unit 11, and the driver's hand is detected in advance. It is identified whether or not it matches with the gesture pattern registered in. When a behavior matching a specific gesture is detected, predetermined control is performed.

上位ＥＣＵ（電子制御ユニット）２２は、ステアリングホイール３１の位置や姿勢を表す情報や、運転席シート３４の位置や姿勢を表す情報や、車両のイグニッションオンオフを示す情報などをジェスチャー監視制御部１２に与えることができる。 The host ECU (electronic control unit) 22 sends information indicating the position and posture of the steering wheel 31, information indicating the position and posture of the driver's seat 34, information indicating ignition on / off of the vehicle, and the like to the gesture monitoring control unit 12. Can be given.

図２に示した車載システムにおいては、画像認識装置１０の出力にＨＵＤユニット２３、カーナビゲーション装置２４、およびカーオーディオ装置２６が接続されている。画像認識装置１０は、運転者のジェスチャーに基づいて、ＨＵＤユニット２３、カーナビゲーション装置２４、およびカーオーディオ装置２６のそれぞれを制御することができる。 In the in-vehicle system shown in FIG. 2, the HUD unit 23, the car navigation device 24, and the car audio device 26 are connected to the output of the image recognition device 10. The image recognition device 10 can control each of the HUD unit 23, the car navigation device 24, and the car audio device 26 based on a driver's gesture.

例えば、特定のジェスチャーによりＨＵＤユニット２３の動作を起動する時には、ＨＵＤユニット２３に含まれる図示しない透明なコンバイナが、図３に示したダッシュボード３３の下方から上昇してダッシュボード３３上に現れ、運転者が視認可能な状態になる。その状態で、ＨＵＤユニット２３から投射された表示光がコンバイナで反射され、運転者の目の位置で視認可能な虚像が結像される。また、特定のジェスチャーによりＨＵＤユニット２３の動作を終了する時には、前記コンバイナが下降してダッシュボード３３の下方に収納される。 For example, when the operation of the HUD unit 23 is activated by a specific gesture, a transparent combiner (not shown) included in the HUD unit 23 rises from below the dashboard 33 shown in FIG. 3 and appears on the dashboard 33. The driver can see the vehicle. In this state, the display light projected from the HUD unit 23 is reflected by the combiner, and a virtual image visible at the position of the driver's eyes is formed. When the operation of the HUD unit 23 is ended by a specific gesture, the combiner is lowered and stored below the dashboard 33.

また、カーナビゲーション装置２４を操作するための様々なボタンや、カーオーディオ装置２６を操作するための様々なボタンと同様の機能を、画像認識装置１０が認識可能な各種のジェスチャーに割り当てることが可能である。 In addition, various buttons for operating the car navigation device 24 and functions similar to various buttons for operating the car audio device 26 can be assigned to various gestures that can be recognized by the image recognition device 10. It is.

＜ジェスチャーの具体例＞
運転者の手の挙動に基づくジェスチャーの具体例を図５（Ａ）および図５（Ｂ）にそれぞれ示す。 <Specific examples of gestures>
Specific examples of gestures based on the behavior of the driver's hand are shown in FIGS. 5 (A) and 5 (B), respectively.

例えば、ＨＵＤユニット２３の動作を起動するためのジェスチャーの操作を行う場合には、運転者３０は左手ＬＨおよび右手ＲＨを図５（Ａ）に示すようにステアリングホイール３１に触れた状態のまま、左手ＬＨを下から上に向かってなぞるように移動する。また、この時の左手ＬＨおよび右手ＲＨの位置は、挙動検出空間ＡｒＨの領域内に位置するように合わせる。この操作をＴＯＦカメラ２１の撮影した画像に基づき画像認識装置１０が特定のジェスチャーとして認識し、画像認識装置１０はＨＵＤユニット２３に起動のための制御信号を送る。 For example, when performing a gesture operation for activating the operation of the HUD unit 23, the driver 30 keeps the left hand LH and the right hand RH touching the steering wheel 31 as shown in FIG. Move the left hand LH to trace from the bottom to the top. Further, the positions of the left hand LH and the right hand RH at this time are adjusted so as to be located within the region of the behavior detection space ArH. This operation is recognized as a specific gesture by the image recognition device 10 based on the image captured by the TOF camera 21, and the image recognition device 10 sends a control signal for activation to the HUD unit 23.

また、ＨＵＤユニット２３の動作を終了するためのジェスチャーの操作を行う場合には、運転者３０は左手ＬＨおよび右手ＲＨを図５（Ｂ）に示すようにステアリングホイール３１に触れた状態のまま、左手ＬＨを上から下に向かってなぞるように移動する。また、この時の左手ＬＨおよび右手ＲＨの位置は、挙動検出空間ＡｒＨの領域内に位置するように合わせる。この操作をＴＯＦカメラ２１の撮影した画像に基づき画像認識装置１０が特定のジェスチャーとして認識し、画像認識装置１０はＨＵＤユニット２３に動作終了のための制御信号を送る。 Further, when performing a gesture operation to end the operation of the HUD unit 23, the driver 30 keeps the left hand LH and the right hand RH touching the steering wheel 31 as shown in FIG. Move to trace the left hand LH from top to bottom. Further, the positions of the left hand LH and the right hand RH at this time are adjusted so as to be located within the region of the behavior detection space ArH. This operation is recognized as a specific gesture by the image recognition device 10 based on the image captured by the TOF camera 21, and the image recognition device 10 sends a control signal for ending the operation to the HUD unit 23.

なお、通常の運転操作とジェスチャーとの区別を容易にするために、図５（Ａ）、図５（Ｂ）よりももっと複雑な操作を行うようにしてもよい。例えば、特定の指の曲げ伸ばし等により特別な手の形状を表現したり、なぞる操作を複数回繰り返すようなジェスチャーパターンを採用してもよい。 In order to easily distinguish between a normal driving operation and a gesture, a more complicated operation than that shown in FIGS. 5A and 5B may be performed. For example, a gesture pattern that expresses a special hand shape by bending or stretching a specific finger or repeating a tracing operation a plurality of times may be employed.

また、本実施形態では画像認識装置１０が挙動検出空間ＡｒＦ内の顔３０ｆの表情も認識できるので、手の挙動と顔の挙動とを組み合わせたジェスチャーを採用することも考えられる。例えば、図５（Ａ）に示したような手の挙動が検出され、且つ挙動検出空間ＡｒＦ内の顔３０ｆにおいて、目の視線の方向が前方を向いている場合に限り、ＨＵＤユニット２３を起動するためのジェスチャーとして認識する。また、図５（Ｂ）に示したような手の挙動が検出され、且つ挙動検出空間ＡｒＦ内の顔３０ｆにおいて、目の視線の方向が前方を向いている場合に限り、ＨＵＤユニット２３の動作を終了するためのジェスチャーとして認識する。 In the present embodiment, since the image recognition apparatus 10 can also recognize the facial expression of the face 30f in the behavior detection space ArF, it is conceivable to employ a gesture that combines the behavior of the hand and the behavior of the face. For example, the HUD unit 23 is activated only when the behavior of the hand as shown in FIG. 5A is detected and the direction of the line of sight of the eyes is facing forward in the face 30f in the behavior detection space ArF. Recognize as a gesture to do. In addition, the operation of the HUD unit 23 is performed only when the behavior of the hand as shown in FIG. 5B is detected and the direction of the line of sight of the eyes faces the front in the face 30f in the behavior detection space ArF. Is recognized as a gesture for ending.

挙動検出空間ＡｒＦ内の挙動と、挙動検出空間ＡｒＨ内の挙動とをそれぞれ独立した機器の制御に割り当ててもよい。例えば、図５（Ａ）、図５（Ｂ）のような挙動検出空間ＡｒＨ内の手の挙動は、ＨＵＤユニット２３の操作のためのジェスチャーとして割り当て、挙動検出空間ＡｒＦ内の顔の挙動はカーナビゲーション装置２４、カーオーディオ装置２６、あるいはエアコンのような機器を操作するためのジェスチャーとして割り当てることが考えられる。 The behavior in the behavior detection space ArF and the behavior in the behavior detection space ArH may be assigned to independent device control. For example, the hand behavior in the behavior detection space ArH as shown in FIGS. 5A and 5B is assigned as a gesture for operating the HUD unit 23, and the facial behavior in the behavior detection space ArF is the car. It is conceivable to assign it as a gesture for operating a device such as the navigation device 24, the car audio device 26, or an air conditioner.

＜計測誤差の説明＞
図２に示した画像認識装置１０は、ＴＯＦカメラ２１の撮影により得られる三次元画像を認識するので、認識対象の手や顔の位置が事前に定めた挙動検出空間ＡｒＦ、ＡｒＨの範囲内に存在するか否かを識別できる。 <Explanation of measurement error>
Since the image recognition apparatus 10 shown in FIG. 2 recognizes a three-dimensional image obtained by photographing with the TOF camera 21, the position of the recognition target hand or face is within the range of the behavior detection spaces ArF and ArH determined in advance. It can be identified whether or not it exists.

しかし、ＴＯＦカメラ２１から認識対象までの計測距離に比較的大きな誤差が発生する場合がある。実際には、高温の環境下で、光源部２１ａからの光の照射角度や画角が大きく変動するため、撮影方向（奥行き方向Ｚ）の計測距離や、他の軸方向（Ｘ、Ｙ）の座標位置にも誤差が発生する。車両においては、車室内の環境温度が大きく変動する可能性があるため、ＴＯＦカメラ２１の計測誤差は無視できない程度に大きくなる。 However, a relatively large error may occur in the measurement distance from the TOF camera 21 to the recognition target. Actually, since the irradiation angle and the angle of view of the light from the light source unit 21a greatly fluctuate in a high temperature environment, the measurement distance in the photographing direction (depth direction Z) and the other axial directions (X, Y) An error also occurs in the coordinate position. In a vehicle, the environmental temperature in the passenger compartment may fluctuate greatly, so the measurement error of the TOF camera 21 becomes so large that it cannot be ignored.

ＴＯＦカメラ２１の計測した距離に大きな誤差が発生すると、三次元画像に基づいて認識される認識対象の手の位置が実際の位置に対して大きくずれてしまう。そして、実際の手の位置が挙動検出空間ＡｒＨ、ＡｒＦの範囲外にある時であっても、画像認識装置１０が手や顔の挙動に反応してジェスチャーを検出する可能性がある。つまり、運転者がジェスチャーを行う意図がない状況で画像認識装置１０がジェスチャーを誤検出してしまうので、ＨＵＤユニット２３等の車載機器が運転者の想定外の動作を行うことになる。このような誤動作を防止するために、後述するように画像認識装置１０は三次元画像に基づいて認識される認識対象の位置を自動的に補正する機能を搭載している。 When a large error occurs in the distance measured by the TOF camera 21, the position of the hand to be recognized that is recognized based on the three-dimensional image is greatly deviated from the actual position. Even when the actual hand position is outside the range of the behavior detection spaces ArH and ArF, the image recognition device 10 may detect a gesture in response to the behavior of the hand or face. That is, since the image recognition apparatus 10 erroneously detects a gesture in a situation where the driver does not intend to perform a gesture, an in-vehicle device such as the HUD unit 23 performs an operation that is not expected by the driver. In order to prevent such a malfunction, the image recognition apparatus 10 has a function of automatically correcting the position of the recognition target recognized based on the three-dimensional image, as will be described later.

＜画像認識装置１０の動作＞
＜処理手順の概要＞
本発明の実施形態における画像認識装置１０の主要な動作例を図４に示す。すなわち、図２に示した画像認識装置１０のジェスチャー監視制御部１２に内蔵されるコンピュータ（図示せず）または画像認識処理部１１が、図４に示した手順に従って運転者のジェスチャーに対応するための制御を実施する。 <Operation of Image Recognition Device 10>
<Outline of processing procedure>
FIG. 4 shows a main operation example of the image recognition apparatus 10 in the embodiment of the present invention. That is, the computer (not shown) or the image recognition processing unit 11 built in the gesture monitoring control unit 12 of the image recognition apparatus 10 shown in FIG. 2 responds to the driver's gesture according to the procedure shown in FIG. Implement the control.

図１に示した手順には、ＴＯＦカメラ２１の出力する三次元画像に基づいて認識される認識対象の手の位置を補正するための処理が含まれている。具体的には、ステアリングホイール３１の位置を基準として、補正のための比率Ｒ１を求め、認識された手や顔の位置を補正する。 The procedure shown in FIG. 1 includes a process for correcting the position of the hand to be recognized that is recognized based on the three-dimensional image output from the TOF camera 21. Specifically, a correction ratio R1 is obtained with reference to the position of the steering wheel 31, and the recognized hand or face position is corrected.

ステアリングホイール３１は、基本的には車体に固定されているので、ＴＯＦカメラ２１からステアリングホイール３１上の特定位置までの距離は既知として扱うことができる。そこで、この距離を事前に実測して距離参照値Ｌｒｅｆとして画像認識装置１０上の定数テーブルＴＢ１に登録しておく。 Since the steering wheel 31 is basically fixed to the vehicle body, the distance from the TOF camera 21 to a specific position on the steering wheel 31 can be treated as known. Therefore, this distance is measured in advance and registered in the constant table TB1 on the image recognition apparatus 10 as a distance reference value Lref.

但し、実際の車両においてステアリングホイール３１はティルト角度や操舵軸の長さを変更するための姿勢調整機能を搭載している場合が多く、ＴＯＦカメラ２１からステアリングホイール３１上の特定位置までの距離も可変である。そこで、ステアリングホイール３１の複数の姿勢のそれぞれの状態で実測した複数の距離参照値Ｌｒｅｆを定数テーブルＴＢ１に登録しておき、ステアリングホイール３１の実際の姿勢に応じて最適な距離参照値Ｌｒｅｆを選択的に使用する。 However, in an actual vehicle, the steering wheel 31 often has a posture adjustment function for changing the tilt angle and the length of the steering shaft, and the distance from the TOF camera 21 to a specific position on the steering wheel 31 is also large. It is variable. Therefore, a plurality of distance reference values Lref actually measured in each of a plurality of postures of the steering wheel 31 are registered in the constant table TB1, and an optimum distance reference value Lref is selected according to the actual posture of the steering wheel 31. Use it.

ＴＯＦカメラ２１が撮影した三次元画像に基づいて認識されるステアリングホイール３１上の特定位置の三次元座標に基づいて、ＴＯＦカメラ２１から前記特定位置までの距離計測値Ｌ１を算出できる。ここで、計測誤差が発生していると、距離参照値Ｌｒｅｆと距離計測値Ｌ１との間に差異が現れる。そこで、これらの比率Ｒ１を算出し、これを距離の誤差を補正するための補正係数として使用する。
Ｒ１＝Ｌ１／Ｌｒｅｆ・・・（１） A distance measurement value L1 from the TOF camera 21 to the specific position can be calculated based on the three-dimensional coordinates of the specific position on the steering wheel 31 recognized based on the three-dimensional image captured by the TOF camera 21. Here, if a measurement error occurs, a difference appears between the distance reference value Lref and the distance measurement value L1. Therefore, the ratio R1 is calculated and used as a correction coefficient for correcting a distance error.
R1 = L1 / Lref (1)

つまり、ＴＯＦカメラ２１の位置からこれが撮影した三次元画像に基づいて認識される監視対象の手の位置までの距離計測値Ｌ２は、ＴＯＦカメラ２１の特性により生じる距離の計測誤差を含んでいるので、この計測誤差を減らすために前記比率Ｒ１を用いて距離計測値Ｌ２等を補正する。 That is, the distance measurement value L2 from the position of the TOF camera 21 to the position of the monitored hand recognized based on the three-dimensional image captured by the TOF camera 21 includes a distance measurement error caused by the characteristics of the TOF camera 21. In order to reduce this measurement error, the distance measurement value L2 and the like are corrected using the ratio R1.

補正後の距離計測値等を用いて、監視対象の手や顔の位置と挙動検出空間ＡｒＨ、ＡｒＦの各範囲の閾値とを比較することにより、手や顔の位置が各挙動検出空間の範囲内か否かを正しく識別できる。 Using the corrected distance measurement values and the like, the position of the hand or face is compared with the threshold value of each range of the behavior detection spaces ArH and ArF, so that the position of the hand or face is the range of each behavior detection space. It is possible to correctly identify whether it is within or not.

また、各挙動検出空間ＡｒＦ、ＡｒＨが予め固定されている場合には、運転者の実際の運転姿勢の調整などに伴って、各挙動検出空間ＡｒＦ、ＡｒＨが所望の空間からずれる可能性がある。例えば、運転席シート３４の前後方向の位置調整を行うと、運転時の実際の顔３０ｆの位置が前後に移動して、挙動検出空間ＡｒＦ内で顔を検出できなくなる可能性がある。 Further, when the behavior detection spaces ArF and ArH are fixed in advance, the behavior detection spaces ArF and ArH may deviate from the desired spaces in accordance with the adjustment of the actual driving posture of the driver. . For example, if the position of the driver's seat 34 is adjusted in the front-rear direction, the position of the actual face 30f during driving may move back and forth, and the face may not be detected in the behavior detection space ArF.

そこで、挙動検出空間ＡｒＦを決める場合には、車両上に固定されている運転席シート３４のヘッドレスト３４ｈの位置を基準として、その周囲の近傍の空間として挙動検出空間ＡｒＦの位置を決定する。但し、運転席シート３４の移動や姿勢調整に伴ってヘッドレスト３４ｈの位置が変化するので、運転席シート３４の位置や姿勢の情報を取得してヘッドレスト３４ｈの位置を算出し、その結果を挙動検出空間ＡｒＦの位置に反映する。 Therefore, when determining the behavior detection space ArF, the position of the behavior detection space ArF is determined as a space in the vicinity of the headrest 34h of the driver's seat 34 fixed on the vehicle. However, since the position of the headrest 34h changes as the driver's seat 34 moves and adjusts, the information on the position and orientation of the driver's seat 34 is obtained, the position of the headrest 34h is calculated, and the behavior is detected as a result. This is reflected in the position of the space ArF.

また、挙動検出空間ＡｒＨを決める場合には、車両上に固定されているステアリングホイール３１の位置を基準として、その周囲の近傍の空間として挙動検出空間ＡｒＨの位置を決定する。但し、ステアリングホイール３１のティルト角度等の姿勢調整に伴ってステアリングホイール３１の位置や高さが変化するので、ステアリングホイール３１の姿勢の情報を取得して実際のステアリングホイール３１の位置を特定し、その結果を挙動検出空間ＡｒＨの位置に反映する。 Further, when determining the behavior detection space ArH, the position of the behavior detection space ArH is determined as a space near the periphery of the steering wheel 31 fixed on the vehicle. However, since the position and height of the steering wheel 31 change with the posture adjustment such as the tilt angle of the steering wheel 31, the information on the posture of the steering wheel 31 is acquired and the actual position of the steering wheel 31 is specified. The result is reflected in the position of the behavior detection space ArH.

＜処理手順の詳細＞
車両のイグニッションがオンになると、ジェスチャー監視制御部１２が実行する処理は図１のステップＳ１１からＳ１２に進む。ステップＳ１２では、ジェスチャー監視制御部１２は上位ＥＣＵ２２から、ステアリングホイール３１のティルト角度（高さの違いに相当）や、運転席シート３４の位置および姿勢などの情報を取得する。 <Details of processing procedure>
When the ignition of the vehicle is turned on, the process executed by the gesture monitoring control unit 12 proceeds from step S11 to S12 in FIG. In step S 12, the gesture monitoring control unit 12 acquires information such as a tilt angle (corresponding to a difference in height) of the steering wheel 31 and the position and posture of the driver seat 34 from the host ECU 22.

次のステップＳ１３では、ジェスチャー監視制御部１２は、Ｓ１２で取得したステアリングホイール３１の姿勢の情報に基づいて、ステアリングホイール３１の実際の各部の位置を特定し、この位置を基準として挙動検出空間ＡｒＨの位置を決定する。また、Ｓ１２で取得した運転席シート３４の位置および姿勢の情報に基づいて、ヘッドレスト３４ｈの実際の位置を算出し、この位置を基準として挙動検出空間ＡｒＦの位置を決定する。なお、ステアリングホイール３１やヘッドレスト３４ｈの各部の位置を特定するために必要な姿勢と各部の位置との関係や、各部の相対距離を表す定数については、定数テーブルＴＢ１から取得する。 In the next step S13, the gesture monitoring control unit 12 specifies the actual position of each part of the steering wheel 31 based on the attitude information of the steering wheel 31 acquired in S12, and the behavior detection space ArH based on this position. Determine the position. Further, the actual position of the headrest 34h is calculated based on the position and posture information of the driver seat 34 acquired in S12, and the position of the behavior detection space ArF is determined based on this position. It should be noted that the constants representing the relationship between the positions necessary for specifying the positions of the respective parts of the steering wheel 31 and the headrest 34h and the positions of the respective parts and the relative distances of the respective parts are obtained from the constant table TB1.

例えば、ヘッドレスト３４ｈの位置に対して、標準的な人の頭の寸法の半分程度の距離だけ離れた位置の前方に隣接するように挙動検出空間ＡｒＦの前後方向（Ｚ方向）の位置を決定する。また、挙動検出空間ＡｒＦの横方向（Ｘ）および縦方向（Ｙ）の位置については、ヘッドレスト３４ｈのほぼ中央の位置と一致するように決定する。 For example, the position in the front-rear direction (Z direction) of the behavior detection space ArF is determined so as to be adjacent to the front of a position separated from the position of the headrest 34h by a distance about half the size of a standard human head. . Further, the positions in the horizontal direction (X) and the vertical direction (Y) of the behavior detection space ArF are determined so as to coincide with the substantially central position of the headrest 34h.

ステップＳ１４では、ジェスチャー監視制御部１２は、Ｓ１２で取得した姿勢の情報をパラメータとして、これに対応付けられた１つの距離参照値Ｌｒｅｆを定数テーブルＴＢ１から取得する。つまり、ＴＯＦカメラ２１の位置から固定されたステアリングホイール３１上の特定位置までの実際の距離を表す値を距離参照値Ｌｒｅｆとして取得する。 In step S14, the gesture monitoring control unit 12 uses the posture information acquired in S12 as a parameter, and acquires one distance reference value Lref associated therewith from the constant table TB1. That is, a value representing an actual distance from the position of the TOF camera 21 to a specific position on the fixed steering wheel 31 is acquired as the distance reference value Lref.

次のステップＳ１５で、ジェスチャー監視制御部１２はＴＯＦカメラ２１が撮影を開始するように制御する。この後で、ＴＯＦカメラ２１の撮影により得られる三次元画像のデータが画像のフレーム毎に順次に画像認識処理部１１およびジェスチャー監視制御部１２に入力される。 In the next step S15, the gesture monitoring control unit 12 controls the TOF camera 21 to start photographing. Thereafter, data of a three-dimensional image obtained by photographing with the TOF camera 21 is sequentially input to the image recognition processing unit 11 and the gesture monitoring control unit 12 for each frame of the image.

ステップＳ１６では、画像認識処理部１１が入力される三次元画像のデータを処理して所定の画像認識を実行する。すなわち、入力された三次元画像から抽出される様々な特徴量と、事前に登録してあるステアリングホイール３１の形状、手の形状、指の形状などの参照データとを比較することにより、ステアリングホイール３１、手、指、顔などのそれぞれの認識対象物を認識する。 In step S16, the image recognition processing unit 11 processes the input three-dimensional image data and executes predetermined image recognition. That is, by comparing various feature amounts extracted from the inputted three-dimensional image with reference data such as the shape of the steering wheel 31 registered in advance, the shape of the hand, and the shape of the finger, the steering wheel 31. Recognize each recognition object such as a hand, a finger, and a face.

ステップＳ１７では、ジェスチャー監視制御部１２は、Ｓ１６における画像認識処理部１１の認識結果に基づき、基準位置として事前に定めたステアリングホイール３１上の特定位置の位置座標を特定し、ＴＯＦカメラ２１から前記特定位置までの距離計測値Ｌ１を算出する。なお、前記「特定位置」については、例えば突起などの特別な形状、着色やマークなどの特徴的な目印を利用することにより容易に特定できる。 In step S 17, the gesture monitoring control unit 12 specifies the position coordinates of a specific position on the steering wheel 31 that is set in advance as a reference position based on the recognition result of the image recognition processing unit 11 in S 16, and from the TOF camera 21 A distance measurement value L1 to the specific position is calculated. The “specific position” can be easily specified by using, for example, a special shape such as a protrusion, or a characteristic mark such as coloring or a mark.

ステップＳ１８では、ジェスチャー監視制御部１２は、Ｓ１４で取得した距離参照値ＬｒｅｆとＳ１７で算出した距離計測値Ｌ１とに基づき、前記第（１）式の比率Ｒ１を算出する。 In step S18, the gesture monitoring control unit 12 calculates the ratio R1 of the expression (1) based on the distance reference value Lref acquired in S14 and the distance measurement value L1 calculated in S17.

例えば、ＴＯＦカメラ２１からステアリングホイール３１までの実際の距離が３０ｃｍである場合には、距離参照値Ｌｒｅｆが３０ｃｍになる。また、画像認識により得られた距離計測値Ｌ１が５０ｃｍの場合には、距離参照値Ｌｒｅｆと異なるので誤差が含まれていることになる。そこで、比率Ｒ１（５０／３０）を補正値として利用すれば、距離計測値Ｌ１と距離参照値Ｌｒｅｆの誤差がなくなるように補正することができる。 For example, when the actual distance from the TOF camera 21 to the steering wheel 31 is 30 cm, the distance reference value Lref is 30 cm. Further, when the distance measurement value L1 obtained by the image recognition is 50 cm, an error is included since it is different from the distance reference value Lref. Therefore, if the ratio R1 (50/30) is used as a correction value, it is possible to correct the error between the distance measurement value L1 and the distance reference value Lref.

ステップＳ１９では、ジェスチャー監視制御部１２は、画像認識処理部１１の認識結果に基づき、手の位置座標を取得し、ＴＯＦカメラ２１の位置から手の位置までの奥行き方向（Ｚ方向）の距離を表す距離計測値Ｌ２を算出する。また、画像認識処理部１１の認識結果に基づき、顔の位置座標を取得し、ＴＯＦカメラ２１の位置から顔の位置までの奥行き方向（Ｚ方向）の距離を表す距離計測値Ｌ３を算出する。 In step S19, the gesture monitoring control unit 12 acquires the hand position coordinates based on the recognition result of the image recognition processing unit 11, and calculates the distance in the depth direction (Z direction) from the position of the TOF camera 21 to the hand position. A distance measurement value L2 to be expressed is calculated. Further, based on the recognition result of the image recognition processing unit 11, face position coordinates are acquired, and a distance measurement value L3 representing a distance in the depth direction (Z direction) from the position of the TOF camera 21 to the face position is calculated.

ステップＳ２０では、ジェスチャー監視制御部１２は、Ｓ１９で取得した手の位置までの距離計測値Ｌ２をＳ１８で得た比率Ｒ１を用いて補正し、補正後の距離計測値Ｌ２１を取得する。また、Ｓ１９で取得した顔の位置までの距離計測値Ｌ３をＳ１９で得た比率Ｒ１を用いて補正し、補正後の距離計測値Ｌ３１を取得する。更に、距離以外の各座標の位置についても比率Ｒ１を用いて補正する。
Ｌ２１＝Ｌ２／Ｒ１・・・（２）
Ｌ３１＝Ｌ３／Ｒ１・・・（３） In step S20, the gesture monitoring control unit 12 corrects the distance measurement value L2 to the hand position acquired in S19 using the ratio R1 obtained in S18, and acquires the corrected distance measurement value L21. Further, the distance measurement value L3 to the face position acquired in S19 is corrected using the ratio R1 obtained in S19, and the corrected distance measurement value L31 is acquired. Further, the position of each coordinate other than the distance is also corrected using the ratio R1.
L21 = L2 / R1 (2)
L31 = L3 / R1 (3)

ステップＳ２１では、ジェスチャー監視制御部１２は、補正後の距離計測値Ｌ２１を含む運転者の手の位置の三次元座標を、挙動検出空間ＡｒＨの範囲を特定する閾値と比較して、手が挙動検出空間ＡｒＨの範囲内か否かを識別する。ここで、手の位置が挙動検出空間ＡｒＨの範囲内であれば次のＳ２２に進み、範囲外であればＳ２３に進む。 In step S21, the gesture monitoring control unit 12 compares the three-dimensional coordinates of the position of the driver's hand including the corrected distance measurement value L21 with a threshold value that specifies the range of the behavior detection space ArH, so that the hand behaves. It is identified whether or not it is within the detection space ArH. Here, if the position of the hand is within the range of the behavior detection space ArH, the process proceeds to the next S22, and if it is out of the range, the process proceeds to S23.

ステップＳ２２では、ジェスチャー監視制御部１２は、画像認識処理部１１の画像認識結果に基づき、例えば図５（Ａ）、図５（Ｂ）に示したような挙動検出空間ＡｒＨ内の運転者の手や指の挙動を監視する。 In step S22, the gesture monitoring control unit 12 is based on the image recognition result of the image recognition processing unit 11, for example, the driver's hand in the behavior detection space ArH as shown in FIG. 5 (A) and FIG. 5 (B). And monitor finger behavior.

ステップＳ２３では、ジェスチャー監視制御部１２は、補正後の距離計測値Ｌ３１を含む運転者の顔の位置の三次元座標を、挙動検出空間ＡｒＦの範囲を特定する閾値と比較して、顔が挙動検出空間ＡｒＦの範囲内か否かを識別する。ここで、顔の位置が挙動検出空間ＡｒＦの範囲内であれば次のＳ２４に進み、範囲外であればＳ２５に進む。 In step S23, the gesture monitoring control unit 12 compares the three-dimensional coordinates of the position of the driver's face including the corrected distance measurement value L31 with a threshold value that specifies the range of the behavior detection space ArF, and the behavior of the face is determined. Whether it is within the detection space ArF is identified. If the face position is within the range of the behavior detection space ArF, the process proceeds to the next S24, and if it is out of the range, the process proceeds to S25.

ステップＳ２４では、ジェスチャー監視制御部１２は、画像認識処理部１１の画像認識結果に基づき、挙動検出空間ＡｒＦ内の運転者の顔の表情などの挙動を監視する。例えば、左右の目の視線の方向、口の形状、顔の向き、頭部の傾きなどを監視する。 In step S24, the gesture monitoring control unit 12 monitors the behavior such as the facial expression of the driver in the behavior detection space ArF based on the image recognition result of the image recognition processing unit 11. For example, the direction of the line of sight of the left and right eyes, the shape of the mouth, the orientation of the face, the tilt of the head, etc. are monitored.

ステップＳ２５では、ジェスチャー監視制御部１２は、Ｓ２２で検出した手や指の挙動パターン、およびＳ２４で検出した顔の表情等の挙動パターンを事前に登録してあるジェスチャーの基準パターンと対比して、これらが一致するか否かを識別する。 In step S25, the gesture monitoring controller 12 compares the behavior pattern of the hand or finger detected in S22 and the behavior pattern such as the facial expression detected in S24 with the gesture reference pattern registered in advance, Identify whether they match.

Ｓ２５における比較の結果、ジェスチャー監視制御部１２が登録済みのジェスチャーを検出した場合には、Ｓ２６からＳ２７に進む。そして、該当するジェスチャーに対応付けられた制御を実行するように、ジェスチャー監視制御部１２がＳ２７で車載機器に対して制御信号を出力する。 As a result of the comparison in S25, if the gesture monitoring control unit 12 detects a registered gesture, the process proceeds from S26 to S27. And the gesture monitoring control part 12 outputs a control signal with respect to vehicle equipment by S27 so that the control matched with the applicable gesture may be performed.

例えば、図５（Ａ）に示した手のジェスチャーを検出し、且つ顔の視線の方向が前方であることを検出した場合には、ジェスチャー監視制御部１２がＨＵＤユニット２３に対して動作を起動するための信号を出力する。また、図５（Ｂ）に示したジェスチャーを検出し、且つ顔の視線の方向が前方であることを検出した場合には、ジェスチャー監視制御部１２がＨＵＤユニット２３に対して動作を終了するための信号を出力する。 For example, when the gesture of the hand shown in FIG. 5A is detected and the direction of the line of sight of the face is detected to be forward, the gesture monitoring control unit 12 activates the operation on the HUD unit 23. Outputs a signal to Further, when the gesture shown in FIG. 5B is detected and it is detected that the direction of the line of sight of the face is forward, the gesture monitoring control unit 12 ends the operation with respect to the HUD unit 23. The signal is output.

＜変形例の説明＞
変形例の画像認識装置１０Ｂを搭載した車両の車室内の構成例を図６に示す。図６に示した変形例においては、前述の挙動検出空間ＡｒＦ、ＡｒＨの代わりに、挙動検出空間ＡｒＬ、ＡｒＤを用いる場合を想定している。 <Description of modification>
FIG. 6 shows a configuration example of the interior of a vehicle in which the image recognition device 10B according to the modification is mounted. In the modification shown in FIG. 6, it is assumed that the behavior detection spaces ArL and ArD are used instead of the behavior detection spaces ArF and ArH described above.

図６に示した挙動検出空間ＡｒＬは、操作レバー３６の位置を基準として、操作レバー３６のノブの箇所を囲むような直方体形状の領域として割り当ててある。また、挙動検出空間ＡｒＤは、ディスプレイユニット３２の位置を基準として、ディスプレイユニット３２の画面と対向するように隣接する直方体形状の領域として割り当ててある。 The behavior detection space ArL shown in FIG. 6 is assigned as a rectangular parallelepiped region surrounding the position of the knob of the operation lever 36 with the position of the operation lever 36 as a reference. In addition, the behavior detection space ArD is assigned as an adjacent rectangular parallelepiped area so as to face the screen of the display unit 32 with the position of the display unit 32 as a reference.

また、図６に示した例では、ＴＯＦカメラ２１Ｂが車室内の天井部に設置してあり、撮影方向を下方に向けてある。つまり、車室内の天井から下方に位置する挙動検出空間ＡｒＤおよびＡｒＬの付近を同時に撮影できるようにＴＯＦカメラ２１Ｂの撮影範囲２５Ｂおよび向きを調整してある。 In the example shown in FIG. 6, the TOF camera 21 B is installed on the ceiling of the vehicle interior, and the photographing direction is directed downward. That is, the shooting range 25B and the direction of the TOF camera 21B are adjusted so that the vicinity of the behavior detection spaces ArD and ArL located below the ceiling in the passenger compartment can be simultaneously shot.

ここで、操作レバー３６の位置やディスプレイユニット３２の位置は変化しないので、ＴＯＦカメラ２１から操作レバー３６やディスプレイユニット３２までの距離を前述の距離参照値Ｌｒｅｆと同様の定数として定数テーブルＴＢ１に登録しておき、比率Ｒ１を算出するための基準値として利用できる。 Here, since the position of the operation lever 36 and the position of the display unit 32 do not change, the distance from the TOF camera 21 to the operation lever 36 and the display unit 32 is registered in the constant table TB1 as a constant similar to the above-described distance reference value Lref. It can be used as a reference value for calculating the ratio R1.

したがって、図６に示した画像認識装置１０Ｂにおいては、運転者３０による次のようなジェスチャーを認識できる。
（１）運転者３０が手を操作レバー３６に近づけた場合の手や指の形状や動きのパターン
（２）運転者３０が手をディスプレイユニット３２の画面に近づけた時の手や指の形状や動きのパターン Therefore, in the image recognition device 10B shown in FIG. 6, the following gesture by the driver 30 can be recognized.
(1) Hand and finger shape and movement pattern when the driver 30 brings his hand close to the operation lever 36 (2) Hand and finger shape when the driver 30 brings his hand close to the screen of the display unit 32 And movement pattern

＜画像認識装置１０の利点＞
図１に示した画像認識装置１０においては、ＴＯＦカメラ２１の撮影方向を示す同一の軸Ａｚの延長線上に複数の挙動検出空間ＡｒＨ、ＡｒＦを設けてある。したがって、ＴＯＦカメラ２１の画角を大きくすることなしに、複数の挙動検出空間ＡｒＨ、ＡｒＦの両方の認識対象物を同時に撮影できる。そのため、認識精度の低下を防止できる。また、複数のカメラを搭載する必要がないため、コストの上昇を抑制できる。 <Advantages of Image Recognition Device 10>
In the image recognition apparatus 10 shown in FIG. 1, a plurality of behavior detection spaces ArH and ArF are provided on an extension line of the same axis Az indicating the photographing direction of the TOF camera 21. Therefore, it is possible to photograph both recognition objects in the plurality of behavior detection spaces ArH and ArF at the same time without increasing the angle of view of the TOF camera 21. Therefore, it is possible to prevent the recognition accuracy from being lowered. Moreover, since it is not necessary to mount a plurality of cameras, an increase in cost can be suppressed.

また、複数の挙動検出空間ＡｒＨ、ＡｒＦのそれぞれについて人間の挙動を検出できるので、認識可能なジェスチャーの種類を増やすことが可能である。また、複数の挙動の組み合わせを１つのジェスチャーとして割り当てることも可能になるので、自由度の高いジェスチャーを実現可能であり、例えば直感的に操作できる使いやすいユーザインタフェースを提供できる。 In addition, since human behavior can be detected for each of the plurality of behavior detection spaces ArH and ArF, it is possible to increase the types of gestures that can be recognized. In addition, since a combination of a plurality of behaviors can be assigned as one gesture, a highly flexible gesture can be realized, and for example, an easy-to-use user interface that can be operated intuitively can be provided.

なお、上述の画像認識装置１０においては撮影対象の３次元認識が可能な距離画像センサとしてＴＯＦカメラ２１を採用しているが、これ以外の距離画像センサを利用してもよい。また、挙動検出空間の数については可能であれば３以上に増やしてもよい。 In the image recognition apparatus 10 described above, the TOF camera 21 is employed as a distance image sensor capable of three-dimensional recognition of a subject to be photographed. However, other distance image sensors may be used. Further, the number of behavior detection spaces may be increased to 3 or more if possible.

ここで、上述した本発明に係る画像認識装置の実施形態の特徴をそれぞれ以下［１］〜［５］に簡潔に纏めて列記する。
上述の画像認識装置１０に関する特徴的な事項について、以下に纏めて列挙する。
［１］撮影対象の３次元認識が可能な距離画像センサ（ＴＯＦカメラ２１）から入力される信号に基づいて人間の挙動を認識する画像認識装置（１０）であって、
人間の挙動を認識する空間として、
第１の挙動検出空間（ＡｒＦまたはＡｒＨ）と、
前記距離画像センサと前記第１の挙動検出空間とを結ぶ線上に少なくとも一部分が位置し、且つ前記第１の挙動検出空間とは重複しない第２の挙動検出空間（ＡｒＨまたはＡｒＦ）と、
が事前に決定され、
前記第１の挙動検出空間及び前記第２の挙動検出空間の少なくとも一方において人間の挙動を認識した結果を出力に反映する（Ｓ２６、Ｓ２７）、
画像認識装置。 Here, the features of the above-described embodiment of the image recognition apparatus according to the present invention are briefly summarized and listed in the following [1] to [5], respectively.
The characteristic items related to the image recognition apparatus 10 are listed below.
[1] An image recognition device (10) for recognizing a human behavior based on a signal input from a distance image sensor (TOF camera 21) capable of three-dimensional recognition of an imaging target,
As a space to recognize human behavior,
A first behavior detection space (ArF or ArH);
A second behavior detection space (ArH or ArF) that is at least partially located on a line connecting the distance image sensor and the first behavior detection space and does not overlap with the first behavior detection space;
Is determined in advance,
The result of recognizing human behavior in at least one of the first behavior detection space and the second behavior detection space is reflected in the output (S26, S27),
Image recognition device.

［２］前記第１の挙動検出空間は、人間の顔の表情を認識可能な領域（ＡｒＦ）に割り当てられ、
前記第２の挙動検出空間は、人間の手又は指の動きを認識可能な領域（ＡｒＨ）に割り当てられている、
上記［１］に記載の画像認識装置。 [2] The first behavior detection space is allocated to an area (ArF) where human facial expressions can be recognized,
The second behavior detection space is assigned to an area (ArH) where human hand or finger movement can be recognized,
The image recognition apparatus according to [1] above.

［３］前記第１の挙動検出空間及び前記第２の挙動検出空間それぞれにおいて人間の挙動を認識し（Ｓ２２、Ｓ２４）、認識された人間の挙動の組合せに応じてその人間が為した操作を識別する（Ｓ２５、Ｓ２６）、
上記［１］に記載の画像認識装置。 [3] Recognize human behavior in each of the first behavior detection space and the second behavior detection space (S22, S24), and perform an operation performed by the human according to the recognized combination of human behaviors. Identify (S25, S26),
The image recognition apparatus according to [1] above.

［４］前記距離画像センサから入力される信号に基づき認識される事前に定めた特定の撮影対象までの第１の計測距離（距離計測値Ｌ１）と、前記距離画像センサから前記特定の撮影対象までの距離を事前に実測して得られた参照距離（距離参照値Ｌｒｅｆ）との比率（Ｒ１）を算出し、前記第１の挙動検出空間および前記第２の挙動検出空間を特定するパラメータ、または前記距離画像センサから入力される信号に基づき認識される挙動監視対象の任意の点までの第２の計測距離（距離計測値Ｌ２）を、前記比率に基づく補正量により補正する計測値補正部（Ｓ２０）、を更に備えた、
上記［１］に記載の画像認識装置。 [4] A first measurement distance (distance measurement value L1) to a predetermined specific imaging target recognized based on a signal input from the distance image sensor, and the specific imaging target from the distance image sensor A parameter (R1) for calculating a ratio (R1) to a reference distance (distance reference value Lref) obtained by actually measuring the distance up to and determining the first behavior detection space and the second behavior detection space; Alternatively, a measurement value correction unit that corrects the second measurement distance (distance measurement value L2) to an arbitrary point of the behavior monitoring target recognized based on the signal input from the distance image sensor by the correction amount based on the ratio. (S20)
The image recognition apparatus according to [1] above.

［５］前記第１の挙動検出空間および前記第２の挙動検出空間の少なくとも一方が、車室内の特定の固定部位（ステアリングホイール３１、ヘッドレスト３４ｈ、操作レバー３６、ディスプレイユニット３２）を基準として、前記固定部位に隣接する領域、もしくは前記固定部位の周辺の領域に割り当てられている、
上記［１］または［２］に記載の画像認識装置。 [5] At least one of the first behavior detection space and the second behavior detection space is based on a specific fixed portion (steering wheel 31, headrest 34h, operation lever 36, display unit 32) in the vehicle interior. Assigned to the area adjacent to the fixed part, or the area around the fixed part,
The image recognition apparatus according to the above [1] or [2].

１０画像認識装置
１１画像認識処理部
１２ジェスチャー監視制御部
２１ＴＯＦカメラ
２１ａ光源部
２１ｂ受光部
２２上位ＥＣＵ
２３ＨＵＤユニット
２４カーナビゲーション装置
２５Ａ，２５Ｂ撮影範囲
２６カーオーディオ装置
３０運転者
３０ｆ顔
３０ｈ手
３１ステアリングホイール
３２ディスプレイユニット
３３ダッシュボード
３４運転席シート
３４ｈヘッドレスト
３５助手席シート
３６操作レバー
ＬＨ左手
ＲＨ右手
ＴＢ１定数テーブル
Ａｚ撮影方向の軸
ＡｒＨ，ＡｒＦ，ＡｒＬ，ＡｒＤ挙動検出空間
Ｌｒｅｆ距離参照値
Ｌ１，Ｌ２，Ｌ３距離計測値
Ｒ１比率 DESCRIPTION OF SYMBOLS 10 Image recognition apparatus 11 Image recognition process part 12 Gesture monitoring control part 21 TOF camera 21a Light source part 21b Light-receiving part 22 Host ECU
23 HUD unit 24 Car navigation device 25A, 25B Shooting range 26 Car audio device 30 Driver 30f Face 30h Hand 31 Steering wheel 32 Display unit 33 Dashboard 34 Driver's seat 34h Headrest 35 Passenger seat 36 Operation lever LH Left hand RH Right hand TB1 Constant table Az Shooting direction axis ArH, ArF, ArL, ArD Behavior detection space Lref Distance reference value L1, L2, L3 Distance measurement value R1 ratio

Claims

An image recognition apparatus for recognizing a human behavior based on a signal input from a distance image sensor capable of three-dimensional recognition of an imaging target,
As a space to recognize human behavior,
A first behavior detection space;
A second behavior detection space that is at least partially located on a line connecting the distance image sensor and the first behavior detection space and that does not overlap the first behavior detection space;
Is determined in advance,
The first behavior detection space and the second behavior detection space are assigned to positions separated from each other on a line connecting the distance image sensor and the first behavior detection space,
A result of recognizing human behavior in at least one of the first behavior detection space and the second behavior detection space is reflected in the output;
Image recognition device.

The first behavior detection space is assigned to an area where a human facial expression can be recognized,
The second behavior detection space is assigned to an area where human hand or finger movement can be recognized.
The image recognition apparatus according to claim 1.

Recognizing a human behavior in each of the first behavior detection space and the second behavior detection space, and identifying an operation performed by the human according to a recognized combination of human behaviors;
The image recognition apparatus according to claim 1.

A first measurement distance to a predetermined specific photographing target recognized based on a signal input from the distance image sensor and a distance from the distance image sensor to the specific photographing target are measured in advance. A behavior monitoring target that is recognized based on a parameter that calculates a ratio with the obtained reference distance and that specifies the first behavior detection space and the second behavior detection space, or a signal input from the distance image sensor A measurement value correction unit that corrects the second measurement distance to an arbitrary point by a correction amount based on the ratio,
The image recognition apparatus according to claim 1.

At least one of the first behavior detection space and the second behavior detection space is assigned to a region adjacent to the fixed portion or a region around the fixed portion with a specific fixed portion in the vehicle interior as a reference. ing,
The image recognition apparatus according to claim 1 or 2.