JP2023012283A

JP2023012283A - face detection device

Info

Publication number: JP2023012283A
Application number: JP2021115824A
Authority: JP
Inventors: 将幸山崎; Masayuki Yamazaki
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 2021-07-13
Filing date: 2021-07-13
Publication date: 2023-01-25

Abstract

To provide a face detection device which can obtain with good accuracy the likelihood of a face being expressed in a prescribed region of an image, even when a portion of the face is not expressed in the image.SOLUTION: A face detection device comprises: a segmentation unit 33 for classifying pixels in a prescribed region of an image into a face pixel that expresses a face and a non-face pixel that does not expresses a face; a feature point detection unit 34 for detecting, for each organ of the face, a plurality of feature points of the organ from the prescribed region; and a confidence calculation unit 35 for setting, for each organ of the face, reliability to each of the plurality of feature points so that the reliability of a feature point which is a face pixel among the plurality of feature points having been detected with regard to the organ is higher than the reliability of a feature point which is the non-face pixel, obtaining the sum total of reliability set with regard to each of the plurality of feature points, and calculating confidence so that confidence representing the likelihood of the face being expressed in the prescribed region is progressively higher as the obtained sum total for each organ increases.SELECTED DRAWING: Figure 3

Description

本発明は、画像に表された顔を検出する顔検出装置に関する。 The present invention relates to a face detection device for detecting a face represented in an image.

ドライバモニタカメラあるいはWebカメラなどを用いて撮影対象となる人物の顔を継続的に撮影して得られた時系列の一連の画像からその人物の顔を検出することで、その人物をモニタリングする技術が研究されている。このような技術において、一連の画像のそれぞれから人物の顔を検出するために、追跡（トラッキング）処理を利用することが提案されている（例えば、特許文献１を参照）。 A technology for monitoring a person by detecting the person's face from a series of time-series images obtained by continuously photographing the person's face using a driver monitor camera, web camera, etc. are being studied. In such technology, it has been proposed to use tracking processing to detect a person's face from each of a series of images (see, for example, Japanese Laid-Open Patent Application Publication No. 2002-100003).

特許文献１に開示された画像解析装置では、トラッキングフラグがオンになっている状態で、探索制御部が、前フレームに対し、現フレームの顔の特徴点の位置座標の変化量、顔向きの変化量及び視線方向の変化量のそれぞれが所定の範囲内であるか判定する。そしてこの画像解析装置は、これら全ての判定において条件が満たされれば、前フレームに対する現フレームの検出結果の変化は許容範囲内であるとみなし、後続フレームにおいて引き続きトラッキング情報記憶部に保存された顔画像領域に応じて顔画像を検出する。 In the image analysis apparatus disclosed in Patent Document 1, in a state where the tracking flag is turned on, the search control unit calculates the amount of change in the position coordinates of the facial feature points in the current frame and the face orientation in the current frame with respect to the previous frame. It is determined whether each of the amount of change and the amount of change in the line-of-sight direction is within a predetermined range. If all these determination conditions are satisfied, the image analysis device considers that the change in the detection result of the current frame with respect to the previous frame is within the allowable range, and continues to detect faces stored in the tracking information storage unit in subsequent frames. A face image is detected according to an image area.

特開２０１９－１８５５５７号公報JP 2019-185557 A

ある時点において得られた画像から検出された、人物の顔が表された領域（以下、顔領域と呼ぶ）の追跡に失敗すると、後続の画像における顔領域には、実際にその人物の顔が写っていないことがある。特に、カメラに対する人物の顔の向き、あるいは、カメラとその人物の顔との位置関係によっては、画像上においてその人物の顔の一部が写っていないことがある。後続の画像の何れかにおいてこのように顔の一部が写らなくなると、人物の顔の特徴点の位置、顔向き及び視線方向を精度良く推定することが困難となり、その結果として、特徴点の位置、顔向き及び視線方向の変化量を正しく求めることが困難となる。そのため、上記の技術では、追跡処理を適用するか否かを正確に判定できない場合が有り、人物の顔の検出に失敗してしまうおそれがある。 If the tracking of an area representing a person's face (hereinafter referred to as a face area) detected from an image obtained at a certain point in time fails, the face area in the subsequent image will not actually contain the person's face. It may not be pictured. In particular, depending on the orientation of the person's face with respect to the camera, or the positional relationship between the camera and the person's face, part of the person's face may not be captured in the image. If a part of the face is not captured in any of the subsequent images, it becomes difficult to accurately estimate the position of the feature points of the person's face, the face orientation, and the line-of-sight direction. It becomes difficult to correctly obtain the amount of change in the position, face orientation, and line-of-sight direction. Therefore, with the above technique, it may not be possible to accurately determine whether or not to apply the tracking process, and there is a risk of failing to detect a person's face.

そこで、本発明は、顔の一部が画像に表されていなくても、画像上の所定の領域に顔が表されている確からしさを精度良く求めることが可能な顔検出装置を提供することを目的とする。 SUMMARY OF THE INVENTION Therefore, the present invention provides a face detection apparatus capable of accurately determining the likelihood that a face is represented in a predetermined area on an image even if part of the face is not represented in the image. With the goal.

一つの実施形態によれば、顔検出装置が提供される。この顔検出装置は、画像上の所定の領域内の各画素を、顔を表す顔画素か顔を表さない非顔画素に分類するセグメンテーション部と、その所定の領域から、顔の個々の器官ごとに、その器官の複数の特徴点を検出する特徴点検出部と、顔の個々の器官ごとに、その器官について検出された複数の特徴点のうち、顔画素である特徴点の信頼度が非顔画素である特徴点の信頼度よりも高くなるように、複数の特徴点のそれぞれに信頼度を設定し、複数の特徴点のそれぞれについて設定された信頼度の総和を求め、求めた個々の器官ごとの総和が大きくなるほど所定の領域に顔が表されている確からしさを表す確信度が高くなるように、その確信度を算出する確信度算出部とを有する。 According to one embodiment, a face detection device is provided. This face detection apparatus includes a segmentation unit that classifies each pixel in a predetermined region on an image into a face pixel representing a face or a non-face pixel that does not represent a face, and a segmentation unit that classifies each pixel in a predetermined region on an image into a face pixel representing a face or a non-face pixel that does not represent a face. a feature point detection unit that detects a plurality of feature points of the organ for each individual organ of the face; A reliability level is set for each of a plurality of feature points so that the reliability level is higher than that of feature points that are non-face pixels. and a certainty calculation unit for calculating the certainty so that the larger the sum of each organ, the higher the certainty representing the certainty that a face is represented in the predetermined area.

本発明に係る顔検出装置は、顔の一部が画像に表されていなくても、画像上の所定の領域に顔が表されている確からしさを精度良く求めることができるという効果を奏する。 ADVANTAGE OF THE INVENTION The face detection apparatus according to the present invention has the effect of being able to accurately obtain the likelihood that a face is represented in a predetermined area on an image even if part of the face is not represented in the image.

顔検出装置が実装される車両制御システムの概略構成図である。1 is a schematic configuration diagram of a vehicle control system in which a face detection device is mounted; FIG. 顔検出装置の一つの実施形態である電子制御装置のハードウェア構成図である。1 is a hardware configuration diagram of an electronic control device that is one embodiment of a face detection device; FIG. 顔検出処理に関する、電子制御装置のプロセッサの機能ブロック図である。FIG. 4 is a functional block diagram of the processor of the electronic control unit for face detection processing; 顔画素及び非顔画素と、検出された特徴点の信頼度との関係の一例を示す図である。FIG. 4 is a diagram showing an example of the relationship between face pixels, non-face pixels, and reliability of detected feature points; 参照テーブルに示される、特徴点の信頼度の総和と確信度との関係の一例を表す図である。FIG. 10 is a diagram showing an example of the relationship between the sum of reliability of feature points and the confidence shown in the reference table; 顔検出処理の動作フローチャートである。4 is an operation flowchart of face detection processing;

以下、図を参照しつつ、顔検出装置、及び、顔検出装置上で実行される顔検出方法及び顔検出用コンピュータプログラムについて説明する。この顔検出装置は、撮影対象となる人物の顔を撮影して得られた画像において、その人物の顔が表されていると想定される所定の領域に含まれる各画素を、顔を表す顔画素か顔を表さない非顔画素に分類する。また、この顔検出装置は、その所定の領域から、顔の個々の器官（例えば、眼、鼻、口等）について、その器官の複数の特徴点を検出し、検出した特徴点ごとに、対応する器官を表している確からしさを表す信頼度を設定する。その際、この顔検出装置は、顔画素である特徴点の信頼度を、非顔画素である特徴点の信頼度よりも高くなるように、各特徴点の信頼度を設定する。そしてこの顔検出装置は、顔の個々の器官ごとに、その器官について検出された信頼度の総和を求め、求めた個々の器官ごとの信頼度の総和が大きくなるほど上記の所定の領域に顔が表されている確からしさを表す確信度が高くなるように、その確信度を算出する。 A face detection device, and a face detection method and a face detection computer program executed on the face detection device will be described below with reference to the drawings. This face detection device detects, in an image obtained by photographing the face of a person to be photographed, each pixel included in a predetermined area in which the person's face is supposed to be represented. Pixels are classified as non-face pixels that do not represent a face. In addition, this face detection device detects a plurality of feature points of individual facial organs (eg, eyes, nose, mouth, etc.) from the predetermined area, and detects a corresponding feature point for each detected feature point. Set the confidence level that indicates the likelihood that the organ to be represented is represented. At this time, the face detection apparatus sets the reliability of each feature point so that the reliability of the feature points that are face pixels is higher than the reliability of the feature points that are non-face pixels. This face detection apparatus obtains the sum of the degrees of reliability detected for each individual organ of the face, and the greater the sum of the degrees of reliability obtained for each organ, the more the face is located in the predetermined area. The certainty factor is calculated so that the certainty factor representing the represented certainty is high.

以下では、顔検出装置を、車両のドライバの顔を継続的に撮影することで得られた時系列の一連の画像に基づいてドライバをモニタリングするドライバモニタ装置に適用した例について説明する。このドライバモニタ装置では、車両のイグニッションスイッチがオンにされたときといった所定のタイミングに得られた画像から、ドライバの顔が表された顔領域を検出し、それ以降に得られた後続の画像に対して追跡処理を適用することで顔領域を追跡する。そしてこのドライバモニタ装置は、後続の画像上の顔領域に対して上記の顔検出処理を実行することで、後続の画像上の顔領域にドライバの顔が表されている確信度を算出する。そしてこのドライバモニタ装置は、その確信度に基づいて、ドライバの顔の再検出の実行可否を判断する。 An example in which the face detection device is applied to a driver monitoring device for monitoring a driver based on a series of time-series images obtained by continuously photographing the face of a driver of a vehicle will be described below. In this driver monitoring device, a face area representing the driver's face is detected from an image obtained at a predetermined timing, such as when the ignition switch of the vehicle is turned on, and the subsequent images obtained after that are detected. The face area is tracked by applying tracking processing to it. Then, the driver monitor device performs the above-described face detection processing on the face area on the subsequent image, thereby calculating the degree of certainty that the driver's face is represented in the face area on the subsequent image. Based on the certainty, the driver monitoring device determines whether re-detection of the driver's face can be executed.

なお、本実施形態による顔検出装置は、ドライバモニタ装置に限られず、Webカメラあるいは他の監視カメラといった、撮影対象となる人物の顔を撮影するカメラにより得られた画像からその人物の顔を検出することが要求される様々な用途に対して好適に利用される。 The face detection device according to the present embodiment is not limited to the driver monitor device, and detects the face of the person to be photographed from an image obtained by a camera that photographs the face of the person to be photographed, such as a web camera or other surveillance camera. It is suitably used for various uses that require

図１は、顔検出装置が実装される車両制御システムの概略構成図である。また図２は、顔検出装置の一つの実施形態である電子制御装置のハードウェア構成図である。本実施形態では、車両１０に搭載され、かつ、車両１０を制御する車両制御システム１は、ドライバモニタカメラ２と、ユーザインターフェース３と、顔検出装置の一例である電子制御装置（ＥＣＵ）４とを有する。ドライバモニタカメラ２及びユーザインターフェース３とＥＣＵ４とは、コントローラエリアネットワークといった規格に準拠した車内ネットワークを介して通信可能に接続される。なお、車両制御システム１は、車両１０の自己位置を測位するためのGPS受信機（図示せず）をさらに有してもよい。また、車両制御システム１は、車両１０の周囲を撮影するためのカメラ（図示せず）、または、LiDARあるいはレーダといった、車両１０から車両１０の周囲に存在する物体までの距離を測定する距離センサ（図示せず）の少なくとも何れかをさらに有していてもよい。さらにまた、車両制御システム１は、他の機器と無線通信するための無線通信端末（図示せず）を有していてもよい。さらにまた、車両制御システム１は、車両１０の走行ルートを探索するためのナビゲーション装置（図示せず）を有していてもよい。 FIG. 1 is a schematic configuration diagram of a vehicle control system in which a face detection device is installed. Also, FIG. 2 is a hardware configuration diagram of an electronic control unit, which is one embodiment of the face detection apparatus. In this embodiment, a vehicle control system 1 mounted on a vehicle 10 and controlling the vehicle 10 includes a driver monitor camera 2, a user interface 3, and an electronic control unit (ECU) 4 which is an example of a face detection device. have The driver monitor camera 2, the user interface 3, and the ECU 4 are communicably connected via an in-vehicle network conforming to a standard such as a controller area network. The vehicle control system 1 may further include a GPS receiver (not shown) for positioning the vehicle 10 itself. In addition, the vehicle control system 1 includes a camera (not shown) for photographing the surroundings of the vehicle 10, or a distance sensor such as LiDAR or radar that measures the distance from the vehicle 10 to objects existing around the vehicle 10. (not shown). Furthermore, the vehicle control system 1 may have a wireless communication terminal (not shown) for wireless communication with other devices. Furthermore, the vehicle control system 1 may have a navigation device (not shown) for searching the travel route of the vehicle 10 .

ドライバモニタカメラ２は、カメラまたは車内撮像部の一例であり、CCDあるいはC-MOSなど、可視光または赤外光に感度を有する光電変換素子のアレイで構成された２次元検出器と、その２次元検出器上に撮影対象となる領域の像を結像する結像光学系を有する。ドライバモニタカメラ２は、赤外LEDといったドライバを照明するための光源をさらに有していてもよい。そしてドライバモニタカメラ２は、車両１０の運転席に着座したドライバの頭部がその撮影対象領域に含まれるように、すなわち、ドライバの頭部を撮影可能なように、例えば、インストルメントパネルまたはその近傍にドライバへ向けて取り付けられる。そしてドライバモニタカメラ２は、所定の撮影周期（例えば1/30秒～1/10秒）ごとにドライバの頭部を撮影し、ドライバの顔が表された画像（以下、説明の便宜上、顔画像と呼ぶ）を生成する。ドライバモニタカメラ２により得られた顔画像は、カラー画像であってもよく、あるいは、グレー画像であってもよい。ドライバモニタカメラ２は、顔画像を生成する度に、その生成した顔画像を、車内ネットワークを介してＥＣＵ４へ出力する。 The driver monitor camera 2 is an example of a camera or an in-vehicle imaging unit, and includes a two-dimensional detector configured by an array of photoelectric conversion elements sensitive to visible light or infrared light, such as a CCD or C-MOS; It has an imaging optical system that forms an image of an area to be photographed on the dimensional detector. The driver monitor camera 2 may further have a light source for illuminating the driver, such as an infrared LED. Then, the driver monitor camera 2 is mounted on the instrument panel or the like so that the head of the driver seated in the driver's seat of the vehicle 10 is included in the photographing target area, that is, the head of the driver can be photographed. It is mounted in close proximity to the driver. Then, the driver monitor camera 2 takes an image of the driver's head at predetermined shooting intervals (for example, 1/30 second to 1/10 second), and an image representing the driver's face (hereinafter referred to as a face image for convenience of explanation). ). The face image obtained by the driver monitor camera 2 may be a color image or a gray image. Each time the driver monitor camera 2 generates a face image, it outputs the generated face image to the ECU 4 via the in-vehicle network.

ユーザインターフェース３は、通知部の一例であり、例えば、液晶ディスプレイまたは有機ＥＬディスプレイといった表示装置を有する。ユーザインターフェース３は、車両１０の車室内、例えば、インスツルメンツパネルに、ドライバへ向けて設置される。そしてユーザインターフェース３は、ＥＣＵ４から車内ネットワークを介して受信した各種の情報を表示することで、その情報をドライバへ通知する。ユーザインターフェース３は、さらに、車室内に設置されるスピーカを有していてもよい。この場合、ユーザインターフェース３は、ＥＣＵ４から車内ネットワークを介して受信した各種の情報を音声信号として出力することで、その情報をドライバへ通知する。 The user interface 3 is an example of a notification unit, and has a display device such as a liquid crystal display or an organic EL display. The user interface 3 is installed in the interior of the vehicle 10, for example, on an instrument panel, facing the driver. The user interface 3 notifies the driver of the information by displaying various information received from the ECU 4 via the in-vehicle network. The user interface 3 may also have a speaker installed inside the vehicle. In this case, the user interface 3 notifies the driver of various information received from the ECU 4 via the in-vehicle network by outputting the information as an audio signal.

ＥＣＵ４は、顔画像に基づいてドライバの顔の向きを検出し、その顔の向きに基づいてドライバの状態を判定する。そしてＥＣＵ４は、ドライバの状態が、ドライバが余所見をしているといった運転に適さない状態である場合、ユーザインターフェース３を介してドライバへ警告する。 The ECU 4 detects the orientation of the driver's face based on the face image, and determines the driver's state based on the orientation of the face. Then, the ECU 4 warns the driver via the user interface 3 when the driver is in a state unsuitable for driving such as looking away.

図２に示されるように、ＥＣＵ４は、通信インターフェース２１と、メモリ２２と、プロセッサ２３とを有する。通信インターフェース２１、メモリ２２及びプロセッサ２３は、それぞれ、別個の回路として構成されてもよく、あるいは、一つの集積回路として一体的に構成されてもよい。 As shown in FIG. 2, the ECU 4 has a communication interface 21, a memory 22, and a processor . The communication interface 21, memory 22 and processor 23 may each be configured as separate circuits, or may be integrally configured as one integrated circuit.

通信インターフェース２１は、ＥＣＵ４を車内ネットワークに接続するためのインターフェース回路を有する。そして通信インターフェース２１は、ドライバモニタカメラ２から顔画像を受信する度に、受信した顔画像をプロセッサ２３へわたす。また、通信インターフェース２１は、ユーザインターフェース３に表示させる情報をプロセッサ２３から受け取ると、その情報をユーザインターフェース３へ出力する。 The communication interface 21 has an interface circuit for connecting the ECU 4 to the in-vehicle network. Then, the communication interface 21 passes the received face image to the processor 23 each time it receives a face image from the driver monitor camera 2 . Further, when receiving information to be displayed on the user interface 3 from the processor 23 , the communication interface 21 outputs the information to the user interface 3 .

メモリ２２は、記憶部の一例であり、例えば、揮発性の半導体メモリ及び不揮発性の半導体メモリを有する。そしてメモリ２２は、ＥＣＵ４のプロセッサ２３により実行される顔検出処理を含むドライバモニタ処理において使用される各種のアルゴリズム及び各種のデータを記憶する。例えば、メモリ２２は、顔領域及び特徴点の検出、顔の向きの判定に利用される各種のパラメータ、及び、信頼度の総和と確信度の関係を表す参照テーブルを記憶する。さらに、メモリ２２は、ドライバモニタカメラ２から受け取った顔画像、及び、ドライバモニタ処理の途中で生成される各種のデータを一時的に記憶する。 The memory 22 is an example of a storage unit, and has, for example, a volatile semiconductor memory and a nonvolatile semiconductor memory. The memory 22 stores various algorithms and various data used in driver monitor processing including face detection processing executed by the processor 23 of the ECU 4 . For example, the memory 22 stores various parameters used for detection of face regions and feature points, determination of face orientation, and a reference table representing the relationship between the sum of confidence levels and confidence levels. Furthermore, the memory 22 temporarily stores the face image received from the driver monitor camera 2 and various data generated during the driver monitor process.

プロセッサ２３は、１個または複数個のＣＰＵ(Central Processing Unit)及びその周辺回路を有する。プロセッサ２３は、論理演算ユニット、数値演算ユニットあるいはグラフィック処理ユニットといった他の演算回路をさらに有していてもよい。そしてプロセッサ２３は、顔検出処理を含むドライバモニタ処理を実行する。 The processor 23 has one or more CPUs (Central Processing Units) and their peripheral circuits. Processor 23 may further comprise other arithmetic circuitry such as a logic arithmetic unit, a math unit or a graphics processing unit. The processor 23 then executes driver monitor processing including face detection processing.

図３は、ドライバモニタ処理に関する、プロセッサ２３の機能ブロック図である。プロセッサ２３は、顔検出部３１と、追跡部３２と、セグメンテーション部３３と、特徴点検出部３４と、確信度算出部３５と、再検出判定部３６と、状態判定部３７とを有する。プロセッサ２３が有するこれらの各部は、例えば、プロセッサ２３上で動作するコンピュータプログラムにより実現される機能モジュールである。あるいは、プロセッサ２３が有するこれらの各部は、プロセッサ２３に設けられる、専用の演算回路であってもよい。なお、プロセッサ２３が有するこれらの各部のうち、顔検出部３１、追跡部３２、セグメンテーション部３３、特徴点検出部３４、確信度算出部３５及び再検出判定部３６が顔検出処理に関連する。 FIG. 3 is a functional block diagram of processor 23 relating to driver monitor processing. The processor 23 has a face detection unit 31 , a tracking unit 32 , a segmentation unit 33 , a feature point detection unit 34 , a certainty calculation unit 35 , a redetection determination unit 36 and a state determination unit 37 . These units of the processor 23 are, for example, functional modules implemented by computer programs running on the processor 23 . Alternatively, each of these units of processor 23 may be a dedicated arithmetic circuit provided in processor 23 . Among these units of the processor 23, the face detection unit 31, the tracking unit 32, the segmentation unit 33, the feature point detection unit 34, the certainty calculation unit 35, and the re-detection determination unit 36 are related to face detection processing.

顔検出部３１は、ＥＣＵ４が所定のタイミングにおいてドライバモニタカメラ２から受け取った顔画像から、ドライバの顔が表された顔領域を検出する。 The face detection unit 31 detects a face area representing the driver's face from the face image received by the ECU 4 from the driver monitor camera 2 at a predetermined timing.

所定のタイミングは、例えば、車両１０のイグニッションスイッチがオンにされたタイミング、あるいは、ドライバの顔の撮影が開始された直後における顔初期発見シーケンスが実施されるタイミングとすることができる。あるいは、所定のタイミングは、ドライバモニタカメラ２の撮影周期よりも長い所定の周期（例えば、数10秒～数分）ごとのタイミングであってもよい。さらに、所定のタイミングは、再検出判定部３６により顔領域の検出が指示されたタイミングであってもよい。 The predetermined timing can be, for example, the timing at which the ignition switch of the vehicle 10 is turned on, or the timing at which the face initial detection sequence is performed immediately after the driver's face is photographed. Alternatively, the predetermined timing may be timing at predetermined intervals longer than the imaging interval of the driver monitor camera 2 (for example, several tens of seconds to several minutes). Furthermore, the predetermined timing may be the timing at which the re-detection determination unit 36 instructs to detect the face area.

顔検出部３１は、例えば、顔画像を、画像からドライバの顔を検出するように予め学習された識別器に入力することで顔領域を検出する。顔検出部３１は、そのような識別器として、例えば、Single Shot MultiBox Detector(SSD)、または、Faster R-CNNといった、コンボリューショナルニューラルネットワーク型(CNN)のアーキテクチャを持つディープニューラルネットワーク(DNN)を用いることができる。あるいは、顔検出部３１は、そのような識別器として、AdaBoost識別器を利用してもよい。この場合、顔検出部３１は、顔画像にウィンドウを設定し、そのウィンドウからHaar-like特徴量といった、顔の有無の判定に有用な特徴量を算出する。そして顔検出部３１は、算出した特徴量を識別器に入力することで、そのウィンドウにドライバの顔が表されているか否かを判定する。顔検出部３１は、顔画像上でのウィンドウの位置、サイズ、アスペクト比及び向きなどを様々に変えながら上記の処理を行って、ドライバの顔が検出されたウィンドウを、顔領域とすればよい。識別器は、顔が表された画像及び顔が表されていない画像を含む教師データを用いて、所定の学習手法に従って予め学習される。そして識別器を規定するパラメータセットは、メモリ２２に予め記憶されればよい。また、顔検出部３１は、画像から顔領域を検出する他の手法に従って、顔画像から顔領域を検出してもよい。 The face detection unit 31 detects a face area by, for example, inputting a face image into a discriminator that has been trained in advance to detect the driver's face from the image. The face detection unit 31 uses a deep neural network (DNN) having a convolutional neural network (CNN) architecture such as a Single Shot MultiBox Detector (SSD) or Faster R-CNN as such a classifier. can be used. Alternatively, the face detection unit 31 may use an AdaBoost classifier as such a classifier. In this case, the face detection unit 31 sets a window on the face image, and calculates a feature amount useful for determining whether or not there is a face, such as a Haar-like feature amount, from the window. Then, the face detection unit 31 inputs the calculated feature amount to the discriminator, thereby determining whether or not the driver's face is displayed in the window. The face detection unit 31 performs the above processing while variously changing the position, size, aspect ratio, orientation, etc. of the window on the face image, and sets the window in which the driver's face is detected as the face region. . The discriminator is trained in advance according to a predetermined learning method using teacher data including images showing faces and images not showing faces. A parameter set that defines the discriminator may be stored in the memory 22 in advance. Moreover, the face detection section 31 may detect a face area from a face image according to another method for detecting a face area from an image.

顔検出部３１は、顔画像から顔領域が検出される度に、検出された顔領域を示す情報（例えば、顔画像上での顔領域の左上端座標、水平方向の幅及び垂直方向の高さ）を、追跡部３２及び状態判定部３７へ通知する。 Each time a face area is detected from a face image, the face detection unit 31 obtains information indicating the detected face area (for example, upper left corner coordinates, horizontal width, and vertical height of the face area on the face image). ) is notified to the tracking unit 32 and the state determination unit 37 .

追跡部３２は、顔検出部３１により顔領域が検出された顔画像に後続する一連の顔画像のそれぞれにおいて顔領域を追跡する。例えば、追跡部３２は、KLT法といった、オプティカルフローに基づく追跡手法、あるいは、カルマンフィルタまたはパーティクルフィルタといった予測フィルタに基づく追跡手法を適用することで、後続の顔画像における顔領域の位置及び範囲を推定する。そして追跡部３２は、後続の顔画像において顔領域を推定する度に、推定した顔領域の位置及び範囲を表す情報を、セグメンテーション部３３及び特徴点検出部３４へ通知する。 The tracking unit 32 tracks the face area in each of a series of face images following the face image from which the face detection unit 31 detected the face area. For example, the tracking unit 32 applies a tracking method based on optical flow, such as the KLT method, or a tracking method based on a prediction filter, such as a Kalman filter or a particle filter, to estimate the position and range of the facial region in the subsequent facial image. do. Each time the tracking unit 32 estimates a face region in a subsequent face image, the tracking unit 32 notifies the segmentation unit 33 and feature point detection unit 34 of information representing the position and range of the estimated face region.

一般に、追跡部３２による、顔領域の追跡処理の演算負荷は、顔検出部３１による、顔検出処理の演算負荷よりも少ない。そのため、顔検出部３１により一旦顔領域が検出されると、後続の各顔画像に対しては追跡部３２による顔領域の追跡処理を実行することで、プロセッサ２３は、顔領域の推定に要する演算負荷を軽減することができる。 In general, the calculation load of the face area tracking process by the tracking unit 32 is less than the calculation load of the face detection process by the face detection unit 31 . Therefore, once a face region is detected by the face detection unit 31, the tracking unit 32 executes face region tracking processing for each subsequent face image. Calculation load can be reduced.

セグメンテーション部３３は、顔検出部３１により顔領域が検出された顔画像に後続する各顔画像において、追跡部３２により推定された顔領域に含まれる各画素を、顔を表す顔画素か顔を表さない非顔画素に分類する。なお、追跡部３２により推定された顔領域は、所定の領域の一例である。 The segmentation unit 33 identifies each pixel included in the face region estimated by the tracking unit 32 in each face image subsequent to the face image in which the face region is detected by the face detection unit 31. classified as non-face pixels that are not represented. Note that the face area estimated by the tracking unit 32 is an example of the predetermined area.

セグメンテーション部３３は、例えば、推定された顔領域を、各画素を顔画素か非顔画素かに分類するように予め学習された識別器に入力することで、その顔領域内の各画素を顔画素または非顔画素に分類する。セグメンテーション部３３は、そのような識別器として、例えば、Fully Convolutional Network、U-Net、または、SegNetといった、セマンティックセグメンテーション用のDNNを用いることができる。あるいは、セグメンテーション部３３は、ランダムフォレストといった他のセグメンテーション手法に従った識別器を利用してもよい。 The segmentation unit 33, for example, inputs the estimated face region to a discriminator trained in advance so as to classify each pixel into a face pixel or a non-face pixel, thereby classifying each pixel in the face region as a face. Categorize as pixels or non-face pixels. The segmentation unit 33 can use a DNN for semantic segmentation such as Fully Convolutional Network, U-Net, or SegNet as such a classifier. Alternatively, the segmentation unit 33 may use classifiers according to other segmentation methods such as random forest.

セグメンテーション部３３は、推定された顔領域ごとに、各画素の分類結果を表す情報を確信度算出部３５へ出力する。なお、推定された顔領域について各画素の分類結果を表す情報は、例えば、その顔領域と同じサイズを有し、かつ、顔画素と非顔画素とが互いに異なる値を持つ２値画像とすることができる。 The segmentation unit 33 outputs information representing the classification result of each pixel to the certainty calculation unit 35 for each estimated face region. The information representing the classification result of each pixel in the estimated face area is, for example, a binary image having the same size as the face area and having different values for face pixels and non-face pixels. be able to.

特徴点検出部３４は、顔検出部３１により顔領域が検出された顔画像に後続する各顔画像において、追跡部３２により推定された顔領域から、顔の個々の器官ごとに、その器官の複数の特徴点を検出する。 The feature point detection unit 34 detects each organ of the face from the face region estimated by the tracking unit 32 in each face image subsequent to the face image from which the face region is detected by the face detection unit 31. Detect multiple feature points.

特徴点検出部３４は、顔の個々の器官の複数の特徴点を検出するために、その特徴点を検出するように設計された検出器を顔領域に適用することで、個々の器官の特徴点を検出することができる。特徴点検出部３４は、そのような検出器として、例えば、Active Shape Model(ASM)あるいはActive Appearance Model(AAM)といった、顔全体の情報を利用する検出器を利用することができる。あるいは、特徴点検出部３４は、顔の個々の器官の特徴点を検出するように予め学習されたDNNを検出器として利用してもよい。 A feature point detection unit 34 detects a plurality of feature points of individual organs of the face by applying a detector designed to detect the feature points to the face region to detect the features of individual organs. A point can be detected. As such a detector, the feature point detection unit 34 can use, for example, a detector that uses information of the entire face, such as Active Shape Model (ASM) or Active Appearance Model (AAM). Alternatively, the feature point detection unit 34 may use, as a detector, a DNN that has been pre-trained to detect feature points of individual facial organs.

ドライバの顔がドライバモニタカメラ２に対して正対しておらず、斜め方向を向いている場合、顔画像にドライバの顔全体が写らないことがある。また、ドライバの姿勢によっては、ドライバの手などにより、ドライバモニタカメラ２から見て、ドライバの顔の一部が隠されることがある。このような場合、顔画像において、一部の器官が見えないことがある。しかしながら、検出器が顔全体の情報を用いて各器官の特徴点を検出しているため、実際には顔が表されていない位置において、顔画像には表されていない顔の器官に相当する特徴点が誤って検出されることがある。このような場合、ドライバの顔の状態を正確に判別することが困難となり得る。 When the driver's face is not facing the driver monitor camera 2 and faces obliquely, the face image may not capture the entire face of the driver. Also, depending on the posture of the driver, part of the driver's face may be hidden from the driver monitor camera 2 by the driver's hand or the like. In such cases, some organs may not be visible in the face image. However, since the detector uses the information of the entire face to detect the feature points of each organ, the position where the face is not actually shown corresponds to the facial organ not shown in the face image. Feature points may be detected incorrectly. In such a case, it may be difficult to accurately determine the state of the driver's face.

特徴点検出部３４は、推定された顔領域ごとに、その顔領域から検出された、器官ごとの複数の特徴点を表す情報を確信度算出部３５へ出力する。なお、器官ごとの複数の特徴点を表す情報は、例えば、特徴点ごとの位置座標と、その特徴点が表す顔の器官を示す識別番号とを含む。 The feature point detection unit 34 outputs, for each estimated face region, information indicating a plurality of feature points for each organ detected from the face region to the certainty calculation unit 35 . The information representing a plurality of feature points for each organ includes, for example, the position coordinates of each feature point and an identification number indicating the facial organ represented by the feature point.

確信度算出部３５は、顔検出部３１により顔領域が検出された顔画像に後続する各顔画像において、推定された顔領域から検出された個々の特徴点に対して、その特徴点の確からしさを表す信頼度を設定する。さらに、確信度算出部３５は、顔の個々の器官ごとに、その器官の特徴点の信頼度の総和を求める。そして確信度算出部３５は、求めた個々の器官ごとのその信頼度の総和が大きくなるほどその顔領域にドライバの顔が表されている確からしさを表す確信度が高くなるように確信度を算出する。 The certainty calculation unit 35 calculates the certainty of each feature point detected from the estimated face area in each face image subsequent to the face image from which the face area is detected by the face detection unit 31. Set the confidence level that represents the likeness. Further, the certainty calculation unit 35 obtains the sum of the reliability of the feature points of each individual organ of the face. Then, the certainty calculation unit 35 calculates the certainty so that the larger the sum of the reliability for each individual organ obtained, the higher the certainty that indicates the probability that the driver's face is represented in the face region. do.

本実施形態では、確信度算出部３５は、顔の個々の器官ごとに、その器官について検出された複数の特徴点のうち、顔画素である特徴点の信頼度が非顔画素である特徴点の信頼度よりも高くなるように、各特徴点に信頼度を設定する。そのために、確信度算出部３５は、セグメンテーション部３３から受け取った、顔領域の各画素の分類結果を表す情報と、特徴点検出部３４から受け取った各特徴点の位置とを参照して、特徴点ごとに、その特徴点が顔画素に位置するか非顔画素に位置するかを判定する。そして確信度算出部３５は、顔画素に位置する特徴点、すなわち、顔画素である特徴点に対して第１の信頼度（例えば、1.0）を設定する。一方、確信度算出部３５は、非顔画素に位置する特徴点、すなわち、非顔画素である特徴点に対して、第１の信頼度よりも低い第２の信頼度（例えば、0.5）を設定する。 In this embodiment, for each organ of the face, the certainty calculation unit 35 determines that, among a plurality of feature points detected for that organ, feature points that are face pixels have a reliability of non-face pixels. Set the reliability to each feature point so that it is higher than the reliability of . For this purpose, the certainty calculation unit 35 refers to the information indicating the classification result of each pixel in the face region received from the segmentation unit 33 and the position of each feature point received from the feature point detection unit 34 to determine the feature. For each point, it is determined whether the feature point is located in a face pixel or a non-face pixel. Then, the certainty calculation unit 35 sets the first reliability (for example, 1.0) to the feature points located in the face pixels, that is, the feature points that are the face pixels. On the other hand, the certainty calculation unit 35 assigns a second reliability lower than the first reliability (for example, 0.5) to feature points located in non-face pixels, that is, feature points that are non-face pixels. set.

図４は、顔画素及び非顔画素と、検出された特徴点の信頼度との関係の一例を示す図である。この例では、推定された顔領域４００のうち、部分領域４０１に含まれる各画素が顔画素であり、一方、部分領域４０２に含まれる各画素が非顔画素となっている。したがって、特徴点検出部３４により検出された顔の個々の器官の複数の特徴点４１１のうち、部分領域４０１内に位置する、すなわち、顔画素である各特徴点４１１ａに対して第１の信頼度（この例では、1.0）が設定される。一方、複数の特徴点４１１のうち、部分領域４０２内に位置する、すなわち、非顔画素である各特徴点４１１ｂに対して第２の信頼度（この例では、0.5）が設定される。 FIG. 4 is a diagram showing an example of the relationship between face pixels and non-face pixels and the reliability of detected feature points. In this example, in the estimated face region 400, each pixel included in the partial region 401 is a face pixel, while each pixel included in the partial region 402 is a non-face pixel. Therefore, among the plurality of feature points 411 of individual organs of the face detected by the feature point detection unit 34, each feature point 411a located within the partial region 401, that is, a face pixel, is given a first confidence. degree (1.0 in this example) is set. On the other hand, among the plurality of feature points 411, a second reliability (0.5 in this example) is set for each feature point 411b located within the partial area 402, that is, each feature point 411b that is a non-face pixel.

各特徴点に対して信頼度を設定すると、確信度算出部３５は、顔の個々の器官ごとに、その器官について検出された各特徴点に設定した信頼度の総和を算出する。そして確信度算出部３５は、顔の個々の器官ごとに、その器官についての信頼度の総和とその器官の確からしさを表す確信度との関係を表す参照テーブルを参照することで、その器官についての確信度を求める。なお、個々の器官の参照テーブルは、メモリ２２に予め記憶される。 After setting the reliability for each feature point, the reliability calculation unit 35 calculates the sum of the reliability set for each feature point detected for each organ of the face. Then, for each organ of the face, the certainty calculation unit 35 refers to a reference table that represents the relationship between the sum of the reliability of the organ and the certainty that represents the certainty of the organ, thereby calculating the probability of the organ. Confidence of Reference tables for individual organs are pre-stored in the memory 22 .

図５は、参照テーブルに示される、特徴点の信頼度の総和と確信度との関係の一例を表す図である。図５において、横軸は信頼度の総和を表し、縦軸は確信度を表す。そして曲線５００は、信頼度の総和と確信度の関係を表す。この例では、信頼度の総和に対して確信度が単調増加し、かつ、信頼度の総和が大きくなるほど急激に確信度が増加するように、特徴点の信頼度の総和と確信度との関係が設定される。 FIG. 5 is a diagram showing an example of the relationship between the total reliability of feature points and the certainty, which is shown in the reference table. In FIG. 5, the horizontal axis represents the total reliability, and the vertical axis represents the certainty. A curve 500 represents the relationship between the sum of reliability and confidence. In this example, the relationship between the sum of the reliability of the feature points and the confidence is calculated so that the confidence increases monotonically with respect to the sum of the reliability, and the confidence increases sharply as the sum of the confidence increases. is set.

なお、信頼度の総和と確信度との関係は図５に示される例に限られない。例えば、信頼度の総和が増加するにつれて確信度も線形に増加するように、信頼度の総和と確信度との関係が設定されてもよい。あるいは、信頼度の総和が大きくなるほど確信度の増加度合いが緩和されるように、信頼度の総和と確信度との関係が設定されてもよい。さらに、顔の器官ごとに、信頼度の総和と確信度との関係は異なっていてもよい。例えば、眼については、信頼度の総和に対して確信度が単調増加し、かつ、信頼度の総和が大きくなるほど急激に確信度が増加するように、特徴点の信頼度の総和と確信度との関係が設定される。一方、鼻あるいは口については、信頼度の総和が大きくなるほど確信度の増加度合いが緩和されるように、信頼度の総和と確信度との関係が設定されてもよい。 Note that the relationship between the sum of reliability and confidence is not limited to the example shown in FIG. For example, the relationship between the sum of reliability and confidence may be set such that the confidence increases linearly as the sum of confidence increases. Alternatively, the relationship between the total reliability and the confidence may be set such that the greater the total confidence, the more moderate the degree of increase in the confidence. Furthermore, the relationship between the sum of confidence levels and the confidence level may be different for each facial organ. For example, with respect to the eye, the sum of the reliability of the feature points and the confidence are calculated so that the confidence monotonically increases with respect to the sum of the confidence and that the confidence increases sharply as the sum of the confidence increases. relationship is set. On the other hand, for the nose or mouth, the relationship between the total reliability and the confidence may be set such that the greater the total reliability, the more moderate the degree of increase in the confidence.

確信度算出部３５は、顔の個々の器官ごとの確信度の統計的代表値を、推定された顔領域に実際にドライバの顔が表されている確からしさを表す確信度とする。なお、確信度算出部３５は、例えば、顔の個々の器官ごとの確信度の平均値、あるいは、顔の個々の器官ごとの確信度の最小値を、顔の個々の器官ごとの確信度の統計的代表値とすることができる。 The certainty calculation unit 35 uses the statistical representative value of the certainty for each organ of the face as the certainty representing the certainty that the driver's face is actually represented in the estimated facial region. Note that the certainty calculation unit 35 calculates, for example, the average value of the certainty for each organ of the face, or the minimum value of the certainty for each organ of the face, as the confidence for each organ of the face. It can be a statistical representative value.

確信度算出部３５は、推定された顔領域について確信度を算出する度に、その顔領域について算出された確信度を再検出判定部３６及び状態判定部３７へ通知する。 Every time the certainty factor calculation unit 35 calculates the certainty factor for the estimated face area, it notifies the re-detection determining unit 36 and the state determining unit 37 of the calculated certainty factor for the face area.

再検出判定部３６は、推定された顔領域について算出された確信度に基づいて、顔画像から顔領域を再検出するか否か判定する。例えば、再検出判定部３６は、確信度が検出判定閾値よりも低い場合、その推定された顔領域には、実際にはドライバの顔が表されていない可能性が高いと判定する。そこで、再検出判定部３６は、確信度が検出判定閾値以下である場合、その推定された顔領域を含む顔画像、あるいは、次に得られる顔画像に対して、顔領域を検出する処理を実行するよう、顔検出部３１に指示する。これにより、プロセッサ２３は、顔領域の追跡に失敗したときに、顔画像からドライバの顔を再度検出し直すことができる。 A re-detection determination unit 36 determines whether or not to re-detect the face region from the face image based on the confidence factor calculated for the estimated face region. For example, when the certainty is lower than the detection determination threshold, the redetection determination unit 36 determines that there is a high possibility that the estimated face area does not actually represent the driver's face. Therefore, when the certainty is equal to or less than the detection determination threshold, the re-detection determination unit 36 performs processing for detecting a face region in a face image including the estimated face region or a face image to be obtained next. The face detection unit 31 is instructed to execute. Thereby, the processor 23 can re-detect the driver's face from the face image when tracking of the face region fails.

状態判定部３７は、各顔画像に表された顔領域あるいは推定された顔領域に基づいて、ドライバの状態を判定する。ただし、確信度算出部３５により算出された確信度が検出判定閾値よりも低い場合、その推定された顔領域には、実際にはドライバの顔が表されていない可能性が高い。そのため、この場合には、状態判定部３７は、ドライバの状態判定の処理を実行しなくてもよい。そして状態判定部３７は、確信度が検出判定閾値よりも低くなった顔領域を含む最新の顔画像の直前に得られた過去の顔画像についての状態判定の結果を、最新の顔画像取得時における、ドライバの状態としてもよい。 The state determination unit 37 determines the state of the driver based on the facial area represented in each facial image or the estimated facial area. However, when the certainty calculated by the certainty calculating unit 35 is lower than the detection determination threshold, there is a high possibility that the estimated face area does not actually represent the driver's face. Therefore, in this case, the state determination unit 37 does not have to execute the process of determining the state of the driver. Then, the state determination unit 37 outputs the result of the state determination of the past face image obtained immediately before the latest face image including the face region whose certainty is lower than the detection determination threshold, when the latest face image is acquired. , may be the state of the driver.

本実施形態では、状態判定部３７は、顔領域に表されたドライバの顔の向きとドライバの顔の基準方向とを比較することで、ドライバの状態が車両１０の運転に適した状態か否か判定する。なお、顔の基準方向は、メモリ２２に予め記憶される。 In the present embodiment, the state determination unit 37 compares the direction of the driver's face shown in the face area with the reference direction of the driver's face to determine whether the driver's state is suitable for driving the vehicle 10 or not. determine whether Note that the reference direction of the face is stored in the memory 22 in advance.

状態判定部３７は、特徴点検出部３４により検出された顔の特徴点を、顔の３次元形状を表す３次元顔モデルにフィッティングする。そして状態判定部３７は、各特徴点が３次元顔モデルに最もフィッティングする際の３次元顔モデルの顔の向きを、ドライバの顔の向きとして検出する。あるいは、状態判定部３７は、画像に表された顔の向きを判定する他の手法に従って、顔画像に基づいてドライバの顔の向きを検出してもよい。なお、ドライバの顔の向きは、例えば、ドライバモニタカメラ２に対して正対する方向を基準とする、ピッチ角、ヨー角及びロール角の組み合わせで表される。 The state determination unit 37 fits the feature points of the face detected by the feature point detection unit 34 to a three-dimensional face model representing the three-dimensional shape of the face. Then, the state determination unit 37 detects the face orientation of the three-dimensional face model when each feature point is best fitted to the three-dimensional face model as the face orientation of the driver. Alternatively, the state determination unit 37 may detect the orientation of the driver's face based on the face image according to another technique for determining the orientation of the face shown in the image. The orientation of the driver's face is represented by, for example, a combination of the pitch angle, yaw angle, and roll angle with reference to the direction facing the driver monitor camera 2 .

状態判定部３７は、顔領域に表されたドライバの顔の向きとドライバの顔の基準方向との差の絶対値を算出し、その差の絶対値を所定の顔向き許容範囲と比較する。そして状態判定部３７は、その差の絶対値が顔向き許容範囲から外れている場合、ドライバは余所見をしている、すなわち、ドライバの状態は車両１０の運転に適した状態でないと判定する。 The state determination unit 37 calculates the absolute value of the difference between the orientation of the driver's face represented in the face area and the reference direction of the driver's face, and compares the absolute value of the difference with a predetermined permissible face orientation range. Then, when the absolute value of the difference is out of the allowable face orientation range, the state determination unit 37 determines that the driver is looking away, that is, the driver is not in a state suitable for driving the vehicle 10 .

なお、ドライバは、車両１０の周辺の状況の確認のために、車両１０の正面方向以外を向くことがある。ただしそのような場合でも、ドライバが車両１０の運転に集中していれば、ドライバは、車両１０の正面方向以外を継続して向くことはない。そこで変形例によれば、状態判定部３７は、ドライバの顔の向きとドライバの顔の基準方向との差の絶対値が顔向き許容範囲から外れている期間が所定時間（例えば、数秒間）以上継続した場合に、ドライバの状態は車両１０の運転に適した状態でないと判定してもよい。 It should be noted that the driver may face a direction other than the front of the vehicle 10 in order to check the situation around the vehicle 10 . However, even in such a case, if the driver is concentrating on driving the vehicle 10, the driver will not continue to face the vehicle 10 in any direction other than the front. Therefore, according to the modified example, the state determination unit 37 determines that the period in which the absolute value of the difference between the orientation of the driver's face and the reference direction of the driver's face is outside the allowable face orientation range is a predetermined time (for example, several seconds). If this continues, it may be determined that the driver's condition is not suitable for driving the vehicle 10 .

状態判定部３７は、ドライバの状態が車両１０の運転に適した状態でないと判定した場合、ドライバに対して車両１０の正面を向くように警告する警告メッセージを含む警告情報を生成する。そして状態判定部３７は、生成した警告情報を、通信インターフェース２１を介してユーザインターフェース３へ出力することで、ユーザインターフェース３にその警告メッセージを表示させる。あるいは、状態判定部３７は、ユーザインターフェース３が有するスピーカに、ドライバに対して車両１０の正面を向くように警告する音声を出力させる。 When the state determination unit 37 determines that the driver's state is not suitable for driving the vehicle 10 , the state determination unit 37 generates warning information including a warning message to warn the driver to face the front of the vehicle 10 . The state determination unit 37 then outputs the generated warning information to the user interface 3 via the communication interface 21, thereby causing the user interface 3 to display the warning message. Alternatively, the state determination unit 37 causes the speaker included in the user interface 3 to output a sound warning the driver to face the front of the vehicle 10 .

図６は、プロセッサ２３により実行される、ドライバモニタ処理の動作フローチャートである。プロセッサ２３は、以下の動作フローチャートに従って、顔検出処理を含むドライバモニタ処理を実行すればよい。なお、以下に示される動作フローチャートのうち、ステップＳ１０１～Ｓ１０９の処理が、顔検出処理に相当する。 FIG. 6 is an operation flowchart of driver monitor processing executed by the processor 23 . The processor 23 may execute driver monitor processing including face detection processing according to the following operation flowchart. In the operation flowchart shown below, the processing of steps S101 to S109 corresponds to the face detection processing.

プロセッサ２３の顔検出部３１は、ドライバモニタカメラ２により所定のタイミングにおいて得られた顔画像から顔領域を検出する（ステップＳ１０１）。また、プロセッサ２３の追跡部３２は、所定のタイミングにおいて得られた顔画像に後続する顔画像に対して顔領域の追跡処理を実行することで、その後続の顔画像における顔領域を推定する（ステップＳ１０２）。 The face detection unit 31 of the processor 23 detects a face area from a face image obtained at a predetermined timing by the driver monitor camera 2 (step S101). In addition, the tracking unit 32 of the processor 23 performs facial area tracking processing on the facial image that follows the facial image obtained at a predetermined timing, thereby estimating the facial area in the subsequent facial image ( step S102).

プロセッサ２３のセグメンテーション部３３は、後続の顔画像において推定された顔領域に含まれる各画素を、顔画素か非顔画素かに分類する（ステップＳ１０３）。また、プロセッサ２３の特徴点検出部３４は、後続の顔画像において推定された顔領域から、顔の個々の器官について複数の特徴点を検出する（ステップＳ１０４）。 The segmentation unit 33 of the processor 23 classifies each pixel included in the estimated face region in the subsequent face image as a face pixel or a non-face pixel (step S103). Further, the feature point detection unit 34 of the processor 23 detects a plurality of feature points for each organ of the face from the face area estimated in the subsequent face image (step S104).

プロセッサ２３の確信度算出部３５は、後続の顔画像において推定された顔領域について、顔画素である特徴点に第１の信頼度を設定し、非顔画素である特徴点に第２の信頼度を設定する（ステップＳ１０５）。なお、上記のように、第２の信頼度は第１の信頼度よりも低く設定される。さらに、確信度算出部３５は、後続の顔画像において推定された顔領域について、顔の器官ごとに、その器官について検出された各特徴点の信頼度の総和を算出し、その信頼度の総和が大きいほど高くなるように、その器官の確信度を算出する（ステップＳ１０６）。そして確信度算出部３５は、顔の器官ごとに算出された確信度の統計的代表値を、推定された顔領域に実際にドライバの顔が表されている確信度として算出する（ステップＳ１０７）。 The certainty calculation unit 35 of the processor 23 sets the first reliability to the feature points that are face pixels, and sets the second reliability to the feature points that are non-face pixels, with respect to the face area estimated in the subsequent face image. degree is set (step S105). Note that, as described above, the second reliability is set lower than the first reliability. Furthermore, the certainty calculation unit 35 calculates the sum of the reliability of each feature point detected for each organ of the face with respect to the facial region estimated in the subsequent facial image, and calculates the sum of the reliability. The degree of certainty of the organ is calculated so that it becomes higher as the is larger (step S106). Then, the confidence calculation unit 35 calculates the statistical representative value of the confidence calculated for each facial organ as the confidence that the driver's face is actually represented in the estimated facial region (step S107). .

プロセッサ２３の再検出判定部３６は、確信度が検出判定閾値未満となるか否か判定する（ステップＳ１０８）。確信度が検出判定閾値未満である場合（ステップＳ１０８－Ｙｅｓ）、再検出判定部３６は、最新の顔画像あるいは次に得られた顔画像から顔領域を検出することを顔検出部３１に指示する（ステップＳ１０９）。 The redetection determination unit 36 of the processor 23 determines whether or not the certainty is less than the detection determination threshold (step S108). If the certainty is less than the detection determination threshold (step S108-Yes), the re-detection determination unit 36 instructs the face detection unit 31 to detect a face area from the latest face image or the next obtained face image. (step S109).

一方、確信度が検出判定閾値以上である場合（ステップＳ１０８－Ｎｏ）、プロセッサ２３の状態判定部３７は、推定された顔領域からドライバの顔の向きを検出して、ドライバの状態が車両１０の運転に適した状態か否か判定する（ステップＳ１１０）。そして状態判定部３７は、その判定結果に応じた警告処理などを実行する。ステップＳ１０９またはＳ１１０の後、プロセッサ２３は、ドライバモニタ処理を終了する。 On the other hand, if the certainty is equal to or greater than the detection determination threshold (step S108-No), the state determination unit 37 of the processor 23 detects the direction of the driver's face from the estimated face area, and determines that the driver's state is the vehicle 10. (step S110). Then, the state determination unit 37 executes warning processing or the like according to the determination result. After step S109 or S110, processor 23 terminates the driver monitor process.

以上に説明してきたように、この顔検出装置は、撮影対象となる人物の顔を撮影して得られた画像において、その人物の顔が表されていると想定される所定の領域に含まれる各画素を、顔を表す顔画素または顔を表さない非顔画素に分類する。また、この顔検出装置は、その所定の領域から、顔の個々の器官について、その器官の複数の特徴点を検出し、検出した特徴点ごとに信頼度を設定する。その際、この顔検出装置は、顔画素である特徴点の信頼度を、非顔画素である特徴点の信頼度よりも高くなるように、各特徴点の信頼度を設定する。そしてこの顔検出装置は、顔の個々の器官ごとに、その器官について検出された特徴点の信頼度の総和を求め、求めた個々の器官ごとの信頼度の総和が大きくなるほど上記の所定の領域に顔が表されている確からしさを表す確信度が高くなるように、その確信度を算出する。これにより、この顔検出装置は、顔の一部が画像に表されていなくても、画像上の所定の領域に顔が表されている確からしさを精度良く求めることができる。 As described above, the face detection apparatus is designed so that, in an image obtained by photographing the face of a person to be photographed, the face of the person is assumed to be included in a predetermined area. Each pixel is classified as a face pixel that represents a face or a non-face pixel that does not represent a face. Also, this face detection apparatus detects a plurality of feature points of each organ of the face from the predetermined area, and sets reliability for each of the detected feature points. At this time, the face detection apparatus sets the reliability of each feature point so that the reliability of the feature points that are face pixels is higher than the reliability of the feature points that are non-face pixels. This face detection apparatus obtains the sum of the reliability of the feature points detected for each organ of the face, and the larger the sum of the obtained reliability of each organ, the greater the predetermined region. The degree of certainty is calculated so that the degree of certainty representing the certainty that the face is represented in is high. As a result, the face detection device can accurately determine the likelihood that a face is represented in a predetermined area on the image even if a part of the face is not represented in the image.

なお、ドライバモニタカメラ２とドライバの顔の想定される位置関係によっては、信頼度が高くなり易い顔の器官と信頼度が低くなり易い顔の器官とがある。例えば、ドライバが車両１０の正面方向を向いている状態において、ドライバの顔を右から撮影するようにドライバモニタカメラ２が設置されている場合、ドライバの顔の右半分に位置する器官についてはドライバモニタカメラ２の方を向いているので、信頼度が高くなり易い。逆に、ドライバの顔の左半分に位置する器官については、顔の他の部位で隠されるなどして顔画像上にその器官の一部が表されていないことがある。その結果として、信頼度が低くなり易い。そこで、確信度算出部３５は、複数の顔画像に基づいて器官ごとの信頼度の総和の分布をもとめ、その分布に基づいて、信頼度が低くなり易い器官については、設定された信頼度にバイアスを掛けるかオフセットを加算するようにしてもよい。逆に、信頼度が高くなり易い器官について、確信度算出部３５は、その器官の特徴点について設定された信頼度からオフセットを減じる等してもよい。また、確信度算出部３５は、信頼度の総和の分布に応じて、個々の器官に対する信頼度の平均値または中央値が一定となるように、器官ごとの信頼度のオフセット値を定めた参照テーブルを生成し、その参照テーブルに従って各特徴点の信頼度を補正してもよい。 Depending on the assumed positional relationship between the driver monitor camera 2 and the driver's face, there are facial organs whose reliability tends to be high and facial organs whose reliability tends to be low. For example, when the driver is facing the front of the vehicle 10 and the driver monitor camera 2 is installed so as to photograph the driver's face from the right side, the organs located on the right half of the driver's face are captured by the driver. Since it faces the monitor camera 2, the reliability tends to be high. Conversely, an organ located on the left half of the driver's face may not be partially represented on the face image because, for example, it is hidden by another part of the face. As a result, reliability tends to be low. Therefore, the certainty calculation unit 35 obtains the distribution of the sum of reliability for each organ based on a plurality of face images, and based on the distribution, for organs whose reliability tends to be low, the set reliability is calculated. A bias may be applied or an offset may be added. Conversely, for an organ whose reliability tends to be high, the certainty calculation unit 35 may subtract the offset from the reliability set for the feature point of that organ. In addition, the certainty calculation unit 35 determines the offset value of the reliability for each organ so that the average value or the median value of the reliability for each organ is constant according to the distribution of the sum of the reliability. A table may be generated and the confidence of each feature point corrected according to the lookup table.

また、状態判定部３７は、確信度が検出判定閾値以上となる顔領域、すなわち、顔が表されている可能性が高い顔領域から、ドライバの顔の向き以外のドライバの状態を表す指標をもとめてもよい。例えば、状態判定部３７は、ドライバの視線方向あるいは覚醒度を推定してもよい。視線方向を推定する場合、状態判定部３７は、顔領域からドライバの何れかの眼の瞳孔及びドライバモニタカメラ２に設けられた光源の角膜反射像（プルキンエ像）を検出する。その際、状態判定部３７は、例えば、テンプレートマッチングにより瞳孔及びプルキンエ像を検出する。そして状態判定部３７は、瞳孔の重心とプルキンエ像との位置関係と視線方向との関係を表す参照テーブルを参照することで、検出した瞳孔の重心の位置とプルキンエ像の位置とから、ドライバの視線方向を推定すればよい。この場合、状態判定部３７は、上記の実施形態と同様に、ドライバの視線方向とドライバの視線方向の基準方向との差の絶対値を算出し、その差の絶対値を所定の視線方向許容範囲と比較する。そして状態判定部３７は、その差の絶対値が視線方向許容範囲から外れている場合、ドライバは余所見をしている、すなわち、ドライバの状態は車両１０の運転に適した状態でないと判定すればよい。 In addition, the state determination unit 37 selects an index representing the state of the driver other than the orientation of the driver's face from the face region having the degree of certainty greater than or equal to the detection determination threshold, that is, the face region that is highly likely to represent a face. You may ask. For example, the state determination unit 37 may estimate the line-of-sight direction or wakefulness of the driver. When estimating the line-of-sight direction, the state determination unit 37 detects the pupil of one of the eyes of the driver and the corneal reflection image (Purkinje image) of the light source provided in the driver monitor camera 2 from the face area. At that time, the state determination unit 37 detects the pupil and the Purkinje image by template matching, for example. Then, the state determination unit 37 refers to a reference table representing the positional relationship between the center of gravity of the pupil and the Purkinje image, and the relationship between the line-of-sight direction and the position of the driver's position based on the detected position of the center of gravity of the pupil and the position of the Purkinje image. What is necessary is just to estimate a line-of-sight direction. In this case, the state determination unit 37 calculates the absolute value of the difference between the line-of-sight direction of the driver and the reference direction of the line-of-sight direction of the driver, and determines the absolute value of the difference as the predetermined line-of-sight direction tolerance. Compare with range. If the absolute value of the difference is outside the permissible line-of-sight direction range, the state determination unit 37 determines that the driver is looking away, that is, the driver is not in a state suitable for driving the vehicle 10. good.

また、状態判定部３７は、ドライバの覚醒度を推定する場合、特徴点検出部３４により検出された、眼の上瞼に相当する特徴点と下瞼に相当する特徴点との間隔に基づいて、眼の開き度合いを算出する。そして状態判定部３７は、眼の開き度合いの時間変化に基づいて、ドライバの覚醒度を推定すればよい。 Further, when estimating the driver's arousal level, the state determination unit 37 detects the distance between the feature point corresponding to the upper eyelid and the feature point corresponding to the lower eyelid detected by the feature point detection unit 34. , to calculate the degree of eye opening. Then, the state determination unit 37 may estimate the degree of wakefulness of the driver based on the temporal change in the eye opening degree.

また、顔検出装置が、ドライバモニタ装置以外の用途に用いられる場合、状態判定部３７は、確信度が検出判定閾値以上となる顔領域、すなわち、撮影対象となる人物の顔が表されている可能性が高い顔領域に基づいて、上記の実施形態とは異なる処理を実行してもよい。例えば、状態判定部３７は、個人認証、肌質判定、性別他、撮影対象となる人物の属性判定、体温推定、顔に基づくオートフォーカス制御、露出制御あるいは色補正などの処理を実行してもよい。ただしこの場合も、状態判定部３７は、確信度が検出判定閾値未満となる顔領域に対しては、上記の処理をスキップしてもよい。 In addition, when the face detection device is used for purposes other than the driver monitor device, the state determination unit 37 determines that the face region whose certainty is equal to or higher than the detection determination threshold, that is, the face of the person to be photographed is represented. Different processing from the above embodiment may be performed based on the likely face region. For example, the state determination unit 37 may perform processing such as personal authentication, skin quality determination, gender, attribute determination of a person to be photographed, body temperature estimation, face-based autofocus control, exposure control, or color correction. good. However, in this case as well, the state determination unit 37 may skip the above processing for face regions whose confidence is less than the detection determination threshold.

他の変形例によれば、顔検出部３１は、複数の顔検出手法を適用可能であってもよい。例えば、顔検出部３１は、DNNベースの識別器を用いた顔検出手法と、AdaBoostベースの識別器を用いた顔検出手法を利用可能であってもよい。そして再検出判定部３６は、連続して得られた所定数（2以上、例えば、3～10）の顔画像について、確信度が再検出判定閾値未満となる場合、顔検出部３１に、それまでに適用されていた顔検出手法と異なる顔検出手法を使用することを指示してもよい。あるいは、再検出判定部３６は、連続して得られた所定数の顔画像について、確信度が再検出判定閾値未満となる場合、顔画像に撮影対象となる人物の顔が表されていないとして、顔検出処理を停止してもよい。さらに、再検出判定部３６は、顔検出装置の電源をオフとし、あるいは、顔検出装置を待機状態にしてもよい。 According to another variation, the face detector 31 may be capable of applying multiple face detection techniques. For example, the face detection unit 31 may use a face detection method using a DNN-based classifier and a face detection method using an AdaBoost-based classifier. Then, the re-detection determination unit 36, when the reliability of a predetermined number (2 or more, for example, 3 to 10) of continuously obtained face images is less than the re-detection determination threshold value, the face detection unit 31 It may be instructed to use a face detection method different from the face detection method that has been applied before. Alternatively, the redetection determination unit 36 determines that the facial images do not represent the face of the person to be photographed when the certainty factor is less than the redetection determination threshold for a predetermined number of consecutively obtained face images. , the face detection process may be stopped. Furthermore, the re-detection determination unit 36 may turn off the power of the face detection device, or put the face detection device into a standby state.

また、上記の実施形態または変形例による、ＥＣＵ４のプロセッサ２３の機能を実現するコンピュータプログラムは、半導体メモリ、磁気記録媒体または光記録媒体といった、コンピュータ読取可能な可搬性の記録媒体に記録された形で提供されてもよい。 Further, the computer program that implements the functions of the processor 23 of the ECU 4 according to the above embodiment or modification is recorded in a computer-readable portable recording medium such as a semiconductor memory, a magnetic recording medium, or an optical recording medium. may be provided in

以上のように、当業者は、本発明の範囲内で、実施される形態に合わせて様々な変更を行うことができる。 As described above, those skilled in the art can make various modifications within the scope of the present invention according to the embodiment.

１車両制御システム
１０車両
２ドライバモニタカメラ
３ユーザインターフェース
４電子制御装置(ＥＣＵ)
２１通信インターフェース
２２メモリ
２３プロセッサ
３１顔検出部
３２追跡部
３３セグメンテーション部
３４特徴点検出部
３５確信度算出部
３６再検出判定部
３７状態判定部 1 vehicle control system 10 vehicle 2 driver monitor camera 3 user interface 4 electronic control unit (ECU)
21 communication interface 22 memory 23 processor 31 face detection unit 32 tracking unit 33 segmentation unit 34 feature point detection unit 35 confidence calculation unit 36 redetection determination unit 37 state determination unit

Claims

a segmentation unit that classifies each pixel in a predetermined region on the image into a face pixel representing a face or a non-face pixel not representing a face;
a feature point detection unit that detects a plurality of feature points of each individual organ of the face from the predetermined region;
For each organ of the face, among a plurality of feature points detected for the organ, the reliability of the feature points that are the facial pixels is higher than the reliability of the feature points that are the non-facial pixels, A reliability is set for each of the plurality of feature points, a sum of the reliability set for each of the plurality of feature points is obtained, and the larger the sum of the obtained reliability for each of the individual organs is, the a certainty calculation unit that calculates the certainty so as to increase the certainty that indicates the likelihood that a face is represented in a predetermined area;
A face detection device having