JP7230710B2

JP7230710B2 - Image processing device, monitoring device, control system, image processing method, and program

Info

Publication number: JP7230710B2
Application number: JP2019121136A
Authority: JP
Inventors: 知禎相澤
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2023-03-01
Anticipated expiration: 2039-06-28
Also published as: WO2020261832A1; JP2021006972A

Description

本発明は、画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing device, a monitoring device, a control system, an image processing method, and a program.

下記の特許文献１には、サービスを提供する対象（人物）の状況に応じて、適切なサービスに切り替え可能なサービス提供装置として利用されるロボット装置が開示されている。 Patent Literature 1 below discloses a robot device used as a service providing device capable of switching to an appropriate service according to the situation of a target (person) to whom the service is provided.

前記ロボット装置には、第１カメラと、第２カメラと、ＣＰＵを含む情報処理装置とが装備され、前記ＣＰＵには、顔検出部、属性判定部、人物検出部、人物位置算出部、及び移動ベクトル検出部などが装備されている。 The robot device is equipped with a first camera, a second camera, and an information processing device including a CPU. The CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and It is equipped with a movement vector detection unit.

前記ロボット装置によれば、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立している人物の集合である場合は、密なやり取りに基づいた情報を提供する第１サービスを行うことを決定する。一方、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立しているか否かが不明な人物の集合である場合は、やり取りを行わずに、一方的に情報を提供する第２サービスを行うことを決定する。これにより、サービスの提供対象の状況に応じて、適切なサービスを行うことができるとしている。 According to the above-described robot device, when the target of service provision is a group of persons who have established relationships such as mutual communication, the first service of providing information based on close exchanges is provided. to decide. On the other hand, if the target of service provision is a group of people whose relationship such as mutual communication is unknown, the second service provides information unilaterally without any interaction. decide to do As a result, appropriate services can be provided according to the situation of the service provider.

［発明が解決しようとする課題］
前記ロボット装置では、前記顔検出部が、前記第１カメラを用いて人物の顔検出を行う構成になっており、該顔検出には、公知の技術を利用することができるとしている。
しかしながら、従来の顔検出技術では、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形している場合、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されている場合、又は遺伝性の疾患などの病気により、前記顔器官の配置が平均的な位置からずれている場合など、このような特定個人（換言すれば、年齢差、性別、及び人種などの違いにかかわらずに共通する一般的な人の顔特徴とは異なっている特徴を有する特定の個人）に対する顔検出の精度が低下してしまうという課題があった。 [Problems to be solved by the invention]
In the robot device, the face detection unit detects a person's face using the first camera, and known technology can be used for the face detection.
However, in the conventional face detection technology, when a part of the facial organs such as eyes, nose, mouth, etc. is missing or greatly deformed due to injury, etc., there is a large mole, wart, or body decoration such as a tattoo on the face. Such a specific individual (in other words, age difference, gender, and person There is a problem that the accuracy of face detection for a specific individual having features different from the facial features of general people common regardless of species etc. is lowered.

特開２０１４－１４８９９号公報JP 2014-14899 A

Means to solve the problem and its effect

本発明は上記課題に鑑みなされたものであって、上記のような特定個人の顔であってもリアルタイムで精度良く検出することができる画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムを提供することを目的としている。 The present invention has been made in view of the above problems, and an image processing device, a monitoring device, a control system, an image processing method, and an image processing device capable of accurately detecting even the face of a specific individual as described above in real time. The purpose is to provide a program.

上記目的を達成するために本開示に係る画像処理装置（１）は、撮像部から入力される画像を処理する画像処理装置であって、
前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量とが記憶される顔特徴量記憶部と、
前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出部とを備え、
該顔検出部が、
前記探索領域から顔の特徴量を抽出する第１特徴量抽出部と、
前記探索領域から抽出された前記特徴量と、前記通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを判別する階層構造の通常顔判別器と、
該通常顔判別器のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定部とを備えていることを特徴としている。 To achieve the above object, an image processing device (1) according to the present disclosure is an image processing device that processes an image input from an imaging unit,
a facial feature amount storage unit that stores a specific individual's facial feature amount and a normal facial feature amount as learned facial feature amounts that have been trained to detect a face from the image;
a face detection unit that detects a face area while scanning a search area of the image;
The face detection unit
a first feature quantity extraction unit that extracts a face feature quantity from the search area;
a hierarchical normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount;
When the face is determined to be a non-face in any hierarchy of the normal face classifier, the search region is determined using the feature quantity extracted from the search region and the face feature quantity of the specific individual. is the face of the specific individual or the non-face of the specific individual.

上記画像処理装置（１）によれば、前記通常顔判別器が、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別することにより、前記顔領域が検出される。また、前記通常顔判別器のいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定部が、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定することにより、前記特定個人の顔を含む前記顔領域が検出される。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することができる。また、前記通常顔判別器と前記特定個人顔判定部では、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。 According to the image processing device (1), the normal face discriminator uses the normal face feature amount to hierarchically discriminate whether the search area is a face or a non-face, , the face region is detected. Further, even when the normal face discriminator determines that the face is non-face in any of the hierarchies, the specific individual face determination unit uses the specific individual's face feature amount to determine the search area. is the specific individual's face or non-face, the face region containing the specific individual's face is detected. As a result, the face region can be detected with high accuracy whether it is the normal face or the face of the specific individual. In addition, since the common feature amount extracted from the search area is used in the normal face discriminator and the specific individual face determination unit, real-time performance regarding detection of the face area can be maintained.

本開示に係る画像処理装置（２）は、上記画像処理装置（１）において、前記特定個人顔判定部が、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との相関を示す指標を算出し、算出した前記指標に基づいて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定するものであることを特徴としている。 The image processing device (2) according to the present disclosure is the image processing device (1), wherein the specific individual face determination unit uses in one layer of the normal face discriminator determined as the non-face. An index indicating a correlation between the feature amount and the face feature amount of the specific individual corresponding to the one layer is calculated, and based on the calculated index, whether the search area is the face of the specific individual or not It is characterized by determining whether it is a non-face.

上記画像処理装置（２）によれば、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との相関を示す指標に基づいて、前記探索領域が前記特定個人の顔であるか、非顔であるかを効率良く判定することができ、前記通常顔判別器で前記非顔であると判別された場合であっても、前記特定個人の顔である場合を精度良く判定することができる。前記指標は、その値が大きいほど関係性が高くなることを示す指標値、例えば、相関係数であってもよいし、二乗誤差の逆数であってもよいし、その他、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との関係の類似度を示す指標値などであってもよい。 According to the image processing device (2), the feature amount used in one layer of the normal face discriminator that was determined to be a non-face, and the face feature amount of the specific individual corresponding to the one layer It is possible to efficiently determine whether the search area is the face of the specific individual or a non-face based on the index showing the correlation with the normal face discriminator. Even if it is the face of the specific individual, it can be determined with high accuracy. The index may be an index value indicating that the higher the value, the higher the relationship, for example, a correlation coefficient, the reciprocal of the squared error, or the non-face. It may be an index value indicating the similarity of the relationship between the feature amount used in one layer of the normal face discriminator that discriminated as and the face feature amount of the specific individual corresponding to the one layer. .

本開示に係る画像処理装置（３）は、上記画像処理装置（２）において、前記特定個人顔判定部が、前記指標が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定し、前記指標が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定するものであることを特徴としている。 The image processing device (3) according to the present disclosure is the image processing device (2), wherein the specific individual face determining unit determines that the search area is the face of the specific individual when the index is larger than a predetermined threshold. and when the index is equal to or less than the predetermined threshold value, it is determined that the search area is the non-face.

上記画像処理装置（３）によれば、前記指標が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定され、前記指標が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定される。前記指標と前記所定の閾値とを比較する処理により、前記判定の処理効率を高めることができる。 According to the image processing device (3), if the index is greater than a predetermined threshold, the search area is determined to be the face of the specific individual, and if the index is less than or equal to the predetermined threshold, the search area is determined to be the non-face. Processing efficiency of the determination can be improved by the process of comparing the index and the predetermined threshold.

本開示に係る画像処理装置（４）は、上記画像処理装置（１）～（３）のいずれかにおいて、前記特定個人顔判定部により前記特定個人の顔であると判定された場合、前記通常顔判別器の次の階層に判別を進める判別進行部と、
前記特定個人顔判定部により前記非顔であると判定された場合、前記通常顔判別器での判別を打ち切る判別打切部とを備えていることを特徴としている。 The image processing device (4) according to the present disclosure, in any one of the image processing devices (1) to (3), when the specific individual face determination unit determines that the face is the specific individual, the normal a discrimination progressing unit that advances discrimination to the next layer of the face discriminator;
and a determination terminating unit that terminates determination by the normal face discriminator when the specific individual face determining unit determines that the face is a non-face.

上記画像処理装置（４）によれば、前記特定個人顔判定部により前記特定個人の顔であると判定された場合、前記通常顔判別器の次の階層に判別が進められ、判別処理が速やかに継続される。一方、前記特定個人顔判定部により前記非顔であると判定された場合、前記通常顔判別器での判別が打ち切られる。したがって、前記通常顔判別器の効率を維持しつつ、前記特定個人の顔を判定する処理を行うことができる。 According to the image processing device (4), when the specific individual face determination unit determines that the face is the specific individual face, the determination proceeds to the next layer of the normal face discriminator, and the determination process proceeds quickly. to be continued. On the other hand, when the specific individual face determining unit determines that the face is the non-face, the determination by the normal face determining device is terminated. Therefore, the process of determining the face of the specific individual can be performed while maintaining the efficiency of the normal face discriminator.

本開示に係る画像処理装置（５）は、上記画像処理装置（１）～（４）のいずれかにおいて、前記顔検出部が、前記通常顔判別器により前記顔であると判別された１以上の前記顔領域の候補を統合する顔領域統合部と、統合された前記顔領域から顔の特徴量を抽出する第２特徴量抽出部とを備え、統合された前記顔領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定部とを備えていることを特徴としている。 The image processing device (5) according to the present disclosure is any one of the image processing devices (1) to (4), wherein the face detection unit is one or more determined to be the face by the normal face discriminator. and a second feature quantity extraction unit for extracting a face feature quantity from the integrated face region, wherein the face region extracted from the integrated face region A specific individual determination unit that determines whether or not the face in the face region is the face of the specific individual by using the feature quantity and the face feature quantity of the specific individual.

上記画像処理装置（５）によれば、前記顔であると判別された１以上の前記顔領域の候補が統合され、統合された前記顔領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。したがって、前記顔領域統合部により統合された前記顔領域が、前記特定個人の顔であるか、前記通常の人の顔であるかを精度良く判定することができる。 According to the image processing device (5), one or more candidates for the face area determined to be the face are integrated, and the feature amount extracted from the integrated face area and the specific individual's It is determined whether or not the face in the face region is the face of the specific individual by using the face feature amount. Therefore, it is possible to accurately determine whether the face area integrated by the face area integration unit is the face of the specific individual or the face of the ordinary person.

本開示に係るモニタリング装置（１）は、上記画像処理装置（１）～（５）のいずれかと、該画像処理装置に入力する画像を撮像する撮像部と、前記画像処理装置による画像処理に基づく情報を出力する出力部とを備えていることを特徴としている。 A monitoring device (1) according to the present disclosure includes any one of the image processing devices (1) to (5), an imaging unit that captures an image to be input to the image processing device, and image processing by the image processing device. and an output unit for outputting information.

上記モニタリング装置（１）によれば、前記通常の人の顔だけでなく、前記特定個人の顔を精度良く検出して、モニタリングすることができ、また、前記出力部から前記画像処理に基づく情報が出力可能なため、該情報を利用するモニタリングシステムなどを容易に構築することが可能となる。 According to the monitoring device (1), it is possible to accurately detect and monitor not only the face of the ordinary person but also the face of the specific individual, and output the information based on the image processing from the output unit. can be output, it is possible to easily construct a monitoring system or the like that uses the information.

本開示に係る制御システム（１）は、上記モニタリング装置（１）と、該モニタリング装置と通信可能に接続され、該モニタリング装置から出力される前記情報に基づいて、所定の処理を実行する１以上の制御装置とを備えていることを特徴としている。 A control system (1) according to the present disclosure is connected to the monitoring device (1) so as to be able to communicate with the monitoring device, and based on the information output from the monitoring device, one or more that execute a predetermined process and a control device.

上記制御システム（１）によれば、前記モニタリング装置から出力される前記情報に基づいて、１以上の前記制御装置で所定の処理を実行させることが可能となる。したがって、前記通常の人のモニタリング結果だけでなく、前記特定個人のモニタリング結果を利用することができるシステムを構築することができる。 According to the above control system (1), it is possible to cause one or more of the control devices to execute a predetermined process based on the information output from the monitoring device. Therefore, it is possible to construct a system that can use not only the monitoring results of the ordinary person but also the monitoring results of the specific individual.

本開示に係る制御システム（２）は、上記制御システム（１）において、前記モニタリング装置（１）が、車両のドライバをモニタリングするための装置であり、前記制御装置が、前記車両に搭載される電子制御ユニットを含むことを特徴としている。 A control system (2) according to the present disclosure is the control system (1), wherein the monitoring device (1) is a device for monitoring a driver of a vehicle, and the control device is mounted on the vehicle. It is characterized by including an electronic control unit.

上記制御システム（２）によれば、前記車両のドライバが前記特定個人である場合であっても、前記特定個人の顔を精度良くモニタリングすることができ、そのモニタリングの結果に基づいて、前記電子制御ユニットに所定の制御を適切に実行させることが可能となる。これにより、前記特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。 According to the control system (2), even if the driver of the vehicle is the specific individual, the face of the specific individual can be monitored with high accuracy, and based on the monitoring results, the electronic It is possible to cause the control unit to appropriately execute predetermined control. As a result, it is possible to construct a highly safe vehicle-mounted system that allows even the specific individual to drive with peace of mind.

本開示に係る画像処理方法は、撮像部から入力される画像を処理する画像処理方法であって、
前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
該顔検出ステップが、
前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップとを含むことを特徴としている。 An image processing method according to the present disclosure is an image processing method for processing an image input from an imaging unit,
A face detection step of detecting a face region while scanning a search region of the image;
The face detection step includes:
a feature quantity extraction step of extracting a face feature quantity from the search area;
The search area is determined to be a face or a non-face by using the feature quantity extracted by the feature quantity extraction step and a learned normal face feature quantity that has been trained to detect a face. a normal face discrimination step for hierarchically discriminating whether
The feature amount extracted when the non-face is discriminated in any hierarchy of the normal face discriminating step, and the learned specific individual who has undergone learning to detect the face of the specific individual. and a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face by using the facial feature amount of .

上記画像処理方法によれば、前記通常顔判別ステップにより、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別することにより、前記顔領域が検出される。また、前記通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定ステップにより、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定することにより、前記特定個人の顔を含む前記顔領域が検出される。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することができる。また、前記通常顔判別ステップと前記特定個人顔判定ステップでは、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。したがって、前記特定個人の顔をリアルタイムで精度良く検出することができる。 According to the above image processing method, in the normal face determination step, the normal face feature amount is used to hierarchically determine whether the search area is a face or a non-face, thereby determining whether the search area is a face or not. A region is detected. Further, even if the non-face is determined in any layer of the normal face determination step, the specific individual face determination step uses the face feature amount of the specific individual to determine the search area. is the specific individual's face or non-face, the face region containing the specific individual's face is detected. As a result, the face region can be detected with high accuracy whether it is the normal face or the face of the specific individual. In addition, since the common feature amount extracted from the search area is used in the normal face determination step and the specific individual face determination step, the real-time nature of detection of the face area can be maintained. Therefore, the face of the specific individual can be detected in real time with high accuracy.

本開示に係るプログラムは、撮像部から入力される画像の処理を少なくとも１以上のコンピュータに実行させるためのプログラムであって、
前記少なくとも１以上のコンピュータに、
前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
該顔検出ステップが、
前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップとを実行させるためのプログラムであることを特徴としている。 A program according to the present disclosure is a program for causing at least one or more computers to process an image input from an imaging unit,
to said at least one computer;
A face detection step of detecting a face region while scanning a search region of the image;
The face detection step includes:
a feature quantity extraction step of extracting a face feature quantity from the search area;
The search area is determined to be a face or a non-face by using the feature quantity extracted by the feature quantity extraction step and a learned normal face feature quantity that has been trained to detect a face. a normal face discrimination step for hierarchically discriminating whether
The feature amount extracted when the non-face is discriminated in any hierarchy of the normal face discriminating step, and the learned specific individual who has undergone learning to detect the face of the specific individual. and a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face using the facial feature amount of .

上記プログラムによれば、前記少なくとも１以上のコンピュータに、前記通常顔判別ステップにより、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別させて、前記顔領域を検出させることができる。また、前記通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定ステップにより、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定させることにより、前記特定個人の顔を含む前記顔領域を検出させることができる。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出させることができる。また、前記通常顔判別ステップと前記特定個人顔判定ステップでは、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。したがって、前記特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる装置やシステムを構築することができる。なお、上記プログラムは、記憶媒体に保存されたプログラムであってもよいし、通信ネットワークを介して転送可能なプログラムであってもよいし、通信ネットワークを介して実行されるプログラムであってもよい。 According to the above program, the at least one computer hierarchically discriminates whether the search area is a face or a non-face by using the normal face feature quantity in the normal face discrimination step. to detect the face area. Further, even if the non-face is determined in any layer of the normal face determination step, the specific individual face determination step uses the face feature amount of the specific individual to determine the search area. is the specific individual's face or non-face, the face region containing the specific individual's face can be detected. As a result, it is possible to accurately detect the face area regardless of whether it is the normal face or the face of the specific individual. In addition, since the common feature amount extracted from the search area is used in the normal face determination step and the specific individual face determination step, the real-time nature of detection of the face area can be maintained. Therefore, it is possible to construct a device or system capable of accurately sensing the face of each of the specific individual and a normal person other than the specific individual. The program may be a program stored in a storage medium, a program that can be transferred via a communication network, or a program that is executed via a communication network. .

本発明の実施の形態に係るドライバモニタリング装置を含む車載システムの一例を示す模式図である。1 is a schematic diagram showing an example of an in-vehicle system including a driver monitoring device according to an embodiment of the present invention; FIG. 実施の形態に係るドライバモニタリング装置を含む車載システムのハードウェア構成の一例を示すブロック図である。1 is a block diagram showing an example hardware configuration of an in-vehicle system including a driver monitoring device according to an embodiment; FIG. 実施の形態に係るドライバモニタリング装置の画像処理部の機能構成例を示すブロック図である。2 is a block diagram showing an example functional configuration of an image processing unit of the driver monitoring device according to the embodiment; FIG. 画像処理部に含まれる顔検出部の機能構成例を示すブロック図である。3 is a block diagram showing a functional configuration example of a face detection unit included in the image processing unit; FIG. 顔検出部で行われる処理動作例を説明するための模式図である。FIG. 4 is a schematic diagram for explaining an example of processing operations performed by a face detection unit; 顔検出部で行われる処理動作例を説明するための模式図である。FIG. 4 is a schematic diagram for explaining an example of processing operations performed by a face detection unit; 実施の形態に係るドライバモニタリング装置の画像処理部が行う処理動作の一例を示すフローチャートである。4 is a flowchart showing an example of processing operations performed by an image processing unit of the driver monitoring device according to the embodiment; 実施の形態に係るドライバモニタリング装置の画像処理部が行う顔検出処理動作と特定個人判定処理動作の一例を示すフローチャートである。7 is a flow chart showing an example of a face detection processing operation and a specific individual determination processing operation performed by an image processing unit of the driver monitoring device according to the embodiment; 実施の形態に係るドライバモニタリング装置の画像処理部が行う判別処理動作の一例を示すフローチャートである。4 is a flow chart showing an example of a determination processing operation performed by an image processing unit of the driver monitoring device according to the embodiment;

以下、本発明に係る画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムの実施の形態を図面に基づいて説明する。
本発明に係る画像処理装置は、例えば、カメラを用いて人などの対象物をモニタリングする装置やシステムに広く適用可能である。本発明に係る画像処理装置は、例えば、車両などの各種移動体のドライバ（操縦者）をモニタリングする装置やシステムの他、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにも適用可能である。 Embodiments of an image processing device, a monitoring device, a control system, an image processing method, and a program according to the present invention will be described below with reference to the drawings.
INDUSTRIAL APPLICABILITY The image processing apparatus according to the present invention can be widely applied to, for example, apparatuses and systems for monitoring objects such as people using cameras. The image processing device according to the present invention is, for example, a device or system for monitoring a driver (operator) of various moving bodies such as a vehicle, or operates or monitors various facilities such as machines and devices in a factory. It can also be applied to a device or system for monitoring a person who performs a predetermined work or the like.

［適用例］
図１は、実施の形態に係るドライバモニタリング装置を含む車載システムの一例を示す模式図である。本適用例では、本発明に係る画像処理装置をドライバモニタリング装置１０に適用した例について説明する。 [Application example]
FIG. 1 is a schematic diagram showing an example of an in-vehicle system including a driver monitoring device according to an embodiment. In this application example, an example in which an image processing device according to the present invention is applied to a driver monitoring device 10 will be described.

車載システム１は、車両２のドライバ３の状態（例えば、顔の挙動など）をモニタリングするドライバモニタリング装置１０、車両２の走行、操舵、又は制動などの制御を行う１以上のＥＣＵ（Electronic Control Unit）４０、及び車両各部の状態、又は車両周囲の状態などを検出する１以上のセンサ４１を含んで構成され、これらが通信バス４３を介して接続されている。車載システム１は、例えば、ＣＡＮ（Controller Area Network）プロトコルに従って通信する車載ネットワークシステムとして構成されている。なお、車載システム１の通信規格には、ＣＡＮ以外の他の通信規格が採用されてもよい。ドライバモニタリング装置１０が、本発明の「モニタリング装置」の一例であり、車載システム１が、本発明の「制御システム」の一例である。 The in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, behavior of the face), and one or more ECUs (Electronic Control Units) that control driving, steering, or braking of the vehicle 2. ) 40 and one or more sensors 41 for detecting the state of each part of the vehicle or the state of the surroundings of the vehicle, which are connected via a communication bus 43 . The in-vehicle system 1 is configured as an in-vehicle network system that communicates according to CAN (Controller Area Network) protocol, for example. In addition, other communication standards than CAN may be adopted as the communication standard of the in-vehicle system 1 . The driver monitoring device 10 is an example of the "monitoring device" of the present invention, and the in-vehicle system 1 is an example of the "control system" of the present invention.

ドライバモニタリング装置１０は、ドライバ３の顔を撮像するためのカメラ１１と、カメラ１１から入力される画像を処理する画像処理部１２と、画像処理部１２による画像処理に基づく情報を、通信バス４３を介して所定のＥＣＵ４０に出力する処理などを行う通信部１６とを含んで構成されている。画像処理部１２が、本発明の「画像処理装置」の一例である。カメラ１１が、本発明の「撮像部」の一例である。 The driver monitoring device 10 includes a camera 11 for capturing an image of the face of the driver 3, an image processing unit 12 for processing an image input from the camera 11, and information based on image processing by the image processing unit 12 via a communication bus 43. and a communication unit 16 that performs processing for outputting to a predetermined ECU 40 via the communication unit 16 . The image processing unit 12 is an example of the "image processing device" of the present invention. The camera 11 is an example of the "imaging section" of the present invention.

ドライバモニタリング装置１０は、カメラ１１で撮像された画像からドライバ３の顔を検出し、検出されたドライバ３の顔の向き、視線の方向、又は目の開閉状態などの顔の挙動を検出する。ドライバモニタリング装置１０は、これら顔の挙動の検出結果に基づいて、ドライバ３の状態、例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態を判定してもよい。また、ドライバモニタリング装置１０が、これらドライバ３の状態判定に基づく信号をＥＣＵ４０に出力し、ＥＣＵ４０が、前記信号に基づいてドライバ３への注意や警告処理、又は車両２の動作制御（例えば、減速制御、又は路肩への誘導制御など）などを実行するように構成してもよい。 The driver monitoring device 10 detects the face of the driver 3 from the image captured by the camera 11, and detects the behavior of the detected face of the driver 3, such as the direction of the face, the direction of the line of sight, or the state of opening and closing the eyes. The driver monitoring device 10 may determine the state of the driver 3, such as looking ahead, looking aside, dozing off, looking backwards, lying down, etc., based on these facial behavior detection results. In addition, the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40 performs attention or warning processing to the driver 3 or operation control (for example, deceleration control) of the vehicle 2 based on the signal. control, or guidance control to the road shoulder, etc.).

ドライバモニタリング装置１０では、ドライバ３に対する顔センシング、特にドライバ３が、特定個人であっても、特定個人以外の通常の人であっても、これら対象者の顔検出をリアルタイムで精度良く行えるようにすることを目的の一つとしている。 In the driver monitoring device 10, the face sensing of the driver 3, in particular, whether the driver 3 is a specific individual or a normal person other than the specific individual, can be performed in real time with high accuracy. One of the purposes is to

従来のドライバモニタリング装置では、車両２のドライバ３が、例えば、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形していたり、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されていたり、又は遺伝性の疾患などの病気により、前記顔器官の配置が平均的な位置からずれていたりした場合、カメラで撮像された画像から顔を検出する精度が低下してしまうという課題があった。 In the conventional driver monitoring device, the driver 3 of the vehicle 2 has, for example, a part of facial organs such as eyes, nose, mouth, etc. missing or greatly deformed due to injury, or a large mole or wart on the face. Accuracy of detecting a face from an image captured by a camera when the arrangement of the facial organs deviates from the average position due to body decorations such as tattoos or diseases such as genetic diseases. There was a problem that the

また、顔検出精度が低下すると、顔向き推定処理など、顔検出後の処理も適切に行われないこととなるため、ドライバ３の脇見や居眠りなどの状態判定も適切に行うことができなくなり、また、前記状態判定に基づいてＥＣＵ４０が実行すべき各種の制御も適切に行うことができなくなる虞があるという課題があった。 In addition, when the face detection accuracy decreases, post-detection processing such as face direction estimation processing will not be performed properly, so it will not be possible to properly determine the state of the driver 3, such as looking aside or falling asleep. Moreover, there is a problem that various controls that should be executed by the ECU 40 based on the state determination may not be performed appropriately.

係る課題を解決すべく、実施の形態に係るドライバモニタリング装置１０では、特定個人、換言すれば、年齢差、性別、及び人種などの違い（個人差）にかかわらずに共通する一般的な人（通常の人ともいう）の顔特徴とは異なる特徴を有している特定の個人に対する顔検出をリアルタイムで精度良く行えるようにするために、以下の構成を採用した。 In order to solve such a problem, the driver monitoring device 10 according to the embodiment can detect a specific individual, in other words, a common general person regardless of differences in age, gender, race, etc. (individual differences). The following configuration is adopted in order to accurately perform real-time face detection for a specific individual having features different from those of a normal person (also called a normal person).

画像処理部１２には、画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量（換言すれば、通常の人の顔検出に用いる顔特徴量）とが記憶されている。 The image processing unit 12 stores a specific individual's facial feature amount and a normal facial feature amount (in other words, a normal person's face facial features used for detection) are stored.

画像処理部１２は、カメラ１１の入力画像に対して所定サイズの探索領域を走査しながら、かつ該探索領域から顔を検出するための特徴量を抽出しながら顔領域を検出する顔検出処理を行う。そして、画像処理部１２は、前記探索領域から抽出された特徴量と、前記通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別していく通常顔判別処理により、入力画像から顔領域を検出する。
また、画像処理部１２は、前記通常顔判別処理のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が特定個人の顔であるか、非顔であるかを判定する特定個人顔判定を行う。 The image processing unit 12 performs face detection processing to detect a face area while scanning a search area of a predetermined size in the input image of the camera 11 and extracting feature amounts for detecting a face from the search area. conduct. Then, the image processing unit 12 uses the feature amount extracted from the search area and the normal facial feature amount to hierarchically determine whether the search area is a face or a non-face. A face area is detected from the input image by normal face discrimination processing.
Further, when the non-face is determined in any layer of the normal face determination process, the image processing unit 12 extracts the feature amount extracted from the search area and the face feature amount of the specific individual. is used to perform specific individual face determination for determining whether the search area is the face of a specific individual or not.

前記特定個人顔判定処理では、前記通常顔判別処理において非顔であると判別した一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との関係を示す指標、例えば、相関係数を算出し、算出した前記相関係数に基づいて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定してもよい。 In the specific individual face determination process, the relationship between the feature amount used in the one layer determined to be non-face in the normal face determination process and the face feature amount of the specific individual corresponding to the one layer is determined. An indicative index, such as a correlation coefficient, may be calculated, and whether the search area is the face or non-face of the specific individual may be determined based on the calculated correlation coefficient.

例えば、前記相関係数が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定し、前記通常顔判別処理における次の階層に判別処理を進めてもよい。また、前記相関係数が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定して、当該探索領域に対する前記通常顔判別処理を打ち切ってもよい。なお、前記特定個人顔判定処理では、前記相関係数以外の指標を用いてもよい。 For example, when the correlation coefficient is greater than a predetermined threshold, it may be determined that the search area is the face of the specific individual, and determination processing may proceed to the next layer in the normal face determination processing. Further, when the correlation coefficient is equal to or less than the predetermined threshold value, it may be determined that the search area is the non-face, and the normal face determination processing for the search area may be terminated. In addition, in the specific individual face determination process, an index other than the correlation coefficient may be used.

このようにドライバモニタリング装置１０では、前記通常顔判別処理により、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかが階層的に判別され、前記顔領域が検出される。また、前記通常顔判別処理のいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定処理により、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかが判定されることにより、前記特定個人の顔を含む前記顔領域が検出される。 As described above, in the driver monitoring device 10, by the normal face determination process, using the normal face feature amount, it is hierarchically determined whether the search area is a face or a non-face, and the face area is determined. is detected. Further, even when the non-face is determined in any layer of the normal face determination process, the specific individual face determination process uses the face feature amount of the specific individual to determine the search area. is the specific individual's face or non-face, the face region containing the specific individual's face is detected.

また、前記通常顔判別処理のいずれかの階層で前記非顔であると判別され、かつ、前記特定個人顔判定処理により前記非顔であると判定された場合は、前記探索領域は非顔（顔以外）であるとして、処理が打ち切られ、次の探索領域の処理に進む。 Further, when the non-face is determined in any layer of the normal face determination process and the non-face is determined by the specific individual face determination process, the search area is a non-face ( face), the process is discontinued, and the process proceeds to the next search area.

これらの構成により、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することが可能となる。また、前記通常顔判別処理と前記特定個人顔判定処理では、前記探索領域から抽出された、共通の前記特徴量を用いるため、前記顔領域の検出に係るリアルタイム性を維持することが可能となる。
したがって、ドライバ３が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔をリアルタイムで（換言すれば、高速な処理で）精度良く検出することが可能となる。 With these configurations, it is possible to detect the face region with high accuracy regardless of whether it is the normal face or the face of the specific individual. In addition, since the common feature amount extracted from the search area is used in the normal face determination process and the specific individual face determination process, it is possible to maintain the real-time nature of detection of the face area. .
Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately detect each face in real time (in other words, by high-speed processing). .

［ハードウェア構成例］
図２は、実施の形態に係るドライバモニタリング装置１０を含む車載システム１のハードウェア構成の一例を示すブロック図である。 [Hardware configuration example]
FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.

車載システム１は、車両２のドライバ３の状態をモニタリングするドライバモニタリング装置１０、１以上のＥＣＵ４０、及び１以上のセンサ４１を含んで構成され、これらが通信バス４３を介して接続されている。また、ＥＣＵ４０には、１以上のアクチュエータ４２が接続されている。 The in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 , one or more ECUs 40 , and one or more sensors 41 , which are connected via a communication bus 43 . One or more actuators 42 are connected to the ECU 40 .

ドライバモニタリング装置１０は、カメラ１１と、カメラ１１から入力される画像を処理する画像処理部１２と、外部のＥＣＵ４０などとデータや信号のやり取りを行うための通信部１６とを含んで構成されている。 The driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 that exchanges data and signals with an external ECU 40 or the like. there is

カメラ１１は、運転席に着座しているドライバ３の顔を含む画像を撮像する装置であり、例えば、レンズ部、撮像素子部、光照射部、インターフェース部、これら各部を制御するカメラ制御部などを含んで構成され得る。前記撮像素子部は、ＣＣＤ(Charge Coupled Device)、ＣＭＯＳ(Complementary Metal Oxide Semiconductor)などの撮像素子、フィルタ、マイクロレンズなどを含んで構成され得る。前記撮像素子部は、可視領域の光を受けて撮像画像を形成できる素子でもよいし、近赤外領域の光を受けて撮像画像を形成できる素子でもよい。前記光照射部は、ＬＥＤ(Light Emitting Diode)などの発光素子を含んで構成され、昼夜を問わずドライバ３の顔を撮像できるように近赤外線ＬＥＤなどを含んでもよい。カメラ１１は、所定のフレームレート（例えば、毎秒数十フレーム）で画像を撮像し、撮像された画像のデータが画像処理部１２に入力される。カメラ１１は、一体式の他、外付け式のものであってもよい。 The camera 11 is a device that captures an image including the face of the driver 3 sitting in the driver's seat, and includes, for example, a lens unit, an imaging device unit, a light irradiation unit, an interface unit, and a camera control unit that controls these units. can be configured to include The imaging element section may include an imaging element such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like. The imaging device section may be a device capable of receiving light in the visible region to form a captured image, or may be a device capable of receiving light in the near-infrared region to form a captured image. The light irradiation unit includes a light emitting element such as an LED (Light Emitting Diode), and may include a near-infrared LED or the like so that the face of the driver 3 can be imaged day or night. The camera 11 captures an image at a predetermined frame rate (for example, several tens of frames per second), and data of the captured image is input to the image processing section 12 . The camera 11 may be an integrated type or an external type.

画像処理部１２は、１以上のＣＰＵ（Central Processing Unit）１３、ＲＯＭ（Read Only Memory）１４、及びＲＡＭ（Random Access Memory）１５を含む画像処理装置として構成されている。ＲＯＭ１４は、プログラム記憶部１４１と顔特徴量記憶部１４２とを含み、ＲＡＭ１５は、カメラ１１からの入力画像を記憶する画像メモリ１５１を含んで構成されている。なお、ドライバモニタリング装置１０は、さらに別の記憶部を装備してもよく、前記別の記憶部を、プログラム記憶部１４１、顔特徴量記憶部１４２、及び画像メモリ１５１として用いてもよい。前記別の記憶部は、半導体メモリでもよいし、ディスクドライブなどで読み込み可能な記憶媒体でもよい。 The image processing unit 12 is configured as an image processing device including one or more CPU (Central Processing Unit) 13 , ROM (Read Only Memory) 14 , and RAM (Random Access Memory) 15 . The ROM 14 includes a program storage unit 141 and a face feature amount storage unit 142 , and the RAM 15 includes an image memory 151 for storing an input image from the camera 11 . The driver monitoring device 10 may be equipped with another storage unit, and the other storage unit may be used as the program storage unit 141 , the facial feature amount storage unit 142 and the image memory 151 . The separate storage unit may be a semiconductor memory or a storage medium readable by a disk drive or the like.

ＣＰＵ１３は、ハードウェアプロセッサの一例であり、ＲＯＭ１４のプログラム記憶部１４１に記憶されているプログラム、顔特徴量記憶部１４２に記憶されている顔特徴量などのデータを読み込み、解釈し実行することで、カメラ１１から入力された画像の処理、例えば、顔検出処理などの顔画像処理を行う。また、ＣＰＵ１３は、該顔画像処理により得られた結果（例えば、処理データ、判定信号、又は制御信号など）を、通信部１６を介してＥＣＵ４０などに出力する処理などを行う。 The CPU 13 is an example of a hardware processor, and reads programs stored in the program storage unit 141 of the ROM 14 and data such as facial feature amounts stored in the facial feature amount storage unit 142, interprets and executes them. , the image input from the camera 11 is processed, for example, face image processing such as face detection processing is performed. In addition, the CPU 13 performs processing of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16 .

顔特徴量記憶部１４２には、画像から顔を検出するための学習（例えば、機械学習）を行った学習済みの顔特徴量として、特定個人の顔特徴量１４２ａと、通常の顔特徴量１４２ｂと（図３、図４を参照）が記憶されている。
学習済みの顔特徴量には、画像から顔を検出するのに有効な各種の特徴量を用いることができる。例えば、顔の局所的な領域の明暗差（平均輝度の差）に着目した特徴量（Haar-like特徴量）を用いてもよい。又は、顔の局所的な領域の輝度の分布の組み合わせに着目した特徴量（LBP (Local Binary Pattern) 特徴量）を用いてもよいし、顔の局所的な領域の輝度の勾配方向の分布の組み合わせに着目した特徴量（HOG (Histogram of Oriented Gradients) 特徴量）などを用いてもよい。 The face feature quantity storage unit 142 stores a face feature quantity 142a of a specific individual and a normal face feature quantity 142b as learned face feature quantities that have undergone learning (for example, machine learning) to detect a face from an image. and (see FIGS. 3 and 4) are stored.
Various feature amounts effective for detecting a face from an image can be used as the learned face feature amount. For example, a feature amount (Haar-like feature amount) focused on the light-dark difference (difference in average luminance) of a local region of the face may be used. Alternatively, a feature value (LBP (Local Binary Pattern) feature value) focusing on a combination of luminance distributions in local regions of the face may be used, or a distribution of luminance gradient directions in local regions of the face may be used. A feature amount focused on a combination (HOG (Histogram of Oriented Gradients) feature amount) may be used.

顔特徴量記憶部１４２に記憶される顔特徴量は、例えば、各種の機械学習による手法を用いて、顔検出に有効な特徴量として抽出されたものである。機械学習とは、データ（学習データ）に内在するパターンをコンピュータにより見つけ出す処理である。例えば、統計的な学習手法の一例としてＡｄａＢｏｏｓｔを用いてもよい。ＡｄａＢｏｏｓｔは、判別能力の低い判別器（弱判別器）を多数選び出し、これら多数の弱判別器の中からエラー率が小さい弱判別器を選択し、重みなどのパラメータを調整し、階層的な構造にすることで、強判別器を構築することのできる学習アルゴリズムである。なお、判別器は、識別器、分類器、又は学習器と称されてもよい。 The face feature quantity stored in the face feature quantity storage unit 142 is extracted as a feature quantity effective for face detection using various machine learning techniques, for example. Machine learning is a process of finding patterns inherent in data (learning data) by a computer. For example, AdaBoost may be used as an example of a statistical learning method. AdaBoost selects a large number of classifiers with low discrimination ability (weak classifiers), selects weak classifiers with a small error rate from among these many weak classifiers, adjusts parameters such as weights, and creates a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by Note that the discriminator may also be called a discriminator, a classifier, or a learner.

強判別器は、例えば、顔の検出に有効な１つの特徴量を１つの弱判別器によって判別する構成とし、ＡｄａＢｏｏｓｔにより多数の弱判別器とその組み合わせを選び出し、これらを用いて、階層的な構造を構築したものとしてもよい。なお、１つの弱判別器は、例えば、顔の場合は１、非顔の場合は０という情報を出力してもよい。また、学習手法には、顔らしさを０または１ではなく、０から１の実数で出力可能なＲｅａｌＡｄａＢｏｏｓｔという学習手法を用いてもよい。また、これら学習手法には、入力層、中間層、及び出力層を有するニューラルネットワークを用いてもよい。 The strong classifier is configured, for example, to classify one feature quantity effective for face detection by one weak classifier, select a large number of weak classifiers and their combinations by AdaBoost, and use these to perform hierarchical classification. It is good also as what constructed the structure. Note that one weak discriminator may output information such as 1 for a face and 0 for a non-face, for example. Also, as a learning method, a learning method called Real AdaBoost, which can output a real number between 0 and 1 instead of 0 or 1, may be used. Moreover, a neural network having an input layer, an intermediate layer, and an output layer may be used for these learning methods.

このような学習アルゴリズムが搭載された学習装置に、さまざまな条件で撮像された多数の顔画像と多数の顔以外の画像（非顔画像）とを学習データとして与え、学習を繰り返し、重みなどのパラメータを調整して最適化を図ることにより、顔を高精度に検出可能な階層構造を有する強判別器を構築することが可能となる。そして、このような強判別器を構成する各階層の弱判別器で用いられる１以上の特徴量を、学習済みの顔特徴量として用いることができる。 A large number of face images captured under various conditions and a large number of non-face images (non-face images) are given as learning data to a learning device equipped with such a learning algorithm. By adjusting parameters for optimization, it is possible to construct a strong discriminator having a hierarchical structure capable of detecting faces with high accuracy. One or more feature amounts used in the weak classifiers of each layer constituting such a strong classifier can be used as learned face feature amounts.

特定個人の顔特徴量１４２ａは、例えば、予め所定の場所で、特定個人の顔画像をさまざまな条件（さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件）で個別に撮像し、これら多数の撮像画像を教師データとして、上記学習装置に入力し、学習処理によって調整された、特定個人の顔の特徴を示すパラメータである。特定個人の顔特徴量１４２ａは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンなどでもよい。顔特徴量記憶部１４２に記憶される特定個人の顔特徴量１４２ａは、１人の特定個人の顔特徴量だけでもよいし、複数の特定個人が車両２を運転する場合などに対応できるように、複数人の特定個人の顔特徴量が記憶されてもよい。 For example, the specific individual's face feature quantity 142a is obtained by individually capturing a specific individual's face image at a predetermined location in advance under various conditions (various conditions such as face orientation, line-of-sight direction, and eye open/closed state). Then, these many captured images are input to the learning device as training data, and adjusted by learning processing, and are parameters indicating the facial features of a specific individual. The face feature amount 142a of the specific individual may be, for example, a combination pattern of light and dark differences in local regions of the face obtained by learning processing. The face feature quantity 142a of the specific individual stored in the face feature quantity storage unit 142 may be the face feature quantity of only one specific individual, or may correspond to the case where a plurality of specific individuals drive the vehicle 2. , facial features of a plurality of specific individuals may be stored.

通常の顔特徴量１４２ｂは、通常の人の顔画像をさまざまな条件（さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件）で撮像した画像を教師データとして、上記学習装置に入力し、学習処理によって調整された、通常の人の顔の特徴を示すパラメータである。通常の顔特徴量１４２ｂは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンなどでもよい。また、通常の顔特徴量１４２ｂは、所定の顔特徴量データベースに登録されている情報を用いてもよい。 The normal face feature amount 142b is obtained by using images of normal people's faces captured under various conditions (various conditions such as face orientation, line of sight direction, and eye open/closed state) as teacher data. , and adjusted by the learning process, representing the features of a normal human face. The normal face feature quantity 142b may be, for example, a combination pattern of light and dark differences in local areas of the face obtained by learning processing. Information registered in a predetermined facial feature amount database may be used as the normal facial feature amount 142b.

顔特徴量記憶部１４２に記憶される学習済みの顔特徴量は、例えば、クラウド上のサーバなどからインターネット、携帯電話網などの通信ネットワークを介して取り込んで、顔特徴量記憶部１４２に記憶される構成としてもよい。 The learned facial features stored in the facial feature storage unit 142 are stored in the facial feature storage unit 142 by, for example, fetching them from a cloud server or the like via a communication network such as the Internet or a mobile phone network. It may be configured to be

ＥＣＵ４０は、１以上のプロセッサ、メモリ、及び通信モジュールなどを含むコンピュータ装置で構成されている。そして、ＥＣＵ４０に搭載されたプロセッサが、メモリに記憶されたプログラムを読み込み、解釈し実行することで、アクチュエータ４２などに対する所定の制御が実行されるようになっている。 The ECU 40 is composed of a computer device including one or more processors, memories, communication modules, and the like. A processor installed in the ECU 40 reads, interprets, and executes the program stored in the memory, thereby performing predetermined control on the actuator 42 and the like.

ＥＣＵ４０は、例えば、走行系ＥＣＵ、運転支援系ＥＣＵ、ボディ系ＥＣＵ、及び情報系ＥＣＵのうちの少なくともいずれかを含んで構成されている。 The ECU 40 includes, for example, at least one of a driving system ECU, a driving support system ECU, a body system ECU, and an information system ECU.

前記走行系ＥＣＵには、例えば、駆動系ＥＣＵ、シャーシ系ＥＣＵなどが含まれている。前記駆動系ＥＣＵには、例えば、エンジン制御、モータ制御、燃料電池制御、EV（Electric Vehicle）制御、又はトランスミッション制御等の「走る」機能に関する制御ユニットが含まれている。前記シャーシ系ＥＣＵには、例えば、ブレーキ制御、又はステアリング制御等の「止まる、曲がる」機能に関する制御ユニットが含まれている。 The travel system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like. The drive system ECU includes, for example, a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control. The chassis system ECU includes, for example, a control unit for "stop, turn" functions such as brake control or steering control.

前記運転支援系ＥＣＵは、例えば、自動ブレーキ支援機能、車線維持支援機能（ＬＫＡ／Lane Keep Assistともいう）、定速走行・車間距離支援機能（ＡＣＣ／Adaptive Cruise Controlともいう）、前方衝突警告機能、車線逸脱警報機能、死角モニタリング機能、交通標識認識機能等、走行系ＥＣＵなどとの連携により自動的に安全性の向上、又は快適な運転を実現する機能（運転支援機能、又は自動運転機能）に関する制御ユニットを少なくとも１つ以上含んで構成され得る。 The driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA/Lane Keep Assist), a constant speed driving/vehicle distance support function (also referred to as ACC/Adaptive Cruise Control), and a forward collision warning function. , Lane departure warning function, blind spot monitoring function, traffic sign recognition function, etc. Functions that automatically improve safety or realize comfortable driving in cooperation with driving system ECU etc. (driving support function or automatic driving function) may be configured to include at least one or more control units for

前記運転支援系ＥＣＵには、例えば、米国自動車技術会（SAE）が提示している自動運転レベルにおけるレベル１（ドライバ支援）、レベル２（部分的自動運転）、及びレベル３（条件付自動運転）の少なくともいずれかの機能が装備されてもよい。さらに、自動運転レベルのレベル４（高度自動運転）、又はレベル５（完全自動運転）の機能が装備されてもよいし、レベル１、２のみ、又はレベル２、３のみの機能が装備されてもよい。また、車載システム１を自動運転システムとして構成してもよい。 The driving support system ECU includes, for example, level 1 (driver support), level 2 (partially automated driving), and level 3 (conditional automated driving) in the automated driving levels presented by the Society of Automotive Engineers (SAE). ) may be equipped with at least one of the functions of Furthermore, the function of level 4 (highly automated driving) or level 5 (fully automated driving) of the automatic driving level may be equipped, or only the functions of levels 1 and 2 or only levels 2 and 3 are equipped. good too. Also, the in-vehicle system 1 may be configured as an automatic driving system.

前記ボディ系ＥＣＵは、例えば、ドアロック、スマートキー、パワーウインドウ、エアコン、ライト、メーターパネル、又はウインカ等の車体の機能に関する制御ユニットを少なくとも１つ以上含んで構成され得る。 The body system ECU may include at least one or more control units related to vehicle body functions such as door locks, smart keys, power windows, air conditioners, lights, meter panels, and blinkers.

前記情報系ＥＣＵは、例えば、インフォテイメント装置、テレマティクス装置、又はＩＴＳ（Intelligent Transport Systems）関連装置を含んで構成され得る。前記インフォテイメント装置には、例えば、ユーザインターフェースとして機能するＨＭＩ（Human Machine Interface）装置の他、カーナビゲーション装置、オーディオ機器などが含まれてもよい。前記テレマティクス装置には、外部と通信するための通信ユニットなどが含まれてもよい。前記ＩＴＳ関連装置には、ＥＴＣ（Electronic Toll Collection System）、又はＩＴＳスポットなどの路側機との路車間通信、若しくは車々間通信などを行うための通信ユニットなどが含まれてもよい。 The information system ECU may include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems)-related device. The infotainment device may include, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like. The telematics device may include a communication unit or the like for communicating with the outside. The ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside device such as an ITS spot, or vehicle-to-vehicle communication.

センサ４１には、ＥＣＵ４０でアクチュエータ４２の動作制御を行うために必要となるセンシングデータを取得する各種の車載センサが含まれ得る。例えば、車速センサ、シフトポジションセンサ、アクセル開度センサ、ブレーキペダルセンサ、ステアリングセンサなどの他、車外撮像用カメラ、ミリ波等のレーダー（Ｒａｄａｒ）、ライダー（ＬＩＤＥＲ）、超音波センサなどの周辺監視センサなどが含まれてもよい。 The sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for the ECU 40 to control the operation of the actuator 42 . For example, in addition to vehicle speed sensors, shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc., peripheral monitoring such as cameras for imaging outside the vehicle, radar such as millimeter waves, LIDER, and ultrasonic sensors Sensors and the like may also be included.

アクチュエータ４２は、ＥＣＵ４０からの制御信号に基づいて、車両２の走行、操舵、又は制動などに関わる動作を実行する装置であり、例えば、エンジン、モータ、トランスミッション、油圧又は電動シリンダー等が含まれる。 The actuator 42 is a device that executes operations related to running, steering, braking, etc. of the vehicle 2 based on control signals from the ECU 40, and includes, for example, an engine, a motor, a transmission, a hydraulic or electric cylinder, and the like.

［機能構成例］
図３は、実施の形態に係るドライバモニタリング装置１０の画像処理部１２の機能構成例を示すブロック図である。
画像処理部１２は、画像入力部２１、顔検出部２２、特定個人判定部２５、第１顔画像処理部２６、第２顔画像処理部３０、出力部３４、及び顔特徴量記憶部１４２を含んで構成されている。 [Example of functional configuration]
FIG. 3 is a block diagram showing a functional configuration example of the image processing section 12 of the driver monitoring device 10 according to the embodiment.
The image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. is composed of

画像入力部２１は、カメラ１１で撮像されたドライバ３の顔を含む画像を取り込む処理を行う。 The image input unit 21 performs processing for capturing an image including the face of the driver 3 captured by the camera 11 .

顔検出部２２は、入力画像に対して所定サイズの探索領域を走査しながら、かつ該探索領域から顔の特徴量を抽出しながら顔領域を検出する処理を行う。顔検出部２２は、第１特徴量抽出部２２１、通常顔判別器２２２、及び特定個人顔判定部２２３を含んで構成されている。顔検出部２２は、さらに、顔領域統合部２２４、及び第２特徴量抽出部２２５を含んで構成してもよい。 The face detection unit 22 detects a face area while scanning a search area of a predetermined size in an input image and extracting facial features from the search area. The face detection unit 22 includes a first feature amount extraction unit 221 , a normal face discriminator 222 and a specific individual face determination unit 223 . The face detection section 22 may further include a face area integration section 224 and a second feature amount extraction section 225 .

図４は、顔検出部２２の機能構成例を示すブロック図である。
図５、図６は、顔検出部２２で行われる処理動作例を説明するための模式図である。 FIG. 4 is a block diagram showing a functional configuration example of the face detection section 22. As shown in FIG.
5 and 6 are schematic diagrams for explaining an example of the processing operation performed by the face detection unit 22. FIG.

本実施の形態においては、顔検出部２２は、画像特徴として、例えば、Haar-like特徴を用い、ＡｄａＢｏｏｓｔの学習アルゴリズムを用いて構築された階層構造の通常顔判別器２２２と、通常顔判別器２２２に付加された特定個人顔判定部２２３とを用いるように構成されている。 In the present embodiment, the face detection unit 22 uses, for example, Haar-like features as image features, and a normal face discriminator 222 with a hierarchical structure constructed using an AdaBoost learning algorithm, and a normal face discriminator 222 222 is configured to use a specific individual face determination unit 223 added thereto.

Haar-like特徴は、矩形特徴とも称され、例えば、２つの矩形領域の平均輝度の差を特徴量とするものであり、例えば、画像中の目の領域は輝度が低く、目の周囲（目の下、目の横）は輝度が高くなるという特徴を利用するものである。Haar-like特徴には、２つ、３つ、又は４つの矩形を組み合わせた矩形特徴を用いてもよい。顔検出に有効な（重要度が高い）特徴量と、その組み合わせが学習アルゴリズムを用いて選び出され、顔特徴量記憶部１４２に記憶される。顔特徴量記憶部１４２には、通常顔判別器２２２での処理に用いられる通常の顔特徴量１４２ｂと、特定個人顔判定部２２３での処理に用いられる特定個人の顔特徴量１４２ａとが記憶されている。 A Haar-like feature is also called a rectangular feature. For example, the difference in average brightness between two rectangular regions is used as a feature quantity. , to the side of the eye), which utilizes the feature that the brightness is high. Haar-like features may use rectangle features that are a combination of two, three, or four rectangles. Feature amounts effective (highly important) for face detection and their combinations are selected using a learning algorithm and stored in the face feature amount storage unit 142 . The facial feature amount storage unit 142 stores normal facial feature amounts 142b used in processing by the normal face discriminator 222 and specific individual facial feature amounts 142a used in processing by the specific individual face determination unit 223. It is

通常顔判別器２２２は、図４に示すように、第１判別器２２２ａから第Ｎ判別器２２２ｎを含み、これらが複数連結された階層構造（カスケード構造ともいう）を備えている。これら各判別器は、画像から切り出された所定サイズの探索領域２１０から抽出された、顔検出に有効な１以上の特徴量２２１ａを用いて、探索領域２１０が顔であるか、非顔（顔以外）であるかを判別する。第１判別器２２２ａなどの階層構造の最初の方の判別器では、例えば、目があるかどうか、というような、顔を大まかに捉える特徴量２２１ａが用いられる。また、第Ｎ判別器２２２ｎなどの階層構造の深い方の判別器では、例えば、目、鼻、口があるか、正面顔であるか、斜め顔であるか、横顔であるか、というような、顔の細部を捉える特徴量２２１ａが用いられる。 As shown in FIG. 4, the normal face discriminator 222 includes first discriminators 222a to Nth discriminators 222n, and has a hierarchical structure (also referred to as a cascade structure) in which a plurality of these discriminators are connected. Each of these discriminators uses one or more feature values 221a effective for face detection, which are extracted from a search region 210 of a predetermined size cut out from an image, to determine whether the search region 210 is a face or a non-face (face). other than). Discriminators at the beginning of the hierarchical structure, such as the first discriminator 222a, use the feature quantity 221a that roughly captures the face, such as whether or not the face has eyes. Further, in classifiers deeper in the hierarchical structure such as the N-th classifier 222n, for example, whether there are eyes, a nose, a mouth, whether the face is frontal, whether the face is oblique, or whether the face is in profile, etc. , the feature amount 221a that captures the details of the face is used.

第１特徴量抽出部２２１は、通常顔判別器２２２を構成する各判別器で判別を行うように設定されている１以上の特徴量２２１ａを探索領域２１０から抽出する。 The first feature amount extraction unit 221 extracts from the search area 210 one or more feature amounts 221 a that are set to be determined by each classifier constituting the normal face classifier 222 .

通常顔判別器２２２は、第１特徴量抽出部２２１により探索領域２１０から抽出された特徴量２２１ａと、通常の顔特徴量１４２ｂとを用いて、探索領域２１０が顔であるか、非顔であるかを、第１判別器２２２ａから第Ｎ判別器２２２ｎの順に階層的に判別していく。 The normal face classifier 222 uses the feature quantity 221a extracted from the search region 210 by the first feature quantity extraction unit 221 and the normal face feature quantity 142b to determine whether the search region 210 is a face or not. The presence is determined hierarchically in order from the first discriminator 222a to the Nth discriminator 222n.

特定個人顔判定部２２３は、図４に示すように、第１判定部２２３ａから第Ｎ判定部２２３ｎを含み、通常顔判別器２２２のいずれかの階層で探索領域２１０が非顔であると判別された場合に、探索領域２１０から抽出された特徴量２２１ａと、特定個人の顔特徴量１４２ａとを用いて、探索領域２１０が特定個人の顔であるか、非顔であるかを判定する。 As shown in FIG. 4, the specific individual face determination unit 223 includes a first determination unit 223a to an N-th determination unit 223n. Then, using the feature amount 221a extracted from the search area 210 and the face feature amount 142a of the specific individual, it is determined whether the search area 210 is the face of the specific individual or not.

例えば、第２判別器２２２ｂで探索領域２１０が非顔であると判別されると、第２判定部２２３ｂが、探索領域２１０から抽出された特徴量２２１ａ（第２判別器２２２ｂで用いたもの）と、当該階層に対応する特定個人の顔特徴量１４２ａとを用いて、探索領域２１０が特定個人の顔であるか、非顔であるかを判定する。 For example, when the second discriminator 222b determines that the search area 210 is non-face, the second determination unit 223b extracts the feature amount 221a (used by the second discriminator 222b) from the search area 210. , and the face feature amount 142a of the specific individual corresponding to the hierarchy, it is determined whether the search area 210 is the face of the specific individual or not.

第２判定部２２３ｂが、探索領域２１０が特定個人の顔であると判定した場合、第３判別器２２２ｃでの判別処理に進む一方、探索領域２１０が非顔であると判定した場合、第２判別器２２２ｂで判別処理を打ち切り、次の探索領域に対する顔検出処理に進む。 When the second determination unit 223b determines that the search area 210 is the face of a specific individual, the process proceeds to determination processing in the third discriminator 222c. The discrimination process is terminated by the discriminator 222b, and the process proceeds to face detection process for the next search area.

そして、判別処理が進み、第Ｎ判別器２２２ｎで、探索領域２１０が顔であると判別された場合、又は第Ｎ判定部２２３ｎで、探索領域２１０が特定個人の顔であると判定した場合、当該探索領域２１０が顔領域の候補として記憶される。 Then, the determination process proceeds, and when the N-th discriminator 222n determines that the search area 210 is a face, or when the N-th determination unit 223n determines that the search area 210 is the face of a specific individual, The search area 210 is stored as a face area candidate.

顔検出部２２は、さまざまな大きさの顔を検出できるようにするために、例えば、図５に示すように、入力画像２０を複数の倍率で縮小した縮小画像２０ａ、２０ｂを生成し、それぞれの縮小画像２０ａ、２０ｂから所定サイズの探索領域２１０を切り出し、通常顔判別器２２２を用いて探索領域２１０が顔であるか、非顔であるかを判別してもよい。そして、入力画像２０、縮小画像２０ａ、２０ｂ内で探索領域２１０を走査することにより、画像２０中のさまざまな大きさの顔とその顔の位置とを検出してもよい。なお、探索領域２１０は、矩形以外の任意の形状であってもよい。 In order to detect faces of various sizes, the face detection unit 22 generates reduced images 20a and 20b by reducing an input image 20 by a plurality of magnifications, as shown in FIG. A search area 210 of a predetermined size may be cut out from the reduced images 20a and 20b, and a normal face discriminator 222 may be used to determine whether the search area 210 is a face or a non-face. Then, by scanning the search area 210 in the input image 20 and the reduced images 20a, 20b, faces of various sizes and their positions in the image 20 may be detected. Note that the search area 210 may have any shape other than a rectangle.

また、顔検出部２２は、さまざまな方向に向いた（回転した）顔やさまざまな角度に傾いた顔を検出できるように構成してもよい。例えば、第１特徴量抽出部２２１が、探索領域２１０から顔の向きや顔の傾きを判別するのに有効な特徴量２２１ａを抽出し、通常顔判別器２２２が、顔の向きや顔の傾き毎の特徴量について学習した学習済みの判別器を用いて、探索領域２１０が、顔であるか、非顔であるかを判別できるように構成してもよい。 Further, the face detection unit 22 may be configured to detect faces facing (rotated) in various directions and faces tilted at various angles. For example, the first feature quantity extraction unit 221 extracts a feature quantity 221a effective for determining the orientation and tilt of the face from the search area 210, and the normal face discriminator 222 extracts the orientation and tilt of the face. It may be configured such that it can be determined whether the search area 210 is a face or a non-face using a learned discriminator that has learned about each feature amount.

例えば、図６に示すように、正面顔、斜め顔、及び横顔をそれぞれ検出するために、通常顔判別器２２２が、正面（０度）、左斜め（４５度）、及び左横（９０度）のそれぞれの特徴量を学習した判別器を備えてもよい。この場合、１つの判別器で所定角度（例えば、２２．５度）以上をカバーできるように学習してもよい。また、右斜め（４５度）の判別は、左斜め（４５度）を左右反転、右横（９０度）の判別は、左横（９０度）を左右反転させることで対応するようにしてもよい。また、図６に示すように、顔の傾きを検出できるように、通常顔判別器２２２が、所定の傾き毎にそれぞれの特徴量を学習した判別器を備えてもよい。 For example, as shown in FIG. 6, in order to detect a frontal face, an oblique face, and a side face, respectively, the normal face classifier 222 detects frontal (0 degrees), left oblique (45 degrees), and left lateral (90 degrees) faces. ) may be provided with a discriminator that has learned each feature amount. In this case, learning may be performed so that one discriminator can cover a predetermined angle (for example, 22.5 degrees) or more. In addition, the right diagonal (45 degrees) can be determined by horizontally inverting the left diagonal (45 degrees), and the right horizontal (90 degrees) can be determined by horizontally inverting the left horizontal (90 degrees). good. Further, as shown in FIG. 6, the normal face discriminator 222 may include discriminators that have learned each feature amount for each predetermined tilt so that the tilt of the face can be detected.

顔領域統合部２２４は、通常顔判別器２２２により顔であると判別された１以上の顔領域の候補を統合する処理を行う。１以上の顔領域の候補を統合する方法は特に限定されない。例えば、１以上の顔領域の候補の領域中心の平均値と、領域サイズの平均値とに基づいて統合してもよい。 The face area integration unit 224 performs processing to integrate one or more face area candidates determined to be faces by the normal face determination unit 222 . The method of integrating one or more face area candidates is not particularly limited. For example, integration may be performed based on an average value of area centers of one or more face area candidates and an average value of area sizes.

第２特徴量抽出部２２５は、顔領域統合部２２４により統合された顔領域から顔の特徴量を抽出する処理を行う。 The second feature quantity extraction unit 225 performs a process of extracting a face feature quantity from the face regions integrated by the face region integration unit 224 .

特定個人判定部２５は、顔検出部２２で検出された顔領域の特徴量と、顔特徴量記憶部１４２から読み込んだ特定個人の顔特徴量１４２ａとを用いて、検出された顔領域の顔が特定個人の顔であるか、特定個人以外の通常の人の顔であるかを判定する処理を行う。 The specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to determine the face in the detected face area. is the face of a specific individual or the face of a normal person other than the specific individual.

特定個人判定部２５は、顔領域から抽出された特徴量と特定個人の顔特徴量１４２ａとの関係を示す指標、例えば、相関を示す指標として、相関係数を算出し、算出した相関係数に基づいて、顔領域の顔が特定個人の顔であるか否かを判定してもよい。そして、相関係数が所定の閾値より大きい場合、検出した顔領域の顔が特定個人の顔であると判定し、相関係数が所定の閾値以下の場合、検出した顔領域の顔が特定個人の顔ではないと判定してもよい。 The specific individual determining unit 25 calculates a correlation coefficient as an index indicating the relationship between the feature quantity extracted from the face region and the face feature quantity 142a of the specific individual, for example, as an index indicating correlation, and calculates the calculated correlation coefficient. , it may be determined whether the face in the face area is the face of a specific individual. If the correlation coefficient is greater than a predetermined threshold, the face in the detected face region is determined to be the face of a specific individual. may be determined not to be the face of

また、特定個人判定部２５では、カメラ１１からの入力画像の１フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定してもよいし、カメラ１１からの入力画像の複数フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定してもよい。 Further, the specific individual determination unit 25 may determine whether or not the face in the detected face area is the face of a specific individual based on the determination result for one frame of the input image from the camera 11, It may be determined whether or not the face in the detected face area is the face of a specific individual, based on the determination results for a plurality of frames of the input image from the camera 11 .

第１顔画像処理部２６は、特定個人判定部２５により特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う。第１顔画像処理部２６は、特定個人の顔向き推定部２７と、特定個人の目開閉検出部２８と、特定個人の視線方向推定部２９とを含んで構成されているが、さらに別の顔挙動を推定したり、検出したりする構成を含んでもよい。また、第１顔画像処理部２６は、特定個人の顔特徴量１４２ａを用いて、特定個人用の顔画像処理のいずれかの処理を行ってもよい。また、顔特徴量記憶部１４２に、特定個人用の顔画像処理を行うための学習を行った学習済みの特徴量を記憶しておき、該学習済みの特徴量を用いて、特定個人用の顔画像処理のいずれかの処理を行ってもよい。 The first face image processing unit 26 performs face image processing for a specific individual when the specific individual determining unit 25 determines that the face is that of a specific individual. The first face image processing unit 26 includes a specific individual's face orientation estimation unit 27, a specific individual's eye open/closed detection unit 28, and a specific individual's gaze direction estimation unit 29. A configuration for estimating or detecting facial behavior may also be included. Further, the first face image processing unit 26 may perform any one of face image processing for a specific individual using the face feature amount 142a of the specific individual. Further, a learned feature amount that has undergone learning for performing face image processing for a specific individual is stored in the face feature amount storage unit 142, and the learned feature amount is used to perform facial image processing for a specific individual. Any processing of face image processing may be performed.

特定個人の顔向き推定部２７は、特定個人の顔の向きを推定する処理を行う。特定個人の顔向き推定部２７は、例えば、顔検出部２２で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定する処理を行う。 The specific individual's face direction estimating unit 27 performs processing for estimating the face direction of the specific individual. The specific individual face direction estimation unit 27 detects the positions and shapes of facial features such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 22, and detects the positions and shapes of the detected facial features. Based on this, the process of estimating the orientation of the face is performed.

画像中の顔領域から顔器官を検出する手法は特に限定されないが、高速で高精度に顔器官を検出できる手法を採用することが好ましい。例えば、３次元顔形状モデルを作成し、これを２次元画像上の顔の領域にフィッティングさせ、顔の各器官の位置と形状を検出する手法が採用され得る。画像中の人の顔に３次元顔形状モデルをフィッティングさせる技術として、例えば、特開２００７－２４９２８０号公報に記載された技術を適用することができるが、これに限定されるものではない。 A method for detecting facial features from a facial region in an image is not particularly limited, but it is preferable to employ a method that can detect facial features at high speed and with high accuracy. For example, a method of creating a three-dimensional face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted. As a technique for fitting a 3D face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but it is not limited to this.

また、特定個人の顔向き推定部２７は、特定個人の顔の向きの推定データとして、例えば、上記３次元顔形状モデルのパラメータに含まれている、上下回転（Ｘ軸回り）のピッチ角、左右回転（Ｙ軸回り）のヨー角、及び全体回転（Ｚ軸回り）のロール角を出力してもよい。 Further, the specific individual's face orientation estimation unit 27 uses, as estimation data of the specific individual's face orientation, for example, the pitch angle of vertical rotation (around the X axis), which is included in the parameters of the three-dimensional face shape model, A yaw angle for left-right rotation (around the Y-axis) and a roll angle for overall rotation (around the Z-axis) may be output.

特定個人の目開閉検出部２８は、特定個人の目の開閉状態を検出する処理を行う。特定個人の目開閉検出部２８は、例えば、特定個人の顔向き推定部２７で求めた顔器官の位置や形状、特に目の特徴点（瞼、瞳孔）の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出する。目の開閉状態は、例えば、さまざまな目の開閉状態における目の画像の特徴量（瞼の位置、瞳孔（黒目）の形状、又は、白目部分と黒目部分の領域サイズなど）を予め学習器を用いて学習し、これら学習済みの特徴量データとの類似度を評価することで検出してもよい。 The specific individual's eye open/close detection unit 28 performs processing for detecting the open/closed state of the eye of the specific individual. The specific individual's eye open/close detection unit 28, for example, based on the position and shape of the facial organs obtained by the specific individual's face direction estimation unit 27, particularly the position and shape of the eye feature points (eyelid, pupil), Detects the open/closed state, for example, whether the eyes are open or closed. The open/closed state of the eyes can be determined by, for example, using a learner in advance for the feature values of eye images (eyelid position, shape of pupil (black eye), size of white and black eye portions, etc.) in various eye open/closed states. It may also be detected by learning using and evaluating the degree of similarity with these learned feature amount data.

特定個人の視線方向推定部２９は、特定個人の視線の方向を推定する処理を行う。特定個人の視線方向推定部２９は、例えば、ドライバ３の顔の向き、及びドライバ３の顔器官の位置や形状、特に目の特徴点（目尻、目頭、瞳孔）の位置や形状に基づいて、視線の方向を推定する。視線の方向とは、ドライバ３が見ている方向のことであり、例えば、顔の向きと目の向きとの組み合わせによって求められる。 The specific individual's line-of-sight direction estimating unit 29 performs processing for estimating the direction of the specific individual's line of sight. The sight line direction estimating unit 29 of the specific individual, for example, based on the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, especially the position and shape of the characteristic points of the eyes (outer corners, inner corners, pupils), Estimate the direction of the line of sight. The line-of-sight direction is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.

また、視線の方向は、例えば、さまざまな顔の向きと目の向きとの組み合わせにおける目の画像の特徴量（目尻、目頭、瞳孔の相対位置、又は白目部分と黒目部分の相対位置、濃淡、テクスチャーなど）とを予め学習器を用いて学習し、これら学習した特徴量データとの類似度を評価することで検出してもよい。また、特定個人の視線方向推定部２９は、前記３次元顔形状モデルのフィッティング結果などを用いて、顔の大きさや向きと目の位置などから眼球の大きさと中心位置とを推定するとともに、瞳孔の位置を検出し、眼球の中心と瞳孔の中心とを結ぶベクトルを視線方向として検出してもよい。 In addition, the direction of the line of sight is, for example, the feature amount of the image of the eyes (the relative positions of the outer corner, the inner corner, and the pupil, or the relative positions of the white and black portions of the eye, shading, light and shade, etc.) in various combinations of face orientation and eye orientation. texture, etc.) may be learned in advance using a learning device, and the degree of similarity with the learned feature amount data may be evaluated for detection. In addition, the sight line direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face, the position of the eyes, etc. using the fitting result of the three-dimensional face shape model, etc. , and a vector connecting the center of the eyeball and the center of the pupil may be detected as the line-of-sight direction.

第２顔画像処理部３０は、特定個人判定部２５により特定個人の顔ではないと判定された場合、通常の顔画像処理を行う。第２顔画像処理部３０は、通常の顔向き推定部３１と、通常の目開閉検出部３２と、通常の視線方向推定部３３とを含んで構成されているが、さらに別の顔挙動を推定したり、検出したりする構成を含んでもよい。また、第２顔画像処理部３０は、通常の顔特徴量１４２ｂを用いて、通常の顔画像処理のいずれかの処理を行ってもよい。また、顔特徴量記憶部１４２に、通常の顔画像処理を行うための学習を行った学習済みの特徴量を記憶しておき、該学習済みの特徴量を用いて、通常の顔画像処理のいずれかの処理を行ってもよい。なお、通常の顔向き推定部３１と、通常の目開閉検出部３２と、通常の視線方向推定部３３とで行われる処理は、特定個人の顔向き推定部２７と、特定個人の目開閉検出部２８と、特定個人の視線方向推定部２９と基本的に同様であるので、ここではその説明を省略する。 When the specific individual determination unit 25 determines that the face is not that of a specific individual, the second face image processing unit 30 performs normal face image processing. The second face image processing unit 30 includes a normal face orientation estimation unit 31, a normal eye open/close detection unit 32, and a normal gaze direction estimation unit 33. It may also include an estimating or detecting configuration. Further, the second face image processing section 30 may perform any of normal face image processing using the normal face feature amount 142b. Also, a learned feature amount that has undergone learning for normal face image processing is stored in the face feature amount storage unit 142, and normal face image processing is performed using the learned feature amount. Either processing may be performed. The processing performed by the normal face orientation estimation unit 31, the normal eye open/closed detection unit 32, and the normal gaze direction estimation unit 33 is performed by the specific individual's face orientation estimation unit 27 and the specific individual's eye open/closed detection. Since the unit 28 is basically the same as the specific individual's line-of-sight direction estimation unit 29, the description thereof will be omitted here.

出力部３４は、画像処理部１２による画像処理に基づく情報をＥＣＵ４０などに出力する処理を行う。画像処理に基づく情報は、例えば、ドライバ３の顔の向き、視線の方向、又は目の開閉状態などの顔の挙動に関する情報でもよいし、顔の挙動の検出結果に基づいて判定されたドライバ３の状態（例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態）に関する情報でもよい。また、画像処理に基づく情報は、ドライバ３の状態判定に基づく、所定の制御信号（注意や警告処理を行うための制御信号、又は車両２の動作制御を行うための制御信号など）でもよい。 The output unit 34 performs processing for outputting information based on image processing by the image processing unit 12 to the ECU 40 or the like. The information based on the image processing may be, for example, information related to the facial behavior of the driver 3, such as the orientation of the face, the direction of the line of sight, or the open/closed state of the eyes, or the driver 3 determined based on the detection result of the facial behavior. (for example, states such as looking ahead, looking aside, dozing off, looking backward, lying down, etc.). Information based on image processing may be a predetermined control signal (a control signal for performing attention or warning processing, or a control signal for controlling the operation of the vehicle 2, etc.) based on the state determination of the driver 3.

［処理動作例］
図７は、実施の形態に係るドライバモニタリング装置１０における画像処理部１２のＣＰＵ１３が行う処理動作の一例を示すフローチャートである。カメラ１１では、例えば、毎秒数十フレームの画像が撮像され、各フレーム、又は一定間隔のフレーム毎に本処理が行われる。 [Processing operation example]
FIG. 7 is a flowchart showing an example of processing operations performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. The camera 11 captures, for example, several tens of frames per second, and this process is performed for each frame or for each frame at regular intervals.

まず、ステップＳ１では、ＣＰＵ１３は、画像入力部２１として動作し、カメラ１１で撮像された画像（ドライバ３の顔を含む画像）を読み込む処理を行い、ステップＳ２に処理を進める。 First, in step S1, the CPU 13 operates as the image input unit 21, reads an image captured by the camera 11 (an image including the face of the driver 3), and proceeds to step S2.

ステップＳ２では、ＣＰＵ１３は、顔検出部２２として動作し、入力画像に対して探索領域を走査しながら顔領域を検出する顔検出処理を行い、ステップＳ３に処理を進める。なお、ステップＳ２の顔検出処理の具体例については後述する。 In step S2, the CPU 13 operates as the face detection unit 22, performs face detection processing for detecting a face region while scanning the search region of the input image, and proceeds to step S3. A specific example of the face detection process in step S2 will be described later.

ステップＳ３では、ＣＰＵ１３は、特定個人判定部２５として動作し、ステップＳ２で検出された顔領域の特徴量と、顔特徴量記憶部１４２から読み込んだ特定個人の顔特徴量１４２ａとを用いて、検出された顔領域の顔が特定個人の顔であるか否かを判定する処理を行い、ステップＳ４に処理を進める。 In step S3, the CPU 13 operates as the specific individual determining unit 25, and uses the feature amount of the face area detected in step S2 and the specific individual's facial feature amount 142a read from the facial feature amount storage unit 142, A process of determining whether or not the face in the detected face area is the face of a specific individual is performed, and the process proceeds to step S4.

ステップＳ４では、ＣＰＵ１３は、ステップＳ３での判定処理の結果が、特定個人の顔であるか否かを判断し、特定個人の顔であると判断すれば、ステップＳ５に処理を進める。 In step S4, the CPU 13 determines whether or not the result of the determination processing in step S3 is the face of a specific individual.

ステップＳ５では、ＣＰＵ１３は、特定個人の顔向き推定部２７として動作し、例えば、ステップＳ２で検出した顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定し、ステップＳ６に処理を進める。 In step S5, the CPU 13 operates as the face direction estimation unit 27 of the specific individual, and detects, for example, the positions and shapes of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected in step S2. The orientation of the face is estimated based on the position and shape of the facial features, and the process proceeds to step S6.

ステップＳ６では、ＣＰＵ１３は、特定個人の目開閉検出部２８として動作し、例えば、ステップＳ５で求めた顔器官の位置や形状、特に目の特徴点（瞼、瞳孔）の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出し、ステップＳ７に処理を進める。 In step S6, the CPU 13 operates as the eye opening/closing detection unit 28 of the specific individual, and, for example, based on the position and shape of the facial features obtained in step S5, particularly the position and shape of the characteristic points of the eyes (eyelids, pupils). , the open/closed state of the eyes, for example, whether the eyes are open or closed, is detected, and the process proceeds to step S7.

ステップＳ７では、ＣＰＵ１３は、特定個人の視線方向推定部２９として動作し、例えば、ステップＳ５で求めた顔の向き、顔器官の位置や形状、特に目の特徴点（目尻、目頭、瞳孔）の位置や形状に基づいて、視線の方向を推定し、その後処理を終える。 In step S7, the CPU 13 operates as the gaze direction estimating unit 29 of the specific individual, for example, the direction of the face obtained in step S5, the position and shape of the facial organs, especially the characteristic points of the eyes (outer corners, inner corners, pupils). Based on the position and shape, the direction of the line of sight is estimated, and then the process ends.

一方ステップＳ４において、ＣＰＵ１３は、特定個人の顔ではない、換言すれば、通常の顔であると判断すれば、ステップＳ８に処理を進める。
ステップＳ８では、ＣＰＵ１３は、通常の顔向き推定部３１として動作し、例えば、ステップＳ２で検出した顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定し、ステップＳ９に処理を進める。 On the other hand, if the CPU 13 determines in step S4 that the face is not that of a specific individual, in other words, that it is a normal face, the process proceeds to step S8.
In step S8, the CPU 13 operates as a normal face direction estimation unit 31, for example, detects the positions and shapes of facial features such as eyes, nose, mouth, and eyebrows from the face area detected in step S2, and detects the detected face. The orientation of the face is estimated based on the position and shape of the organ, and the process proceeds to step S9.

ステップＳ９では、ＣＰＵ１３は、通常の目開閉検出部３２として動作し、例えば、ステップＳ８で求めた顔器官の位置や形状、特に目の特徴点（瞼、瞳孔）の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出し、ステップＳ１０に処理を進める。 In step S9, the CPU 13 operates as a normal eye open/close detection unit 32. For example, based on the position and shape of the facial features obtained in step S8, particularly the position and shape of the characteristic points of the eyes (eyelids, pupils), The open/closed state of the eyes, for example, whether the eyes are open or closed is detected, and the process proceeds to step S10.

ステップＳ１０では、ＣＰＵ１３は、通常の視線方向推定部３３として動作し、例えば、ステップＳ８で求めた顔の向き、顔器官の位置や形状、特に目の特徴点（目尻、目頭、瞳孔）の位置や形状に基づいて、視線の方向を推定し、その後処理を終える。 In step S10, the CPU 13 operates as a normal gaze direction estimating unit 33, for example, the orientation of the face, the positions and shapes of facial organs, particularly the positions of eye feature points (outer corners, inner corners, pupils) obtained in step S8. Based on the shape and shape, the direction of the line of sight is estimated, and then the process ends.

図８は、実施の形態に係るドライバモニタリング装置１０における画像処理部１２のＣＰＵ１３が行う処理動作の一例を示すフローチャートである。本処理動作は、図７に示すステップＳ２の顔検出処理動作とステップＳ３の特定個人判定処理動作の一例であり、入力画像１枚（１フレーム）に対する処理動作例である。 FIG. 8 is a flowchart showing an example of processing operations performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the face detection processing operation of step S2 and the specific individual determination processing operation of step S3 shown in FIG. 7, and is an example of processing operation for one input image (one frame).

ＣＰＵ１３は、まず、ステップＳ２１で、顔のサイズ（大きさ）を検出するループ処理Ｌ１を開始し、次のステップＳ２２で、顔の回転角度（向きや傾き）を検出するループ処理Ｌ２を開始し、次のステップＳ２３では、顔の位置を検出するループ処理Ｌ３を開始して、ステップＳ２４に処理を進める。 First, in step S21, the CPU 13 starts loop processing L1 for detecting the face size (magnitude), and in the next step S22, starts loop processing L2 for detecting the rotation angle (orientation or inclination) of the face. , in the next step S23, a loop process L3 for detecting the position of the face is started, and the process proceeds to step S24.

ループ処理Ｌ１は、さまざまな大きさの顔を検出するために生成された縮小画像（例えば、図５に示す２０ａ、２０ｂ）の数に応じて繰り返される。ループ処理Ｌ２は、顔の回転角度（例えば、図６に示す正面顔、斜め顔、横顔、傾き）を判別する判別器の設定に応じて繰り返される。ループ処理Ｌ３は、顔の位置を検出するために探索領域２１０を走査する位置の数だけ繰り返される。 The loop process L1 is repeated according to the number of reduced images (eg, 20a, 20b shown in FIG. 5) generated to detect faces of various sizes. The loop processing L2 is repeated according to the setting of the discriminator that discriminates the rotation angle of the face (for example, front face, oblique face, side face, tilt shown in FIG. 6). The loop process L3 is repeated as many times as the search area 210 is scanned to detect the position of the face.

ステップＳ２４では、ＣＰＵ１３は、第１特徴量抽出部２２１として動作し、ループ処理Ｌ１、Ｌ２、Ｌ３の各条件で、探索領域２１０から顔の特徴量２２１ａを抽出する処理を行う。 In step S24, the CPU 13 operates as the first feature amount extraction unit 221, and performs processing for extracting the facial feature amount 221a from the search area 210 under each condition of the loop processes L1, L2, and L3.

ステップＳ２５では、ＣＰＵ１３は、通常顔判別器２２２及び特定個人顔判定部２２３として動作し、探索領域２１０が顔であるか（顔であるか）、非顔（顔以外）であるかを階層的に判別し、いずれかの階層で非顔であると判別された場合に、探索領域２１０が特定個人の顔であるか、非顔であるかを判定する処理を行う。 In step S25, the CPU 13 operates as the normal face discriminator 222 and the specific individual face determination unit 223, and hierarchically determines whether the search area 210 is a face (is a face) or a non-face (other than a face). , and if it is determined that the search area 210 is a non-face in any hierarchy, a process of determining whether the search area 210 is the face of a specific individual or a non-face is performed.

ＣＰＵ１３は、例えば、全ての縮小画像に対して探索領域２１０の走査を行い、全ての探索領域２１０での顔検出（換言すれば、顔領域の候補の検出）を終えると、ステップＳ２６でループ処理Ｌ１を終え、ステップＳ２７でループ処理Ｌ２を終え、ステップＳ２８でループ処理Ｌ３を終え、その後ステップＳ２９に処理を進める。 For example, the CPU 13 scans the search area 210 for all the reduced images, and after completing face detection (in other words, detection of face area candidates) in all the search areas 210, performs loop processing in step S26. After completing L1, loop processing L2 is completed in step S27, loop processing L3 is completed in step S28, and then the process proceeds to step S29.

ステップＳ２９では、ＣＰＵ１３は、顔領域統合部２２４として動作し、ステップＳ２１～Ｓ２８の処理で顔であると判別された１以上の顔領域の候補を統合する処理を行い、ステップＳ３０に処理を進める。１以上の顔領域の候補を統合する方法は特に限定されない。例えば、１以上の顔領域の候補の領域中心の平均値と、領域サイズの平均値とに基づいて統合してもよい。 In step S29, the CPU 13 operates as the face area integration unit 224, performs processing to integrate one or more face area candidates determined to be faces in the processes of steps S21 to S28, and proceeds to step S30. . The method of integrating one or more face area candidates is not particularly limited. For example, integration may be performed based on an average value of area centers of one or more face area candidates and an average value of area sizes.

ステップＳ３０では、ＣＰＵ１３は、第２特徴量抽出部２２５として動作し、ステップＳ２９で統合された顔領域から顔の特徴量を抽出する処理を行い、ステップＳ３１に処理を進める。 In step S30, the CPU 13 operates as the second feature amount extraction unit 225, performs processing for extracting facial feature amounts from the face area integrated in step S29, and proceeds to step S31.

ステップＳ３１では、ＣＰＵ１３は、特定個人判定部２５として動作し、ステップＳ３０で、統合された顔領域から抽出された特徴量と、顔特徴量記憶部１４２から読み込んだ特定個人の顔特徴量１４２ａとの相関係数を算出する処理を行い、ステップＳ３２に処理を進める。 In step S31, the CPU 13 operates as the specific individual determination unit 25, and in step S30, the feature quantity extracted from the integrated face region and the face feature quantity 142a of the specific individual read from the face feature quantity storage unit 142. , and the process proceeds to step S32.

ステップＳ３２では、ＣＰＵ１３は、算出した相関係数が、特定個人の顔か否かを判定するための所定の閾値より大きいか否かを判断し、相関係数が所定の閾値よりも大きい、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量１４２ａとの相関性が高い（換言すれば、類似度が高い）と判断すれば、ステップＳ３３に処理を進める。
ステップＳ３３では、ＣＰＵ１３は、顔領域に検出された顔が特定個人の顔であると判定し、その後処理を終える。 In step S32, the CPU 13 determines whether the calculated correlation coefficient is greater than a predetermined threshold value for determining whether or not the face is of a specific individual. If it is determined that the feature quantity extracted from the face region and the face feature quantity 142a of the specific individual are highly correlated (in other words, the degree of similarity is high), the process proceeds to step S33.
In step S33, the CPU 13 determines that the face detected in the face area is the face of a specific individual, and then terminates the process.

一方ステップＳ３２において、相関係数が所定の閾値以下である、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量１４２ａとの相関性が低い（換言すれば、類似度が低い）と判断すれば、ステップＳ３４に処理を進める。
ステップＳ３４では、ＣＰＵ１３は、特定個人の顔ではない、換言すれば、通常の顔であると判定し、その後処理を終える。 On the other hand, in step S32, if the correlation coefficient is equal to or less than a predetermined threshold, in other words, the correlation between the feature quantity extracted from the face region and the face feature quantity 142a of the specific individual is low (in other words, the similarity is low), the process proceeds to step S34.
In step S34, the CPU 13 determines that the face is not that of a specific individual, in other words, that it is a normal face, and then terminates the process.

図９は、実施の形態に係るドライバモニタリング装置１０における画像処理部１２のＣＰＵ１３が行う顔検出処理動作の一例を示すフローチャートである。本処理動作は、図８に示すステップＳ２５の判別処理動作の一例である。 FIG. 9 is a flow chart showing an example of the face detection processing operation performed by the CPU 13 of the image processing section 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the determination processing operation of step S25 shown in FIG.

まず、ステップＳ４１では、ＣＰＵ１３は、図８のステップＳ２４で探索領域２１０から抽出した特徴量２２１ａを読み込み、ステップＳ４２に処理を進める。
ステップＳ４２では、ＣＰＵ１３は、通常顔判別器２２２がどの階層にあるのかをカウントする判別器カウンタ（ｉ）に０をセットし、また、通常顔判別器２２２の階層構造の数を示すｎには、初期値として１をセットし、ステップＳ４３に処理を進める。 First, in step S41, the CPU 13 reads the feature amount 221a extracted from the search area 210 in step S24 of FIG. 8, and advances the process to step S42.
In step S42, the CPU 13 sets 0 to a discriminator counter (i) that counts in which layer the normal face discriminator 222 is located, and n, which indicates the number of hierarchical structures of the normal face discriminator 222, is set to , 1 is set as an initial value, and the process proceeds to step S43.

ステップＳ４３では、ＣＰＵ１３は、通常顔判別器２２２として動作し、第ｎ判別器により、探索領域２１０が顔であるか、非顔であるかを判別する処理を行う。 In step S43, the CPU 13 operates as the normal face discriminator 222, and performs processing for discriminating whether the search area 210 is a face or a non-face by the n-th discriminator.

ステップＳ４４では、ＣＰＵ１３は、探索領域２１０が顔であるか、非顔であるかを判別し、顔であると判別すれば、ステップＳ４５に処理を進める。
ステップＳ４５では、ＣＰＵ１３は、判別器カウンタ（ｉ）に１を加算して、ステップＳ４６に処理を進める。 In step S44, the CPU 13 determines whether the search area 210 is a face or a non-face, and if determined to be a face, proceeds to step S45.
In step S45, the CPU 13 adds 1 to the discriminator counter (i), and proceeds to step S46.

ステップＳ４６では、ＣＰＵ１３は、判別器カウンタ（ｉ）が、判別器の数を示すＮ未満であるか否かを判断し、判別器数Ｎ未満であると判断すれば、ステップＳ４７に処理を進め、ステップＳ４７では、次の階層の判別器による処理に進むために、ｎに１を加算し、その後、ステップＳ４３に戻り、処理を繰り返す。 In step S46, the CPU 13 determines whether or not the discriminator counter (i) is less than N, which indicates the number of discriminators. , in step S47, 1 is added to n in order to proceed to the processing by the discriminator of the next layer, and then the processing returns to step S43 to repeat the processing.

一方ステップＳ４６において、ＣＰＵ１３が、判別器カウンタ（ｉ）が判別器数Ｎ未満ではない、換言すれば、判別器カウンタ（ｉ）が判別器数Ｎになったと判断すれば、ステップＳ４８に処理を進める。 On the other hand, in step S46, if the CPU 13 determines that the discriminator counter (i) is not less than the discriminator number N, in other words, if the discriminator counter (i) has reached the discriminator number N, the process proceeds to step S48. proceed.

ステップＳ４８では、ＣＰＵ１３は、探索領域２１０が顔領域の候補であると判定し、当該探索領域の情報を顔領域の候補として記憶した後、当該探索領域に対する顔検出処理を終え、次の探索領域に対する顔検出処理を繰り返す。 In step S48, the CPU 13 determines that the search area 210 is a candidate for the face area, stores the information of the search area as a candidate for the face area, finishes the face detection process for the search area, and then selects the next search area. Repeat the face detection process for .

一方ステップＳ４４において、ＣＰＵ１３が、探索領域２１０が非顔であると判別すれば、ステップＳ４９に処理を進める。
ステップＳ４９では、ＣＰＵ１３は、特定個人顔判定部２２３として動作し、第ｎ判別器で用いた特徴量（探索領域２１０から抽出された特徴量）と、第ｎ判別器の階層に対応する特定個人の顔特徴量１４２ａとの相関係数を算出する処理を行い、ステップＳ５０に処理を進める。 On the other hand, if the CPU 13 determines in step S44 that the search area 210 is non-face, the process proceeds to step S49.
In step S49, the CPU 13 operates as the specific individual face determination unit 223, and uses the feature amount (the feature amount extracted from the search area 210) used in the n-th discriminator and the specific individual corresponding to the hierarchy of the n-th discriminator. , and the processing proceeds to step S50.

ステップＳ５０では、ＣＰＵ１３は、算出した相関係数が、特定個人の顔であるか否かを判定するための所定の閾値より大きいか否かを判断し、相関係数が所定の閾値よりも大きい（すなわち、相関性が高い）と判断すれば（換言すれば、探索領域２１０が特定個人の顔であると判断すれば）、ステップＳ４５に処理を進め、通常顔判別器２２２による判別処理を進める。 In step S50, the CPU 13 determines whether the calculated correlation coefficient is greater than a predetermined threshold value for determining whether or not the face is of a specific individual, and determines whether the correlation coefficient is greater than the predetermined threshold value. (That is, if the correlation is high) (in other words, if it is determined that the search area 210 is the face of a specific individual), the process proceeds to step S45, and the discrimination process by the normal face discriminator 222 proceeds. .

一方ステップＳ５０において、ＣＰＵ１３が、相関係数が所定の閾値以下である（すなわち、相関性が低い）と判断すれば（換言すれば、探索領域２１０が非顔であると判断すれば）、ステップＳ５１に処理を進める。 On the other hand, in step S50, if the CPU 13 determines that the correlation coefficient is equal to or less than a predetermined threshold value (that is, the correlation is low) (in other words, if it determines that the search area 210 is non-face), step The process proceeds to S51.

ステップＳ５１では、ＣＰＵ１３は、探索領域が非顔（顔以外）であると判定し、次のステップＳ５２で、通常顔判別器２２２による第ｎ判別器より後の判別処理を打ち切り、その後、当該探索領域に対する顔検出処理を終え、次の探索領域に対する顔検出処理を繰り返す。 In step S51, the CPU 13 determines that the search area is a non-face (other than a face), and in the next step S52, terminates the discrimination processing after the n-th discriminator by the normal face discriminator 222, and then the search is performed. After finishing the face detection processing for the region, the face detection processing for the next search region is repeated.

［作用・効果］
上記した実施の形態に係るドライバモニタリング装置１０によれば、顔特徴量記憶部１４２に学習済みの顔特徴量として、特定個人の顔特徴量１４２ａと、通常の顔特徴量１４２ｂとが記憶され、通常顔判別器２２２が、通常の顔特徴量１４２ｂを用いて、画像から切り出された探索領域２１０が顔であるか、非顔であるかを階層的に判別することにより、顔領域が検出される。 [Action/effect]
According to the driver monitoring device 10 according to the above-described embodiment, the facial feature amount storage unit 142 stores the facial feature amount 142a of the specific individual and the normal facial feature amount 142b as the learned facial feature amount, The normal face discriminator 222 uses the normal face feature quantity 142b to hierarchically discriminate whether the search region 210 cut out from the image is a face or a non-face, thereby detecting a face region. be.

また、通常顔判別器２２２のいずれかの階層で非顔であると判別された場合であっても、特定個人顔判定部２２３が、特定個人の顔特徴量１４２ａを用いて、探索領域２１０が特定個人の顔であるか、非顔であるかを判定することにより、特定個人の顔を含む顔領域が検出される。 In addition, even if the normal face discriminator 222 determines that it is a non-face at any level, the specific individual face determination unit 223 uses the specific individual's face feature amount 142a to determine whether the search area 210 is A face area containing the specific individual's face is detected by determining whether it is the specific individual's face or not.

これにより、通常の顔であっても、特定個人の顔であっても、画像中から顔領域を精度良く検出することができる。また、通常顔判別器２２２と特定個人顔判定部２２３では、探索領域２１０から抽出された共通の特徴量を用いるので、別途特徴量を抽出したりする処理が必要ないので、顔領域の検出に係るリアルタイム性を維持することができる。 As a result, it is possible to accurately detect a face area from an image, whether it is a normal face or a specific individual's face. In addition, since the normal face discriminator 222 and the specific individual face determination unit 223 use the common feature amount extracted from the search area 210, there is no need for a separate process of extracting the feature amount, so that the face area can be detected. Such real-time property can be maintained.

したがって、ドライバ３が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔をリアルタイムで（換言すれば、高速な処理で）精度良く検出することができる。 Therefore, regardless of whether the driver 3 is a specific individual or a normal person other than the specific individual, each face can be accurately detected in real time (in other words, by high-speed processing).

また、特定個人顔判定部２２３によって、非顔であると判別された通常顔判別器２２２の一の階層で用いた特徴量と、前記一の階層に対応する特定個人の顔特徴量１４２ａとの相関性に基づいて、探索領域が特定個人の顔であるか、非顔であるかを効率良く判定することができる。したがって、通常顔判別器２２２で非顔であると判別された場合であっても、特定個人の顔である場合を精度良く判定することができる。 In addition, the feature amount used in one layer of the normal face discriminator 222, which is determined to be a non-face by the specific individual face determination unit 223, and the face feature amount 142a of the specific individual corresponding to the one layer. Based on the correlation, it is possible to efficiently determine whether the search area is the face of a specific individual or not. Therefore, even when the normal face discriminator 222 discriminates the face as a non-face, it is possible to accurately determine the case as the face of a specific individual.

また、上記ドライバモニタリング装置１０によれば、特定個人顔判定部２２３により特定個人の顔であると判定された場合、通常顔判別器２２２の次の階層に判別が進められ、判別処理が速やかに継続される。一方、特定個人顔判定部２２３により非顔であると判定された場合、通常顔判別器２２２での判別が打ち切られる。したがって、通常顔判別器２２２の効率を維持しつつ、特定個人の顔を判定する処理を行うことができる。 Further, according to the driver monitoring device 10, when the specific individual face determination unit 223 determines that the face is that of a specific individual, the determination proceeds to the next layer of the normal face discriminator 222, and the determination process can be performed quickly. Continued. On the other hand, when the specific individual face determining unit 223 determines that the face is a non-face, the determination by the normal face determining unit 222 is terminated. Therefore, while maintaining the efficiency of the normal face discriminator 222, it is possible to perform the process of discriminating the face of a specific individual.

また、上記ドライバモニタリング装置１０によれば、顔領域統合部２２４によって、通常顔判別器２２２を介して顔であると判別された１以上の顔領域の候補が統合され、特定個人判定部２５によって、統合された顔領域から抽出された特徴量と、特定個人の顔特徴量１４２ａとを用いて、顔領域の顔が特定個人の顔であるか否かが判定される。したがって、顔領域統合部２２４により統合された顔領域に基づいて、特定個人の顔であるか、通常の人の顔であるかを精度良く判定することができる。 Further, according to the driver monitoring device 10, the facial area integration unit 224 integrates one or more facial area candidates determined to be a face through the normal face discriminator 222, and the specific individual determination unit 25 , using the feature quantity extracted from the integrated face region and the face feature quantity 142a of the specific individual, it is determined whether or not the face in the face region is the face of the specific individual. Therefore, based on the face area integrated by the face area integration unit 224, it is possible to accurately determine whether the face is that of a specific individual or that of an ordinary person.

また、車載システム１が、ドライバモニタリング装置１０と、ドライバモニタリング装置１０から出力されるモニタリングの結果に基づいて、所定の処理を実行する１以上のＥＣＵ４０とを備えている。したがって、前記モニタリングの結果に基づいて、ＥＣＵ４０に所定の制御を適切に実行させることが可能となる。これにより、特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。 The in-vehicle system 1 also includes a driver monitoring device 10 and one or more ECUs 40 that execute predetermined processing based on the monitoring results output from the driver monitoring device 10 . Therefore, it is possible to cause the ECU 40 to appropriately perform predetermined control based on the monitoring result. This makes it possible to build a highly safe vehicle-mounted system that allows even specific individuals to drive with peace of mind.

［変形例］
以上、本発明の実施の形態を詳細に説明したが、前述までの説明はあらゆる点において本発明の例示に過ぎない。本発明の範囲を逸脱することなく、種々の改良や変更を行うことができることは言うまでもない。
上記実施の形態では、本発明に係る画像処理装置をドライバモニタリング装置１０に適用した場合について説明したが、適用例はこれに限定されない。例えば、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにおいて、モニタリング対象者に上記した特定個人が含まれる場合に、本発明に係る画像処理装置を適用可能である。 [Modification]
Although the embodiments of the present invention have been described in detail above, the above description is merely an example of the present invention in every respect. It goes without saying that various modifications and changes can be made without departing from the scope of the invention.
In the above embodiment, the case where the image processing device according to the present invention is applied to the driver monitoring device 10 has been described, but application examples are not limited to this. For example, in devices and systems that monitor people who operate and monitor various facilities such as machines and devices in a factory, and perform predetermined work, the above-mentioned specific individuals are included in the monitoring targets. In this case, the image processing apparatus according to the present invention can be applied.

［付記］
本発明の実施の形態は、以下の付記の様にも記載され得るが、これらに限定されない。
（付記１）
撮像部（１１）から入力される画像を処理する画像処理装置（１２）であって、
前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量（１４２ａ）と、通常の顔特徴量（１４２ｂ）とが記憶される顔特徴量記憶部（１４２）と、
前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出部（２２）とを備え、
該顔検出部（２２）が、
前記探索領域から顔の特徴量を抽出する第１特徴量抽出部（２２１）と、
前記探索領域から抽出された前記特徴量と、前記通常の顔特徴量（１４２ｂ）とを用いて、前記探索領域が顔であるか、非顔であるかを判別する階層構造の通常顔判別器（２２２）と、
該通常顔判別器（２２２）のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定部（２２３）とを備えていることを特徴とする画像処理装置。 [Appendix]
Embodiments of the present invention can also be described in the following appendices, but are not limited thereto.
(Appendix 1)
An image processing device (12) for processing an image input from an imaging unit (11),
A facial feature amount storage unit that stores a specific individual's facial feature amount (142a) and a normal facial feature amount (142b) as learned facial feature amounts that have been trained to detect a face from the image. (142) and
a face detection unit (22) for detecting a face area while scanning a search area of the image;
The face detection unit (22)
a first feature quantity extraction unit (221) for extracting a face feature quantity from the search area;
A hierarchical normal face discriminator for discriminating whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount (142b). (222) and
When the normal face discriminator (222) determines that the face is non-face at any level, using the feature quantity extracted from the search area and the face feature quantity of the specific individual, An image processing apparatus, comprising: a specific individual face determination unit (223) that determines whether the search area is the face of the specific individual or the non-face.

（付記２）
撮像部（１１）から入力される画像を処理する画像処理方法であって、
前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップ（Ｓ２）を含み、
該顔検出ステップ（Ｓ２）が、
前記探索領域から顔の特徴量を抽出する特徴量抽出ステップ（Ｓ２４）と、
該特徴量抽出ステップ（Ｓ２４）により抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量（１４２ｂ）とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップ（Ｓ４３、Ｓ４４）と、
該通常顔判別ステップ（Ｓ４３、Ｓ４４）のいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量（１４２ａ）とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップ（Ｓ４９、Ｓ５０）とを含むことを特徴とする画像処理方法。 (Appendix 2)
An image processing method for processing an image input from an imaging unit (11),
including a face detection step (S2) of detecting a face region while scanning a search region of the image;
The face detection step (S2) is
a feature quantity extraction step (S24) for extracting a face feature quantity from the search area;
The search area is determined to be a face using the feature amount extracted by the feature amount extraction step (S24) and a learned normal face feature amount (142b) that has been trained to detect a face. a normal face discrimination step (S43, S44) for hierarchically discriminating whether or not it is a non-face;
Learning for detecting the extracted feature quantity and the face of a specific individual when the normal face determination step (S43, S44) determines that the face is non-face at any one of the hierarchies. a specific individual face determination step (S49, S50) for determining whether the search area is the face of the specific individual or the non-face using the facial feature amount (142a) of the specific individual that has already been processed; An image processing method comprising:

１車載システム
２車両
３ドライバ
１０ドライバモニタリング装置
１１カメラ
１２画像処理部
１３ＣＰＵ
１４ＲＯＭ
１４１プログラム記憶部
１４２顔特徴量記憶部
１４２ａ特定個人の顔特徴量
１４２ｂ通常の顔特徴量
１５ＲＡＭ
１５１画像メモリ
１６通信部
２０画像（入力画像）
２０ａ、２０ｂ縮小画像
２１画像入力部
２２顔検出部
２１０探索領域
２２１第１特徴量抽出部
２２１ａ特徴量
２２２通常顔判別器
２２３特定個人顔判定部
２２４顔領域統合部
２２５第２特徴量抽出部
２５特定個人判定部
２６第１顔画像処理部
２７特定個人の顔向き推定部
２８特定個人の目開閉検出部
２９特定個人の視線方向推定部
３０第２顔画像処理部
３１通常の顔向き推定部
３２通常の目開閉検出部
３３通常の視線方向推定部
３４出力部
４０ＥＣＵ
４１センサ
４２アクチュエータ
４３通信バス 1 in-vehicle system 2 vehicle 3 driver 10 driver monitoring device 11 camera 12 image processing unit 13 CPU
14 ROMs
141 program storage unit 142 facial feature amount storage unit 142a specific individual facial feature amount 142b normal facial feature amount 15 RAM
151 image memory 16 communication unit 20 image (input image)
20a, 20b Reduced image 21 Image input unit 22 Face detection unit 210 Search area 221 First feature amount extraction unit 221a Feature amount 222 Normal face discriminator 223 Specific individual face determination unit 224 Face area integration unit 225 Second feature amount extraction unit 25 Specific individual determination unit 26 First face image processing unit 27 Specific individual face direction estimation unit 28 Specific individual eye open/close detection unit 29 Specific individual gaze direction estimation unit 30 Second face image processing unit 31 Normal face direction estimation unit 32 Normal eye open/close detection unit 33 Normal line-of-sight direction estimation unit 34 Output unit 40 ECU
41 sensor 42 actuator 43 communication bus

Claims

An image processing device that processes an image input from an imaging unit,
a facial feature amount storage unit that stores a specific individual's facial feature amount and a normal facial feature amount as learned facial feature amounts that have been trained to detect a face from the image;
a face detection unit that detects a face area while scanning a search area of the image;
The face detection unit
a first feature quantity extraction unit that extracts a face feature quantity from the search area;
a hierarchical normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount;
When the face is determined to be a non-face in any hierarchy of the normal face classifier, the search region is determined using the feature quantity extracted from the search region and the face feature quantity of the specific individual. and a specific individual face determination unit that determines whether the is the face of the specific individual or the non-face of the specific individual.

The specific individual face determination unit
calculating an index indicating a correlation between the feature amount used in one layer of the normal face discriminator determined as the non-face and the face feature amount of the specific individual corresponding to the one layer;
2. The image processing apparatus according to claim 1, wherein, based on the calculated index, it is determined whether the search area is the face of the specific individual or the non-face.

The specific individual face determination unit
determining that the search area is the face of the specific individual when the index is greater than a predetermined threshold;
3. The image processing apparatus according to claim 2, wherein the search area is determined to be the non-face when the index is equal to or less than the predetermined threshold.

a determination progressing unit that advances determination to the next layer of the normal face classifier when the specific individual face determining unit determines that the face is that of the specific individual;
4. The method according to any one of claims 1 to 3, further comprising a determination terminating unit that terminates determination by the normal face discriminator when the specific individual face determining unit determines that the face is a non-face. 10. The image processing device according to claim 1.

The face detection unit is
a face area integration unit that integrates one or more candidates for the face area determined to be the face by the normal face determiner;
a second feature quantity extraction unit that extracts a face feature quantity from the integrated face region;
a specific individual determination unit that determines whether or not the face in the face region is the face of the specific individual, using the feature quantity extracted from the integrated face region and the face feature quantity of the specific individual; The image processing apparatus according to any one of claims 1 to 4, characterized by comprising:

An image processing device according to any one of claims 1 to 5;
an imaging unit that captures an image to be input to the image processing device;
and an output unit for outputting information based on image processing by the image processing device.

a monitoring device according to claim 6;
A control system comprising one or more control devices communicably connected to the monitoring device and executing predetermined processing based on the information output from the monitoring device.

the monitoring device is a device for monitoring a driver of a vehicle;
8. The control system of claim 7, wherein said controller includes an electronic control unit mounted on said vehicle.

An image processing method for processing an image input from an imaging unit,
A face detection step of detecting a face region while scanning a search region of the image;
The face detection step includes:
a feature quantity extraction step of extracting a face feature quantity from the search area;
The search area is determined to be a face or a non-face by using the feature quantity extracted by the feature quantity extraction step and a learned normal face feature quantity that has been trained to detect a face. a normal face discrimination step for hierarchically discriminating whether
The feature amount extracted when the non-face is discriminated in any hierarchy of the normal face discriminating step, and the learned specific individual who has undergone learning to detect the face of the specific individual. and a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face using the facial feature amount of .

A program for causing at least one or more computers to process an image input from an imaging unit,
to said at least one computer;
A face detection step of detecting a face region while scanning a search region of the image;
The face detection step includes:
a feature quantity extraction step of extracting a face feature quantity from the search area;
The search area is determined to be a face or a non-face by using the feature quantity extracted by the feature quantity extraction step and a learned normal face feature quantity that has been trained to detect a face. a normal face discrimination step for hierarchically discriminating whether
The feature amount extracted when the non-face is discriminated in any hierarchy of the normal face discriminating step, and the learned specific individual who has undergone learning to detect the face of the specific individual. and a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face using the facial feature amount of .