WO2021024905A1

WO2021024905A1 - Image processing device, monitoring device, control system, image processing method, computer program, and recording medium

Info

Publication number: WO2021024905A1
Application number: PCT/JP2020/029232
Authority: WO
Inventors: 相澤　知禎
Original assignee: オムロン株式会社
Priority date: 2019-08-02
Filing date: 2020-07-30
Publication date: 2021-02-11
Also published as: JP2021026420A

Abstract

The purpose of the present invention is to provide an image processing device capable of stably performing a face orientation estimation process for a specific individual in real time with high accuracy. The image processing device, which processes an image input from an imaging unit, comprises: a facial feature quantity storage unit which stores, as learned facial feature quantities, a facial feature quantity of a specific individual and a normal facial feature quantity; a face detection unit which detects a face area while extracting a facial feature quantity from the image; a specific individual determination unit which determines whether or not the face in the face area is the face of the specific individual, by using the detected feature quantity of the face area and the facial feature quantity of the specific individual; a first face image processing unit which performs a face image process for the specific individual, when it is determined that the face in the face area is the face of the specific individual; and a second face image processing unit which performs a normal face image process, when it is determined that the face in the face area is not the face of the specific individual, wherein the first face image processing unit includes a specific individual face orientation estimation unit.

Description

Image processing equipment, monitoring equipment, control systems, image processing methods, computer programs, and storage media

The present invention relates to an image processing device, a monitoring device, a control system, an image processing method, a computer program, and a storage medium.

Patent Document 1 below discloses a robot device used as a service providing device that can switch to an appropriate service according to the situation of a target (person) to provide the service.

The robot device is equipped with a first camera, a second camera, and an information processing device including a CPU, and the CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and a person position calculation unit. It is equipped with a movement vector detector and the like.

According to the robot device, when the service is provided to a group of people who have a relationship such as communicating with each other, the first service is provided to provide information based on close communication. To determine.
On the other hand, when the service is provided to a group of people whose relationship such as communication with each other is unknown, the second method of providing information unilaterally without exchanging information. Decide to provide the service. With these, it is possible to provide appropriate services according to the situation of the service provision target.

[Problems to be solved by the invention]
In the robot device disclosed in Patent Document 1, the face detection unit is configured to detect a person's face using the first camera, and a known technique is used for the face detection. Can be done.
However, with conventional face detection technology, if a part of the facial organs such as eyes, nose, and mouth is missing or significantly deformed due to injury, a large mole, swelling, or body decoration such as tattoo will appear on the face. Assuming that there is a difference in the age and gender of a specific individual (general person, age, gender, etc.), such as when the facial organs are placed significantly different from the average position due to treatment or a genetic disease However, there is a problem that the accuracy of face orientation estimation on the premise of face detection for (referring to an individual having characteristics different from the common facial features) is also lowered.

Japanese Unexamined Patent Publication No. 2014-14899

Means for solving problems and their effects

The present invention has been made in view of the above problems, and can improve the accuracy of face orientation estimation for a specific individual as described above. It is intended to provide a storage medium.

In order to achieve the above object, the image processing device (1) according to the present disclosure is an image processing device that processes an image input from an imaging unit.
As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
A face detection unit that detects a face region while extracting a feature amount for detecting a face from the image, and a face detection unit.
Using the detected feature amount of the face region and the face feature amount of the specific individual, a specific individual determination unit for determining whether or not the face in the face region is the face of the specific individual.
When the specific individual determination unit determines that the face is the face of the specific individual, the first face image processing unit that performs face image processing for the specific individual and the first face image processing unit
When the specific individual determination unit determines that the face is not the specific individual's face, it is provided with a second face image processing unit that performs normal face image processing.
The image processing includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit for a specific individual.

According to the image processing device (1), the face feature amount of the specific individual and the normal face feature amount (in other words, the specific individual) are used as the learned face feature amount in the face feature amount storage unit. The facial feature amount used when the person is a person other than the above) is stored, and the feature amount of the face region detected by the face detection unit and the facial feature amount of the specific individual are stored by the specific individual determination unit. It is used to determine whether or not the face in the face region is the face of the specific individual.
By using the facial feature amount of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed accurately in the first face image processing unit.
On the other hand, when it is determined that the face is not the face of the specific individual, in other words, the face of a person other than the specific individual, the second face image processing unit performs the normal face image processing. Can be carried out with high accuracy.
The image processing includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit of a specific individual. Therefore, the specific person is premised on the face detection of the specific individual. Face orientation estimation processing for an individual can be performed stably and accurately in real time.

Further, the image processing device (2) according to the present disclosure is the above-mentioned image processing device (1).
The face orientation estimation unit of the specific individual is characterized by including a specific part-free face orientation estimation unit that estimates the face orientation by a face model fitting process that does not use the specific part of the specific individual.

According to the image processing device (2), by performing the face model fitting process that is not affected by the specific portion, it is possible to accurately estimate the face orientation while maintaining the real-time process.

Further, the image processing apparatus (3) according to the present disclosure is the above-mentioned image processing apparatus (2).
A score calculation unit that calculates the face model fitting score for each part other than the specific part,
It is further characterized by further including a fitting score determining unit for determining whether or not the face model fitting score for all the portions other than the specific portion satisfies a predetermined condition.

According to the image processing device (3), it is possible to accurately determine whether or not highly accurate face orientation estimation processing can be performed only by the portion excluding the specific portion.

Further, the image processing device (4) according to the present disclosure is the above-mentioned image processing device (3).
When the fitting score determination unit determines that the face model fitting score for all parts other than the specific part satisfies a predetermined condition, the feature amount of the specific part is supplemented in the tracking process. It is characterized in that it is further provided with a complementary processing unit.

According to the image processing apparatus (4), by providing a complementary processing unit that complements the feature amount of the specific portion, the specific portion is processed as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like. Is possible.

Further, the image processing apparatus (5) according to the present disclosure is the image processing apparatus (4) described above.
After the feature amount of the specific portion is complemented, it is further provided with a normal face orientation estimation unit that estimates the face orientation by a normal face model fitting process.

According to the image processing apparatus (5), even if the specific part exists, face orientation can be estimated by normal face model fitting processing, and stable, high-speed, and highly accurate processing can be performed. Become.

Further, the image processing apparatus (6) according to the present disclosure is the above-mentioned image processing apparatus (1) to (5).
It is characterized by further providing an angle correction table that corrects the deviation of the face orientation angle.

According to the image processing apparatus (6), if a certain angle deviation occurs in the estimated face orientation angle even after the above processing is performed, the deviation of the face orientation angle is determined by using the angle correction table. It can be corrected, and it becomes easy to calculate the face orientation angle with high accuracy.

Further, the image processing apparatus (7) according to the present disclosure is the above-mentioned image processing apparatus (1) to (6).
The specific individual judgment unit
The correlation coefficient between the feature amount extracted from the face area and the face feature amount of the specific individual was calculated.
Based on the calculated correlation coefficient, it is characterized in that it is determined whether or not the face in the face region is the face of the specific individual.

According to the image processing apparatus (7), a correlation coefficient between the feature amount extracted from the face region and the face feature amount of the specific individual is calculated, and the face is based on the calculated correlation coefficient. It is determined whether or not the face of the area is the face of the specific individual. Thereby, it is possible to efficiently determine whether or not the face in the face region is the face of the specific individual based on the correlation coefficient.

Further, the image processing apparatus (8) according to the present disclosure is the image processing apparatus (7) described above.
The specific individual judgment unit
When the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual.
When the correlation coefficient is equal to or less than the predetermined threshold value, it is determined that the face in the face region is not the face of the specific individual.

According to the image processing apparatus (8), when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and the correlation coefficient is equal to or less than the predetermined threshold value. In the case of, it is determined that the face in the face area is not the face of the specific individual. The processing efficiency of the determination can be further improved by the processing of comparing the correlation coefficient with the predetermined threshold value.

Further, the image processing device (9) according to the present disclosure is any of the above image processing devices (1) to (8), and the face image processing includes face detection processing, line-of-sight direction estimation processing, and eye opening / closing detection. It is characterized in that at least one of the processes is included.

According to the image processing apparatus (9), since the face image processing includes at least one of the face detection process, the line-of-sight direction estimation process, and the eye opening / closing detection process, the specific individual or the above. It is possible to accurately perform processing for estimating and detecting various facial behaviors of people other than a specific individual.

Further, the monitoring device (1) according to the present disclosure includes any of the above image processing devices (1) to (9).
An image pickup unit that captures an image to be input to the image processing device, and
It is characterized by including an output unit that outputs information based on image processing by the image processing apparatus.

According to the monitoring device (1), not only the face of the normal person but also the face of the specific individual can be accurately monitored, and information based on the image processing can be output from the output unit. Therefore, it is possible to easily construct a monitoring system or the like that uses the information.

In addition, the control system (1) according to the present disclosure is
With the above monitoring device (1)
It is characterized by including one or more control devices that are communicably connected to the monitoring device and execute a predetermined process based on the information output from the monitoring device.

According to the control system (1), it is possible to execute a predetermined process by one or more of the control devices based on the information output from the monitoring device. Therefore, it is possible to construct a system that can utilize not only the monitoring result of the normal person but also the monitoring result of the specific individual.

Further, the control system (2) according to the present disclosure is the control system (1) described above.
The monitoring device is a device for monitoring the driver of the vehicle.
The control device is characterized by including an electronic control unit mounted on the vehicle.

According to the control system (2), even when the driver of the vehicle is the specific individual, the face of the specific individual can be accurately monitored, and the electronic device is based on the monitoring result. It is possible to make the control unit appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even the specific individual to drive with peace of mind.

Further, the image processing method (1) according to the present disclosure is an image processing method for processing an image input from an imaging unit.
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step.
The specific individual determination step includes a second face image processing step of performing normal face image processing when it is determined that the face is not the specific individual's face.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.

According to the image processing method (1), the face region is used by the feature amount of the face region detected in the face detection step and the face feature amount of the specific individual in the specific individual determination step. It is determined whether or not the face of the specific individual is the face of the specific individual. By using the facial feature amount of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be accurately performed by the first face image processing step. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing is performed with high accuracy by the second face image processing step. Can be done. Therefore, both the specific individual and the ordinary person other than the specific individual can accurately perform the sensing of each face.
Since the image processing includes a face orientation estimation process and the first face image processing step includes a face orientation estimation step for a specific individual, the face orientation estimation process for the specific individual is performed stably and accurately. It can be implemented in real time.

Further, the computer program (1) according to the present disclosure is a computer program for causing at least one or more computers to process an image input from an imaging unit.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.

According to the computer program (1), the face in the face area is the face of the specific individual by using the feature amount in the face area and the face feature amount in the specific individual in at least one computer. It is possible to determine whether or not it is the face of the specific individual, and it is possible to accurately determine whether or not it is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed with high accuracy. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing can be performed with high accuracy. Therefore, it is possible to construct a device or system capable of accurately sensing each face regardless of whether the specific individual or an ordinary person other than the specific individual.
Since the image processing includes a face orientation estimation process and the first face image processing step includes a face orientation estimation step for a specific individual, the face orientation estimation process for the specific individual is performed stably and accurately. It can be implemented in real time.
The computer program may be a computer program stored in a storage medium, a computer program that can be transferred via a communication network, or a computer program executed via a communication network. There may be.

Further, the computer-readable storage medium (1) according to the present disclosure is a computer-readable storage medium for causing at least one or more computers to process an image input from an imaging unit.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a program for executing the second face image processing step of performing normal face image processing and the program for executing the execution are stored.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.

According to the computer-readable storage medium (1), the same effect as that obtained by the computer program (1) by having at least one or more computers read the program and executing each of the steps is the same. The effect can be obtained.

It is a schematic diagram which shows an example of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment of this invention. It is a block diagram which shows an example of the hardware composition of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment. It is a block diagram which shows the functional structure example of the image processing part of the driver monitoring apparatus which concerns on embodiment. It is a flowchart which shows an example of the processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment. It is a flowchart which shows an example of the specific individual judgment processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment. It is a block diagram which shows the more detailed function configuration example of the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment. It is an image conceptual diagram which shows an example of the processing operation performed by the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment. It is a flowchart which shows an example of the processing operation performed by the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment.

Hereinafter, embodiments of an image processing device, a monitoring device, a control system, an image processing method, a computer program, and a storage medium according to the present invention will be described with reference to the drawings.
[Application example]
The image processing apparatus according to the present invention can be widely applied to, for example, an apparatus or system for monitoring an object such as a person using a camera.
Further, the image processing device according to the present invention operates or monitors, for example, various equipment such as machines and devices in a factory, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can also be applied to devices and systems that monitor people who perform or perform predetermined work.

[Application example 1]
FIG. 1 is a schematic view showing an example in which the image processing device according to the present disclosure is applied to an in-vehicle system including a driver monitoring device.
The in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2. ) 40, and one or more sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43.

The in-vehicle system 1 is configured as, for example, an in-vehicle network system that communicates according to a CAN (Controller Area Network) protocol. In addition to CAN, other communication standards other than CAN may be adopted as the communication standard of the in-vehicle system 1.
The driver monitoring device 10 is an example of the "monitoring device" according to the present invention, and the in-vehicle system 1 is an example of the "control system" according to the present invention.

The driver monitoring device 10 transmits information based on image processing by the camera 11 for photographing the face of the driver 3, the image processing unit 12 that processes the image input from the camera 11, and the image processing unit 12, and the communication bus 43. It is configured to include a communication unit 16 that performs processing such as output to a predetermined ECU 40 via the above. The image processing unit 12 is an example of the "image processing apparatus" according to the present invention, and the camera 11 is an example of the "imaging unit" according to the present invention.

The driver monitoring device 10 detects the face of the driver 3 from the image taken by the camera 11, and detects the behavior of the face such as the direction of the face of the detected driver 3, the direction of the line of sight, or the open / closed state of the eyes. The driver monitoring device 10 can determine the state of the driver 3, such as forward gaze, inattentiveness, dozing, backward facing, and prone, based on the detection results of these facial behaviors. Further, the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40, for example, pays attention to or warns the driver 3 based on the signal, or notifies the outside, or the vehicle. It is configured to execute the operation control (for example, deceleration control, guidance control to the road shoulder, etc.) of 2.

One of the purposes of the driver monitoring device 10 is, for example, to stably and accurately estimate the face orientation of a specific individual in real time.

In the conventional driver monitoring device, the driver 3 of the vehicle 2 has a part of facial organs such as eyes, nose, and mouth missing or greatly deformed due to, for example, an injury, or a large mole or wart on the face, or Accuracy of estimating face orientation from images taken by a camera when the arrangement of the facial organs is significantly deviated from the average position due to body decoration such as tattoos or hereditary diseases. There was a problem that the

Further, if the accuracy of the face orientation estimation is lowered, the processing after the face orientation estimation will not be performed properly, so that it will not be possible to properly determine the state such as inattentiveness or dozing of the driver 3. Further, there is a problem that various controls to be executed by the ECU 40 based on the state determination may not be appropriately performed.

In order to solve the problem, the driver monitoring device 10 according to the embodiment adopts the following configuration in order to improve the accuracy of face orientation estimation for the specific individual.

In the image processing unit 12, as the learned facial features that have been learned to detect the face from the image, the facial features of a specific individual and the normal facial features (when the person is a person other than the specific individual) The amount of facial features to be used) is memorized.

The image processing unit 12 performs face detection processing for detecting the face area while extracting the feature amount for detecting the face from the input image of the camera 11. Then, the image processing unit 12 determines whether or not the face in the face region is the face of the specific individual by using the detected feature amount of the face region and the face feature amount of the specific individual. Performs specific individual judgment processing.

In the specific individual determination process, an index showing the relationship between the feature amount extracted from the face region and the face feature amount of the specific individual, for example, a correlation coefficient is calculated, and based on the calculated correlation coefficient. , It is determined whether or not the face in the face region is the face of the specific individual.
For example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the face region is It is determined that it is not the face of the specific individual. In the specific individual determination process, an index other than the correlation coefficient may be adopted.

Further, in the specific individual determination process, it may be determined whether or not the face in the face region is the face of the specific individual based on the determination result for one frame of the input image from the camera 11. Based on the determination result for a plurality of frames of the input image from No. 11, it may be determined whether or not the face in the face region is the face of the specific individual.

As described above, in the driver monitoring device 10, the image processing unit 12 stores the learned facial feature amount of the specific individual in advance, and the face feature amount of the specific individual is used to obtain the face of the specific individual. It is possible to accurately determine whether or not it is.

Further, when the face of the specific individual is determined by the specific individual determination process, the image processing unit 12 executes the face image process for the specific individual, so that the face image process of the specific individual can be performed accurately. It will be possible to carry out.
On the other hand, when it is determined that the face is not the face of the specific individual, in other words, it is a normal face, the image processing unit 12 executes the normal face image processing, so that the normal face image processing can be performed accurately. Can be carried out. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.

[Hardware configuration example]
FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.

The in-vehicle system 1 includes a

driver monitoring device

10, 1 or more ECUs 40 for monitoring the state of the driver 3 of the

vehicle

2, and 1 or more sensors 41, which are connected via a communication bus 43. Further, one or more actuators 42 are connected to the ECU 40.

The driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 for exchanging data and signals with an external ECU 40 and the like. There is.

The camera 11 is a device that captures an image including the face of the driver 3 seated in the driver's seat. For example, a lens unit, an image sensor unit, a light irradiation unit, an interface unit, a camera control unit that controls each of these units, and the like. Is configured to include.
The image sensor unit includes, for example, an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like. The image pickup device unit may be an element capable of forming a photographed image by receiving light in a visible region, or may be an element capable of forming a photographed image by receiving light in a near infrared region.
The light irradiation unit is configured to include a light emitting element such as an LED (Light Emitting Diode), and may include a near infrared LED or the like so that the driver's face can be imaged day or night.
The camera 11 shoots an image at a predetermined frame rate (for example, several tens of frames per second), and the data of the shot image is input to the image processing unit 12. The camera 11 may be an external type as well as an integrated type.

The image processing unit 12 is configured as an image processing device including one or more CPUs (Central Processing Units) 13, ROMs (Read Only Memory) 14, and RAMs (Random Access Memory) 15. The ROM 14 includes a program storage unit 141 and a facial feature amount storage unit 142, and the RAM 15 includes an image memory 151 for storing an input image from the camera 11.
The driver monitoring device 10 may be provided with another storage unit, and the storage unit may be used as the program storage unit 141, the facial feature amount storage unit 142, and the image memory 151. The other storage unit may be a semiconductor memory or a storage medium that can be read by a disk drive or the like.

The CPU 13 is an example of a hardware processor, and reads, interprets, and executes data such as a computer program stored in the program storage unit 141 of the ROM 14 and a face feature amount stored in the face feature amount storage unit 142. Then, processing of the image input from the camera 11, for example, face image processing such as face detection processing and face orientation estimation processing is performed. Further, the CPU 13 performs a process of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16.

The face feature amount storage unit 142 contains a specific individual face feature amount 142a and a normal face feature amount 142b shown in FIG. 3 as learned face feature amounts that have been learned to detect a face from an image. It is remembered.
As the learned facial features, various features effective for detecting a face from an image can be used. For example, a feature amount (Haar-like feature amount) focusing on the difference in brightness (difference in average brightness between two rectangular areas of various sizes) in a local area of the face is used.
Alternatively, a feature amount (LBP (Local Binary Pattern) feature amount) focusing on a combination of brightness distributions in a local region of the face is adopted, or a combination of brightness distributions in a local region of the face in the gradient direction is adopted. A feature quantity (HOG (Histogram of Oriented Gradients) feature quantity) focusing on is may be used.

Various machine learning methods can be used as a method for extracting features that are effective for face detection. Machine learning is a process of finding patterns inherent in data (learning data) by a computer. For example, AdaBoost may be used as an example of a statistical learning method. AdaBoost selects a large number of discriminators (weak discriminators) with low discriminating ability, selects a weak discriminator with a small error rate from these many weak discriminators, adjusts parameters such as weights, and has a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by setting. The discriminator is also referred to as a discriminator, a classifier, or a learner.

For example, one feature amount effective for face detection is discriminated by one weak discriminator, a large number of weak discriminators and their combinations are selected by AdaBoost, and these are used for strong discrimination having a hierarchical structure. The vessel is built. Note that, for example, information such as 1 for a face and 0 for a non-face is output from one weak discriminator.
Further, as the learning method, a learning method called Real AdaBoost, which can output a real number from 0 to 1 instead of 0 or 1, may be adopted.
Further, as these learning methods, a neural network having an input layer, an intermediate layer, and an output layer may be adopted.

A large number of face images taken under various conditions and a large number of non-face images (non-face images) are given as training data to a learning device equipped with such a learning algorithm, learning is repeated, and weights are obtained. By adjusting and optimizing the parameters such as, a strong discriminator having a hierarchical structure capable of detecting the face with high accuracy will be constructed.
Then, one or more feature amounts used in the weak discriminators of each layer constituting the strong discriminator are used as the learned facial feature amounts.

For example, the face feature amount 142a of a specific individual is obtained by individually capturing a face image of the specific individual at a predetermined place under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). A large number of these captured images are input to the learning device as teacher data, and are adjusted by the learning process, which are parameters indicating the facial features of a specific individual.
The facial feature amount 142a of the specific individual may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process. The facial feature amount 142a of a specific individual stored in the facial feature amount storage unit 142 may be only the facial feature amount of one specific individual, or when a plurality of specific individuals drive the vehicle 2, a plurality of specific individuals may drive the vehicle 2. It may be a facial feature amount of a specific individual.

In the normal facial feature amount 142b, an image obtained by capturing a normal human face image under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states) is used as teacher data for the above learning. It is a parameter indicating a normal human facial feature that is input to the device and adjusted by the learning process.
The normal facial feature amount 142b may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process.

The learned facial feature amount stored in the facial feature amount storage unit 142 is taken in from a server on the cloud via a communication network such as the Internet or a mobile phone network, and is stored in the facial feature amount storage unit 142. May be configured.

The ECU 40 is composed of a computer device including one or more processors, a memory, a communication module, and the like. Then, the processor mounted on the ECU 40 reads, interprets, and executes the program stored in the memory, so that predetermined control for the actuator 42 and the like is executed.

The ECU 40 is configured to include, for example, at least one of a traveling system ECU, a driving support system ECU, a body system ECU, and an information system ECU.

The traveling system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like. The drive system ECU includes a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control.
The chassis-based ECU includes a control unit related to a "stop, turn" function such as brake control or steering control.

The driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA / Lane Keep Assist), a constant speed driving / inter-vehicle distance support function (also referred to as ACC / Adaptive Cruise Control), and a forward collision warning function. , Lane departure warning function, blind spot monitoring function, traffic sign recognition function, etc., functions that automatically improve safety or realize comfortable driving by linking with driving ECUs (driving support function or automatic driving function) Consists of at least one control unit for.

The driving support system ECU includes, for example, Level 1 (driver assistance), Level 2 (partially automatic driving), and Level 3 (conditional automatic driving) at the automatic driving level presented by the American Society of Automotive Engineers of Japan (SAE). ) May be equipped with at least one of the functions.
Further, the functions of level 4 (highly automatic driving) or level 5 (fully automatic driving) of the automatic driving level may be equipped, and only the functions of

level

1 and 2 or only

level

2 and 3 are equipped. May be good. Further, the in-vehicle system 1 may be configured as an automatic driving system.

The body system ECU may be configured to include at least one control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a blinker.

The information system ECU may be configured to include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems) related device.
The infotainment device includes, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like.
The telematics device includes a communication unit for communicating with the outside. The ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside machine such as an ITS spot, or vehicle-to-vehicle communication.

The sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for controlling the operation of the actuator 42 by the ECU 40. For example, in addition to vehicle speed sensors, shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc., peripheral monitoring of cameras for outside vehicles, radars such as millimeter waves (Radar), lidars, ultrasonic sensors, etc. A sensor or the like may be included.

The actuator 42 is a device that executes an operation related to traveling, steering, braking, or the like of the vehicle 2 based on a control signal from the ECU 40. The actuator 42 includes, for example, an engine, a motor, a transmission, a hydraulic or an electric cylinder. Etc. are included.

[Functional configuration example]
FIG. 3 is a block diagram showing a functional configuration example of the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
The image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. It is configured to include.

The image input unit 21 performs a process of capturing an image including the face of the driver 3 taken by the camera 11.
The face detection unit 22 is configured to include a face detection unit 23 of a specific individual and a normal face detection unit 24, and performs a process of detecting a face region while extracting a feature amount for detecting a face from an input image. Do.

The face detection unit 23 of the specific individual uses the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to perform a process of detecting the face region from the input image.
The normal face detection unit 24 uses the normal face feature amount 142b read from the face feature amount storage unit 142 to perform a process of detecting a face region from an input image.

The method of detecting the face area from the image is not particularly limited, but a method of detecting the face area at high speed and with high accuracy is adopted. The face detection unit 22 extracts features for detecting a face in each search area while scanning a predetermined search area (search window) on the input image, for example.
The face detection unit 22 extracts, for example, the difference in brightness (luminance difference) of a local region of the face, the edge strength, or the relationship between these local regions as a feature amount. Then, the face detection unit 22 uses the feature amount extracted from the search area, the normal face feature amount 142b read from the face feature amount storage unit 142, or the face feature amount 142a of a specific individual, and has a hierarchical structure ( A detector (hierarchical structure that captures the details of the face from the hierarchy that roughly captures the face) determines whether the face is face or non-face, and performs processing to detect the face area from the image.

The specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to detect the face in the face area. Performs a process of determining whether or not is the face of a specific individual.

The specific individual determination unit 25 calculates an index showing the relationship between the feature amount extracted from the face region and the face feature amount 142a of the specific individual, for example, a correlation coefficient, and based on the calculated correlation coefficient, the face. Determine if the face in the area is the face of a particular individual. For example, the correlation of feature quantities such as Haar-like features of one or more local regions in the face region can be obtained. Then, for example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the detected face region is the face of a specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the detected face region is Judge that it is not the face of a specific individual.

Further, the specific individual determination unit 25 determines whether or not the face in the detected face region is the face of a specific individual based on the result of determination for one frame of the input image from the camera 11, or the camera 11 Based on the result of the determination for a plurality of frames of the input image from, it is determined whether or not the face in the detected face region is the face of a specific individual.

When the specific individual determination unit 25 determines that the face is a specific individual's face, the first face image processing unit 26 performs face image processing for the specific individual using the face feature amount 142a of the specific individual. The illustrated first face image processing unit 26 includes a face orientation estimation unit 27 of a specific individual, an eye opening / closing detection unit 28 of the specific individual, and a line-of-sight direction estimation unit 29 of the specific individual. Further, another face behavior estimation unit and a detection unit may be included.

The face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual. The face orientation estimation unit 27 of the specific individual uses, for example, the position of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 23 of the specific individual using the face feature amount 142a of the specific individual. And shape are detected, and the orientation of the face is estimated based on the position and shape of the detected facial organs.

The method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy. For example, a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted. As a technique for fitting a 3D face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be adopted, but the technique is not limited thereto.

Further, the face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis and the yaw around the up-down axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual. The angle and the roll angle around the front-rear axis may be output.

The eye opening / closing detection unit 28 of the specific individual performs a process of detecting the opening / closing state of the eyes of the specific individual. The eye opening / closing detection unit 28 of the specific individual, for example, is based on the position and shape of the facial organs obtained by the face orientation estimation unit 27 of the specific individual, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. Detects the open / closed state, for example, whether the eyes are open or closed.
For the open / closed state of the eye, for example, the feature amount of the image of the eye (the position of the eyelid, the shape of the pupil (black eye), the area size of the white eye part and the black eye part, etc.) in various open / closed states of the eye is learned in advance. It may be detected by learning using and evaluating the degree of similarity with these learned feature amount data.

The line-of-sight direction estimation unit 29 of the specific individual performs a process of estimating the line-of-sight direction of the specific individual. The line-of-sight direction estimation unit 29 of a specific individual is based on, for example, the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, particularly the position and shape of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). Estimate the direction of the line of sight. The direction of the line of sight is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.

In addition, the direction of the line of sight is, for example, the feature amount of the image of the eye in various combinations of face orientation and eye orientation (relative position of outer corner, inner corner of eye, pupil, relative position of white eye portion and black eye portion, shading, etc. (Texture, etc.) may be learned in advance using a learner, and estimated by evaluating the degree of similarity with the learned feature amount data.
In addition, the line-of-sight direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face and the position of the eyes by using the fitting result of the 3D face shape model and the like, and also estimates the size and center position of the pupil. The position may be detected and the vector connecting the center of the eyeball and the center of the pupil may be estimated as the line-of-sight direction.

When the specific individual determination unit 25 determines that the face is not the face of a specific individual, the second face image processing unit 30 performs normal face image processing using the normal face feature amount 142b. The second face image processing unit 30 may include a normal face orientation estimation unit 31, a normal eye opening / closing detection unit 32, and a normal line-of-sight direction estimation unit 33.
The processing performed by the normal face orientation estimation unit 31, the normal eye opening / closing detection unit 32, and the normal line-of-sight direction estimation unit 33 uses the normal face feature amount 142b, except that the face orientation of a specific individual is used. Since it is basically the same as the estimation unit 27, the eye opening / closing detection unit 28 of the specific individual, and the line-of-sight direction estimation unit 29 of the specific individual, the description thereof will be omitted here.

The output unit 34 performs a process of outputting information based on the image processing by the image processing unit 12 to the ECU 40 or the like. The information based on the image processing may be, for example, information on the behavior of the face such as the direction of the face of the driver 3, the direction of the line of sight, or the open / closed state of the eyes, or the driver 3 determined based on the detection result of the behavior of the face. It may be information about the state of (for example, forward gaze, inattentiveness, dozing, backward facing, prone, etc.). Further, the information based on the image processing may be a predetermined control signal (control signal for performing caution or warning processing, control signal for performing operation control of the vehicle 2, etc.) based on the state determination of the driver 3.

[Processing operation example]
FIG. 4 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. For example, the camera 11 captures an image of several tens of frames per second, and this processing is performed for each frame or every frame at regular intervals.

First, in step S1, the CPU 13 operates as an image input unit 21, and a process of reading an image (an image including the face of the driver 3) taken by the camera 11 is performed, and then the process proceeds to step S2.

In step S2, the CPU 13 operates as a normal face detection unit 24, performs normal face detection processing on the input image, and then proceeds to step S3.
In step S2, for example, while scanning a predetermined search area (search window) with respect to the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the normal face feature amount 142b read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Is done.

In step S3, the CPU 13 operates as the face detection unit 23 of the specific individual, performs face detection processing of the specific individual on the input image, and then proceeds to step S4.
In step S3, for example, while scanning a predetermined search area (search window) on the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Processing is performed. The processes of step S2 and step S3 may be performed in parallel in one step, or may be performed in combination.

In step S4, the CPU 13 operates as the specific individual determination unit 25, and the feature amount of the face area detected in steps S2 and S3 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. Is used to perform a process of determining whether or not the face in the face area is the face of a specific individual, and then the process proceeds to step S5.

In step S5, it is determined whether or not the result of the determination process in step S4 is the face of a specific individual, and if it is determined that the face is the face of a specific individual, the process proceeds to step S6 thereafter.

In step S6, the CPU 13 operates as the face orientation estimation unit 27 of the specific individual, and for example, using the face feature amount 142a of the specific individual, the eyes, nose, mouth, eyebrows, etc. from the face area detected in step S3 are used. The position and shape of the facial organ are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S7.

In step S7, the CPU 13 operates as an eye opening / closing detection unit 28 of a specific individual, and is based on, for example, the position and shape of facial organs obtained in step S6, particularly the position and shape of eye feature points (eyelids, pupils). Then, the open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S8.

In step S8, the CPU 13 operates as the line-of-sight direction estimation unit 29 of a specific individual, and for example, the face orientation, the position and shape of the facial organs obtained in step S6, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). The direction of the line of sight is estimated based on the position and shape of, and then the process is finished.

On the other hand, in step S5, if it is determined that the face is not a specific individual's face, in other words, it is a normal face, the process proceeds to step S9.
In step S9, the CPU 13 operates as a normal face orientation estimation unit 31, and for example, using the normal face feature amount 142b, facial organs such as eyes, nose, mouth, and eyebrows are used from the face region detected in step S2. The position and shape of the face are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S10.

In step S10, the CPU 13 operates as a normal eye opening / closing detection unit 32, and is based on, for example, the position and shape of the facial organs obtained in step S9, particularly the position and shape of eye feature points (eyelids, pupils). , The open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S11.

In step S11, the CPU 13 operates as a normal line-of-sight direction estimation unit 33, and for example, the orientation of the face and the position and shape of the facial organs obtained in step S9, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). The direction of the line of sight is estimated based on the position and shape, and then the process is finished.

FIG. 5 is a flowchart showing an example of a specific individual determination processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the specific individual determination processing operation in step S4 shown in FIG. 4, and shows an example of the processing operation when determining with one input image (1 frame).

First, in step S21, the CPU 13 reads the feature amount extracted from the face region detected by the face detection processing in steps S2 and S3 shown in FIG.
In the next step S22, the learned facial feature amount 142a of the specific individual is read from the face feature amount storage unit 142 (FIG. 3), and then the process proceeds to step S23.

In step S23, a process of calculating the correlation coefficient between the feature amount extracted from the face area read in step S21 and the face feature amount 142a of the specific individual read in step S22 is performed, and then in step S24. Proceed to.

In step S24, it is determined whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the individual is a specific individual, and the correlation coefficient is larger than the predetermined threshold value, in other words, the face. If it is determined that the feature amount extracted from the region has a high correlation (high similarity) with the facial feature amount 142a of the specific individual, the process proceeds to step S25.
In step S25, it is determined that the face detected in the face area is the face of a specific individual, and then the process ends.

On the other hand, in step S24, the correlation coefficient is equal to or less than a predetermined threshold value, in other words, the correlation between the feature amount extracted from the face region and the face feature amount 142a of the specific individual is low (the degree of similarity is low). If it is determined, the process proceeds to step S26 thereafter.
In step S26, it is determined that the face is not a specific individual's face, in other words, it is a normal face, and then the process is completed.

FIG. 6 is a block diagram showing a more detailed functional configuration example of the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10.
The face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual. The face orientation estimation unit 27 of the specific individual detects, for example, the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face region detected by the face detection unit 23 of the specific individual, and the detected facial organs. Performs processing to estimate the orientation of the face based on the position and shape of.

The method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy. For example, a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted.

The face orientation estimation unit 27 of a specific individual includes a face model fitting unit 27a that does not use a specific part to perform face model fitting processing without using a specific part of the specific individual. For example, 15 to 30 frames are captured per second. In the first frame of the image to be displayed, the face model fitting process is performed without contributing a specific part of the specific individual.
By performing the face model fitting process that is not affected by the specific portion, it is possible to realize high-speed processing close to the normal face model fitting process.

The face orientation estimation unit 27 of the specific individual further includes a score calculation unit 27b for calculating the face model fitting score of each part other than the specific part, and the face model fitting score for all parts other than the specific part. Is provided with a fitting score determination unit 27c for determining whether or not the number exceeds a predetermined threshold value.
Due to the presence of the fitting score determination unit 27c, it is possible to accurately determine whether or not the face orientation estimation process can be performed with high accuracy only by the portion excluding the specific portion.

When the face orientation estimation unit 27 of the specific individual determines in the fitting score determination unit 27c that the face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value, the face orientation estimation unit 27 of the specific part determines the specific part. A complement processing unit 27d that complements the feature amount is further provided.
By providing the complementary processing unit 27d that complements the feature amount of the specific portion, the specific portion can be treated as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like.

The face orientation estimation unit 27 of the specific individual further includes a normal face orientation estimation unit 27e that performs face orientation estimation processing by a normal face model fitting process after the feature amount of the specific portion is complemented.
Even if the specific portion exists, if the face orientation estimation process can be performed by a normal face model fitting process, a stable and highly accurate real-time face orientation estimation process can be realized.

The face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis, the yaw angle around the up-down axis, and the yaw angle around the vertical axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual. The roll angle around the front-back axis may be output.

The face orientation estimation unit 27 of the specific individual further includes an angle correction table 27f for correcting the deviation of the face orientation angle.
In this angle correction table 27f, for each specific individual, for example, at a predetermined place in advance, various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). The angle correction data for each specific individual, which is individually captured by the user, is input to the learning device as teacher data, and is adjusted by the learning process, is stored in advance.
If a certain amount of deviation occurs in the estimated face orientation angle even after performing the face model fitting process that does not use the specific part of the specific individual, the deviation of the face orientation angle is determined by using the angle correction table. to correct. This correction process facilitates the calculation of the face orientation angle with high accuracy.

FIG. 7 is an image conceptual diagram showing a processing operation performed by the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.

The face orientation estimation unit 27 of a specific individual creates, for example, a 3D face shape model, fits the 3D face shape model to the face region on the two-dimensional image, and performs face orientation estimation processing of the specific individual, and each organ of the face. Adopt a method to detect the position and shape of.
As a technique for fitting a 3D face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but the technique is not limited thereto.

In the case of the face orientation estimation process of a specific individual shown in FIG. 7, the 3D face model fitting process corresponding to the specific individual is performed without using the specific part of the specific individual. For example, of the images captured at 15 to 30 frames per second, the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of a specific individual (in the case of FIG. 7, the right eye part). To carry out.

Then, the process of calculating the 3D face model fitting score of each part other than the specific part is performed, and the specific individual-corresponding 3D face model fitting score of all the parts other than the specific part is set to a predetermined threshold value. Is determined whether or not the value is exceeded.

When it is determined that the specific individual-compatible 3D face model fitting score exceeds a predetermined threshold value, the tracking process is started from the next frame, and the complementary process for complementing the feature amount of the specific portion is performed.
In the example shown in FIG. 7, the specific portion is the right eye, and the portion of the right eye is, for example, painted as a complementary treatment.
By carrying out such a complementary process, it becomes possible to treat the specific site as a normal site.

In the tracking process after performing the complement process, the face orientation estimation process is performed by the normal 3D face model fitting process, but here, when the fitting score other than the specific part exceeds a predetermined threshold value, the face orientation estimation process is performed. It is considered that the fitting can be performed accurately including the specific part, and from the next frame, the tracking process of the facial organ points is performed by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute.

Specifically, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
By performing such a process, it is possible to perform a process equivalent to fitting in a normal 3D face model, and it is possible to realize a stable and highly accurate real-time face orientation estimation process.

FIG. 8 is a flowchart showing an example of a processing operation performed by the face orientation estimation unit 27 (CPU 13) of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.

First, in step S61, the flag is set to false. Next, in step S62, t is set to 1.

Next, in step S63, the above-mentioned face detection process is performed on the image at the t-frame. Next, in step S64, it is determined whether or not the face can be detected, and if it is determined that the face cannot be detected, the process proceeds to step S75, and the flag is set to false, while the face can be detected. If determined, the next step is step S65.

In step S65, the face image captured in the image at the t-frame is captured.
Next, in step S66, it is determined whether or not the flag is true, and if it is determined that the flag is not true, the process proceeds to step S67, while if it is determined that the flag is true, then the step is followed. Proceed to S73.

In step S67, for example, of the images captured in 15 to 30 frames per minute, the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of the specific individual. By performing the face model fitting process that is not affected by the specific portion, it is possible to realize high-speed processing close to the normal face model fitting process.

After completing the process of step S67, the process proceeds to step S68, and in step S68, a process of calculating a specific individual-compatible 3D face model fitting score that does not contribute to a specific part is performed.
After finishing the process of step S68, the process proceeds to step S69.

In step S69, the fitting score determination process is performed, and it is determined whether or not the specific individual-corresponding 3D face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value.

If it is determined in step S69 that the specific individualized 3D face model fitting score for all parts other than the specific part exceeds a predetermined threshold value, the process proceeds to step S70 thereafter.

In step S70, the flag is set to true, and then the process proceeds to step S71.
On the other hand, if it is determined in step S69 that the specific personalized 3D face model fitting score for all parts other than the specific part does not exceed a predetermined threshold value, the process proceeds to step S72, and in step S72, the process proceeds. , The process of advancing the frame is performed, and then the process returns to step S63.

On the other hand, in step S66, it is determined that the flag is true, and then when the process proceeds to step S73, in step S73. Complementary processing is performed to complement the feature amount of the specific site. In this complementary process, for example, as shown in FIG. 7, when the specific portion is determined to be the right eye, the right eye portion is, for example, black-painted.
By carrying out such a complementary process, it is possible to process the specific part as a normal part, for example, the right eye in the subsequent processing, and it is possible to perform a process equivalent to a normal face model fitting. Become.
After completing the complementary process in step S73, the process proceeds to step S74.

In step S74, the face orientation estimation process is performed by the normal 3D face model fitting process, and then the process proceeds to step S71.
In step S74, the tracking process is performed after the complementary process is performed, and when the fitting score other than the specific part exceeds a predetermined threshold value, it is considered that the fitting including the specific part has been performed accurately, and the facial organ point position. The tracking process of facial organ points is performed by utilizing the fact that the movement of each frame can be regarded as minute.

Specifically, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
By performing such a process, it is possible to perform a process equivalent to fitting in a normal 3D face model, and it is possible to realize a stable and highly accurate real-time face orientation estimation process.
In this way, even if the specific part exists, after the feature amount of the specific part is complemented, the face orientation estimation process can be performed by the normal 3D face model fitting process, and the face orientation estimation process can be performed stably. It is possible to perform accurate real-time face orientation estimation processing.

When the face orientation estimation process is performed in step S74, the process proceeds to step S71, and in step S71, the angle correction table 27f for correcting the deviation of the face orientation angle, which has been learned and created in advance for each specific individual. Is used to perform a process of correcting the deviation of the face orientation angle.

This correction process is performed when a certain deviation occurs in the estimated face orientation angle even if the specific individual-compatible 3D face model fitting process that does not use the specific part of the specific individual is performed, and is specified in advance. By using the angle correction table 27f created for each individual, the deviation of the face orientation angle can be easily corrected for each specific individual, and the face orientation angle can be calculated with high accuracy.

When the process of correcting the deviation of the face orientation angle in step S71 is completed, the process proceeds to step S72, and in step S72, the process of advancing the frame is performed, and then the process returns to step S63. Further, the process proceeds from step S75 to step S72, and in step S72, the process of advancing the frame is performed in the same manner, and then the process returns to step S63.

According to the driver monitoring device 10 according to the above-described embodiment, the face feature amount 142a of a specific individual and the normal face feature amount 142b are stored as the learned face feature amount in the face feature amount storage unit 142. The specific individual determination unit 25 determines whether or not the face in the face region is the face of a specific individual by using the feature amount of the face region detected by the face detection unit 22 and the face feature amount 142a of the specific individual. Will be done. Therefore, by using the facial feature amount 142a of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.

Further, when the specific individual determination unit 25 determines that the face is a specific individual, the first face image processing unit 26 can accurately perform the face image processing of the specific individual. On the other hand, when the specific individual determination unit 25 determines that the face is not a specific individual's face, in other words, a normal face (a face of a person other than the specific individual), the second face image processing unit 30 determines that the face is a normal face. Image processing can be performed with high accuracy. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.

In addition, regarding the face orientation estimation of a specific individual, by supplementing the feature amount of a specific part (facial organ defect part, etc.) based on the face model fitting result in the previous frame, the face orientation can be stably and accurately measured in real time. It becomes possible to estimate.
That is, specifically, for the first frame, a so-called specific personalized face model is used, in which a specific part is fitted without contributing.
Here, if the fitting score other than the specific part is equal to or higher than the predetermined threshold value, it is considered that the fitting can be performed accurately including the specific part, and from the next frame, in a moving image such as 15 frames / second or 30 frames / second, The tracking process of the facial organ point is started by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute.
That is, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, the position is painted black in the image, and the feature amount is extracted. By doing so, it is possible to perform processing equivalent to normal face model fitting, and as a result, it becomes possible to estimate the face orientation in real time with stability and accuracy.

Further, the in-vehicle system 1 includes a driver monitoring device 10 and one or more ECUs 40 that execute a predetermined process based on the monitoring result output from the driver monitoring device 10. Therefore, based on the result of the monitoring, the ECU 40 can appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even a specific individual to drive with peace of mind.

Although the embodiments of the present invention have been described above, the above description is merely an example of the present invention in all respects, and various improvements and modifications can be made without departing from the scope of the present invention. Needless to say.
In the above embodiment, the case where the image processing device according to the present invention is applied to the driver monitoring device 10 has been described, but the application example is not limited to this. For example, in a device or system for monitoring a person who operates, monitors, or performs a predetermined work in various facilities such as machines and devices in a factory, the above-mentioned specific individual is included in the monitoring target person. In some cases, the image processing apparatus according to the present invention can be applied.

Further, in the above-described embodiment, the present invention is applied to a specific individual (meaning an individual having characteristics different from common facial characteristics even if there are differences in age, gender, etc. of a general person). Although the case of application has been described, the present invention can be applied as a specific individual even to a person whose nose and mouth are hidden by a mask, or to a person wearing an eyepatch.

The present invention can be widely applied to devices and systems for monitoring objects such as people using a camera. For example, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can be widely used for devices and systems that monitor people who operate, monitor, or perform predetermined work in various facilities such as machines and devices in factories.

[Additional Notes]
Embodiments of the present invention may also be described as, but are not limited to, the following appendices.
(Appendix 1)
It is an image processing method that processes an image input from an imaging unit.
Face detection steps (S2, S3) for detecting a face region while extracting facial features from the image, and
The feature amount of the face region detected by the face detection steps (S2, S3) and the learned face feature amount (142a) of the specific individual who has been trained to detect the face of the specific individual. Using the specific individual determination step (S4) for determining whether or not the face in the face region is the face of the specific individual,
When the face of the specific individual is determined by the specific individual determination step (S4), the first face image processing step (S6, S7, S8) for performing the face image processing for the specific individual and
When it is determined by the specific individual determination step (S4) that the face is not the face of the specific individual, the second face image processing step (S9, S10, S11) for performing normal face image processing is included.
The image processing includes a face orientation estimation process, and the first face image processing step (S6, S7, S8) includes a face orientation estimation step (S6) of a specific individual. ..

1 In-vehicle system 2 Vehicle 3 Driver 10 Driver monitoring device 11 Camera 12 Image processing unit 13 CPU
14 ROM
141 Program storage unit 142 Face feature amount Storage unit 142a Specific individual face feature amount 142b Normal face feature amount 15 RAM
151 Image memory 16 Communication unit 21 Image input unit 22 Face detection unit 23 Specific individual face detection unit 24 Normal face detection unit 25 Specific individual judgment unit 26 First face image processing unit 27 Specific individual face orientation estimation unit 27a Specific part Unused face orientation estimation unit 27b Face model fitting score calculation unit 27c Fitting score determination unit 27d Complementary processing unit 27e Normal face orientation estimation unit 27f Angle correction table 28 Specific individual eye opening / closing detection unit 29 Specific individual line-of-sight direction estimation unit 30 2 Face image processing unit 31 Normal face orientation estimation unit 32 Normal eye opening / closing detection unit 33 Normal line-of-sight direction estimation unit 34 Output unit 40 ECU
41 Sensor 42 Actuator 43 Communication bus

Claims

An image processing device that processes images input from the image pickup unit.
As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
A face detection unit that detects a face region while extracting a feature amount for detecting a face from the image, and a face detection unit.
Using the detected feature amount of the face region and the face feature amount of the specific individual, a specific individual determination unit for determining whether or not the face in the face region is the face of the specific individual.
When the specific individual determination unit determines that the face is the face of the specific individual, the first face image processing unit that performs face image processing for the specific individual and the first face image processing unit
When the specific individual determination unit determines that the face is not the specific individual's face, it is provided with a second face image processing unit that performs normal face image processing.
The image processing apparatus includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit of a specific individual.
The first aspect of claim 1, wherein the face orientation estimation unit of the specific individual includes a face orientation estimation unit that does not use a specific part and performs face orientation estimation by a face model fitting process that does not use a specific part of the specific individual. Image processing device.
A score calculation unit that calculates the face model fitting score for each part other than the specific part,
The image according to claim 2, further comprising a fitting score determination unit for determining whether or not the face model fitting score for all parts other than the specific part satisfies a predetermined condition. Processing equipment.
When the fitting score determination unit determines that the face model fitting score for all parts other than the specific part satisfies a predetermined condition, the feature amount of the specific part is supplemented in the tracking process. The image processing apparatus according to claim 3, further comprising a complementary processing unit.
The image processing apparatus according to claim 4, further comprising a normal face orientation estimation unit that estimates the face orientation by a normal face model fitting process after the feature amount of the specific portion is complemented.
The image processing apparatus according to any one of claims 1 to 5, further comprising an angle correction table for correcting a deviation of the face orientation angle.
The specific individual judgment unit
The correlation coefficient between the feature amount extracted from the face area and the face feature amount of the specific individual was calculated.
The item according to any one of claims 1 to 5, wherein it is determined whether or not the face in the face region is the face of the specific individual based on the calculated correlation coefficient. Image processing device.
The specific individual judgment unit
When the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual.
The image processing apparatus according to claim 7, wherein when the correlation coefficient is equal to or less than the predetermined threshold value, it is determined that the face in the face region is not the face of the specific individual.
The image according to any one of claims 1 to 5, wherein the face image processing includes at least one of a face detection process, a line-of-sight direction estimation process, and an eye opening / closing detection process. Processing equipment.
The image processing apparatus according to any one of claims 1 to 5.
An image pickup unit that captures an image to be input to the image processing device, and
A monitoring device including an output unit that outputs information based on image processing by the image processing device.
The monitoring device according to claim 10 and
A control system including one or more control devices that are communicably connected to the monitoring device and that execute a predetermined process based on the information output from the monitoring device.
The monitoring device is a device for monitoring the driver of the vehicle.
The control system according to claim 11, wherein the control device includes an electronic control unit mounted on the vehicle.
It is an image processing method that processes an image input from an imaging unit.
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step.
The specific individual determination step includes a second face image processing step of performing normal face image processing when it is determined that the face is not the specific individual's face.
The image processing includes a face orientation estimation process, and the first face image processing step is an image processing method including a face orientation estimation step for a specific individual.
A computer program for causing at least one or more computers to process an image input from an image pickup unit.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
The image processing includes a face orientation estimation process, and the first face image processing step is a computer program including a face orientation estimation step for a specific individual.
A computer-readable storage medium in which a computer program for causing at least one or more computers to process an image input from an image pickup unit is stored.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
The image processing includes a face orientation estimation process, and the first face image processing step is a computer-readable storage medium including a face orientation estimation step for a specific individual.