WO2021024905A1 - Image processing device, monitoring device, control system, image processing method, computer program, and recording medium - Google Patents

Image processing device, monitoring device, control system, image processing method, computer program, and recording medium Download PDF

Info

Publication number
WO2021024905A1
WO2021024905A1 PCT/JP2020/029232 JP2020029232W WO2021024905A1 WO 2021024905 A1 WO2021024905 A1 WO 2021024905A1 JP 2020029232 W JP2020029232 W JP 2020029232W WO 2021024905 A1 WO2021024905 A1 WO 2021024905A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
specific individual
image processing
unit
feature amount
Prior art date
Application number
PCT/JP2020/029232
Other languages
French (fr)
Japanese (ja)
Inventor
相澤 知禎
Original Assignee
オムロン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オムロン株式会社 filed Critical オムロン株式会社
Publication of WO2021024905A1 publication Critical patent/WO2021024905A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60KARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
    • B60K35/00Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present invention relates to an image processing device, a monitoring device, a control system, an image processing method, a computer program, and a storage medium.
  • Patent Document 1 discloses a robot device used as a service providing device that can switch to an appropriate service according to the situation of a target (person) to provide the service.
  • the robot device is equipped with a first camera, a second camera, and an information processing device including a CPU, and the CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and a person position calculation unit. It is equipped with a movement vector detector and the like.
  • the robot device when the service is provided to a group of people who have a relationship such as communicating with each other, the first service is provided to provide information based on close communication. To determine.
  • the second method of providing information when the service is provided to a group of people whose relationship such as communication with each other is unknown, the second method of providing information unilaterally without exchanging information. Decide to provide the service. With these, it is possible to provide appropriate services according to the situation of the service provision target.
  • the face detection unit is configured to detect a person's face using the first camera, and a known technique is used for the face detection. Can be done.
  • a part of the facial organs such as eyes, nose, and mouth is missing or significantly deformed due to injury, a large mole, swelling, or body decoration such as tattoo will appear on the face.
  • the present invention has been made in view of the above problems, and can improve the accuracy of face orientation estimation for a specific individual as described above. It is intended to provide a storage medium.
  • the image processing device (1) is an image processing device that processes an image input from an imaging unit.
  • a facial feature storage unit that stores the facial features of a specific individual and the normal facial features
  • a face detection unit that detects a face region while extracting a feature amount for detecting a face from the image, and a face detection unit.
  • a specific individual determination unit for determining whether or not the face in the face region is the face of the specific individual.
  • the specific individual determination unit determines that the face is the face of the specific individual
  • the first face image processing unit that performs face image processing for the specific individual
  • the first face image processing unit When the specific individual determination unit determines that the face is not the specific individual's face, it is provided with a second face image processing unit that performs normal face image processing.
  • the image processing includes a face orientation estimation process
  • the first face image processing unit includes a face orientation estimation unit for a specific individual.
  • the face feature amount of the specific individual and the normal face feature amount are used as the learned face feature amount in the face feature amount storage unit.
  • the facial feature amount used when the person is a person other than the above) is stored, and the feature amount of the face region detected by the face detection unit and the facial feature amount of the specific individual are stored by the specific individual determination unit. It is used to determine whether or not the face in the face region is the face of the specific individual.
  • the facial feature amount of the specific individual it is possible to accurately determine whether or not the face is the face of the specific individual. Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed accurately in the first face image processing unit.
  • the second face image processing unit performs the normal face image processing. Can be carried out with high accuracy.
  • the image processing includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit of a specific individual. Therefore, the specific person is premised on the face detection of the specific individual. Face orientation estimation processing for an individual can be performed stably and accurately in real time.
  • the image processing device (2) is the above-mentioned image processing device (1).
  • the face orientation estimation unit of the specific individual is characterized by including a specific part-free face orientation estimation unit that estimates the face orientation by a face model fitting process that does not use the specific part of the specific individual.
  • the image processing device (2) by performing the face model fitting process that is not affected by the specific portion, it is possible to accurately estimate the face orientation while maintaining the real-time process.
  • the image processing apparatus (3) is the above-mentioned image processing apparatus (2).
  • a score calculation unit that calculates the face model fitting score for each part other than the specific part, It is further characterized by further including a fitting score determining unit for determining whether or not the face model fitting score for all the portions other than the specific portion satisfies a predetermined condition.
  • the image processing device (3) it is possible to accurately determine whether or not highly accurate face orientation estimation processing can be performed only by the portion excluding the specific portion.
  • the image processing device (4) is the above-mentioned image processing device (3).
  • the fitting score determination unit determines that the face model fitting score for all parts other than the specific part satisfies a predetermined condition
  • the feature amount of the specific part is supplemented in the tracking process. It is characterized in that it is further provided with a complementary processing unit.
  • the specific portion is processed as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like. Is possible.
  • the image processing apparatus (5) is the image processing apparatus (4) described above. After the feature amount of the specific portion is complemented, it is further provided with a normal face orientation estimation unit that estimates the face orientation by a normal face model fitting process.
  • face orientation can be estimated by normal face model fitting processing, and stable, high-speed, and highly accurate processing can be performed. Become.
  • the image processing apparatus (6) is the above-mentioned image processing apparatus (1) to (5). It is characterized by further providing an angle correction table that corrects the deviation of the face orientation angle.
  • the image processing apparatus (6) if a certain angle deviation occurs in the estimated face orientation angle even after the above processing is performed, the deviation of the face orientation angle is determined by using the angle correction table. It can be corrected, and it becomes easy to calculate the face orientation angle with high accuracy.
  • the image processing apparatus (7) is the above-mentioned image processing apparatus (1) to (6).
  • the specific individual judgment unit The correlation coefficient between the feature amount extracted from the face area and the face feature amount of the specific individual was calculated. Based on the calculated correlation coefficient, it is characterized in that it is determined whether or not the face in the face region is the face of the specific individual.
  • a correlation coefficient between the feature amount extracted from the face region and the face feature amount of the specific individual is calculated, and the face is based on the calculated correlation coefficient. It is determined whether or not the face of the area is the face of the specific individual. Thereby, it is possible to efficiently determine whether or not the face in the face region is the face of the specific individual based on the correlation coefficient.
  • the image processing apparatus (8) is the image processing apparatus (7) described above.
  • the specific individual judgment unit When the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual. When the correlation coefficient is equal to or less than the predetermined threshold value, it is determined that the face in the face region is not the face of the specific individual.
  • the correlation coefficient when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and the correlation coefficient is equal to or less than the predetermined threshold value. In the case of, it is determined that the face in the face area is not the face of the specific individual.
  • the processing efficiency of the determination can be further improved by the processing of comparing the correlation coefficient with the predetermined threshold value.
  • the image processing device (9) is any of the above image processing devices (1) to (8), and the face image processing includes face detection processing, line-of-sight direction estimation processing, and eye opening / closing detection. It is characterized in that at least one of the processes is included.
  • the face image processing includes at least one of the face detection process, the line-of-sight direction estimation process, and the eye opening / closing detection process, the specific individual or the above. It is possible to accurately perform processing for estimating and detecting various facial behaviors of people other than a specific individual.
  • the monitoring device (1) includes any of the above image processing devices (1) to (9).
  • An image pickup unit that captures an image to be input to the image processing device, and It is characterized by including an output unit that outputs information based on image processing by the image processing apparatus.
  • the monitoring device (1) not only the face of the normal person but also the face of the specific individual can be accurately monitored, and information based on the image processing can be output from the output unit. Therefore, it is possible to easily construct a monitoring system or the like that uses the information.
  • control system (1) is with the above monitoring device (1) It is characterized by including one or more control devices that are communicably connected to the monitoring device and execute a predetermined process based on the information output from the monitoring device.
  • control system (1) it is possible to execute a predetermined process by one or more of the control devices based on the information output from the monitoring device. Therefore, it is possible to construct a system that can utilize not only the monitoring result of the normal person but also the monitoring result of the specific individual.
  • control system (2) is the control system (1) described above.
  • the monitoring device is a device for monitoring the driver of the vehicle.
  • the control device is characterized by including an electronic control unit mounted on the vehicle.
  • control system (2) even when the driver of the vehicle is the specific individual, the face of the specific individual can be accurately monitored, and the electronic device is based on the monitoring result. It is possible to make the control unit appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even the specific individual to drive with peace of mind.
  • the image processing method (1) is an image processing method for processing an image input from an imaging unit.
  • a face detection step of detecting a face region while extracting facial features from the image, and The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual.
  • a specific individual determination step for determining whether or not is the face of the specific individual, and When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step.
  • the specific individual determination step includes a second face image processing step of performing normal face image processing when it is determined that the face is not the specific individual's face.
  • the image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
  • the face region is used by the feature amount of the face region detected in the face detection step and the face feature amount of the specific individual in the specific individual determination step. It is determined whether or not the face of the specific individual is the face of the specific individual. By using the facial feature amount of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual. Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be accurately performed by the first face image processing step. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing is performed with high accuracy by the second face image processing step. Can be done.
  • both the specific individual and the ordinary person other than the specific individual can accurately perform the sensing of each face. Since the image processing includes a face orientation estimation process and the first face image processing step includes a face orientation estimation step for a specific individual, the face orientation estimation process for the specific individual is performed stably and accurately. It can be implemented in real time.
  • the computer program (1) is a computer program for causing at least one or more computers to process an image input from an imaging unit.
  • a specific individual determination step for determining whether or not is the face of the specific individual and When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
  • the image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
  • the face in the face area is the face of the specific individual by using the feature amount in the face area and the face feature amount in the specific individual in at least one computer. It is possible to determine whether or not it is the face of the specific individual, and it is possible to accurately determine whether or not it is the face of the specific individual. Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed with high accuracy. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing can be performed with high accuracy.
  • the computer program may be a computer program stored in a storage medium, a computer program that can be transferred via a communication network, or a computer program executed via a communication network. There may be.
  • the computer-readable storage medium (1) is a computer-readable storage medium for causing at least one or more computers to process an image input from an imaging unit.
  • a face detection step of detecting a face region while extracting facial features from the image, and The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual.
  • a specific individual determination step for determining whether or not is the face of the specific individual and When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step When it is determined by the specific individual determination step that the face is not the face of the specific individual, a program for executing the second face image processing step of performing normal face image processing and the program for executing the execution are stored.
  • the image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
  • the same effect as that obtained by the computer program (1) by having at least one or more computers read the program and executing each of the steps is the same.
  • the effect can be obtained.
  • the image processing apparatus can be widely applied to, for example, an apparatus or system for monitoring an object such as a person using a camera.
  • the image processing device operates or monitors, for example, various equipment such as machines and devices in a factory, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can also be applied to devices and systems that monitor people who perform or perform predetermined work.
  • FIG. 1 is a schematic view showing an example in which the image processing device according to the present disclosure is applied to an in-vehicle system including a driver monitoring device.
  • the in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2. ) 40, and one or more sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43.
  • a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior)
  • ECUs Electronic Control Units
  • sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43.
  • the in-vehicle system 1 is configured as, for example, an in-vehicle network system that communicates according to a CAN (Controller Area Network) protocol.
  • CAN Controller Area Network
  • other communication standards other than CAN may be adopted as the communication standard of the in-vehicle system 1.
  • the driver monitoring device 10 is an example of the "monitoring device” according to the present invention
  • the in-vehicle system 1 is an example of the "control system” according to the present invention.
  • the driver monitoring device 10 transmits information based on image processing by the camera 11 for photographing the face of the driver 3, the image processing unit 12 that processes the image input from the camera 11, and the image processing unit 12, and the communication bus 43. It is configured to include a communication unit 16 that performs processing such as output to a predetermined ECU 40 via the above.
  • the image processing unit 12 is an example of the "image processing apparatus” according to the present invention
  • the camera 11 is an example of the "imaging unit” according to the present invention.
  • the driver monitoring device 10 detects the face of the driver 3 from the image taken by the camera 11, and detects the behavior of the face such as the direction of the face of the detected driver 3, the direction of the line of sight, or the open / closed state of the eyes.
  • the driver monitoring device 10 can determine the state of the driver 3, such as forward gaze, inattentiveness, dozing, backward facing, and prone, based on the detection results of these facial behaviors.
  • the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40, for example, pays attention to or warns the driver 3 based on the signal, or notifies the outside, or the vehicle. It is configured to execute the operation control (for example, deceleration control, guidance control to the road shoulder, etc.) of 2.
  • One of the purposes of the driver monitoring device 10 is, for example, to stably and accurately estimate the face orientation of a specific individual in real time.
  • the driver 3 of the vehicle 2 has a part of facial organs such as eyes, nose, and mouth missing or greatly deformed due to, for example, an injury, or a large mole or wart on the face, or Accuracy of estimating face orientation from images taken by a camera when the arrangement of the facial organs is significantly deviated from the average position due to body decoration such as tattoos or hereditary diseases.
  • the driver monitoring device 10 adopts the following configuration in order to improve the accuracy of face orientation estimation for the specific individual.
  • the image processing unit 12 as the learned facial features that have been learned to detect the face from the image, the facial features of a specific individual and the normal facial features (when the person is a person other than the specific individual) The amount of facial features to be used) is memorized.
  • the image processing unit 12 performs face detection processing for detecting the face area while extracting the feature amount for detecting the face from the input image of the camera 11. Then, the image processing unit 12 determines whether or not the face in the face region is the face of the specific individual by using the detected feature amount of the face region and the face feature amount of the specific individual. Performs specific individual judgment processing.
  • an index showing the relationship between the feature amount extracted from the face region and the face feature amount of the specific individual for example, a correlation coefficient is calculated, and based on the calculated correlation coefficient. , It is determined whether or not the face in the face region is the face of the specific individual. For example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the face region is It is determined that it is not the face of the specific individual. In the specific individual determination process, an index other than the correlation coefficient may be adopted.
  • the specific individual determination process it may be determined whether or not the face in the face region is the face of the specific individual based on the determination result for one frame of the input image from the camera 11. Based on the determination result for a plurality of frames of the input image from No. 11, it may be determined whether or not the face in the face region is the face of the specific individual.
  • the image processing unit 12 stores the learned facial feature amount of the specific individual in advance, and the face feature amount of the specific individual is used to obtain the face of the specific individual. It is possible to accurately determine whether or not it is.
  • the image processing unit 12 executes the face image process for the specific individual, so that the face image process of the specific individual can be performed accurately. It will be possible to carry out.
  • the image processing unit 12 executes the normal face image processing, so that the normal face image processing can be performed accurately. Can be carried out. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.
  • FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.
  • the in-vehicle system 1 includes a driver monitoring device 10, 1 or more ECUs 40 for monitoring the state of the driver 3 of the vehicle 2, and 1 or more sensors 41, which are connected via a communication bus 43. Further, one or more actuators 42 are connected to the ECU 40.
  • the driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 for exchanging data and signals with an external ECU 40 and the like. There is.
  • the camera 11 is a device that captures an image including the face of the driver 3 seated in the driver's seat.
  • the image sensor unit includes, for example, an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like.
  • the image pickup device unit may be an element capable of forming a photographed image by receiving light in a visible region, or may be an element capable of forming a photographed image by receiving light in a near infrared region.
  • the light irradiation unit is configured to include a light emitting element such as an LED (Light Emitting Diode), and may include a near infrared LED or the like so that the driver's face can be imaged day or night.
  • the camera 11 shoots an image at a predetermined frame rate (for example, several tens of frames per second), and the data of the shot image is input to the image processing unit 12.
  • the camera 11 may be an external type as well as an integrated type.
  • the image processing unit 12 is configured as an image processing device including one or more CPUs (Central Processing Units) 13, ROMs (Read Only Memory) 14, and RAMs (Random Access Memory) 15.
  • the ROM 14 includes a program storage unit 141 and a facial feature amount storage unit 142
  • the RAM 15 includes an image memory 151 for storing an input image from the camera 11.
  • the driver monitoring device 10 may be provided with another storage unit, and the storage unit may be used as the program storage unit 141, the facial feature amount storage unit 142, and the image memory 151.
  • the other storage unit may be a semiconductor memory or a storage medium that can be read by a disk drive or the like.
  • the CPU 13 is an example of a hardware processor, and reads, interprets, and executes data such as a computer program stored in the program storage unit 141 of the ROM 14 and a face feature amount stored in the face feature amount storage unit 142. Then, processing of the image input from the camera 11, for example, face image processing such as face detection processing and face orientation estimation processing is performed. Further, the CPU 13 performs a process of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16.
  • the face feature amount storage unit 142 contains a specific individual face feature amount 142a and a normal face feature amount 142b shown in FIG. 3 as learned face feature amounts that have been learned to detect a face from an image. It is remembered.
  • various features effective for detecting a face from an image can be used. For example, a feature amount (Haar-like feature amount) focusing on the difference in brightness (difference in average brightness between two rectangular areas of various sizes) in a local area of the face is used.
  • a feature amount (LBP (Local Binary Pattern) feature amount) focusing on a combination of brightness distributions in a local region of the face is adopted, or a combination of brightness distributions in a local region of the face in the gradient direction is adopted.
  • a feature quantity (HOG (Histogram of Oriented Gradients) feature quantity) focusing on is may be used.
  • Machine learning is a process of finding patterns inherent in data (learning data) by a computer.
  • AdaBoost may be used as an example of a statistical learning method.
  • AdaBoost selects a large number of discriminators (weak discriminators) with low discriminating ability, selects a weak discriminator with a small error rate from these many weak discriminators, adjusts parameters such as weights, and has a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by setting.
  • the discriminator is also referred to as a discriminator, a classifier, or a learner.
  • one feature amount effective for face detection is discriminated by one weak discriminator, a large number of weak discriminators and their combinations are selected by AdaBoost, and these are used for strong discrimination having a hierarchical structure.
  • the vessel is built. Note that, for example, information such as 1 for a face and 0 for a non-face is output from one weak discriminator.
  • a learning method called Real AdaBoost which can output a real number from 0 to 1 instead of 0 or 1, may be adopted.
  • a neural network having an input layer, an intermediate layer, and an output layer may be adopted.
  • a large number of face images taken under various conditions and a large number of non-face images (non-face images) are given as training data to a learning device equipped with such a learning algorithm, learning is repeated, and weights are obtained.
  • a strong discriminator having a hierarchical structure capable of detecting the face with high accuracy will be constructed.
  • one or more feature amounts used in the weak discriminators of each layer constituting the strong discriminator are used as the learned facial feature amounts.
  • the face feature amount 142a of a specific individual is obtained by individually capturing a face image of the specific individual at a predetermined place under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). A large number of these captured images are input to the learning device as teacher data, and are adjusted by the learning process, which are parameters indicating the facial features of a specific individual.
  • the facial feature amount 142a of the specific individual may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process.
  • the facial feature amount 142a of a specific individual stored in the facial feature amount storage unit 142 may be only the facial feature amount of one specific individual, or when a plurality of specific individuals drive the vehicle 2, a plurality of specific individuals may drive the vehicle 2. It may be a facial feature amount of a specific individual.
  • the normal facial feature amount 142b an image obtained by capturing a normal human face image under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states) is used as teacher data for the above learning. It is a parameter indicating a normal human facial feature that is input to the device and adjusted by the learning process.
  • the normal facial feature amount 142b may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process.
  • the learned facial feature amount stored in the facial feature amount storage unit 142 is taken in from a server on the cloud via a communication network such as the Internet or a mobile phone network, and is stored in the facial feature amount storage unit 142. May be configured.
  • the ECU 40 is composed of a computer device including one or more processors, a memory, a communication module, and the like. Then, the processor mounted on the ECU 40 reads, interprets, and executes the program stored in the memory, so that predetermined control for the actuator 42 and the like is executed.
  • the ECU 40 is configured to include, for example, at least one of a traveling system ECU, a driving support system ECU, a body system ECU, and an information system ECU.
  • the traveling system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like.
  • the drive system ECU includes a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control.
  • the chassis-based ECU includes a control unit related to a "stop, turn” function such as brake control or steering control.
  • the driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA / Lane Keep Assist), a constant speed driving / inter-vehicle distance support function (also referred to as ACC / Adaptive Cruise Control), and a forward collision warning function.
  • Lane departure warning function a blind spot monitoring function, traffic sign recognition function, etc., functions that automatically improve safety or realize comfortable driving by linking with driving ECUs (driving support function or automatic driving function) Consists of at least one control unit for.
  • the driving support system ECU includes, for example, Level 1 (driver assistance), Level 2 (partially automatic driving), and Level 3 (conditional automatic driving) at the automatic driving level presented by the American Society of Automotive Engineers of Japan (SAE). ) May be equipped with at least one of the functions. Further, the functions of level 4 (highly automatic driving) or level 5 (fully automatic driving) of the automatic driving level may be equipped, and only the functions of level 1 and 2 or only level 2 and 3 are equipped. May be good. Further, the in-vehicle system 1 may be configured as an automatic driving system.
  • the body system ECU may be configured to include at least one control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a blinker.
  • a control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a blinker.
  • the information system ECU may be configured to include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems) related device.
  • the infotainment device includes, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like.
  • the telematics device includes a communication unit for communicating with the outside.
  • the ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside machine such as an ITS spot, or vehicle-to-vehicle communication.
  • ETC Electronic Toll Collection System
  • the sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for controlling the operation of the actuator 42 by the ECU 40.
  • vehicle speed sensors shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc.
  • peripheral monitoring of cameras for outside vehicles radars such as millimeter waves (Radar), lidars, ultrasonic sensors, etc.
  • a sensor or the like may be included.
  • the actuator 42 is a device that executes an operation related to traveling, steering, braking, or the like of the vehicle 2 based on a control signal from the ECU 40.
  • the actuator 42 includes, for example, an engine, a motor, a transmission, a hydraulic or an electric cylinder. Etc. are included.
  • FIG. 3 is a block diagram showing a functional configuration example of the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
  • the image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. It is configured to include.
  • the image input unit 21 performs a process of capturing an image including the face of the driver 3 taken by the camera 11.
  • the face detection unit 22 is configured to include a face detection unit 23 of a specific individual and a normal face detection unit 24, and performs a process of detecting a face region while extracting a feature amount for detecting a face from an input image. Do.
  • the face detection unit 23 of the specific individual uses the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to perform a process of detecting the face region from the input image.
  • the normal face detection unit 24 uses the normal face feature amount 142b read from the face feature amount storage unit 142 to perform a process of detecting a face region from an input image.
  • the method of detecting the face area from the image is not particularly limited, but a method of detecting the face area at high speed and with high accuracy is adopted.
  • the face detection unit 22 extracts features for detecting a face in each search area while scanning a predetermined search area (search window) on the input image, for example.
  • the face detection unit 22 extracts, for example, the difference in brightness (luminance difference) of a local region of the face, the edge strength, or the relationship between these local regions as a feature amount.
  • the face detection unit 22 uses the feature amount extracted from the search area, the normal face feature amount 142b read from the face feature amount storage unit 142, or the face feature amount 142a of a specific individual, and has a hierarchical structure (A detector (hierarchical structure that captures the details of the face from the hierarchy that roughly captures the face) determines whether the face is face or non-face, and performs processing to detect the face area from the image.
  • a detector hierarchical structure that captures the details of the face from the hierarchy that roughly captures the face
  • the specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to detect the face in the face area. Performs a process of determining whether or not is the face of a specific individual.
  • the specific individual determination unit 25 calculates an index showing the relationship between the feature amount extracted from the face region and the face feature amount 142a of the specific individual, for example, a correlation coefficient, and based on the calculated correlation coefficient, the face. Determine if the face in the area is the face of a particular individual. For example, the correlation of feature quantities such as Haar-like features of one or more local regions in the face region can be obtained. Then, for example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the detected face region is the face of a specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the detected face region is Judge that it is not the face of a specific individual.
  • the specific individual determination unit 25 determines whether or not the face in the detected face region is the face of a specific individual based on the result of determination for one frame of the input image from the camera 11, or the camera 11 Based on the result of the determination for a plurality of frames of the input image from, it is determined whether or not the face in the detected face region is the face of a specific individual.
  • the first face image processing unit 26 When the specific individual determination unit 25 determines that the face is a specific individual's face, the first face image processing unit 26 performs face image processing for the specific individual using the face feature amount 142a of the specific individual.
  • the illustrated first face image processing unit 26 includes a face orientation estimation unit 27 of a specific individual, an eye opening / closing detection unit 28 of the specific individual, and a line-of-sight direction estimation unit 29 of the specific individual. Further, another face behavior estimation unit and a detection unit may be included.
  • the face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual.
  • the face orientation estimation unit 27 of the specific individual uses, for example, the position of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 23 of the specific individual using the face feature amount 142a of the specific individual. And shape are detected, and the orientation of the face is estimated based on the position and shape of the detected facial organs.
  • the method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy.
  • a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted.
  • a technique for fitting a 3D face shape model to a human face in an image for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be adopted, but the technique is not limited thereto.
  • the face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis and the yaw around the up-down axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual.
  • the angle and the roll angle around the front-rear axis may be output.
  • the eye opening / closing detection unit 28 of the specific individual performs a process of detecting the opening / closing state of the eyes of the specific individual.
  • the eye opening / closing detection unit 28 of the specific individual is based on the position and shape of the facial organs obtained by the face orientation estimation unit 27 of the specific individual, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. Detects the open / closed state, for example, whether the eyes are open or closed.
  • the feature amount of the image of the eye (the position of the eyelid, the shape of the pupil (black eye), the area size of the white eye part and the black eye part, etc.) in various open / closed states of the eye is learned in advance. It may be detected by learning using and evaluating the degree of similarity with these learned feature amount data.
  • the line-of-sight direction estimation unit 29 of the specific individual performs a process of estimating the line-of-sight direction of the specific individual.
  • the line-of-sight direction estimation unit 29 of a specific individual is based on, for example, the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, particularly the position and shape of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils).
  • Estimate the direction of the line of sight is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.
  • the direction of the line of sight is, for example, the feature amount of the image of the eye in various combinations of face orientation and eye orientation (relative position of outer corner, inner corner of eye, pupil, relative position of white eye portion and black eye portion, shading, etc. (Texture, etc.) may be learned in advance using a learner, and estimated by evaluating the degree of similarity with the learned feature amount data.
  • the line-of-sight direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face and the position of the eyes by using the fitting result of the 3D face shape model and the like, and also estimates the size and center position of the pupil. The position may be detected and the vector connecting the center of the eyeball and the center of the pupil may be estimated as the line-of-sight direction.
  • the second face image processing unit 30 performs normal face image processing using the normal face feature amount 142b.
  • the second face image processing unit 30 may include a normal face orientation estimation unit 31, a normal eye opening / closing detection unit 32, and a normal line-of-sight direction estimation unit 33.
  • the processing performed by the normal face orientation estimation unit 31, the normal eye opening / closing detection unit 32, and the normal line-of-sight direction estimation unit 33 uses the normal face feature amount 142b, except that the face orientation of a specific individual is used. Since it is basically the same as the estimation unit 27, the eye opening / closing detection unit 28 of the specific individual, and the line-of-sight direction estimation unit 29 of the specific individual, the description thereof will be omitted here.
  • the output unit 34 performs a process of outputting information based on the image processing by the image processing unit 12 to the ECU 40 or the like.
  • the information based on the image processing may be, for example, information on the behavior of the face such as the direction of the face of the driver 3, the direction of the line of sight, or the open / closed state of the eyes, or the driver 3 determined based on the detection result of the behavior of the face. It may be information about the state of (for example, forward gaze, inattentiveness, dozing, backward facing, prone, etc.). Further, the information based on the image processing may be a predetermined control signal (control signal for performing caution or warning processing, control signal for performing operation control of the vehicle 2, etc.) based on the state determination of the driver 3.
  • FIG. 4 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment.
  • the camera 11 captures an image of several tens of frames per second, and this processing is performed for each frame or every frame at regular intervals.
  • step S1 the CPU 13 operates as an image input unit 21, and a process of reading an image (an image including the face of the driver 3) taken by the camera 11 is performed, and then the process proceeds to step S2.
  • step S2 the CPU 13 operates as a normal face detection unit 24, performs normal face detection processing on the input image, and then proceeds to step S3.
  • step S2 for example, while scanning a predetermined search area (search window) with respect to the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the normal face feature amount 142b read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Is done.
  • step S3 the CPU 13 operates as the face detection unit 23 of the specific individual, performs face detection processing of the specific individual on the input image, and then proceeds to step S4.
  • step S3 for example, while scanning a predetermined search area (search window) on the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Processing is performed.
  • the processes of step S2 and step S3 may be performed in parallel in one step, or may be performed in combination.
  • step S4 the CPU 13 operates as the specific individual determination unit 25, and the feature amount of the face area detected in steps S2 and S3 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. Is used to perform a process of determining whether or not the face in the face area is the face of a specific individual, and then the process proceeds to step S5.
  • step S5 it is determined whether or not the result of the determination process in step S4 is the face of a specific individual, and if it is determined that the face is the face of a specific individual, the process proceeds to step S6 thereafter.
  • step S6 the CPU 13 operates as the face orientation estimation unit 27 of the specific individual, and for example, using the face feature amount 142a of the specific individual, the eyes, nose, mouth, eyebrows, etc. from the face area detected in step S3 are used.
  • the position and shape of the facial organ are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S7.
  • step S7 the CPU 13 operates as an eye opening / closing detection unit 28 of a specific individual, and is based on, for example, the position and shape of facial organs obtained in step S6, particularly the position and shape of eye feature points (eyelids, pupils). Then, the open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S8.
  • step S8 the CPU 13 operates as the line-of-sight direction estimation unit 29 of a specific individual, and for example, the face orientation, the position and shape of the facial organs obtained in step S6, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils).
  • the direction of the line of sight is estimated based on the position and shape of, and then the process is finished.
  • step S5 if it is determined that the face is not a specific individual's face, in other words, it is a normal face, the process proceeds to step S9.
  • step S9 the CPU 13 operates as a normal face orientation estimation unit 31, and for example, using the normal face feature amount 142b, facial organs such as eyes, nose, mouth, and eyebrows are used from the face region detected in step S2. The position and shape of the face are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S10.
  • step S10 the CPU 13 operates as a normal eye opening / closing detection unit 32, and is based on, for example, the position and shape of the facial organs obtained in step S9, particularly the position and shape of eye feature points (eyelids, pupils). , The open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S11.
  • step S11 the CPU 13 operates as a normal line-of-sight direction estimation unit 33, and for example, the orientation of the face and the position and shape of the facial organs obtained in step S9, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils).
  • the direction of the line of sight is estimated based on the position and shape, and then the process is finished.
  • FIG. 5 is a flowchart showing an example of a specific individual determination processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment.
  • This processing operation is an example of the specific individual determination processing operation in step S4 shown in FIG. 4, and shows an example of the processing operation when determining with one input image (1 frame).
  • step S21 the CPU 13 reads the feature amount extracted from the face region detected by the face detection processing in steps S2 and S3 shown in FIG.
  • step S22 the learned facial feature amount 142a of the specific individual is read from the face feature amount storage unit 142 (FIG. 3), and then the process proceeds to step S23.
  • step S23 a process of calculating the correlation coefficient between the feature amount extracted from the face area read in step S21 and the face feature amount 142a of the specific individual read in step S22 is performed, and then in step S24. Proceed to.
  • step S24 it is determined whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the individual is a specific individual, and the correlation coefficient is larger than the predetermined threshold value, in other words, the face. If it is determined that the feature amount extracted from the region has a high correlation (high similarity) with the facial feature amount 142a of the specific individual, the process proceeds to step S25. In step S25, it is determined that the face detected in the face area is the face of a specific individual, and then the process ends.
  • step S24 the correlation coefficient is equal to or less than a predetermined threshold value, in other words, the correlation between the feature amount extracted from the face region and the face feature amount 142a of the specific individual is low (the degree of similarity is low). If it is determined, the process proceeds to step S26 thereafter. In step S26, it is determined that the face is not a specific individual's face, in other words, it is a normal face, and then the process is completed.
  • FIG. 6 is a block diagram showing a more detailed functional configuration example of the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10.
  • the face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual.
  • the face orientation estimation unit 27 of the specific individual detects, for example, the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face region detected by the face detection unit 23 of the specific individual, and the detected facial organs. Performs processing to estimate the orientation of the face based on the position and shape of.
  • the method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy.
  • a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted.
  • the face orientation estimation unit 27 of a specific individual includes a face model fitting unit 27a that does not use a specific part to perform face model fitting processing without using a specific part of the specific individual. For example, 15 to 30 frames are captured per second. In the first frame of the image to be displayed, the face model fitting process is performed without contributing a specific part of the specific individual. By performing the face model fitting process that is not affected by the specific portion, it is possible to realize high-speed processing close to the normal face model fitting process.
  • the face orientation estimation unit 27 of the specific individual further includes a score calculation unit 27b for calculating the face model fitting score of each part other than the specific part, and the face model fitting score for all parts other than the specific part. Is provided with a fitting score determination unit 27c for determining whether or not the number exceeds a predetermined threshold value. Due to the presence of the fitting score determination unit 27c, it is possible to accurately determine whether or not the face orientation estimation process can be performed with high accuracy only by the portion excluding the specific portion.
  • the face orientation estimation unit 27 of the specific individual determines in the fitting score determination unit 27c that the face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value
  • the face orientation estimation unit 27 of the specific part determines the specific part.
  • a complement processing unit 27d that complements the feature amount is further provided.
  • the complementary processing unit 27d that complements the feature amount of the specific portion the specific portion can be treated as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like.
  • the face orientation estimation unit 27 of the specific individual further includes a normal face orientation estimation unit 27e that performs face orientation estimation processing by a normal face model fitting process after the feature amount of the specific portion is complemented. Even if the specific portion exists, if the face orientation estimation process can be performed by a normal face model fitting process, a stable and highly accurate real-time face orientation estimation process can be realized.
  • the face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis, the yaw angle around the up-down axis, and the yaw angle around the vertical axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual.
  • the roll angle around the front-back axis may be output.
  • the face orientation estimation unit 27 of the specific individual further includes an angle correction table 27f for correcting the deviation of the face orientation angle.
  • this angle correction table 27f for each specific individual, for example, at a predetermined place in advance, various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states).
  • the angle correction data for each specific individual which is individually captured by the user, is input to the learning device as teacher data, and is adjusted by the learning process, is stored in advance. If a certain amount of deviation occurs in the estimated face orientation angle even after performing the face model fitting process that does not use the specific part of the specific individual, the deviation of the face orientation angle is determined by using the angle correction table. to correct. This correction process facilitates the calculation of the face orientation angle with high accuracy.
  • FIG. 7 is an image conceptual diagram showing a processing operation performed by the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
  • the face orientation estimation unit 27 of a specific individual creates, for example, a 3D face shape model, fits the 3D face shape model to the face region on the two-dimensional image, and performs face orientation estimation processing of the specific individual, and each organ of the face.
  • Adopt a method to detect the position and shape of.
  • a technique for fitting a 3D face shape model to a human face in an image for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but the technique is not limited thereto.
  • the 3D face model fitting process corresponding to the specific individual is performed without using the specific part of the specific individual.
  • the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of a specific individual (in the case of FIG. 7, the right eye part).
  • the process of calculating the 3D face model fitting score of each part other than the specific part is performed, and the specific individual-corresponding 3D face model fitting score of all the parts other than the specific part is set to a predetermined threshold value. Is determined whether or not the value is exceeded.
  • the tracking process is started from the next frame, and the complementary process for complementing the feature amount of the specific portion is performed.
  • the specific portion is the right eye, and the portion of the right eye is, for example, painted as a complementary treatment.
  • the face orientation estimation process is performed by the normal 3D face model fitting process, but here, when the fitting score other than the specific part exceeds a predetermined threshold value, the face orientation estimation process is performed. It is considered that the fitting can be performed accurately including the specific part, and from the next frame, the tracking process of the facial organ points is performed by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute.
  • the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
  • FIG. 8 is a flowchart showing an example of a processing operation performed by the face orientation estimation unit 27 (CPU 13) of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
  • step S61 the flag is set to false.
  • step S62 t is set to 1.
  • step S63 the above-mentioned face detection process is performed on the image at the t-frame.
  • step S64 it is determined whether or not the face can be detected, and if it is determined that the face cannot be detected, the process proceeds to step S75, and the flag is set to false, while the face can be detected. If determined, the next step is step S65.
  • step S65 the face image captured in the image at the t-frame is captured.
  • step S66 it is determined whether or not the flag is true, and if it is determined that the flag is not true, the process proceeds to step S67, while if it is determined that the flag is true, then the step is followed. Proceed to S73.
  • the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of the specific individual.
  • step S67 After completing the process of step S67, the process proceeds to step S68, and in step S68, a process of calculating a specific individual-compatible 3D face model fitting score that does not contribute to a specific part is performed. After finishing the process of step S68, the process proceeds to step S69.
  • step S69 the fitting score determination process is performed, and it is determined whether or not the specific individual-corresponding 3D face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value.
  • step S69 If it is determined in step S69 that the specific individualized 3D face model fitting score for all parts other than the specific part exceeds a predetermined threshold value, the process proceeds to step S70 thereafter.
  • step S70 the flag is set to true, and then the process proceeds to step S71.
  • step S69 the specific personalized 3D face model fitting score for all parts other than the specific part does not exceed a predetermined threshold value.
  • step S72 the process proceeds.
  • the process of advancing the frame is performed, and then the process returns to step S63.
  • step S66 it is determined that the flag is true, and then when the process proceeds to step S73, in step S73.
  • Complementary processing is performed to complement the feature amount of the specific site.
  • this complementary process for example, as shown in FIG. 7, when the specific portion is determined to be the right eye, the right eye portion is, for example, black-painted.
  • step S74 the face orientation estimation process is performed by the normal 3D face model fitting process, and then the process proceeds to step S71.
  • step S74 the tracking process is performed after the complementary process is performed, and when the fitting score other than the specific part exceeds a predetermined threshold value, it is considered that the fitting including the specific part has been performed accurately, and the facial organ point position.
  • the tracking process of facial organ points is performed by utilizing the fact that the movement of each frame can be regarded as minute.
  • the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
  • the face orientation estimation process can be performed by the normal 3D face model fitting process, and the face orientation estimation process can be performed stably. It is possible to perform accurate real-time face orientation estimation processing.
  • step S74 the process proceeds to step S71, and in step S71, the angle correction table 27f for correcting the deviation of the face orientation angle, which has been learned and created in advance for each specific individual. Is used to perform a process of correcting the deviation of the face orientation angle.
  • This correction process is performed when a certain deviation occurs in the estimated face orientation angle even if the specific individual-compatible 3D face model fitting process that does not use the specific part of the specific individual is performed, and is specified in advance.
  • the angle correction table 27f created for each individual the deviation of the face orientation angle can be easily corrected for each specific individual, and the face orientation angle can be calculated with high accuracy.
  • step S71 When the process of correcting the deviation of the face orientation angle in step S71 is completed, the process proceeds to step S72, and in step S72, the process of advancing the frame is performed, and then the process returns to step S63. Further, the process proceeds from step S75 to step S72, and in step S72, the process of advancing the frame is performed in the same manner, and then the process returns to step S63.
  • the face feature amount 142a of a specific individual and the normal face feature amount 142b are stored as the learned face feature amount in the face feature amount storage unit 142.
  • the specific individual determination unit 25 determines whether or not the face in the face region is the face of a specific individual by using the feature amount of the face region detected by the face detection unit 22 and the face feature amount 142a of the specific individual. Will be done. Therefore, by using the facial feature amount 142a of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
  • the first face image processing unit 26 can accurately perform the face image processing of the specific individual.
  • the specific individual determination unit 25 determines that the face is not a specific individual's face, in other words, a normal face (a face of a person other than the specific individual)
  • the second face image processing unit 30 determines that the face is a normal face. Image processing can be performed with high accuracy. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.
  • the face orientation estimation of a specific individual by supplementing the feature amount of a specific part (facial organ defect part, etc.) based on the face model fitting result in the previous frame, the face orientation can be stably and accurately measured in real time. It becomes possible to estimate. That is, specifically, for the first frame, a so-called specific personalized face model is used, in which a specific part is fitted without contributing.
  • the fitting score other than the specific part is equal to or higher than the predetermined threshold value, it is considered that the fitting can be performed accurately including the specific part, and from the next frame, in a moving image such as 15 frames / second or 30 frames / second.
  • the tracking process of the facial organ point is started by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute. That is, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, the position is painted black in the image, and the feature amount is extracted. By doing so, it is possible to perform processing equivalent to normal face model fitting, and as a result, it becomes possible to estimate the face orientation in real time with stability and accuracy.
  • the in-vehicle system 1 includes a driver monitoring device 10 and one or more ECUs 40 that execute a predetermined process based on the monitoring result output from the driver monitoring device 10. Therefore, based on the result of the monitoring, the ECU 40 can appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even a specific individual to drive with peace of mind.
  • the above description is merely an example of the present invention in all respects, and various improvements and modifications can be made without departing from the scope of the present invention. Needless to say.
  • the image processing device according to the present invention is applied to the driver monitoring device 10 has been described, but the application example is not limited to this.
  • the image processing apparatus according to the present invention can be applied.
  • the present invention is applied to a specific individual (meaning an individual having characteristics different from common facial characteristics even if there are differences in age, gender, etc. of a general person).
  • a specific individual meaning an individual having characteristics different from common facial characteristics even if there are differences in age, gender, etc. of a general person.
  • the present invention can be applied as a specific individual even to a person whose nose and mouth are hidden by a mask, or to a person wearing an eyepatch.
  • the present invention can be widely applied to devices and systems for monitoring objects such as people using a camera.
  • devices and systems for monitoring drivers (operators) of various moving objects such as vehicles can be widely used for devices and systems that monitor people who operate, monitor, or perform predetermined work in various facilities such as machines and devices in factories.
  • Embodiments of the present invention may also be described as, but are not limited to, the following appendices.
  • Appendix 1 It is an image processing method that processes an image input from an imaging unit. Face detection steps (S2, S3) for detecting a face region while extracting facial features from the image, and The feature amount of the face region detected by the face detection steps (S2, S3) and the learned face feature amount (142a) of the specific individual who has been trained to detect the face of the specific individual.
  • the first face image processing step (S6, S7, S8) for performing the face image processing for the specific individual and
  • the second face image processing step (S9, S10, S11) for performing normal face image processing is included.
  • the image processing includes a face orientation estimation process
  • the first face image processing step (S6, S7, S8) includes a face orientation estimation step (S6) of a specific individual. ..

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Image Analysis (AREA)
  • Instrument Panels (AREA)

Abstract

The purpose of the present invention is to provide an image processing device capable of stably performing a face orientation estimation process for a specific individual in real time with high accuracy. The image processing device, which processes an image input from an imaging unit, comprises: a facial feature quantity storage unit which stores, as learned facial feature quantities, a facial feature quantity of a specific individual and a normal facial feature quantity; a face detection unit which detects a face area while extracting a facial feature quantity from the image; a specific individual determination unit which determines whether or not the face in the face area is the face of the specific individual, by using the detected feature quantity of the face area and the facial feature quantity of the specific individual; a first face image processing unit which performs a face image process for the specific individual, when it is determined that the face in the face area is the face of the specific individual; and a second face image processing unit which performs a normal face image process, when it is determined that the face in the face area is not the face of the specific individual, wherein the first face image processing unit includes a specific individual face orientation estimation unit.

Description

画像処理装置、モニタリング装置、制御システム、画像処理方法、コンピュータプログラム、及び記憶媒体Image processing equipment, monitoring equipment, control systems, image processing methods, computer programs, and storage media
 本発明は、画像処理装置、モニタリング装置、制御システム、画像処理方法、コンピュータプログラム、及び記憶媒体に関する。 The present invention relates to an image processing device, a monitoring device, a control system, an image processing method, a computer program, and a storage medium.
 下記の特許文献1には、サービスを提供する対象(人物)の状況に応じて、適切なサービスに切り替え可能なサービス提供装置として利用されるロボット装置が開示されている。 Patent Document 1 below discloses a robot device used as a service providing device that can switch to an appropriate service according to the situation of a target (person) to provide the service.
 前記ロボット装置には、第1カメラと、第2カメラと、CPUを含む情報処理装置とが装備され、前記CPUには、顔検出部、属性判定部、人物検出部、人物位置算出部、及び移動ベクトル検出部などが装備されている。 The robot device is equipped with a first camera, a second camera, and an information processing device including a CPU, and the CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and a person position calculation unit. It is equipped with a movement vector detector and the like.
 前記ロボット装置によれば、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立している人物の集合である場合は、密なやり取りに基づいた情報を提供する第1のサービスの提供を決定する。
 他方、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立しているか否かが不明な人物の集合である場合は、やり取りを行わずに、一方的に情報を提供する第2のサービスの提供を決定する。これらにより、サービスの提供対象の状況に応じて、適切なサービスを提供することができるとしている。
According to the robot device, when the service is provided to a group of people who have a relationship such as communicating with each other, the first service is provided to provide information based on close communication. To determine.
On the other hand, when the service is provided to a group of people whose relationship such as communication with each other is unknown, the second method of providing information unilaterally without exchanging information. Decide to provide the service. With these, it is possible to provide appropriate services according to the situation of the service provision target.
 [発明が解決しようとする課題]
 特許文献1に開示された前記ロボット装置では、前記顔検出部が、前記第1カメラを用いて人物の顔検出を行う構成になっており、該顔検出には、公知の技術を利用することができるとしている。
 しかしながら、従来の顔検出技術では、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形している場合、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されている場合、又は遺伝的疾患により、前記顔器官の配置が平均的な配置から大きくずれている場合などの特定個人(一般的な人物の、年齢、及び性別の違いなどがあったとしても共通する顔特徴とは異なる特徴を有する個人をいうものとする)に対する顔検出を前提とした顔向き推定の精度も低下してしまうという課題があった。
[Problems to be solved by the invention]
In the robot device disclosed in Patent Document 1, the face detection unit is configured to detect a person's face using the first camera, and a known technique is used for the face detection. Can be done.
However, with conventional face detection technology, if a part of the facial organs such as eyes, nose, and mouth is missing or significantly deformed due to injury, a large mole, swelling, or body decoration such as tattoo will appear on the face. Assuming that there is a difference in the age and gender of a specific individual (general person, age, gender, etc.), such as when the facial organs are placed significantly different from the average position due to treatment or a genetic disease However, there is a problem that the accuracy of face orientation estimation on the premise of face detection for (referring to an individual having characteristics different from the common facial features) is also lowered.
特開2014-14899号公報Japanese Unexamined Patent Publication No. 2014-14899
課題を解決するための手段及びその効果Means for solving problems and their effects
 本発明は上記課題に鑑みなされたものであって、上記のような特定個人に対する顔向き推定の精度を向上させることができる画像処理装置、モニタリング装置、制御システム、画像処理方法、コンピュータプログラム、及び記憶媒体を提供することを目的としている。 The present invention has been made in view of the above problems, and can improve the accuracy of face orientation estimation for a specific individual as described above. It is intended to provide a storage medium.
 上記目的を達成するために、本開示に係る画像処理装置(1)は、撮像部から入力される画像を処理する画像処理装置であって、
 前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量とが記憶される顔特徴量記憶部と、
 前記画像から顔を検出するための特徴量を抽出しながら顔領域を検出する顔検出部と、
 検出された前記顔領域の前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定部と、
 該特定個人判定部により前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理部と、
 前記特定個人判定部により前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理部とを備え、
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理部には、特定個人の顔向き推定部が含まれていることを特徴としている。
In order to achieve the above object, the image processing device (1) according to the present disclosure is an image processing device that processes an image input from an imaging unit.
As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
A face detection unit that detects a face region while extracting a feature amount for detecting a face from the image, and a face detection unit.
Using the detected feature amount of the face region and the face feature amount of the specific individual, a specific individual determination unit for determining whether or not the face in the face region is the face of the specific individual.
When the specific individual determination unit determines that the face is the face of the specific individual, the first face image processing unit that performs face image processing for the specific individual and the first face image processing unit
When the specific individual determination unit determines that the face is not the specific individual's face, it is provided with a second face image processing unit that performs normal face image processing.
The image processing includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit for a specific individual.
 上記画像処理装置(1)によれば、前記顔特徴量記憶部に前記学習済みの顔特徴量として、前記特定個人の顔特徴量と、前記通常の顔特徴量(換言すれば、前記特定個人以外の人である場合に用いる顔特徴量)とが記憶され、前記特定個人判定部により、前記顔検出部で検出された前記顔領域の前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。
 前記特定個人の顔特徴量を用いることにより、前記特定個人の顔であるか否かを精度良く判定することができる。
 また、前記特定個人の顔であると判定された場合、前記第1顔画像処理部において前記特定個人の顔画像処理を精度良く実施することができる。
 他方、前記特定個人の顔ではない、換言すれば、通常の顔(前記特定個人以外の人の顔)であると判定された場合、前記第2顔画像処理部において、前記通常の顔画像処理を精度良く実施することができる。
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理部には、特定個人の顔向き推定部が含まれているので、前記特定個人の顔検出を前提とする前記特定個人に対する顔向き推定処理を安定して精度よくリアルタイムに実施することができる。
According to the image processing device (1), the face feature amount of the specific individual and the normal face feature amount (in other words, the specific individual) are used as the learned face feature amount in the face feature amount storage unit. The facial feature amount used when the person is a person other than the above) is stored, and the feature amount of the face region detected by the face detection unit and the facial feature amount of the specific individual are stored by the specific individual determination unit. It is used to determine whether or not the face in the face region is the face of the specific individual.
By using the facial feature amount of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed accurately in the first face image processing unit.
On the other hand, when it is determined that the face is not the face of the specific individual, in other words, the face of a person other than the specific individual, the second face image processing unit performs the normal face image processing. Can be carried out with high accuracy.
The image processing includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit of a specific individual. Therefore, the specific person is premised on the face detection of the specific individual. Face orientation estimation processing for an individual can be performed stably and accurately in real time.
 また、本開示に係る画像処理装置(2)は、上記画像処理装置(1)において、
 前記特定個人の顔向き推定部が、特定個人の特定部位を使用しない顔モデルフィッティング処理により顔向き推定を行う特定部位不使用顔向き推定部を備えていることを特徴としている。
Further, the image processing device (2) according to the present disclosure is the above-mentioned image processing device (1).
The face orientation estimation unit of the specific individual is characterized by including a specific part-free face orientation estimation unit that estimates the face orientation by a face model fitting process that does not use the specific part of the specific individual.
 上記画像処理装置(2)によれば、前記特定部位の影響を受けない顔モデルフィッティング処理を行うことにより、リアルタイム処理を維持しつつ、精度よく顔向きを推定することができる。 According to the image processing device (2), by performing the face model fitting process that is not affected by the specific portion, it is possible to accurately estimate the face orientation while maintaining the real-time process.
 また、本開示に係る画像処理装置(3)は、上記画像処理装置(2)において、
 前記特定部位以外の全ての部位各々の顔モデルフィッティングスコアを算出するスコア算出部と、
 前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の条件を満たしているか否かを判定するフィッティングスコア判定部とを、さらに備えていることを特徴としている。
Further, the image processing apparatus (3) according to the present disclosure is the above-mentioned image processing apparatus (2).
A score calculation unit that calculates the face model fitting score for each part other than the specific part,
It is further characterized by further including a fitting score determining unit for determining whether or not the face model fitting score for all the portions other than the specific portion satisfies a predetermined condition.
 上記画像処理装置(3)によれば、前記特定部位を除いた部位のみによっても精度の高い顔向き推定処理の実施が可能か否かの判定を正確に行うことができることとなる。 According to the image processing device (3), it is possible to accurately determine whether or not highly accurate face orientation estimation processing can be performed only by the portion excluding the specific portion.
 また、本開示に係る画像処理装置(4)は、上記画像処理装置(3)において、
 前記フィッティングスコア判定部において、前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の条件を満たしていると判定された場合には、トラッキング処理に際し、前記特定部位の特徴量を補完する補完処理部を、さらに備えていることを特徴としている。
Further, the image processing device (4) according to the present disclosure is the above-mentioned image processing device (3).
When the fitting score determination unit determines that the face model fitting score for all parts other than the specific part satisfies a predetermined condition, the feature amount of the specific part is supplemented in the tracking process. It is characterized in that it is further provided with a complementary processing unit.
 上記画像処理装置(4)によれば、前記特定部位の特徴量を補完する補完処理部を備えることにより、前記特定部位を通常の部位、例えば、左目、右目、鼻、口などとして処理することが可能となる。 According to the image processing apparatus (4), by providing a complementary processing unit that complements the feature amount of the specific portion, the specific portion is processed as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like. Is possible.
 また、本開示に係る画像処理装置(5)は、上記画像処理装置(4)において、
 前記特定部位の特徴量が補完された後は、通常の顔モデルフィッティング処理により顔向き推定を行う通常顔向き推定部を、さらに備えていることを特徴としている。
Further, the image processing apparatus (5) according to the present disclosure is the image processing apparatus (4) described above.
After the feature amount of the specific portion is complemented, it is further provided with a normal face orientation estimation unit that estimates the face orientation by a normal face model fitting process.
 上記画像処理装置(5)によれば、前記特定部位が存在していても、通常の顔モデルフィッティング処理により顔向き推定を実施することができ、安定して高速、高精度な処理が可能となる。 According to the image processing apparatus (5), even if the specific part exists, face orientation can be estimated by normal face model fitting processing, and stable, high-speed, and highly accurate processing can be performed. Become.
 また、本開示に係る画像処理装置(6)は、上記画像処理装置(1)~(5)のいずれかにおいて、
 顔向き角度のずれを補正する角度補正テーブルを、さらに備えていることを特徴としている。
Further, the image processing apparatus (6) according to the present disclosure is the above-mentioned image processing apparatus (1) to (5).
It is characterized by further providing an angle correction table that corrects the deviation of the face orientation angle.
 上記画像処理装置(6)によれば、上記処理を実施しても、推定される顔向き角度に一定の角度のずれが生じる場合には、前記角度補正テーブルを用いて顔向き角のずれを補正することができ、高精度な顔向き角の算出が容易となる。 According to the image processing apparatus (6), if a certain angle deviation occurs in the estimated face orientation angle even after the above processing is performed, the deviation of the face orientation angle is determined by using the angle correction table. It can be corrected, and it becomes easy to calculate the face orientation angle with high accuracy.
 また、本開示に係る画像処理装置(7)は、上記画像処理装置(1)~(6)のいずれかにおいて、
 前記特定個人判定部が、
 前記顔領域から抽出された前記特徴量と前記特定個人の顔特徴量との相関係数を算出し、
 算出した前記相関係数に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かを判定するものであることを特徴としている。
Further, the image processing apparatus (7) according to the present disclosure is the above-mentioned image processing apparatus (1) to (6).
The specific individual judgment unit
The correlation coefficient between the feature amount extracted from the face area and the face feature amount of the specific individual was calculated.
Based on the calculated correlation coefficient, it is characterized in that it is determined whether or not the face in the face region is the face of the specific individual.
 上記画像処理装置(7)によれば、前記顔領域から抽出された前記特徴量と前記特定個人の顔特徴量との相関係数を算出し、算出した前記相関係数に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。これにより、前記相関係数に基づいて前記顔領域の顔が前記特定個人の顔であるか否かを効率良く判定することができる。 According to the image processing apparatus (7), a correlation coefficient between the feature amount extracted from the face region and the face feature amount of the specific individual is calculated, and the face is based on the calculated correlation coefficient. It is determined whether or not the face of the area is the face of the specific individual. Thereby, it is possible to efficiently determine whether or not the face in the face region is the face of the specific individual based on the correlation coefficient.
 また、本開示に係る画像処理装置(8)は、上記画像処理装置(7)において、
 前記特定個人判定部が、
 前記相関係数が所定の閾値より大きい場合、前記顔領域の顔が前記特定個人の顔であると判定し、
 前記相関係数が前記所定の閾値以下の場合、前記顔領域の顔が前記特定個人の顔ではないと判定するものであることを特徴としている。
Further, the image processing apparatus (8) according to the present disclosure is the image processing apparatus (7) described above.
The specific individual judgment unit
When the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual.
When the correlation coefficient is equal to or less than the predetermined threshold value, it is determined that the face in the face region is not the face of the specific individual.
 上記画像処理装置(8)によれば、前記相関係数が所定の閾値より大きい場合、前記顔領域の顔が前記特定個人の顔であると判定され、前記相関係数が前記所定の閾値以下の場合、前記顔領域の顔が前記特定個人の顔ではないと判定される。前記相関係数と前記所定の閾値とを比較する処理により、前記判定の処理効率を更に高めることができる。 According to the image processing apparatus (8), when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and the correlation coefficient is equal to or less than the predetermined threshold value. In the case of, it is determined that the face in the face area is not the face of the specific individual. The processing efficiency of the determination can be further improved by the processing of comparing the correlation coefficient with the predetermined threshold value.
 また、本開示に係る画像処理装置(9)は、上記画像処理装置(1)~(8)のいずれかにおいて、前記顔画像処理には、顔検出処理、視線方向推定処理、及び目開閉検出処理のうち少なくとも1つは含まれていることを特徴としている。 Further, the image processing device (9) according to the present disclosure is any of the above image processing devices (1) to (8), and the face image processing includes face detection processing, line-of-sight direction estimation processing, and eye opening / closing detection. It is characterized in that at least one of the processes is included.
 上記画像処理装置(9)によれば、前記顔画像処理には、顔検出処理、視線方向推定処理、及び目開閉検出処理のうち少なくとも1つは含まれているので、前記特定個人、又は前記特定個人以外の人のさまざまな顔の挙動を推定したり、検出したりする処理を精度良く行うことができる。 According to the image processing apparatus (9), since the face image processing includes at least one of the face detection process, the line-of-sight direction estimation process, and the eye opening / closing detection process, the specific individual or the above. It is possible to accurately perform processing for estimating and detecting various facial behaviors of people other than a specific individual.
 また、本開示に係るモニタリング装置(1)は、上記画像処理装置(1)~(9)のいずれかと、
 該画像処理装置に入力する画像を撮影する撮像部と、
 前記画像処理装置による画像処理に基づく情報を出力する出力部とを備えていることを特徴としている。
Further, the monitoring device (1) according to the present disclosure includes any of the above image processing devices (1) to (9).
An image pickup unit that captures an image to be input to the image processing device, and
It is characterized by including an output unit that outputs information based on image processing by the image processing apparatus.
 上記モニタリング装置(1)によれば、前記通常の人の顔だけでなく、前記特定個人の顔を精度良くモニタリングすることができ、また、前記出力部から前記画像処理に基づく情報が出力可能なため、該情報を利用するモニタリングシステムなどを容易に構築することが可能となる。 According to the monitoring device (1), not only the face of the normal person but also the face of the specific individual can be accurately monitored, and information based on the image processing can be output from the output unit. Therefore, it is possible to easily construct a monitoring system or the like that uses the information.
 また、本開示に係る制御システム(1)は、
 上記モニタリング装置(1)と、
 該モニタリング装置と通信可能に接続され、該モニタリング装置から出力される前記情報に基づいて、所定の処理を実行する1以上の制御装置とを備えていることを特徴としている。
In addition, the control system (1) according to the present disclosure is
With the above monitoring device (1)
It is characterized by including one or more control devices that are communicably connected to the monitoring device and execute a predetermined process based on the information output from the monitoring device.
 上記制御システム(1)によれば、前記モニタリング装置から出力される前記情報に基づいて、1以上の前記制御装置で所定の処理を実行させることが可能となる。従って、前記通常の人のモニタリング結果だけでなく、前記特定個人のモニタリング結果を利用することができるシステムを構築することができる。 According to the control system (1), it is possible to execute a predetermined process by one or more of the control devices based on the information output from the monitoring device. Therefore, it is possible to construct a system that can utilize not only the monitoring result of the normal person but also the monitoring result of the specific individual.
 また、本開示に係る制御システム(2)は、上記制御システム(1)において、
 前記モニタリング装置が、車両のドライバをモニタリングするための装置であり、
 前記制御装置が、前記車両に搭載される電子制御ユニットを含むものであることを特徴としている。
Further, the control system (2) according to the present disclosure is the control system (1) described above.
The monitoring device is a device for monitoring the driver of the vehicle.
The control device is characterized by including an electronic control unit mounted on the vehicle.
 上記制御システム(2)によれば、前記車両のドライバが前記特定個人である場合であっても、前記特定個人の顔を精度良くモニタリングすることができ、そのモニタリングの結果に基づいて、前記電子制御ユニットに所定の制御を適切に実行させることが可能となる。これにより、前記特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。 According to the control system (2), even when the driver of the vehicle is the specific individual, the face of the specific individual can be accurately monitored, and the electronic device is based on the monitoring result. It is possible to make the control unit appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even the specific individual to drive with peace of mind.
 また、本開示に係る画像処理方法(1)は、撮像部から入力される画像を処理する画像処理方法であって、
 前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
 該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
 該特定個人判定ステップにより、前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
 前記特定個人判定ステップにより、前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を含み、
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴としている。
Further, the image processing method (1) according to the present disclosure is an image processing method for processing an image input from an imaging unit.
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step.
The specific individual determination step includes a second face image processing step of performing normal face image processing when it is determined that the face is not the specific individual's face.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
 上記画像処理方法(1)によれば、前記特定個人判定ステップにより、前記顔検出ステップで検出された前記顔領域の前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。前記特定個人の顔特徴量を用いることにより、前記特定個人の顔であるか否かを精度良く判定することができる。
 また、前記特定個人の顔であると判定された場合、前記第1顔画像処理ステップにより前記特定個人の顔画像処理を精度良く実施することができる。一方、前記特定個人の顔ではない、換言すれば、前記特定個人ではない通常の顔であると判定された場合、前記第2顔画像処理ステップにより前記通常の顔画像処理を精度良く実施することができる。従って、前記特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる。
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含んでいるので、前記特定個人に対する顔向き推定処理を安定して精度よくリアルタイムに実施することができる。
According to the image processing method (1), the face region is used by the feature amount of the face region detected in the face detection step and the face feature amount of the specific individual in the specific individual determination step. It is determined whether or not the face of the specific individual is the face of the specific individual. By using the facial feature amount of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be accurately performed by the first face image processing step. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing is performed with high accuracy by the second face image processing step. Can be done. Therefore, both the specific individual and the ordinary person other than the specific individual can accurately perform the sensing of each face.
Since the image processing includes a face orientation estimation process and the first face image processing step includes a face orientation estimation step for a specific individual, the face orientation estimation process for the specific individual is performed stably and accurately. It can be implemented in real time.
 また、本開示に係るコンピュータプログラム(1)は、撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのコンピュータプログラムであって、
 前記少なくとも1以上のコンピュータに、
 前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
 該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
 該特定個人判定ステップにより前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
 前記特定個人判定ステップにより前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を実行させ、
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴としている。
Further, the computer program (1) according to the present disclosure is a computer program for causing at least one or more computers to process an image input from an imaging unit.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
 上記コンピュータプログラム(1)によれば、前記少なくとも1以上のコンピュータに、前記顔領域の前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定させることができ、前記特定個人の顔であるか否かを精度良く判定させることができる。
 また、前記特定個人の顔であると判定された場合、前記特定個人の顔画像処理を精度良く実施させることができる。一方、前記特定個人の顔ではない、換言すれば、前記特定個人ではない通常の顔であると判定された場合、前記通常の顔画像処理を精度良く実施させることができる。従って、前記特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる装置やシステムを構築することができる。
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含んでいるので、前記特定個人に対する顔向き推定処理を安定して精度よくリアルタイムに実施することができる。
 なお、上記コンピュータプログラムは、記憶媒体に保存されたコンピュータプログラムであってもよいし、通信ネットワークを介して転送可能なコンピュータプログラムであってもよいし、通信ネットワークを介して実行されるコンピュータプログラムであってもよい。
According to the computer program (1), the face in the face area is the face of the specific individual by using the feature amount in the face area and the face feature amount in the specific individual in at least one computer. It is possible to determine whether or not it is the face of the specific individual, and it is possible to accurately determine whether or not it is the face of the specific individual.
Further, when it is determined that the face of the specific individual is the face, the face image processing of the specific individual can be performed with high accuracy. On the other hand, when it is determined that the face is not the specific individual's face, in other words, it is a normal face that is not the specific individual, the normal face image processing can be performed with high accuracy. Therefore, it is possible to construct a device or system capable of accurately sensing each face regardless of whether the specific individual or an ordinary person other than the specific individual.
Since the image processing includes a face orientation estimation process and the first face image processing step includes a face orientation estimation step for a specific individual, the face orientation estimation process for the specific individual is performed stably and accurately. It can be implemented in real time.
The computer program may be a computer program stored in a storage medium, a computer program that can be transferred via a communication network, or a computer program executed via a communication network. There may be.
 また、本開示に係るコンピュータ読み取り可能な記憶媒体(1)は、撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのコンピュータ読み取り可能な記憶媒体であって、
 前記少なくとも1以上のコンピュータに、
 前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
 該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
 該特定個人判定ステップにより前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
 前記特定個人判定ステップにより前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を実行させるためのプログラムを記憶し、
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴としている。
Further, the computer-readable storage medium (1) according to the present disclosure is a computer-readable storage medium for causing at least one or more computers to process an image input from an imaging unit.
To at least one of the above computers
A face detection step of detecting a face region while extracting facial features from the image, and
The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
When it is determined by the specific individual determination step that the face is not the face of the specific individual, a program for executing the second face image processing step of performing normal face image processing and the program for executing the execution are stored.
The image processing includes a face orientation estimation process, and the first face image processing step is characterized by including a face orientation estimation step for a specific individual.
 上記コンピュータ読み取り可能な記憶媒体(1)によれば、前記少なくとも1以上のコンピュータに、前記プログラムを読み取らせて上記各ステップを実行させることにより、上記コンピュータプログラム(1)で得られる効果と同様の効果を得ることができる。 According to the computer-readable storage medium (1), the same effect as that obtained by the computer program (1) by having at least one or more computers read the program and executing each of the steps is the same. The effect can be obtained.
本発明の実施の形態に係るドライバモニタリング装置を含む車載システムの一例を示す模式図である。It is a schematic diagram which shows an example of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment of this invention. 実施の形態に係るドライバモニタリング装置を含む車載システムのハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware composition of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部の機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部が行う処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部が行う特定個人判定処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the specific individual judgment processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部における特定個人の顔向き推定部のより詳細な機能構成例を示すブロック図である。It is a block diagram which shows the more detailed function configuration example of the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部における特定個人の顔向き推定部が実施する処理動作の一例を示すイメージ概念図である。It is an image conceptual diagram which shows an example of the processing operation performed by the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部における特定個人の顔向き推定部が行う処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the processing operation performed by the face orientation estimation part of a specific individual in the image processing part of the driver monitoring apparatus which concerns on embodiment.
 以下、本発明に係る画像処理装置、モニタリング装置、制御システム、画像処理方法、コンピュータプログラム、及び記憶媒体の実施の形態を図面に基づいて説明する。
[適用例]
 本発明に係る画像処理装置は、例えば、カメラを用いて人などの対象物をモニタリングする装置やシステムに広く適用可能である。
 また、本発明に係る画像処理装置は、例えば、車両などの各種移動体のドライバ(操縦者)をモニタリングする装置やシステムの他、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにも適用可能である。
Hereinafter, embodiments of an image processing device, a monitoring device, a control system, an image processing method, a computer program, and a storage medium according to the present invention will be described with reference to the drawings.
[Application example]
The image processing apparatus according to the present invention can be widely applied to, for example, an apparatus or system for monitoring an object such as a person using a camera.
Further, the image processing device according to the present invention operates or monitors, for example, various equipment such as machines and devices in a factory, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can also be applied to devices and systems that monitor people who perform or perform predetermined work.
[適用例1]
 図1は、本開示に係る画像処理装置を、ドライバモニタリング装置を含む車載システムに適用した例を示す模式図である。
 車載システム1は、車両2のドライバ3の状態(例えば、顔の挙動など)をモニタリングするドライバモニタリング装置10、車両2の走行、操舵、又は制動などの制御を行う1以上のECU(Electronic Control Unit)40、及び車両各部の状態、又は車両周囲の状態などを検出する1以上のセンサ41を含んで構成され、これらが通信バス43を介して接続されている。
[Application example 1]
FIG. 1 is a schematic view showing an example in which the image processing device according to the present disclosure is applied to an in-vehicle system including a driver monitoring device.
The in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2. ) 40, and one or more sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43.
 車載システム1は、例えば、CAN(Controller Area Network)プロトコルに従って通信する車載ネットワークシステムとして構成されている。車載システム1の通信規格には、CANの他、CAN以外の他の通信規格も採用され得る。
 ドライバモニタリング装置10は、本発明に係る「モニタリング装置」の一例であり、車載システム1は、本発明に係る「制御システム」の一例である。
The in-vehicle system 1 is configured as, for example, an in-vehicle network system that communicates according to a CAN (Controller Area Network) protocol. In addition to CAN, other communication standards other than CAN may be adopted as the communication standard of the in-vehicle system 1.
The driver monitoring device 10 is an example of the "monitoring device" according to the present invention, and the in-vehicle system 1 is an example of the "control system" according to the present invention.
 ドライバモニタリング装置10は、ドライバ3の顔を撮影するためのカメラ11と、カメラ11から入力される画像を処理する画像処理部12と、画像処理部12による画像処理に基づく情報を、通信バス43を介して所定のECU40に出力する処理などを行う通信部16とを含んで構成されている。画像処理部12は、本発明に係る「画像処理装置」の一例であり、カメラ11は、本発明に係る「撮像部」の一例である。 The driver monitoring device 10 transmits information based on image processing by the camera 11 for photographing the face of the driver 3, the image processing unit 12 that processes the image input from the camera 11, and the image processing unit 12, and the communication bus 43. It is configured to include a communication unit 16 that performs processing such as output to a predetermined ECU 40 via the above. The image processing unit 12 is an example of the "image processing apparatus" according to the present invention, and the camera 11 is an example of the "imaging unit" according to the present invention.
 ドライバモニタリング装置10は、カメラ11で撮影された画像からドライバ3の顔を検出し、検出されたドライバ3の顔の向き、視線の方向、あるいは目の開閉状態などの顔の挙動を検出する。ドライバモニタリング装置10は、これら顔の挙動の検出結果に基づいて、ドライバ3の状態、例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態を判定することが可能となる。また、ドライバモニタリング装置10が、これらドライバ3の状態判定に基づく信号をECU40に出力し、ECU40が、例えば、前記信号に基づいてドライバ3への注意や警告処理、あるいは外部への通報、又は車両2の動作制御(例えば、減速制御、又は路肩への誘導制御など)などを実行するように構成されている。 The driver monitoring device 10 detects the face of the driver 3 from the image taken by the camera 11, and detects the behavior of the face such as the direction of the face of the detected driver 3, the direction of the line of sight, or the open / closed state of the eyes. The driver monitoring device 10 can determine the state of the driver 3, such as forward gaze, inattentiveness, dozing, backward facing, and prone, based on the detection results of these facial behaviors. Further, the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40, for example, pays attention to or warns the driver 3 based on the signal, or notifies the outside, or the vehicle. It is configured to execute the operation control (for example, deceleration control, guidance control to the road shoulder, etc.) of 2.
 ドライバモニタリング装置10では、例えば、特定個人に対する顔向きを安定して精度良くリアルタイムで推定することを目的の一つとしている。 One of the purposes of the driver monitoring device 10 is, for example, to stably and accurately estimate the face orientation of a specific individual in real time.
 従来のドライバモニタリング装置では、車両2のドライバ3が、例えば、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形していたり、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されていたり、又は遺伝性の疾患などにより、前記顔器官の配置が平均的な位置から大きくずれていたりした場合、カメラで撮影された画像から顔向きを推定する精度が低下してしまうという課題があった。 In the conventional driver monitoring device, the driver 3 of the vehicle 2 has a part of facial organs such as eyes, nose, and mouth missing or greatly deformed due to, for example, an injury, or a large mole or wart on the face, or Accuracy of estimating face orientation from images taken by a camera when the arrangement of the facial organs is significantly deviated from the average position due to body decoration such as tattoos or hereditary diseases. There was a problem that the
 また、顔向き推定の精度が低下すると、顔向き推定後の処理も適切に行われないこととなるため、ドライバ3の脇見や居眠りなどの状態判定も適切に行うことができなくなる。また、前記状態判定に基づいてECU40が実行すべき各種の制御も適切に行うことができなくなる虞があるという課題があった。 Further, if the accuracy of the face orientation estimation is lowered, the processing after the face orientation estimation will not be performed properly, so that it will not be possible to properly determine the state such as inattentiveness or dozing of the driver 3. Further, there is a problem that various controls to be executed by the ECU 40 based on the state determination may not be appropriately performed.
 係る課題を解決すべく、実施の形態に係るドライバモニタリング装置10では、前記特定個人に対する顔向き推定の精度を向上させるために、以下の構成が採用されている。 In order to solve the problem, the driver monitoring device 10 according to the embodiment adopts the following configuration in order to improve the accuracy of face orientation estimation for the specific individual.
 画像処理部12には、画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量(特定個人以外の人である場合に用いる顔特徴量)とを記憶させる。 In the image processing unit 12, as the learned facial features that have been learned to detect the face from the image, the facial features of a specific individual and the normal facial features (when the person is a person other than the specific individual) The amount of facial features to be used) is memorized.
 画像処理部12が、カメラ11の入力画像から顔を検出するための特徴量を抽出しながら顔領域を検出する顔検出処理を行う。そして、画像処理部12が、検出された前記顔領域の特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定処理を行う。 The image processing unit 12 performs face detection processing for detecting the face area while extracting the feature amount for detecting the face from the input image of the camera 11. Then, the image processing unit 12 determines whether or not the face in the face region is the face of the specific individual by using the detected feature amount of the face region and the face feature amount of the specific individual. Performs specific individual judgment processing.
 前記特定個人判定処理では、前記顔領域から抽出された特徴量と前記特定個人の顔特徴量との関係を示す指標、例えば、相関係数を算出し、算出された前記相関係数に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。
 例えば、前記相関係数が所定の閾値より大きい場合、前記顔領域の顔が前記特定個人の顔であると判定され、前記相関係数が前記所定の閾値以下の場合、前記顔領域の顔が前記特定個人の顔ではないと判定される。なお、前記特定個人判定処理では、前記相関係数以外の指標が採用されてもよい。
In the specific individual determination process, an index showing the relationship between the feature amount extracted from the face region and the face feature amount of the specific individual, for example, a correlation coefficient is calculated, and based on the calculated correlation coefficient. , It is determined whether or not the face in the face region is the face of the specific individual.
For example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the face region is It is determined that it is not the face of the specific individual. In the specific individual determination process, an index other than the correlation coefficient may be adopted.
 また、前記特定個人判定処理では、カメラ11からの入力画像の1フレームに対する判定結果に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かが判定されてもよいし、カメラ11からの入力画像の複数フレームに対する判定結果に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かが判定されてもよい。 Further, in the specific individual determination process, it may be determined whether or not the face in the face region is the face of the specific individual based on the determination result for one frame of the input image from the camera 11. Based on the determination result for a plurality of frames of the input image from No. 11, it may be determined whether or not the face in the face region is the face of the specific individual.
 このように、ドライバモニタリング装置10では、画像処理部12に、学習済みの特定個人の顔特徴量を予め記憶させておき、特定個人の顔特徴量を用いることにより、前記特定個人の顔であるか否かを精度良く判定することが可能となっている。 As described above, in the driver monitoring device 10, the image processing unit 12 stores the learned facial feature amount of the specific individual in advance, and the face feature amount of the specific individual is used to obtain the face of the specific individual. It is possible to accurately determine whether or not it is.
 また、前記特定個人判定処理により、前記特定個人の顔であると判定された場合、画像処理部12は、特定個人用の顔画像処理を実行するので、前記特定個人の顔画像処理を精度良く実施することが可能となる。
 一方、前記特定個人の顔ではない、換言すれば、通常の顔であると判定された場合、画像処理部12は、通常の顔画像処理を実行するので、前記通常の顔画像処理を精度良く実施することができる。従って、ドライバ3が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる。
Further, when the face of the specific individual is determined by the specific individual determination process, the image processing unit 12 executes the face image process for the specific individual, so that the face image process of the specific individual can be performed accurately. It will be possible to carry out.
On the other hand, when it is determined that the face is not the face of the specific individual, in other words, it is a normal face, the image processing unit 12 executes the normal face image processing, so that the normal face image processing can be performed accurately. Can be carried out. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.
[ハードウェア構成例]
 図2は、実施の形態に係るドライバモニタリング装置10を含む車載システム1のハードウェア構成の一例を示すブロック図である。
[Hardware configuration example]
FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.
 車載システム1は、車両2のドライバ3の状態をモニタリングするドライバモニタリング装置10、1以上のECU40、及び1以上のセンサ41を含んで構成され、これらが通信バス43を介して接続されている。また、ECU40には、1以上のアクチュエータ42が接続されている。 The in-vehicle system 1 includes a driver monitoring device 10, 1 or more ECUs 40 for monitoring the state of the driver 3 of the vehicle 2, and 1 or more sensors 41, which are connected via a communication bus 43. Further, one or more actuators 42 are connected to the ECU 40.
 ドライバモニタリング装置10は、カメラ11と、カメラ11から入力される画像を処理する画像処理部12と、外部のECU40などとデータや信号のやり取りを行うための通信部16とを含んで構成されている。 The driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 for exchanging data and signals with an external ECU 40 and the like. There is.
 カメラ11は、運転席に着座しているドライバ3の顔を含む画像を撮影する装置であり、例えば、レンズ部、撮像素子部、光照射部、インターフェース部、これら各部を制御するカメラ制御部などを含んで構成されている。
 前記撮像素子部は、例えば、CCD(Charge Coupled Device)、CMOS(Complementary Metal Oxide Semiconductor)などの撮像素子、フィルタ、マイクロレンズなどを含んで構成されている。前記撮像素子部は、可視領域の光を受けて撮影画像を形成できる素子でもよいし、近赤外領域の光を受けて撮影画像を形成できる素子でもよい。
 前記光照射部は、LED(Light Emitting Diode)などの発光素子を含んで構成され、昼夜を問わずドライバの顔を撮像できるように近赤外線LEDなどを含んでいてもよい。
 カメラ11は、所定のフレームレート(例えば、毎秒数十フレーム)で画像を撮影し、撮影された画像のデータが画像処理部12に入力される。カメラ11は、一体式の他、外付け式のものであってもよい。
The camera 11 is a device that captures an image including the face of the driver 3 seated in the driver's seat. For example, a lens unit, an image sensor unit, a light irradiation unit, an interface unit, a camera control unit that controls each of these units, and the like. Is configured to include.
The image sensor unit includes, for example, an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like. The image pickup device unit may be an element capable of forming a photographed image by receiving light in a visible region, or may be an element capable of forming a photographed image by receiving light in a near infrared region.
The light irradiation unit is configured to include a light emitting element such as an LED (Light Emitting Diode), and may include a near infrared LED or the like so that the driver's face can be imaged day or night.
The camera 11 shoots an image at a predetermined frame rate (for example, several tens of frames per second), and the data of the shot image is input to the image processing unit 12. The camera 11 may be an external type as well as an integrated type.
 画像処理部12は、1以上のCPU(Central Processing Unit)13、ROM(Read Only Memory)14、及びRAM(Random Access Memory)15を含む画像処理装置として構成されている。ROM14は、プログラム記憶部141と顔特徴量記憶部142とを含み、RAM15は、カメラ11からの入力画像を記憶する画像メモリ151を含んで構成されている。
 なお、ドライバモニタリング装置10に、別の記憶部が設けられ、その記憶部がプログラム記憶部141、顔特徴量記憶部142、及び画像メモリ151として用いられてもよい。前記別の記憶部は、半導体メモリでもよいし、ディスクドライブなどで読み込み可能な記憶媒体でもよい。
The image processing unit 12 is configured as an image processing device including one or more CPUs (Central Processing Units) 13, ROMs (Read Only Memory) 14, and RAMs (Random Access Memory) 15. The ROM 14 includes a program storage unit 141 and a facial feature amount storage unit 142, and the RAM 15 includes an image memory 151 for storing an input image from the camera 11.
The driver monitoring device 10 may be provided with another storage unit, and the storage unit may be used as the program storage unit 141, the facial feature amount storage unit 142, and the image memory 151. The other storage unit may be a semiconductor memory or a storage medium that can be read by a disk drive or the like.
 CPU13は、ハードウェアプロセッサの一例であり、ROM14のプログラム記憶部141に記憶されているコンピュータプログラム、顔特徴量記憶部142に記憶されている顔特徴量などのデータを読み込み、解釈し実行することで、カメラ11から入力された画像の処理、例えば、顔検出処理、顔向き推定処理などの顔画像処理を行う。また、CPU13は、該顔画像処理により得られた結果(例えば、処理データ、判定信号、又は制御信号など)を、通信部16を介してECU40などに出力する処理などを行う。 The CPU 13 is an example of a hardware processor, and reads, interprets, and executes data such as a computer program stored in the program storage unit 141 of the ROM 14 and a face feature amount stored in the face feature amount storage unit 142. Then, processing of the image input from the camera 11, for example, face image processing such as face detection processing and face orientation estimation processing is performed. Further, the CPU 13 performs a process of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16.
 顔特徴量記憶部142には、画像から顔を検出するための学習を行った学習済みの顔特徴量として、図3に示す特定個人の顔特徴量142aと、通常の顔特徴量142bとが記憶されている。
 学習済みの顔特徴量には、画像から顔を検出するのに有効な各種の特徴量を用いることができる。例えば、顔の局所的な領域の明暗差(さまざまな大きさの2つの矩形領域の平均輝度の差)に着目した特徴量(Haar-like特徴量)が用いられる。
 又は、顔の局所的な領域の輝度の分布の組み合わせに着目した特徴量(LBP (Local Binary Pattern) 特徴量)が採用され、あるいは、顔の局所的な領域の輝度の勾配方向の分布の組み合わせに着目した特徴量(HOG (Histogram of Oriented Gradients) 特徴量)が用いられてもよい。
The face feature amount storage unit 142 contains a specific individual face feature amount 142a and a normal face feature amount 142b shown in FIG. 3 as learned face feature amounts that have been learned to detect a face from an image. It is remembered.
As the learned facial features, various features effective for detecting a face from an image can be used. For example, a feature amount (Haar-like feature amount) focusing on the difference in brightness (difference in average brightness between two rectangular areas of various sizes) in a local area of the face is used.
Alternatively, a feature amount (LBP (Local Binary Pattern) feature amount) focusing on a combination of brightness distributions in a local region of the face is adopted, or a combination of brightness distributions in a local region of the face in the gradient direction is adopted. A feature quantity (HOG (Histogram of Oriented Gradients) feature quantity) focusing on is may be used.
 顔検出に有効な特徴量を抽出する方法として、各種の機械学習による手法を用いることができる。機械学習とは、データ(学習データ)に内在するパターンをコンピュータにより見つけ出す処理のことをいう。例えば、統計的な学習手法の一例としてAdaBoostが用いられてもよい。AdaBoostは、判別能力の低い判別器(弱判別器)を多数選び出し、これら多数の弱判別器の中からエラー率が小さい弱判別器を選択し、重みなどのパラメータを調整し、階層的な構造にすることで、強判別器を構築することのできる学習アルゴリズムである。判別器は、識別器、分類器、又は学習器とも称される。 Various machine learning methods can be used as a method for extracting features that are effective for face detection. Machine learning is a process of finding patterns inherent in data (learning data) by a computer. For example, AdaBoost may be used as an example of a statistical learning method. AdaBoost selects a large number of discriminators (weak discriminators) with low discriminating ability, selects a weak discriminator with a small error rate from these many weak discriminators, adjusts parameters such as weights, and has a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by setting. The discriminator is also referred to as a discriminator, a classifier, or a learner.
 例えば、顔の検出に有効な1つの特徴量を1つの弱判別器によって判別する構成とし、AdaBoostにより多数の弱判別器とその組み合わせを選び出し、これらを用いて、階層的な構造を有する強判別器が構築される。なお、1つの弱判別器からは、例えば、顔の場合は1、非顔の場合は0という情報が出力される。
 また、学習手法には、顔らしさを0または1ではなく、0から1の実数で出力可能なReal AdaBoostという学習手法が採用されてもよい。
 また、これら学習手法には、入力層、中間層、及び出力層を有するニューラルネットワークが採用されてもよい。
For example, one feature amount effective for face detection is discriminated by one weak discriminator, a large number of weak discriminators and their combinations are selected by AdaBoost, and these are used for strong discrimination having a hierarchical structure. The vessel is built. Note that, for example, information such as 1 for a face and 0 for a non-face is output from one weak discriminator.
Further, as the learning method, a learning method called Real AdaBoost, which can output a real number from 0 to 1 instead of 0 or 1, may be adopted.
Further, as these learning methods, a neural network having an input layer, an intermediate layer, and an output layer may be adopted.
 このような学習アルゴリズムが搭載された学習装置に、さまざまな条件で撮影された多数の顔画像と多数の顔以外の画像(非顔画像)とが学習データとして与えられ、学習が繰り返され、重みなどのパラメータが調整されて最適化が図られることにより、顔を高精度に検出可能な階層構造を有する強判別器が構築されてゆく。
 そして、強判別器を構成する各階層の弱判別器で用いられる1以上の特徴量が、学習済みの顔特徴量として用いられる。
A large number of face images taken under various conditions and a large number of non-face images (non-face images) are given as training data to a learning device equipped with such a learning algorithm, learning is repeated, and weights are obtained. By adjusting and optimizing the parameters such as, a strong discriminator having a hierarchical structure capable of detecting the face with high accuracy will be constructed.
Then, one or more feature amounts used in the weak discriminators of each layer constituting the strong discriminator are used as the learned facial feature amounts.
 特定個人の顔特徴量142aは、例えば、予め所定の場所で、特定個人の顔画像がさまざまな条件(さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件)で個別に撮像され、これら多数の撮影画像が教師データとして、上記学習装置に入力され、学習処理によって調整された、特定個人の顔の特徴を示すパラメータである。
 特定個人の顔特徴量142aは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンでもよい。顔特徴量記憶部142に記憶される特定個人の顔特徴量142aは、1人の特定個人の顔特徴量だけでもよいし、複数の特定個人が車両2を運転する場合には、複数人の特定個人の顔特徴量であってもよい。
For example, the face feature amount 142a of a specific individual is obtained by individually capturing a face image of the specific individual at a predetermined place under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). A large number of these captured images are input to the learning device as teacher data, and are adjusted by the learning process, which are parameters indicating the facial features of a specific individual.
The facial feature amount 142a of the specific individual may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process. The facial feature amount 142a of a specific individual stored in the facial feature amount storage unit 142 may be only the facial feature amount of one specific individual, or when a plurality of specific individuals drive the vehicle 2, a plurality of specific individuals may drive the vehicle 2. It may be a facial feature amount of a specific individual.
 通常の顔特徴量142bは、通常の人の顔画像がさまざまな条件(さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件)で撮像された画像が教師データとして、上記学習装置に入力され、学習処理によって調整された、通常の人の顔の特徴を示すパラメータである。
 通常の顔特徴量142bは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンであってもよい。
In the normal facial feature amount 142b, an image obtained by capturing a normal human face image under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states) is used as teacher data for the above learning. It is a parameter indicating a normal human facial feature that is input to the device and adjusted by the learning process.
The normal facial feature amount 142b may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process.
 顔特徴量記憶部142に記憶される学習済みの顔特徴量は、例えば、クラウド上のサーバなどからインターネット、携帯電話網などの通信ネットワークを介して取り込まれ、顔特徴量記憶部142に記憶される構成であってもよい。 The learned facial feature amount stored in the facial feature amount storage unit 142 is taken in from a server on the cloud via a communication network such as the Internet or a mobile phone network, and is stored in the facial feature amount storage unit 142. May be configured.
 ECU40は、1以上のプロセッサ、メモリ、及び通信モジュールなどを含むコンピュータ装置で構成されている。そして、ECU40に搭載されたプロセッサが、メモリに記憶されたプログラムを読み込み、解釈し実行することで、アクチュエータ42などに対する所定の制御が実行されるようになっている。 The ECU 40 is composed of a computer device including one or more processors, a memory, a communication module, and the like. Then, the processor mounted on the ECU 40 reads, interprets, and executes the program stored in the memory, so that predetermined control for the actuator 42 and the like is executed.
 ECU40は、例えば、走行系ECU、運転支援系ECU、ボディ系ECU、及び情報系ECUのうちの少なくとも1つを含んで構成されている。 The ECU 40 is configured to include, for example, at least one of a traveling system ECU, a driving support system ECU, a body system ECU, and an information system ECU.
 前記走行系ECUには、例えば、駆動系ECU、シャーシ系ECUなどが含まれている。前記駆動系ECUには、例えば、エンジン制御、モータ制御、燃料電池制御、EV(Electric Vehicle)制御、又はトランスミッション制御等の「走る」機能に関する制御ユニットが含まれている。
 前記シャーシ系ECUには、例えば、ブレーキ制御、又はステアリング制御等の「止まる、曲がる」機能に関する制御ユニットが含まれている。
The traveling system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like. The drive system ECU includes a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control.
The chassis-based ECU includes a control unit related to a "stop, turn" function such as brake control or steering control.
 前記運転支援系ECUは、例えば、自動ブレーキ支援機能、車線維持支援機能(LKA/Lane Keep Assistともいう)、定速走行・車間距離支援機能(ACC/Adaptive Cruise Controlともいう)、前方衝突警告機能、車線逸脱警報機能、死角モニタリング機能、交通標識認識機能等、走行系ECUなどとの連携により自動的に安全性の向上、又は快適な運転を実現する機能(運転支援機能、又は自動運転機能)に関する制御ユニットを少なくとも1つ以上含んで構成される。 The driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA / Lane Keep Assist), a constant speed driving / inter-vehicle distance support function (also referred to as ACC / Adaptive Cruise Control), and a forward collision warning function. , Lane departure warning function, blind spot monitoring function, traffic sign recognition function, etc., functions that automatically improve safety or realize comfortable driving by linking with driving ECUs (driving support function or automatic driving function) Consists of at least one control unit for.
 前記運転支援系ECUには、例えば、米国自動車技術会(SAE)が提示している自動運転レベルにおけるレベル1(ドライバ支援)、レベル2(部分的自動運転)、及びレベル3(条件付自動運転)の少なくともいずれかの機能が装備されてもよい。
 さらに、自動運転レベルのレベル4(高度自動運転)、又はレベル5(完全自動運転)の機能が装備されてもよいし、レベル1、2のみ、又はレベル2、3のみの機能が装備されてもよい。また、車載システム1が自動運転システムとして構成されていてもよい。
The driving support system ECU includes, for example, Level 1 (driver assistance), Level 2 (partially automatic driving), and Level 3 (conditional automatic driving) at the automatic driving level presented by the American Society of Automotive Engineers of Japan (SAE). ) May be equipped with at least one of the functions.
Further, the functions of level 4 (highly automatic driving) or level 5 (fully automatic driving) of the automatic driving level may be equipped, and only the functions of level 1 and 2 or only level 2 and 3 are equipped. May be good. Further, the in-vehicle system 1 may be configured as an automatic driving system.
 前記ボディ系ECUは、例えば、ドアロック、スマートキー、パワーウィンドウ、エアコン、ライト、メーターパネル、又はウィンカー等の車体の機能に関する制御ユニットを少なくとも1つ以上含んで構成され得る。 The body system ECU may be configured to include at least one control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a blinker.
 前記情報系ECUは、例えば、インフォテインメント装置、テレマティクス装置、又はITS(Intelligent Transport Systems)関連装置を含んで構成され得る。
 前記インフォテインメント装置には、例えば、ユーザインターフェースとして機能するHMI(Human Machine Interface)装置の他、カーナビゲーション装置、オーディオ機器などが含まれる。
 前記テレマティクス装置には、外部と通信するための通信ユニットなどが含まれる。前記ITS関連装置には、ETC(Electronic Toll Collection System)、又はITSスポットなどの路側機との路車間通信、若しくは車々間通信などを行うための通信ユニットなどが含まれていてもよい。
The information system ECU may be configured to include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems) related device.
The infotainment device includes, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like.
The telematics device includes a communication unit for communicating with the outside. The ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside machine such as an ITS spot, or vehicle-to-vehicle communication.
 センサ41には、ECU40でアクチュエータ42の動作制御を行うために必要となるセンシングデータを取得する各種の車載センサが含まれ得る。例えば、車速センサ、シフトポジションセンサ、アクセル開度センサ、ブレーキペダルセンサ、ステアリングセンサなどの他、車外撮影用カメラ、ミリ波等のレーダー(Radar)、ライダー(LIDER)、超音波センサなどの周辺監視センサなどが含まれていてもよい。 The sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for controlling the operation of the actuator 42 by the ECU 40. For example, in addition to vehicle speed sensors, shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc., peripheral monitoring of cameras for outside vehicles, radars such as millimeter waves (Radar), lidars, ultrasonic sensors, etc. A sensor or the like may be included.
 アクチュエータ42は、ECU40からの制御信号に基づいて、車両2の走行、操舵、又は制動などに関わる動作を実行する装置であり、アクチュエータ42には、例えば、エンジン、モータ、トランスミッション、油圧又は電動シリンダー等が含まれる。 The actuator 42 is a device that executes an operation related to traveling, steering, braking, or the like of the vehicle 2 based on a control signal from the ECU 40. The actuator 42 includes, for example, an engine, a motor, a transmission, a hydraulic or an electric cylinder. Etc. are included.
[機能構成例]
 図3は、実施の形態に係るドライバモニタリング装置10の画像処理部12の機能構成例を示すブロック図である。
 画像処理部12は、画像入力部21、顔検出部22、特定個人判定部25、第1顔画像処理部26、第2顔画像処理部30、出力部34、及び顔特徴量記憶部142を含んで構成されている。
[Functional configuration example]
FIG. 3 is a block diagram showing a functional configuration example of the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
The image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. It is configured to include.
 画像入力部21は、カメラ11で撮影されたドライバ3の顔を含む画像を取り込む処理を行う。
 顔検出部22は、特定個人の顔検出部23と、通常の顔検出部24とを含んで構成され、入力画像から顔を検出するための特徴量を抽出しながら顔領域を検出する処理を行う。
The image input unit 21 performs a process of capturing an image including the face of the driver 3 taken by the camera 11.
The face detection unit 22 is configured to include a face detection unit 23 of a specific individual and a normal face detection unit 24, and performs a process of detecting a face region while extracting a feature amount for detecting a face from an input image. Do.
 特定個人の顔検出部23は、顔特徴量記憶部142から読み込んだ特定個人の顔特徴量142aを用いて、入力画像から顔領域を検出する処理を行う。
 通常の顔検出部24は、顔特徴量記憶部142から読み込んだ通常の顔特徴量142bを用いて、入力画像から顔領域を検出する処理を行う。
The face detection unit 23 of the specific individual uses the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to perform a process of detecting the face region from the input image.
The normal face detection unit 24 uses the normal face feature amount 142b read from the face feature amount storage unit 142 to perform a process of detecting a face region from an input image.
 画像から顔領域を検出する手法は特に限定されないが、高速で高精度に顔領域を検出する手法が採用される。顔検出部22は、例えば、入力画像に対して所定の探索領域(探索窓)を走査させながら、それぞれの探索領域で顔を検出するための特徴量を抽出する。
 顔検出部22は、例えば、顔の局所的な領域の明暗差(輝度差)、エッジ強度、又はこれら局所的領域間の関連性などを特徴量として抽出する。そして、顔検出部22は、探索領域から抽出した特徴量と、顔特徴量記憶部142から読み込んだ通常の顔特徴量142b、又は特定個人の顔特徴量142aを用いて、階層的な構造(顔をおおまかにとらえる階層から顔の細部をとらえる階層構造)の検出器で顔か非顔かを判断し、画像中から顔領域を検出する処理を行う。
The method of detecting the face area from the image is not particularly limited, but a method of detecting the face area at high speed and with high accuracy is adopted. The face detection unit 22 extracts features for detecting a face in each search area while scanning a predetermined search area (search window) on the input image, for example.
The face detection unit 22 extracts, for example, the difference in brightness (luminance difference) of a local region of the face, the edge strength, or the relationship between these local regions as a feature amount. Then, the face detection unit 22 uses the feature amount extracted from the search area, the normal face feature amount 142b read from the face feature amount storage unit 142, or the face feature amount 142a of a specific individual, and has a hierarchical structure ( A detector (hierarchical structure that captures the details of the face from the hierarchy that roughly captures the face) determines whether the face is face or non-face, and performs processing to detect the face area from the image.
 特定個人判定部25は、顔検出部22で検出された顔領域の特徴量と、顔特徴量記憶部142から読み込んだ特定個人の顔特徴量142aとを用いて、検出された顔領域の顔が特定個人の顔であるか否かを判定する処理を行う。 The specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to detect the face in the face area. Performs a process of determining whether or not is the face of a specific individual.
 特定個人判定部25は、顔領域から抽出された特徴量と特定個人の顔特徴量142aとの関係を示す指標、例えば、相関係数を算出し、算出された相関係数に基づいて、顔領域の顔が特定個人の顔であるか否かを判定する。例えば、顔領域内における1以上の局所的な領域のHaar-like特徴などの特徴量の相関が求められる。そして、例えば、相関係数が所定の閾値より大きい場合、検出した顔領域の顔が特定個人の顔であると判定し、相関係数が所定の閾値以下の場合、検出した顔領域の顔が特定個人の顔ではないと判定する。 The specific individual determination unit 25 calculates an index showing the relationship between the feature amount extracted from the face region and the face feature amount 142a of the specific individual, for example, a correlation coefficient, and based on the calculated correlation coefficient, the face. Determine if the face in the area is the face of a particular individual. For example, the correlation of feature quantities such as Haar-like features of one or more local regions in the face region can be obtained. Then, for example, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the detected face region is the face of a specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the detected face region is Judge that it is not the face of a specific individual.
 また、特定個人判定部25は、カメラ11からの入力画像の1フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定し、あるいは、カメラ11からの入力画像の複数フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定する。 Further, the specific individual determination unit 25 determines whether or not the face in the detected face region is the face of a specific individual based on the result of determination for one frame of the input image from the camera 11, or the camera 11 Based on the result of the determination for a plurality of frames of the input image from, it is determined whether or not the face in the detected face region is the face of a specific individual.
 第1顔画像処理部26は、特定個人判定部25により特定個人の顔であると判定された場合、特定個人の顔特徴量142aを用いて、特定個人用の顔画像処理を行う。図示された第1顔画像処理部26は、特定個人の顔向き推定部27と、特定個人の目開閉検出部28と、特定個人の視線方向推定部29とを含んで構成されているが、さらに別の顔挙動推定部、検出部を含んで構成されてもよい。 When the specific individual determination unit 25 determines that the face is a specific individual's face, the first face image processing unit 26 performs face image processing for the specific individual using the face feature amount 142a of the specific individual. The illustrated first face image processing unit 26 includes a face orientation estimation unit 27 of a specific individual, an eye opening / closing detection unit 28 of the specific individual, and a line-of-sight direction estimation unit 29 of the specific individual. Further, another face behavior estimation unit and a detection unit may be included.
 特定個人の顔向き推定部27は、特定個人の顔の向きを推定する処理を行う。特定個人の顔向き推定部27は、例えば、特定個人の顔特徴量142aを用いて、特定個人の顔検出部23で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定する処理を行う。 The face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual. The face orientation estimation unit 27 of the specific individual uses, for example, the position of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 23 of the specific individual using the face feature amount 142a of the specific individual. And shape are detected, and the orientation of the face is estimated based on the position and shape of the detected facial organs.
 画像中の顔領域から顔器官を検出する手法は特に限定されないが、高速で高精度に顔器官を検出できる手法を採用することが好ましい。例えば、3D顔形状モデルを作成し、これを2次元画像上の顔の領域にフィッティングさせ、顔の各器官の位置と形状を検出する手法が採用され得る。画像中の人の顔に3D顔形状モデルをフィッティングさせる技術として、例えば、特開2007-249280号公報に記載された技術を採用することができるが、これに限定されるものではない。 The method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy. For example, a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted. As a technique for fitting a 3D face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be adopted, but the technique is not limited thereto.
 また、特定個人の顔向き推定部27は、特定個人の顔の向きの推定データとして、例えば、上記3D顔形状モデルのパラメータに含まれている、左右軸回りのピッチ角、上下軸回りのヨー角、及び前後軸回りのロール角を出力してもよい。 Further, the face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis and the yaw around the up-down axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual. The angle and the roll angle around the front-rear axis may be output.
 特定個人の目開閉検出部28は、特定個人の目の開閉状態を検出する処理を行う。特定個人の目開閉検出部28は、例えば、特定個人の顔向き推定部27で求めた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出する。
 目の開閉状態は、例えば、さまざまな目の開閉状態における目の画像の特徴量(瞼の位置、瞳孔(黒目)の形状、又は、白目部分と黒目部分の領域サイズなど)を予め学習器を用いて学習し、これら学習済みの特徴量データとの類似度を評価することで検出されてもよい。
The eye opening / closing detection unit 28 of the specific individual performs a process of detecting the opening / closing state of the eyes of the specific individual. The eye opening / closing detection unit 28 of the specific individual, for example, is based on the position and shape of the facial organs obtained by the face orientation estimation unit 27 of the specific individual, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. Detects the open / closed state, for example, whether the eyes are open or closed.
For the open / closed state of the eye, for example, the feature amount of the image of the eye (the position of the eyelid, the shape of the pupil (black eye), the area size of the white eye part and the black eye part, etc.) in various open / closed states of the eye is learned in advance. It may be detected by learning using and evaluating the degree of similarity with these learned feature amount data.
 特定個人の視線方向推定部29は、特定個人の視線の方向を推定する処理を行う。特定個人の視線方向推定部29は、例えば、ドライバ3の顔の向き、及びドライバ3の顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向を推定する。視線の方向とは、ドライバ3が見ている方向のことであり、例えば、顔の向きと目の向きとの組み合わせによって求められる。 The line-of-sight direction estimation unit 29 of the specific individual performs a process of estimating the line-of-sight direction of the specific individual. The line-of-sight direction estimation unit 29 of a specific individual is based on, for example, the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, particularly the position and shape of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). Estimate the direction of the line of sight. The direction of the line of sight is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.
 また、視線の方向は、例えば、さまざまな顔の向きと目の向きとの組み合わせにおける目の画像の特徴量(目尻、目頭、瞳孔の相対位置、又は白目部分と黒目部分の相対位置、濃淡、テクスチャーなど)を予め学習器を用いて学習し、これら学習した特徴量データとの類似度を評価することで推定されてもよい。
 また、特定個人の視線方向推定部29は、前記3D顔形状モデルのフィッティング結果などを用いて、顔の大きさや向きと目の位置などから眼球の大きさと中心位置とを推定するとともに、瞳孔の位置を検出し、眼球の中心と瞳孔の中心とを結ぶベクトルを視線方向として推定してもよい。
In addition, the direction of the line of sight is, for example, the feature amount of the image of the eye in various combinations of face orientation and eye orientation (relative position of outer corner, inner corner of eye, pupil, relative position of white eye portion and black eye portion, shading, etc. (Texture, etc.) may be learned in advance using a learner, and estimated by evaluating the degree of similarity with the learned feature amount data.
In addition, the line-of-sight direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face and the position of the eyes by using the fitting result of the 3D face shape model and the like, and also estimates the size and center position of the pupil. The position may be detected and the vector connecting the center of the eyeball and the center of the pupil may be estimated as the line-of-sight direction.
 第2顔画像処理部30は、特定個人判定部25により特定個人の顔ではないと判定された場合、通常の顔特徴量142bを用いて、通常の顔画像処理を行う。第2顔画像処理部30は、通常の顔向き推定部31と、通常の目開閉検出部32と、通常の視線方向推定部33とを含んで構成されていてもよい。
 通常の顔向き推定部31と、通常の目開閉検出部32と、通常の視線方向推定部33とで行われる処理は、通常の顔特徴量142bを用いる点などを除き、特定個人の顔向き推定部27と、特定個人の目開閉検出部28と、特定個人の視線方向推定部29と基本的に同様であるので、ここではその説明を省略する。
When the specific individual determination unit 25 determines that the face is not the face of a specific individual, the second face image processing unit 30 performs normal face image processing using the normal face feature amount 142b. The second face image processing unit 30 may include a normal face orientation estimation unit 31, a normal eye opening / closing detection unit 32, and a normal line-of-sight direction estimation unit 33.
The processing performed by the normal face orientation estimation unit 31, the normal eye opening / closing detection unit 32, and the normal line-of-sight direction estimation unit 33 uses the normal face feature amount 142b, except that the face orientation of a specific individual is used. Since it is basically the same as the estimation unit 27, the eye opening / closing detection unit 28 of the specific individual, and the line-of-sight direction estimation unit 29 of the specific individual, the description thereof will be omitted here.
 出力部34は、画像処理部12による画像処理に基づく情報をECU40などに出力する処理を行う。画像処理に基づく情報は、例えば、ドライバ3の顔の向き、視線の方向、又は目の開閉状態などの顔の挙動に関する情報でもよいし、顔の挙動の検出結果に基づいて判定されたドライバ3の状態(例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態)に関する情報でもよい。また、画像処理に基づく情報は、ドライバ3の状態判定に基づく、所定の制御信号(注意や警告処理を行うための制御信号、又は車両2の動作制御を行うための制御信号など)でもよい。 The output unit 34 performs a process of outputting information based on the image processing by the image processing unit 12 to the ECU 40 or the like. The information based on the image processing may be, for example, information on the behavior of the face such as the direction of the face of the driver 3, the direction of the line of sight, or the open / closed state of the eyes, or the driver 3 determined based on the detection result of the behavior of the face. It may be information about the state of (for example, forward gaze, inattentiveness, dozing, backward facing, prone, etc.). Further, the information based on the image processing may be a predetermined control signal (control signal for performing caution or warning processing, control signal for performing operation control of the vehicle 2, etc.) based on the state determination of the driver 3.
[処理動作例]
 図4は、実施の形態に係るドライバモニタリング装置10における画像処理部12のCPU13が行う処理動作の一例を示すフローチャートである。カメラ11では、例えば、毎秒数十フレームの画像が撮影され、各フレーム、又は一定間隔のフレーム毎に本処理が行われる。
[Processing operation example]
FIG. 4 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. For example, the camera 11 captures an image of several tens of frames per second, and this processing is performed for each frame or every frame at regular intervals.
 まず、ステップS1では、CPU13は、画像入力部21として動作し、カメラ11で撮影された画像(ドライバ3の顔を含む画像)を読み込む処理が行われ、その後、ステップS2に処理を進める。 First, in step S1, the CPU 13 operates as an image input unit 21, and a process of reading an image (an image including the face of the driver 3) taken by the camera 11 is performed, and then the process proceeds to step S2.
 ステップS2では、CPU13は、通常の顔検出部24として動作し、入力画像に対して通常の顔検出処理が行われ、その後、ステップS3に処理を進める。
 ステップS2では、例えば、入力画像に対して所定の探索領域(探索窓)を走査させながら、それぞれの探索領域で顔を検出するための特徴量が抽出される。そして、探索領域から抽出した特徴量と、顔特徴量記憶部142から読み込まれた通常の顔特徴量142bとを用いて、顔か非顔かが判断され、画像中から顔領域を検出する処理が行われる。
In step S2, the CPU 13 operates as a normal face detection unit 24, performs normal face detection processing on the input image, and then proceeds to step S3.
In step S2, for example, while scanning a predetermined search area (search window) with respect to the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the normal face feature amount 142b read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Is done.
 ステップS3では、CPU13は、特定個人の顔検出部23として動作し、入力画像に対して特定個人の顔検出処理が行われ、その後、ステップS4に処理を進める。
 ステップS3では、例えば、入力画像に対して所定の探索領域(探索窓)を走査させながら、それぞれの探索領域で顔を検出するための特徴量が抽出される。そして、探索領域から抽出された特徴量と、顔特徴量記憶部142から読み込まれた特定個人の顔特徴量142aとを用いて、顔か非顔かが判断され、画像中から顔領域を検出する処理が行われる。なお、ステップS2とステップS3の処理は、1つのステップ内で並列的に行われてもよいし、組み合わせて行われてもよい。
In step S3, the CPU 13 operates as the face detection unit 23 of the specific individual, performs face detection processing of the specific individual on the input image, and then proceeds to step S4.
In step S3, for example, while scanning a predetermined search area (search window) on the input image, a feature amount for detecting a face in each search area is extracted. Then, using the feature amount extracted from the search area and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142, it is determined whether the face is face or non-face, and the face area is detected from the image. Processing is performed. The processes of step S2 and step S3 may be performed in parallel in one step, or may be performed in combination.
 ステップS4では、CPU13は、特定個人判定部25として動作し、ステップS2、ステップS3で検出された顔領域の特徴量と、顔特徴量記憶部142から読み込まれた特定個人の顔特徴量142aとを用いて、顔領域の顔が特定個人の顔であるか否かを判定する処理が行われ、その後、ステップS5に処理を進める。 In step S4, the CPU 13 operates as the specific individual determination unit 25, and the feature amount of the face area detected in steps S2 and S3 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. Is used to perform a process of determining whether or not the face in the face area is the face of a specific individual, and then the process proceeds to step S5.
 ステップS5では、ステップS4での判定処理の結果が、特定個人の顔であるか否かが判断され、特定個人の顔であると判断されれば、その後、ステップS6に処理を進める。 In step S5, it is determined whether or not the result of the determination process in step S4 is the face of a specific individual, and if it is determined that the face is the face of a specific individual, the process proceeds to step S6 thereafter.
 ステップS6では、CPU13は、特定個人の顔向き推定部27として動作し、例えば、特定個人の顔特徴量142aを用いて、ステップS3で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状が検出され、検出された顔器官の位置や形状に基づいて、顔の向きが推定され、その後、ステップS7に処理を進める。 In step S6, the CPU 13 operates as the face orientation estimation unit 27 of the specific individual, and for example, using the face feature amount 142a of the specific individual, the eyes, nose, mouth, eyebrows, etc. from the face area detected in step S3 are used. The position and shape of the facial organ are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S7.
 ステップS7では、CPU13は、特定個人の目開閉検出部28として動作し、例えば、ステップS6で求められた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかが検出され、その後、ステップS8に処理を進める。 In step S7, the CPU 13 operates as an eye opening / closing detection unit 28 of a specific individual, and is based on, for example, the position and shape of facial organs obtained in step S6, particularly the position and shape of eye feature points (eyelids, pupils). Then, the open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S8.
 ステップS8では、CPU13は、特定個人の視線方向推定部29として動作し、例えば、ステップS6で求められた顔の向き、顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向が推定され、その後処理を終える。 In step S8, the CPU 13 operates as the line-of-sight direction estimation unit 29 of a specific individual, and for example, the face orientation, the position and shape of the facial organs obtained in step S6, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). The direction of the line of sight is estimated based on the position and shape of, and then the process is finished.
 一方ステップS5において、特定個人の顔ではない、換言すれば、通常の顔であると判断されれば、ステップS9に処理を進める。
 ステップS9では、CPU13は、通常の顔向き推定部31として動作し、例えば、通常の顔特徴量142bを用いて、ステップS2で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状が検出され、検出された顔器官の位置や形状に基づいて、顔の向きが推定され、その後、ステップS10に処理を進める。
On the other hand, in step S5, if it is determined that the face is not a specific individual's face, in other words, it is a normal face, the process proceeds to step S9.
In step S9, the CPU 13 operates as a normal face orientation estimation unit 31, and for example, using the normal face feature amount 142b, facial organs such as eyes, nose, mouth, and eyebrows are used from the face region detected in step S2. The position and shape of the face are detected, the orientation of the face is estimated based on the detected position and shape of the facial organ, and then the process proceeds to step S10.
 ステップS10では、CPU13は、通常の目開閉検出部32として動作し、例えば、ステップS9で求められた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかが検出され、その後、ステップS11に処理を進める。 In step S10, the CPU 13 operates as a normal eye opening / closing detection unit 32, and is based on, for example, the position and shape of the facial organs obtained in step S9, particularly the position and shape of eye feature points (eyelids, pupils). , The open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and then the process proceeds to step S11.
 ステップS11では、CPU13は、通常の視線方向推定部33として動作し、例えば、ステップS9で求められた顔の向き、顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向が推定され、その後処理を終える。 In step S11, the CPU 13 operates as a normal line-of-sight direction estimation unit 33, and for example, the orientation of the face and the position and shape of the facial organs obtained in step S9, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). The direction of the line of sight is estimated based on the position and shape, and then the process is finished.
 図5は、実施の形態に係るドライバモニタリング装置10における画像処理部12のCPU13が行う特定個人判定処理動作の一例を示すフローチャートである。本処理動作は、図4に示したステップS4における特定個人判定処理動作の一例であり、入力画像1枚(1フレーム)で判定する場合の処理動作例を示している。 FIG. 5 is a flowchart showing an example of a specific individual determination processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the specific individual determination processing operation in step S4 shown in FIG. 4, and shows an example of the processing operation when determining with one input image (1 frame).
 まず、ステップS21では、CPU13は、図4に示したステップS2、ステップS3の顔検出処理で検出された顔領域から抽出された特徴量を読み込む。
 次のステップS22では、顔特徴量記憶部142(図3)から学習済みの特定個人の顔特徴量142aが読み込まれ、その後、ステップS23に処理を進める。
First, in step S21, the CPU 13 reads the feature amount extracted from the face region detected by the face detection processing in steps S2 and S3 shown in FIG.
In the next step S22, the learned facial feature amount 142a of the specific individual is read from the face feature amount storage unit 142 (FIG. 3), and then the process proceeds to step S23.
 ステップS23では、ステップS21で読み込まれた顔領域から抽出された特徴量と、ステップS22で読み込まれた特定個人の顔特徴量142aとの相関係数を算出する処理が行われ、その後、ステップS24に進む。 In step S23, a process of calculating the correlation coefficient between the feature amount extracted from the face area read in step S21 and the face feature amount 142a of the specific individual read in step S22 is performed, and then in step S24. Proceed to.
 ステップS24では、算出された相関係数が、特定個人か否かを判定するための所定の閾値より大きいか否かが判断され、相関係数が所定の閾値よりも大きい、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量142aとの相関性が高い(類似度が高い)と判断されれば、その後、ステップS25に進む。
 ステップS25では、顔領域に検出された顔が特定個人の顔であると判定され、その後、処理を終える。
In step S24, it is determined whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the individual is a specific individual, and the correlation coefficient is larger than the predetermined threshold value, in other words, the face. If it is determined that the feature amount extracted from the region has a high correlation (high similarity) with the facial feature amount 142a of the specific individual, the process proceeds to step S25.
In step S25, it is determined that the face detected in the face area is the face of a specific individual, and then the process ends.
 一方ステップS24において、相関係数が所定の閾値以下である、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量142aとの相関性が低い(類似度が低い)と判断されれば、その後、ステップS26に処理を進める。
 ステップS26では、特定個人の顔ではない、換言すれば、通常の顔であると判定され、その後、処理を終える。
On the other hand, in step S24, the correlation coefficient is equal to or less than a predetermined threshold value, in other words, the correlation between the feature amount extracted from the face region and the face feature amount 142a of the specific individual is low (the degree of similarity is low). If it is determined, the process proceeds to step S26 thereafter.
In step S26, it is determined that the face is not a specific individual's face, in other words, it is a normal face, and then the process is completed.
 図6は、ドライバモニタリング装置10の画像処理部12における特定個人の顔向き推定部27のより詳細な機能構成例を示すブロック図である。
 特定個人の顔向き推定部27は、特定個人の顔の向きを推定する処理を行う。特定個人の顔向き推定部27は、例えば、特定個人の顔検出部23で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出された顔器官の位置や形状に基づいて、顔の向きを推定する処理を行う。
FIG. 6 is a block diagram showing a more detailed functional configuration example of the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10.
The face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual. The face orientation estimation unit 27 of the specific individual detects, for example, the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face region detected by the face detection unit 23 of the specific individual, and the detected facial organs. Performs processing to estimate the orientation of the face based on the position and shape of.
 画像中の顔領域から顔器官を検出する手法は特に限定されないが、高速で高精度に顔器官を検出できる手法を採用することが好ましい。例えば、3D顔形状モデルを作成し、これを2次元画像上の顔の領域にフィッティングさせ、顔の各器官の位置と形状を検出する手法が採用され得る。 The method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method that can detect the facial organs at high speed and with high accuracy. For example, a method of creating a 3D face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted.
 特定個人の顔向き推定部27は、特定個人の特定部位を使用しないで顔モデルフィッティング処理を行う特定部位不使用顔モデルフィッティング部27aを備えており、例えば1秒間に、15フレーム~30フレーム撮影される画像のうち、最初の1フレーム目は、特定個人の特定部位を寄与させずに顔モデルフィッティング処理を実施する。
 前記特定部位の影響を受けない顔モデルフィッティング処理を行うことにより、通常の顔モデルフィッティング処理に近い高速処理を実現することができる。
The face orientation estimation unit 27 of a specific individual includes a face model fitting unit 27a that does not use a specific part to perform face model fitting processing without using a specific part of the specific individual. For example, 15 to 30 frames are captured per second. In the first frame of the image to be displayed, the face model fitting process is performed without contributing a specific part of the specific individual.
By performing the face model fitting process that is not affected by the specific portion, it is possible to realize high-speed processing close to the normal face model fitting process.
 特定個人の顔向き推定部27は、さらに、前記特定部位以外の全ての部位各々の顔モデルフィッティングスコアを算出するスコア算出部27bと、前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の閾値を上回ったか否かを判定するフィッティングスコア判定部27cとを備えている。
 このフィッティングスコア判定部27cの存在により、前記特定部位を除いた部位のみによっても、精度の高い顔向き推定処理の実施が可能か否かの判定を正確に行うことができる。
The face orientation estimation unit 27 of the specific individual further includes a score calculation unit 27b for calculating the face model fitting score of each part other than the specific part, and the face model fitting score for all parts other than the specific part. Is provided with a fitting score determination unit 27c for determining whether or not the number exceeds a predetermined threshold value.
Due to the presence of the fitting score determination unit 27c, it is possible to accurately determine whether or not the face orientation estimation process can be performed with high accuracy only by the portion excluding the specific portion.
 特定個人の顔向き推定部27は、フィッティングスコア判定部27cにおいて、前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の閾値を上回ったと判定された場合には、前記特定部位の特徴量を補完する補完処理部27dを、さらに備えている。
 前記特定部位の特徴量を補完する補完処理部27dを備えることにより、前記特定部位を通常の部位、例えば、左目、右目、鼻、口などとして処理することが可能となる。
When the face orientation estimation unit 27 of the specific individual determines in the fitting score determination unit 27c that the face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value, the face orientation estimation unit 27 of the specific part determines the specific part. A complement processing unit 27d that complements the feature amount is further provided.
By providing the complementary processing unit 27d that complements the feature amount of the specific portion, the specific portion can be treated as a normal portion, for example, a left eye, a right eye, a nose, a mouth, or the like.
 特定個人の顔向き推定部27は、前記特定部位の特徴量が補完された後は、通常の顔モデルフィッティング処理により顔向き推定処理を行う通常顔向き推定部27eを、さらに備えている。
 前記特定部位が存在していても、通常の顔モデルフィッティング処理により顔向き推定処理を実施することができれば、安定した高精度のリアルタイム顔向き推定処理を実現することができる。
The face orientation estimation unit 27 of the specific individual further includes a normal face orientation estimation unit 27e that performs face orientation estimation processing by a normal face model fitting process after the feature amount of the specific portion is complemented.
Even if the specific portion exists, if the face orientation estimation process can be performed by a normal face model fitting process, a stable and highly accurate real-time face orientation estimation process can be realized.
 特定個人の顔向き推定部27は、特定個人の顔向きの推定データとして、例えば、上記3D顔形状モデルのパラメータに含まれている、左右軸回りのピッチ角、上下軸回りのヨー角、及び前後軸回りのロール角を出力してもよい。 The face orientation estimation unit 27 of the specific individual includes, for example, the pitch angle around the left-right axis, the yaw angle around the up-down axis, and the yaw angle around the vertical axis, which are included in the parameters of the 3D face shape model as the estimation data of the face orientation of the specific individual. The roll angle around the front-back axis may be output.
 特定個人の顔向き推定部27は、顔向き角度のずれを補正する角度補正テーブル27fを、さらに備えている。
 この角度補正テーブル27fには、特定個人毎に、例えば、予め所定の場所で、特定個人の顔画像がさまざまな条件(さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件)で個別に撮像され、これら多数の撮影画像が教師データとして、上記学習装置に入力され、学習処理によって調整された、特定個人毎の角度補正データがあらかじめ格納されている。
 上記特定個人の特定部位を使用しない顔モデルフィッティング処理を実施しても、どうしても推定される顔向き角度に一定量のずれが生じる場合には、前記角度補正テーブルを用いて顔向き角のずれを補正する。この補正処理により、高精度な顔向き角の算出が容易となる。
The face orientation estimation unit 27 of the specific individual further includes an angle correction table 27f for correcting the deviation of the face orientation angle.
In this angle correction table 27f, for each specific individual, for example, at a predetermined place in advance, various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). The angle correction data for each specific individual, which is individually captured by the user, is input to the learning device as teacher data, and is adjusted by the learning process, is stored in advance.
If a certain amount of deviation occurs in the estimated face orientation angle even after performing the face model fitting process that does not use the specific part of the specific individual, the deviation of the face orientation angle is determined by using the angle correction table. to correct. This correction process facilitates the calculation of the face orientation angle with high accuracy.
 図7は、実施の形態に係るドライバモニタリング装置10の画像処理部12における特定個人の顔向き推定部27が実施する処理動作を示すイメージ概念図である。 FIG. 7 is an image conceptual diagram showing a processing operation performed by the face orientation estimation unit 27 of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
 特定個人の顔向き推定部27は、特定個人の顔向き推定処理を行なうのに、例えば、3D顔形状モデルを作成し、これを2次元画像上の顔の領域にフィッティングさせ、顔の各器官の位置と形状を検出する手法を採用する。
 画像中の人の顔に3D顔形状モデルをフィッティングさせる技術として、例えば、特開2007-249280号公報に記載された技術を適用することができるが、これに限定されるものではない。
The face orientation estimation unit 27 of a specific individual creates, for example, a 3D face shape model, fits the 3D face shape model to the face region on the two-dimensional image, and performs face orientation estimation processing of the specific individual, and each organ of the face. Adopt a method to detect the position and shape of.
As a technique for fitting a 3D face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but the technique is not limited thereto.
 図7に示した特定個人の顔向き推定処理の場合、特定個人の特定部位を使用しないで特定個人対応3D顔モデルフィッティング処理を行う。例えば、1秒間に、15フレーム~30フレーム撮影される画像のうち、最初の1フレーム目は、特定個人の特定部位(図7の場合、右目部分)を寄与させない特定個人対応3D顔モデルフィッティング処理を実施する。 In the case of the face orientation estimation process of a specific individual shown in FIG. 7, the 3D face model fitting process corresponding to the specific individual is performed without using the specific part of the specific individual. For example, of the images captured at 15 to 30 frames per second, the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of a specific individual (in the case of FIG. 7, the right eye part). To carry out.
 そして、次に、前記特定部位以外の全ての部位各々の3D顔モデルフィッティングスコアを算出する処理を行い、前記特定部位以外の全ての部位についての前記特定個人対応3D顔モデルフィッティングスコアが所定の閾値を上回ったか否かを判定する。 Then, the process of calculating the 3D face model fitting score of each part other than the specific part is performed, and the specific individual-corresponding 3D face model fitting score of all the parts other than the specific part is set to a predetermined threshold value. Is determined whether or not the value is exceeded.
 前記特定個人対応3D顔モデルフィッティングスコアが所定の閾値を上回ったと判定された場合には、次のフレームからトラッキング処理に移行し、前記特定部位の特徴量を補完する補完処理を実施する。
 図7に示した例では、特定部位が右目となっており、右目の部分に、例えば、補完処理として塗潰し処理を実施する。
 このような補完処理を実施することにより、前記特定部位を通常の部位として処理することが可能となる。
When it is determined that the specific individual-compatible 3D face model fitting score exceeds a predetermined threshold value, the tracking process is started from the next frame, and the complementary process for complementing the feature amount of the specific portion is performed.
In the example shown in FIG. 7, the specific portion is the right eye, and the portion of the right eye is, for example, painted as a complementary treatment.
By carrying out such a complementary process, it becomes possible to treat the specific site as a normal site.
 補完処理を実施した後のトラッキング処理では、通常の3D顔モデルフィッティング処理により顔向き推定処理を実施することになるが、ここでは、特定部位以外のフィッティングスコアが所定の閾値を上回った場合には、特定部位も含めて精度よくフィッティングできたとみなし、次のフレームからは、顔器官点位置の1フレーム毎の動きが微小であるとみなせることを利用し、顔器官点のトラッキング処理を実施する。 In the tracking process after performing the complement process, the face orientation estimation process is performed by the normal 3D face model fitting process, but here, when the fitting score other than the specific part exceeds a predetermined threshold value, the face orientation estimation process is performed. It is considered that the fitting can be performed accurately including the specific part, and from the next frame, the tracking process of the facial organ points is performed by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute.
 具体的には、トラッキング時は、特定部位について、前フレームのフィッティング結果をベースに特徴量を補完する。例えば、図7に示したように、特定部位が右目であるとした場合、前フレームのフィッティング結果から、画像上の右目の位置を推定し、該画像において、その位置を黒く塗りつぶし、特徴量を抽出する。
 このような処理とすることで、あたかも通常の3D顔モデルでのフィッティング相当処理とすることができ、安定した高精度のリアルタイム顔向き推定処理を実現することができる。
Specifically, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
By performing such a process, it is possible to perform a process equivalent to fitting in a normal 3D face model, and it is possible to realize a stable and highly accurate real-time face orientation estimation process.
 図8は、実施の形態に係るドライバモニタリング装置10の画像処理部12における特定個人の顔向き推定部27(CPU13)が行う処理動作の一例を示すフローチャートである。 FIG. 8 is a flowchart showing an example of a processing operation performed by the face orientation estimation unit 27 (CPU 13) of a specific individual in the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
 まず、ステップS61では、フラグがフォールスに設定される。次に、ステップS62において、tが1に設定される。 First, in step S61, the flag is set to false. Next, in step S62, t is set to 1.
 次に、ステップS63において、tフレーム目の画像で上記した顔検出処理が行われる。次に、ステップS64において、顔の検出ができたか否かが判断され、顔の検出ができなかったと判断されるとステップS75に進み、フラグがフォールスに設定される一方、顔の検出ができたと判断されると、次に、ステップS65に進む。 Next, in step S63, the above-mentioned face detection process is performed on the image at the t-frame. Next, in step S64, it is determined whether or not the face can be detected, and if it is determined that the face cannot be detected, the process proceeds to step S75, and the flag is set to false, while the face can be detected. If determined, the next step is step S65.
 ステップS65では、tフレーム目の画像で検出された顔画像の取り込み処理が行われる。
 次に、ステップS66では、フラグがツルーか否かが判断され、フラグがツルーでないと判断されると、次にステップS67に進む一方、フラグがツルーであると判断されると、次に、ステップS73に進む。
In step S65, the face image captured in the image at the t-frame is captured.
Next, in step S66, it is determined whether or not the flag is true, and if it is determined that the flag is not true, the process proceeds to step S67, while if it is determined that the flag is true, then the step is followed. Proceed to S73.
 ステップS67では、例えば、1分間に、15フレーム~30フレーム撮影される画像のうち、最初の1フレーム目は、特定個人の特定部位を寄与させない特定個人対応3D顔モデルフィッティング処理を実施する。前記特定部位の影響を受けない顔モデルフィッティング処理を行うことにより、通常の顔モデルフィッティング処理に近い高速処理を実現することができる。 In step S67, for example, of the images captured in 15 to 30 frames per minute, the first frame is a specific individual-compatible 3D face model fitting process that does not contribute to a specific part of the specific individual. By performing the face model fitting process that is not affected by the specific portion, it is possible to realize high-speed processing close to the normal face model fitting process.
 ステップS67の処理を終えると、次にステップS68に進み、ステップS68では、特定部位を寄与させない特定個人対応3D顔モデルフィッティングスコアを算出する処理が行われる。
 ステップS68の処理を終えると、次に、ステップS69に処理を進める。
After completing the process of step S67, the process proceeds to step S68, and in step S68, a process of calculating a specific individual-compatible 3D face model fitting score that does not contribute to a specific part is performed.
After finishing the process of step S68, the process proceeds to step S69.
 ステップS69では、フィッティングスコア判定処理が行われ、前記特定部位以外の全ての部位についての前記特定個人対応3D顔モデルフィッティングスコアが所定の閾値を上回ったか否かが判定される。 In step S69, the fitting score determination process is performed, and it is determined whether or not the specific individual-corresponding 3D face model fitting score for all the parts other than the specific part exceeds a predetermined threshold value.
 ステップS69において、前記特定部位以外の全ての部位についての前記特定個人対応3D顔モデルフィッティングスコアが所定の閾値を上回ったと判定された場合には、その後、ステップS70に処理を進める。 If it is determined in step S69 that the specific individualized 3D face model fitting score for all parts other than the specific part exceeds a predetermined threshold value, the process proceeds to step S70 thereafter.
 ステップS70では、フラグがツルーに設定され、その後、ステップS71に処理を進める。
 他方、ステップS69において、前記特定部位以外の全ての部位についての前記特定個人対応3D顔モデルフィッティングスコアが所定の閾値を上回らなかったと判定された場合には、ステップS72に処理を進め、ステップS72では、フレームが進む処理が施され、その後、ステップS63に戻る。
In step S70, the flag is set to true, and then the process proceeds to step S71.
On the other hand, if it is determined in step S69 that the specific personalized 3D face model fitting score for all parts other than the specific part does not exceed a predetermined threshold value, the process proceeds to step S72, and in step S72, the process proceeds. , The process of advancing the frame is performed, and then the process returns to step S63.
 他方、ステップS66で、フラグがツルーであると判断され、次に、ステップS73に進むと、ステップS73では。前記特定部位の特徴量を補完する補完処理が実施される。この補完処理では、例えば、図7に示したように、特定部位が右目と判断された場合には、右目の部分に、例えば、黒塗潰し処理が施される。
 このような補完処理を実施することにより、その後の処理において、前記特定部位を通常の部位、例えば、右目として処理することが可能となり、あたかも通常の顔モデルフィッティング相当の処理とすることが可能となる。
 ステップS73での補完処理を終えると、その後、ステップS74に処理を進める。
On the other hand, in step S66, it is determined that the flag is true, and then when the process proceeds to step S73, in step S73. Complementary processing is performed to complement the feature amount of the specific site. In this complementary process, for example, as shown in FIG. 7, when the specific portion is determined to be the right eye, the right eye portion is, for example, black-painted.
By carrying out such a complementary process, it is possible to process the specific part as a normal part, for example, the right eye in the subsequent processing, and it is possible to perform a process equivalent to a normal face model fitting. Become.
After completing the complementary process in step S73, the process proceeds to step S74.
 ステップS74では、通常の3D顔モデルフィッティング処理により顔向き推定処理が実施され、その後、ステップS71に進む。
 ステップS74では、補完処理を実施した後のトラッキング処理が実施され、特定部位以外のフィッティングスコアが所定の閾値を上回った場合には、特定部位も含めて精度よくフィッティングできたとみなし、顔器官点位置の1フレーム毎の動きが微小であるとみなせることを利用し、顔器官点のトラッキング処理を実施する。
In step S74, the face orientation estimation process is performed by the normal 3D face model fitting process, and then the process proceeds to step S71.
In step S74, the tracking process is performed after the complementary process is performed, and when the fitting score other than the specific part exceeds a predetermined threshold value, it is considered that the fitting including the specific part has been performed accurately, and the facial organ point position. The tracking process of facial organ points is performed by utilizing the fact that the movement of each frame can be regarded as minute.
 具体的には、トラッキング時は、特定部位について、前フレームのフィッティング結果をベースに特徴量を補完する。例えば、図7に示したように、特定部位が右目であるとした場合、前フレームのフィッティング結果から、画像上の右目の位置を推定し、該画像において、その位置を黒く塗りつぶし、特徴量を抽出する。
 このような処理とすることで、あたかも通常の3D顔モデルでのフィッティング相当処理とすることができ、安定した高精度のリアルタイム顔向き推定処理を実現することができる。
 この様に、前記特定部位が存在していても、前記特定部位の特徴量が補完された後は、通常の3D顔モデルフィッティング処理により顔向き推定処理を実施することが可能となり、安定して精度の良いリアルタイムな顔向き推定処理を実施することが可能となる。
Specifically, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, as shown in FIG. 7, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, and the position is painted black in the image to obtain the feature amount. Extract.
By performing such a process, it is possible to perform a process equivalent to fitting in a normal 3D face model, and it is possible to realize a stable and highly accurate real-time face orientation estimation process.
In this way, even if the specific part exists, after the feature amount of the specific part is complemented, the face orientation estimation process can be performed by the normal 3D face model fitting process, and the face orientation estimation process can be performed stably. It is possible to perform accurate real-time face orientation estimation processing.
 ステップS74において顔向き推定処理が実施されると、その後、ステップS71に進み、ステップS71では、予め特定個人毎に学習して作成しておいた、顔向き角度のずれを補正する角度補正テーブル27fを用い、顔向き角のずれを補正する処理が実施される。 When the face orientation estimation process is performed in step S74, the process proceeds to step S71, and in step S71, the angle correction table 27f for correcting the deviation of the face orientation angle, which has been learned and created in advance for each specific individual. Is used to perform a process of correcting the deviation of the face orientation angle.
 この補正処理は、上記特定個人の特定部位を使用しない特定個人対応3D顔モデルフィッティング処理を実施しても、推定される顔向き角度にどうしても一定のずれが生じる場合に行われるもので、予め特定個人毎に作成しておいた角度補正テーブル27fを用いることにより、顔向き角のずれを特定個人毎に容易に補正することができ、高精度な顔向き角の算出を可能とする。 This correction process is performed when a certain deviation occurs in the estimated face orientation angle even if the specific individual-compatible 3D face model fitting process that does not use the specific part of the specific individual is performed, and is specified in advance. By using the angle correction table 27f created for each individual, the deviation of the face orientation angle can be easily corrected for each specific individual, and the face orientation angle can be calculated with high accuracy.
 ステップS71における顔向き角のずれを補正する処理が終了すると、その後、ステップS72に処理を進め、ステップS72では、フレームが進む処理が施され、その後、ステップS63に戻る。また、上記したステップS75からもステップS72に処理が進められ、ステップS72では、同様にフレームが進む処理が施され、その後、ステップS63に戻る。 When the process of correcting the deviation of the face orientation angle in step S71 is completed, the process proceeds to step S72, and in step S72, the process of advancing the frame is performed, and then the process returns to step S63. Further, the process proceeds from step S75 to step S72, and in step S72, the process of advancing the frame is performed in the same manner, and then the process returns to step S63.
 上記した実施の形態に係るドライバモニタリング装置10によれば、顔特徴量記憶部142に学習済みの顔特徴量として、特定個人の顔特徴量142aと、通常の顔特徴量142bとが記憶され、特定個人判定部25により、顔検出部22で検出された顔領域の特徴量と、特定個人の顔特徴量142aとを用いて、顔領域の顔が特定個人の顔であるか否かが判定される。従って、特定個人の顔特徴量142aを用いることにより、特定個人の顔であるか否かを精度良く判定することができる。 According to the driver monitoring device 10 according to the above-described embodiment, the face feature amount 142a of a specific individual and the normal face feature amount 142b are stored as the learned face feature amount in the face feature amount storage unit 142. The specific individual determination unit 25 determines whether or not the face in the face region is the face of a specific individual by using the feature amount of the face region detected by the face detection unit 22 and the face feature amount 142a of the specific individual. Will be done. Therefore, by using the facial feature amount 142a of the specific individual, it is possible to accurately determine whether or not the face is the face of the specific individual.
 また、特定個人判定部25により特定個人の顔であると判定された場合、第1顔画像処理部26により特定個人の顔画像処理を精度良く実施することができる。一方、特定個人判定部25により特定個人の顔ではない、換言すれば、通常の顔(特定個人以外の人の顔)であると判定された場合、第2顔画像処理部30により通常の顔画像処理を精度良く実施することができる。従って、ドライバ3が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる。 Further, when the specific individual determination unit 25 determines that the face is a specific individual, the first face image processing unit 26 can accurately perform the face image processing of the specific individual. On the other hand, when the specific individual determination unit 25 determines that the face is not a specific individual's face, in other words, a normal face (a face of a person other than the specific individual), the second face image processing unit 30 determines that the face is a normal face. Image processing can be performed with high accuracy. Therefore, whether the driver 3 is a specific individual or a normal person other than the specific individual, it is possible to accurately perform sensing of each face.
 また、特定個人の顔向き推定について、前フレームでの顔モデルフィッティング結果をベースに、特定部位(顔器官欠損部位など)の特徴量を補完することで、安定して精度良くリアルタイムで顔向きを推定することが可能となる。
 すなわち、具体的には、最初の1フレーム目は、特定部位を寄与させずにフィッティングする、いわゆる、特定個人対応顔モデルを用いる。
 ここで、特定部位以外のフィッティングスコアが所定の閾値以上の場合、特定部位も含めて精度良くフィッティングできたとみなし、次フレームからは、15フレーム/秒あるいは30フレーム/秒のような動画においては、顔器官点位置の1フレーム毎の動きが微小であるとみなせることを利用した顔器官点のトラッキング処理を開始する。
 すなわち、トラッキング時は、特定部位について、前フレームのフィッティング結果をベースに特徴量を補完する。例えば、特定部位が右目の場合、前フレームのフィッティング結果から、画像上における右目の位置を推定し、該画像において、その位置を黒く塗りつぶし、特徴量を抽出する。このようにすることで、あたかも通常の顔モデルフィッティング相当の処理とすることができ、結果として、安定して精度良くリアルタイムで顔向きを推定することができるようになる。
In addition, regarding the face orientation estimation of a specific individual, by supplementing the feature amount of a specific part (facial organ defect part, etc.) based on the face model fitting result in the previous frame, the face orientation can be stably and accurately measured in real time. It becomes possible to estimate.
That is, specifically, for the first frame, a so-called specific personalized face model is used, in which a specific part is fitted without contributing.
Here, if the fitting score other than the specific part is equal to or higher than the predetermined threshold value, it is considered that the fitting can be performed accurately including the specific part, and from the next frame, in a moving image such as 15 frames / second or 30 frames / second, The tracking process of the facial organ point is started by utilizing the fact that the movement of the facial organ point position for each frame can be regarded as minute.
That is, at the time of tracking, the feature amount is complemented for a specific part based on the fitting result of the previous frame. For example, when the specific part is the right eye, the position of the right eye on the image is estimated from the fitting result of the front frame, the position is painted black in the image, and the feature amount is extracted. By doing so, it is possible to perform processing equivalent to normal face model fitting, and as a result, it becomes possible to estimate the face orientation in real time with stability and accuracy.
 また、車載システム1が、ドライバモニタリング装置10と、ドライバモニタリング装置10から出力されるモニタリングの結果に基づいて、所定の処理を実行する1以上のECU40とを備えている。従って、前記モニタリングの結果に基づいて、ECU40に所定の制御を適切に実行させることが可能となる。これにより、特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。 Further, the in-vehicle system 1 includes a driver monitoring device 10 and one or more ECUs 40 that execute a predetermined process based on the monitoring result output from the driver monitoring device 10. Therefore, based on the result of the monitoring, the ECU 40 can appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even a specific individual to drive with peace of mind.
 以上、本発明の実施の形態を説明したが、前述までの説明はあらゆる点において本発明の例示に過ぎず、本発明の範囲を逸脱することなく、種々の改良や変更を実施することができることは言うまでもない。
 上記実施の形態では、本発明に係る画像処理装置をドライバモニタリング装置10に適用した場合について説明したが、適用例はこれに限定されない。例えば、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにおいて、モニタリング対象者に上記した特定個人が含まれる場合に、本発明に係る画像処理装置を適用可能である。
Although the embodiments of the present invention have been described above, the above description is merely an example of the present invention in all respects, and various improvements and modifications can be made without departing from the scope of the present invention. Needless to say.
In the above embodiment, the case where the image processing device according to the present invention is applied to the driver monitoring device 10 has been described, but the application example is not limited to this. For example, in a device or system for monitoring a person who operates, monitors, or performs a predetermined work in various facilities such as machines and devices in a factory, the above-mentioned specific individual is included in the monitoring target person. In some cases, the image processing apparatus according to the present invention can be applied.
 また、上記実施の形態では、特定個人(一般的な人物の、年齢、及び性別の違いなどがあったとしても共通する顔特徴とは異なる特徴を有する個人をいうものとする)に本発明を適用した場合について説明したが、本発明は、マスクで鼻や口が隠された人も特定個人として適用可能であり、あるいは眼帯を付けた人も特定個人として適用可能である。 Further, in the above-described embodiment, the present invention is applied to a specific individual (meaning an individual having characteristics different from common facial characteristics even if there are differences in age, gender, etc. of a general person). Although the case of application has been described, the present invention can be applied as a specific individual even to a person whose nose and mouth are hidden by a mask, or to a person wearing an eyepatch.
 本発明は、カメラを用いて人などの対象物をモニタリングする装置やシステムなどに広く適用可能であり、例えば、車両などの各種移動体のドライバ(操縦者)をモニタリングする装置やシステムの他、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどに広く利用することができる。 The present invention can be widely applied to devices and systems for monitoring objects such as people using a camera. For example, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can be widely used for devices and systems that monitor people who operate, monitor, or perform predetermined work in various facilities such as machines and devices in factories.
[付記]
 本発明の実施の形態は、以下の付記の様にも記載され得るが、これらに限定されない。
(付記1)
 撮像部から入力される画像を処理する画像処理方法であって、
 前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップ(S2、S3)と、
 該顔検出ステップ(S2、S3)により検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量(142a)とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップ(S4)と、
 該特定個人判定ステップ(S4)により、前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップ(S6、S7、S8)と、
 前記特定個人判定ステップ(S4)により、前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップ(S9、S10、S11)と、を含み、
 前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップ(S6、S7、S8)は、特定個人の顔向き推定ステップ(S6)を含むことを特徴とする画像処理方法。
[Additional Notes]
Embodiments of the present invention may also be described as, but are not limited to, the following appendices.
(Appendix 1)
It is an image processing method that processes an image input from an imaging unit.
Face detection steps (S2, S3) for detecting a face region while extracting facial features from the image, and
The feature amount of the face region detected by the face detection steps (S2, S3) and the learned face feature amount (142a) of the specific individual who has been trained to detect the face of the specific individual. Using the specific individual determination step (S4) for determining whether or not the face in the face region is the face of the specific individual,
When the face of the specific individual is determined by the specific individual determination step (S4), the first face image processing step (S6, S7, S8) for performing the face image processing for the specific individual and
When it is determined by the specific individual determination step (S4) that the face is not the face of the specific individual, the second face image processing step (S9, S10, S11) for performing normal face image processing is included.
The image processing includes a face orientation estimation process, and the first face image processing step (S6, S7, S8) includes a face orientation estimation step (S6) of a specific individual. ..
1    車載システム
2    車両
3    ドライバ
10   ドライバモニタリング装置
11   カメラ
12   画像処理部
13   CPU
14   ROM
141  プログラム記憶部
142  顔特徴量記憶部
142a 特定個人の顔特徴量
142b 通常の顔特徴量
15   RAM
151  画像メモリ
16   通信部
21   画像入力部
22   顔検出部
23   特定個人の顔検出部
24   通常の顔検出部
25   特定個人判定部
26   第1顔画像処理部
27   特定個人の顔向き推定部
27a  特定部位不使用顔向き推定部
27b  顔モデルフィッティングスコア算出部
27c  フィッティングスコア判定部
27d  補完処理部
27e  通常顔向き推定部
27f  角度補正テーブル
28   特定個人の目開閉検出部
29   特定個人の視線方向推定部
30   第2顔画像処理部
31   通常の顔向き推定部
32   通常の目開閉検出部
33   通常の視線方向推定部
34   出力部
40   ECU
41   センサ
42   アクチュエータ
43   通信バス
1 In-vehicle system 2 Vehicle 3 Driver 10 Driver monitoring device 11 Camera 12 Image processing unit 13 CPU
14 ROM
141 Program storage unit 142 Face feature amount Storage unit 142a Specific individual face feature amount 142b Normal face feature amount 15 RAM
151 Image memory 16 Communication unit 21 Image input unit 22 Face detection unit 23 Specific individual face detection unit 24 Normal face detection unit 25 Specific individual judgment unit 26 First face image processing unit 27 Specific individual face orientation estimation unit 27a Specific part Unused face orientation estimation unit 27b Face model fitting score calculation unit 27c Fitting score determination unit 27d Complementary processing unit 27e Normal face orientation estimation unit 27f Angle correction table 28 Specific individual eye opening / closing detection unit 29 Specific individual line-of-sight direction estimation unit 30 2 Face image processing unit 31 Normal face orientation estimation unit 32 Normal eye opening / closing detection unit 33 Normal line-of-sight direction estimation unit 34 Output unit 40 ECU
41 Sensor 42 Actuator 43 Communication bus

Claims (15)

  1.  撮像部から入力される画像を処理する画像処理装置であって、
     前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量とが記憶される顔特徴量記憶部と、
     前記画像から顔を検出するための特徴量を抽出しながら顔領域を検出する顔検出部と、
     検出された前記顔領域の前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定部と、
     該特定個人判定部により前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理部と、
     前記特定個人判定部により前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理部とを備え、
     前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理部には、特定個人の顔向き推定部が含まれていることを特徴とする画像処理装置。
    An image processing device that processes images input from the image pickup unit.
    As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
    A face detection unit that detects a face region while extracting a feature amount for detecting a face from the image, and a face detection unit.
    Using the detected feature amount of the face region and the face feature amount of the specific individual, a specific individual determination unit for determining whether or not the face in the face region is the face of the specific individual.
    When the specific individual determination unit determines that the face is the face of the specific individual, the first face image processing unit that performs face image processing for the specific individual and the first face image processing unit
    When the specific individual determination unit determines that the face is not the specific individual's face, it is provided with a second face image processing unit that performs normal face image processing.
    The image processing apparatus includes a face orientation estimation process, and the first face image processing unit includes a face orientation estimation unit of a specific individual.
  2.  前記特定個人の顔向き推定部が、特定個人の特定部位を使用しない顔モデルフィッティング処理により顔向き推定を行う特定部位不使用顔向き推定部を備えていることを特徴とする請求項1記載の画像処理装置。 The first aspect of claim 1, wherein the face orientation estimation unit of the specific individual includes a face orientation estimation unit that does not use a specific part and performs face orientation estimation by a face model fitting process that does not use a specific part of the specific individual. Image processing device.
  3.  前記特定部位以外の全ての部位各々の顔モデルフィッティングスコアを算出するスコア算出部と、
     前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の条件を満たしているか否かを判定するフィッティングスコア判定部とを、さらに備えていることを特徴とする請求項2記載の画像処理装置。
    A score calculation unit that calculates the face model fitting score for each part other than the specific part,
    The image according to claim 2, further comprising a fitting score determination unit for determining whether or not the face model fitting score for all parts other than the specific part satisfies a predetermined condition. Processing equipment.
  4.  前記フィッティングスコア判定部において、前記特定部位以外の全ての部位についての前記顔モデルフィッティングスコアが所定の条件を満たしていると判定された場合には、トラッキング処理に際し、前記特定部位の特徴量を補完する補完処理部を、さらに備えていることを特徴とする請求項3記載の画像処理装置。 When the fitting score determination unit determines that the face model fitting score for all parts other than the specific part satisfies a predetermined condition, the feature amount of the specific part is supplemented in the tracking process. The image processing apparatus according to claim 3, further comprising a complementary processing unit.
  5.  前記特定部位の特徴量が補完された後は、通常の顔モデルフィッティング処理により顔向き推定を行う通常顔向き推定部を、さらに備えていることを特徴とする請求項4記載の画像処理装置。 The image processing apparatus according to claim 4, further comprising a normal face orientation estimation unit that estimates the face orientation by a normal face model fitting process after the feature amount of the specific portion is complemented.
  6.  顔向き角度のずれを補正する角度補正テーブルを、さらに備えていることを特徴とする請求項1~5のいずれかの項に記載の画像処理装置。 The image processing apparatus according to any one of claims 1 to 5, further comprising an angle correction table for correcting a deviation of the face orientation angle.
  7.  前記特定個人判定部が、
     前記顔領域から抽出された前記特徴量と前記特定個人の顔特徴量との相関係数を算出し、
     算出した前記相関係数に基づいて、前記顔領域の顔が前記特定個人の顔であるか否かを判定するものであることを特徴とする請求項1~5のいずれかの項に記載の画像処理装置。
    The specific individual judgment unit
    The correlation coefficient between the feature amount extracted from the face area and the face feature amount of the specific individual was calculated.
    The item according to any one of claims 1 to 5, wherein it is determined whether or not the face in the face region is the face of the specific individual based on the calculated correlation coefficient. Image processing device.
  8.  前記特定個人判定部が、
     前記相関係数が所定の閾値より大きい場合、前記顔領域の顔が前記特定個人の顔であると判定し、
     前記相関係数が前記所定の閾値以下の場合、前記顔領域の顔が前記特定個人の顔ではないと判定するものであることを特徴とする請求項7記載の画像処理装置。
    The specific individual judgment unit
    When the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the face region is the face of the specific individual.
    The image processing apparatus according to claim 7, wherein when the correlation coefficient is equal to or less than the predetermined threshold value, it is determined that the face in the face region is not the face of the specific individual.
  9.  前記顔画像処理には、顔検出処理、視線方向推定処理、及び目開閉検出処理のうち少なくとも1つは含まれていることを特徴とする請求項1~5のいずれかの項に記載の画像処理装置。 The image according to any one of claims 1 to 5, wherein the face image processing includes at least one of a face detection process, a line-of-sight direction estimation process, and an eye opening / closing detection process. Processing equipment.
  10.  請求項1~5のいずれかの項に記載の画像処理装置と、
     該画像処理装置に入力する画像を撮影する撮像部と、
     前記画像処理装置による画像処理に基づく情報を出力する出力部とを備えていることを特徴とするモニタリング装置。
    The image processing apparatus according to any one of claims 1 to 5.
    An image pickup unit that captures an image to be input to the image processing device, and
    A monitoring device including an output unit that outputs information based on image processing by the image processing device.
  11.  請求項10記載のモニタリング装置と、
     該モニタリング装置と通信可能に接続され、該モニタリング装置から出力される前記情報に基づいて、所定の処理を実行する1以上の制御装置とを備えていることを特徴とする制御システム。
    The monitoring device according to claim 10 and
    A control system including one or more control devices that are communicably connected to the monitoring device and that execute a predetermined process based on the information output from the monitoring device.
  12.  前記モニタリング装置が、車両のドライバをモニタリングするための装置であり、
     前記制御装置が、前記車両に搭載される電子制御ユニットを含むものであることを特徴とする請求項11記載の制御システム。
    The monitoring device is a device for monitoring the driver of the vehicle.
    The control system according to claim 11, wherein the control device includes an electronic control unit mounted on the vehicle.
  13.  撮像部から入力される画像を処理する画像処理方法であって、
     前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
     該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
     該特定個人判定ステップにより、前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
     前記特定個人判定ステップにより、前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を含み、
     前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴とする画像処理方法。
    It is an image processing method that processes an image input from an imaging unit.
    A face detection step of detecting a face region while extracting facial features from the image, and
    The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
    When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step.
    The specific individual determination step includes a second face image processing step of performing normal face image processing when it is determined that the face is not the specific individual's face.
    The image processing includes a face orientation estimation process, and the first face image processing step is an image processing method including a face orientation estimation step for a specific individual.
  14.  撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのコンピュータプログラムであって、
     前記少なくとも1以上のコンピュータに、
     前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
     該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
     該特定個人判定ステップにより前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
     前記特定個人判定ステップにより前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を実行させ、
     前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴とするコンピュータプログラム。
    A computer program for causing at least one or more computers to process an image input from an image pickup unit.
    To at least one of the above computers
    A face detection step of detecting a face region while extracting facial features from the image, and
    The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
    When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
    When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
    The image processing includes a face orientation estimation process, and the first face image processing step is a computer program including a face orientation estimation step for a specific individual.
  15.  撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのコンピュータプログラムが記憶されたコンピュータ読み取り可能な記憶媒体であって、
     前記少なくとも1以上のコンピュータに、
     前記画像から顔の特徴量を抽出しながら顔領域を検出する顔検出ステップと、
     該顔検出ステップにより検出された前記顔領域の前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定ステップと、
     該特定個人判定ステップにより前記特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う第1顔画像処理ステップと、
     前記特定個人判定ステップにより前記特定個人の顔ではないと判定された場合、通常の顔画像処理を行う第2顔画像処理ステップと、を実行させ、
     前記画像処理には、顔向き推定処理が含まれ、前記第1顔画像処理ステップは、特定個人の顔向き推定ステップを含むことを特徴とするコンピュータ読み取り可能な記憶媒体。
    A computer-readable storage medium in which a computer program for causing at least one or more computers to process an image input from an image pickup unit is stored.
    To at least one of the above computers
    A face detection step of detecting a face region while extracting facial features from the image, and
    The face of the face region is used by using the feature amount of the face region detected by the face detection step and the learned face feature amount of the specific individual who has been trained to detect the face of the specific individual. A specific individual determination step for determining whether or not is the face of the specific individual, and
    When the face of the specific individual is determined by the specific individual determination step, the first face image processing step of performing the face image processing for the specific individual and the first face image processing step
    When it is determined by the specific individual determination step that the face is not the face of the specific individual, a second face image processing step of performing normal face image processing is executed.
    The image processing includes a face orientation estimation process, and the first face image processing step is a computer-readable storage medium including a face orientation estimation step for a specific individual.
PCT/JP2020/029232 2019-08-02 2020-07-30 Image processing device, monitoring device, control system, image processing method, computer program, and recording medium WO2021024905A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019142705A JP2021026420A (en) 2019-08-02 2019-08-02 Image processing device, monitoring device, control system, image processing method, and computer program
JP2019-142705 2019-08-02

Publications (1)

Publication Number Publication Date
WO2021024905A1 true WO2021024905A1 (en) 2021-02-11

Family

ID=74504081

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/029232 WO2021024905A1 (en) 2019-08-02 2020-07-30 Image processing device, monitoring device, control system, image processing method, computer program, and recording medium

Country Status (2)

Country Link
JP (1) JP2021026420A (en)
WO (1) WO2021024905A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030512A (en) * 2022-08-04 2023-04-28 荣耀终端有限公司 Gaze point detection method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010198313A (en) * 2009-02-25 2010-09-09 Denso Corp Device for specifying degree of eye opening
WO2013008305A1 (en) * 2011-07-11 2013-01-17 トヨタ自動車株式会社 Eyelid detection device
JP2015194884A (en) * 2014-03-31 2015-11-05 パナソニックIpマネジメント株式会社 driver monitoring system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010198313A (en) * 2009-02-25 2010-09-09 Denso Corp Device for specifying degree of eye opening
WO2013008305A1 (en) * 2011-07-11 2013-01-17 トヨタ自動車株式会社 Eyelid detection device
JP2015194884A (en) * 2014-03-31 2015-11-05 パナソニックIpマネジメント株式会社 driver monitoring system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAGISAWA, YASUAKI: "2. Input by correcting the face orientation, creating reference data'', (''Angle correction of face and improvement of authentication rate with FARSHAS", LECTURE PROCEEDINGS 2 OF THE 2010 ELECTRONICS SOCIETY CONFERENCE OF IEICE *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030512A (en) * 2022-08-04 2023-04-28 荣耀终端有限公司 Gaze point detection method and device
CN116030512B (en) * 2022-08-04 2023-10-31 荣耀终端有限公司 Gaze point detection method and device

Also Published As

Publication number Publication date
JP2021026420A (en) 2021-02-22

Similar Documents

Publication Publication Date Title
Bila et al. Vehicles of the future: A survey of research on safety issues
Alioua et al. Driver head pose estimation using efficient descriptor fusion
EP2860664B1 (en) Face detection apparatus
Rezaei et al. Look at the driver, look at the road: No distraction! no accident!
Trivedi et al. Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety
CN111434553B (en) Brake system, method and device, and fatigue driving model training method and device
US20160180192A1 (en) System and method for partially occluded object detection
Rezaei et al. Simultaneous analysis of driver behaviour and road condition for driver distraction detection
CN109740477A (en) Study in Driver Fatigue State Surveillance System and its fatigue detection method
WO2021024905A1 (en) Image processing device, monitoring device, control system, image processing method, computer program, and recording medium
Rani et al. Development of an Automated Tool for Driver Drowsiness Detection
JP2004334786A (en) State detection device and state detection system
CN116012822B (en) Fatigue driving identification method and device and electronic equipment
Llorca et al. Stereo-based pedestrian detection in crosswalks for pedestrian behavioural modelling assessment
WO2020261820A1 (en) Image processing device, monitoring device, control system, image processing method, and program
WO2020261832A1 (en) Image processing device, monitoring device, control system, image processing method, and program
JP2021009503A (en) Personal data acquisition system, personal data acquisition method, face sensing parameter adjustment method for image processing device and computer program
US11345354B2 (en) Vehicle control device, vehicle control method and computer-readable medium containing program
CN111267865B (en) Vision-based safe driving early warning method and system and storage medium
Nowosielski Vision-based solutions for driver assistance
WO2021262166A1 (en) Operator evaluation and vehicle control based on eyewear data
Kim et al. Driving environment assessment using fusion of in-and out-of-vehicle vision systems
Bruno et al. Advanced driver assistance system based on neurofsm applied in the detection of autonomous human faults and support to semi-autonomous control for robotic vehicles
US20230260269A1 (en) Biometric task network
US20230260328A1 (en) Biometric task network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20851092

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20851092

Country of ref document: EP

Kind code of ref document: A1