WO2020261832A1 - Image processing device, monitoring device, control system, image processing method, and program - Google Patents

Image processing device, monitoring device, control system, image processing method, and program Download PDF

Info

Publication number
WO2020261832A1
WO2020261832A1 PCT/JP2020/020261 JP2020020261W WO2020261832A1 WO 2020261832 A1 WO2020261832 A1 WO 2020261832A1 JP 2020020261 W JP2020020261 W JP 2020020261W WO 2020261832 A1 WO2020261832 A1 WO 2020261832A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
specific individual
feature amount
unit
search area
Prior art date
Application number
PCT/JP2020/020261
Other languages
French (fr)
Japanese (ja)
Inventor
相澤 知禎
Original Assignee
オムロン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オムロン株式会社 filed Critical オムロン株式会社
Publication of WO2020261832A1 publication Critical patent/WO2020261832A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/16Anti-collision systems

Definitions

  • the present invention relates to an image processing device, a monitoring device, a control system, an image processing method, and a program.
  • Patent Document 1 discloses a robot device used as a service providing device that can switch to an appropriate service according to the situation of a target (person) to which the service is provided.
  • the robot device is equipped with a first camera, a second camera, and an information processing device including a CPU, and the CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and an information processing unit. It is equipped with a movement vector detector and the like.
  • the robot device when the service is provided to a group of people who have a relationship such as communicating with each other, the first service of providing information based on close communication is performed. To determine. On the other hand, when the service is provided to a group of people whose relationship such as communication with each other is unknown, the second service provides information unilaterally without exchanging information. Decide to do. As a result, it is possible to provide appropriate services according to the situation of the service provision target.
  • the face detection unit is configured to detect a person's face using the first camera, and a known technique can be used for the face detection.
  • a known technique can be used for the face detection.
  • a part of the facial organs such as eyes, nose, and mouth is missing or greatly deformed due to injury, a large mole, wart, or body decoration such as tattoo is applied to the face.
  • Such specific individuals in other words, age difference, gender, and person, such as when the facial organs are displaced from their average position due to treatment or a disease such as a hereditary disease.
  • the present invention has been made in view of the above problems, and an image processing device, a monitoring device, a control system, an image processing method, and an image processing method capable of accurately detecting even the face of a specific individual as described above in real time.
  • the purpose is to provide a program.
  • the image processing apparatus (1) according to the present disclosure in order to achieve the above object is an image processing apparatus that processes an image input from an imaging unit.
  • a facial feature storage unit that stores the facial features of a specific individual and the normal facial features
  • a face detection unit that detects a face area while scanning a search area with respect to the image is provided.
  • the face detection unit A first feature amount extraction unit that extracts facial feature amounts from the search area, and A hierarchically structured normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount.
  • the search area is used by using the feature amount extracted from the search area and the face feature amount of the specific individual. Is provided with a specific individual face determination unit for determining whether the face is the face of the specific individual or the non-face.
  • the normal face discriminator hierarchically discriminates whether the search area is a face or a non-face by using the normal face feature amount. , The face area is detected. Further, even when it is determined that the face is non-faced in any layer of the normal face discriminator, the specific individual face determination unit uses the face feature amount of the specific individual to search the search area. By determining whether is the face of the specific individual or non-face, the face region including the face of the specific individual is detected. Thereby, the face region can be detected with high accuracy regardless of whether it is the normal face or the face of the specific individual. Further, since the normal face discriminator and the specific individual face determination unit use the common feature amount extracted from the search region, the real-time property related to the detection of the face region can be maintained.
  • the image processing device (2) is the above-mentioned image processing device (1) used in one layer of the normal face discriminator that the specific individual face determination unit has determined to be non-face.
  • An index showing the correlation between the feature amount and the face feature amount of the specific individual corresponding to the one layer is calculated, and based on the calculated index, whether the search area is the face of the specific individual or not. It is characterized in that it determines whether or not it is non-faced.
  • the feature amount used in one layer of the normal face discriminator determined to be non-face and the face feature amount of the specific individual corresponding to the one layer is determined to efficiently determine whether the search area is the face or non-face of the specific individual, and the normal face discriminator determines that the search area is non-face. Even if it is done, it is possible to accurately determine the case where the face is the specific individual.
  • the index may be an index value indicating that the larger the value is, the higher the relationship is, for example, a correlation coefficient, the reciprocal of the square error, or the non-face. It may be an index value indicating the degree of similarity between the feature amount used in one layer of the normal face discriminator determined to be, and the face feature amount of the specific individual corresponding to the one layer. ..
  • the search area is the face of the specific individual.
  • the index is equal to or less than the predetermined threshold value, it is determined that the search region is the non-face.
  • the search area is determined. Is determined to be the non-face. The processing efficiency of the determination can be improved by the processing of comparing the index with the predetermined threshold value.
  • the image processing device (4) determines that the face is non-face, the specific individual face determination unit is provided with a discrimination cutoff unit that terminates the discrimination by the normal face discriminator.
  • the specific individual face determination unit determines that the face is the specific individual's face
  • the determination proceeds to the next layer of the normal face discriminator, and the discrimination process is expedited.
  • the specific individual face determination unit determines that the face is non-face
  • the determination by the normal face discriminator is terminated. Therefore, it is possible to perform the process of determining the face of the specific individual while maintaining the efficiency of the normal face discriminator.
  • the image processing device (5) is one or more of the image processing devices (1) to (4) in which the face detection unit is determined to be the face by the normal face discriminator.
  • the face region integration unit that integrates the candidates for the face region and the second feature quantity extraction unit that extracts the facial feature quantity from the integrated face region are provided, and the face region is extracted from the integrated face region. It is characterized by including a specific individual determination unit that determines whether or not the face in the face region is the face of the specific individual by using the feature amount and the face feature amount of the specific individual.
  • one or more candidates for the face region determined to be the face are integrated, the feature amount extracted from the integrated face region, and the specific individual.
  • the face feature amount it is determined whether or not the face in the face region is the face of the specific individual. Therefore, it is possible to accurately determine whether the face region integrated by the face region integrating unit is the face of the specific individual or the face of a normal person.
  • the monitoring device (1) is based on any of the above image processing devices (1) to (5), an image pickup unit that captures an image input to the image processing device, and image processing by the image processing device. It is characterized by having an output unit that outputs information. According to the monitoring device (1), not only the face of the normal person but also the face of the specific individual can be accurately detected and monitored, and information based on the image processing can be detected from the output unit. Can be output, so it is possible to easily construct a monitoring system or the like that uses the information.
  • the control system (1) is one or more that is communicably connected to the monitoring device (1) and executes a predetermined process based on the information output from the monitoring device. It is characterized by being equipped with a control device of. According to the control system (1), it is possible to execute a predetermined process by one or more of the control devices based on the information output from the monitoring device. Therefore, it is possible to construct a system that can utilize not only the monitoring result of the normal person but also the monitoring result of the specific individual.
  • the control system (2) is a device for monitoring the driver of the vehicle, and the control device is mounted on the vehicle. It is characterized by including an electronic control unit. According to the control system (2), even when the driver of the vehicle is the specific individual, the face of the specific individual can be accurately monitored, and the electronic device is based on the monitoring result. It is possible to make the control unit appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even the specific individual to drive with peace of mind.
  • the image processing method is an image processing method for processing an image input from an imaging unit.
  • a face detection step of detecting a face area while scanning the search area with respect to the image is included.
  • the face detection step A feature amount extraction step for extracting facial feature amounts from the search area, and
  • the search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face.
  • a normal face discrimination step that hierarchically discriminates whether or not When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. It is characterized by including a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face by using the face feature amount of the above.
  • the face is determined by hierarchically determining whether the search area is a face or a non-face by using the normal face feature amount in the normal face discrimination step.
  • the area is detected.
  • the search area is used by the specific individual face determination step using the face feature amount of the specific individual.
  • the face region including the face of the specific individual is detected.
  • the face region can be detected with high accuracy regardless of whether it is the normal face or the face of the specific individual.
  • the common feature amount extracted from the search area is used in the normal face determination step and the specific individual face determination step, the real-time property related to the detection of the face area can be maintained. Therefore, the face of the specific individual can be detected accurately in real time.
  • the program according to the present disclosure is a program for causing at least one or more computers to process an image input from an imaging unit.
  • a face detection step of detecting a face area while scanning the search area with respect to the image is included.
  • the face detection step A feature amount extraction step for extracting facial feature amounts from the search area, and
  • the search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face.
  • a normal face discrimination step that hierarchically discriminates whether or not When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. It is a program for executing a specific individual face determination step for determining whether the search area is the face of the specific individual or the non-face by using the face feature amount of the above. ..
  • the search area is hierarchically determined whether the search area is a face or a non-face by using the normal face feature amount on the at least one computer in the normal face discrimination step. It is possible to detect the face region. Further, even when it is determined that the face is non-face in any layer of the normal face determination step, the search area is used by the specific individual face determination step using the face feature amount of the specific individual. It is possible to detect the face region including the face of the specific individual by determining whether the face is the face of the specific individual or not. As a result, the face region can be accurately detected regardless of whether it is the normal face or the face of the specific individual.
  • the above program may be a program stored in a storage medium, a program that can be transferred via a communication network, or a program that is executed via a communication network. ..
  • the image processing apparatus can be widely applied to, for example, an apparatus or system for monitoring an object such as a person using a camera.
  • the image processing device operates or monitors, for example, various facilities such as machines and devices in a factory, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can also be applied to devices and systems that monitor people who perform predetermined work.
  • FIG. 1 is a schematic view showing an example of an in-vehicle system including the driver monitoring device according to the embodiment.
  • the in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2. ) 40, and one or more sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43.
  • a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2.
  • ECUs Electronic Control Units
  • the in-vehicle system 1 is configured as, for example, an in-vehicle network system that communicates according to a CAN (Controller Area Network) protocol.
  • CAN Controller Area Network
  • the driver monitoring device 10 is an example of the "monitoring device" of the present invention, and the in-vehicle system 1 is an example of the "control system" of the present invention.
  • the driver monitoring device 10 transmits information based on image processing by the camera 11 for capturing the face of the driver 3, the image processing unit 12 that processes the image input from the camera 11, and the image processing unit 12, and the communication bus 43. It is configured to include a communication unit 16 that performs processing such as output to a predetermined ECU 40 via the above.
  • the image processing unit 12 is an example of the "image processing device” of the present invention.
  • the camera 11 is an example of the "imaging unit” of the present invention.
  • the driver monitoring device 10 detects the face of the driver 3 from the image captured by the camera 11, and detects the behavior of the face such as the direction of the face of the detected driver 3, the direction of the line of sight, or the open / closed state of the eyes.
  • the driver monitoring device 10 may determine the state of the driver 3, such as forward gaze, inattentiveness, dozing, backward facing, and prone, based on the detection results of these facial behaviors. Further, the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40 performs attention and warning processing to the driver 3 or operation control of the vehicle 2 (for example, deceleration) based on the signal. Control, guidance control to the road shoulder, etc.) may be executed.
  • face sensing for the driver 3, particularly whether the driver 3 is a specific individual or a normal person other than the specific individual, can accurately detect the faces of these subjects in real time.
  • the driver 3 of the vehicle 2 has a part of facial organs such as eyes, nose, and mouth missing or greatly deformed due to, for example, an injury, or a large mole or wart on the face, or Accuracy of detecting the face from the image captured by the camera when the facial organs are displaced from the average position due to body decoration such as tattoo or a disease such as a hereditary disease.
  • the driver monitoring device 10 is a general person who is common regardless of a specific individual, in other words, a difference in age, gender, race, etc. (individual difference). The following configuration was adopted in order to enable real-time and accurate face detection for a specific individual who has features different from those of a normal person (also called a normal person).
  • the image processing unit 12 as the learned facial features that have been learned to detect the face from the image, the facial features of a specific individual and the normal facial features (in other words, the face of a normal person) The amount of facial features used for detection) is stored.
  • the image processing unit 12 performs face detection processing for detecting a face area while scanning a search area of a predetermined size with respect to an input image of the camera 11 and extracting a feature amount for detecting a face from the search area. Do. Then, the image processing unit 12 hierarchically determines whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount. The face area is detected from the input image by the normal face discrimination process.
  • the image processing unit 12 determines that the face is non-face in any layer of the normal face discrimination process, the feature amount extracted from the search area and the face feature amount of the specific individual Using and, a specific individual face determination for determining whether the search area is a face of a specific individual or a non-face is performed.
  • the specific individual face determination process the relationship between the feature amount used in one layer determined to be non-face in the normal face determination process and the face feature amount of the specific individual corresponding to the one layer is determined.
  • An index to be shown, for example, a correlation coefficient may be calculated, and based on the calculated correlation coefficient, it may be determined whether the search area is a face or a non-face of the specific individual.
  • the correlation coefficient when the correlation coefficient is larger than a predetermined threshold value, it may be determined that the search area is the face of the specific individual, and the discrimination process may proceed to the next layer in the normal face discrimination process. Further, when the correlation coefficient is equal to or less than the predetermined threshold value, it may be determined that the search area is the non-face, and the normal face discrimination process for the search area may be terminated. In the specific individual face determination process, an index other than the correlation coefficient may be used.
  • the normal face discrimination process is used to hierarchically determine whether the search region is a face or a non-face using the normal face feature amount, and the face region is determined. Is detected. Further, even when it is determined that the face is non-faced in any layer of the normal face determination process, the search area is used by the specific individual face determination process using the face feature amount of the specific individual. The face region including the face of the specific individual is detected by determining whether is the face of the specific individual or a non-face.
  • the search area is defined as non-face.
  • the process is terminated as (other than the face), and the process proceeds to the next search area.
  • FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.
  • the in-vehicle system 1 includes a driver monitoring device 10, 1 or more ECUs 40 for monitoring the state of the driver 3 of the vehicle 2, and 1 or more sensors 41, which are connected via a communication bus 43. Further, one or more actuators 42 are connected to the ECU 40.
  • the driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 for exchanging data and signals with an external ECU 40 and the like. There is.
  • the camera 11 is a device that captures an image including the face of the driver 3 seated in the driver's seat.
  • the image sensor unit may include an image sensor such as a CCD (Charge Coupled Device) and a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like.
  • the image pickup device unit may be an element capable of forming an image pickup image by receiving light in a visible region, or an element capable of forming an image pickup image by receiving light in a near infrared region.
  • the light irradiation unit is configured to include a light emitting element such as an LED (Light Emitting Diode), and may include a near infrared LED or the like so that the driver 3's face can be imaged day or night.
  • the camera 11 captures an image at a predetermined frame rate (for example, several tens of frames per second), and the data of the captured image is input to the image processing unit 12.
  • the camera 11 may be an external type as well as an integrated type.
  • the image processing unit 12 is configured as an image processing device including one or more CPU (Central Processing Unit) 13, ROM (Read Only Memory) 14, and RAM (Random Access Memory) 15.
  • the ROM 14 includes a program storage unit 141 and a facial feature amount storage unit 142
  • the RAM 15 includes an image memory 151 for storing an input image from the camera 11.
  • the driver monitoring device 10 may be further equipped with another storage unit, and the other storage unit may be used as the program storage unit 141, the facial feature amount storage unit 142, and the image memory 151.
  • the other storage unit may be a semiconductor memory or a storage medium that can be read by a disk drive or the like.
  • the CPU 13 is an example of a hardware processor, and by reading, interpreting, and executing data such as a program stored in the program storage unit 141 of the ROM 14 and the face feature amount stored in the face feature amount storage unit 142. , Processing of the image input from the camera 11, for example, face image processing such as face detection processing is performed. Further, the CPU 13 performs a process of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16.
  • the face feature amount storage unit 142 contains the face feature amount 142a of a specific individual and the normal face feature amount 142b as learned face feature amounts that have been learned (for example, machine learning) to detect a face from an image. And (see FIGS. 3 and 4) are stored.
  • learned facial features various feature quantities effective for detecting a face from an image can be used. For example, a feature amount (Haar-like feature amount) focusing on the difference in brightness (difference in average brightness) of a local region of the face may be used.
  • a feature amount (LBP (Local Binary Pattern) feature amount) focusing on a combination of brightness distributions in the local region of the face may be used, or the distribution of the brightness in the local region of the face in the gradient direction may be used.
  • Features HOG (Histogram of Oriented Gradients) features focusing on the combination may be used.
  • the face feature amount stored in the face feature amount storage unit 142 is extracted as an effective feature amount for face detection by using, for example, various machine learning methods.
  • Machine learning is a process of finding a pattern inherent in data (learning data) by a computer.
  • AdaBoost may be used as an example of a statistical learning method.
  • AdaBoost selects a large number of discriminators (weak discriminators) with low discriminating ability, selects a weak discriminator with a small error rate from these many weak discriminators, adjusts parameters such as weights, and has a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by setting.
  • the discriminator may be referred to as a discriminator, a classifier, or a learner.
  • the strong discriminator is configured to discriminate one feature amount effective for face detection by one weak discriminator, and a large number of weak discriminators and their combinations are selected by AdaBoost, and these are used hierarchically.
  • the structure may be constructed.
  • one weak discriminator may output information such as 1 for a face and 0 for a non-face.
  • a learning method called Real AdaBoost which can output a real number from 0 to 1 instead of 0 or 1, may be used.
  • a neural network having an input layer, an intermediate layer, and an output layer may be used.
  • a large number of face images captured under various conditions and a large number of non-face images (non-face images) are given as training data to a learning device equipped with such a learning algorithm, learning is repeated, weighting, etc.
  • a strong discriminator having a hierarchical structure capable of detecting a face with high accuracy.
  • one or more feature amounts used in the weak discriminators of each layer constituting such a strong discriminator can be used as the learned facial feature amounts.
  • the face feature amount 142a of a specific individual individually captures a face image of the specific individual at a predetermined place under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). Then, these a large number of captured images are input to the learning device as teacher data, and are parameters that indicate the facial features of a specific individual adjusted by the learning process.
  • the facial feature amount 142a of the specific individual may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process.
  • the facial feature amount 142a of a specific individual stored in the facial feature amount storage unit 142 may be only the facial feature amount of one specific individual, or can be used when a plurality of specific individuals drive the vehicle 2. , The facial features of a plurality of specific individuals may be stored.
  • the normal facial feature amount 142b is the above-mentioned learning device using images of a normal human face image captured under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states) as teacher data. It is a parameter indicating the characteristics of a normal human face, which is input to and adjusted by the learning process.
  • the normal facial feature amount 142b may be, for example, a combination pattern of light and dark differences in a local region of the face obtained by a learning process. Further, as the normal facial feature amount 142b, the information registered in the predetermined facial feature amount database may be used.
  • the learned facial feature amount stored in the facial feature amount storage unit 142 is fetched from a server on the cloud or the like via a communication network such as the Internet or a mobile phone network and stored in the facial feature amount storage unit 142. It may be configured as such.
  • the ECU 40 is composed of a computer device including one or more processors, a memory, a communication module, and the like. Then, the processor mounted on the ECU 40 reads, interprets, and executes the program stored in the memory, so that predetermined control for the actuator 42 and the like is executed.
  • the ECU 40 includes, for example, at least one of a traveling system ECU, a driving support system ECU, a body system ECU, and an information system ECU.
  • the traveling system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like.
  • the drive system ECU includes a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control.
  • the chassis-based ECU includes a control unit related to a "stop, turn” function such as brake control or steering control.
  • the driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA / Lane Keep Assist), a constant speed driving / inter-vehicle distance support function (also referred to as ACC / Adaptive Cruise Control), and a forward collision warning function.
  • Lane departure warning function a blind spot monitoring function, traffic sign recognition function, etc., functions that automatically improve safety or realize comfortable driving by linking with driving ECUs (driving support function or automatic driving function) It may be configured to include at least one control unit with respect to.
  • the driving support system ECU includes, for example, Level 1 (driver assistance), Level 2 (partially automatic driving), and Level 3 (conditional automatic driving) at the automatic driving level presented by the American Society of Automotive Engineers of Japan (SAE). ) May be equipped with at least one of the functions. Further, the functions of level 4 (highly automatic driving) or level 5 (fully automatic driving) of the automatic driving level may be equipped, and only the functions of level 1 and 2 or only level 2 and 3 are equipped. May be good. Further, the in-vehicle system 1 may be configured as an automatic driving system.
  • the body system ECU may be configured to include at least one control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a winker.
  • the information system ECU may be configured to include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems) related device.
  • the infotainment device may include, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like.
  • the telematics device may include a communication unit or the like for communicating with the outside.
  • the ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside machine such as an ITS spot, or vehicle-to-vehicle communication.
  • ETC Electronic Toll Collection System
  • the sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for controlling the operation of the actuator 42 by the ECU 40.
  • vehicle speed sensors shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc.
  • peripheral monitoring of external imaging cameras millimeter-wave radar (Radar), riders (LIDER), ultrasonic sensors, etc.
  • a sensor or the like may be included.
  • the actuator 42 is a device that executes operations related to traveling, steering, braking, etc. of the vehicle 2 based on a control signal from the ECU 40, and includes, for example, an engine, a motor, a transmission, a hydraulic cylinder, an electric cylinder, and the like.
  • FIG. 3 is a block diagram showing a functional configuration example of the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
  • the image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. It is configured to include.
  • the image input unit 21 performs a process of capturing an image including the face of the driver 3 captured by the camera 11.
  • the face detection unit 22 performs a process of detecting the face area while scanning the search area of a predetermined size with respect to the input image and extracting the feature amount of the face from the search area.
  • the face detection unit 22 includes a first feature amount extraction unit 221, a normal face discriminator 222, and a specific individual face determination unit 223.
  • the face detection unit 22 may further include a face region integration unit 224 and a second feature amount extraction unit 225.
  • FIG. 4 is a block diagram showing a functional configuration example of the face detection unit 22.
  • 5 and 6 are schematic views for explaining an example of processing operation performed by the face detection unit 22.
  • the face detection unit 22 uses, for example, Haar-like features as image features, and has a hierarchical structure normal face discriminator 222 and a normal face discriminator constructed by using the learning algorithm of AdaBoost. It is configured to use the specific individual face determination unit 223 added to 222.
  • the Haar-like feature is also called a rectangular feature, for example, the feature amount is the difference in the average brightness of the two rectangular areas. For example, the eye area in the image has low brightness and is around the eye (under the eye). , Next to the eyes) utilizes the feature that the brightness is high.
  • a rectangular feature that is a combination of two, three, or four rectangles may be used.
  • Features (highly important) effective for face detection and their combinations are selected using a learning algorithm and stored in the face feature storage unit 142.
  • the face feature amount storage unit 142 stores a normal face feature amount 142b used for processing by the normal face discriminator 222 and a specific individual face feature amount 142a used for processing by the specific individual face determination unit 223. Has been done.
  • the normal face discriminator 222 includes the first discriminator 222a to the Nth discriminator 222n, and has a hierarchical structure (also referred to as a cascade structure) in which a plurality of these are connected.
  • Each of these discriminators uses one or more feature quantities 221a extracted from a search area 210 of a predetermined size cut out from an image and is effective for face detection, and the search area 210 is a face or a non-face (face). Other than).
  • a feature amount 221a that roughly captures the face, such as whether or not there are eyes, is used.
  • a feature amount 221a that captures the details of the face is used.
  • the first feature amount extraction unit 221 extracts from the search area 210 one or more feature amounts 221a that are usually set to be discriminated by each discriminator constituting the face discriminator 222.
  • the normal face discriminator 222 uses the feature amount 221a extracted from the search area 210 by the first feature amount extraction unit 221 and the normal face feature amount 142b, and the search area 210 is face or non-face.
  • the presence or absence is hierarchically determined in the order of the first discriminator 222a to the Nth discriminator 222n.
  • the specific individual face determination unit 223 includes the first determination unit 223a to the Nth determination unit 223n, and determines that the search area 210 is non-face in any layer of the normal face determination device 222.
  • the feature amount 221a extracted from the search area 210 and the face feature amount 142a of the specific individual are used to determine whether the search area 210 is the face of the specific individual or not.
  • the second determination unit 223b extracts the feature amount 221a (used by the second discriminator 222b) from the search region 210. And the face feature amount 142a of the specific individual corresponding to the hierarchy are used to determine whether the search area 210 is a face of the specific individual or a non-face.
  • the process proceeds to the determination process by the third discriminator 222c, while when it is determined that the search area 210 is a non-face, the second determination unit 210 The discrimination process is terminated by the discriminator 222b, and the process proceeds to the face detection process for the next search area.
  • the search area 210 is stored as a candidate for the face area.
  • the face detection unit 22 generates reduced images 20a and 20b obtained by reducing the input image 20 by a plurality of magnifications, as shown in FIG. 5, for example, in order to detect faces of various sizes.
  • a search area 210 of a predetermined size may be cut out from the reduced images 20a and 20b of the above, and a normal face detector 222 may be used to determine whether the search area 210 is a face or a non-face.
  • by scanning the search area 210 in the input image 20, the reduced images 20a, and 20b, faces of various sizes in the image 20 and the positions of the faces may be detected.
  • the search area 210 may have any shape other than a rectangle.
  • the face detection unit 22 may be configured to detect a face facing (rotated) in various directions or a face tilted at various angles.
  • the first feature amount extraction unit 221 extracts a feature amount 221a effective for discriminating the face orientation and the face inclination from the search area 210, and the normal face discriminator 222 usually uses the face orientation and the face inclination.
  • the search area 210 may be configured to be able to discriminate whether it is a face or a non-face by using a trained discriminator that has learned about each feature amount.
  • the normal face discriminator 222 in order to detect the front face, the diagonal face, and the profile, respectively, the normal face discriminator 222 normally uses the front (0 degree), left diagonal (45 degrees), and left side (90 degrees). ) May be provided with a discriminator that has learned each feature amount. In this case, learning may be performed so that one discriminator can cover a predetermined angle (for example, 22.5 degrees) or more. Further, the right angle (45 degrees) can be determined by flipping the left diagonal (45 degrees) horizontally, and the right side (90 degrees) can be determined by flipping the left side (90 degrees) horizontally. Good. Further, as shown in FIG. 6, the normal face discriminator 222 may be provided with a discriminator that learns each feature amount for each predetermined tilt so that the tilt of the face can be detected.
  • the face region integration unit 224 performs a process of integrating one or more face region candidates determined to be a face by the normal face discriminator 222.
  • the method of integrating the candidates of one or more face regions is not particularly limited. For example, it may be integrated based on the average value of the region center of one or more face region candidates and the average value of the region size.
  • the second feature amount extraction unit 225 performs a process of extracting facial feature amounts from the face area integrated by the face area integration unit 224.
  • the specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to detect the face in the face area. Performs a process of determining whether is the face of a specific individual or the face of a normal person other than the specific individual.
  • the specific individual determination unit 25 calculates a correlation coefficient as an index showing the relationship between the feature amount extracted from the face region and the face feature amount 142a of the specific individual, for example, an index showing the correlation, and the calculated correlation coefficient. It may be determined whether or not the face in the face region is the face of a specific individual based on.
  • the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the detected face area is the face of a specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the detected face area is a specific individual. It may be determined that it is not the face of.
  • the specific individual determination unit 25 may determine whether or not the face in the detected face region is the face of the specific individual based on the result of determination for one frame of the input image from the camera 11. Based on the result of determination for a plurality of frames of the input image from the camera 11, it may be determined whether or not the face in the detected face region is the face of a specific individual.
  • the first face image processing unit 26 performs face image processing for the specific individual.
  • the first face image processing unit 26 includes a face orientation estimation unit 27 of a specific individual, an eye opening / closing detection unit 28 of the specific individual, and a line-of-sight direction estimation unit 29 of the specific individual, but is still different.
  • the first face image processing unit 26 may perform any processing of the face image processing for the specific individual by using the face feature amount 142a of the specific individual.
  • the face feature amount storage unit 142 stores the learned feature amount that has been learned to perform the face image processing for the specific individual, and uses the learned feature amount for the specific individual. Any processing of face image processing may be performed.
  • the face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual.
  • the face orientation estimation unit 27 of a specific individual detects, for example, the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 22, and the detected position and shape of the facial organs.
  • the process of estimating the orientation of the face is performed based on.
  • the method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method capable of detecting the facial organs at high speed and with high accuracy.
  • a method of creating a three-dimensional face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted.
  • a technique for fitting a three-dimensional face shape model to a human face in an image for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but the technique is not limited thereto.
  • the face orientation estimation unit 27 of the specific individual can use the estimation data of the face orientation of the specific individual, for example, the pitch angle of vertical rotation (around the X axis) included in the parameters of the three-dimensional face shape model.
  • the yaw angle of the left-right rotation (around the Y axis) and the roll angle of the entire rotation (around the Z axis) may be output.
  • the eye opening / closing detection unit 28 of the specific individual performs a process of detecting the opening / closing state of the eyes of the specific individual.
  • the eye opening / closing detection unit 28 of the specific individual is based on the position and shape of the facial organs obtained by the face orientation estimation unit 27 of the specific individual, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. Detects the open / closed state, for example, whether the eyes are open or closed. For the open / closed state of the eye, for example, the feature amount of the image of the eye (the position of the eyelid, the shape of the pupil (black eye), the area size of the white eye part and the black eye part, etc.) in various open / closed states of the eye is previously learned. It may be detected by learning using the data and evaluating the degree of similarity with the learned feature data.
  • the line-of-sight direction estimation unit 29 of the specific individual performs a process of estimating the line-of-sight direction of the specific individual.
  • the line-of-sight direction estimation unit 29 of a specific individual is based on, for example, the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, particularly the position and shape of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils).
  • Estimate the direction of the line of sight is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.
  • the direction of the line of sight is, for example, the feature amount of the image of the eye in various combinations of face orientation and eye orientation (relative position of outer corner, inner corner of eye, pupil, relative position of white eye portion and black eye portion, shading, etc. (Texture, etc.) may be detected by learning in advance using a learning device and evaluating the degree of similarity with the learned feature amount data.
  • the line-of-sight direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face, the position of the eyes, etc., using the fitting result of the three-dimensional face shape model, and the pupil. The position of the eyeball may be detected, and the vector connecting the center of the eyeball and the center of the pupil may be detected as the line-of-sight direction.
  • the second face image processing unit 30 When the specific individual determination unit 25 determines that the face is not the face of a specific individual, the second face image processing unit 30 performs normal face image processing.
  • the second face image processing unit 30 includes a normal face orientation estimation unit 31, a normal eye opening / closing detection unit 32, and a normal line-of-sight direction estimation unit 33, but has yet another face behavior. It may include a configuration for estimating or detecting.
  • the second face image processing unit 30 may perform any processing of the normal face image processing using the normal face feature amount 142b.
  • the face feature amount storage unit 142 stores the learned feature amount that has been learned for performing the normal face image processing, and uses the learned feature amount to perform the normal face image processing. Either process may be performed.
  • the processing performed by the normal face orientation estimation unit 31, the normal eye opening / closing detection unit 32, and the normal line-of-sight direction estimation unit 33 includes the face orientation estimation unit 27 of the specific individual and the eye opening / closing detection of the specific individual. Since it is basically the same as the unit 28 and the line-of-sight direction estimation unit 29 of a specific individual, the description thereof will be omitted here.
  • the output unit 34 performs a process of outputting information based on the image processing by the image processing unit 12 to the ECU 40 or the like.
  • the information based on the image processing may be, for example, information on the behavior of the face such as the direction of the face of the driver 3, the direction of the line of sight, or the open / closed state of the eyes, or the driver 3 determined based on the detection result of the behavior of the face. Information on the state of (for example, forward gaze, inattentiveness, dozing, backward facing, prone, etc.) may be used. Further, the information based on the image processing may be a predetermined control signal (control signal for performing caution or warning processing, control signal for performing operation control of the vehicle 2, etc.) based on the state determination of the driver 3.
  • FIG. 7 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment.
  • the camera 11 captures an image of several tens of frames per second, and this processing is performed for each frame or every frame at regular intervals.
  • the CPU 13 operates as an image input unit 21, performs a process of reading an image (an image including the face of the driver 3) captured by the camera 11, and proceeds to step S2.
  • the CPU 13 operates as the face detection unit 22, performs face detection processing for detecting the face area while scanning the search area for the input image, and proceeds to step S3.
  • a specific example of the face detection process in step S2 will be described later.
  • step S3 the CPU 13 operates as the specific individual determination unit 25, and uses the feature amount of the face region detected in step S2 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142.
  • a process of determining whether or not the face in the detected face area is a face of a specific individual is performed, and the process proceeds to step S4.
  • step S4 the CPU 13 determines whether or not the result of the determination process in step S3 is the face of a specific individual, and if it is determined that the result is the face of a specific individual, the process proceeds to step S5.
  • step S5 the CPU 13 operates as a face orientation estimation unit 27 of a specific individual, and for example, detects and detects the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected in step S2.
  • the orientation of the face is estimated based on the position and shape of the facial organ, and the process proceeds to step S6.
  • step S6 the CPU 13 operates as an eye opening / closing detection unit 28 of a specific individual, and is based on, for example, the position and shape of the facial organs obtained in step S5, particularly the position and shape of eye feature points (eyelids, pupils).
  • the open / closed state of the eyes for example, whether the eyes are open or closed is detected, and the process proceeds to step S7.
  • step S7 the CPU 13 operates as a line-of-sight direction estimation unit 29 of a specific individual, and for example, the orientation of the face, the position and shape of the facial organs obtained in step S5, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils) The direction of the line of sight is estimated based on the position and shape, and then the process is finished.
  • step S4 if the CPU 13 determines that it is not the face of a specific individual, in other words, it is a normal face, the process proceeds to step S8.
  • step S8 the CPU 13 operates as a normal face orientation estimation unit 31, and for example, detects the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected in step S2, and detects the face.
  • the orientation of the face is estimated based on the position and shape of the organ, and the process proceeds to step S9.
  • step S9 the CPU 13 operates as a normal eye opening / closing detection unit 32, and for example, based on the position and shape of the facial organs obtained in step S8, particularly the position and shape of the feature points (eyelids, pupils) of the eyes.
  • the open / closed state of the eyes for example, whether the eyes are open or closed is detected, and the process proceeds to step S10.
  • step S10 the CPU 13 operates as a normal line-of-sight direction estimation unit 33, and for example, the orientation of the face and the position and shape of the facial organs obtained in step S8, particularly the positions of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils) The direction of the line of sight is estimated based on the shape and shape, and then the process is completed.
  • FIG. 8 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the face detection processing operation of step S2 and the specific individual determination processing operation of step S3 shown in FIG. 7, and is an example of processing operation for one input image (1 frame).
  • the CPU 13 starts the loop process L1 for detecting the size (size) of the face in step S21, and starts the loop process L2 for detecting the rotation angle (direction and inclination) of the face in the next step S22.
  • the loop process L3 for detecting the position of the face is started, and the process proceeds to step S24.
  • the loop processing L1 is repeated according to the number of reduced images (for example, 20a and 20b shown in FIG. 5) generated to detect faces of various sizes.
  • the loop processing L2 is repeated according to the setting of the discriminator for discriminating the rotation angle of the face (for example, the front face, the oblique face, the profile, and the inclination shown in FIG. 6).
  • the loop process L3 is repeated for the number of positions where the search area 210 is scanned to detect the position of the face.
  • step S24 the CPU 13 operates as the first feature amount extraction unit 221 and performs a process of extracting the facial feature amount 221a from the search area 210 under each condition of the loop processes L1, L2, and L3.
  • step S25 the CPU 13 normally operates as a face discriminator 222 and a specific individual face determination unit 223, and hierarchically determines whether the search area 210 is a face (face) or a non-face (other than a face). When it is determined that the search area 210 is a non-face in any of the layers, a process of determining whether the search area 210 is a face of a specific individual or a non-face is performed.
  • the CPU 13 scans the search area 210 for all the reduced images, and when the face detection (in other words, the detection of the face area candidate) in all the search areas 210 is completed, the loop process is performed in step S26. L1 is completed, loop processing L2 is completed in step S27, loop processing L3 is completed in step S28, and then processing proceeds to step S29.
  • the CPU 13 operates as the face area integration unit 224, performs a process of integrating one or more face area candidates determined to be faces in the processes of steps S21 to S28, and proceeds to step S30. ..
  • the method of integrating the candidates of one or more face regions is not particularly limited. For example, it may be integrated based on the average value of the region center of one or more face region candidates and the average value of the region size.
  • step S30 the CPU 13 operates as the second feature amount extraction unit 225, performs a process of extracting the facial feature amount from the face region integrated in step S29, and proceeds to the process in step S31.
  • step S31 the CPU 13 operates as the specific individual determination unit 25, and in step S30, the feature amount extracted from the integrated face area and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. The process of calculating the correlation coefficient of is performed, and the process proceeds to step S32.
  • step S32 the CPU 13 determines whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the face is a specific individual, and the correlation coefficient is larger than the predetermined threshold value, in other words. Then, if it is determined that the feature amount extracted from the face area has a high correlation (in other words, the similarity is high) with the face feature amount 142a of the specific individual, the process proceeds to step S33. In step S33, the CPU 13 determines that the face detected in the face area is the face of a specific individual, and then ends the process.
  • step S32 the correlation coefficient is equal to or less than a predetermined threshold value, in other words, the correlation between the feature amount extracted from the face region and the face feature amount 142a of the specific individual is low (in other words, the degree of similarity). Is low), the process proceeds to step S34.
  • step S34 the CPU 13 determines that the face is not a specific individual's face, in other words, a normal face, and then ends the process.
  • FIG. 9 is a flowchart showing an example of a face detection processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment.
  • This processing operation is an example of the discrimination processing operation in step S25 shown in FIG.
  • the CPU 13 reads the feature amount 221a extracted from the search area 210 in step S24 of FIG. 8, and proceeds to step S42.
  • step S42 the CPU 13 sets 0 in the discriminator counter (i) that counts the hierarchy of the normal face discriminator 222, and sets n indicating the number of hierarchical structures of the normal face discriminator 222. , 1 is set as the initial value, and the process proceeds to step S43.
  • step S43 the CPU 13 normally operates as a face discriminator 222, and performs a process of discriminating whether the search area 210 is a face or a non-face by the nth discriminator.
  • step S44 the CPU 13 determines whether the search area 210 is a face or a non-face, and if it is determined that the search area 210 is a face, the process proceeds to step S45.
  • step S45 the CPU 13 adds 1 to the discriminator counter (i) and proceeds to step S46.
  • step S46 the CPU 13 determines whether or not the discriminator counter (i) is less than N indicating the number of discriminators, and if it is determined that the number of discriminators is less than N, the process proceeds to step S47.
  • step S47 1 is added to n in order to proceed to the process by the discriminator of the next layer, and then the process returns to step S43 and the process is repeated.
  • step S46 if the CPU 13 determines that the discriminator counter (i) is not less than the number of discriminators N, in other words, the discriminator counter (i) has reached the number of discriminators N, the process is performed in step S48. Proceed.
  • step S48 the CPU 13 determines that the search area 210 is a candidate for the face area, stores the information of the search area as a candidate for the face area, finishes the face detection process for the search area, and finishes the next search area. The face detection process for is repeated.
  • step S49 the CPU 13 operates as the specific individual face determination unit 223, and the feature amount used in the nth discriminator (feature amount extracted from the search area 210) and the specific individual corresponding to the hierarchy of the nth discriminator.
  • the process of calculating the correlation coefficient with the facial feature amount 142a is performed, and the process proceeds to step S50.
  • step S50 the CPU 13 determines whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the face is a specific individual, and the correlation coefficient is larger than the predetermined threshold value. If it is determined (that is, the correlation is high) (in other words, if it is determined that the search area 210 is the face of a specific individual), the process proceeds to step S45, and the discrimination process by the normal face discriminator 222 proceeds. .. On the other hand, in step S50, if the CPU 13 determines that the correlation coefficient is equal to or less than a predetermined threshold value (that is, the correlation is low) (in other words, if it determines that the search area 210 is faceless), the step. The process proceeds to S51.
  • step S51 the CPU 13 determines that the search area is a non-face (other than a face), and in the next step S52, the discrimination process after the nth discriminator by the normal face discriminator 222 is terminated, and then the search is performed. The face detection process for the area is completed, and the face detection process for the next search area is repeated.
  • the face feature amount 142a of a specific individual and the normal face feature amount 142b are stored as the learned face feature amount in the face feature amount storage unit 142.
  • the face region is detected by the normal face discriminator 222 hierarchically discriminating whether the search region 210 cut out from the image is a face or a non-face by using the normal face feature amount 142b.
  • the specific individual face determination unit 223 uses the face feature amount 142a of the specific individual to search the search area 210. By determining whether the face of a specific individual is a face or a non-face, a face region including the face of the specific individual is detected. As a result, the face region can be accurately detected from the image regardless of whether it is a normal face or a face of a specific individual. Further, since the normal face discriminator 222 and the specific individual face determination unit 223 use a common feature amount extracted from the search area 210, there is no need for a separate process of extracting the feature amount, so that the face area can be detected. The real-time property can be maintained.
  • the driver 3 can accurately detect each face in real time (in other words, by high-speed processing) regardless of whether the driver 3 is a specific individual or a normal person other than the specific individual. Further, the feature amount used in one layer of the normal face discriminator 222 determined to be non-face by the specific individual face determination unit 223 and the face feature amount 142a of the specific individual corresponding to the one layer. Based on the correlation, it is possible to efficiently determine whether the search area is a face of a specific individual or a non-face. Therefore, even when the face discriminator 222 determines that the face is non-face, it is possible to accurately determine the face of a specific individual.
  • the specific individual face determination unit 223 determines that the face is a specific individual face, the determination proceeds to the next layer of the normal face discriminator 222, and the discrimination process is swiftly performed. Will be continued.
  • the specific individual face determination unit 223 determines that the face is non-face, the determination by the normal face discriminator 222 is terminated. Therefore, it is possible to perform a process of determining the face of a specific individual while maintaining the efficiency of the normal face discriminator 222.
  • the driver monitoring device 10 one or more face region candidates determined to be faces are integrated by the face region integration unit 224 via the normal face discriminator 222, and the specific individual determination unit 25 integrates the candidates. It is determined whether or not the face in the face region is the face of the specific individual by using the feature amount extracted from the integrated face region and the face feature amount 142a of the specific individual. Therefore, based on the face region integrated by the face region integration unit 224, it is possible to accurately determine whether the face is a specific individual's face or a normal person's face.
  • the in-vehicle system 1 includes a driver monitoring device 10 and one or more ECUs 40 that execute a predetermined process based on the monitoring result output from the driver monitoring device 10. Therefore, based on the result of the monitoring, the ECU 40 can appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even a specific individual to drive with peace of mind.
  • Embodiments of the present invention may also be described as, but are not limited to, the following appendices.
  • Appendix 1 An image processing device (12) that processes an image input from the image pickup unit (11).
  • a face feature storage unit that stores a specific individual's face feature (142a) and a normal face feature (142b) as learned face features that have been learned to detect a face from the image.
  • (142) and A face detection unit (22) that detects a face area while scanning the search area with respect to the image is provided.
  • the face detection unit (22) A first feature amount extraction unit (221) that extracts facial feature amounts from the search area, and A hierarchically structured normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount (142b). (222) and When the non-face is determined by any layer of the normal face discriminator (222), the feature amount extracted from the search area and the face feature amount of the specific individual are used.
  • An image processing device including a specific individual face determination unit (223) for determining whether the search area is the face of the specific individual or the non-face.
  • a face detection step (S2) of detecting a face area while scanning the search area with respect to the image is included.
  • the face detection step (S2) A feature amount extraction step (S24) for extracting facial feature amounts from the search area, and
  • the search area is a face using the feature amount extracted by the feature amount extraction step (S24) and the learned normal face feature amount (142b) that has been learned to detect a face.
  • the normal face discrimination step (S43, S44) for hierarchically discriminating whether the face is non-face or non-face, Learning to detect the extracted feature amount and the face of a specific individual when the non-face is determined in any layer of the normal face discrimination step (S43, S44).
  • the specific individual face determination step (S49, S50) for determining whether the search area is the face of the specific individual or the non-face using the completed face feature amount (142a) of the specific individual.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Image Processing (AREA)

Abstract

An image processing device that processes an image input from an imaging unit, said image processing device being equipped with a face detection unit for detecting a face region while scanning a search region of the image. The face detection unit is equipped with: a hierarchically structured ordinary face classifier that uses a feature amount extracted from the search region by a first feature amount extraction unit and an ordinary face feature amount stored in a face feature amount storage unit to determine whether the search region is the face of an ordinary person or is a non-face; and a specific individual face determination unit that, when the ordinary face classifier determines at any level that the search region is a non-face, uses the feature amount extracted from the search region and a face feature amount of the specific individual stored in the face feature amount storage unit to determine whether the search region is the face of the specific individual, or is a non-face.

Description

画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムImage processing equipment, monitoring equipment, control systems, image processing methods, and programs
 本発明は、画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing device, a monitoring device, a control system, an image processing method, and a program.
 下記の特許文献1には、サービスを提供する対象(人物)の状況に応じて、適切なサービスに切り替え可能なサービス提供装置として利用されるロボット装置が開示されている。
 前記ロボット装置には、第1カメラと、第2カメラと、CPUを含む情報処理装置とが装備され、前記CPUには、顔検出部、属性判定部、人物検出部、人物位置算出部、及び移動ベクトル検出部などが装備されている。
Patent Document 1 below discloses a robot device used as a service providing device that can switch to an appropriate service according to the situation of a target (person) to which the service is provided.
The robot device is equipped with a first camera, a second camera, and an information processing device including a CPU, and the CPU includes a face detection unit, an attribute determination unit, a person detection unit, a person position calculation unit, and an information processing unit. It is equipped with a movement vector detector and the like.
 前記ロボット装置によれば、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立している人物の集合である場合は、密なやり取りに基づいた情報を提供する第1サービスを行うことを決定する。一方、サービスの提供対象が、互いに意思疎通を行うなどの関係が成立しているか否かが不明な人物の集合である場合は、やり取りを行わずに、一方的に情報を提供する第2サービスを行うことを決定する。これにより、サービスの提供対象の状況に応じて、適切なサービスを行うことができるとしている。 According to the robot device, when the service is provided to a group of people who have a relationship such as communicating with each other, the first service of providing information based on close communication is performed. To determine. On the other hand, when the service is provided to a group of people whose relationship such as communication with each other is unknown, the second service provides information unilaterally without exchanging information. Decide to do. As a result, it is possible to provide appropriate services according to the situation of the service provision target.
特開2014-14899号公報Japanese Unexamined Patent Publication No. 2014-14899
 前記ロボット装置では、前記顔検出部が、前記第1カメラを用いて人物の顔検出を行う構成になっており、該顔検出には、公知の技術を利用することができるとしている。
 しかしながら、従来の顔検出技術では、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形している場合、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されている場合、又は遺伝性の疾患などの病気により、前記顔器官の配置が平均的な位置からずれている場合など、このような特定個人(換言すれば、年齢差、性別、及び人種などの違いにかかわらずに共通する一般的な人の顔特徴とは異なっている特徴を有する特定の個人)に対する顔検出の精度が低下してしまうという課題があった。
In the robot device, the face detection unit is configured to detect a person's face using the first camera, and a known technique can be used for the face detection.
However, with conventional face detection technology, if a part of the facial organs such as eyes, nose, and mouth is missing or greatly deformed due to injury, a large mole, wart, or body decoration such as tattoo is applied to the face. Such specific individuals (in other words, age difference, gender, and person), such as when the facial organs are displaced from their average position due to treatment or a disease such as a hereditary disease. There is a problem that the accuracy of face detection for a specific individual (a specific individual) having characteristics different from those of a general person, which is common regardless of the difference in species, is lowered.
 本発明は上記課題に鑑みなされたものであって、上記のような特定個人の顔であってもリアルタイムで精度良く検出することができる画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムを提供することを目的としている。
 上記目的を達成するために本開示に係る画像処理装置(1)は、撮像部から入力される画像を処理する画像処理装置であって、
 前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量とが記憶される顔特徴量記憶部と、
 前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出部と、を備え、
 該顔検出部が、
 前記探索領域から顔の特徴量を抽出する第1特徴量抽出部と、
 前記探索領域から抽出された前記特徴量と、前記通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを判別する階層構造の通常顔判別器と、
 該通常顔判別器のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定部とを備えていることを特徴としている。
The present invention has been made in view of the above problems, and an image processing device, a monitoring device, a control system, an image processing method, and an image processing method capable of accurately detecting even the face of a specific individual as described above in real time. The purpose is to provide a program.
The image processing apparatus (1) according to the present disclosure in order to achieve the above object is an image processing apparatus that processes an image input from an imaging unit.
As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
A face detection unit that detects a face area while scanning a search area with respect to the image is provided.
The face detection unit
A first feature amount extraction unit that extracts facial feature amounts from the search area, and
A hierarchically structured normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount.
When it is determined that the face is non-face in any layer of the normal face discriminator, the search area is used by using the feature amount extracted from the search area and the face feature amount of the specific individual. Is provided with a specific individual face determination unit for determining whether the face is the face of the specific individual or the non-face.
 上記画像処理装置(1)によれば、前記通常顔判別器が、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別することにより、前記顔領域が検出される。また、前記通常顔判別器のいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定部が、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定することにより、前記特定個人の顔を含む前記顔領域が検出される。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することができる。また、前記通常顔判別器と前記特定個人顔判定部では、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。 According to the image processing device (1), the normal face discriminator hierarchically discriminates whether the search area is a face or a non-face by using the normal face feature amount. , The face area is detected. Further, even when it is determined that the face is non-faced in any layer of the normal face discriminator, the specific individual face determination unit uses the face feature amount of the specific individual to search the search area. By determining whether is the face of the specific individual or non-face, the face region including the face of the specific individual is detected. Thereby, the face region can be detected with high accuracy regardless of whether it is the normal face or the face of the specific individual. Further, since the normal face discriminator and the specific individual face determination unit use the common feature amount extracted from the search region, the real-time property related to the detection of the face region can be maintained.
 本開示に係る画像処理装置(2)は、上記画像処理装置(1)において、前記特定個人顔判定部が、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との相関を示す指標を算出し、算出した前記指標に基づいて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定するものであることを特徴としている。 The image processing device (2) according to the present disclosure is the above-mentioned image processing device (1) used in one layer of the normal face discriminator that the specific individual face determination unit has determined to be non-face. An index showing the correlation between the feature amount and the face feature amount of the specific individual corresponding to the one layer is calculated, and based on the calculated index, whether the search area is the face of the specific individual or not. It is characterized in that it determines whether or not it is non-faced.
 上記画像処理装置(2)によれば、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との相関を示す指標に基づいて、前記探索領域が前記特定個人の顔であるか、非顔であるかを効率良く判定することができ、前記通常顔判別器で前記非顔であると判別された場合であっても、前記特定個人の顔である場合を精度良く判定することができる。前記指標は、その値が大きいほど関係性が高くなることを示す指標値、例えば、相関係数であってもよいし、二乗誤差の逆数であってもよいし、その他、前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との関係の類似度を示す指標値などであってもよい。 According to the image processing device (2), the feature amount used in one layer of the normal face discriminator determined to be non-face and the face feature amount of the specific individual corresponding to the one layer. Based on the index showing the correlation with, it is possible to efficiently determine whether the search area is the face or non-face of the specific individual, and the normal face discriminator determines that the search area is non-face. Even if it is done, it is possible to accurately determine the case where the face is the specific individual. The index may be an index value indicating that the larger the value is, the higher the relationship is, for example, a correlation coefficient, the reciprocal of the square error, or the non-face. It may be an index value indicating the degree of similarity between the feature amount used in one layer of the normal face discriminator determined to be, and the face feature amount of the specific individual corresponding to the one layer. ..
 本開示に係る画像処理装置(3)は、上記画像処理装置(2)において、前記特定個人顔判定部が、前記指標が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定し、前記指標が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定するものであることを特徴としている。
 上記画像処理装置(3)によれば、前記指標が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定され、前記指標が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定される。前記指標と前記所定の閾値とを比較する処理により、前記判定の処理効率を高めることができる。
In the image processing device (3) according to the present disclosure, in the image processing device (2), when the specific individual face determination unit indicates that the index is larger than a predetermined threshold value, the search area is the face of the specific individual. When the index is equal to or less than the predetermined threshold value, it is determined that the search region is the non-face.
According to the image processing device (3), when the index is larger than a predetermined threshold value, it is determined that the search area is the face of the specific individual, and when the index is equal to or less than the predetermined threshold value, the search area is determined. Is determined to be the non-face. The processing efficiency of the determination can be improved by the processing of comparing the index with the predetermined threshold value.
 本開示に係る画像処理装置(4)は、上記画像処理装置(1)~(3)のいずれかにおいて、前記特定個人顔判定部により前記特定個人の顔であると判定された場合、前記通常顔判別器の次の階層に判別を進める判別進行部と、
 前記特定個人顔判定部により前記非顔であると判定された場合、前記通常顔判別器での判別を打ち切る判別打切部とを備えていることを特徴としている。
When the image processing device (4) according to the present disclosure is determined by the specific individual face determination unit to be the face of the specific individual in any of the image processing devices (1) to (3), the usual A discrimination progressing unit that advances discrimination to the next level of the face discriminator,
When the specific individual face determination unit determines that the face is non-face, the specific individual face determination unit is provided with a discrimination cutoff unit that terminates the discrimination by the normal face discriminator.
 上記画像処理装置(4)によれば、前記特定個人顔判定部により前記特定個人の顔であると判定された場合、前記通常顔判別器の次の階層に判別が進められ、判別処理が速やかに継続される。一方、前記特定個人顔判定部により前記非顔であると判定された場合、前記通常顔判別器での判別が打ち切られる。したがって、前記通常顔判別器の効率を維持しつつ、前記特定個人の顔を判定する処理を行うことができる。 According to the image processing device (4), when the specific individual face determination unit determines that the face is the specific individual's face, the determination proceeds to the next layer of the normal face discriminator, and the discrimination process is expedited. Continued to. On the other hand, when the specific individual face determination unit determines that the face is non-face, the determination by the normal face discriminator is terminated. Therefore, it is possible to perform the process of determining the face of the specific individual while maintaining the efficiency of the normal face discriminator.
 本開示に係る画像処理装置(5)は、上記画像処理装置(1)~(4)のいずれかにおいて、前記顔検出部が、前記通常顔判別器により前記顔であると判別された1以上の前記顔領域の候補を統合する顔領域統合部と、統合された前記顔領域から顔の特徴量を抽出する第2特徴量抽出部とを備え、統合された前記顔領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定部とを備えていることを特徴としている。 The image processing device (5) according to the present disclosure is one or more of the image processing devices (1) to (4) in which the face detection unit is determined to be the face by the normal face discriminator. The face region integration unit that integrates the candidates for the face region and the second feature quantity extraction unit that extracts the facial feature quantity from the integrated face region are provided, and the face region is extracted from the integrated face region. It is characterized by including a specific individual determination unit that determines whether or not the face in the face region is the face of the specific individual by using the feature amount and the face feature amount of the specific individual.
 上記画像処理装置(5)によれば、前記顔であると判別された1以上の前記顔領域の候補が統合され、統合された前記顔領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かが判定される。したがって、前記顔領域統合部により統合された前記顔領域が、前記特定個人の顔であるか、前記通常の人の顔であるかを精度良く判定することができる。 According to the image processing apparatus (5), one or more candidates for the face region determined to be the face are integrated, the feature amount extracted from the integrated face region, and the specific individual. Using the face feature amount, it is determined whether or not the face in the face region is the face of the specific individual. Therefore, it is possible to accurately determine whether the face region integrated by the face region integrating unit is the face of the specific individual or the face of a normal person.
 本開示に係るモニタリング装置(1)は、上記画像処理装置(1)~(5)のいずれかと、該画像処理装置に入力する画像を撮像する撮像部と、前記画像処理装置による画像処理に基づく情報を出力する出力部とを備えていることを特徴としている。
 上記モニタリング装置(1)によれば、前記通常の人の顔だけでなく、前記特定個人の顔を精度良く検出して、モニタリングすることができ、また、前記出力部から前記画像処理に基づく情報が出力可能なため、該情報を利用するモニタリングシステムなどを容易に構築することが可能となる。
The monitoring device (1) according to the present disclosure is based on any of the above image processing devices (1) to (5), an image pickup unit that captures an image input to the image processing device, and image processing by the image processing device. It is characterized by having an output unit that outputs information.
According to the monitoring device (1), not only the face of the normal person but also the face of the specific individual can be accurately detected and monitored, and information based on the image processing can be detected from the output unit. Can be output, so it is possible to easily construct a monitoring system or the like that uses the information.
 本開示に係る制御システム(1)は、上記モニタリング装置(1)と、該モニタリング装置と通信可能に接続され、該モニタリング装置から出力される前記情報に基づいて、所定の処理を実行する1以上の制御装置とを備えていることを特徴としている。
 上記制御システム(1)によれば、前記モニタリング装置から出力される前記情報に基づいて、1以上の前記制御装置で所定の処理を実行させることが可能となる。したがって、前記通常の人のモニタリング結果だけでなく、前記特定個人のモニタリング結果を利用することができるシステムを構築することができる。
The control system (1) according to the present disclosure is one or more that is communicably connected to the monitoring device (1) and executes a predetermined process based on the information output from the monitoring device. It is characterized by being equipped with a control device of.
According to the control system (1), it is possible to execute a predetermined process by one or more of the control devices based on the information output from the monitoring device. Therefore, it is possible to construct a system that can utilize not only the monitoring result of the normal person but also the monitoring result of the specific individual.
 本開示に係る制御システム(2)は、上記制御システム(1)において、前記モニタリング装置(1)が、車両のドライバをモニタリングするための装置であり、前記制御装置が、前記車両に搭載される電子制御ユニットを含むことを特徴としている。
 上記制御システム(2)によれば、前記車両のドライバが前記特定個人である場合であっても、前記特定個人の顔を精度良くモニタリングすることができ、そのモニタリングの結果に基づいて、前記電子制御ユニットに所定の制御を適切に実行させることが可能となる。これにより、前記特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。
In the control system (1), the control system (2) according to the present disclosure is a device for monitoring the driver of the vehicle, and the control device is mounted on the vehicle. It is characterized by including an electronic control unit.
According to the control system (2), even when the driver of the vehicle is the specific individual, the face of the specific individual can be accurately monitored, and the electronic device is based on the monitoring result. It is possible to make the control unit appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even the specific individual to drive with peace of mind.
 本開示に係る画像処理方法は、撮像部から入力される画像を処理する画像処理方法であって、
 前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
 該顔検出ステップが、
 前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
 該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
 該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップとを含むことを特徴としている。
The image processing method according to the present disclosure is an image processing method for processing an image input from an imaging unit.
A face detection step of detecting a face area while scanning the search area with respect to the image is included.
The face detection step
A feature amount extraction step for extracting facial feature amounts from the search area, and
The search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face. A normal face discrimination step that hierarchically discriminates whether or not
When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. It is characterized by including a specific individual face determination step of determining whether the search area is the face of the specific individual or the non-face by using the face feature amount of the above.
 上記画像処理方法によれば、前記通常顔判別ステップにより、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別することにより、前記顔領域が検出される。また、前記通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定ステップにより、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定することにより、前記特定個人の顔を含む前記顔領域が検出される。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することができる。また、前記通常顔判別ステップと前記特定個人顔判定ステップでは、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。したがって、前記特定個人の顔をリアルタイムで精度良く検出することができる。 According to the image processing method, the face is determined by hierarchically determining whether the search area is a face or a non-face by using the normal face feature amount in the normal face discrimination step. The area is detected. Further, even when it is determined that the face is non-face in any layer of the normal face determination step, the search area is used by the specific individual face determination step using the face feature amount of the specific individual. By determining whether is the face of the specific individual or non-face, the face region including the face of the specific individual is detected. Thereby, the face region can be detected with high accuracy regardless of whether it is the normal face or the face of the specific individual. Further, since the common feature amount extracted from the search area is used in the normal face determination step and the specific individual face determination step, the real-time property related to the detection of the face area can be maintained. Therefore, the face of the specific individual can be detected accurately in real time.
 本開示に係るプログラムは、撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのプログラムであって、
 前記少なくとも1以上のコンピュータに、
 前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
 該顔検出ステップが、
 前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
 該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
 該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップとを実行させるためのプログラムであることを特徴としている。
The program according to the present disclosure is a program for causing at least one or more computers to process an image input from an imaging unit.
To at least one of the above computers
A face detection step of detecting a face area while scanning the search area with respect to the image is included.
The face detection step
A feature amount extraction step for extracting facial feature amounts from the search area, and
The search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face. A normal face discrimination step that hierarchically discriminates whether or not
When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. It is a program for executing a specific individual face determination step for determining whether the search area is the face of the specific individual or the non-face by using the face feature amount of the above. ..
 上記プログラムによれば、前記少なくとも1以上のコンピュータに、前記通常顔判別ステップにより、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別させて、前記顔領域を検出させることができる。また、前記通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定ステップにより、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定させることにより、前記特定個人の顔を含む前記顔領域を検出させることができる。これにより、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出させることができる。また、前記通常顔判別ステップと前記特定個人顔判定ステップでは、前記探索領域から抽出された、共通の前記特徴量を用いるので、前記顔領域の検出に係るリアルタイム性を維持することができる。したがって、前記特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔のセンシングを精度良く実施することができる装置やシステムを構築することができる。なお、上記プログラムは、記憶媒体に保存されたプログラムであってもよいし、通信ネットワークを介して転送可能なプログラムであってもよいし、通信ネットワークを介して実行されるプログラムであってもよい。 According to the above program, the search area is hierarchically determined whether the search area is a face or a non-face by using the normal face feature amount on the at least one computer in the normal face discrimination step. It is possible to detect the face region. Further, even when it is determined that the face is non-face in any layer of the normal face determination step, the search area is used by the specific individual face determination step using the face feature amount of the specific individual. It is possible to detect the face region including the face of the specific individual by determining whether the face is the face of the specific individual or not. As a result, the face region can be accurately detected regardless of whether it is the normal face or the face of the specific individual. Further, since the common feature amount extracted from the search area is used in the normal face determination step and the specific individual face determination step, the real-time property related to the detection of the face area can be maintained. Therefore, it is possible to construct a device or system capable of accurately sensing each face, whether it is the specific individual or an ordinary person other than the specific individual. The above program may be a program stored in a storage medium, a program that can be transferred via a communication network, or a program that is executed via a communication network. ..
本発明の実施の形態に係るドライバモニタリング装置を含む車載システムの一例を示す模式図である。It is a schematic diagram which shows an example of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment of this invention. 実施の形態に係るドライバモニタリング装置を含む車載システムのハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware composition of the in-vehicle system including the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部の機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the image processing part of the driver monitoring apparatus which concerns on embodiment. 画像処理部に含まれる顔検出部の機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the face detection part included in the image processing part. 顔検出部で行われる処理動作例を説明するための模式図である。It is a schematic diagram for demonstrating the processing operation example performed in the face detection part. 顔検出部で行われる処理動作例を説明するための模式図である。It is a schematic diagram for demonstrating the processing operation example performed in the face detection part. 実施の形態に係るドライバモニタリング装置の画像処理部が行う処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部が行う顔検出処理動作と特定個人判定処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the face detection processing operation and the specific individual judgment processing operation performed by the image processing unit of the driver monitoring apparatus which concerns on embodiment. 実施の形態に係るドライバモニタリング装置の画像処理部が行う判別処理動作の一例を示すフローチャートである。It is a flowchart which shows an example of the discrimination processing operation performed by the image processing part of the driver monitoring apparatus which concerns on embodiment.
 以下、本発明に係る画像処理装置、モニタリング装置、制御システム、画像処理方法、及びプログラムの実施の形態を図面に基づいて説明する。
 本発明に係る画像処理装置は、例えば、カメラを用いて人などの対象物をモニタリングする装置やシステムに広く適用可能である。本発明に係る画像処理装置は、例えば、車両などの各種移動体のドライバ(操縦者)をモニタリングする装置やシステムの他、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにも適用可能である。
Hereinafter, an image processing device, a monitoring device, a control system, an image processing method, and an embodiment of a program according to the present invention will be described with reference to the drawings.
The image processing apparatus according to the present invention can be widely applied to, for example, an apparatus or system for monitoring an object such as a person using a camera. The image processing device according to the present invention operates or monitors, for example, various facilities such as machines and devices in a factory, in addition to devices and systems for monitoring drivers (operators) of various moving objects such as vehicles. It can also be applied to devices and systems that monitor people who perform predetermined work.
[適用例]
 図1は、実施の形態に係るドライバモニタリング装置を含む車載システムの一例を示す模式図である。本適用例では、本発明に係る画像処理装置をドライバモニタリング装置10に適用した例について説明する。
 車載システム1は、車両2のドライバ3の状態(例えば、顔の挙動など)をモニタリングするドライバモニタリング装置10、車両2の走行、操舵、又は制動などの制御を行う1以上のECU(Electronic Control Unit)40、及び車両各部の状態、又は車両周囲の状態などを検出する1以上のセンサ41を含んで構成され、これらが通信バス43を介して接続されている。車載システム1は、例えば、CAN(Controller Area Network)プロトコルに従って通信する車載ネットワークシステムとして構成されている。なお、車載システム1の通信規格には、CAN以外の他の通信規格が採用されてもよい。ドライバモニタリング装置10が、本発明の「モニタリング装置」の一例であり、車載システム1が、本発明の「制御システム」の一例である。
[Application example]
FIG. 1 is a schematic view showing an example of an in-vehicle system including the driver monitoring device according to the embodiment. In this application example, an example in which the image processing apparatus according to the present invention is applied to the driver monitoring apparatus 10 will be described.
The in-vehicle system 1 includes a driver monitoring device 10 that monitors the state of the driver 3 of the vehicle 2 (for example, facial behavior), and one or more ECUs (Electronic Control Units) that control the running, steering, or braking of the vehicle 2. ) 40, and one or more sensors 41 for detecting the state of each part of the vehicle, the state around the vehicle, and the like are included, and these are connected via the communication bus 43. The in-vehicle system 1 is configured as, for example, an in-vehicle network system that communicates according to a CAN (Controller Area Network) protocol. As the communication standard of the in-vehicle system 1, a communication standard other than CAN may be adopted. The driver monitoring device 10 is an example of the "monitoring device" of the present invention, and the in-vehicle system 1 is an example of the "control system" of the present invention.
 ドライバモニタリング装置10は、ドライバ3の顔を撮像するためのカメラ11と、カメラ11から入力される画像を処理する画像処理部12と、画像処理部12による画像処理に基づく情報を、通信バス43を介して所定のECU40に出力する処理などを行う通信部16とを含んで構成されている。画像処理部12が、本発明の「画像処理装置」の一例である。カメラ11が、本発明の「撮像部」の一例である。 The driver monitoring device 10 transmits information based on image processing by the camera 11 for capturing the face of the driver 3, the image processing unit 12 that processes the image input from the camera 11, and the image processing unit 12, and the communication bus 43. It is configured to include a communication unit 16 that performs processing such as output to a predetermined ECU 40 via the above. The image processing unit 12 is an example of the "image processing device" of the present invention. The camera 11 is an example of the "imaging unit" of the present invention.
 ドライバモニタリング装置10は、カメラ11で撮像された画像からドライバ3の顔を検出し、検出されたドライバ3の顔の向き、視線の方向、又は目の開閉状態などの顔の挙動を検出する。ドライバモニタリング装置10は、これら顔の挙動の検出結果に基づいて、ドライバ3の状態、例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態を判定してもよい。また、ドライバモニタリング装置10が、これらドライバ3の状態判定に基づく信号をECU40に出力し、ECU40が、前記信号に基づいてドライバ3への注意や警告処理、又は車両2の動作制御(例えば、減速制御、又は路肩への誘導制御など)などを実行するように構成してもよい。 The driver monitoring device 10 detects the face of the driver 3 from the image captured by the camera 11, and detects the behavior of the face such as the direction of the face of the detected driver 3, the direction of the line of sight, or the open / closed state of the eyes. The driver monitoring device 10 may determine the state of the driver 3, such as forward gaze, inattentiveness, dozing, backward facing, and prone, based on the detection results of these facial behaviors. Further, the driver monitoring device 10 outputs a signal based on the state determination of the driver 3 to the ECU 40, and the ECU 40 performs attention and warning processing to the driver 3 or operation control of the vehicle 2 (for example, deceleration) based on the signal. Control, guidance control to the road shoulder, etc.) may be executed.
 ドライバモニタリング装置10では、ドライバ3に対する顔センシング、特にドライバ3が、特定個人であっても、特定個人以外の通常の人であっても、これら対象者の顔検出をリアルタイムで精度良く行えるようにすることを目的の一つとしている。
 従来のドライバモニタリング装置では、車両2のドライバ3が、例えば、ケガなどにより、目、鼻、口などの顔器官の一部が欠損、若しくは大きく変形していたり、顔に大きなホクロやイボ、若しくはタトゥーなどの身体装飾が施されていたり、又は遺伝性の疾患などの病気により、前記顔器官の配置が平均的な位置からずれていたりした場合、カメラで撮像された画像から顔を検出する精度が低下してしまうという課題があった。
In the driver monitoring device 10, face sensing for the driver 3, particularly whether the driver 3 is a specific individual or a normal person other than the specific individual, can accurately detect the faces of these subjects in real time. One of the purposes is to do.
In the conventional driver monitoring device, the driver 3 of the vehicle 2 has a part of facial organs such as eyes, nose, and mouth missing or greatly deformed due to, for example, an injury, or a large mole or wart on the face, or Accuracy of detecting the face from the image captured by the camera when the facial organs are displaced from the average position due to body decoration such as tattoo or a disease such as a hereditary disease. There was a problem that the
 また、顔検出精度が低下すると、顔向き推定処理など、顔検出後の処理も適切に行われないこととなるため、ドライバ3の脇見や居眠りなどの状態判定も適切に行うことができなくなり、また、前記状態判定に基づいてECU40が実行すべき各種の制御も適切に行うことができなくなる虞があるという課題があった。
 係る課題を解決すべく、実施の形態に係るドライバモニタリング装置10では、特定個人、換言すれば、年齢差、性別、及び人種などの違い(個人差)にかかわらずに共通する一般的な人(通常の人ともいう)の顔特徴とは異なる特徴を有している特定の個人に対する顔検出をリアルタイムで精度良く行えるようにするために、以下の構成を採用した。
Further, if the face detection accuracy is lowered, the post-face detection processing such as the face orientation estimation processing is not properly performed, so that the driver 3 cannot properly perform the state determination such as inattentiveness or dozing. Further, there is a problem that various controls to be executed by the ECU 40 based on the state determination may not be appropriately performed.
In order to solve the problem, the driver monitoring device 10 according to the embodiment is a general person who is common regardless of a specific individual, in other words, a difference in age, gender, race, etc. (individual difference). The following configuration was adopted in order to enable real-time and accurate face detection for a specific individual who has features different from those of a normal person (also called a normal person).
 画像処理部12には、画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量(換言すれば、通常の人の顔検出に用いる顔特徴量)とが記憶されている。
 画像処理部12は、カメラ11の入力画像に対して所定サイズの探索領域を走査しながら、かつ該探索領域から顔を検出するための特徴量を抽出しながら顔領域を検出する顔検出処理を行う。そして、画像処理部12は、前記探索領域から抽出された特徴量と、前記通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別していく通常顔判別処理により、入力画像から顔領域を検出する。
In the image processing unit 12, as the learned facial features that have been learned to detect the face from the image, the facial features of a specific individual and the normal facial features (in other words, the face of a normal person) The amount of facial features used for detection) is stored.
The image processing unit 12 performs face detection processing for detecting a face area while scanning a search area of a predetermined size with respect to an input image of the camera 11 and extracting a feature amount for detecting a face from the search area. Do. Then, the image processing unit 12 hierarchically determines whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount. The face area is detected from the input image by the normal face discrimination process.
 また、画像処理部12は、前記通常顔判別処理のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が特定個人の顔であるか、非顔であるかを判定する特定個人顔判定を行う。
 前記特定個人顔判定処理では、前記通常顔判別処理において非顔であると判別した一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との関係を示す指標、例えば、相関係数を算出し、算出した前記相関係数に基づいて、前記探索領域が前記特定個人の顔であるか、非顔であるかを判定してもよい。
In addition, when the image processing unit 12 determines that the face is non-face in any layer of the normal face discrimination process, the feature amount extracted from the search area and the face feature amount of the specific individual Using and, a specific individual face determination for determining whether the search area is a face of a specific individual or a non-face is performed.
In the specific individual face determination process, the relationship between the feature amount used in one layer determined to be non-face in the normal face determination process and the face feature amount of the specific individual corresponding to the one layer is determined. An index to be shown, for example, a correlation coefficient may be calculated, and based on the calculated correlation coefficient, it may be determined whether the search area is a face or a non-face of the specific individual.
 例えば、前記相関係数が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定し、前記通常顔判別処理における次の階層に判別処理を進めてもよい。また、前記相関係数が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定して、当該探索領域に対する前記通常顔判別処理を打ち切ってもよい。なお、前記特定個人顔判定処理では、前記相関係数以外の指標を用いてもよい。 For example, when the correlation coefficient is larger than a predetermined threshold value, it may be determined that the search area is the face of the specific individual, and the discrimination process may proceed to the next layer in the normal face discrimination process. Further, when the correlation coefficient is equal to or less than the predetermined threshold value, it may be determined that the search area is the non-face, and the normal face discrimination process for the search area may be terminated. In the specific individual face determination process, an index other than the correlation coefficient may be used.
 このようにドライバモニタリング装置10では、前記通常顔判別処理により、前記通常の顔特徴量を用いて、前記探索領域が顔であるか、非顔であるかが階層的に判別され、前記顔領域が検出される。また、前記通常顔判別処理のいずれかの階層で前記非顔であると判別された場合であっても、前記特定個人顔判定処理により、前記特定個人の顔特徴量を用いて、前記探索領域が前記特定個人の顔であるか、非顔であるかが判定されることにより、前記特定個人の顔を含む前記顔領域が検出される。 In this way, in the driver monitoring device 10, the normal face discrimination process is used to hierarchically determine whether the search region is a face or a non-face using the normal face feature amount, and the face region is determined. Is detected. Further, even when it is determined that the face is non-faced in any layer of the normal face determination process, the search area is used by the specific individual face determination process using the face feature amount of the specific individual. The face region including the face of the specific individual is detected by determining whether is the face of the specific individual or a non-face.
 また、前記通常顔判別処理のいずれかの階層で前記非顔であると判別され、かつ、前記特定個人顔判定処理により前記非顔であると判定された場合は、前記探索領域は非顔(顔以外)であるとして、処理が打ち切られ、次の探索領域の処理に進む。
 これらの構成により、前記通常の顔であっても、前記特定個人の顔であっても、前記顔領域を精度良く検出することが可能となる。また、前記通常顔判別処理と前記特定個人顔判定処理では、前記探索領域から抽出された、共通の前記特徴量を用いるため、前記顔領域の検出に係るリアルタイム性を維持することが可能となる。
 したがって、ドライバ3が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔をリアルタイムで(換言すれば、高速な処理で)精度良く検出することが可能となる。
Further, when it is determined that the face is non-face in any layer of the normal face discrimination process and the specific individual face determination process determines that the face is non-face, the search area is defined as non-face. The process is terminated as (other than the face), and the process proceeds to the next search area.
With these configurations, it is possible to accurately detect the face region regardless of whether it is the normal face or the face of the specific individual. Further, since the normal face discrimination process and the specific individual face determination process use the common feature amount extracted from the search area, it is possible to maintain the real-time property related to the detection of the face area. ..
Therefore, the driver 3 can accurately detect each face in real time (in other words, by high-speed processing) regardless of whether the driver 3 is a specific individual or an ordinary person other than the specific individual. ..
[ハードウェア構成例]
 図2は、実施の形態に係るドライバモニタリング装置10を含む車載システム1のハードウェア構成の一例を示すブロック図である。
[Hardware configuration example]
FIG. 2 is a block diagram showing an example of the hardware configuration of the in-vehicle system 1 including the driver monitoring device 10 according to the embodiment.
 車載システム1は、車両2のドライバ3の状態をモニタリングするドライバモニタリング装置10、1以上のECU40、及び1以上のセンサ41を含んで構成され、これらが通信バス43を介して接続されている。また、ECU40には、1以上のアクチュエータ42が接続されている。
 ドライバモニタリング装置10は、カメラ11と、カメラ11から入力される画像を処理する画像処理部12と、外部のECU40などとデータや信号のやり取りを行うための通信部16とを含んで構成されている。
The in-vehicle system 1 includes a driver monitoring device 10, 1 or more ECUs 40 for monitoring the state of the driver 3 of the vehicle 2, and 1 or more sensors 41, which are connected via a communication bus 43. Further, one or more actuators 42 are connected to the ECU 40.
The driver monitoring device 10 includes a camera 11, an image processing unit 12 that processes an image input from the camera 11, and a communication unit 16 for exchanging data and signals with an external ECU 40 and the like. There is.
 カメラ11は、運転席に着座しているドライバ3の顔を含む画像を撮像する装置であり、例えば、レンズ部、撮像素子部、光照射部、インターフェース部、これら各部を制御するカメラ制御部などを含んで構成され得る。前記撮像素子部は、CCD(Charge Coupled Device)、CMOS(Complementary Metal Oxide Semiconductor)などの撮像素子、フィルタ、マイクロレンズなどを含んで構成され得る。前記撮像素子部は、可視領域の光を受けて撮像画像を形成できる素子でもよいし、近赤外領域の光を受けて撮像画像を形成できる素子でもよい。前記光照射部は、LED(Light Emitting Diode)などの発光素子を含んで構成され、昼夜を問わずドライバ3の顔を撮像できるように近赤外線LEDなどを含んでもよい。カメラ11は、所定のフレームレート(例えば、毎秒数十フレーム)で画像を撮像し、撮像された画像のデータが画像処理部12に入力される。カメラ11は、一体式の他、外付け式のものであってもよい。 The camera 11 is a device that captures an image including the face of the driver 3 seated in the driver's seat. For example, a lens unit, an image sensor unit, a light irradiation unit, an interface unit, a camera control unit that controls each of these units, and the like. Can be configured to include. The image sensor unit may include an image sensor such as a CCD (Charge Coupled Device) and a CMOS (Complementary Metal Oxide Semiconductor), a filter, a microlens, and the like. The image pickup device unit may be an element capable of forming an image pickup image by receiving light in a visible region, or an element capable of forming an image pickup image by receiving light in a near infrared region. The light irradiation unit is configured to include a light emitting element such as an LED (Light Emitting Diode), and may include a near infrared LED or the like so that the driver 3's face can be imaged day or night. The camera 11 captures an image at a predetermined frame rate (for example, several tens of frames per second), and the data of the captured image is input to the image processing unit 12. The camera 11 may be an external type as well as an integrated type.
 画像処理部12は、1以上のCPU(Central Processing Unit)13、ROM(Read Only Memory)14、及びRAM(Random Access Memory)15を含む画像処理装置として構成されている。ROM14は、プログラム記憶部141と顔特徴量記憶部142とを含み、RAM15は、カメラ11からの入力画像を記憶する画像メモリ151を含んで構成されている。なお、ドライバモニタリング装置10は、さらに別の記憶部を装備してもよく、前記別の記憶部を、プログラム記憶部141、顔特徴量記憶部142、及び画像メモリ151として用いてもよい。前記別の記憶部は、半導体メモリでもよいし、ディスクドライブなどで読み込み可能な記憶媒体でもよい。 The image processing unit 12 is configured as an image processing device including one or more CPU (Central Processing Unit) 13, ROM (Read Only Memory) 14, and RAM (Random Access Memory) 15. The ROM 14 includes a program storage unit 141 and a facial feature amount storage unit 142, and the RAM 15 includes an image memory 151 for storing an input image from the camera 11. The driver monitoring device 10 may be further equipped with another storage unit, and the other storage unit may be used as the program storage unit 141, the facial feature amount storage unit 142, and the image memory 151. The other storage unit may be a semiconductor memory or a storage medium that can be read by a disk drive or the like.
 CPU13は、ハードウェアプロセッサの一例であり、ROM14のプログラム記憶部141に記憶されているプログラム、顔特徴量記憶部142に記憶されている顔特徴量などのデータを読み込み、解釈し実行することで、カメラ11から入力された画像の処理、例えば、顔検出処理などの顔画像処理を行う。また、CPU13は、該顔画像処理により得られた結果(例えば、処理データ、判定信号、又は制御信号など)を、通信部16を介してECU40などに出力する処理などを行う。 The CPU 13 is an example of a hardware processor, and by reading, interpreting, and executing data such as a program stored in the program storage unit 141 of the ROM 14 and the face feature amount stored in the face feature amount storage unit 142. , Processing of the image input from the camera 11, for example, face image processing such as face detection processing is performed. Further, the CPU 13 performs a process of outputting the result (for example, processing data, determination signal, control signal, etc.) obtained by the face image processing to the ECU 40 or the like via the communication unit 16.
 顔特徴量記憶部142には、画像から顔を検出するための学習(例えば、機械学習)を行った学習済みの顔特徴量として、特定個人の顔特徴量142aと、通常の顔特徴量142bと(図3、図4を参照)が記憶されている。
 学習済みの顔特徴量には、画像から顔を検出するのに有効な各種の特徴量を用いることができる。例えば、顔の局所的な領域の明暗差(平均輝度の差)に着目した特徴量(Haar-like特徴量)を用いてもよい。又は、顔の局所的な領域の輝度の分布の組み合わせに着目した特徴量(LBP (Local Binary Pattern) 特徴量)を用いてもよいし、顔の局所的な領域の輝度の勾配方向の分布の組み合わせに着目した特徴量(HOG (Histogram of Oriented Gradients) 特徴量)などを用いてもよい。
The face feature amount storage unit 142 contains the face feature amount 142a of a specific individual and the normal face feature amount 142b as learned face feature amounts that have been learned (for example, machine learning) to detect a face from an image. And (see FIGS. 3 and 4) are stored.
As the learned facial features, various feature quantities effective for detecting a face from an image can be used. For example, a feature amount (Haar-like feature amount) focusing on the difference in brightness (difference in average brightness) of a local region of the face may be used. Alternatively, a feature amount (LBP (Local Binary Pattern) feature amount) focusing on a combination of brightness distributions in the local region of the face may be used, or the distribution of the brightness in the local region of the face in the gradient direction may be used. Features (HOG (Histogram of Oriented Gradients) features) focusing on the combination may be used.
 顔特徴量記憶部142に記憶される顔特徴量は、例えば、各種の機械学習による手法を用いて、顔検出に有効な特徴量として抽出されたものである。機械学習とは、データ(学習データ)に内在するパターンをコンピュータにより見つけ出す処理である。例えば、統計的な学習手法の一例としてAdaBoostを用いてもよい。AdaBoostは、判別能力の低い判別器(弱判別器)を多数選び出し、これら多数の弱判別器の中からエラー率が小さい弱判別器を選択し、重みなどのパラメータを調整し、階層的な構造にすることで、強判別器を構築することのできる学習アルゴリズムである。なお、判別器は、識別器、分類器、又は学習器と称されてもよい。 The face feature amount stored in the face feature amount storage unit 142 is extracted as an effective feature amount for face detection by using, for example, various machine learning methods. Machine learning is a process of finding a pattern inherent in data (learning data) by a computer. For example, AdaBoost may be used as an example of a statistical learning method. AdaBoost selects a large number of discriminators (weak discriminators) with low discriminating ability, selects a weak discriminator with a small error rate from these many weak discriminators, adjusts parameters such as weights, and has a hierarchical structure. It is a learning algorithm that can construct a strong discriminator by setting. The discriminator may be referred to as a discriminator, a classifier, or a learner.
 強判別器は、例えば、顔の検出に有効な1つの特徴量を1つの弱判別器によって判別する構成とし、AdaBoostにより多数の弱判別器とその組み合わせを選び出し、これらを用いて、階層的な構造を構築したものとしてもよい。なお、1つの弱判別器は、例えば、顔の場合は1、非顔の場合は0という情報を出力してもよい。また、学習手法には、顔らしさを0または1ではなく、0から1の実数で出力可能なReal AdaBoostという学習手法を用いてもよい。また、これら学習手法には、入力層、中間層、及び出力層を有するニューラルネットワークを用いてもよい。 For example, the strong discriminator is configured to discriminate one feature amount effective for face detection by one weak discriminator, and a large number of weak discriminators and their combinations are selected by AdaBoost, and these are used hierarchically. The structure may be constructed. Note that one weak discriminator may output information such as 1 for a face and 0 for a non-face. Further, as the learning method, a learning method called Real AdaBoost, which can output a real number from 0 to 1 instead of 0 or 1, may be used. Further, as these learning methods, a neural network having an input layer, an intermediate layer, and an output layer may be used.
 このような学習アルゴリズムが搭載された学習装置に、さまざまな条件で撮像された多数の顔画像と多数の顔以外の画像(非顔画像)とを学習データとして与え、学習を繰り返し、重みなどのパラメータを調整して最適化を図ることにより、顔を高精度に検出可能な階層構造を有する強判別器を構築することが可能となる。そして、このような強判別器を構成する各階層の弱判別器で用いられる1以上の特徴量を、学習済みの顔特徴量として用いることができる。 A large number of face images captured under various conditions and a large number of non-face images (non-face images) are given as training data to a learning device equipped with such a learning algorithm, learning is repeated, weighting, etc. By adjusting the parameters and optimizing it, it is possible to construct a strong discriminator having a hierarchical structure capable of detecting a face with high accuracy. Then, one or more feature amounts used in the weak discriminators of each layer constituting such a strong discriminator can be used as the learned facial feature amounts.
 特定個人の顔特徴量142aは、例えば、予め所定の場所で、特定個人の顔画像をさまざまな条件(さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件)で個別に撮像し、これら多数の撮像画像を教師データとして、上記学習装置に入力し、学習処理によって調整された、特定個人の顔の特徴を示すパラメータである。特定個人の顔特徴量142aは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンなどでもよい。顔特徴量記憶部142に記憶される特定個人の顔特徴量142aは、1人の特定個人の顔特徴量だけでもよいし、複数の特定個人が車両2を運転する場合などに対応できるように、複数人の特定個人の顔特徴量が記憶されてもよい。 For example, the face feature amount 142a of a specific individual individually captures a face image of the specific individual at a predetermined place under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states). Then, these a large number of captured images are input to the learning device as teacher data, and are parameters that indicate the facial features of a specific individual adjusted by the learning process. The facial feature amount 142a of the specific individual may be, for example, a combination pattern of the difference in brightness of the local region of the face obtained by the learning process. The facial feature amount 142a of a specific individual stored in the facial feature amount storage unit 142 may be only the facial feature amount of one specific individual, or can be used when a plurality of specific individuals drive the vehicle 2. , The facial features of a plurality of specific individuals may be stored.
 通常の顔特徴量142bは、通常の人の顔画像をさまざまな条件(さまざまな顔の向き、視線の方向、又は目の開閉状態などの条件)で撮像した画像を教師データとして、上記学習装置に入力し、学習処理によって調整された、通常の人の顔の特徴を示すパラメータである。通常の顔特徴量142bは、例えば、学習処理によって得られた、顔の局所的な領域の明暗差の組み合わせパターンなどでもよい。また、通常の顔特徴量142bは、所定の顔特徴量データベースに登録されている情報を用いてもよい。 The normal facial feature amount 142b is the above-mentioned learning device using images of a normal human face image captured under various conditions (conditions such as various face orientations, line-of-sight directions, or eye open / closed states) as teacher data. It is a parameter indicating the characteristics of a normal human face, which is input to and adjusted by the learning process. The normal facial feature amount 142b may be, for example, a combination pattern of light and dark differences in a local region of the face obtained by a learning process. Further, as the normal facial feature amount 142b, the information registered in the predetermined facial feature amount database may be used.
 顔特徴量記憶部142に記憶される学習済みの顔特徴量は、例えば、クラウド上のサーバなどからインターネット、携帯電話網などの通信ネットワークを介して取り込んで、顔特徴量記憶部142に記憶される構成としてもよい。
 ECU40は、1以上のプロセッサ、メモリ、及び通信モジュールなどを含むコンピュータ装置で構成されている。そして、ECU40に搭載されたプロセッサが、メモリに記憶されたプログラムを読み込み、解釈し実行することで、アクチュエータ42などに対する所定の制御が実行されるようになっている。
The learned facial feature amount stored in the facial feature amount storage unit 142 is fetched from a server on the cloud or the like via a communication network such as the Internet or a mobile phone network and stored in the facial feature amount storage unit 142. It may be configured as such.
The ECU 40 is composed of a computer device including one or more processors, a memory, a communication module, and the like. Then, the processor mounted on the ECU 40 reads, interprets, and executes the program stored in the memory, so that predetermined control for the actuator 42 and the like is executed.
 ECU40は、例えば、走行系ECU、運転支援系ECU、ボディ系ECU、及び情報系ECUのうちの少なくともいずれかを含んで構成されている。
 前記走行系ECUには、例えば、駆動系ECU、シャーシ系ECUなどが含まれている。前記駆動系ECUには、例えば、エンジン制御、モータ制御、燃料電池制御、EV(Electric Vehicle)制御、又はトランスミッション制御等の「走る」機能に関する制御ユニットが含まれている。前記シャーシ系ECUには、例えば、ブレーキ制御、又はステアリング制御等の「止まる、曲がる」機能に関する制御ユニットが含まれている。
The ECU 40 includes, for example, at least one of a traveling system ECU, a driving support system ECU, a body system ECU, and an information system ECU.
The traveling system ECU includes, for example, a drive system ECU, a chassis system ECU, and the like. The drive system ECU includes a control unit related to a "running" function such as engine control, motor control, fuel cell control, EV (Electric Vehicle) control, or transmission control. The chassis-based ECU includes a control unit related to a "stop, turn" function such as brake control or steering control.
 前記運転支援系ECUは、例えば、自動ブレーキ支援機能、車線維持支援機能(LKA/Lane Keep Assistともいう)、定速走行・車間距離支援機能(ACC/Adaptive Cruise Controlともいう)、前方衝突警告機能、車線逸脱警報機能、死角モニタリング機能、交通標識認識機能等、走行系ECUなどとの連携により自動的に安全性の向上、又は快適な運転を実現する機能(運転支援機能、又は自動運転機能)に関する制御ユニットを少なくとも1つ以上含んで構成され得る。 The driving support system ECU has, for example, an automatic braking support function, a lane keeping support function (also referred to as LKA / Lane Keep Assist), a constant speed driving / inter-vehicle distance support function (also referred to as ACC / Adaptive Cruise Control), and a forward collision warning function. , Lane departure warning function, blind spot monitoring function, traffic sign recognition function, etc., functions that automatically improve safety or realize comfortable driving by linking with driving ECUs (driving support function or automatic driving function) It may be configured to include at least one control unit with respect to.
 前記運転支援系ECUには、例えば、米国自動車技術会(SAE)が提示している自動運転レベルにおけるレベル1(ドライバ支援)、レベル2(部分的自動運転)、及びレベル3(条件付自動運転)の少なくともいずれかの機能が装備されてもよい。さらに、自動運転レベルのレベル4(高度自動運転)、又はレベル5(完全自動運転)の機能が装備されてもよいし、レベル1、2のみ、又はレベル2、3のみの機能が装備されてもよい。また、車載システム1を自動運転システムとして構成してもよい。 The driving support system ECU includes, for example, Level 1 (driver assistance), Level 2 (partially automatic driving), and Level 3 (conditional automatic driving) at the automatic driving level presented by the American Society of Automotive Engineers of Japan (SAE). ) May be equipped with at least one of the functions. Further, the functions of level 4 (highly automatic driving) or level 5 (fully automatic driving) of the automatic driving level may be equipped, and only the functions of level 1 and 2 or only level 2 and 3 are equipped. May be good. Further, the in-vehicle system 1 may be configured as an automatic driving system.
 前記ボディ系ECUは、例えば、ドアロック、スマートキー、パワーウインドウ、エアコン、ライト、メーターパネル、又はウインカ等の車体の機能に関する制御ユニットを少なくとも1つ以上含んで構成され得る。
 前記情報系ECUは、例えば、インフォテイメント装置、テレマティクス装置、又はITS(Intelligent Transport Systems)関連装置を含んで構成され得る。前記インフォテイメント装置には、例えば、ユーザインターフェースとして機能するHMI(Human Machine Interface)装置の他、カーナビゲーション装置、オーディオ機器などが含まれてもよい。前記テレマティクス装置には、外部と通信するための通信ユニットなどが含まれてもよい。前記ITS関連装置には、ETC(Electronic Toll Collection System)、又はITSスポットなどの路側機との路車間通信、若しくは車々間通信などを行うための通信ユニットなどが含まれてもよい。
The body system ECU may be configured to include at least one control unit related to the function of the vehicle body such as a door lock, a smart key, a power window, an air conditioner, a light, an instrument panel, or a winker.
The information system ECU may be configured to include, for example, an infotainment device, a telematics device, or an ITS (Intelligent Transport Systems) related device. The infotainment device may include, for example, an HMI (Human Machine Interface) device that functions as a user interface, a car navigation device, an audio device, and the like. The telematics device may include a communication unit or the like for communicating with the outside. The ITS-related device may include an ETC (Electronic Toll Collection System), a communication unit for performing road-to-vehicle communication with a roadside machine such as an ITS spot, or vehicle-to-vehicle communication.
 センサ41には、ECU40でアクチュエータ42の動作制御を行うために必要となるセンシングデータを取得する各種の車載センサが含まれ得る。例えば、車速センサ、シフトポジションセンサ、アクセル開度センサ、ブレーキペダルセンサ、ステアリングセンサなどの他、車外撮像用カメラ、ミリ波等のレーダー(Radar)、ライダー(LIDER)、超音波センサなどの周辺監視センサなどが含まれてもよい。 The sensor 41 may include various in-vehicle sensors that acquire sensing data necessary for controlling the operation of the actuator 42 by the ECU 40. For example, in addition to vehicle speed sensors, shift position sensors, accelerator opening sensors, brake pedal sensors, steering sensors, etc., peripheral monitoring of external imaging cameras, millimeter-wave radar (Radar), riders (LIDER), ultrasonic sensors, etc. A sensor or the like may be included.
 アクチュエータ42は、ECU40からの制御信号に基づいて、車両2の走行、操舵、又は制動などに関わる動作を実行する装置であり、例えば、エンジン、モータ、トランスミッション、油圧又は電動シリンダー等が含まれる。 The actuator 42 is a device that executes operations related to traveling, steering, braking, etc. of the vehicle 2 based on a control signal from the ECU 40, and includes, for example, an engine, a motor, a transmission, a hydraulic cylinder, an electric cylinder, and the like.
[機能構成例]
 図3は、実施の形態に係るドライバモニタリング装置10の画像処理部12の機能構成例を示すブロック図である。
 画像処理部12は、画像入力部21、顔検出部22、特定個人判定部25、第1顔画像処理部26、第2顔画像処理部30、出力部34、及び顔特徴量記憶部142を含んで構成されている。
 画像入力部21は、カメラ11で撮像されたドライバ3の顔を含む画像を取り込む処理を行う。
[Functional configuration example]
FIG. 3 is a block diagram showing a functional configuration example of the image processing unit 12 of the driver monitoring device 10 according to the embodiment.
The image processing unit 12 includes an image input unit 21, a face detection unit 22, a specific individual determination unit 25, a first face image processing unit 26, a second face image processing unit 30, an output unit 34, and a face feature amount storage unit 142. It is configured to include.
The image input unit 21 performs a process of capturing an image including the face of the driver 3 captured by the camera 11.
 顔検出部22は、入力画像に対して所定サイズの探索領域を走査しながら、かつ該探索領域から顔の特徴量を抽出しながら顔領域を検出する処理を行う。顔検出部22は、第1特徴量抽出部221、通常顔判別器222、及び特定個人顔判定部223を含んで構成されている。顔検出部22は、さらに、顔領域統合部224、及び第2特徴量抽出部225を含んで構成してもよい。 The face detection unit 22 performs a process of detecting the face area while scanning the search area of a predetermined size with respect to the input image and extracting the feature amount of the face from the search area. The face detection unit 22 includes a first feature amount extraction unit 221, a normal face discriminator 222, and a specific individual face determination unit 223. The face detection unit 22 may further include a face region integration unit 224 and a second feature amount extraction unit 225.
 図4は、顔検出部22の機能構成例を示すブロック図である。
 図5、図6は、顔検出部22で行われる処理動作例を説明するための模式図である。
 本実施の形態においては、顔検出部22は、画像特徴として、例えば、Haar-like特徴を用い、AdaBoostの学習アルゴリズムを用いて構築された階層構造の通常顔判別器222と、通常顔判別器222に付加された特定個人顔判定部223とを用いるように構成されている。
FIG. 4 is a block diagram showing a functional configuration example of the face detection unit 22.
5 and 6 are schematic views for explaining an example of processing operation performed by the face detection unit 22.
In the present embodiment, the face detection unit 22 uses, for example, Haar-like features as image features, and has a hierarchical structure normal face discriminator 222 and a normal face discriminator constructed by using the learning algorithm of AdaBoost. It is configured to use the specific individual face determination unit 223 added to 222.
 Haar-like特徴は、矩形特徴とも称され、例えば、2つの矩形領域の平均輝度の差を特徴量とするものであり、例えば、画像中の目の領域は輝度が低く、目の周囲(目の下、目の横)は輝度が高くなるという特徴を利用するものである。Haar-like特徴には、2つ、3つ、又は4つの矩形を組み合わせた矩形特徴を用いてもよい。顔検出に有効な(重要度が高い)特徴量と、その組み合わせが学習アルゴリズムを用いて選び出され、顔特徴量記憶部142に記憶される。顔特徴量記憶部142には、通常顔判別器222での処理に用いられる通常の顔特徴量142bと、特定個人顔判定部223での処理に用いられる特定個人の顔特徴量142aとが記憶されている。 The Haar-like feature is also called a rectangular feature, for example, the feature amount is the difference in the average brightness of the two rectangular areas. For example, the eye area in the image has low brightness and is around the eye (under the eye). , Next to the eyes) utilizes the feature that the brightness is high. As the Haar-like feature, a rectangular feature that is a combination of two, three, or four rectangles may be used. Features (highly important) effective for face detection and their combinations are selected using a learning algorithm and stored in the face feature storage unit 142. The face feature amount storage unit 142 stores a normal face feature amount 142b used for processing by the normal face discriminator 222 and a specific individual face feature amount 142a used for processing by the specific individual face determination unit 223. Has been done.
 通常顔判別器222は、図4に示すように、第1判別器222aから第N判別器222nを含み、これらが複数連結された階層構造(カスケード構造ともいう)を備えている。これら各判別器は、画像から切り出された所定サイズの探索領域210から抽出された、顔検出に有効な1以上の特徴量221aを用いて、探索領域210が顔であるか、非顔(顔以外)であるかを判別する。第1判別器222aなどの階層構造の最初の方の判別器では、例えば、目があるかどうか、というような、顔を大まかに捉える特徴量221aが用いられる。また、第N判別器222nなどの階層構造の深い方の判別器では、例えば、目、鼻、口があるか、正面顔であるか、斜め顔であるか、横顔であるか、というような、顔の細部を捉える特徴量221aが用いられる。 As shown in FIG. 4, the normal face discriminator 222 includes the first discriminator 222a to the Nth discriminator 222n, and has a hierarchical structure (also referred to as a cascade structure) in which a plurality of these are connected. Each of these discriminators uses one or more feature quantities 221a extracted from a search area 210 of a predetermined size cut out from an image and is effective for face detection, and the search area 210 is a face or a non-face (face). Other than). In the first discriminator of the hierarchical structure such as the first discriminator 222a, a feature amount 221a that roughly captures the face, such as whether or not there are eyes, is used. Further, in a discriminator having a deeper hierarchical structure such as the Nth discriminator 222n, for example, whether there are eyes, nose, mouth, front face, diagonal face, or profile. , A feature amount 221a that captures the details of the face is used.
 第1特徴量抽出部221は、通常顔判別器222を構成する各判別器で判別を行うように設定されている1以上の特徴量221aを探索領域210から抽出する。
 通常顔判別器222は、第1特徴量抽出部221により探索領域210から抽出された特徴量221aと、通常の顔特徴量142bとを用いて、探索領域210が顔であるか、非顔であるかを、第1判別器222aから第N判別器222nの順に階層的に判別していく。
The first feature amount extraction unit 221 extracts from the search area 210 one or more feature amounts 221a that are usually set to be discriminated by each discriminator constituting the face discriminator 222.
The normal face discriminator 222 uses the feature amount 221a extracted from the search area 210 by the first feature amount extraction unit 221 and the normal face feature amount 142b, and the search area 210 is face or non-face. The presence or absence is hierarchically determined in the order of the first discriminator 222a to the Nth discriminator 222n.
 特定個人顔判定部223は、図4に示すように、第1判定部223aから第N判定部223nを含み、通常顔判別器222のいずれかの階層で探索領域210が非顔であると判別された場合に、探索領域210から抽出された特徴量221aと、特定個人の顔特徴量142aとを用いて、探索領域210が特定個人の顔であるか、非顔であるかを判定する。 As shown in FIG. 4, the specific individual face determination unit 223 includes the first determination unit 223a to the Nth determination unit 223n, and determines that the search area 210 is non-face in any layer of the normal face determination device 222. When this is done, the feature amount 221a extracted from the search area 210 and the face feature amount 142a of the specific individual are used to determine whether the search area 210 is the face of the specific individual or not.
 例えば、第2判別器222bで探索領域210が非顔であると判別されると、第2判定部223bが、探索領域210から抽出された特徴量221a(第2判別器222bで用いたもの)と、当該階層に対応する特定個人の顔特徴量142aとを用いて、探索領域210が特定個人の顔であるか、非顔であるかを判定する。
 第2判定部223bが、探索領域210が特定個人の顔であると判定した場合、第3判別器222cでの判別処理に進む一方、探索領域210が非顔であると判定した場合、第2判別器222bで判別処理を打ち切り、次の探索領域に対する顔検出処理に進む。
For example, when the search area 210 is determined to be non-face by the second discriminator 222b, the second determination unit 223b extracts the feature amount 221a (used by the second discriminator 222b) from the search region 210. And the face feature amount 142a of the specific individual corresponding to the hierarchy are used to determine whether the search area 210 is a face of the specific individual or a non-face.
When the second determination unit 223b determines that the search area 210 is the face of a specific individual, the process proceeds to the determination process by the third discriminator 222c, while when it is determined that the search area 210 is a non-face, the second determination unit 210 The discrimination process is terminated by the discriminator 222b, and the process proceeds to the face detection process for the next search area.
 そして、判別処理が進み、第N判別器222nで、探索領域210が顔であると判別された場合、又は第N判定部223nで、探索領域210が特定個人の顔であると判定した場合、当該探索領域210が顔領域の候補として記憶される。
 顔検出部22は、さまざまな大きさの顔を検出できるようにするために、例えば、図5に示すように、入力画像20を複数の倍率で縮小した縮小画像20a、20bを生成し、それぞれの縮小画像20a、20bから所定サイズの探索領域210を切り出し、通常顔判別器222を用いて探索領域210が顔であるか、非顔であるかを判別してもよい。そして、入力画像20、縮小画像20a、20b内で探索領域210を走査することにより、画像20中のさまざまな大きさの顔とその顔の位置とを検出してもよい。なお、探索領域210は、矩形以外の任意の形状であってもよい。
Then, when the discrimination process proceeds and the Nth discriminator 222n determines that the search area 210 is a face, or the Nth determination unit 223n determines that the search area 210 is a face of a specific individual. The search area 210 is stored as a candidate for the face area.
The face detection unit 22 generates reduced images 20a and 20b obtained by reducing the input image 20 by a plurality of magnifications, as shown in FIG. 5, for example, in order to detect faces of various sizes. A search area 210 of a predetermined size may be cut out from the reduced images 20a and 20b of the above, and a normal face detector 222 may be used to determine whether the search area 210 is a face or a non-face. Then, by scanning the search area 210 in the input image 20, the reduced images 20a, and 20b, faces of various sizes in the image 20 and the positions of the faces may be detected. The search area 210 may have any shape other than a rectangle.
 また、顔検出部22は、さまざまな方向に向いた(回転した)顔やさまざまな角度に傾いた顔を検出できるように構成してもよい。例えば、第1特徴量抽出部221が、探索領域210から顔の向きや顔の傾きを判別するのに有効な特徴量221aを抽出し、通常顔判別器222が、顔の向きや顔の傾き毎の特徴量について学習した学習済みの判別器を用いて、探索領域210が、顔であるか、非顔であるかを判別できるように構成してもよい。 Further, the face detection unit 22 may be configured to detect a face facing (rotated) in various directions or a face tilted at various angles. For example, the first feature amount extraction unit 221 extracts a feature amount 221a effective for discriminating the face orientation and the face inclination from the search area 210, and the normal face discriminator 222 usually uses the face orientation and the face inclination. The search area 210 may be configured to be able to discriminate whether it is a face or a non-face by using a trained discriminator that has learned about each feature amount.
 例えば、図6に示すように、正面顔、斜め顔、及び横顔をそれぞれ検出するために、通常顔判別器222が、正面(0度)、左斜め(45度)、及び左横(90度)のそれぞれの特徴量を学習した判別器を備えてもよい。この場合、1つの判別器で所定角度(例えば、22.5度)以上をカバーできるように学習してもよい。また、右斜め(45度)の判別は、左斜め(45度)を左右反転、右横(90度)の判別は、左横(90度)を左右反転させることで対応するようにしてもよい。また、図6に示すように、顔の傾きを検出できるように、通常顔判別器222が、所定の傾き毎にそれぞれの特徴量を学習した判別器を備えてもよい。 For example, as shown in FIG. 6, in order to detect the front face, the diagonal face, and the profile, respectively, the normal face discriminator 222 normally uses the front (0 degree), left diagonal (45 degrees), and left side (90 degrees). ) May be provided with a discriminator that has learned each feature amount. In this case, learning may be performed so that one discriminator can cover a predetermined angle (for example, 22.5 degrees) or more. Further, the right angle (45 degrees) can be determined by flipping the left diagonal (45 degrees) horizontally, and the right side (90 degrees) can be determined by flipping the left side (90 degrees) horizontally. Good. Further, as shown in FIG. 6, the normal face discriminator 222 may be provided with a discriminator that learns each feature amount for each predetermined tilt so that the tilt of the face can be detected.
 顔領域統合部224は、通常顔判別器222により顔であると判別された1以上の顔領域の候補を統合する処理を行う。1以上の顔領域の候補を統合する方法は特に限定されない。例えば、1以上の顔領域の候補の領域中心の平均値と、領域サイズの平均値とに基づいて統合してもよい。
 第2特徴量抽出部225は、顔領域統合部224により統合された顔領域から顔の特徴量を抽出する処理を行う。
The face region integration unit 224 performs a process of integrating one or more face region candidates determined to be a face by the normal face discriminator 222. The method of integrating the candidates of one or more face regions is not particularly limited. For example, it may be integrated based on the average value of the region center of one or more face region candidates and the average value of the region size.
The second feature amount extraction unit 225 performs a process of extracting facial feature amounts from the face area integrated by the face area integration unit 224.
 特定個人判定部25は、顔検出部22で検出された顔領域の特徴量と、顔特徴量記憶部142から読み込んだ特定個人の顔特徴量142aとを用いて、検出された顔領域の顔が特定個人の顔であるか、特定個人以外の通常の人の顔であるかを判定する処理を行う。
 特定個人判定部25は、顔領域から抽出された特徴量と特定個人の顔特徴量142aとの関係を示す指標、例えば、相関を示す指標として、相関係数を算出し、算出した相関係数に基づいて、顔領域の顔が特定個人の顔であるか否かを判定してもよい。そして、相関係数が所定の閾値より大きい場合、検出した顔領域の顔が特定個人の顔であると判定し、相関係数が所定の閾値以下の場合、検出した顔領域の顔が特定個人の顔ではないと判定してもよい。
The specific individual determination unit 25 uses the feature amount of the face area detected by the face detection unit 22 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142 to detect the face in the face area. Performs a process of determining whether is the face of a specific individual or the face of a normal person other than the specific individual.
The specific individual determination unit 25 calculates a correlation coefficient as an index showing the relationship between the feature amount extracted from the face region and the face feature amount 142a of the specific individual, for example, an index showing the correlation, and the calculated correlation coefficient. It may be determined whether or not the face in the face region is the face of a specific individual based on. Then, when the correlation coefficient is larger than a predetermined threshold value, it is determined that the face in the detected face area is the face of a specific individual, and when the correlation coefficient is equal to or less than the predetermined threshold value, the face in the detected face area is a specific individual. It may be determined that it is not the face of.
 また、特定個人判定部25では、カメラ11からの入力画像の1フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定してもよいし、カメラ11からの入力画像の複数フレームに対する判定の結果に基づいて、検出した顔領域の顔が特定個人の顔であるか否かを判定してもよい。
 第1顔画像処理部26は、特定個人判定部25により特定個人の顔であると判定された場合、特定個人用の顔画像処理を行う。第1顔画像処理部26は、特定個人の顔向き推定部27と、特定個人の目開閉検出部28と、特定個人の視線方向推定部29とを含んで構成されているが、さらに別の顔挙動を推定したり、検出したりする構成を含んでもよい。また、第1顔画像処理部26は、特定個人の顔特徴量142aを用いて、特定個人用の顔画像処理のいずれかの処理を行ってもよい。また、顔特徴量記憶部142に、特定個人用の顔画像処理を行うための学習を行った学習済みの特徴量を記憶しておき、該学習済みの特徴量を用いて、特定個人用の顔画像処理のいずれかの処理を行ってもよい。
Further, the specific individual determination unit 25 may determine whether or not the face in the detected face region is the face of the specific individual based on the result of determination for one frame of the input image from the camera 11. Based on the result of determination for a plurality of frames of the input image from the camera 11, it may be determined whether or not the face in the detected face region is the face of a specific individual.
When the specific individual determination unit 25 determines that the face is a specific individual's face, the first face image processing unit 26 performs face image processing for the specific individual. The first face image processing unit 26 includes a face orientation estimation unit 27 of a specific individual, an eye opening / closing detection unit 28 of the specific individual, and a line-of-sight direction estimation unit 29 of the specific individual, but is still different. It may include a configuration for estimating or detecting facial behavior. In addition, the first face image processing unit 26 may perform any processing of the face image processing for the specific individual by using the face feature amount 142a of the specific individual. Further, the face feature amount storage unit 142 stores the learned feature amount that has been learned to perform the face image processing for the specific individual, and uses the learned feature amount for the specific individual. Any processing of face image processing may be performed.
 特定個人の顔向き推定部27は、特定個人の顔の向きを推定する処理を行う。特定個人の顔向き推定部27は、例えば、顔検出部22で検出された顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定する処理を行う。
 画像中の顔領域から顔器官を検出する手法は特に限定されないが、高速で高精度に顔器官を検出できる手法を採用することが好ましい。例えば、3次元顔形状モデルを作成し、これを2次元画像上の顔の領域にフィッティングさせ、顔の各器官の位置と形状を検出する手法が採用され得る。画像中の人の顔に3次元顔形状モデルをフィッティングさせる技術として、例えば、特開2007-249280号公報に記載された技術を適用することができるが、これに限定されるものではない。
The face orientation estimation unit 27 of the specific individual performs a process of estimating the face orientation of the specific individual. The face orientation estimation unit 27 of a specific individual detects, for example, the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected by the face detection unit 22, and the detected position and shape of the facial organs. The process of estimating the orientation of the face is performed based on.
The method for detecting the facial organs from the facial region in the image is not particularly limited, but it is preferable to adopt a method capable of detecting the facial organs at high speed and with high accuracy. For example, a method of creating a three-dimensional face shape model, fitting it to a face region on a two-dimensional image, and detecting the position and shape of each organ of the face can be adopted. As a technique for fitting a three-dimensional face shape model to a human face in an image, for example, the technique described in Japanese Patent Application Laid-Open No. 2007-249280 can be applied, but the technique is not limited thereto.
 また、特定個人の顔向き推定部27は、特定個人の顔の向きの推定データとして、例えば、上記3次元顔形状モデルのパラメータに含まれている、上下回転(X軸回り)のピッチ角、左右回転(Y軸回り)のヨー角、及び全体回転(Z軸回り)のロール角を出力してもよい。
 特定個人の目開閉検出部28は、特定個人の目の開閉状態を検出する処理を行う。特定個人の目開閉検出部28は、例えば、特定個人の顔向き推定部27で求めた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出する。目の開閉状態は、例えば、さまざまな目の開閉状態における目の画像の特徴量(瞼の位置、瞳孔(黒目)の形状、又は、白目部分と黒目部分の領域サイズなど)を予め学習器を用いて学習し、これら学習済みの特徴量データとの類似度を評価することで検出してもよい。
Further, the face orientation estimation unit 27 of the specific individual can use the estimation data of the face orientation of the specific individual, for example, the pitch angle of vertical rotation (around the X axis) included in the parameters of the three-dimensional face shape model. The yaw angle of the left-right rotation (around the Y axis) and the roll angle of the entire rotation (around the Z axis) may be output.
The eye opening / closing detection unit 28 of the specific individual performs a process of detecting the opening / closing state of the eyes of the specific individual. The eye opening / closing detection unit 28 of the specific individual, for example, is based on the position and shape of the facial organs obtained by the face orientation estimation unit 27 of the specific individual, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. Detects the open / closed state, for example, whether the eyes are open or closed. For the open / closed state of the eye, for example, the feature amount of the image of the eye (the position of the eyelid, the shape of the pupil (black eye), the area size of the white eye part and the black eye part, etc.) in various open / closed states of the eye is previously learned. It may be detected by learning using the data and evaluating the degree of similarity with the learned feature data.
 特定個人の視線方向推定部29は、特定個人の視線の方向を推定する処理を行う。特定個人の視線方向推定部29は、例えば、ドライバ3の顔の向き、及びドライバ3の顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向を推定する。視線の方向とは、ドライバ3が見ている方向のことであり、例えば、顔の向きと目の向きとの組み合わせによって求められる。 The line-of-sight direction estimation unit 29 of the specific individual performs a process of estimating the line-of-sight direction of the specific individual. The line-of-sight direction estimation unit 29 of a specific individual is based on, for example, the orientation of the face of the driver 3 and the position and shape of the facial organs of the driver 3, particularly the position and shape of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils). Estimate the direction of the line of sight. The direction of the line of sight is the direction in which the driver 3 is looking, and is determined by, for example, a combination of the direction of the face and the direction of the eyes.
 また、視線の方向は、例えば、さまざまな顔の向きと目の向きとの組み合わせにおける目の画像の特徴量(目尻、目頭、瞳孔の相対位置、又は白目部分と黒目部分の相対位置、濃淡、テクスチャーなど)とを予め学習器を用いて学習し、これら学習した特徴量データとの類似度を評価することで検出してもよい。また、特定個人の視線方向推定部29は、前記3次元顔形状モデルのフィッティング結果などを用いて、顔の大きさや向きと目の位置などから眼球の大きさと中心位置とを推定するとともに、瞳孔の位置を検出し、眼球の中心と瞳孔の中心とを結ぶベクトルを視線方向として検出してもよい。 In addition, the direction of the line of sight is, for example, the feature amount of the image of the eye in various combinations of face orientation and eye orientation (relative position of outer corner, inner corner of eye, pupil, relative position of white eye portion and black eye portion, shading, etc. (Texture, etc.) may be detected by learning in advance using a learning device and evaluating the degree of similarity with the learned feature amount data. In addition, the line-of-sight direction estimation unit 29 of the specific individual estimates the size and center position of the eyeball from the size and orientation of the face, the position of the eyes, etc., using the fitting result of the three-dimensional face shape model, and the pupil. The position of the eyeball may be detected, and the vector connecting the center of the eyeball and the center of the pupil may be detected as the line-of-sight direction.
 第2顔画像処理部30は、特定個人判定部25により特定個人の顔ではないと判定された場合、通常の顔画像処理を行う。第2顔画像処理部30は、通常の顔向き推定部31と、通常の目開閉検出部32と、通常の視線方向推定部33とを含んで構成されているが、さらに別の顔挙動を推定したり、検出したりする構成を含んでもよい。また、第2顔画像処理部30は、通常の顔特徴量142bを用いて、通常の顔画像処理のいずれかの処理を行ってもよい。また、顔特徴量記憶部142に、通常の顔画像処理を行うための学習を行った学習済みの特徴量を記憶しておき、該学習済みの特徴量を用いて、通常の顔画像処理のいずれかの処理を行ってもよい。なお、通常の顔向き推定部31と、通常の目開閉検出部32と、通常の視線方向推定部33とで行われる処理は、特定個人の顔向き推定部27と、特定個人の目開閉検出部28と、特定個人の視線方向推定部29と基本的に同様であるので、ここではその説明を省略する。 When the specific individual determination unit 25 determines that the face is not the face of a specific individual, the second face image processing unit 30 performs normal face image processing. The second face image processing unit 30 includes a normal face orientation estimation unit 31, a normal eye opening / closing detection unit 32, and a normal line-of-sight direction estimation unit 33, but has yet another face behavior. It may include a configuration for estimating or detecting. In addition, the second face image processing unit 30 may perform any processing of the normal face image processing using the normal face feature amount 142b. In addition, the face feature amount storage unit 142 stores the learned feature amount that has been learned for performing the normal face image processing, and uses the learned feature amount to perform the normal face image processing. Either process may be performed. The processing performed by the normal face orientation estimation unit 31, the normal eye opening / closing detection unit 32, and the normal line-of-sight direction estimation unit 33 includes the face orientation estimation unit 27 of the specific individual and the eye opening / closing detection of the specific individual. Since it is basically the same as the unit 28 and the line-of-sight direction estimation unit 29 of a specific individual, the description thereof will be omitted here.
 出力部34は、画像処理部12による画像処理に基づく情報をECU40などに出力する処理を行う。画像処理に基づく情報は、例えば、ドライバ3の顔の向き、視線の方向、又は目の開閉状態などの顔の挙動に関する情報でもよいし、顔の挙動の検出結果に基づいて判定されたドライバ3の状態(例えば、前方注視、脇見、居眠り、後ろ向き、突っ伏しなどの状態)に関する情報でもよい。また、画像処理に基づく情報は、ドライバ3の状態判定に基づく、所定の制御信号(注意や警告処理を行うための制御信号、又は車両2の動作制御を行うための制御信号など)でもよい。 The output unit 34 performs a process of outputting information based on the image processing by the image processing unit 12 to the ECU 40 or the like. The information based on the image processing may be, for example, information on the behavior of the face such as the direction of the face of the driver 3, the direction of the line of sight, or the open / closed state of the eyes, or the driver 3 determined based on the detection result of the behavior of the face. Information on the state of (for example, forward gaze, inattentiveness, dozing, backward facing, prone, etc.) may be used. Further, the information based on the image processing may be a predetermined control signal (control signal for performing caution or warning processing, control signal for performing operation control of the vehicle 2, etc.) based on the state determination of the driver 3.
[処理動作例]
 図7は、実施の形態に係るドライバモニタリング装置10における画像処理部12のCPU13が行う処理動作の一例を示すフローチャートである。カメラ11では、例えば、毎秒数十フレームの画像が撮像され、各フレーム、又は一定間隔のフレーム毎に本処理が行われる。
 まず、ステップS1では、CPU13は、画像入力部21として動作し、カメラ11で撮像された画像(ドライバ3の顔を含む画像)を読み込む処理を行い、ステップS2に処理を進める。
 ステップS2では、CPU13は、顔検出部22として動作し、入力画像に対して探索領域を走査しながら顔領域を検出する顔検出処理を行い、ステップS3に処理を進める。なお、ステップS2の顔検出処理の具体例については後述する。
[Processing operation example]
FIG. 7 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. For example, the camera 11 captures an image of several tens of frames per second, and this processing is performed for each frame or every frame at regular intervals.
First, in step S1, the CPU 13 operates as an image input unit 21, performs a process of reading an image (an image including the face of the driver 3) captured by the camera 11, and proceeds to step S2.
In step S2, the CPU 13 operates as the face detection unit 22, performs face detection processing for detecting the face area while scanning the search area for the input image, and proceeds to step S3. A specific example of the face detection process in step S2 will be described later.
 ステップS3では、CPU13は、特定個人判定部25として動作し、ステップS2で検出された顔領域の特徴量と、顔特徴量記憶部142から読み込んだ特定個人の顔特徴量142aとを用いて、検出された顔領域の顔が特定個人の顔であるか否かを判定する処理を行い、ステップS4に処理を進める。
 ステップS4では、CPU13は、ステップS3での判定処理の結果が、特定個人の顔であるか否かを判断し、特定個人の顔であると判断すれば、ステップS5に処理を進める。
In step S3, the CPU 13 operates as the specific individual determination unit 25, and uses the feature amount of the face region detected in step S2 and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. A process of determining whether or not the face in the detected face area is a face of a specific individual is performed, and the process proceeds to step S4.
In step S4, the CPU 13 determines whether or not the result of the determination process in step S3 is the face of a specific individual, and if it is determined that the result is the face of a specific individual, the process proceeds to step S5.
 ステップS5では、CPU13は、特定個人の顔向き推定部27として動作し、例えば、ステップS2で検出した顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定し、ステップS6に処理を進める。
 ステップS6では、CPU13は、特定個人の目開閉検出部28として動作し、例えば、ステップS5で求めた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出し、ステップS7に処理を進める。
In step S5, the CPU 13 operates as a face orientation estimation unit 27 of a specific individual, and for example, detects and detects the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected in step S2. The orientation of the face is estimated based on the position and shape of the facial organ, and the process proceeds to step S6.
In step S6, the CPU 13 operates as an eye opening / closing detection unit 28 of a specific individual, and is based on, for example, the position and shape of the facial organs obtained in step S5, particularly the position and shape of eye feature points (eyelids, pupils). , The open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and the process proceeds to step S7.
 ステップS7では、CPU13は、特定個人の視線方向推定部29として動作し、例えば、ステップS5で求めた顔の向き、顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向を推定し、その後処理を終える。
 一方ステップS4において、CPU13は、特定個人の顔ではない、換言すれば、通常の顔であると判断すれば、ステップS8に処理を進める。
In step S7, the CPU 13 operates as a line-of-sight direction estimation unit 29 of a specific individual, and for example, the orientation of the face, the position and shape of the facial organs obtained in step S5, particularly the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils) The direction of the line of sight is estimated based on the position and shape, and then the process is finished.
On the other hand, in step S4, if the CPU 13 determines that it is not the face of a specific individual, in other words, it is a normal face, the process proceeds to step S8.
 ステップS8では、CPU13は、通常の顔向き推定部31として動作し、例えば、ステップS2で検出した顔領域から目、鼻、口、眉などの顔器官の位置や形状を検出し、検出した顔器官の位置や形状に基づいて、顔の向きを推定し、ステップS9に処理を進める。
 ステップS9では、CPU13は、通常の目開閉検出部32として動作し、例えば、ステップS8で求めた顔器官の位置や形状、特に目の特徴点(瞼、瞳孔)の位置や形状に基づいて、目の開閉状態、例えば、目を開けているか、閉じているかを検出し、ステップS10に処理を進める。
In step S8, the CPU 13 operates as a normal face orientation estimation unit 31, and for example, detects the position and shape of facial organs such as eyes, nose, mouth, and eyebrows from the face area detected in step S2, and detects the face. The orientation of the face is estimated based on the position and shape of the organ, and the process proceeds to step S9.
In step S9, the CPU 13 operates as a normal eye opening / closing detection unit 32, and for example, based on the position and shape of the facial organs obtained in step S8, particularly the position and shape of the feature points (eyelids, pupils) of the eyes. The open / closed state of the eyes, for example, whether the eyes are open or closed is detected, and the process proceeds to step S10.
 ステップS10では、CPU13は、通常の視線方向推定部33として動作し、例えば、ステップS8で求めた顔の向き、顔器官の位置や形状、特に目の特徴点(目尻、目頭、瞳孔)の位置や形状に基づいて、視線の方向を推定し、その後処理を終える。
 図8は、実施の形態に係るドライバモニタリング装置10における画像処理部12のCPU13が行う処理動作の一例を示すフローチャートである。本処理動作は、図7に示すステップS2の顔検出処理動作とステップS3の特定個人判定処理動作の一例であり、入力画像1枚(1フレーム)に対する処理動作例である。
In step S10, the CPU 13 operates as a normal line-of-sight direction estimation unit 33, and for example, the orientation of the face and the position and shape of the facial organs obtained in step S8, particularly the positions of the feature points of the eyes (outer corners of eyes, inner corners of eyes, pupils) The direction of the line of sight is estimated based on the shape and shape, and then the process is completed.
FIG. 8 is a flowchart showing an example of a processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the face detection processing operation of step S2 and the specific individual determination processing operation of step S3 shown in FIG. 7, and is an example of processing operation for one input image (1 frame).
 CPU13は、まず、ステップS21で、顔のサイズ(大きさ)を検出するループ処理L1を開始し、次のステップS22で、顔の回転角度(向きや傾き)を検出するループ処理L2を開始し、次のステップS23では、顔の位置を検出するループ処理L3を開始して、ステップS24に処理を進める。
 ループ処理L1は、さまざまな大きさの顔を検出するために生成された縮小画像(例えば、図5に示す20a、20b)の数に応じて繰り返される。ループ処理L2は、顔の回転角度(例えば、図6に示す正面顔、斜め顔、横顔、傾き)を判別する判別器の設定に応じて繰り返される。ループ処理L3は、顔の位置を検出するために探索領域210を走査する位置の数だけ繰り返される。
First, the CPU 13 starts the loop process L1 for detecting the size (size) of the face in step S21, and starts the loop process L2 for detecting the rotation angle (direction and inclination) of the face in the next step S22. In the next step S23, the loop process L3 for detecting the position of the face is started, and the process proceeds to step S24.
The loop processing L1 is repeated according to the number of reduced images (for example, 20a and 20b shown in FIG. 5) generated to detect faces of various sizes. The loop processing L2 is repeated according to the setting of the discriminator for discriminating the rotation angle of the face (for example, the front face, the oblique face, the profile, and the inclination shown in FIG. 6). The loop process L3 is repeated for the number of positions where the search area 210 is scanned to detect the position of the face.
 ステップS24では、CPU13は、第1特徴量抽出部221として動作し、ループ処理L1、L2、L3の各条件で、探索領域210から顔の特徴量221aを抽出する処理を行う。
 ステップS25では、CPU13は、通常顔判別器222及び特定個人顔判定部223として動作し、探索領域210が顔であるか(顔であるか)、非顔(顔以外)であるかを階層的に判別し、いずれかの階層で非顔であると判別された場合に、探索領域210が特定個人の顔であるか、非顔であるかを判定する処理を行う。
In step S24, the CPU 13 operates as the first feature amount extraction unit 221 and performs a process of extracting the facial feature amount 221a from the search area 210 under each condition of the loop processes L1, L2, and L3.
In step S25, the CPU 13 normally operates as a face discriminator 222 and a specific individual face determination unit 223, and hierarchically determines whether the search area 210 is a face (face) or a non-face (other than a face). When it is determined that the search area 210 is a non-face in any of the layers, a process of determining whether the search area 210 is a face of a specific individual or a non-face is performed.
 CPU13は、例えば、全ての縮小画像に対して探索領域210の走査を行い、全ての探索領域210での顔検出(換言すれば、顔領域の候補の検出)を終えると、ステップS26でループ処理L1を終え、ステップS27でループ処理L2を終え、ステップS28でループ処理L3を終え、その後ステップS29に処理を進める。
 ステップS29では、CPU13は、顔領域統合部224として動作し、ステップS21~S28の処理で顔であると判別された1以上の顔領域の候補を統合する処理を行い、ステップS30に処理を進める。1以上の顔領域の候補を統合する方法は特に限定されない。例えば、1以上の顔領域の候補の領域中心の平均値と、領域サイズの平均値とに基づいて統合してもよい。
For example, the CPU 13 scans the search area 210 for all the reduced images, and when the face detection (in other words, the detection of the face area candidate) in all the search areas 210 is completed, the loop process is performed in step S26. L1 is completed, loop processing L2 is completed in step S27, loop processing L3 is completed in step S28, and then processing proceeds to step S29.
In step S29, the CPU 13 operates as the face area integration unit 224, performs a process of integrating one or more face area candidates determined to be faces in the processes of steps S21 to S28, and proceeds to step S30. .. The method of integrating the candidates of one or more face regions is not particularly limited. For example, it may be integrated based on the average value of the region center of one or more face region candidates and the average value of the region size.
 ステップS30では、CPU13は、第2特徴量抽出部225として動作し、ステップS29で統合された顔領域から顔の特徴量を抽出する処理を行い、ステップS31に処理を進める。
 ステップS31では、CPU13は、特定個人判定部25として動作し、ステップS30で、統合された顔領域から抽出された特徴量と、顔特徴量記憶部142から読み込んだ特定個人の顔特徴量142aとの相関係数を算出する処理を行い、ステップS32に処理を進める。
In step S30, the CPU 13 operates as the second feature amount extraction unit 225, performs a process of extracting the facial feature amount from the face region integrated in step S29, and proceeds to the process in step S31.
In step S31, the CPU 13 operates as the specific individual determination unit 25, and in step S30, the feature amount extracted from the integrated face area and the face feature amount 142a of the specific individual read from the face feature amount storage unit 142. The process of calculating the correlation coefficient of is performed, and the process proceeds to step S32.
 ステップS32では、CPU13は、算出した相関係数が、特定個人の顔か否かを判定するための所定の閾値より大きいか否かを判断し、相関係数が所定の閾値よりも大きい、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量142aとの相関性が高い(換言すれば、類似度が高い)と判断すれば、ステップS33に処理を進める。
 ステップS33では、CPU13は、顔領域に検出された顔が特定個人の顔であると判定し、その後処理を終える。
In step S32, the CPU 13 determines whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the face is a specific individual, and the correlation coefficient is larger than the predetermined threshold value, in other words. Then, if it is determined that the feature amount extracted from the face area has a high correlation (in other words, the similarity is high) with the face feature amount 142a of the specific individual, the process proceeds to step S33.
In step S33, the CPU 13 determines that the face detected in the face area is the face of a specific individual, and then ends the process.
 一方ステップS32において、相関係数が所定の閾値以下である、換言すれば、顔領域から抽出された特徴量と、特定個人の顔特徴量142aとの相関性が低い(換言すれば、類似度が低い)と判断すれば、ステップS34に処理を進める。
 ステップS34では、CPU13は、特定個人の顔ではない、換言すれば、通常の顔であると判定し、その後処理を終える。
On the other hand, in step S32, the correlation coefficient is equal to or less than a predetermined threshold value, in other words, the correlation between the feature amount extracted from the face region and the face feature amount 142a of the specific individual is low (in other words, the degree of similarity). Is low), the process proceeds to step S34.
In step S34, the CPU 13 determines that the face is not a specific individual's face, in other words, a normal face, and then ends the process.
 図9は、実施の形態に係るドライバモニタリング装置10における画像処理部12のCPU13が行う顔検出処理動作の一例を示すフローチャートである。本処理動作は、図8に示すステップS25の判別処理動作の一例である。
 まず、ステップS41では、CPU13は、図8のステップS24で探索領域210から抽出した特徴量221aを読み込み、ステップS42に処理を進める。
FIG. 9 is a flowchart showing an example of a face detection processing operation performed by the CPU 13 of the image processing unit 12 in the driver monitoring device 10 according to the embodiment. This processing operation is an example of the discrimination processing operation in step S25 shown in FIG.
First, in step S41, the CPU 13 reads the feature amount 221a extracted from the search area 210 in step S24 of FIG. 8, and proceeds to step S42.
 ステップS42では、CPU13は、通常顔判別器222がどの階層にあるのかをカウントする判別器カウンタ(i)に0をセットし、また、通常顔判別器222の階層構造の数を示すnには、初期値として1をセットし、ステップS43に処理を進める。
 ステップS43では、CPU13は、通常顔判別器222として動作し、第n判別器により、探索領域210が顔であるか、非顔であるかを判別する処理を行う。
In step S42, the CPU 13 sets 0 in the discriminator counter (i) that counts the hierarchy of the normal face discriminator 222, and sets n indicating the number of hierarchical structures of the normal face discriminator 222. , 1 is set as the initial value, and the process proceeds to step S43.
In step S43, the CPU 13 normally operates as a face discriminator 222, and performs a process of discriminating whether the search area 210 is a face or a non-face by the nth discriminator.
 ステップS44では、CPU13は、探索領域210が顔であるか、非顔であるかを判別し、顔であると判別すれば、ステップS45に処理を進める。
 ステップS45では、CPU13は、判別器カウンタ(i)に1を加算して、ステップS46に処理を進める。
 ステップS46では、CPU13は、判別器カウンタ(i)が、判別器の数を示すN未満であるか否かを判断し、判別器数N未満であると判断すれば、ステップS47に処理を進め、ステップS47では、次の階層の判別器による処理に進むために、nに1を加算し、その後、ステップS43に戻り、処理を繰り返す。
In step S44, the CPU 13 determines whether the search area 210 is a face or a non-face, and if it is determined that the search area 210 is a face, the process proceeds to step S45.
In step S45, the CPU 13 adds 1 to the discriminator counter (i) and proceeds to step S46.
In step S46, the CPU 13 determines whether or not the discriminator counter (i) is less than N indicating the number of discriminators, and if it is determined that the number of discriminators is less than N, the process proceeds to step S47. In step S47, 1 is added to n in order to proceed to the process by the discriminator of the next layer, and then the process returns to step S43 and the process is repeated.
 一方ステップS46において、CPU13が、判別器カウンタ(i)が判別器数N未満ではない、換言すれば、判別器カウンタ(i)が判別器数Nになったと判断すれば、ステップS48に処理を進める。
 ステップS48では、CPU13は、探索領域210が顔領域の候補であると判定し、当該探索領域の情報を顔領域の候補として記憶した後、当該探索領域に対する顔検出処理を終え、次の探索領域に対する顔検出処理を繰り返す。
On the other hand, in step S46, if the CPU 13 determines that the discriminator counter (i) is not less than the number of discriminators N, in other words, the discriminator counter (i) has reached the number of discriminators N, the process is performed in step S48. Proceed.
In step S48, the CPU 13 determines that the search area 210 is a candidate for the face area, stores the information of the search area as a candidate for the face area, finishes the face detection process for the search area, and finishes the next search area. The face detection process for is repeated.
 一方ステップS44において、CPU13が、探索領域210が非顔であると判別すれば、ステップS49に処理を進める。
 ステップS49では、CPU13は、特定個人顔判定部223として動作し、第n判別器で用いた特徴量(探索領域210から抽出された特徴量)と、第n判別器の階層に対応する特定個人の顔特徴量142aとの相関係数を算出する処理を行い、ステップS50に処理を進める。
On the other hand, if the CPU 13 determines in step S44 that the search area 210 is non-faced, the process proceeds to step S49.
In step S49, the CPU 13 operates as the specific individual face determination unit 223, and the feature amount used in the nth discriminator (feature amount extracted from the search area 210) and the specific individual corresponding to the hierarchy of the nth discriminator. The process of calculating the correlation coefficient with the facial feature amount 142a is performed, and the process proceeds to step S50.
 ステップS50では、CPU13は、算出した相関係数が、特定個人の顔であるか否かを判定するための所定の閾値より大きいか否かを判断し、相関係数が所定の閾値よりも大きい(すなわち、相関性が高い)と判断すれば(換言すれば、探索領域210が特定個人の顔であると判断すれば)、ステップS45に処理を進め、通常顔判別器222による判別処理を進める。
 一方ステップS50において、CPU13が、相関係数が所定の閾値以下である(すなわち、相関性が低い)と判断すれば(換言すれば、探索領域210が非顔であると判断すれば)、ステップS51に処理を進める。
 ステップS51では、CPU13は、探索領域が非顔(顔以外)であると判定し、次のステップS52で、通常顔判別器222による第n判別器より後の判別処理を打ち切り、その後、当該探索領域に対する顔検出処理を終え、次の探索領域に対する顔検出処理を繰り返す。
In step S50, the CPU 13 determines whether or not the calculated correlation coefficient is larger than a predetermined threshold value for determining whether or not the face is a specific individual, and the correlation coefficient is larger than the predetermined threshold value. If it is determined (that is, the correlation is high) (in other words, if it is determined that the search area 210 is the face of a specific individual), the process proceeds to step S45, and the discrimination process by the normal face discriminator 222 proceeds. ..
On the other hand, in step S50, if the CPU 13 determines that the correlation coefficient is equal to or less than a predetermined threshold value (that is, the correlation is low) (in other words, if it determines that the search area 210 is faceless), the step. The process proceeds to S51.
In step S51, the CPU 13 determines that the search area is a non-face (other than a face), and in the next step S52, the discrimination process after the nth discriminator by the normal face discriminator 222 is terminated, and then the search is performed. The face detection process for the area is completed, and the face detection process for the next search area is repeated.
[作用・効果]
 上記した実施の形態に係るドライバモニタリング装置10によれば、顔特徴量記憶部142に学習済みの顔特徴量として、特定個人の顔特徴量142aと、通常の顔特徴量142bとが記憶され、通常顔判別器222が、通常の顔特徴量142bを用いて、画像から切り出された探索領域210が顔であるか、非顔であるかを階層的に判別することにより、顔領域が検出される。
[Action / Effect]
According to the driver monitoring device 10 according to the above-described embodiment, the face feature amount 142a of a specific individual and the normal face feature amount 142b are stored as the learned face feature amount in the face feature amount storage unit 142. The face region is detected by the normal face discriminator 222 hierarchically discriminating whether the search region 210 cut out from the image is a face or a non-face by using the normal face feature amount 142b. To.
 また、通常顔判別器222のいずれかの階層で非顔であると判別された場合であっても、特定個人顔判定部223が、特定個人の顔特徴量142aを用いて、探索領域210が特定個人の顔であるか、非顔であるかを判定することにより、特定個人の顔を含む顔領域が検出される。
 これにより、通常の顔であっても、特定個人の顔であっても、画像中から顔領域を精度良く検出することができる。また、通常顔判別器222と特定個人顔判定部223では、探索領域210から抽出された共通の特徴量を用いるので、別途特徴量を抽出したりする処理が必要ないので、顔領域の検出に係るリアルタイム性を維持することができる。
Further, even when it is determined that the face is not a face in any layer of the normal face discriminator 222, the specific individual face determination unit 223 uses the face feature amount 142a of the specific individual to search the search area 210. By determining whether the face of a specific individual is a face or a non-face, a face region including the face of the specific individual is detected.
As a result, the face region can be accurately detected from the image regardless of whether it is a normal face or a face of a specific individual. Further, since the normal face discriminator 222 and the specific individual face determination unit 223 use a common feature amount extracted from the search area 210, there is no need for a separate process of extracting the feature amount, so that the face area can be detected. The real-time property can be maintained.
 したがって、ドライバ3が、特定個人であっても、特定個人以外の通常の人であっても、それぞれの顔をリアルタイムで(換言すれば、高速な処理で)精度良く検出することができる。
 また、特定個人顔判定部223によって、非顔であると判別された通常顔判別器222の一の階層で用いた特徴量と、前記一の階層に対応する特定個人の顔特徴量142aとの相関性に基づいて、探索領域が特定個人の顔であるか、非顔であるかを効率良く判定することができる。したがって、通常顔判別器222で非顔であると判別された場合であっても、特定個人の顔である場合を精度良く判定することができる。
Therefore, the driver 3 can accurately detect each face in real time (in other words, by high-speed processing) regardless of whether the driver 3 is a specific individual or a normal person other than the specific individual.
Further, the feature amount used in one layer of the normal face discriminator 222 determined to be non-face by the specific individual face determination unit 223 and the face feature amount 142a of the specific individual corresponding to the one layer. Based on the correlation, it is possible to efficiently determine whether the search area is a face of a specific individual or a non-face. Therefore, even when the face discriminator 222 determines that the face is non-face, it is possible to accurately determine the face of a specific individual.
 また、上記ドライバモニタリング装置10によれば、特定個人顔判定部223により特定個人の顔であると判定された場合、通常顔判別器222の次の階層に判別が進められ、判別処理が速やかに継続される。一方、特定個人顔判定部223により非顔であると判定された場合、通常顔判別器222での判別が打ち切られる。したがって、通常顔判別器222の効率を維持しつつ、特定個人の顔を判定する処理を行うことができる。 Further, according to the driver monitoring device 10, when the specific individual face determination unit 223 determines that the face is a specific individual face, the determination proceeds to the next layer of the normal face discriminator 222, and the discrimination process is swiftly performed. Will be continued. On the other hand, when the specific individual face determination unit 223 determines that the face is non-face, the determination by the normal face discriminator 222 is terminated. Therefore, it is possible to perform a process of determining the face of a specific individual while maintaining the efficiency of the normal face discriminator 222.
 また、上記ドライバモニタリング装置10によれば、顔領域統合部224によって、通常顔判別器222を介して顔であると判別された1以上の顔領域の候補が統合され、特定個人判定部25によって、統合された顔領域から抽出された特徴量と、特定個人の顔特徴量142aとを用いて、顔領域の顔が特定個人の顔であるか否かが判定される。したがって、顔領域統合部224により統合された顔領域に基づいて、特定個人の顔であるか、通常の人の顔であるかを精度良く判定することができる。
 また、車載システム1が、ドライバモニタリング装置10と、ドライバモニタリング装置10から出力されるモニタリングの結果に基づいて、所定の処理を実行する1以上のECU40とを備えている。したがって、前記モニタリングの結果に基づいて、ECU40に所定の制御を適切に実行させることが可能となる。これにより、特定個人であっても安心して運転することができる安全性の高い車載システムを構築することが可能となる。
Further, according to the driver monitoring device 10, one or more face region candidates determined to be faces are integrated by the face region integration unit 224 via the normal face discriminator 222, and the specific individual determination unit 25 integrates the candidates. It is determined whether or not the face in the face region is the face of the specific individual by using the feature amount extracted from the integrated face region and the face feature amount 142a of the specific individual. Therefore, based on the face region integrated by the face region integration unit 224, it is possible to accurately determine whether the face is a specific individual's face or a normal person's face.
Further, the in-vehicle system 1 includes a driver monitoring device 10 and one or more ECUs 40 that execute a predetermined process based on the monitoring result output from the driver monitoring device 10. Therefore, based on the result of the monitoring, the ECU 40 can appropriately execute a predetermined control. This makes it possible to construct a highly safe in-vehicle system that allows even a specific individual to drive with peace of mind.
[変形例]
 以上、本発明の実施の形態を詳細に説明したが、前述までの説明はあらゆる点において本発明の例示に過ぎない。本発明の範囲を逸脱することなく、種々の改良や変更を行うことができることは言うまでもない。
 上記実施の形態では、本発明に係る画像処理装置をドライバモニタリング装置10に適用した場合について説明したが、適用例はこれに限定されない。例えば、工場内の機械や装置などの各種設備を操作したり、監視したり、所定の作業をしたりする人などをモニタリングする装置やシステムなどにおいて、モニタリング対象者に上記した特定個人が含まれる場合に、本発明に係る画像処理装置を適用可能である。
[Modification example]
Although the embodiments of the present invention have been described in detail above, the above description is merely an example of the present invention in all respects. Needless to say, various improvements and changes can be made without departing from the scope of the present invention.
In the above embodiment, the case where the image processing device according to the present invention is applied to the driver monitoring device 10 has been described, but the application example is not limited to this. For example, in a device or system for monitoring a person who operates, monitors, or performs a predetermined work of various facilities such as machines and devices in a factory, the above-mentioned specific individual is included in the monitoring target person. In some cases, the image processing apparatus according to the present invention can be applied.
[付記]
 本発明の実施の形態は、以下の付記の様にも記載され得るが、これらに限定されない。
(付記1)
 撮像部(11)から入力される画像を処理する画像処理装置(12)であって、
 前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量(142a)と、通常の顔特徴量(142b)とが記憶される顔特徴量記憶部(142)と、
 前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出部(22)とを備え、
 該顔検出部(22)が、
 前記探索領域から顔の特徴量を抽出する第1特徴量抽出部(221)と、
 前記探索領域から抽出された前記特徴量と、前記通常の顔特徴量(142b)とを用いて、前記探索領域が顔であるか、非顔であるかを判別する階層構造の通常顔判別器(222)と、
 該通常顔判別器(222)のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定部(223)とを備えていることを特徴とする画像処理装置。
[Additional Notes]
Embodiments of the present invention may also be described as, but are not limited to, the following appendices.
(Appendix 1)
An image processing device (12) that processes an image input from the image pickup unit (11).
A face feature storage unit that stores a specific individual's face feature (142a) and a normal face feature (142b) as learned face features that have been learned to detect a face from the image. (142) and
A face detection unit (22) that detects a face area while scanning the search area with respect to the image is provided.
The face detection unit (22)
A first feature amount extraction unit (221) that extracts facial feature amounts from the search area, and
A hierarchically structured normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount (142b). (222) and
When the non-face is determined by any layer of the normal face discriminator (222), the feature amount extracted from the search area and the face feature amount of the specific individual are used. An image processing device including a specific individual face determination unit (223) for determining whether the search area is the face of the specific individual or the non-face.
(付記2)
 撮像部(11)から入力される画像を処理する画像処理方法であって、
 前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップ(S2)を含み、
 該顔検出ステップ(S2)が、
 前記探索領域から顔の特徴量を抽出する特徴量抽出ステップ(S24)と、
 該特徴量抽出ステップ(S24)により抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量(142b)とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップ(S43、S44)と、
 該通常顔判別ステップ(S43、S44)のいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量(142a)とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップ(S49、S50)とを含むことを特徴とする画像処理方法。
(Appendix 2)
An image processing method for processing an image input from the image pickup unit (11).
A face detection step (S2) of detecting a face area while scanning the search area with respect to the image is included.
The face detection step (S2)
A feature amount extraction step (S24) for extracting facial feature amounts from the search area, and
The search area is a face using the feature amount extracted by the feature amount extraction step (S24) and the learned normal face feature amount (142b) that has been learned to detect a face. The normal face discrimination step (S43, S44) for hierarchically discriminating whether the face is non-face or non-face,
Learning to detect the extracted feature amount and the face of a specific individual when the non-face is determined in any layer of the normal face discrimination step (S43, S44). With the specific individual face determination step (S49, S50) for determining whether the search area is the face of the specific individual or the non-face using the completed face feature amount (142a) of the specific individual. An image processing method characterized by including.
 1   車載システム
 2   車両
 3   ドライバ
10   ドライバモニタリング装置
11   カメラ
12   画像処理部
13   CPU
14   ROM
141  プログラム記憶部
142  顔特徴量記憶部
142a 特定個人の顔特徴量
142b 通常の顔特徴量
15   RAM
151  画像メモリ
16   通信部
20   画像(入力画像)
20a,20b 縮小画像
21   画像入力部
22   顔検出部
210  探索領域
221  第1特徴量抽出部
221a 特徴量
222  通常顔判別器
223  特定個人顔判定部
224  顔領域統合部
225  第2特徴量抽出部
25   特定個人判定部
26   第1顔画像処理部
27   特定個人の顔向き推定部
28   特定個人の目開閉検出部
29   特定個人の視線方向推定部
30   第2顔画像処理部
31   通常の顔向き推定部
32   通常の目開閉検出部
33   通常の視線方向推定部
34   出力部
40   ECU
41   センサ
42   アクチュエータ
43   通信バス
1 In-vehicle system 2 Vehicle 3 Driver 10 Driver monitoring device 11 Camera 12 Image processing unit 13 CPU
14 ROM
141 Program storage unit 142 Face feature amount Storage unit 142a Specific individual face feature amount 142b Normal face feature amount 15 RAM
151 Image memory 16 Communication unit 20 Image (input image)
20a, 20b Reduced image 21 Image input unit 22 Face detection unit 210 Search area 221 First feature amount extraction unit 221a Feature amount 222 Normal face discriminator 223 Specific individual face determination unit 224 Face area integration unit 225 Second feature amount extraction unit 25 Specific individual determination unit 26 First face image processing unit 27 Face orientation estimation unit for a specific individual 28 Eye opening / closing detection unit for a specific individual 29 Line-of-sight direction estimation unit 30 for a specific individual Second face image processing unit 31 Normal face orientation estimation unit 32 Normal eye opening / closing detection unit 33 Normal line-of-sight direction estimation unit 34 Output unit 40 ECU
41 Sensor 42 Actuator 43 Communication bus

Claims (10)

  1.  撮像部から入力される画像を処理する画像処理装置であって、
     前記画像から顔を検出するための学習を行った学習済みの顔特徴量として、特定個人の顔特徴量と、通常の顔特徴量とが記憶される顔特徴量記憶部と、
     前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出部と、
    を備え、
     該顔検出部が、
     前記探索領域から顔の特徴量を抽出する第1特徴量抽出部と、
     前記探索領域から抽出された前記特徴量と、前記通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを判別する階層構造の通常顔判別器と、
     該通常顔判別器のいずれかの階層で前記非顔であると判別された場合に、前記探索領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定部と、
    を備えている画像処理装置。
    An image processing device that processes images input from the image pickup unit.
    As the learned facial features that have been learned to detect the face from the image, a facial feature storage unit that stores the facial features of a specific individual and the normal facial features, and
    A face detection unit that detects the face area while scanning the search area for the image, and
    With
    The face detection unit
    A first feature amount extraction unit that extracts facial feature amounts from the search area, and
    A hierarchically structured normal face discriminator that discriminates whether the search area is a face or a non-face by using the feature amount extracted from the search area and the normal face feature amount.
    When it is determined that the face is non-face in any layer of the normal face discriminator, the search area is used by using the feature amount extracted from the search area and the face feature amount of the specific individual. The specific individual face determination unit that determines whether is the face of the specific individual or the non-face, and
    An image processing device equipped with.
  2.  前記特定個人顔判定部が、
     前記非顔であると判別した前記通常顔判別器の一の階層で用いた前記特徴量と、前記一の階層に対応する前記特定個人の顔特徴量との相関を示す指標を算出し、
     算出した前記指標に基づいて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する、
    請求項1記載の画像処理装置。
    The specific individual face determination unit
    An index showing the correlation between the feature amount used in one layer of the normal face discriminator determined to be non-face and the face feature amount of the specific individual corresponding to the one layer was calculated.
    Based on the calculated index, it is determined whether the search area is the face of the specific individual or the non-face.
    The image processing apparatus according to claim 1.
  3.  前記特定個人顔判定部が、
     前記指標が所定の閾値より大きい場合、前記探索領域が前記特定個人の顔であると判定し、
     前記指標が前記所定の閾値以下の場合、前記探索領域が前記非顔であると判定する、
    請求項2記載の画像処理装置。
    The specific individual face determination unit
    When the index is larger than a predetermined threshold value, it is determined that the search area is the face of the specific individual.
    When the index is equal to or less than the predetermined threshold value, it is determined that the search area is the non-face.
    The image processing apparatus according to claim 2.
  4.  前記特定個人顔判定部により前記特定個人の顔であると判定された場合、前記通常顔判別器の次の階層に判別を進める判別進行部と、
     前記特定個人顔判定部により前記非顔であると判定された場合、前記通常顔判別器での判別を打ち切る判別打切部とを備えている、
    請求項1~3のいずれかの項に記載の画像処理装置。
    When the specific individual face determination unit determines that the face is the specific individual's face, the determination progress unit advances the determination to the next layer of the normal face discriminator.
    When the specific individual face determination unit determines that the face is non-face, the specific individual face determination unit is provided with a discrimination cutoff unit that terminates the discrimination by the normal face discriminator.
    The image processing apparatus according to any one of claims 1 to 3.
  5.  前記顔検出部が、
     前記通常顔判別器により前記顔であると判別された1以上の前記顔領域の候補を統合する顔領域統合部と、
     統合された前記顔領域から顔の特徴量を抽出する第2特徴量抽出部とを備え、
     統合された前記顔領域から抽出された前記特徴量と、前記特定個人の顔特徴量とを用いて、前記顔領域の顔が前記特定個人の顔であるか否かを判定する特定個人判定部とを備えている、
    請求項1~4のいずれかの項に記載の画像処理装置。
    The face detection unit
    A face region integration unit that integrates one or more candidates for the face region determined to be the face by the normal face discriminator, and a face region integration unit.
    It is provided with a second feature amount extraction unit that extracts facial feature amounts from the integrated face area.
    A specific individual determination unit that determines whether or not the face in the face region is the face of the specific individual by using the feature amount extracted from the integrated face region and the face feature amount of the specific individual. Has,
    The image processing apparatus according to any one of claims 1 to 4.
  6.  請求項1~5のいずれかの項に記載の画像処理装置と、
     該画像処理装置に入力する画像を撮像する撮像部と、
     前記画像処理装置による画像処理に基づく情報を出力する出力部と、
    を備えているモニタリング装置。
    The image processing apparatus according to any one of claims 1 to 5.
    An image pickup unit that captures an image to be input to the image processing device,
    An output unit that outputs information based on image processing by the image processing device, and
    A monitoring device equipped with.
  7.  請求項6記載のモニタリング装置と、
     該モニタリング装置と通信可能に接続され、該モニタリング装置から出力される前記情報に基づいて、所定の処理を実行する1以上の制御装置とを備えている、
    制御システム。
    The monitoring device according to claim 6 and
    It includes one or more control devices that are communicably connected to the monitoring device and execute a predetermined process based on the information output from the monitoring device.
    Control system.
  8.  前記モニタリング装置が、車両のドライバをモニタリングするための装置であり、
     前記制御装置が、前記車両に搭載される電子制御ユニットを含む、
    請求項7記載の制御システム。
    The monitoring device is a device for monitoring the driver of the vehicle.
    The control device includes an electronic control unit mounted on the vehicle.
    The control system according to claim 7.
  9.  撮像部から入力される画像を処理する画像処理方法であって、
     前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
     該顔検出ステップが、
     前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
     該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
     該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップと、
    を含む画像処理方法。
    It is an image processing method that processes an image input from an imaging unit.
    A face detection step of detecting a face area while scanning the search area with respect to the image is included.
    The face detection step
    A feature amount extraction step for extracting facial feature amounts from the search area, and
    The search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face. A normal face discrimination step that hierarchically discriminates whether or not
    When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. A specific individual face determination step for determining whether the search area is the face of the specific individual or the non-face using the facial features of
    Image processing method including.
  10.  撮像部から入力される画像の処理を少なくとも1以上のコンピュータに実行させるためのプログラムであって、
     前記少なくとも1以上のコンピュータに、
     前記画像に対して探索領域を走査しながら顔領域の検出を行う顔検出ステップを含み、
     該顔検出ステップが、
     前記探索領域から顔の特徴量を抽出する特徴量抽出ステップと、
     該特徴量抽出ステップにより抽出された前記特徴量と、顔を検出するための学習を行った学習済みの通常の顔特徴量とを用いて、前記探索領域が顔であるか、非顔であるかを階層的に判別する通常顔判別ステップと、
     該通常顔判別ステップのいずれかの階層で前記非顔であると判別された場合に、抽出された前記特徴量と、特定個人の顔を検出するための学習を行った学習済みの前記特定個人の顔特徴量とを用いて、前記探索領域が前記特定個人の顔であるか、前記非顔であるかを判定する特定個人顔判定ステップと、
    を実行させるためのプログラム。
    A program for causing at least one or more computers to process an image input from an imaging unit.
    To at least one of the above computers
    A face detection step of detecting a face area while scanning the search area with respect to the image is included.
    The face detection step
    A feature amount extraction step for extracting facial feature amounts from the search area, and
    The search area is a face or a non-face using the feature amount extracted by the feature amount extraction step and a trained normal face feature amount that has been trained to detect a face. A normal face discrimination step that hierarchically discriminates whether or not
    When it is determined that the face is non-face in any of the layers of the normal face discrimination step, the extracted feature amount and the learned specific individual who has been trained to detect the face of the specific individual. A specific individual face determination step for determining whether the search area is the face of the specific individual or the non-face using the facial features of
    A program to execute.
PCT/JP2020/020261 2019-06-28 2020-05-22 Image processing device, monitoring device, control system, image processing method, and program WO2020261832A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-121136 2019-06-28
JP2019121136A JP7230710B2 (en) 2019-06-28 2019-06-28 Image processing device, monitoring device, control system, image processing method, and program

Publications (1)

Publication Number Publication Date
WO2020261832A1 true WO2020261832A1 (en) 2020-12-30

Family

ID=74061205

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/020261 WO2020261832A1 (en) 2019-06-28 2020-05-22 Image processing device, monitoring device, control system, image processing method, and program

Country Status (2)

Country Link
JP (1) JP7230710B2 (en)
WO (1) WO2020261832A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005056387A (en) * 2003-07-18 2005-03-03 Canon Inc Image processor, imaging apparatus and image processing method
JP2010198313A (en) * 2009-02-25 2010-09-09 Denso Corp Device for specifying degree of eye opening

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005056387A (en) * 2003-07-18 2005-03-03 Canon Inc Image processor, imaging apparatus and image processing method
JP2010198313A (en) * 2009-02-25 2010-09-09 Denso Corp Device for specifying degree of eye opening

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IWAI, YOSHIO: "A Survey on Face Detection and Face Recognition", IPSJ SIG TECHNICAL REPORT, no. 38, 13 May 2005 (2005-05-13), pages 343 - 368 *

Also Published As

Publication number Publication date
JP2021006972A (en) 2021-01-21
JP7230710B2 (en) 2023-03-01

Similar Documents

Publication Publication Date Title
Alioua et al. Driver head pose estimation using efficient descriptor fusion
Trivedi et al. Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety
EP2860664B1 (en) Face detection apparatus
CN105769120A (en) Fatigue driving detection method and device
Rezaei et al. Simultaneous analysis of driver behaviour and road condition for driver distraction detection
CN114973215A (en) Fatigue driving determination method and device and electronic equipment
CN114821697A (en) Material spectrum
CN114821695A (en) Material spectrometry
CN114821694A (en) Material spectrometry
CN114821696A (en) Material spectrometry
WO2021024905A1 (en) Image processing device, monitoring device, control system, image processing method, computer program, and recording medium
Rani et al. Development of an Automated Tool for Driver Drowsiness Detection
JP2004334786A (en) State detection device and state detection system
Haselhoff et al. Radar-vision fusion for vehicle detection by means of improved haar-like feature and adaboost approach
CN113128540B (en) Method and device for detecting vehicle theft behavior of non-motor vehicle and electronic equipment
CN116012822B (en) Fatigue driving identification method and device and electronic equipment
Llorca et al. Stereo-based pedestrian detection in crosswalks for pedestrian behavioural modelling assessment
WO2020261832A1 (en) Image processing device, monitoring device, control system, image processing method, and program
CN117953333A (en) Biometric service assessment architecture for vehicles
WO2020261820A1 (en) Image processing device, monitoring device, control system, image processing method, and program
JP2021009503A (en) Personal data acquisition system, personal data acquisition method, face sensing parameter adjustment method for image processing device and computer program
CN112348718B (en) Intelligent auxiliary driving guiding method, intelligent auxiliary driving guiding device and computer storage medium
TW202326624A (en) Embedded deep learning multi-scale object detection model using real-time distant region locating device and method thereof
Sivaraman Learning, modeling, and understanding vehicle surround using multi-modal sensing
Kim et al. Driving environment assessment using fusion of in-and out-of-vehicle vision systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20831560

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20831560

Country of ref document: EP

Kind code of ref document: A1