WO2019145578A1 - Neural network image processing apparatus - Google Patents

Neural network image processing apparatus Download PDF

Info

Publication number
WO2019145578A1
WO2019145578A1 PCT/EP2019/060596 EP2019060596W WO2019145578A1 WO 2019145578 A1 WO2019145578 A1 WO 2019145578A1 EP 2019060596 W EP2019060596 W EP 2019060596W WO 2019145578 A1 WO2019145578 A1 WO 2019145578A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
face region
region
landmarks
gaze
Prior art date
Application number
PCT/EP2019/060596
Other languages
French (fr)
Inventor
Liviu DUTU
Stefan MATHE
Madalin DUMITRU-GUZU
Joseph LEMLEY
Original Assignee
Fotonation Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fotonation Limited filed Critical Fotonation Limited
Priority to EP19721236.8A priority Critical patent/EP3539054B1/en
Publication of WO2019145578A1 publication Critical patent/WO2019145578A1/en
Priority to US16/780,775 priority patent/US11314324B2/en
Priority to US17/677,320 priority patent/US11699293B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor

Definitions

  • the present invention relates to a neural network image processing apparatus.
  • Embodiments substantially simultaneously provide gaze and eyelid opening estimates from both eyes of a detected face within an image.
  • Embodiments comprise an integrated network where the weights for the various layers are determined once in the same training process to provide eyelid and gaze estimation values - this training can mean that each component (opening, gaze) of the network boosts the other as eyelid opening information can help the system learn more efficiently how to predict gaze, and vice-versa.
  • US Application No. 15/912,946 filed 6 March 2018 discloses tracking units for facial features with advanced training for natural rendering of human faces in real-time.
  • a device receives an image of a face from a frame of a video stream, and based on the image, selects a head orientation class from a comprehensive set of head orientation classes. Each head orientation class includes a respective 3D model.
  • the device determines modifications to the selected 3D model to describe the face in the image, then projects a model of tracking points (landmarks) of facial features in an image plane based on the 3D model.
  • the distance between points 21A, 21B is multiplied by a fixed aspect ratio to determine a height 23 for an eye bounding box.
  • the upper and lower boundary lines of the eye bounding box are centered about a line 25 extending between points 21A, 21B. (If there is a difference in height between eye corner locations 19DA, 19DB, the line 25 can be centred height wise between these locations.
  • the defined eye regions (only the left region 27L is shown in Figure 3) can be fed to a neural network as shown in Figure 4 to simultaneously determine an eyelid opening value for each eye as well as the gaze angle (both pitch and yaw) for the pair of eyes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A neural network image processing apparatus arranged to acquire images from an image sensor and to: identify a ROI containing a face region in an image; determine at plurality of facial landmarks in the face region; use the facial landmarks to transform the face region within the ROI into a face region having a given pose; and use transformed landmarks within the transformed face region to identify a pair of eye regions within the transformed face region. Each identified eye region is fed to a respective first and second convolutional neural network, each network configured to produce a respective feature vector. Each feature vector is fed to respective eyelid opening level neural networks to obtain respective measures of eyelid opening for each eye region. The feature vectors are combined and to a gaze angle neural network to generate gaze yaw and pitch values substantially simultaneously with the eyelid opening values.

Description

Neural Network Image Processing Apparatus
Field
The present invention relates to a neural network image processing apparatus.
Background
There is a need for eye gaze tracking applications and gaze-based human computer interactions for dynamic platforms such as driver monitoring systems and handheld devices. For an automobile driver, eye based cues such as levels of gaze variation, speed of eyelid movements and eye closure can be indicative of a driver's cognitive state. These can be useful inputs for intelligent vehicles to understand driver attentiveness levels, lane change intent, and vehicle control in the presence of obstacles to avoid accidents. Handheld devices like smartphones and tablets may also employ gaze tracking applications wherein gaze may be used as an input modality for device control, activating safety features and controlling user interfaces.
The most challenging aspect of such gaze applications includes operation under dynamic user conditions and unconstrained environments. Further requirements for implementing a consumer-grade gaze tracking system include real-time high-accuracy operation, minimal or no calibration, and robustness to user head movements and varied lighting conditions.
Traditionally, gaze estimation has been done using architectures based on screen light reflection on the eye where corneal reflections from light can be used to estimate point-of- gaze.
Neural networks have also been applied to the problem and S. Baluja and D. Pomerleau, "Non-intrusive gaze tracking using artificial neural networks," Pittsburgh, PA, USA, Tech. Rep., 1994 discloses using a neural network to map gaze coordinates to low quality cropped eye images.
Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhandarkar, Wojciech Matusik, Antonio Torralba, "Eye Tracking for Everyone" discloses an appearance based convolutional neural network (CNN) based model that uses face landmarks to crop an image into left and right regions. The eye regions and face are then passed to distinct neural networks which output into shared fully connected layers to provide a gaze prediction.
Similarly, M. Kim, 0. Wang and N. Ng "Convolutional Neural Network Architectures for Gaze Estimation on Mobile Devices", Stanford Reports, 2017,
(http://cs231n.stanford.edu/reports/2017/pdfs/229.pdf) referring to Krafka also uses separate eye regions extracted from a face region as well as a histogram of gradients map to provide a gaze prediction.
Rizwan AN Naqvi, Muhammad Arsalan, Ganbayar Batchuluun, Hyo Sik Yoon and Kang Ryoung Park, "Deep Learning-Based Gaze Detection System for Automobile Drivers Using a NIR Camera Sensor", Sensors 2018, 18, 456 discloses capturing a driver's frontal image, detecting face landmarks using a facial feature tracker, obtaining face, left and right eye images, calculating three distances based on three sets of feature vectors and classifying a gaze zone based on the three distances. X. Zhang, Y. Sugano, M. Fritz, and A. Bulling in both "Appearance-based gaze estimation in the wild," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015, pp. 4511-4520 and "MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation" IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, disclose using face detection and facial landmark detection methods to locate landmarks in an input image obtained from a calibrated monocular RGB camera. A generic 3D facial shape model is fitted to estimate a 3D pose of a detected face and to crop and warp the head pose and eye images to a normalised training space. A CNN is used to learn the mapping from the head poses and eye images to gaze directions in the camera coordinate system.
Summary According to the present invention there is provided a neural network image processing apparatus as claimed in claim 1.
Embodiments substantially simultaneously provide gaze and eyelid opening estimates from both eyes of a detected face within an image. Embodiments comprise an integrated network where the weights for the various layers are determined once in the same training process to provide eyelid and gaze estimation values - this training can mean that each component (opening, gaze) of the network boosts the other as eyelid opening information can help the system learn more efficiently how to predict gaze, and vice-versa.
There is no need to manually weight gaze angles calculated for separate eye regions and so this reduces human intervention and favours a pure machine learning approach.
Brief Description of the Drawings An embodiment of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Figure 1 shows a neural network image processing apparatus according to an embodiment of the present invention;
Figure 2 illustrates a normalised face region derived from an image acquired by the system according to Figure 1 along with facial landmarks and an eye region identified for the face region;
Figure 3 illustrates the identification of individual eye regions based on the landmarks of Figure 2;
Figure 4 illustrates the configuration of a convolutional neural network implemented with the system of Figure 1 for identifying a gaze angle and eyelid opening for eye regions provided to the system; and
Figure 5 shows the configuration of the convolutional neural network in more detail.
Description of the Embodiment
Referring now to Figure 1, there is shown a neural network image processing apparatus 10 according to an embodiment of the present invention. The apparatus 10 comprises an image sensor 12 for acquiring images 13-1...13-N which are subsequently stored in memory 14. Although not shown, the image sensor 12 can include or cooperate with an image processing pipeline for performing initial processing of a raw image such as colour balancing, distortion correction etc. Details of such pre-processing and distortion correction systems are disclosed in PCT Application WO2017/032468 (Ref: FN-469-PCT), European Patent No. EP3101622 (Ref: FN-384-EP2) and US Patent Application No. 15/879,310 (Ref: FN-622-US).
Note that the image sensor need not be immediately connected to the remainder of the apparatus 10 and for example, the sensor 12 can provide images for processing by the remainder of the apparatus across any of a local area network, a personal area network, a wide area network and/or any combination of a wired or wireless network.
The image sensor 12 can provide acquired images 13-1...13-N directly to memory 14 across a system bus 20 or the images 13-1...13-N can be provided directly to a face detector module 16. Face detection within acquired images is well-known since at least US
2002/0102024, Viola-Jones with many optimisations and improvements made in such systems since then. Thus, the face detector module 16 can be a dedicated hardware module such as the engine disclosed in PCT Application WO 2017/108222 (Ref: FN-470-PCT), or the face detector can be implemented in general purpose software executing on a system CPU 18, or indeed the face detector 16 could be implemented using one or more convolutional neural networks (CNN) and executed on a dedicated CNN engine 26 such as described in PCT Application WO 2017/129325 (Ref: FN-481-PCT), and US Application No. 62/592,665 (Ref: FN-618-US). Indeed, US application No. 62/592,665 (Ref: FN-618-US) discloses a system including multiple neural network processing cores which can be configured to process multiple neural networks performing different tasks on the same or different images or image portions in parallel.
In any case, once the face detector module 16 has processed an image, any region of interest (ROI) 17 bounding a portion of the image containing a face is identified and this information can be stored in memory as meta data associated with the image 13-1...13-N. This, may simply comprise bounding box information for the ROI containing the face or as explained below, further information may be included in the meta data for the ROI 17. It will be appreciated that any given image may include a number of detected face regions - in the example, image 13-1 includes 3 ROI 17, and information relating to each of these may be stored as meta data associated with the image 13-1 and processed as and if required.
It will be appreciated that face regions may be detected within an image at one of a number of different scales and at one of a number of different orientations and it may be desirable to transform these detected face regions into a rectangular image crop with a given orientation and scale using techniques such as disclosed in PCT Application WO2017/032468 (Ref: FN-469-PCT). In this case, an image crop can be included in the ROI information 17 stored in association with the image 13-1...13-N in memory 14. Nonetheless, even with an image crop of a given orientation and scale, the detected face may be in a number of different poses within the crop, e.g. forward facing, looking up, down, left, right etc.
US Application No. 15/912,946 filed 6 March 2018 (Ref: I0002-0613-US-01) discloses tracking units for facial features with advanced training for natural rendering of human faces in real-time. A device receives an image of a face from a frame of a video stream, and based on the image, selects a head orientation class from a comprehensive set of head orientation classes. Each head orientation class includes a respective 3D model. The device determines modifications to the selected 3D model to describe the face in the image, then projects a model of tracking points (landmarks) of facial features in an image plane based on the 3D model. The device can switch among a comprehensive set of 35 different head orientation classes, for example, for each video frame based on suggestions computed from a previous video frame or from yaw and pitch angles of the visual head orientation. Each class of the comprehensive set is trained separately based on a respective collection of automatically marked images for that head orientation class. Alternatively, libraries such as dlib http://dlib.net/face landmark detection.py.html are available for face landmark detection.
Such tools can be employed within a landmark detector 22 which produces a set of landmarks 19 for a given image crop containing a face region. Figure 2 shows an image containing a face region where a number of landmarks have been identified. In this case, the landmarks indicate the jawline 19A, mouth region 19C, nose 19B, eyes 19D and eyebrows 19E of the subject. Note that the present apparatus is typically processing images and image crops of relatively low resolution so that it is typically not possible to explicitly locate or identify features on the eye of the user, for example, a pupil outline which in other applications can be useful for determining gaze angle.
Again, the landmark detector 22 can be implemented as a dedicated module, or the detector 22 can be implemented in general purpose software executing on a system CPU 18. Again, landmark information 19 can be stored in association with the ROI information 17 within the meta-data associated with a given image 13.
Now using the landmarks 19 identified in the original ROI 17, a pose normalisation module 24 can transform the ROI 17 into a face crop in a normalised pose - in this case front facing, such as the face region 17' shown in Figure 2 - and store this as ROI-Norm 17' in association with the original ROI 17. It will be appreciated that this morphing process may result in an incomplete front facing face image, for example, where the original detected face was a side profile, but this does not necessarily prevent the remainder of the system from performing properly.
Now with the landmarks 19 of a transformed front facing image region and eye region 21 can be defined. In the example, the eye region extend from the highest eyebrow landmark 19E to a margin beyond the lowest eye landmark 19D and from a margin beyond the left most eye or eyebrow landmark 19D,19E to a margin beyond the right-most eye or eyebrow landmark 19D,19E.
Referring now to Figure 3, within the eye region 21, for each eye (only detail for one eye is shown), the corners of the eye 19DA, 19DB are identified and then shifted by a margin away from the centre of the eye to indicate respective left and right boundary points 21A, 21B for each eye. The margin can be a percentage of the distance between corners 19DA and 19DB.
The distance between points 21A, 21B is multiplied by a fixed aspect ratio to determine a height 23 for an eye bounding box. The upper and lower boundary lines of the eye bounding box are centered about a line 25 extending between points 21A, 21B. (If there is a difference in height between eye corner locations 19DA, 19DB, the line 25 can be centred height wise between these locations. Now the defined eye regions (only the left region 27L is shown in Figure 3) can be fed to a neural network as shown in Figure 4 to simultaneously determine an eyelid opening value for each eye as well as the gaze angle (both pitch and yaw) for the pair of eyes.
Figure 4 shows each eye region 27L, 27R fed to a respective CNN 40L, 40R. More specific details of the layers comprising each CNN 40L, 40R will be provided below, for the moment is it sufficient to note that the output layer 42L, 42R of each CNN 40L, 40R comprise a feature vector comprising a plurality of values, typically of the order of between 128 and 256 floating point values. Nonetheless, it will be appreciated that in alternative
implementations, the feature vector may comprise fewer or more than these exemplary values. Note that the format of these values can be as disclosed in PCT Application WO 2017/129325 (Ref: FN-481-PCT) and US Application No. 15/955,426 (Ref: FN-626-US). While each of the left and right feature vectors 42L, 42R can be fed to separate respective networks 44L, 44R, each for providing a measure of eyelid opening, the feature vectors are concatenated into a single feature vector 42 which is provided as an input layer to a gaze angle network 46. It will be appreciated that other mechanisms for combining the feature vectors 42L, 42R before or as they are fed to the gaze angle network 46 may be employed, for example, they could be supplied as separate input maps to an input layer of the network 46. In this regard, it should also be appreciated that the feature vectors 42L, 42R are not confined to comprising lxM values and instead could comprise feature maps with AxB=M values.
The output layer each network 44L, 44R comprises an integer corresponding to number of pixels indicating a level of opening of a given eye.
The output layer of network 46 comprises a pair of numbers indicating a gaze horizontal angle (yaw) and gaze vertical angle (pitch) for the pair of eye regions 27L, 27R. Referring now to Figure 5, each of the CNNs 40L, 40R can comprise a number of convolutional layers (Conv) interleaved with pooling layers (Pool). In one example, each convolutional layer includes an activation function for example the ReLU function, as described in US Application No. 15/955,426 (Ref: FN-626-US), however, it will be
appreciated that other activation functions such as PReLU could also be employed. The pooling layers can for example, comprise any of average or max pooling or alternatively functions such as peak as described in US Application No. 15/955,426 (Ref: FN-626-US). As an alternative or in addition to pooling layers, convolution layers with strides (steps) greater than one can be employed. A final fully connected layer (FC) again including a ReLU activation function produces the output feature vectors 42L, 42R.
Each of the networks 44L, 44R and 46 need only comprise an input fully connected layer (whose nodes correspond with the values of the input feature vectors 42L, 42R and 42), again including a ReLU activation function and either a 2 node (in the case of network 46) or single node (in the case of networks 44L, 44R) output layer, again comprising an activation function such as ReLU.
Note that the values produced by the gaze network 46 need to be mapped back through the transformation based on the landmarks 19 and indeed any original rotation and/or scaling of the original ROI within the acquired image 13 to provide a meaningful gaze location in the coordinate space of the apparatus 10. This can be done either mathematically or using appropriate look-up tables.
It will be appreciated that with an appropriate common training set comprising labelled images including face regions with eyes at a variety of gaze angles and opening levels, the network illustrated in Figure 5 can be trained jointly so that each network can boost the other.
The networks 44L, 44R and 46 can substantially simultaneously provide eyelid opening and gaze values for any region of interest 13 detected within an image and especially when implemented on a multi-processor core such as disclosed in US application No. 62/592,665 (Ref: FN-618-US), results can readily be provided in real-time.
Nonetheless, it should be appreciated that it is not necessary to execute the gaze network 46 at the same frequency as the eyelid networks 44L, 44R and one may update more frequently than the other as required.
Variations of the above described embodiment are possible, so for example, it is not desirable to have either the processor 18 or 26 execute the gaze network 46 on images which do not contain eyes.
Thus, in variants of the described embodiments, a determination is made of the probability that a given image patch 27L, 27R is an eye patch. While any number of conventional approaches to doing so can be employed, in one variant, each the eyelid left and right networks 40-44L, 40-44R are extended to provide an additional "eyeness" output indicative of the probabilities for the left and right candidate eye patches 27L, 27R including an eye. The branches of the networks 40-44L, 40-44R producing this "eyeness" output can be trained at the same time as the remainder of the eyelid-opening networks 40-44L, 40-44R and the gaze network 46.
Now the execution of the gaze network 46 can be made conditional on the probabilities for the left and right candidate eye patches 27L, 27R including an eye. So, for example, if both left and right candidate eye patches 27L, 27R are non-eyes (possibly because of an anomalous output from the landmark detector 22), the gaze network 46 need not be executed.

Claims

Claims:
1. A neural network image processing apparatus arranged to acquire one or more images from an image sensor, the apparatus being configured to: identify at least one region of interest containing a face region in at least one of said one or more images; determine at plurality of facial landmarks in a face region within a region of interest; use said plurality of said facial landmarks to transform said face region within said region of interest into a face region having a given pose; use transformed landmarks within said transformed face region to identify a pair of eye regions within said transformed face region; feed each identified eye region of said pair of eye regions to a respective first and second convolutional neural network, each network configured to produce a respective feature vector comprising a plurality of numerical values; feed each feature vector to respective eyelid opening level neural networks to obtain respective measures of eyelid opening for each eye region; combine the feature vectors; and feed the concatenated vector to a gaze angle neural network to generate gaze yaw and pitch values substantially simultaneously with the eyelid opening values, wherein said first and second eyelid opening and gaze angle neural networks are jointly trained based on a common training set.
2. The apparatus according to claim 1 further comprising a multi-processor neural network core which substantially simultaneously processes layers from said first and second convolutional networks and which substantially simultaneously processes layers from said first and second eyelid opening and gaze angle neural networks.
3. The apparatus according to claim 1 wherein each of said feature vectors comprises a plurality of floating point numbers.
4. The apparatus according to claim 3 wherein each of said feature vectors comprise at least 128 32-bit floating point numbers.
5. The apparatus according to claim 1 wherein said given pose is a front facing pose.
6. The apparatus according to claim 1 wherein said pair of eye regions include an area including any eye related or eyebrow related landmarks.
7. The apparatus according to claim 1 wherein said pair of eye regions are rectangular.
8. The apparatus according to claim 7 wherein said pair of eye regions have a fixed aspect ratio.
9. The apparatus according to claim 1 wherein each of said first and second
convolutional neural networks comprises respective combinations of convolutional and pooling layers.
10. The apparatus according to claim 9 wherein each convolution layer includes an activation function.
11. The apparatus according to claim 1 wherein said gaze angle network comprises only fully connected layers.
12. The apparatus according to claim 1 wherein said eyelid opening network comprises only fully connected layers.
13. The apparatus according to claim 1 wherein said landmarks delineate any combination of a face region: jaw, mouth, nose, eyes or eyebrows.
14. The apparatus according to claim 1 wherein said landmarks do not delineate any feature on an eye.
15. The apparatus according to claim 1 wherein said apparatus is configured to combine said feature vectors by concatenating the feature vectors.
16. The apparatus according to claim 1 wherein said feature vectors comprise either lxM or AxB=M values.
PCT/EP2019/060596 2018-06-11 2019-04-25 Neural network image processing apparatus WO2019145578A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP19721236.8A EP3539054B1 (en) 2018-06-11 2019-04-25 Neural network image processing apparatus
US16/780,775 US11314324B2 (en) 2018-06-11 2020-02-03 Neural network image processing apparatus
US17/677,320 US11699293B2 (en) 2018-06-11 2022-02-22 Neural network image processing apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/005,610 2018-06-11
US16/005,610 US10684681B2 (en) 2018-06-11 2018-06-11 Neural network image processing apparatus

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/780,775 Continuation-In-Part US11314324B2 (en) 2018-06-11 2020-02-03 Neural network image processing apparatus

Publications (1)

Publication Number Publication Date
WO2019145578A1 true WO2019145578A1 (en) 2019-08-01

Family

ID=66379889

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2019/060596 WO2019145578A1 (en) 2018-06-11 2019-04-25 Neural network image processing apparatus

Country Status (3)

Country Link
US (3) US10684681B2 (en)
EP (1) EP3539054B1 (en)
WO (1) WO2019145578A1 (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259713A (en) * 2019-09-16 2020-06-09 浙江工业大学 Sight tracking method based on self-adaptive weighting
WO2021092541A1 (en) * 2019-11-07 2021-05-14 Facebook Technologies, Llc Image sensing and processing using a neural network to track regions of interest
US11164019B1 (en) 2020-06-17 2021-11-02 Fotonation Limited Object detection for event cameras
WO2021254735A1 (en) 2020-06-17 2021-12-23 Fotonation Limited Method and system to determine the location and/or orientation of a head
WO2021254673A2 (en) 2020-06-17 2021-12-23 Fotonation Limited Object detection for event cameras
CN113850210A (en) * 2021-09-29 2021-12-28 支付宝(杭州)信息技术有限公司 Face image processing method and device and electronic equipment
WO2022053192A1 (en) 2020-09-09 2022-03-17 Fotonation Limited Producing an image frame using data from an event camera
US11301702B2 (en) 2020-06-17 2022-04-12 Fotonation Limited Object detection for event cameras
WO2022111909A1 (en) 2021-01-13 2022-06-02 Fotonation Limited An image processing system
CN114898447A (en) * 2022-07-13 2022-08-12 北京科技大学 Personalized fixation point detection method and device based on self-attention mechanism
US11463636B2 (en) 2018-06-27 2022-10-04 Facebook Technologies, Llc Pixel sensor having multiple photodiodes
US11595598B2 (en) 2018-06-28 2023-02-28 Meta Platforms Technologies, Llc Global shutter image sensor
US11595602B2 (en) 2018-11-05 2023-02-28 Meta Platforms Technologies, Llc Image sensor post processing
US11825228B2 (en) 2020-05-20 2023-11-21 Meta Platforms Technologies, Llc Programmable pixel array having multiple power domains
US11877080B2 (en) 2019-03-26 2024-01-16 Meta Platforms Technologies, Llc Pixel sensor having shared readout structure
US11888002B2 (en) 2018-12-17 2024-01-30 Meta Platforms Technologies, Llc Dynamically programmable image sensor
US11902685B1 (en) 2020-04-28 2024-02-13 Meta Platforms Technologies, Llc Pixel sensor having hierarchical memory
US11910119B2 (en) 2017-06-26 2024-02-20 Meta Platforms Technologies, Llc Digital pixel with extended dynamic range
US11910114B2 (en) 2020-07-17 2024-02-20 Meta Platforms Technologies, Llc Multi-mode image sensor
US11906353B2 (en) 2018-06-11 2024-02-20 Meta Platforms Technologies, Llc Digital pixel with extended dynamic range
US11927475B2 (en) 2017-08-17 2024-03-12 Meta Platforms Technologies, Llc Detecting high intensity light in photo sensor
US11936998B1 (en) 2019-10-17 2024-03-19 Meta Platforms Technologies, Llc Digital pixel sensor having extended dynamic range
US11935291B2 (en) 2019-10-30 2024-03-19 Meta Platforms Technologies, Llc Distributed sensor system
US11935575B1 (en) 2020-12-23 2024-03-19 Meta Platforms Technologies, Llc Heterogeneous memory system
US11943561B2 (en) 2019-06-13 2024-03-26 Meta Platforms Technologies, Llc Non-linear quantization at pixel sensor
US11956560B2 (en) 2020-10-09 2024-04-09 Meta Platforms Technologies, Llc Digital pixel sensor having reduced quantization operation
US11956413B2 (en) 2018-08-27 2024-04-09 Meta Platforms Technologies, Llc Pixel sensor having multiple photodiodes and shared comparator
US11962928B2 (en) 2018-12-17 2024-04-16 Meta Platforms Technologies, Llc Programmable pixel array
US11974044B2 (en) 2018-08-20 2024-04-30 Meta Platforms Technologies, Llc Pixel sensor having adaptive exposure time
US12022218B2 (en) 2020-12-29 2024-06-25 Meta Platforms Technologies, Llc Digital image sensor using a single-input comparator based quantizer
US12034015B2 (en) 2018-05-25 2024-07-09 Meta Platforms Technologies, Llc Programmable pixel array
US12075175B1 (en) 2020-09-08 2024-08-27 Meta Platforms Technologies, Llc Programmable smart sensor with adaptive readout
US12108141B2 (en) 2019-08-05 2024-10-01 Meta Platforms Technologies, Llc Dynamically programmable image sensor

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10684681B2 (en) * 2018-06-11 2020-06-16 Fotonation Limited Neural network image processing apparatus
CN110598601A (en) * 2019-08-30 2019-12-20 电子科技大学 Face 3D key point detection method and system based on distributed thermodynamic diagram
CN111091539B (en) * 2019-12-09 2024-03-26 上海联影智能医疗科技有限公司 Network model training and medical image processing methods, devices, mediums and equipment
CN111476151B (en) * 2020-04-03 2023-02-03 广州市百果园信息技术有限公司 Eyeball detection method, device, equipment and storage medium
US11776319B2 (en) * 2020-07-14 2023-10-03 Fotonation Limited Methods and systems to predict activity in a sequence of images
US11659193B2 (en) 2021-01-06 2023-05-23 Tencent America LLC Framework for video conferencing based on face restoration
US11854579B2 (en) 2021-06-03 2023-12-26 Spree3D Corporation Video reenactment taking into account temporal information
US11836905B2 (en) * 2021-06-03 2023-12-05 Spree3D Corporation Image reenactment with illumination disentanglement
WO2022264269A1 (en) * 2021-06-15 2022-12-22 日本電信電話株式会社 Training device, estimation device, methods therefor, and program
TWI779815B (en) * 2021-09-03 2022-10-01 瑞昱半導體股份有限公司 Face recognition network model with face alignment based on knowledge distillation
US11722329B2 (en) 2021-10-06 2023-08-08 Fotonation Limited Gaze repositioning during a video conference
WO2024108250A1 (en) * 2022-11-21 2024-05-30 Z Ware Development Pty Ltd Animal identification
US20240223813A1 (en) * 2023-01-01 2024-07-04 Alibaba (China) Co., Ltd. Method and apparatuses for using face video generative compression sei message
CN116974370B (en) * 2023-07-18 2024-04-16 深圳市本顿科技有限公司 Anti-addiction child learning tablet computer control method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020102024A1 (en) 2000-11-29 2002-08-01 Compaq Information Technologies Group, L.P. Method and system for object detection in digital images
EP2743117A1 (en) * 2012-12-17 2014-06-18 State Farm Mutual Automobile Insurance Company System and method to monitor and reduce vehicle operator impairment
EP3101622A2 (en) 2015-06-02 2016-12-07 FotoNation Limited An image acquisition system
WO2017032468A1 (en) 2015-08-26 2017-03-02 Fotonation Limited Image processing apparatus
WO2017108222A1 (en) 2015-12-23 2017-06-29 Fotonation Limited Image processing system
WO2017129325A1 (en) 2016-01-29 2017-08-03 Fotonation Limited A convolutional neural network

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6128398A (en) * 1995-01-31 2000-10-03 Miros Inc. System, method and application for the recognition, verification and similarity ranking of facial or other object patterns
JP2003315875A (en) * 2002-04-26 2003-11-06 Konica Minolta Holdings Inc Apparatus and method for photography
US8682097B2 (en) * 2006-02-14 2014-03-25 DigitalOptics Corporation Europe Limited Digital image enhancement with reference images
US8488023B2 (en) * 2009-05-20 2013-07-16 DigitalOptics Corporation Europe Limited Identifying facial expressions in acquired digital images
US7610250B2 (en) * 2006-09-27 2009-10-27 Delphi Technologies, Inc. Real-time method of determining eye closure state using off-line adaboost-over-genetic programming
JP5055166B2 (en) * 2008-02-29 2012-10-24 キヤノン株式会社 Eye open / closed degree determination device, method and program, and imaging device
JP5208711B2 (en) * 2008-12-17 2013-06-12 アイシン精機株式会社 Eye open / close discrimination device and program
CN106961621A (en) * 2011-12-29 2017-07-18 英特尔公司 Use the communication of incarnation
US8952819B2 (en) * 2013-01-31 2015-02-10 Lytx, Inc. Direct observation event triggering of drowsiness
US9100540B1 (en) * 2013-03-14 2015-08-04 Ca, Inc. Multi-person video conference with focus detection
US9549118B2 (en) * 2014-03-10 2017-01-17 Qualcomm Incorporated Blink and averted gaze avoidance in photographic images
US10682038B1 (en) * 2014-09-19 2020-06-16 Colorado School Of Mines Autonomous robotic laparoscope based on eye tracking
US10048749B2 (en) * 2015-01-09 2018-08-14 Microsoft Technology Licensing, Llc Gaze detection offset for gaze tracking models
US10521683B2 (en) * 2015-02-20 2019-12-31 Seeing Machines Limited Glare reduction
US10489043B2 (en) * 2015-12-15 2019-11-26 International Business Machines Corporation Cognitive graphical control element
WO2017177188A1 (en) * 2016-04-08 2017-10-12 Vizzario, Inc. Methods and systems for obtaining, aggregating, and analyzing vision data to assess a person's vision performance
US10127680B2 (en) * 2016-06-28 2018-11-13 Google Llc Eye gaze tracking using neural networks
US10447972B2 (en) * 2016-07-28 2019-10-15 Chigru Innovations (OPC) Private Limited Infant monitoring system
US10467488B2 (en) * 2016-11-21 2019-11-05 TeleLingo Method to analyze attention margin and to prevent inattentive and unsafe driving
US20190370580A1 (en) * 2017-03-14 2019-12-05 Omron Corporation Driver monitoring apparatus, driver monitoring method, learning apparatus, and learning method
US11042729B2 (en) * 2017-05-01 2021-06-22 Google Llc Classifying facial expressions using eye-tracking cameras
JP6946831B2 (en) * 2017-08-01 2021-10-13 オムロン株式会社 Information processing device and estimation method for estimating the line-of-sight direction of a person, and learning device and learning method
CN113128449A (en) * 2017-08-09 2021-07-16 北京市商汤科技开发有限公司 Neural network training method and device for face image processing, and face image processing method and device
US20190246036A1 (en) * 2018-02-02 2019-08-08 Futurewei Technologies, Inc. Gesture- and gaze-based visual data acquisition system
US10664999B2 (en) * 2018-02-15 2020-05-26 Adobe Inc. Saliency prediction for a mobile user interface
US10713813B2 (en) * 2018-02-22 2020-07-14 Innodem Neurosciences Eye tracking method and system
US10706577B2 (en) * 2018-03-06 2020-07-07 Fotonation Limited Facial features tracker with advanced training for natural rendering of human faces in real-time
TWI666941B (en) * 2018-03-27 2019-07-21 緯創資通股份有限公司 Multi-level state detecting system and method
US10534982B2 (en) * 2018-03-30 2020-01-14 Tobii Ab Neural network training for three dimensional (3D) gaze prediction with calibration parameters
US10915769B2 (en) * 2018-06-04 2021-02-09 Shanghai Sensetime Intelligent Technology Co., Ltd Driving management methods and systems, vehicle-mounted intelligent systems, electronic devices, and medium
US10684681B2 (en) * 2018-06-11 2020-06-16 Fotonation Limited Neural network image processing apparatus
CN113785258A (en) * 2019-03-22 2021-12-10 惠普发展公司,有限责任合伙企业 Detecting eye measurements
KR20190084912A (en) * 2019-06-28 2019-07-17 엘지전자 주식회사 Artificial intelligence device that can be controlled according to user action

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020102024A1 (en) 2000-11-29 2002-08-01 Compaq Information Technologies Group, L.P. Method and system for object detection in digital images
EP2743117A1 (en) * 2012-12-17 2014-06-18 State Farm Mutual Automobile Insurance Company System and method to monitor and reduce vehicle operator impairment
EP3101622A2 (en) 2015-06-02 2016-12-07 FotoNation Limited An image acquisition system
WO2017032468A1 (en) 2015-08-26 2017-03-02 Fotonation Limited Image processing apparatus
WO2017108222A1 (en) 2015-12-23 2017-06-29 Fotonation Limited Image processing system
WO2017129325A1 (en) 2016-01-29 2017-08-03 Fotonation Limited A convolutional neural network

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
"MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017
AMER AL-RAHAYFEH ET AL: "Eye Tracking and Head Movement Detection: A State-of-Art Survey", IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, vol. 1, 2013, pages 1 - 12, XP055590704, DOI: 10.1109/JTEHM.2013.2289879 *
JOSEPH LEMLEY ET AL: "Efficient CNN Implementation for Eye-Gaze Estimation on Low-Power/Low-Quality Consumer Imaging Systems", 28 June 2018 (2018-06-28), XP055590709, Retrieved from the Internet <URL:https://arxiv.org/pdf/1806.10890.pdf> [retrieved on 20190521] *
M. KIM; 0. WANG; N. NG: "Convolutional Neural Network Architectures for Gaze Estimation on Mobile Devices", STANFORD REPORTS, 2017, Retrieved from the Internet <URL:http://cs231n.stanford.edu/reports/2017/pdfs/229.pdf>
RIZWAN ALI NAQVI; MUHAMMAD ARSALAN; GANBAYAR BATCHULUUN; HYO SIK YOON; KANG RYOUNG PARK: "Deep Learning-Based Gaze Detection System for Automobile Drivers Using a NIR Camera Sensor", SENSORS, vol. 18, 2018, pages 456
S. BALUJA; D. POMERLEAU: "Non-intrusive gaze tracking using artificial neural networks", TECH. REP., 1994
SUGANO YUSUKE ET AL: "Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation", 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 23 June 2014 (2014-06-23), pages 1821 - 1828, XP032649336, DOI: 10.1109/CVPR.2014.235 *
X. ZHANG; Y. SUGANO; M. FRITZ; A. BULLING: "Appearance-based gaze estimation in the wild", IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR, June 2015 (2015-06-01), pages 4511 - 4520, XP032793907, DOI: doi:10.1109/CVPR.2015.7299081
XUCONG ZHANG ET AL: "MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 24 November 2017 (2017-11-24), pages 1 - 14, XP055590368, Retrieved from the Internet <URL:https://arxiv.org/pdf/1711.09017.pdf> [retrieved on 20190520] *

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11910119B2 (en) 2017-06-26 2024-02-20 Meta Platforms Technologies, Llc Digital pixel with extended dynamic range
US11927475B2 (en) 2017-08-17 2024-03-12 Meta Platforms Technologies, Llc Detecting high intensity light in photo sensor
US12034015B2 (en) 2018-05-25 2024-07-09 Meta Platforms Technologies, Llc Programmable pixel array
US11906353B2 (en) 2018-06-11 2024-02-20 Meta Platforms Technologies, Llc Digital pixel with extended dynamic range
US11863886B2 (en) 2018-06-27 2024-01-02 Meta Platforms Technologies, Llc Pixel sensor having multiple photodiodes
US11463636B2 (en) 2018-06-27 2022-10-04 Facebook Technologies, Llc Pixel sensor having multiple photodiodes
US11595598B2 (en) 2018-06-28 2023-02-28 Meta Platforms Technologies, Llc Global shutter image sensor
US11974044B2 (en) 2018-08-20 2024-04-30 Meta Platforms Technologies, Llc Pixel sensor having adaptive exposure time
US11956413B2 (en) 2018-08-27 2024-04-09 Meta Platforms Technologies, Llc Pixel sensor having multiple photodiodes and shared comparator
US11595602B2 (en) 2018-11-05 2023-02-28 Meta Platforms Technologies, Llc Image sensor post processing
US11962928B2 (en) 2018-12-17 2024-04-16 Meta Platforms Technologies, Llc Programmable pixel array
US11888002B2 (en) 2018-12-17 2024-01-30 Meta Platforms Technologies, Llc Dynamically programmable image sensor
US11877080B2 (en) 2019-03-26 2024-01-16 Meta Platforms Technologies, Llc Pixel sensor having shared readout structure
US11943561B2 (en) 2019-06-13 2024-03-26 Meta Platforms Technologies, Llc Non-linear quantization at pixel sensor
US12108141B2 (en) 2019-08-05 2024-10-01 Meta Platforms Technologies, Llc Dynamically programmable image sensor
CN111259713A (en) * 2019-09-16 2020-06-09 浙江工业大学 Sight tracking method based on self-adaptive weighting
CN111259713B (en) * 2019-09-16 2023-07-21 浙江工业大学 Sight tracking method based on self-adaptive weighting
US11936998B1 (en) 2019-10-17 2024-03-19 Meta Platforms Technologies, Llc Digital pixel sensor having extended dynamic range
US11935291B2 (en) 2019-10-30 2024-03-19 Meta Platforms Technologies, Llc Distributed sensor system
US11960638B2 (en) 2019-10-30 2024-04-16 Meta Platforms Technologies, Llc Distributed sensor system
US11948089B2 (en) 2019-11-07 2024-04-02 Meta Platforms Technologies, Llc Sparse image sensing and processing
WO2021092541A1 (en) * 2019-11-07 2021-05-14 Facebook Technologies, Llc Image sensing and processing using a neural network to track regions of interest
US11902685B1 (en) 2020-04-28 2024-02-13 Meta Platforms Technologies, Llc Pixel sensor having hierarchical memory
US11825228B2 (en) 2020-05-20 2023-11-21 Meta Platforms Technologies, Llc Programmable pixel array having multiple power domains
US11423567B2 (en) 2020-06-17 2022-08-23 Fotonation Limited Method and system to determine the location and/or orientation of a head
WO2021254673A2 (en) 2020-06-17 2021-12-23 Fotonation Limited Object detection for event cameras
US11164019B1 (en) 2020-06-17 2021-11-02 Fotonation Limited Object detection for event cameras
WO2021254735A1 (en) 2020-06-17 2021-12-23 Fotonation Limited Method and system to determine the location and/or orientation of a head
US11749004B2 (en) 2020-06-17 2023-09-05 Fotonation Limited Event detector and method of generating textural image based on event count decay factor and net polarity
US11301702B2 (en) 2020-06-17 2022-04-12 Fotonation Limited Object detection for event cameras
US11270137B2 (en) 2020-06-17 2022-03-08 Fotonation Limited Event detector and method of generating textural image based on event count decay factor and net polarity
US11910114B2 (en) 2020-07-17 2024-02-20 Meta Platforms Technologies, Llc Multi-mode image sensor
US12075175B1 (en) 2020-09-08 2024-08-27 Meta Platforms Technologies, Llc Programmable smart sensor with adaptive readout
US11405580B2 (en) 2020-09-09 2022-08-02 Fotonation Limited Event camera hardware
WO2022053192A1 (en) 2020-09-09 2022-03-17 Fotonation Limited Producing an image frame using data from an event camera
US11956560B2 (en) 2020-10-09 2024-04-09 Meta Platforms Technologies, Llc Digital pixel sensor having reduced quantization operation
US11935575B1 (en) 2020-12-23 2024-03-19 Meta Platforms Technologies, Llc Heterogeneous memory system
US12022218B2 (en) 2020-12-29 2024-06-25 Meta Platforms Technologies, Llc Digital image sensor using a single-input comparator based quantizer
WO2022111909A1 (en) 2021-01-13 2022-06-02 Fotonation Limited An image processing system
US11768919B2 (en) 2021-01-13 2023-09-26 Fotonation Limited Image processing system
CN113850210A (en) * 2021-09-29 2021-12-28 支付宝(杭州)信息技术有限公司 Face image processing method and device and electronic equipment
CN113850210B (en) * 2021-09-29 2024-05-17 支付宝(杭州)信息技术有限公司 Face image processing method and device and electronic equipment
CN114898447A (en) * 2022-07-13 2022-08-12 北京科技大学 Personalized fixation point detection method and device based on self-attention mechanism
CN114898447B (en) * 2022-07-13 2022-10-11 北京科技大学 Personalized fixation point detection method and device based on self-attention mechanism

Also Published As

Publication number Publication date
US11314324B2 (en) 2022-04-26
EP3539054B1 (en) 2020-07-01
US20190377409A1 (en) 2019-12-12
US11699293B2 (en) 2023-07-11
EP3539054A1 (en) 2019-09-18
US20220171458A1 (en) 2022-06-02
US10684681B2 (en) 2020-06-16
US20200342212A1 (en) 2020-10-29

Similar Documents

Publication Publication Date Title
EP3539054B1 (en) Neural network image processing apparatus
CN104200192B (en) Driver&#39;s gaze detection system
US7912253B2 (en) Object recognition method and apparatus therefor
US7950802B2 (en) Method and circuit arrangement for recognising and tracking eyes of several observers in real time
US8055016B2 (en) Apparatus and method for normalizing face image used for detecting drowsy driving
EP3893090B1 (en) Method for eye gaze tracking
JP2019527448A (en) Method and system for monitoring the status of a vehicle driver
US20070189584A1 (en) Specific expression face detection method, and imaging control method, apparatus and program
EP2590111A2 (en) Face recognition apparatus and method for controlling the same
US20100074529A1 (en) Image recognition apparatus
JP6351243B2 (en) Image processing apparatus and image processing method
EP3506149A1 (en) Method, system and computer program product for eye gaze direction estimation
WO2018078857A1 (en) Line-of-sight estimation device, line-of-sight estimation method, and program recording medium
CN110647782A (en) Three-dimensional face reconstruction and multi-pose face recognition method and device
US11887331B2 (en) Information processing apparatus, control method, and non-transitory storage medium
US8630483B2 (en) Complex-object detection using a cascade of classifiers
CN112541394A (en) Black eye and rhinitis identification method, system and computer medium
CN112115790A (en) Face recognition method and device, readable storage medium and electronic equipment
JP7107380B2 (en) Estimation device, estimation method, and program
CN115171189A (en) Fatigue detection method, device, equipment and storage medium
KR102074977B1 (en) Electronic devices and methods thereof
CN112183215A (en) Human eye positioning method and system combining multi-feature cascade SVM and human eye template
KR101706674B1 (en) Method and computing device for gender recognition based on long distance visible light image and thermal image
US20230177861A1 (en) Apparatus, method, and computer program for detecting hand region
CN117351557A (en) Vehicle-mounted gesture recognition method for deep learning

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2019721236

Country of ref document: EP

Effective date: 20190612

NENP Non-entry into the national phase

Ref country code: DE