WO2015122789A1 - Facial recognition and user authentication method - Google Patents

Facial recognition and user authentication method Download PDF

Info

Publication number
WO2015122789A1
WO2015122789A1 PCT/RU2014/000089 RU2014000089W WO2015122789A1 WO 2015122789 A1 WO2015122789 A1 WO 2015122789A1 RU 2014000089 W RU2014000089 W RU 2014000089W WO 2015122789 A1 WO2015122789 A1 WO 2015122789A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
facial
processors
user
feature vector
Prior art date
Application number
PCT/RU2014/000089
Other languages
French (fr)
Inventor
Aleksander Sergeevich SHUSHARIN
Konstantin Vasilevich CHERENKOV
Andrey Vladimirovich VALIK
Original Assignee
3Divi Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 3Divi Company filed Critical 3Divi Company
Priority to PCT/RU2014/000089 priority Critical patent/WO2015122789A1/en
Publication of WO2015122789A1 publication Critical patent/WO2015122789A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/117Identification of persons
    • A61B5/1171Identification of persons based on the shapes or appearances of their bodies or parts thereof
    • A61B5/1176Recognition of faces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/755Deformable models or variational models, e.g. snakes or active contours
    • G06V10/7553Deformable models or variational models, e.g. snakes or active contours based on shape, e.g. active shape models [ASM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • G06V30/19013Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • G06V30/1902Shifting or otherwise transforming the patterns to accommodate for positional errors
    • G06V30/1904Shifting or otherwise transforming the patterns to accommodate for positional errors involving a deformation of the sample or reference pattern; Elastic matching
    • G06V30/19053Shifting or otherwise transforming the patterns to accommodate for positional errors involving a deformation of the sample or reference pattern; Elastic matching based on shape statistics, e.g. active shape models of the pattern to be recognised
    • G06V30/1906Shifting or otherwise transforming the patterns to accommodate for positional errors involving a deformation of the sample or reference pattern; Elastic matching based on shape statistics, e.g. active shape models of the pattern to be recognised based also on statistics of image patches, e.g. active appearance models of the pattern to be recognised

Definitions

  • This disclosure relates generally to object recognition and more particularly to facial recognition and user authentication technologies.
  • biometrics refers to identification of people by their characteristics and traits.
  • Biometrics is widely used in computer-based user authentication and access control systems, computer vision systems, gaming platforms, and so forth.
  • an individual may activate or otherwise gain access to functionalities or data controlled by a computing device by acquiring and verifying user biometric data.
  • biometric identifiers are the distinctive and measurable characteristics, which can be used to label and describe an individual.
  • Some examples of biometric identifiers may include a retina image, an iris image, an image or shape of individual's face, fingerprints, a handprint, a voice, keystroke dynamics, behavioral characteristics, and so forth.
  • facial recognition methods or algorithms are implemented by retrieving features of samples (e.g.
  • the reliability of facial recognition methods greatly depend on the quality of samples, in other words, images of the individual's face to be identified, as well as conditions under which the samples are taken.
  • One way to reduce vulnerability from these factors is to maintain a great number of samples for a given individual, whereas the samples are taken under different conditions as outlined above.
  • security systems are limited by one sample image per individual.
  • Various embodiments of the present disclosure provide generally for significantly improving the quality and reliability of facial recognition and corresponding user authentication methods based upon the improved facial recognition techniques, as well as improving decreasing the FAR and FRR values.
  • the present technology involves intelligent creating of multiple facial angle shots of an individual based on his or her single facial image. These facial angle shots may optionally be used to train a machine- learning system handling facial recognition and/or user authentication processing.
  • the principles of the present technology can be integrated not only in facial recognition methods, but also in user authentication methods, which are based on facial recognition.
  • a method for user authentication and/or identification comprises a step of acquiring, by one or more processors, an input image associated with a user (the input image shows at least the user face). Further, the method includes determining, by one or more processors, a position of the user face, and locating, by the one or more processors, multiple facial landmarks associated with the user. The method further includes transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user face.
  • the transformation include spatial rotation, scaling, cropping, color and/or brightness adjustments, or a combination thereof.
  • the method further includes retrieving, by the one or more processors, a feature vector based upon the uniform image of the user face, comparing the feature vector with at least one stored reference feature vector (using, for example, a machine- learning algorithm), and making an authentication and/or identification decision(s) with respect to the user based on the result of comparison.
  • a method for registering users includes acquiring one registration image associated with the authorized user, determining a facial position of the authorized user based on the registration image.
  • the method includes locating, based on the position, multiple facial landmarks associated with the authorized user, transforming at least a portion of the one registration image into a texture image of the authorized user face (e.g., a full-face image of the authorized user). Further, the method includes generating multiple angle shots of the authorized user face based upon the texture image. The method also includes creating a biometric pattern associated with the multiple angle shots of the user face and storing the biometric pattern associated with the authorized user in a database.
  • the generation process of the angle shots comprises the steps of finding similarity between the texture image and one of a plurality of reference two-dimensional (2D) facial images, wherein each one of the plurality of reference 2D facial images is associated with a reference three-dimensional (3D) facial image. Further, based on the similarity, the method selects the 3D facial image related to the reference 2D image being the most similar to the texture image. The generation method further includes superposing the texture image of the authorized user with multiple points associated with the pre-selected 3D facial image. Further, the 3D facial image and the texture image can be rotated to generate images of the multiple angle shots based upon the rotated 3D facial image and the texture image.
  • a method for facial recognition comprises the steps of acquiring a facial image of an individual, determining a facial position based on the facial image, locating multiple facial landmarks associated with the individual based on the position, transforming at least a portion of the facial image into a uniform facial image based on the multiple facial landmarks (the uniform facial image represents a full-face image of the individual). Further the method comprises creating a feature vector based upon the uniform facial image, comparing the feature vector with at least one stored reference feature vector and, based on the comparison, determining the identity of the individual.
  • FIG. 1 is a high-level block diagram illustrating an example computing device suitable for implementing methods for facial recognition and user authentication as disclosed herein.
  • FIG. 2 shows a high-level process flow diagram of a method for user registration according to one exemplary embodiment.
  • FIG. 3 shows a high-level process flow diagram of a method for facial recognition according to one exemplary embodiment.
  • FIG.4 shows a high-level process flow diagram of a method for user authentication according to one exemplary embodiment of the present disclosure.
  • FIG. 5 shows an exemplary face image and several facial landmarks, which can be used in methods for facial recognition and user authentication as described herein.
  • FIG. 6 shows an input image of a user for the use in a method for facial recognition, according to one example embodiment, which image may also serve as a registration image in a method for user registration.
  • FIG. 7 shows an exemplary area of interest created by rotating and cropping of the input image shown in FIG. 6.
  • FIG. 8 shows an exemplary uniform image of a user face suitable for the use in a method for facial recognition, which image may be also used as a texture image in a method for user registration.
  • FIG. 9 shows an exemplary three-dimensional image (depth map) corresponded to a selected image being the most similar to the texture image shown in FIG. 8.
  • FIG. 10 shows a graphical representation of the result of applying the texture image, as shown in FIG. 8, to a point cloud.
  • FIGs. 11A-11D illustrate exemplary angle shots based upon the to the registration image shown in FIG. 6.
  • FIG. 12 shows a graphical representation of forty Gabor filters suitable for implementing the methods described herein.
  • the techniques of the embodiments disclosed herein may be implemented using a variety of technologies.
  • the methods described herein may be implemented in software executing on a computer system or in hardware utilizing either a combination of microprocessors, controllers or other specially designed application-specific integrated circuits (ASICs), programmable logic devices, or various combinations thereof.
  • the methods described herein may be implemented by a series of computer-executable instructions residing on a storage medium such as a disk drive, solid-state drive or on a computer-readable medium.
  • facial recognition can be used by a computing device in various scenarios.
  • a computing device may use facial recognition to authenticate or authorize a user who attempts to gain access to one or more functionalities or data of the computing device or functionalities otherwise controlled by the computing device.
  • the computing device may store facial images of one or more pre-authorized users. These facial images are referred herein as "registration images.”
  • registration images are referred herein as "registration images.”
  • the computing device may capture an image of the user's face for authentication purposes. The computing device may then use facial recognition applications to compare the captured facial image to the enrollment images associated with authorized users.
  • the computing device may authenticate the user, and grant access to requested functionalities or data.
  • this approach can be used in security systems to recognize people attempting to enter designated premises or land, crime detection systems, computer vision systems, gesture recognition and control systems, and so forth.
  • the facial recognition are not accurate and may falsely accept or falsely reject a user when similarity of the user's image and one or more registration images is found incorrectly.
  • Unauthorized users may leverage vulnerabilities of facial recognition to cause erroneous authentication.
  • an authorized user that should be authenticated, may be declined to have access because a facial recognition system could not find similarity of a present user's image and a user's registration image captured some time ago.
  • the finding similarities between the present user images and registration images greatly depend on a quality and number of registration images. In most scenarios, facial recognition systems maintain only one registration image per authorized user.
  • FAR and FRR values may be significantly high.
  • the present technology decreases FAR and FRR values and improves the quality and reliability of facial recognition systems by, but not limited to, artificially synthesizing facial foreshortening images with respect to an available registration image of authorized user, as well as the intelligent use of Gabor filters and linear discriminant analysis.
  • FIG. 1 is a high-level block diagram illustrating an example computing device 100 suitable for implementing the present technology.
  • the computing device 100 may be used for facial recognition, user authentication, user authorization, and/or user registration as described herein.
  • the computing device 100 may include, be, or be a part of one or more of a variety of types of devices, such as a general purpose computer, desktop computer, laptop computer, tablet computer, netbook, server, mobile phone, a smartphone, personal digital assistant, set-top box, television, door lock, watch, vehicle computer, electronic kiosk, automated teller machine, infotainment system, presence verification device, security device, surveillance device, among others.
  • the computing device 100 may be an integrated part of another multi-component system such as a video surveillance system, access control system, among others.
  • the computing device 100 includes one or more processors 102, memory 104, one or more storage devices 106, one or more input devices 108, one or more output devices 110, network interface 112, and an image sensor 114 (e.g. a camera or charged-coupled device (CCD)).
  • processors 102 are, in some examples, , configured to implement functionality and/or process instructions for execution within the computing device 100.
  • the processors 102 may process instructions stored in memory 104 and/or instructions stored on storage devices 106.
  • Such instructions may include components of an operating system 118, a facial recognition module 120 and/or a user authentication module 122.
  • Computing device 100 may also include one or more additional components not shown in FIG. 1, such as a power supply, a battery, a fan, a global positioning system (GPS) receiver, among others.
  • GPS global positioning system
  • Memory 104 is configured to store information within the computing device 100 during operation.
  • Memory 104 may refer to a non-transitory computer- readable storage medium or a computer-readable storage device.
  • memory 104 is a temporary memory, meaning that a primary purpose of memory 104 may not be long-term storage.
  • Memory 104 may also refer to a volatile memory, meaning that memory 104 does not maintain stored contents when memory 104 is not receiving power. Examples of volatile memories include random access memories (RAM), dynamic random access memories (DRAM), static random access memories (SRAM), and other forms of volatile memories known in the art.
  • RAM random access memories
  • DRAM dynamic random access memories
  • SRAM static random access memories
  • memory 104 is used to store program instructions for execution by the processors 102.
  • Memory 104 in one example, is used by software (e.g., operating system 118) or applications, such as facial recognition 120 and/or user authentication 122, executing on computing device 100 to temporarily store information during program execution.
  • One or more storage devices 106 can also include one or more computer-readable storage media and/or computer-readable storage devices. In some embodiments, storage devices 106 may be configured to store greater amounts of information than memory 104. Storage devices 106 may further be configured for long-term storage of information. In some examples, the storage devices 106 include non-volatile storage elements.
  • the computing device 100 may also include one or more input devices 108.
  • the input devices 108 may be configured to receive input from a user through tactile, audio, video, or biometric channels.
  • Examples of input devices 108 may include a keyboard, mouse, touchscreen, microphone, one or more video cameras, or any other device capable of detecting an input from a user or other source, and relaying the input to computing device 100, or components thereof. Though shown separately in FIG.
  • the image sensor 114 may, in some instances, be a part of input devices 108. It should be also noted that the image sensor 114 (e.g., a digital still and/or video camera) may be a peripheral device operatively connected to the computing device 100 via the network interface 112.
  • the image sensor 114 e.g., a digital still and/or video camera
  • the image sensor 114 may be a peripheral device operatively connected to the computing device 100 via the network interface 112.
  • the output devices 110 may be configured to provide output to a user through visual or auditory channels.
  • Output devices 110 may include a video graphics adapter card, a liquid crystal display (LCD) monitor, a light emitting diode (LED) monitor, a sound card, a speaker, or any other device capable of generating output that may be intelligible to a user.
  • Output devices 110 may also include a touchscreen, presence-sensitive display, or other input/output capable displays known in the art.
  • the computing device 100 also includes network interface 112.
  • the network interface 112 can be utilized to communicate with external devices via one or more networks such as one or more wired, wireless or optical networks including, for example, the Internet, intranet, local area network (LAN), wide area network (WAN), cellular phone networks, Bluetooth radio, an IEEE 802.11-based radio frequency network, among others.
  • the network interface 112 may be a network interface card, such as an Ethernet card, an optical transceiver, a radio frequency transceiver, or any other type of device that can send and receive information.
  • Other examples of such network interfaces may include Bluetooth ® , 3G, 4G, and WiFi ® radios in mobile computing devices as well as USB.
  • the operating system 118 may control one or more functionalities of computing device 100 and/or components thereof.
  • the operating system 118 may interact with applications 124, including facial recognition 120 and user authentication 122, and may facilitate one or more interactions between applications 124 and one or more of processors 102, memory 104, storage devices 106, input devices 108, and output devices 110.
  • the operating system 118 may interact with or be otherwise coupled to facial recognition module 120, user authentication module 122, applications 124, and components thereof.
  • facial recognition module 120 and user authentication module 122 may be included in operating system 118.
  • facial recognition module 120 and user authentication module 122 may be part of applications 230, facial recognition module 120 and user authentication module 122 may be implemented externally to computing device 100 such as at a network location. In some such instances, computing device 100 may use the network interface 112 to access and implement functionalities provided by facial recognition module 120 and user authentication module 122, through methods commonly known as "cloud computing.”
  • FIG. 2 shows a high-level process flow diagram of a method 200 for user registration (in other words, user enrolment) according to one exemplary embodiment.
  • the method 200 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both.
  • processing logic may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both.
  • the method 200 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 200 is just one example operation of the computing device 100.
  • the method 200 commences at step 210 with the computing device 100 acquiring at least one registration image associated with an authorized user.
  • the image may be a 2D image of the authorized user captured by the image sensor 114.
  • the computing device 100 determines a facial position of the authorized user based on the at least one registration image. The determination of facial position may be needed for isolating (e.g. cropping) the authorized user face, which would simplify further processing steps.
  • the computing device 100 locates multiple facial landmarks associated with the authorized user.
  • the computing device 100 transforms at least a portion of the at least one registration image into a texture image of the authorized user face.
  • the computing device 100 generates multiple angle shots of the authorized user face based upon the texture image of the authorized user face.
  • the computing device 100 creates a biometric pattern associated with the multiple angle shots of the user face.
  • the biometric pattern includes a feature vector, which components are associated with the angle shots.
  • the biometric pattern can be utilized for training one or more machine-learning algorithms handling face recognition and/or user authentication based on facial recognition.
  • the computing device 100 stores the biometric pattern associated with the authorized user in the memory 104 and/or storage device 106.
  • the computing system 100 maintains user profiles, which comprise feature vectors and/or biometric patterns of authorized users. The user profiles can be later recalled when a new image is acquired to identify and/or authorize a user.
  • the feature vectors may be based on various parameters such as pixel data, distances between certain facial landmarks, facial angle shots, among other things.
  • FIG. 3 shows a high-level process flow diagram of a method 300 for facial recognition according to one exemplary embodiment of the present disclosure.
  • the method 300 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both.
  • the method 300 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 300 is just one example operation of the computing device 100.
  • FIG. 3 starts at step 310, when the computing device 100 acquires a facial image of an individual from the image sensor 114 or an analogues device.
  • the computing device 100 determines a facial position based on the facial image, which may be needed for cropping or isolating the individual's face.
  • the computing device 100 locates multiple facial landmarks associated with the individual based on the position determined.
  • the computing device 100 transforms, based on the multiple facial landmarks, at least a portion of the facial image into a uniform facial image.
  • the computing device 100 creates a feature vector based upon the uniform facial image.
  • FIG. 4 shows a high- level process flow diagram of a method 400 for user authentication according to one exemplary embodiment of the present disclosure.
  • the method 400 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both.
  • the method 400 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 400 is just one example operation of the computing device lOO.
  • the method 400 starts at step 410, when the computing device 100 acquires an input image associated with a user (whereas the input image shows at least the user face) from the image sensor 114 or similar device.
  • the computing device 100 optionally determines a position of the user face.
  • the computing device 100 locates multiple facial landmarks associated with the user.
  • the computing device 100 transforms, based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user face.
  • the computing device 100 retrieves a feature vector based upon the uniform image of the user face.
  • the computing device 100 compares the feature vector with a stored reference feature vector associated with the authorized user. The comparison process may be based on the use of one or more machine-learning algorithms, although it is not required.
  • the computing device 100 based on the comparison, makes an authentication decision with respect to the user. The authentication decision can be further used in providing access for the authorized user to computing device functionalities, data, resources, or access to dedicated premises or land.
  • the authentication decision can be also used for generating one or more control commands for the computing device 100, its elements, or any other suitable peripheral device. For example, a control command may control one or more devices, such as "unlock" computing devices.
  • Facial position is determined at least in steps 220, 320 and 420 based on an input image of a user/individual as received from the image sensor 114.
  • facial position can be located and/or determined by a machine-learning algorithm, pattern recognition algorithms, and/or statistical analysis configured to search objects of a predetermined class (i.e., faces) in input images.
  • machine-learning algorithms include neural networks, heuristic methods, support vector machines, or a combination thereof.
  • PCA Principal Components Analysis
  • LDA Linear Discriminant Analysis
  • Fisher's linear discriminant method Another example method suitable for facial position detection is Linear Discriminant Analysis (LDA) also known as Fisher's linear discriminant method. LDA may be used to find linear combination of features, which characterize or distinguish two or more classes or species of one class.
  • Another example method suitable for facial position detection is Viola-Jones object detection framework. This method adapted the idea of using Haar wavelets and developed so called Haar-like features.
  • a Haar-like feature considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in these regions and calculates the difference between them. This difference is then used to categorize subsections of an image. For example, consider an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore, a common Haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region.
  • the position of these rectangles is defined relative to a detection window that acts like a bounding box to the target object (the face in this case).
  • a window of the target size is moved over the input image, and for each subsection of the image the Haar-like feature is calculated. This difference is then compared to a learned threshold that separates non-objects from objects. Because such a Haar-like feature is only a weak learner or classifier (its detection quality is slightly better than random guessing) a large number of Haar-like features is necessary to describe an object with sufficient accuracy.
  • the Haar-like features are therefore organized in something called a classifier cascade to form a strong learner or classifier.
  • the Viola-Jones object detection framework may serve one of preferred algorithms for facial position detection performed in the steps 220, 320 and 420 of the methods 200-400.
  • facial landmark points refer to various face elements such as inner/outer corner of eyes, eye centers (e.g., pupils), left/right corner of mouth, nose center, nose corners, ears, chin, eyebrows, among others.
  • FIG. 5 shows an exemplary face image and some facial landmarks, which can be used by the methods for facial recognition and user authentication as described herein.
  • facial landmark are located using at least one of Active Shape Model (ASM) searching process and Active Appearance Model (AAM) searching process.
  • ASM searching process is a statistical model of the shape of various objects, which iteratively deform to fit to an example of the object in a new image.
  • ASM searching process consists in the use of statistical relations between mutual arrangements of landmarks associated with a new image and at least one reference image.
  • the process generally, includes two steps: (a) locating an area associated with an initial landmark point coordinates, and (b) iteratively adjust the statistical model to define landmark coordinates more precisely. This process is described below in greater details.
  • Equation No. 6 Equation No 6
  • includes V principal components, i.e. eigenvectors u J ⁇ ⁇ ' P, which eigenvectors refer to P the largest eigenvalues and is a vector having P coefficients (also known as parameters of model).
  • P is defined as follows: (Equation No. 7)
  • ASM model is defined by the matrix ⁇ and the vector Any image/shape can approximately be described by means of the ASM model and parameters obtained from the equation:
  • the vector s can represent a common pattern of landmarks arrangement and individual features of specific face shape.
  • the localization of landmarks in a new face image is carried out as follows. First, a face position is determined as described above in this paper. For example, Viola-Jones classifier is utilized, which returns an image with isolated user face.
  • An average shape defined by the vector s is coincided with the center of isolated user face.
  • the coordinates of the average form defined by * can be scaled, if required.
  • the average shape defines initial approximation to landmark point coordinates, which can be considered as an initial iteration f ⁇ .
  • the Viola-Jones cascade classifier is trained for the entire image. For the entire image, it generates a set of landmarks classified as landmarks with number z.
  • positive examples are those areas of the image centered in a landmark point, and negative examples are those areas crossed with positive examples.
  • a corresponding cascade classifier Q is applied to a small area of the image centered in a landmark with coordinates defined as ⁇ 1 ' i+N ) .
  • the classifier generates some landmark points classified as anthropometrical, the located landmark point is considered as nearest to landmark with the coordinates ⁇ 1 ' l+,v / .
  • c is a shape consisted of landmark points found by the cascade classifiers at ;-th iteration, and the coordinates of this shape are centered and divided by a scale coefficient t 1 .
  • This shape may be checked for conformity with the statistical ASM model. The result of conformity may define the shape c ⁇ for the next iteration.
  • Procedure of landmark localization is repeated and terminates when a predetermined number of iterations is performed (for example, three iterations). Accordingly, landmark points may be represented by three-dimensional (3D) coordinates and associated with a facial image.
  • 3D three-dimensional
  • the face image transformation also refers to image pre-processing required to bring the face image to a uniform image being in a better condition for further processing.
  • the image transformation may include the following operations. First, a color face image or its part can be transformed into a half-tone image or monotone (single-colored) image.
  • an area of interest can be isolated from the image.
  • This process can include the following steps: (a) rotating at least a portion of the face image until landmarks associated with user pupils are horizontally oriented; (b) scaling the face image (for example, until landmarks associated with user pupils are at a predetermined distance from each other, e.g. 60 pixels between the pupils); and/or (c) cropping the face image or its part to create an image of a predetermined pixel size.
  • FIGs. 6 and 7 show example face images illustrating the above processes. Namely, FIG. 6 shows an exemplary input image of a user. FIG. 7 shows the same image as in FIG. 6, but subjected to the process of rotation and cropping as outlined above to isolate the area of interest, i.e. a user face.
  • the image transformation may include adjusting a brightness of at least a portion of the face image.
  • the brightness adjustment may include a number of processes including, but not limited to, Single Scale Retinex (SSR) filtering, homomorphic filtering based normalization technique, Discrete Cosine Transform (DCT) based normalization technique, wavelet based normalization technique, among others.
  • SSR Single Scale Retinex
  • DCT Discrete Cosine Transform
  • the brightness adjustment may include histogram correcting of the face image.
  • the brightness adjustment may include contrast enhancing.
  • FIG. 8 shows an example face image shown in FIG. 7 and subject to the brightness adjustment as outlined above.
  • FIG. 8 shows a uniform image of the user face. Accordingly, the image transformation process as described above is used in the steps 240, 340 and 440 of the methods 200-400.
  • angle shots are artificial images taken or created by a virtual camera at an angle from the horizontal or vertical lines.
  • angle shots may be created by rotation of 3D representation of a user face at different angles with a fixed virtual camera.
  • the rotation may create any of a yaw angle, pitch angle, and/or roll angle.
  • each angle shot may be characterized by yaw, pitch and roll angles.
  • the synthesis (generation) of the angle shots includes the following operations performed by the computing device 100: superposing a texture image of an authorized user, such as one shown in FIG.
  • the reference 3D facial images are associated with their corresponding reference 2D half-tone or color images, which are also stored in the database such as the memory 104 or storage device 106.
  • the reference 2D facial images are uniform face images as discussed above.
  • Reference 3D images can be represented by depth maps, whereas each pixel of these images include information related to a distance between the image sensor 114 and certain parts of user face or other objects.
  • the association of reference 2D images with the depth maps means that each pixel of reference 3D image also include data related to brightness and/or color.
  • the reference 3D facial images associated with reference 2D images may be pre-processed/transformed as described above with reference to the steps 240, 340 and 440.
  • the synthesis process of angle shots includes the following operations.
  • the computing device 100 finds similarity between the texture image and one or more of the plurality of reference 3D and 2D images, and then, based on the similarity, the computing device 100 selects the most similar reference 3D and 2D image to the texture image.
  • This process of finding similarity can be implemented by a machine-learning algorithm or a statistical analysis.
  • Some examples of the methods suitable for finding similarity includes Principal Components Analysis (PCA) or a discriminant analysis such as Linear Discriminant Analysis (LDA) or Fisher's linear discriminant analysis.
  • FIG. 9 shows an exemplary 3D image, which is selected by the computing device 100 as the most similar to the texture image such as the one shown in FIG. 8 (the corresponding 2D image of the selected 3D image is not shown).
  • a homography-based process is utilized to match landmarks associated with the selected 2D image and landmarks associated with the texture image to find conformity there between.
  • the homography- based process refers to perspective transformation of one plane into another. Therefore, having a first set of landmarks associated with one image and a second set of landmarks, being corresponded to the first set but associated with another image, it is possible to find conformity between these two images in the form of a homographic matrix.
  • One example of homography based process for finding conformity is a Random Sample Consensus (RANSAC) method.
  • RANSAC is an iterative process for estimating parameters from a set of observed data.
  • a basic assumption of this process is that the observed data consists of "inliers," i.e., data whose distribution can be explained by a certain model, though may be subject to noise, and "outliers" which are data that do not fit said model.
  • the outliers can come, for example, from extreme values of noise or from erroneous measurements or incorrect hypotheses about the interpretation of data.
  • RANSAC process also assumes that, given a set of inliers, there exists a procedure which can estimate the parameters of a model that optimally explains or fits this data. In this technology, RANSAC makes an iterative estimation of model parameters for randomly selected landmarks.
  • the depth map of the most similar 3D image located is transformed into a point cloud, or in other words, a vertex set in a 3D coordinate system.
  • the texture image is taken as a texture and applied to the point cloud.
  • FIG. 10 shows the result of applying the texture image, as shown in FIG. 8, to a point cloud.
  • the point cloud along with the "attached" texture is rotated and multiple shots are taken at different angles which constitute angle shots.
  • FIGs. 11A-11D illustrate exemplary angle shots created with respect to the registration image (such as the one shown in FIG. 6) and based on the technology described herein. Specifically, FIGs.
  • FIG. 11A-11D show angle shots rotated at different angles relative to horizontal and vertical axis (i.e., yaw and pitch angles): FIG. 11A illustrates an angle shot taken at the yaw angle of +15 degrees, FIG. 11B illustrates an angle shot taken at the yaw angle of -15 degrees, FIG. 11C illustrates an angle shot taken at the pitch angle of +15 degrees, and FIG. 11D illustrates an angle shot taken at the pitch angle of -15 degrees.
  • features extracted from facial images may relate to pixel data such as coordinates, brightness, color, a depth value, among other things.
  • responses of a linear filter such as responses of two-dimensional Gabor filters applied to transformed (pre-processed) face images as the features.
  • responses can be calculated as follows.
  • the impulse response of Gabor filter is defined as the product of
  • the Gabor filter can be defined, for example, as follows: J " " " 7 ⁇ 7
  • Vt — sin f? + v cos ⁇ , T _ . ⁇ ⁇ admir y t y ' (Equation No. 9)
  • x and y are pixel coordinates
  • / is a frequency of a complex sine curve
  • is a filter orientation
  • X is a spatial width of the filter along a sinusoidal wave
  • r l is a spatial width of the filter perpendicularly to wave.
  • each transformed image is convolved together with a set of 2D Gabor filters, which may include, but not limited to, forty various filters.
  • FIG. 12 shows graphical representation of forty Gabor filters suitable for implementing the methods described herein.
  • the convolution in spatial area can be replaced with multiplication, in the frequency area, of Fourier images of the input image and the filter impulse response followed by inverse Fourier- transformation of the multiplication product.
  • a result of convolution of the transformed face image together with forty Gabor filters, 40 images are generated which are then consequently converted into a single vector.
  • This vector further is considered as a feature vector analyzed with the image recognition method.
  • the process of retrieving a feature vector as described above can be used in the steps 260, 350, and 450 of the methods 200-400. Comparing Feature Vectors
  • Comparing feature vectors with reference feature vectors required for the steps 360 and 460 of the methods 300, 400 can be implemented differently.
  • statistical algorithms can be utilized; in other examples, machine-learning algorithms can be utilized; and in yet more examples a combination of the foregoing can be utilized.
  • LDA or Fisher's linear discriminant method can be used to find similarity between feature vectors.

Abstract

The technology described herein provide methods and systems for facial recognition and user authentication. The methods comprises the step of acquiring an input image, determining a position of a user face on the input image, locating multiple facial landmarks of the user face, transforming at least a portion of the input image into a uniform image of the user face, retrieving a feature vector based upon the uniform image of the user face, comparing the feature vector with at least one reference feature vector, and identifying the user or making an authentication decision with respect to the user based on the result of comparison. The technology allows for the creation of multiple facial angle shots based on a single facial image of an individual. These facial angle shots are further used to train a machine-learning algorithm handling facial recognition and/or user authentication processing.

Description

FACIAL RECOGNITION AND USER AUTHENTICATION METHOD
TECHNICAL FIELD
[0001] This disclosure relates generally to object recognition and more particularly to facial recognition and user authentication technologies.
DESCRIPTION OF RELATED ART
[0002] The approaches described in this section could be pursued, but are not necessarily approaches that have previously been conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
[0003] Traditionally, biometrics refers to identification of people by their characteristics and traits. Biometrics is widely used in computer-based user authentication and access control systems, computer vision systems, gaming platforms, and so forth. For example, an individual may activate or otherwise gain access to functionalities or data controlled by a computing device by acquiring and verifying user biometric data. Generally, biometric identifiers are the distinctive and measurable characteristics, which can be used to label and describe an individual. Some examples of biometric identifiers may include a retina image, an iris image, an image or shape of individual's face, fingerprints, a handprint, a voice, keystroke dynamics, behavioral characteristics, and so forth. [0004] Conventionally, facial recognition methods or algorithms are implemented by retrieving features of samples (e.g. human faces), comparing the features with stored reference feature records, and determining whether they are matched or not. The reliability of facial recognition methods greatly depend on the quality of samples, in other words, images of the individual's face to be identified, as well as conditions under which the samples are taken. In particular, the level and angle of illumination, the position of a camera relative to an individual's face, facial expression, luminance and color characteristics, among other things, significantly influence the quality and reliability of facial recognition. One way to reduce vulnerability from these factors is to maintain a great number of samples for a given individual, whereas the samples are taken under different conditions as outlined above. However, in most real-life scenarios, it is not possible to maintain a great number of an individual's samples, and, typically, security systems are limited by one sample image per individual. Thus, facial recognition is still inaccurate enough to be used in security and computer vision systems employing one or a few sample images of individuals. [0005] In view of at least the foregoing problems, there is still a need in the art for improvement of facial recognition methods and eliminating the influence of the above listed parameters on quality and reliability of these methods. There is a further need to decrease the false acceptance rate (FAR) and false rejection rate (FRR) especially in those circumstances when just one sample image of an individual is available.
SUMMARY
[0006] This summary is provided to introduce a selection of concepts in a simplified form that are further described in the Detailed Description below. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
[0007] Various embodiments of the present disclosure provide generally for significantly improving the quality and reliability of facial recognition and corresponding user authentication methods based upon the improved facial recognition techniques, as well as improving decreasing the FAR and FRR values. Specifically, the present technology involves intelligent creating of multiple facial angle shots of an individual based on his or her single facial image. These facial angle shots may optionally be used to train a machine- learning system handling facial recognition and/or user authentication processing. As will be explained below, the principles of the present technology can be integrated not only in facial recognition methods, but also in user authentication methods, which are based on facial recognition.
[0008] According to embodiments of the present disclosure, provided are a method for user authentication and/or identification, as well as a system and a non-transitory processor-readable medium configured to implement the method for user authentication and/or identification. The method comprises a step of acquiring, by one or more processors, an input image associated with a user (the input image shows at least the user face). Further, the method includes determining, by one or more processors, a position of the user face, and locating, by the one or more processors, multiple facial landmarks associated with the user. The method further includes transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user face. The transformation include spatial rotation, scaling, cropping, color and/or brightness adjustments, or a combination thereof. The method further includes retrieving, by the one or more processors, a feature vector based upon the uniform image of the user face, comparing the feature vector with at least one stored reference feature vector (using, for example, a machine- learning algorithm), and making an authentication and/or identification decision(s) with respect to the user based on the result of comparison. [0009] According to embodiments of the present disclosure, provided is also a method for registering users, or in other words, creating user profiles to be used in the method for user authentication and/or identification. The method for registering users includes acquiring one registration image associated with the authorized user, determining a facial position of the authorized user based on the registration image. Further, the method includes locating, based on the position, multiple facial landmarks associated with the authorized user, transforming at least a portion of the one registration image into a texture image of the authorized user face (e.g., a full-face image of the authorized user). Further, the method includes generating multiple angle shots of the authorized user face based upon the texture image. The method also includes creating a biometric pattern associated with the multiple angle shots of the user face and storing the biometric pattern associated with the authorized user in a database.
[0010] In one example embodiment, the generation process of the angle shots comprises the steps of finding similarity between the texture image and one of a plurality of reference two-dimensional (2D) facial images, wherein each one of the plurality of reference 2D facial images is associated with a reference three-dimensional (3D) facial image. Further, based on the similarity, the method selects the 3D facial image related to the reference 2D image being the most similar to the texture image. The generation method further includes superposing the texture image of the authorized user with multiple points associated with the pre-selected 3D facial image. Further, the 3D facial image and the texture image can be rotated to generate images of the multiple angle shots based upon the rotated 3D facial image and the texture image.
[0011] According to yet more embodiments of the present disclosure, provided is a method for facial recognition, a facial recognition system and a non-transitory processor-readable medium configured to implement the method for facial recognition. The method for facial recognition comprises the steps of acquiring a facial image of an individual, determining a facial position based on the facial image, locating multiple facial landmarks associated with the individual based on the position, transforming at least a portion of the facial image into a uniform facial image based on the multiple facial landmarks (the uniform facial image represents a full-face image of the individual). Further the method comprises creating a feature vector based upon the uniform facial image, comparing the feature vector with at least one stored reference feature vector and, based on the comparison, determining the identity of the individual. Other features, aspects, examples, and embodiments are described below.
BRIEF DESCRIPTION OF THE DRAWINGS [0012] Embodiments are illustrated by way of example, and not by limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
[0013] FIG. 1 is a high-level block diagram illustrating an example computing device suitable for implementing methods for facial recognition and user authentication as disclosed herein.
[0014] FIG. 2 shows a high-level process flow diagram of a method for user registration according to one exemplary embodiment.
[0015] FIG. 3 shows a high-level process flow diagram of a method for facial recognition according to one exemplary embodiment. [0016] FIG.4 shows a high-level process flow diagram of a method for user authentication according to one exemplary embodiment of the present disclosure.
[0017] FIG. 5 shows an exemplary face image and several facial landmarks, which can be used in methods for facial recognition and user authentication as described herein. [0018] FIG. 6 shows an input image of a user for the use in a method for facial recognition, according to one example embodiment, which image may also serve as a registration image in a method for user registration.
[0019] FIG. 7 shows an exemplary area of interest created by rotating and cropping of the input image shown in FIG. 6. [0020] FIG. 8 shows an exemplary uniform image of a user face suitable for the use in a method for facial recognition, which image may be also used as a texture image in a method for user registration.
[0021] FIG. 9 shows an exemplary three-dimensional image (depth map) corresponded to a selected image being the most similar to the texture image shown in FIG. 8.
[0022] FIG. 10 shows a graphical representation of the result of applying the texture image, as shown in FIG. 8, to a point cloud.
[0023] FIGs. 11A-11D illustrate exemplary angle shots based upon the to the registration image shown in FIG. 6.
[0024] FIG. 12 shows a graphical representation of forty Gabor filters suitable for implementing the methods described herein.
DETAILED DESCRIPTION
[0025] The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with example embodiments. These example embodiments, which are also referred to herein as "examples," are described in enough detail to enable those skilled in the art to practice the present subject matter. The embodiments can be combined, other
embodiments can be utilized, or structural, logical, and electrical changes can be made without departing from the scope of what is claimed. The following detailed description is therefore not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents. In this document, the terms "a" and "an" are used, as is common in patent documents, to include one or more than one. In this document, the term "or" is used to refer to a nonexclusive "or," such that "A or B" includes "A but not B," "B but not A," and "A and B," unless otherwise indicated.
[0026] The techniques of the embodiments disclosed herein may be implemented using a variety of technologies. For example, the methods described herein may be implemented in software executing on a computer system or in hardware utilizing either a combination of microprocessors, controllers or other specially designed application-specific integrated circuits (ASICs), programmable logic devices, or various combinations thereof. In particular, the methods described herein may be implemented by a series of computer-executable instructions residing on a storage medium such as a disk drive, solid-state drive or on a computer-readable medium.
[0027] Generally speaking, facial recognition can be used by a computing device in various scenarios. For example, a computing device may use facial recognition to authenticate or authorize a user who attempts to gain access to one or more functionalities or data of the computing device or functionalities otherwise controlled by the computing device. In some common scenarios, the computing device may store facial images of one or more pre-authorized users. These facial images are referred herein as "registration images." When a user attempts to gain access to functionalities or data of the computing device, the computing device may capture an image of the user's face for authentication purposes. The computing device may then use facial recognition applications to compare the captured facial image to the enrollment images associated with authorized users. If the facial recognition applications determine match or acceptable level of similarity between the captured facial image and at least one of registration images, the computing device may authenticate the user, and grant access to requested functionalities or data. Similarly, this approach can be used in security systems to recognize people attempting to enter designated premises or land, crime detection systems, computer vision systems, gesture recognition and control systems, and so forth.
[0028] As was outlined above, the facial recognition are not accurate and may falsely accept or falsely reject a user when similarity of the user's image and one or more registration images is found incorrectly. Unauthorized users may leverage vulnerabilities of facial recognition to cause erroneous authentication. In contrast, an authorized user, that should be authenticated, may be declined to have access because a facial recognition system could not find similarity of a present user's image and a user's registration image captured some time ago. The finding similarities between the present user images and registration images greatly depend on a quality and number of registration images. In most scenarios, facial recognition systems maintain only one registration image per authorized user. Accordingly, when the facial recognition system takes an image of a user and light conditions, as well as camera position relative to the user significantly differ from those parameters relative to the registration image, FAR and FRR values may be significantly high. The present technology decreases FAR and FRR values and improves the quality and reliability of facial recognition systems by, but not limited to, artificially synthesizing facial foreshortening images with respect to an available registration image of authorized user, as well as the intelligent use of Gabor filters and linear discriminant analysis. Below is provided a detailed description of various embodiments of the present technology with reference to the drawings.
[0029] FIG. 1 is a high-level block diagram illustrating an example computing device 100 suitable for implementing the present technology. In particular, the computing device 100 may be used for facial recognition, user authentication, user authorization, and/or user registration as described herein. The computing device 100 may include, be, or be a part of one or more of a variety of types of devices, such as a general purpose computer, desktop computer, laptop computer, tablet computer, netbook, server, mobile phone, a smartphone, personal digital assistant, set-top box, television, door lock, watch, vehicle computer, electronic kiosk, automated teller machine, infotainment system, presence verification device, security device, surveillance device, among others. Furthermore, the computing device 100 may be an integrated part of another multi-component system such as a video surveillance system, access control system, among others.
[0030] As shown in FIG. 1, the computing device 100 includes one or more processors 102, memory 104, one or more storage devices 106, one or more input devices 108, one or more output devices 110, network interface 112, and an image sensor 114 (e.g. a camera or charged-coupled device (CCD)). One or more processors 102 are, in some examples, , configured to implement functionality and/or process instructions for execution within the computing device 100. For example, the processors 102 may process instructions stored in memory 104 and/or instructions stored on storage devices 106. Such instructions may include components of an operating system 118, a facial recognition module 120 and/or a user authentication module 122. Computing device 100 may also include one or more additional components not shown in FIG. 1, such as a power supply, a battery, a fan, a global positioning system (GPS) receiver, among others.
[0031] Memory 104, according to one example, is configured to store information within the computing device 100 during operation. Memory 104, in some example embodiments, may refer to a non-transitory computer- readable storage medium or a computer-readable storage device. In some examples, memory 104 is a temporary memory, meaning that a primary purpose of memory 104 may not be long-term storage. Memory 104 may also refer to a volatile memory, meaning that memory 104 does not maintain stored contents when memory 104 is not receiving power. Examples of volatile memories include random access memories (RAM), dynamic random access memories (DRAM), static random access memories (SRAM), and other forms of volatile memories known in the art. In some examples, memory 104 is used to store program instructions for execution by the processors 102. Memory 104, in one example, is used by software (e.g., operating system 118) or applications, such as facial recognition 120 and/or user authentication 122, executing on computing device 100 to temporarily store information during program execution. One or more storage devices 106 can also include one or more computer-readable storage media and/or computer-readable storage devices. In some embodiments, storage devices 106 may be configured to store greater amounts of information than memory 104. Storage devices 106 may further be configured for long-term storage of information. In some examples, the storage devices 106 include non-volatile storage elements. Examples of such non-volatile storage elements include magnetic hard discs, optical discs, solid-state discs, flash memories, forms of electrically programmable memories (EPROM) or electrically erasable and programmable memories, and other forms of non-volatile memories known in the art. [0032] Still referencing to FIG. 1, the computing device 100 may also include one or more input devices 108. The input devices 108 may be configured to receive input from a user through tactile, audio, video, or biometric channels. Examples of input devices 108 may include a keyboard, mouse, touchscreen, microphone, one or more video cameras, or any other device capable of detecting an input from a user or other source, and relaying the input to computing device 100, or components thereof. Though shown separately in FIG. 1, the image sensor 114 may, in some instances, be a part of input devices 108. It should be also noted that the image sensor 114 (e.g., a digital still and/or video camera) may be a peripheral device operatively connected to the computing device 100 via the network interface 112.
[0033] The output devices 110, in some examples, may be configured to provide output to a user through visual or auditory channels. Output devices 110 may include a video graphics adapter card, a liquid crystal display (LCD) monitor, a light emitting diode (LED) monitor, a sound card, a speaker, or any other device capable of generating output that may be intelligible to a user. Output devices 110 may also include a touchscreen, presence-sensitive display, or other input/output capable displays known in the art.
[0034] The computing device 100, in some example embodiments, also includes network interface 112. The network interface 112 can be utilized to communicate with external devices via one or more networks such as one or more wired, wireless or optical networks including, for example, the Internet, intranet, local area network (LAN), wide area network (WAN), cellular phone networks, Bluetooth radio, an IEEE 802.11-based radio frequency network, among others. The network interface 112 may be a network interface card, such as an Ethernet card, an optical transceiver, a radio frequency transceiver, or any other type of device that can send and receive information. Other examples of such network interfaces may include Bluetooth®, 3G, 4G, and WiFi® radios in mobile computing devices as well as USB.
[0035] The operating system 118 may control one or more functionalities of computing device 100 and/or components thereof. For example, the operating system 118 may interact with applications 124, including facial recognition 120 and user authentication 122, and may facilitate one or more interactions between applications 124 and one or more of processors 102, memory 104, storage devices 106, input devices 108, and output devices 110. As shown in FIG. 1, the operating system 118 may interact with or be otherwise coupled to facial recognition module 120, user authentication module 122, applications 124, and components thereof. In some embodiments, facial recognition module 120 and user authentication module 122 may be included in operating system 118. In these and other examples, facial recognition module 120 and user authentication module 122 may be part of applications 230, facial recognition module 120 and user authentication module 122 may be implemented externally to computing device 100 such as at a network location. In some such instances, computing device 100 may use the network interface 112 to access and implement functionalities provided by facial recognition module 120 and user authentication module 122, through methods commonly known as "cloud computing."
[0036] FIG. 2 shows a high-level process flow diagram of a method 200 for user registration (in other words, user enrolment) according to one exemplary embodiment. The method 200 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both. In some example embodiments, the method 200 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 200 is just one example operation of the computing device 100.
[0037] The method 200 commences at step 210 with the computing device 100 acquiring at least one registration image associated with an authorized user. The image may be a 2D image of the authorized user captured by the image sensor 114. Further, at step 220, the computing device 100 determines a facial position of the authorized user based on the at least one registration image. The determination of facial position may be needed for isolating (e.g. cropping) the authorized user face, which would simplify further processing steps. At step 230, the computing device 100 locates multiple facial landmarks associated with the authorized user. At step 240, the computing device 100 transforms at least a portion of the at least one registration image into a texture image of the authorized user face. At step 250, the computing device 100 generates multiple angle shots of the authorized user face based upon the texture image of the authorized user face. At step 260, the computing device 100 creates a biometric pattern associated with the multiple angle shots of the user face. Notably, the biometric pattern includes a feature vector, which components are associated with the angle shots. As will be explained below in details, the biometric pattern can be utilized for training one or more machine-learning algorithms handling face recognition and/or user authentication based on facial recognition. At step 270, the computing device 100 stores the biometric pattern associated with the authorized user in the memory 104 and/or storage device 106. In some example embodiments, the computing system 100 maintains user profiles, which comprise feature vectors and/or biometric patterns of authorized users. The user profiles can be later recalled when a new image is acquired to identify and/or authorize a user. The feature vectors may be based on various parameters such as pixel data, distances between certain facial landmarks, facial angle shots, among other things.
[0038] Once one or more authorized users are registered, the computing system 100 may operate to recognize faces, identify (authorize) users and optionally provide access to computing device functionalities, data, resources, premises, land, among others. FIG. 3 shows a high-level process flow diagram of a method 300 for facial recognition according to one exemplary embodiment of the present disclosure. The method 300 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both. In some example embodiments, the method 300 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 300 is just one example operation of the computing device 100.
[0039] FIG. 3 starts at step 310, when the computing device 100 acquires a facial image of an individual from the image sensor 114 or an analogues device. At step 320, the computing device 100 determines a facial position based on the facial image, which may be needed for cropping or isolating the individual's face. At step 330, the computing device 100 locates multiple facial landmarks associated with the individual based on the position determined. At step 340, the computing device 100 transforms, based on the multiple facial landmarks, at least a portion of the facial image into a uniform facial image. At step 350, the computing device 100 creates a feature vector based upon the uniform facial image. At step 360, the computing device 100 compares the feature vector with at least one stored reference feature vector associated with the authorized users (i.e., those users that registered using the method 200). At step 370, the computing device 100, based on the result of comparison, determines identity of the individual. [0040] Similar to the facial recognition method 300, FIG. 4 shows a high- level process flow diagram of a method 400 for user authentication according to one exemplary embodiment of the present disclosure. The method 400 may be performed by processing logic that may comprise hardware (e.g., one or more processors, controllers, dedicated logic, programmable logic, and microcode), software (such as software run on a general-purpose computer system or a dedicated machine, firmware), or a combination of both. In some example embodiments, the method 400 is implemented by the computing device 100 shown in FIG. 1, however, it should be appreciated that the method 400 is just one example operation of the computing device lOO.The method 400 starts at step 410, when the computing device 100 acquires an input image associated with a user (whereas the input image shows at least the user face) from the image sensor 114 or similar device. At step 420, the computing device 100 optionally determines a position of the user face. At step 430, the computing device 100 locates multiple facial landmarks associated with the user. At step 440, the computing device 100 transforms, based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user face. At step 450, the computing device 100 retrieves a feature vector based upon the uniform image of the user face. At step 460, the computing device 100 compares the feature vector with a stored reference feature vector associated with the authorized user. The comparison process may be based on the use of one or more machine-learning algorithms, although it is not required. At step 470, the computing device 100, based on the comparison, makes an authentication decision with respect to the user. The authentication decision can be further used in providing access for the authorized user to computing device functionalities, data, resources, or access to dedicated premises or land. The authentication decision can be also used for generating one or more control commands for the computing device 100, its elements, or any other suitable peripheral device. For example, a control command may control one or more devices, such as "unlock" computing devices.
[0041] As shown in FIGs. 2-4, the majority of operation steps replicate each other or similar to each other. Accordingly, the steps 220, 230, 240, 250, 260, 320, 330, 340, 350, 360, 420, 430, 440, 450, and 460 are further combined together and described in details below.
Determining Facial Position
[0042] Facial position is determined at least in steps 220, 320 and 420 based on an input image of a user/individual as received from the image sensor 114. In various example embodiments, facial position can be located and/or determined by a machine-learning algorithm, pattern recognition algorithms, and/or statistical analysis configured to search objects of a predetermined class (i.e., faces) in input images. Some examples of machine-learning algorithms include neural networks, heuristic methods, support vector machines, or a combination thereof.
[0043] One example of statistical analysis suitable for facial position detection includes Principal Components Analysis (PCA), which is a procedure that uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrected variables called principal components. This transformation can be defined in such a way that the first principal component has the largest possible variance, and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to the preceding components.
[0044] Another example method suitable for facial position detection is Linear Discriminant Analysis (LDA) also known as Fisher's linear discriminant method. LDA may be used to find linear combination of features, which characterize or distinguish two or more classes or species of one class.
[0045] Another example method suitable for facial position detection is Viola-Jones object detection framework. This method adapted the idea of using Haar wavelets and developed so called Haar-like features. A Haar-like feature considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in these regions and calculates the difference between them. This difference is then used to categorize subsections of an image. For example, consider an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore, a common Haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region. The position of these rectangles is defined relative to a detection window that acts like a bounding box to the target object (the face in this case). In the detection phase of the Viola-Jones object detection framework, a window of the target size is moved over the input image, and for each subsection of the image the Haar-like feature is calculated. This difference is then compared to a learned threshold that separates non-objects from objects. Because such a Haar-like feature is only a weak learner or classifier (its detection quality is slightly better than random guessing) a large number of Haar-like features is necessary to describe an object with sufficient accuracy. In the Viola-Jones object detection framework, the Haar-like features are therefore organized in something called a classifier cascade to form a strong learner or classifier. One advantage of the Haar-like feature over most other features is its calculation speed. Due to the use of integral images, a Haar-like feature of any size may be calculated in constant time. Accordingly, the Viola-Jones object detection framework may serve one of preferred algorithms for facial position detection performed in the steps 220, 320 and 420 of the methods 200-400.
Locating Facial Landmarks
[0046] Generally, facial landmark points refer to various face elements such as inner/outer corner of eyes, eye centers (e.g., pupils), left/right corner of mouth, nose center, nose corners, ears, chin, eyebrows, among others. FIG. 5 shows an exemplary face image and some facial landmarks, which can be used by the methods for facial recognition and user authentication as described herein.
[0047] In example embodiments, facial landmark are located using at least one of Active Shape Model (ASM) searching process and Active Appearance Model (AAM) searching process. In an example, ASM searching process is a statistical model of the shape of various objects, which iteratively deform to fit to an example of the object in a new image. In general, the principles of ASM searching process consists in the use of statistical relations between mutual arrangements of landmarks associated with a new image and at least one reference image. The process, generally, includes two steps: (a) locating an area associated with an initial landmark point coordinates, and (b) iteratively adjust the statistical model to define landmark coordinates more precisely. This process is described below in greater details.
[0048] Assume that there is a training sample based on L images of faces shot full-face, with N marked landmarks for each face, wherein all landmarks are numbered in the same order. The following equation defines coordinates of landmarks (in the system of coordinates of the image):
(*ij> >¾*) = l, N where i = 1, 1 (Equation No. 1)
[0049] To bring point coordinates from all images to a uniform system, it is carried out Generalized Procrustes Analysis (GPA). In another example embodiment, when all face images have identical scale, it is possible to limit this process to centering. We will consider further that coordinates are centered. Now, L vectors of height 2N describing "form" or "contour" of arrangement of landmarks are defined as follows:
Sf = (*il - xiN >'tl - }¾V) . (Equation No. 2)
[0050] Assume also that an average of the above value is defined as follows:
L i=l (Equation No. 3) si— Si— S, i = 1, L (Equation No. 4)
[0051] Further, based on a set of vectors ^ΰ, the covariance matrix of their coordinates is calculated as:
L i=l (Equation No. 5)
[0052] Let -¾ define eigenvalues of the matrix K ordered in decreasing order with appropriate eigenvectors u£« ' = 1>2^ . Set of vectors ut is the basis of a 2N-dimensional vector space, and thus any vector ¾ can be presented in the form of their linear combination. Based on statistical relations between landmark point coordinates, the equation No. 5 can be replaced with the following approximation:
s£ « s + biitut + bii2u2 + bLvup = s + bi (Equation No 6) where matrix φ includes V principal components, i.e. eigenvectors u J ~ ^' P, which eigenvectors refer to P the largest eigenvalues and is a vector having P coefficients (also known as parameters of model). The value P is defined as follows:
Figure imgf000021_0001
(Equation No. 7)
[0053] ASM model is defined by the matrix φ and the vector Any image/shape can approximately be described by means of the ASM model and parameters obtained from the equation:
?i = ¾ = <t>r(si - s) . (Equation No. 8)
[0054] The vector s can represent a common pattern of landmarks arrangement and individual features of specific face shape.
[0055] According to example embodiments, the localization of landmarks in a new face image is carried out as follows. First, a face position is determined as described above in this paper. For example, Viola-Jones classifier is utilized, which returns an image with isolated user face.
[0056] An average shape defined by the vector s is coincided with the center of isolated user face. In some embodiments, the coordinates of the average form defined by * can be scaled, if required. The average shape defines initial approximation to landmark point coordinates, which can be considered as an initial iteration f ^ .
[0057] Further, for each landmark point with number i the Viola-Jones cascade classifier is trained. For the entire image, it generates a set of landmarks classified as landmarks with number z. When the classifier is trained, positive examples are those areas of the image centered in a landmark point, and negative examples are those areas crossed with positive examples.
[0058] At '-th iteration of algorithm and for z'-th landmark, a corresponding cascade classifier Q is applied to a small area of the image centered in a landmark with coordinates defined as \ 1 ' i+N ) . As the classifier generates some landmark points classified as anthropometrical, the located landmark point is considered as nearest to landmark with the coordinates \ 1 ' l+,v / .
[0059] Assume c is a shape consisted of landmark points found by the cascade classifiers at ;-th iteration, and the coordinates of this shape are centered and divided by a scale coefficient t1 . This shape may be checked for conformity with the statistical ASM model. The result of conformity may define the shape c^ for the next iteration. Procedure of landmark localization is repeated and terminates when a predetermined number of iterations is performed (for example, three iterations). Accordingly, landmark points may be represented by three-dimensional (3D) coordinates and associated with a facial image. The process of locating landmarks as described above can be used in the steps 230, 330 and 430 of the methods 200-400.
Face Image Transformation
[0060] The face image transformation also refers to image pre-processing required to bring the face image to a uniform image being in a better condition for further processing. For example, the image transformation may include the following operations. First, a color face image or its part can be transformed into a half-tone image or monotone (single-colored) image.
[0061] Second, an area of interest can be isolated from the image. This process can include the following steps: (a) rotating at least a portion of the face image until landmarks associated with user pupils are horizontally oriented; (b) scaling the face image (for example, until landmarks associated with user pupils are at a predetermined distance from each other, e.g. 60 pixels between the pupils); and/or (c) cropping the face image or its part to create an image of a predetermined pixel size. FIGs. 6 and 7 show example face images illustrating the above processes. Namely, FIG. 6 shows an exemplary input image of a user. FIG. 7 shows the same image as in FIG. 6, but subjected to the process of rotation and cropping as outlined above to isolate the area of interest, i.e. a user face.
[0062] Third, the image transformation may include adjusting a brightness of at least a portion of the face image. The brightness adjustment may include a number of processes including, but not limited to, Single Scale Retinex (SSR) filtering, homomorphic filtering based normalization technique, Discrete Cosine Transform (DCT) based normalization technique, wavelet based normalization technique, among others. In addition, the brightness adjustment may include histogram correcting of the face image. Furthermore, the brightness adjustment may include contrast enhancing. It should be noted that the above listed procedures are just examples that can be used for brightness adjustments, their order can be different from the listed above, and furthermore these example procedures can be used interchangeably and their parameters may vary in order to achieve better results in image processing. FIG. 8 shows an example face image shown in FIG. 7 and subject to the brightness adjustment as outlined above. In other words, FIG. 8 shows a uniform image of the user face. Accordingly, the image transformation process as described above is used in the steps 240, 340 and 440 of the methods 200-400.
Synthesizing Angle Shots
[0063] Generally, angle shots are artificial images taken or created by a virtual camera at an angle from the horizontal or vertical lines. In the present technology, angle shots may be created by rotation of 3D representation of a user face at different angles with a fixed virtual camera. The rotation may create any of a yaw angle, pitch angle, and/or roll angle. Accordingly, each angle shot may be characterized by yaw, pitch and roll angles. One example of the angle shot may be characterized as {yaw=15; pitch=-15; roll=10}. The synthesis (generation) of the angle shots includes the following operations performed by the computing device 100: superposing a texture image of an authorized user, such as one shown in FIG. 7, with multiple points associated with a reference three-dimensional (3D) facial image of the same or another user; rotating the reference 3D facial image with the texture image in concert with each other; and generating new images corresponded to multiple angle shots based upon the rotated reference 3D facial image with the texture image at different angles. This process is used in the step 250 of method 200 for user registration as described above.
[0064] Notably, for synthesizing of angle shots, it is necessary to maintain a database of 3D images of a group of people (preferably of different sex, age and ethnicity), wherein the reference 3D facial images are associated with their corresponding reference 2D half-tone or color images, which are also stored in the database such as the memory 104 or storage device 106. In some examples, the reference 2D facial images are uniform face images as discussed above.
[0065] Reference 3D images can be represented by depth maps, whereas each pixel of these images include information related to a distance between the image sensor 114 and certain parts of user face or other objects. The association of reference 2D images with the depth maps means that each pixel of reference 3D image also include data related to brightness and/or color. In some example embodiments, the reference 3D facial images associated with reference 2D images may be pre-processed/transformed as described above with reference to the steps 240, 340 and 440.
[0066] More particularly, the synthesis process of angle shots includes the following operations. The computing device 100 finds similarity between the texture image and one or more of the plurality of reference 3D and 2D images, and then, based on the similarity, the computing device 100 selects the most similar reference 3D and 2D image to the texture image. This process of finding similarity can be implemented by a machine-learning algorithm or a statistical analysis. Some examples of the methods suitable for finding similarity includes Principal Components Analysis (PCA) or a discriminant analysis such as Linear Discriminant Analysis (LDA) or Fisher's linear discriminant analysis. Referring back to the drawings, FIG. 9 shows an exemplary 3D image, which is selected by the computing device 100 as the most similar to the texture image such as the one shown in FIG. 8 (the corresponding 2D image of the selected 3D image is not shown).
[0067] Further, a homography-based process is utilized to match landmarks associated with the selected 2D image and landmarks associated with the texture image to find conformity there between. The homography- based process refers to perspective transformation of one plane into another. Therefore, having a first set of landmarks associated with one image and a second set of landmarks, being corresponded to the first set but associated with another image, it is possible to find conformity between these two images in the form of a homographic matrix. One example of homography based process for finding conformity is a Random Sample Consensus (RANSAC) method. RANSAC is an iterative process for estimating parameters from a set of observed data. A basic assumption of this process is that the observed data consists of "inliers," i.e., data whose distribution can be explained by a certain model, though may be subject to noise, and "outliers" which are data that do not fit said model. The outliers can come, for example, from extreme values of noise or from erroneous measurements or incorrect hypotheses about the interpretation of data. RANSAC process also assumes that, given a set of inliers, there exists a procedure which can estimate the parameters of a model that optimally explains or fits this data. In this technology, RANSAC makes an iterative estimation of model parameters for randomly selected landmarks.
[0068] Further, the depth map of the most similar 3D image located, such the one shown in FIG. 9, is transformed into a point cloud, or in other words, a vertex set in a 3D coordinate system. Further, the texture image is taken as a texture and applied to the point cloud. FIG. 10 shows the result of applying the texture image, as shown in FIG. 8, to a point cloud. Next, the point cloud along with the "attached" texture is rotated and multiple shots are taken at different angles which constitute angle shots. FIGs. 11A-11D illustrate exemplary angle shots created with respect to the registration image (such as the one shown in FIG. 6) and based on the technology described herein. Specifically, FIGs. 11A-11D show angle shots rotated at different angles relative to horizontal and vertical axis (i.e., yaw and pitch angles): FIG. 11A illustrates an angle shot taken at the yaw angle of +15 degrees, FIG. 11B illustrates an angle shot taken at the yaw angle of -15 degrees, FIG. 11C illustrates an angle shot taken at the pitch angle of +15 degrees, and FIG. 11D illustrates an angle shot taken at the pitch angle of -15 degrees.
Feature Vector Extraction
[0069] In general, features extracted from facial images may relate to pixel data such as coordinates, brightness, color, a depth value, among other things.
In the present technology, it is more preferable to use responses of a linear filter such as responses of two-dimensional Gabor filters applied to transformed (pre-processed) face images as the features.
[0070] In some example embodiments, responses can be calculated as follows. The impulse response of Gabor filter is defined as the product of
Gaussian function by a harmonic function. Accordingly, the Gabor filter can be defined, for example, as follows: J " " 7Γ 7
xt = x cos Θ 4- ¾/ sin #,
Vt = — sin f? + v cos ^, T_ . χ τ „ yt y ' (Equation No. 9) where x and y are pixel coordinates, / is a frequency of a complex sine curve, Θ is a filter orientation, X is a spatial width of the filter along a sinusoidal wave, rl is a spatial width of the filter perpendicularly to wave.
[0071] In one example, each transformed image is convolved together with a set of 2D Gabor filters, which may include, but not limited to, forty various filters. FIG. 12 shows graphical representation of forty Gabor filters suitable for implementing the methods described herein.
[0072] In other example embodiments, the convolution in spatial area can be replaced with multiplication, in the frequency area, of Fourier images of the input image and the filter impulse response followed by inverse Fourier- transformation of the multiplication product.
[0073] In one example, a result of convolution of the transformed face image together with forty Gabor filters, 40 images are generated which are then consequently converted into a single vector. This vector further is considered as a feature vector analyzed with the image recognition method. The process of retrieving a feature vector as described above can be used in the steps 260, 350, and 450 of the methods 200-400. Comparing Feature Vectors
[0074] Comparing feature vectors with reference feature vectors required for the steps 360 and 460 of the methods 300, 400 can be implemented differently. In some examples, statistical algorithms can be utilized; in other examples, machine-learning algorithms can be utilized; and in yet more examples a combination of the foregoing can be utilized. In one embodiment, LDA or Fisher's linear discriminant method can be used to find similarity between feature vectors.
Conclusion
[0075] Thus, methods and systems for facial recognition and user authentication have been described. The technology described herein have significantly improved the quality and reliability of facial recognition especially in the cases when there is just a single user registration image is available. In particular, tests performed with respect to 100 individuals based on a traditional facial recognition methods utilizing a single full face image of each user for training a machine-learning algorithm shown that at FAR value of 0.1%, the probability of identification value is up to 85.60%. Similar tests performed utilizing the present technology, which used not only a single full face image of each user for training a machine-learning algorithm, but also four additional synthesized angle shots, the probability of identification value was increased up to 98.80% at FAR value of 0.1%. These results illustrate the great performance of the present technology enabling to use in conditions when user images are taken in conditions significantly different from those when a registration image was taken.
[0076] Although embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes can be made to these example embodiments without departing from the broader spirit and scope of the present application. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method for user authentication, the method comprising:
acquiring, by one or more processors, an input image associated with a user, the input image shows at least the user face;
locating, by the one or more processors, multiple facial landmarks associated with the user;
transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user face;
retrieving, by the one or more processors, a feature vector based upon the uniform image of the user face;
comparing, by the one or more processors, the feature vector with at least one stored reference feature vector; and
based on the comparison, making, by the one or more processors, an authentication decision with respect to the user.
2. The method of claim 1, further comprising creating, by the one or more processors, at least one user profile, wherein the at least one user profile comprises at least a reference feature vector associated with an authorized user.
3. The method of claim 2, wherein the creating of the at least one user profile comprising:
acquiring, by the one or more processors, at least one registration image associated with the authorized user;
determining, by the one or more processors, a facial position of the authorized user based on the at least one registration image; locating, by the one or more processors, multiple facial landmarks associated with the authorized user;
transforming, by the one or more processors, at least a portion of the at least one registration image into a texture image of the authorized user face; generating, by the one or more processors, multiple angle shots of the authorized user face based upon the texture image of the authorized user face;
creating, by the one or more processors, a biometric pattern associated with the multiple angle shots of the user face; and
storing, by the one or more processors, the biometric pattern associated with the authorized user in a database.
4. The method of claim 3, wherein the biometric pattern comprises at least one feature vector associated with the multiple angle shots of the authorized user face.
5. The method of claim 4, wherein the feature vector includes pixel data of the texture image.
6. The method of claim 4, wherein the feature vector includes one or more impulse responses of a linear filter or Gabor filter.
7. The method of claim 3, wherein the generation of the multiple angle shots comprising:
superposing, by the one or more processors, the texture image of the authorized user with multiple points associated with a stored three- dimensional (3D) facial image; rotating, by the one or more processors, the 3D facial image with the texture image in concert with each other; and
generating, by the one or more processors, images of the multiple angle shots based upon the rotated the 3D facial image and with texture image.
8. The method of claim 7, wherein the texture image includes a full-face image of the authorized user.
9. The method of claim 7, further comprising:
finding similarity, by the one or more processors, between the texture image and one of a plurality of reference two-dimensional (2D) facial images, wherein each one of the plurality of reference 2D facial images is associated with a reference 3D facial image; and
based on the similarity, selecting, by the one or more processors, the 3D facial image related to the reference 2D image being the most similar to the texture image.
10. The method of claim 9, wherein the finding similarity is based on a discriminant analysis.
11. The method of claim 10, wherein the finding similarity is based on a Linear Discriminant Analysis (LDA) or Gabor filter analysis.
12. The method of claim 9, wherein the finding similarity comprising matching, by the one or more processors, first facial landmarks associated with the texture image and second facial landmarks associated with the plurality of reference 2D facial images.
13. The method of claim 1, further comprising:
determining, by the one or more processors, a position of the user face; and
isolating, by the one or more processors, an image area associated with the user face from the input image, wherein the transforming is performed with respect to the image area.
14. The method of claim 1, wherein the machine-learning algorithm comprises one or more heuristic algorithms, one or more support vector machines, one or more neural network algorithms, or a combination thereof.
15. The method of claim 1, wherein the determination of the position of the user face comprising a facial object recognition processing.
16. The method of claim 15, wherein the facial object recognition processing includes Viola-Jones facial recognition processing.
17. The method of claim 1, wherein the multiple facial landmarks relate to at least one of the following: a pupil, an eye corner, a nose, a nose corner, and a mouth corner.
18. The method of claim 1, wherein the locating of the multiple facial landmarks comprising at least one of: an Active Shape Model (ASM) searching and an Active Appearance Model (AAM) searching.
19. The method of claim 1, wherein the transforming of the at least a portion of the input image to generate the uniform image of the user face comprising: transforming the at least a portion of the input image into a half-tone image, wherein the input image is a color picture.
20. The method of claim 1, wherein the transforming of the at least a portion of the input image to generate the uniform image of the user face comprising: rotating of the at least a portion of the input image until facial landmarks associated with user pupils are horizontally oriented.
21. The method of claim 1, wherein the transforming of the at least a portion of the input image to generate the uniform image of the user face comprising: scaling of the at least a portion of the input image.
22. The method of claim 21, wherein the scaling is performed until a distance between facial landmarks related to user pupils is of a predetermined value.
23. The method of claim 1, wherein the transforming of the at least a portion of the input image to generate the uniform image of the user face comprising: cropping the at least a portion of the input image to generate an image of a predetermined pixel size.
24. The method of claim 1, wherein the transforming of the at least a portion of the input image to generate the uniform image of the user face comprising: adjusting a brightness of the at least a portion of the input image.
25. The method of claim 24, wherein the adjusting of the brightness comprises Single Scale Retinex (SSR) filtering.
26. The method of claim 24, wherein the adjusting of the brightness comprises histogram correcting of the at least a portion of the input image.
27. The method of claim 24, wherein the adjusting of the brightness comprises contrast enhancing.
28. A system for user authentication, the system comprising: at least one processor and a memory having processor-readable code embodied therein for programming the processor to perform a face recognition method, wherein the method comprises:
acquiring, from an image sensor, an input image associated with a user, the input image shows at least the user face;
locating multiple facial landmarks associated with the user face;
transforming, based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user;
retrieving a feature vector based upon the uniform image of the user face;
comparing the feature vector with at least one stored reference feature vector using a machine-learning algorithm; and
based on the comparison, making an authentication decision with respect to the user.
29. A non-transitory processor-readable medium having instructions stored thereon, which when executed by one or more processors, cause the one or more processors to implement a method for user authentication, the method comprising:
acquiring, from an image sensor, an input image associated with a user, the input image shows at least the user face; locating multiple facial landmarks associated with the user face;
transforming, based on the multiple facial landmarks, at least a portion of the input image into a uniform image of the user;
retrieving a feature vector based upon the uniform image of the user face;
comparing the feature vector with at least one stored reference feature vector; and
based on the comparison, making an authentication decision with respect to the user.
30. A method for facial recognition, the method comprising:
acquiring, by one or more processors, a facial image of an individual- determining, by the one or more processors, a facial position based on the facial image;
locating, by the one or more processors, multiple facial landmarks associated with the individual;
transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the facial image into a uniform facial image, wherein the uniform facial image represents a full-face image of the individual;
creating, by the one or more processors, a feature vector based upon the uniform facial image;
comparing, by the one or more processors, the feature vector with at least one stored reference feature vector; and
based on the comparison, determining, by the one or more processors, identity of the individual.
31. A system for facial recognition, the system comprising: at least one processor and a memory having processor-readable code embodied therein for programming the processor to perform a face recognition method, wherein the method comprises:
acquiring, by one or more processors, a facial image of an individual; determining, by the one or more processors, a facial position based on the facial image;
locating, by the one or more processors, multiple facial landmarks associated with the individual- transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the facial image into a uniform facial image, wherein the uniform facial image represents a full-face image of the individual;
creating, by the one or more processors, a feature vector based upon the uniform facial image;
comparing, by the one or more processors, the feature vector with at least one stored reference feature vector; and
based on the comparison, determining, by the one or more processors, identity of the individual.
32. A non-transitory processor-readable medium having instructions stored thereon, which when executed by one or more processors, cause the one or more processors to implement a method for facial recognition, the method comprising:
acquiring, by one or more processors, a facial image of an individual- determining, by the one or more processors, a facial position based on the facial image; locating, by the one or more processors, multiple facial landmarks associated with the individual- transforming, by the one or more processors and based on the multiple facial landmarks, at least a portion of the facial image into a uniform facial image, wherein the uniform facial image represents a full-face image of the individual;
creating, by the one or more processors, a feature vector based upon the uniform facial image;
comparing, by the one or more processors, the feature vector with at least one stored reference feature vector; and
based on the comparison, determining, by the one or more processors, identity of the individual.
PCT/RU2014/000089 2014-02-11 2014-02-11 Facial recognition and user authentication method WO2015122789A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/RU2014/000089 WO2015122789A1 (en) 2014-02-11 2014-02-11 Facial recognition and user authentication method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/RU2014/000089 WO2015122789A1 (en) 2014-02-11 2014-02-11 Facial recognition and user authentication method

Publications (1)

Publication Number Publication Date
WO2015122789A1 true WO2015122789A1 (en) 2015-08-20

Family

ID=53800427

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/RU2014/000089 WO2015122789A1 (en) 2014-02-11 2014-02-11 Facial recognition and user authentication method

Country Status (1)

Country Link
WO (1) WO2015122789A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2647689C1 (en) * 2017-03-01 2018-03-16 Общество с ограниченной ответственностью "Рилейшн Рейт" Method of the client's portrait construction
WO2018064214A1 (en) * 2016-09-30 2018-04-05 Alibaba Group Holding Limited Facial recognition-based authentication
CN108036746A (en) * 2017-12-26 2018-05-15 太原理工大学 A kind of Gabor transformation based on Spectrum Method realizes carbon fibre composite surface texture analysis method
CN108090983A (en) * 2017-12-29 2018-05-29 新开普电子股份有限公司 A kind of device of registering based on recognition of face
CN109001702A (en) * 2018-06-04 2018-12-14 桂林电子科技大学 Carrier-free ultra-wideband radar human body action identification method
WO2019067223A1 (en) * 2017-09-29 2019-04-04 General Electric Company Automatic authentication for access control using facial recognition
WO2019074240A1 (en) * 2017-10-11 2019-04-18 삼성전자주식회사 Server, method for controlling server, and terminal device
WO2020114135A1 (en) * 2018-12-06 2020-06-11 西安光启未来技术研究院 Feature recognition method and apparatus
US10685101B1 (en) 2017-11-30 2020-06-16 Wells Fargo Bank, N.A. Pupil dilation response for authentication
US20210049291A1 (en) * 2019-08-13 2021-02-18 Caleb Sima Securing Display of Sensitive Content from Ambient Interception
CN112990101A (en) * 2021-04-14 2021-06-18 深圳市罗湖医院集团 Facial organ positioning method based on machine vision and related equipment
WO2022001097A1 (en) * 2020-06-30 2022-01-06 公安部第三研究所 Algorithm evaluation system and test method for performance test of person and certificate verification device
US11394705B2 (en) 2018-07-10 2022-07-19 Ademco Inc. Systems and methods for verifying credentials to perform a secured operation in a connected system
US11552944B2 (en) 2017-10-11 2023-01-10 Samsung Electronics Co., Ltd. Server, method for controlling server, and terminal device
US11921831B2 (en) 2021-03-12 2024-03-05 Intellivision Technologies Corp Enrollment system with continuous learning and confirmation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070086627A1 (en) * 2005-10-18 2007-04-19 Samsung Electronics Co., Ltd. Face identification apparatus, medium, and method
US20090060290A1 (en) * 2007-08-27 2009-03-05 Sony Corporation Face image processing apparatus, face image processing method, and computer program
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
WO2011055164A1 (en) * 2009-11-06 2011-05-12 Vesalis Method for illumination normalization on a digital image for performing face recognition
US20130070973A1 (en) * 2011-09-15 2013-03-21 Hiroo SAITO Face recognizing apparatus and face recognizing method
US20130195320A1 (en) * 2006-08-11 2013-08-01 DigitalOptics Corporation Europe Limited Real-Time Face Tracking in a Digital Image Acquisition Device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
US20070086627A1 (en) * 2005-10-18 2007-04-19 Samsung Electronics Co., Ltd. Face identification apparatus, medium, and method
US20130195320A1 (en) * 2006-08-11 2013-08-01 DigitalOptics Corporation Europe Limited Real-Time Face Tracking in a Digital Image Acquisition Device
US20090060290A1 (en) * 2007-08-27 2009-03-05 Sony Corporation Face image processing apparatus, face image processing method, and computer program
WO2011055164A1 (en) * 2009-11-06 2011-05-12 Vesalis Method for illumination normalization on a digital image for performing face recognition
US20130070973A1 (en) * 2011-09-15 2013-03-21 Hiroo SAITO Face recognizing apparatus and face recognizing method

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018064214A1 (en) * 2016-09-30 2018-04-05 Alibaba Group Holding Limited Facial recognition-based authentication
US11551482B2 (en) 2016-09-30 2023-01-10 Alibaba Group Holding Limited Facial recognition-based authentication
US10997445B2 (en) 2016-09-30 2021-05-04 Alibaba Group Holding Limited Facial recognition-based authentication
US10762368B2 (en) 2016-09-30 2020-09-01 Alibaba Group Holding Limited Facial recognition-based authentication
RU2647689C1 (en) * 2017-03-01 2018-03-16 Общество с ограниченной ответственностью "Рилейшн Рейт" Method of the client's portrait construction
WO2019067223A1 (en) * 2017-09-29 2019-04-04 General Electric Company Automatic authentication for access control using facial recognition
CN111133433A (en) * 2017-09-29 2020-05-08 通用电气公司 Automatic authentication for access control using facial recognition
CN111133433B (en) * 2017-09-29 2023-09-05 通用电气公司 Automatic authentication for access control using face recognition
WO2019074240A1 (en) * 2017-10-11 2019-04-18 삼성전자주식회사 Server, method for controlling server, and terminal device
US11552944B2 (en) 2017-10-11 2023-01-10 Samsung Electronics Co., Ltd. Server, method for controlling server, and terminal device
US11138305B1 (en) 2017-11-30 2021-10-05 Wells Fargo Bank, N.A. Pupil dilation response for authentication
US10685101B1 (en) 2017-11-30 2020-06-16 Wells Fargo Bank, N.A. Pupil dilation response for authentication
US11687636B1 (en) 2017-11-30 2023-06-27 Wells Fargo Bank, N.A. Pupil dilation response for authentication
CN108036746A (en) * 2017-12-26 2018-05-15 太原理工大学 A kind of Gabor transformation based on Spectrum Method realizes carbon fibre composite surface texture analysis method
CN108090983A (en) * 2017-12-29 2018-05-29 新开普电子股份有限公司 A kind of device of registering based on recognition of face
CN109001702A (en) * 2018-06-04 2018-12-14 桂林电子科技大学 Carrier-free ultra-wideband radar human body action identification method
US11394705B2 (en) 2018-07-10 2022-07-19 Ademco Inc. Systems and methods for verifying credentials to perform a secured operation in a connected system
WO2020114135A1 (en) * 2018-12-06 2020-06-11 西安光启未来技术研究院 Feature recognition method and apparatus
US20210049291A1 (en) * 2019-08-13 2021-02-18 Caleb Sima Securing Display of Sensitive Content from Ambient Interception
WO2022001097A1 (en) * 2020-06-30 2022-01-06 公安部第三研究所 Algorithm evaluation system and test method for performance test of person and certificate verification device
US11921831B2 (en) 2021-03-12 2024-03-05 Intellivision Technologies Corp Enrollment system with continuous learning and confirmation
CN112990101A (en) * 2021-04-14 2021-06-18 深圳市罗湖医院集团 Facial organ positioning method based on machine vision and related equipment
CN112990101B (en) * 2021-04-14 2021-12-28 深圳市罗湖医院集团 Facial organ positioning method based on machine vision and related equipment

Similar Documents

Publication Publication Date Title
WO2015122789A1 (en) Facial recognition and user authentication method
US11915515B2 (en) Facial verification method and apparatus
CN108073889B (en) Iris region extraction method and device
US11188734B2 (en) Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices
KR102299847B1 (en) Face verifying method and apparatus
US10956719B2 (en) Depth image based face anti-spoofing
KR102290392B1 (en) Method and apparatus for registering face, method and apparatus for recognizing face
US11625954B2 (en) Method and apparatus with liveness testing
US11869272B2 (en) Liveness test method and apparatus and biometric authentication method and apparatus
US10922399B2 (en) Authentication verification using soft biometric traits
US11238271B2 (en) Detecting artificial facial images using facial landmarks
JP2008015871A (en) Authentication device and authenticating method
KR102380426B1 (en) Method and apparatus for verifying face
Lin et al. A novel framework for automatic 3D face recognition using quality assessment
JP6430987B2 (en) Reference point position determination device
Radu et al. On combining information from both eyes to cope with motion blur in iris recognition
Pavithra et al. Scale Invariant Feature Transform Based Face Recognition from a Single Sample per Person
Hesson Detecting GAN-generated face morphs using human iris characteristics
Shiwani et al. PCA Based Improved Algorithm for Face Recognition
Mahmood et al. MATLAB Implementation of Face Identification Using Principal Component Analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14882241

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14882241

Country of ref document: EP

Kind code of ref document: A1