WO2013159686A1 - Three-dimensional face recognition for mobile devices - Google Patents
Three-dimensional face recognition for mobile devices Download PDFInfo
- Publication number
- WO2013159686A1 WO2013159686A1 PCT/CN2013/074511 CN2013074511W WO2013159686A1 WO 2013159686 A1 WO2013159686 A1 WO 2013159686A1 CN 2013074511 W CN2013074511 W CN 2013074511W WO 2013159686 A1 WO2013159686 A1 WO 2013159686A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- person
- image
- dimensional model
- images
- determining
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
- G06V10/993—Evaluation of the quality of the acquired pattern
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/653—Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/60—Static or dynamic means for assisting the user to position a body part for biometric acquisition
- G06V40/67—Static or dynamic means for assisting the user to position a body part for biometric acquisition by interactive indications to the user
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- This disclosure is generally related to using face-recognition to identify or
- this disclosure is related to using a mobile device that includes an image sensor and a motion sensor to generate a three-dimensional model of a user's face.
- Face recognition can provide the most effective and natural way to identify and/or authenticate a user if it is implemented properly.
- two dimensional (2-D) image-based face recognition is prone to errors caused by variations in ambient lighting or variations in the user's pose, expression, make-up and aging.
- the effectiveness of 2-D image-based face recognition is also limited by how easy it can be for others to deceive it by capturing an image of a printed picture of a privileged user.
- three-dimensional (3-D) image-based face recognition can be more secure, it is typically implemented using stereoscopic image-capture devices that use multiple cameras, which is not often found on mobile devices.
- typical 3-D image-based face recognition involves performing complicated computations that are too computationally expensive for a mobile computing device.
- One embodiment provides a mobile device that generates a three-dimensional model of a person's face by capturing and processing a set of two-dimensional images.
- the device uses an image-capture device to capture a set of images of a person from various orientations as the person or any another user sweeps the mobile device across the person's face.
- the device determines orientation information for the captured images, and detects a plurality of features of the person's face from the captured images.
- the device then generates a three- dimensional model of the person's face from the detected features and their orientation information.
- the three-dimensional model of the person's face facilitates identifying and/or authenticating the person's identity.
- the device monitors a change in orientation of the mobile device.
- the device determines whether the orientation has changed by at least a minimum amount from an orientation of a previous captured image, and determines whether the mobile device is stabilized.
- the device captures an image in response to determining that the orientation has changed by at least a minimum amount and that the mobile device is stabilized.
- the device then stores the captured image in response to determining that the image is suitable for detecting facial features of the person.
- the device while capturing the set of images, provides a notification to the person or any other user in response to determining that the mobile device is not stabilized or determining that no more images need to be captured.
- the device can also provide a notification in response to determining that the person's face is not in the image frame, or determining that the current orientation of the device is not suitable for detecting features of the person's face.
- the notification includes one or more of: a sound; a vibration pattern; a flashing pattern from a light source of the mobile device; and a displayed image on a screen of the mobile.
- the device captures the set of images in response to receiving a request to register the person as auser.
- the device then stores the three-dimensional model in association with a user profile of the person.
- the device captures the set of images in response to receiving a request to authenticate the person, and uses the generated three-dimensional model to authenticate the person.
- the device authenticates the person by determining whether the generated three-dimensional model of the person matches a stored three- dimensional model of a registered user.
- the device while authenticating the person, sends the generated three-dimensional model of the person to a remote authentication device, and receives an authentication response which indicates whether the person is a registered user, access privileges for the person, and/or identifying profile information for the person.
- the device captures the set of images in response to receiving a request to generate an avatar for the person.
- the device then generates an avatar for the person, such that the avatar's face is generated based on the three-dimensional model of the person.
- FIG. 1 illustrates an exemplary application for an image-capture device in accordance with an embodiment.
- FIG. 2 presents a flow chart illustrating a process for generating and using a three- dimensional model of a local user's face in accordance with an embodiment.
- FIG. 3 illustrates a plurality of detected facial features from a two-dimensional image in accordance with an embodiment.
- FIG. 4 presents a flow chart illustrating a method for capturing a set of images of a local user in accordance with an embodiment.
- FIG. 5A illustrates a motion trajectory of an image-capture device during an image capture operation in accordance with an embodiment.
- FIG. 5B illustrates modeling data that is computed while generating the three- dimensional model of the local user in accordance with an embodiment.
- FIG. 6 illustrates a normalized three-dimensional model of a user's face in accordance with an embodiment.
- FIG. 7 illustrates an exemplary apparatus that facilitates generating a three- dimensional model of a local user in accordance with an embodiment.
- FIG. 8 illustrates an exemplary computer system that facilitates generating a three- dimensional model of a local user in accordance with an embodiment.
- Embodiments of the present invention provide an image-capture device that solves the problem of generating a three-dimensional model of a user's face using a single camera.
- the device can use an on-board motion sensor, such as a gyroscope, while capturing multiple images of the user from various viewpoints to monitor position and orientation information about the individual images.
- the device uses this position and orientation information to generate the three- dimensional model of the user's face, and can use this three-dimensional model to identify or authenticate the user when the user requests access to the device or other restricted resources.
- smartphones typically include at least one camera facing a certain direction, such as a front-facing camera and/or a rear-facing camera.
- the user can be asked to sweep the device's camera in front his/her face from one side to the opposing side so that the device can capture images of his/her face from various angles and viewpoints.
- the device can also use the on-board motion sensor and its face- detection capabilities to determine the right moments to capture an image as the user sweeps the device in front of his/her face, and can inform the user if the user is performing the sweeping motion incorrectly.
- the device uses the on-board motion sensor to capture motion or orientation information of the device at the time the image was captured, and stores this information along with the captured image.
- the device analyzes these captured images to detect position information on the images for certain facial features, and uses the device motion or orientation information to efficiently compute the 3-D position of these features and generates a corresponding three- dimensional facial model for the user.
- the device can normalize the scale and orientation of the model with respect to a global coordinate system, which facilitates comparing the user's three-dimensional model directly with other stored model(s) (e.g., to identify the user).
- FIG. 1 illustrates an exemplary application for an image-capture device 102 in accordance with an embodiment.
- Image-capture device 102 can include any computing device that includes a digital camera and a motion sensor (e.g., a gyroscope, a compass, an accelerometer, etc.).
- image-capture device 102 can include a smartphone that includes a display, a digital camera (e.g., a front-facing or rear-facing camera), a storage device, and a communication device for interfacing with other devices (e.g., via a network 112).
- Device 102 can use the on-board camera and motion sensor to generate a three-dimensional model of user 104 using a single camera, and can use the three-dimensional model to identify or authenticate user 104.
- user 104 can create or update a user profile for accessing device 102 (or a remote device such as server 110) without having to manually enter a passcode.
- device 102 To create or update the user profile, device 102 generates a three-dimensional model of user 104, and can use this three-dimensional model to identify or authenticate user 104.
- Device 102 can allow user 104 to create multiple three-dimensional models, which can improve the likelihood that device 102 recognizes user 104.
- device 102 instructs user 104 to sweep device 102 across his/her face to capture his/her face from various positions and orientations (e.g., positions 106.1, 106.2 and 106./ ' ).
- User 104 uses device 102 to capture images of his/her face by holding device 102 with a single hand so that an on-board camera is aimed at his/her face, and steadily changes the position and orientation of device 102 until the onboard camera has captured a sufficient number of images of user 104.
- the image-capturing procedure is continuous and automatic, such that user 104 does not need to manually press a shutter button, and does not need to be concerned about whether the captured images are motion-blurred, whether the face is out of sight, etc.
- device 102 ensures that it captures quality images that capture facial features of user 104 by using the motion sensor and the face-detection capabilities to determine the moments that result in the best pictures, and can notify 104 of any problems during the image-capture procedure.
- Device 102 uses these images and their orientation to generate the three-dimensional model of user 104, for example, by determining the position in the three- dimensional model for the facial features detected in the captured images.
- user 104 can use the face recognition capability of device 102, without having to manually enter a passcode.
- User 104 can also use device 102 to gain access to other restricted resources, such as software or data, a computer system, or a secured room.
- server 110 may store profile information for a set of users that have access to the restricted resource.
- Device 102 may be a trusted resource that interacts with server 110 to communicate the three-dimensional model of user 104 to server 110. If server 110 determines that the three-dimensional model matches that of a trusted user, server 110 can grant user 104 access to the trusted resource. Otherwise, server 110 can deny user 104 from accessing the trusted resource.
- FIG. 2 presents a flow chart illustrating a process for generating and using a three- dimensional model of a local user's face in accordance with an embodiment.
- the image-capture device can receive a request that requires a three-dimensional model of the local user's face (operation 202).
- the request can include, for example, a command to register a user profile that includes a three-dimensional model of the local user's face, or a request to identify or authenticate the local user using a three-dimensional model of the user's face.
- the request can also include other commands that require a model of the local user' s face, such as to generate a three- dimensional avatar for the local user.
- the device captures a set of images of the local user's face (operation 204), and processes the captured images to detect facial features of the local user (operation 206). The device then determines orientation information for each captured image (operation 208), and generates the three-dimensional model of the user's face from the orientation information for the captured images and the image coordinates of the detected features (operation 210).
- the device detects a set of predefined facial feature points, such as points along the contour of the eyebrows, the eyes, the nose, the jawline, and the mouth.
- the device can use the position of each feature point that occurs in several different images during operation 210 to compute, using projective geometry, a position for this feature point in the three- dimensional model.
- the device then processes the request using the three-dimensional model of the user's face (operation 212).
- the request can include a command to register the local user, for example, by creating a user profile that includes the three-dimensional model of the local user.
- the device performs the command by storing the three-dimensional model and a profile of the local user in a local profile repository, and can also provide the three-dimensional model and the local user' s profile to a remote authentication system.
- the request can include a request to identify the local user, at which point the device processes the request by searching for a user profile whose three-dimensional model matches that of the local user. If the device finds a closest match that has a high confidence value, the device provides the identity of the closest match as the user's identity. Otherwise, the device provides a result indicating that the local user is not recognized.
- the device stores the three-dimensional models of various registered user profiles in a local repository, and searches for the local user's profile by comparing the three-dimensional model of the local user to the stored models associated with the registered user profiles.
- the device can also search for the local user's profile by sending the three- dimensional model of the local user to the remote authentication system, and receiving an authentication response from the authentication system. If the authentication system recognizes the local user, the authentication response can indicate the identity of the local user, access privileges for the local user, and/or the local user's profile information.
- the request from operation 202 can include a command to generate an avatar for the local user, at which point the device processes the command to generate the avatar for the local user from the generated three-dimensional model.
- the avatar can include a pre-designed body and costume (e.g., selected or designed by the local user), and can include facial features that match features from the three-dimensional model of the local user's face.
- the look and texture of these facial features can be selected from a pre-designed feature repository based on the three-dimensional model of the local user's face, and their placement on the avatar's face can also be determined from the three-dimensional model of the local user's face.
- the image-capture device generates the three-dimensional model of the local user's face by capturing and processing a plurality of images that show the user's facial features from various viewpoints.
- the device makes this image-capture process fast and cost-effective by allowing the user to sweep the device's on-board camera across the front and sides of his/her face, for example, in a left-to-right or a right-to-left motion.
- the user needs to make sure that he/she does not move the device too fast so that the captured images are not blurred, and also needs to make sure that the images capture enough facial features of the local user.
- the device can monitor its motion and the quality of the captured images to let the user know when he/she needs to slow down his/her motion, repeat his/her motion, reposition the device to better capture his/her face, or move the device to a specific viewpoint to capture facial features from any necessary orientations.
- the device can monitor the motion using an on-board gyroscope, and can monitor the quality of a captured image by analyzing its brightness, contrast, sharpness, and/or by counting the number of detectable facial features.
- the device interacts with the local user to facilitate capturing images that include a sufficient number of detectable facial features.
- image 300 illustrates a plurality of feature points (illustrated using cross marks) for a set of facial features, such as left eye features 302 and right eye features 304, as well as left eyebrow features 306 and right eyebrow features 308.
- the detected features can also include nose features 310, lips features 312, and jawline features 314. Other possible features include a hairline, the chin, ears, etc.
- the detected features can also include feature points surrounding other facial anomalies that are not found on every face, such as a dimple, a birthmark, a scar, a tattoo, etc.
- FIG. 4 presents a flow chart illustrating a method for capturing a set of images of a local user in accordance with an embodiment.
- the image-capture device can determine whether it is ready to capture an image (operation 402). For example, the device may determine that the user is sweeping the motion-capture device too fast, which could result in a blurry image. If the device is not ready, the device can notify the user that it cannot capture (operation 404), for example, by playing a sound, generating a certain vibration pattern, generating a flash pattern (e.g., using the camera's flash), or displaying an image on the device's screen. When the user notices the notification, the user can respond by slowing down his/her sweeping motion of the image-capture device.
- operation 402 the device may determine that the user is sweeping the motion-capture device too fast, which could result in a blurry image.
- the device can notify the user that it cannot capture (operation 404), for example, by playing a sound, generating a certain vibration pattern
- the device can capture the image (operation 406), and processes the image to determine facial features of the local user and a device orientation for the captured image
- the device determines whether the image is suitable for detecting features of the local user (operation 410). For example, the image-capture device can determine whether it can detect a face, and/or whether it can detect a sufficient number of facial features. If the captured image corresponds to the front of the user's face, the device may expect to detect at least six facial features. However, if the device determines that the captured image is a profile view of the user's face, the device may expect to capture at least three or four facial features.
- the device can return to operation 404 to notify the user of this problem.
- the user can respond by re-aligning the image-capture device so that the user's face is visible in the captured image, by ensuring there is sufficient ambient light for capturing an image, and/or by ensuring that the device is steady enough for capturing an in-focus image.
- the device stores the image, the detected feature points, and a device orientation for the captured image (operation 412).
- the device determines whether it has captured enough images for generating the three-dimensional model (operation 414). If so, the device can proceed to an end terminal.
- the device monitors a change in its orientation from that of a previous stored image (operation 416), and determines whether the orientation has changed by at least a minimum threshold (operation 418). If the device's orientation has not changed beyond this threshold (e.g., a captured image would be too similar to that of a previous stored image), the device can return to operation 416 after a short delay (e.g., a few milliseconds).
- a short delay e.g., a few milliseconds.
- the device can return to operation 402 to capture another image.
- the device can continue to perform method 400 until it has captured enough images from which it can generate a three-dimensional model of the user's face.
- FIG. 5A illustrates a motion trajectory 500 of an image-capture device 502 during an image capture operation in accordance with an embodiment.
- the image-capture device captures image 506.1 and orientation data 508.1 while the device is in orientation 504.1.
- the device can capture images 506.2 through 506./ ' and orientation data 508.2 through 508./ ' for device orientations 504.2 through 504 j, respectively.
- the image-capture device can determine orientation data 508 using any motion- sensor, now known or later developed, that can determine absolute or relative three-dimensional coordinates for each captured image.
- the motion sensor can include a gyroscope that provides three rotation angles perpendicular to the device's plane (e.g., the pitch, yaw, and roll angles along the X, Y, and Z axis, respectively) for each captured image.
- the device then processes the captured images to detect the image coordinates of certain facial features across the various captured images. For example, the device can determine feature points 510.1, 510.2, and 5 lO.j that correspond to a nose feature captured by images 506.1, 506.2, and 506.j, respectively.
- the coordinates of a feature point i within an image j is hereinafter denoted using the tuple ( ⁇ ( .
- the device then processes the orientation data 508 and the feature points 510 to generate a three-dimensional model in a global coordinate system.
- the three- dimensional model is hereinafter denoted using the tuple x (0) , z (0) , such that the superscript (0) indicates the model is represented using the global coordinate system under which all captured images are processed.
- the relationship of the two-dimensional coordinates of a point 510 and the 3D physical space for the three-dimensional model can be represented by the projection transformation as follows:
- u,v,l provides the homogenous image coordinates of a feature point
- K 3x3 provides a 3x3 matrix consisting of intrinsic camera parameters, such as focal length, principal point, aspect ratio, skew factor and radial distortion.
- the value for K can be computed beforehand using any camera calibration technique, such as the technique described by Zhengyou Zhang in "A flexible new technique for camera calibration” (IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, issue 11, pages 1330-1334, 2000), which is hereby incorporated by reference.
- [R ⁇ I T ] provides a 3D rotation and translation matrix, which facilitates converting a point in the 3D physical space to a point in the camera's local 3D coordinate system.
- the device generates the 3x4 matrix [R 3x3 1 ⁇ 3 ⁇ ] by concatenating the 3x3 rotation matrix R 3x3 and the 3x1 translation matrix ⁇ 3 ⁇ .
- each captured image has a local 3D coordinate system
- the device uses a global coordinate system to generate the three-dimensional model of the local user from the captured images.
- the device can select the coordinate system for one captured image (e.g., the frontal view of the user, hereinafter referred to as view 0) as the global coordinate system for the three-dimensional model.
- the global 3D coordinate system is hereinafter denoted using the notation ⁇ ° °
- FIG. 5B illustrates modeling data that is computed while generating the three- dimensional model of the local user in accordance with an embodiment.
- the device To generate a three dimensional model 564, the device first selects one captured image to use as a reference point for processing all other images. For example, the device can select an orientation 554.2, which corresponds to a front-facing image 556.2 of the local user, as the global coordinate system 562. The device then generates a system of linear equations based on equation (1) for all feature points detected across all images, and relative to the global coordinate system 562. The device generates three-dimensional model 564 by solving this system of linear equations.
- the system of linear equations includes a linear equation for each feature point of each captured image (e.g., for a feature point i from an image view j, represented using
- the image- capture device computes the 3D rotation R y o and translation 7 ⁇ ;0 from the global coordinate system Z° ⁇ for view j.
- the device uses the gyroscope data to compute an accurate rotation matrix R y o, which facilitates generating the three- dimensional model by solving the set of linear equations, making the computations extremely lightweight for mobile devices.
- To determine translation J 0 the device needs to solve the system of linear equations.
- each detected facial feature point i introduces 3 unknowns and 4 linear equations:
- Equation (2) corresponds to a projection transformation within view 0 (the view selected as the global coordinate system for generating the three-dimensional model).
- Equation (3) corresponds to a projection transformation for a view j, relative to the global coordinate system corresponding of view 0.
- the device can determine input values for the following variables in equations (2) and (3) as follows.
- the variable K takes as input the 3x3 intrinsic matrix that is computed for the device ahead of time when calibrating the device's camera.
- the 3x3 matrix R y o takes as input the rotation matrix computed from gyroscope data, which corresponds to a rotation of the device from view 0 to view j.
- the tuple (w (0) ,v (0) ) takes as input an image coordinate detected for the facial marker i from the image captured from view 0, and the tuple ( ⁇ ( takes as input an image coordinate detected for the facial marker i from the image captured from view j.
- Equation (2) and (3) denoted with a tilde ( ⁇ ) correspond to unknown values that the device solves for (e.g., during operation 210 of FIG. 2).
- the tuple provides three-dimensional coordinates of the facial marker i with respect to the global coordinate system 1 ⁇ 0) , Z ⁇ .
- the 3x1 matrix T J 0 provides a translation matrix from view 0 to view j, which is common for all facial markers in view j.
- the device can capture images for a plurality of views.
- Each additional view j contributes an additional 4n equations (based on equations (2) and (3)), and introduces 3 new unknowns (based on the 3x1 translation matrix T j 0 for view j).
- the device solves the linear equations generated for all views together (e.g., during operation 210 of FIG. 2). Solving the system of equations provides the three- dimensional coordinates ⁇ 5c. ' ,y ( , ( ') for all facial markers i, and provides the translation matrix
- the device transforms the model to generate a normalized three-dimensional model.
- the image-capture device can generate the normalized model by performing a translation operation, a rotation operation, and a scale-change operation so that the two eyes are fixed to certain coordinates (e.g., coordinates (1,0,0) and (-1,0,0) for the user's left and right eyes, respectively).
- This computation-efficient transformation facilitates normalizing the three- dimensional model at the image-capture device, and prevents the device from having to fit two models to a common coordinate system before comparing the two models, which can be time consuming when comparing the local user' s face to those of other users in a large user-profile database.
- FIG. 6 illustrates a normalized three-dimensional model 600 of a user's face in accordance with an embodiment. Specifically, the scale and orientation of normalized three- dimensional model 600 is transformed so that the left and right eyes (e.g., features 604 and 606) are positioned at feature coordinates (1,0,0) and (-1,0,0) of global coordinate system 602, respectively.
- the left and right eyes e.g., features 604 and 606
- feature coordinates (1,0,0) and (-1,0,0) of global coordinate system 602, respectively.
- the device can compare the model of the user' s face to other three-dimensional models (e.g., to perform face recognition or to authenticate the user), without first fitting them to a common coordinate system.
- the device can compute the difference between features of the two three-dimensional models by computing a distance between corresponding feature points of the two models.
- the device can compute the distance as a Euclidean distance between all features points i that occur in the two models as follows:
- the two coordinates (x ⁇ y ⁇ z ⁇ and ( ⁇ ', ⁇ ', ⁇ ') correspond to a feature point i that occurs in the two three-dimensional models being compared.
- the computed difference, diff provides a numeric value indicating a difference between the two three-dimensional models (e.g., as a Euclidean distance relative to the global coordinate system).
- the image-capture device can compare two three-dimensional models in a way that accounts for differences in coordinate systems for the two models. For example, if a stored three-dimensional model of a registered user's face has not been normalized, or has been normalized to a different coordinate system, the device can perform the comparison operation by solving the following linear equation:
- the device can compute the rotation matrix R using gyroscope data, and can solve for the translation matrix T and the scale factor s by solving equation (5), for example, using linear least-square fitting.
- the device can then compute the fitting error for each three-dimensional model, and can use the fitting error as the difference between the two three dimensional models.
- the device can compute the distance between the three- dimensional model of the user's face to those of other registered users (e.g., using equation (4) or equation (5)). If the confidence is high for the closest match (e.g., the difference of the closest match is less than a certain threshold), the device can provide the identity of the closest match as the user's identity. Otherwise, the device can provide a result indicating that the local user is not recognized.
- FIG. 7 illustrates an exemplary apparatus 700 that facilitates generating a three- dimensional model of a local user in accordance with an embodiment.
- Apparatus 700 can comprise a plurality of hardware and/or software modules which may communicate with one another via a wired or wireless communication channel.
- Apparatus 700 may be realized using one or more integrated circuits, and may include fewer or more modules than those shown in FIG. 7.
- apparatus 700 may be integrated in a computer system, or realized as a separate device which is capable of communicating with other computer systems and/or devices.
- apparatus 700 can comprise a communication module 702, an interface module 704, an image-capture module 706, a motion sensor 708, a feature-detecting module 710, a model-generating module 712, and an authentication module 708.
- communication module 702 can communicate with third- party systems, such as an authentication server.
- Interface module 704 can provide feedback to the local user during the authentication process, for example, to alert the user of a potential problem that prevents apparatus 700 from detecting the local user's facial features.
- Image-capture module 706 can capture a set of images of the local user from various orientations, and motion sensor 708 can determine orientation information for the captured images.
- Feature-detecting module 710 can detect a plurality of features of the local user's face from the captured images, and model-generating module 712 can generate a three-dimensional model of the local user's face from the detected features and orientation information for their corresponding images.
- Authentication module 708 can compare the generated three-dimensional model to those of registered users to identify or authenticate the local user.
- FIG. 8 illustrates an exemplary computer system 802 that facilitates generating a three-dimensional model of a local user in accordance with an embodiment.
- Computer system 802 includes a processor 804, a memory 806, and a storage device 808.
- Memory 806 can include a volatile memory (e.g., RAM) that serves as a managed memory, and can be used to store one or more memory pools.
- computer system 802 can be coupled to a display device 810, a keyboard 812, and a pointing device 814.
- Storage device 808 can store an operating system 816, an image-capture system 818, and data 834.
- display 810 includes a touch screen display, such that keyboard 812 includes a virtual keyboard presented on display 810, and pointing device 814 includes a touch-sensitive device coupled to display 810 (e.g., a capacitive-touch sensor or a resistive-touch sensor layered on display 810).
- keyboard 812 includes a virtual keyboard presented on display 810
- pointing device 814 includes a touch-sensitive device coupled to display 810 (e.g., a capacitive-touch sensor or a resistive-touch sensor layered on display 810).
- keyboard 812 includes a virtual keyboard presented on display 810
- pointing device 814 includes a touch-sensitive device coupled to display 810 (e.g., a capacitive-touch sensor or a resistive-touch sensor layered on display 810).
- touch-sensitive device coupled to display 810
- the user can tap on a portion of display 810 that presents a desired key.
- the user can also select any other display object presented on display 810 by tapping on the display object,
- Image-capture system 818 can include instructions, which when executed by computer system 802, can cause computer system 802 to perform methods and/or processes described in this disclosure. Specifically, image-capture system 818 may include instructions for communicating with third-party systems, such as an authentication server (communication module 820). Further, image-capture system 818 can include instructions for providing feedback to the local user during the authentication process, for example, to alert the user of a potential problem that prevents image-capture system 800 from detecting the local user's facial features (interface module 822).
- Image-capture system 818 can also include instructions for capturing a set of images of the local user from various orientations (image-capture module 824), and for determining orientation information for the captured images (motion-sensing module 826). Image-capture system 818 can also include instructions for detecting a plurality of features of the local user's face from the captured images (feature-detecting module module 828), and for generating a three- dimensional model of the local user's face from the detected features and orientation information for their corresponding images (model-generating module 830). Image-capture system 818 can also include instructions for comparing the generated three-dimensional model to those of registered users to identify or authenticate the local user (authentication module 832).
- Data 834 can include any data that is required as input or that is generated as output by the methods and/or processes described in this disclosure. Specifically, data 834 can store at least user profiles for one or more registered users, access privileges for the registered users, and at least one three-dimensional model for each of the registered user's faces.
- the data structures and code described in this detailed description are typically stored on a computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system.
- the computer-readable storage medium includes, but is not limited to, volatile memory, non- volatile memory, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.
- the methods and processes described in the detailed description section can be embodied as code and/or data, which can be stored in a computer-readable storage medium as described above.
- a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium.
- the methods and processes described above can be included in hardware modules.
- the hardware modules can include, but are not limited to, application-specific integrated circuit (ASIC) chips, field-programmable gate arrays (FPGAs), and other programmable-logic devices now known or later developed.
- ASIC application-specific integrated circuit
- FPGA field-programmable gate arrays
- the hardware modules When the hardware modules are activated, the hardware modules perform the methods and processes included within the hardware modules.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Collating Specific Patterns (AREA)
Abstract
A mobile device can generate a three-dimensional model of a person' s face by capturing and processing a plurality of two-dimensional images. During operation, the mobile device uses an image- capture device to capture a set of images of the person from various orientations as the person or any other user sweeps the mobile device in front of the person's face from one side of his/her face to the opposing side. The device determines orientation information for the captured images, and detects a plurality of features of the person's face from the captured images. The device then generates a three- dimensional model of the person's face from the detected features and their orientation information. The three-dimensional model of the person's face facilitates identifying and/or authenticating the person's identity.
Description
THREE-DIMENSIONAL FACE RECOGNITION FOR
MOBILE DEVICES
Inventors: Fengjun Lv and Antontius Kalker
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims the benefit of Non-U. S. Provisional Patent Application No. 13/456,074 filed April 25, 2012, entitled "Three-Dimensional Face Recognition for Mobile Devices," which is incorporated herein by reference as if reproduced in its entirety.
BACKGROUND
Field
[0001] This disclosure is generally related to using face-recognition to identify or
authenticate a user. More specifically, this disclosure is related to using a mobile device that includes an image sensor and a motion sensor to generate a three-dimensional model of a user's face.
Related Art
[0002] Nowadays users can use mobile devices, such as a smartphone, to perform their computing tasks while on the go. They can check their bank account balances while shopping at a local store, compare merchandise prices with their favorite online retailers, and even purchase items online from their mobile device. Users also oftentimes use their mobile devices to interact with their friends and colleagues, regardless of where they are, for example, by collaborating in on-line games or by communicating with their friends via an online social network.
[0003] Face recognition can provide the most effective and natural way to identify and/or authenticate a user if it is implemented properly. However, two dimensional (2-D) image-based face recognition is prone to errors caused by variations in ambient lighting or variations in the user's pose, expression, make-up and aging. The effectiveness of 2-D image-based face recognition is also
limited by how easy it can be for others to deceive it by capturing an image of a printed picture of a privileged user. Further, while three-dimensional (3-D) image-based face recognition can be more secure, it is typically implemented using stereoscopic image-capture devices that use multiple cameras, which is not often found on mobile devices. Moreover, typical 3-D image-based face recognition involves performing complicated computations that are too computationally expensive for a mobile computing device.
SUMMARY
[0004] One embodiment provides a mobile device that generates a three-dimensional model of a person's face by capturing and processing a set of two-dimensional images. During operation, the device uses an image-capture device to capture a set of images of a person from various orientations as the person or any another user sweeps the mobile device across the person's face. The device determines orientation information for the captured images, and detects a plurality of features of the person's face from the captured images. The device then generates a three- dimensional model of the person's face from the detected features and their orientation information. The three-dimensional model of the person's face facilitates identifying and/or authenticating the person's identity.
[0005] In some embodiments, to capture the set of images, the device monitors a change in orientation of the mobile device. The device determines whether the orientation has changed by at least a minimum amount from an orientation of a previous captured image, and determines whether the mobile device is stabilized. The device captures an image in response to determining that the orientation has changed by at least a minimum amount and that the mobile device is stabilized. The device then stores the captured image in response to determining that the image is suitable for detecting facial features of the person.
[0006] In some embodiments, while capturing the set of images, the device provides a notification to the person or any other user in response to determining that the mobile device is not stabilized or determining that no more images need to be captured. The device can also provide a notification in response to determining that the person's face is not in the image frame, or determining that the current orientation of the device is not suitable for detecting features of the person's face.
[0007] In some embodiments, the notification includes one or more of: a sound; a vibration pattern; a flashing pattern from a light source of the mobile device; and a displayed image on a screen of the mobile.
[0008] In some embodiments, the device captures the set of images in response to receiving a request to register the person as auser. The device then stores the three-dimensional model in association with a user profile of the person.
[0009] In some embodiments, the device captures the set of images in response to receiving a request to authenticate the person, and uses the generated three-dimensional model to authenticate the person.
[0010] In a variation on these embodiments, the device authenticates the person by determining whether the generated three-dimensional model of the person matches a stored three- dimensional model of a registered user.
[0011] In a variation on these embodiments, while authenticating the person, the device sends the generated three-dimensional model of the person to a remote authentication device, and receives an authentication response which indicates whether the person is a registered user, access privileges for the person, and/or identifying profile information for the person.
[0012] In some embodiments, the device captures the set of images in response to receiving a request to generate an avatar for the person. The device then generates an avatar for the person, such that the avatar's face is generated based on the three-dimensional model of the person.
BRIEF DESCRIPTION OF THE FIGURES
[0013] FIG. 1 illustrates an exemplary application for an image-capture device in accordance with an embodiment.
[0014] FIG. 2 presents a flow chart illustrating a process for generating and using a three- dimensional model of a local user's face in accordance with an embodiment.
[0015] FIG. 3 illustrates a plurality of detected facial features from a two-dimensional image in accordance with an embodiment.
[0016] FIG. 4 presents a flow chart illustrating a method for capturing a set of images of a local user in accordance with an embodiment.
[0017] FIG. 5A illustrates a motion trajectory of an image-capture device during an image capture operation in accordance with an embodiment.
[0018] FIG. 5B illustrates modeling data that is computed while generating the three- dimensional model of the local user in accordance with an embodiment.
[0019] FIG. 6 illustrates a normalized three-dimensional model of a user's face in accordance with an embodiment.
[0020] FIG. 7 illustrates an exemplary apparatus that facilitates generating a three- dimensional model of a local user in accordance with an embodiment.
[0021] FIG. 8 illustrates an exemplary computer system that facilitates generating a three- dimensional model of a local user in accordance with an embodiment.
[0022] In the figures, like reference numerals refer to the same figure elements.
DETAILED DESCRIPTION
[0023] The following description is presented to enable any person skilled in the art to make and use the embodiments, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Overview
[0024] Embodiments of the present invention provide an image-capture device that solves the problem of generating a three-dimensional model of a user's face using a single camera. The device can use an on-board motion sensor, such as a gyroscope, while capturing multiple images of the user from various viewpoints to monitor position and orientation information about the individual images. The device uses this position and orientation information to generate the three- dimensional model of the user's face, and can use this three-dimensional model to identify or authenticate the user when the user requests access to the device or other restricted resources.
[0025] For example, smartphones typically include at least one camera facing a certain direction, such as a front-facing camera and/or a rear-facing camera. When the user attempts to access the smartphone device, the user can be asked to sweep the device's camera in front his/her face from one side to the opposing side so that the device can capture images of his/her face from various angles and viewpoints. The device can also use the on-board motion sensor and its face- detection capabilities to determine the right moments to capture an image as the user sweeps the device in front of his/her face, and can inform the user if the user is performing the sweeping motion incorrectly. When the device does capture an image, the device uses the on-board motion sensor to capture motion or orientation information of the device at the time the image was captured, and stores this information along with the captured image.
[0026] The device analyzes these captured images to detect position information on the images for certain facial features, and uses the device motion or orientation information to efficiently compute the 3-D position of these features and generates a corresponding three- dimensional facial model for the user. Once the device generates the three-dimensional model, the device can normalize the scale and orientation of the model with respect to a global coordinate system, which facilitates comparing the user's three-dimensional model directly with other stored model(s) (e.g., to identify the user).
[0027] FIG. 1 illustrates an exemplary application for an image-capture device 102 in accordance with an embodiment. Image-capture device 102 can include any computing device that includes a digital camera and a motion sensor (e.g., a gyroscope, a compass, an accelerometer, etc.). For example, image-capture device 102 can include a smartphone that includes a display, a digital camera (e.g., a front-facing or rear-facing camera), a storage device, and a communication device for interfacing with other devices (e.g., via a network 112). Device 102 can use the on-board camera and motion sensor to generate a three-dimensional model of user 104 using a single camera, and can use the three-dimensional model to identify or authenticate user 104.
[0028] In some embodiments, user 104 can create or update a user profile for accessing device 102 (or a remote device such as server 110) without having to manually enter a passcode. To create or update the user profile, device 102 generates a three-dimensional model of user 104, and can use this three-dimensional model to identify or authenticate user 104. Device 102 can allow
user 104 to create multiple three-dimensional models, which can improve the likelihood that device 102 recognizes user 104.
[0029] When device 102 is ready to generate the three-dimensional model, device 102 instructs user 104 to sweep device 102 across his/her face to capture his/her face from various positions and orientations (e.g., positions 106.1, 106.2 and 106./'). User 104 then uses device 102 to capture images of his/her face by holding device 102 with a single hand so that an on-board camera is aimed at his/her face, and steadily changes the position and orientation of device 102 until the onboard camera has captured a sufficient number of images of user 104. The image-capturing procedure is continuous and automatic, such that user 104 does not need to manually press a shutter button, and does not need to be concerned about whether the captured images are motion-blurred, whether the face is out of sight, etc.
[0030] In some embodiments, device 102 ensures that it captures quality images that capture facial features of user 104 by using the motion sensor and the face-detection capabilities to determine the moments that result in the best pictures, and can notify 104 of any problems during the image-capture procedure. Device 102 uses these images and their orientation to generate the three-dimensional model of user 104, for example, by determining the position in the three- dimensional model for the facial features detected in the captured images.
[0031] If user 104 has a registered user profile, user 104 can use the face recognition capability of device 102, without having to manually enter a passcode. User 104 can also use device 102 to gain access to other restricted resources, such as software or data, a computer system, or a secured room. For example, server 110 may store profile information for a set of users that have access to the restricted resource. Device 102 may be a trusted resource that interacts with server 110 to communicate the three-dimensional model of user 104 to server 110. If server 110 determines that the three-dimensional model matches that of a trusted user, server 110 can grant user 104 access to the trusted resource. Otherwise, server 110 can deny user 104 from accessing the trusted resource.
[0032] FIG. 2 presents a flow chart illustrating a process for generating and using a three- dimensional model of a local user's face in accordance with an embodiment. During operation, the image-capture device can receive a request that requires a three-dimensional model of the local user's face (operation 202). The request can include, for example, a command to register a user profile that includes a three-dimensional model of the local user's face, or a request to identify or
authenticate the local user using a three-dimensional model of the user's face. The request can also include other commands that require a model of the local user' s face, such as to generate a three- dimensional avatar for the local user.
[0033] To generate the three-dimensional model, the device captures a set of images of the local user's face (operation 204), and processes the captured images to detect facial features of the local user (operation 206). The device then determines orientation information for each captured image (operation 208), and generates the three-dimensional model of the user's face from the orientation information for the captured images and the image coordinates of the detected features (operation 210).
[0034] During operation 206, the device detects a set of predefined facial feature points, such as points along the contour of the eyebrows, the eyes, the nose, the jawline, and the mouth. The device can use the position of each feature point that occurs in several different images during operation 210 to compute, using projective geometry, a position for this feature point in the three- dimensional model.
[0035] The device then processes the request using the three-dimensional model of the user's face (operation 212). In some embodiments, the request can include a command to register the local user, for example, by creating a user profile that includes the three-dimensional model of the local user. The device performs the command by storing the three-dimensional model and a profile of the local user in a local profile repository, and can also provide the three-dimensional model and the local user' s profile to a remote authentication system.
[0036] In some embodiments, the request can include a request to identify the local user, at which point the device processes the request by searching for a user profile whose three-dimensional model matches that of the local user. If the device finds a closest match that has a high confidence value, the device provides the identity of the closest match as the user's identity. Otherwise, the device provides a result indicating that the local user is not recognized.
[0037] In some embodiments, the device stores the three-dimensional models of various registered user profiles in a local repository, and searches for the local user's profile by comparing the three-dimensional model of the local user to the stored models associated with the registered user profiles. The device can also search for the local user's profile by sending the three- dimensional model of the local user to the remote authentication system, and receiving an
authentication response from the authentication system. If the authentication system recognizes the local user, the authentication response can indicate the identity of the local user, access privileges for the local user, and/or the local user's profile information.
[0038] In some embodiments, the request from operation 202 can include a command to generate an avatar for the local user, at which point the device processes the command to generate the avatar for the local user from the generated three-dimensional model. The avatar can include a pre-designed body and costume (e.g., selected or designed by the local user), and can include facial features that match features from the three-dimensional model of the local user's face. For example, the look and texture of these facial features can be selected from a pre-designed feature repository based on the three-dimensional model of the local user's face, and their placement on the avatar's face can also be determined from the three-dimensional model of the local user's face.
Interactive Process for Capturing the User's Facial Features
[0039] The image-capture device generates the three-dimensional model of the local user's face by capturing and processing a plurality of images that show the user's facial features from various viewpoints. The device makes this image-capture process fast and cost-effective by allowing the user to sweep the device's on-board camera across the front and sides of his/her face, for example, in a left-to-right or a right-to-left motion. However, to generate a quality three- dimensional model, the user needs to make sure that he/she does not move the device too fast so that the captured images are not blurred, and also needs to make sure that the images capture enough facial features of the local user.
[0040] In some embodiments, the device can monitor its motion and the quality of the captured images to let the user know when he/she needs to slow down his/her motion, repeat his/her motion, reposition the device to better capture his/her face, or move the device to a specific viewpoint to capture facial features from any necessary orientations. For example, the device can monitor the motion using an on-board gyroscope, and can monitor the quality of a captured image by analyzing its brightness, contrast, sharpness, and/or by counting the number of detectable facial features. The device interacts with the local user to facilitate capturing images that include a sufficient number of detectable facial features.
[0041] FIG. 3 illustrates a plurality of detected facial feature points from a two-dimensional image 300 in accordance with an embodiment. The feature points indicate the size, shape, and/or position of a set of facial features that the device is programmed or trained to recognize. For example, image 300 illustrates a plurality of feature points (illustrated using cross marks) for a set of facial features, such as left eye features 302 and right eye features 304, as well as left eyebrow features 306 and right eyebrow features 308. The detected features can also include nose features 310, lips features 312, and jawline features 314. Other possible features include a hairline, the chin, ears, etc. In some embodiments, the detected features can also include feature points surrounding other facial anomalies that are not found on every face, such as a dimple, a birthmark, a scar, a tattoo, etc.
[0042] FIG. 4 presents a flow chart illustrating a method for capturing a set of images of a local user in accordance with an embodiment. During operation, the image-capture device can determine whether it is ready to capture an image (operation 402). For example, the device may determine that the user is sweeping the motion-capture device too fast, which could result in a blurry image. If the device is not ready, the device can notify the user that it cannot capture (operation 404), for example, by playing a sound, generating a certain vibration pattern, generating a flash pattern (e.g., using the camera's flash), or displaying an image on the device's screen. When the user notices the notification, the user can respond by slowing down his/her sweeping motion of the image-capture device.
[0043] Otherwise, the device can capture the image (operation 406), and processes the image to determine facial features of the local user and a device orientation for the captured image
(operation 408). The device then determines whether the image is suitable for detecting features of the local user (operation 410). For example, the image-capture device can determine whether it can detect a face, and/or whether it can detect a sufficient number of facial features. If the captured image corresponds to the front of the user's face, the device may expect to detect at least six facial features. However, if the device determines that the captured image is a profile view of the user's face, the device may expect to capture at least three or four facial features.
[0044] If the device cannot detect a sufficient number of features from the captured image, the device can return to operation 404 to notify the user of this problem. When the user notices this notification, the user can respond by re-aligning the image-capture device so that the user's face is
visible in the captured image, by ensuring there is sufficient ambient light for capturing an image, and/or by ensuring that the device is steady enough for capturing an in-focus image. However, if the image is suitable for detecting features, the device stores the image, the detected feature points, and a device orientation for the captured image (operation 412).
[0045] The device then determines whether it has captured enough images for generating the three-dimensional model (operation 414). If so, the device can proceed to an end terminal.
Otherwise, the device monitors a change in its orientation from that of a previous stored image (operation 416), and determines whether the orientation has changed by at least a minimum threshold (operation 418). If the device's orientation has not changed beyond this threshold (e.g., a captured image would be too similar to that of a previous stored image), the device can return to operation 416 after a short delay (e.g., a few milliseconds).
[0046] However, if the device's orientation is sufficiently different from that of previous captured images, the device can return to operation 402 to capture another image. The device can continue to perform method 400 until it has captured enough images from which it can generate a three-dimensional model of the user's face.
Generating a Three-Dimensional Model
[0047] FIG. 5A illustrates a motion trajectory 500 of an image-capture device 502 during an image capture operation in accordance with an embodiment. When the user begins the image- capture operation, the image-capture device captures image 506.1 and orientation data 508.1 while the device is in orientation 504.1. As the user sweeps the device in front of his/her face, the device can capture images 506.2 through 506./' and orientation data 508.2 through 508./' for device orientations 504.2 through 504 j, respectively.
[0048] The image-capture device can determine orientation data 508 using any motion- sensor, now known or later developed, that can determine absolute or relative three-dimensional coordinates for each captured image. For example, the motion sensor can include a gyroscope that provides three rotation angles perpendicular to the device's plane (e.g., the pitch, yaw, and roll angles along the X, Y, and Z axis, respectively) for each captured image.
[0049] The device then processes the captured images to detect the image coordinates of certain facial features across the various captured images. For example, the device can determine
feature points 510.1, 510.2, and 5 lO.j that correspond to a nose feature captured by images 506.1, 506.2, and 506.j, respectively. The coordinates of a feature point i within an image j is hereinafter denoted using the tuple (ιι(
. The device then processes the orientation data 508 and the feature points 510 to generate a three-dimensional model in a global coordinate system. The three- dimensional model is hereinafter denoted using the tuple x(0), z(0), such that the superscript (0) indicates the model is represented using the global coordinate system under which all captured images are processed.
[0050] Under the perspective projection, the relationship of the two-dimensional coordinates of a point 510 and the 3D physical space for the three-dimensional model can be represented by the projection transformation as follows:
Γ T
In equation (1), u,v,l provides the homogenous image coordinates of a feature point, and
[x,y,z, l]T provides the homogenous three-dimensional coordinates for the feature point in the 3D physical space. K3x3 provides a 3x3 matrix consisting of intrinsic camera parameters, such as focal length, principal point, aspect ratio, skew factor and radial distortion. The value for K can be computed beforehand using any camera calibration technique, such as the technique described by Zhengyou Zhang in "A flexible new technique for camera calibration" (IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, issue 11, pages 1330-1334, 2000), which is hereby incorporated by reference.
[0051] [R^ I T ] provides a 3D rotation and translation matrix, which facilitates converting a point in the 3D physical space to a point in the camera's local 3D coordinate system. The device generates the 3x4 matrix [R3x3 1 Τ3Λ] by concatenating the 3x3 rotation matrix R3x3 and the 3x1 translation matrix Τ3Λ .
[0052] Although each captured image has a local 3D coordinate system, the device uses a global coordinate system to generate the three-dimensional model of the local user from the captured images. In some embodiments, the device can select the coordinate system for one captured image (e.g., the frontal view of the user, hereinafter referred to as view 0) as the global
coordinate system for the three-dimensional model. The global 3D coordinate system is hereinafter denoted using the notation Ϋ° °
[0053] FIG. 5B illustrates modeling data that is computed while generating the three- dimensional model of the local user in accordance with an embodiment. To generate a three dimensional model 564, the device first selects one captured image to use as a reference point for processing all other images. For example, the device can select an orientation 554.2, which corresponds to a front-facing image 556.2 of the local user, as the global coordinate system 562. The device then generates a system of linear equations based on equation (1) for all feature points detected across all images, and relative to the global coordinate system 562. The device generates three-dimensional model 564 by solving this system of linear equations.
[0054] The system of linear equations includes a linear equation for each feature point of each captured image (e.g., for a feature point i from an image view j, represented using
These equations map the coordinates of these feature points from the global coordinate system using a transformation represented using [R 0 \ Tj 0] (e.g., transformations 558.1 and 558./' for feature points 560.1 and 560./', respectively).
[0055] To determine the camera orientation for an image captured at a view j, the image- capture device computes the 3D rotation Ry o and translation 7};0 from the global coordinate system
Z°} for view j. The device uses the gyroscope data to compute an accurate rotation matrix Ry o, which facilitates generating the three- dimensional model by solving the set of linear equations, making the computations extremely lightweight for mobile devices. To determine translation J 0, the device needs to solve the system of linear equations.
Setting Up the System of Linear Equations
[0056] From equation (1), each detected facial feature point i introduces 3 unknowns and 4 linear equations:
«:o),v;o),if = ^[/ | o][¾0),¾0), ;o),i] (2)
(3)
Equation (2) corresponds to a projection transformation within view 0 (the view selected as the global coordinate system for generating the three-dimensional model). Equation (3) corresponds to a projection transformation for a view j, relative to the global coordinate system corresponding of view 0.
[0057] The device can determine input values for the following variables in equations (2) and (3) as follows. The variable K takes as input the 3x3 intrinsic matrix that is computed for the device ahead of time when calibrating the device's camera. The 3x3 matrix Ry o takes as input the rotation matrix computed from gyroscope data, which corresponds to a rotation of the device from view 0 to view j. The tuple (w(0),v(0)) takes as input an image coordinate detected for the facial marker i from the image captured from view 0, and the tuple (ιι(
takes as input an image coordinate detected for the facial marker i from the image captured from view j.
[0058] The symbols in equations (2) and (3) denoted with a tilde (~) correspond to unknown values that the device solves for (e.g., during operation 210 of FIG. 2). Specifically, the tuple
provides three-dimensional coordinates of the facial marker i with respect to the global coordinate system 1^0), Z^. The 3x1 matrix TJ 0 provides a translation matrix from view 0 to view j, which is common for all facial markers in view j.
[0059] For n detected facial markers, because Tj 0 is common for all facial markers in view j, there are 4n equations and 3n+3 unknowns. If n is sufficiently large, the system of equations (based on equations (2) and (3)) can provide more equations than unknowns. The device can solve this system of linear equations for view j using techniques such as linear least-square fitting.
[0060] As the user sweeps the image-capture device across the forefront of his/her face, the device can capture images for a plurality of views. Each additional view j contributes an additional 4n equations (based on equations (2) and (3)), and introduces 3 new unknowns (based on the 3x1 translation matrix Tj 0 for view j).
Solving the System of Equations
[0061] In some embodiments, the device solves the linear equations generated for all views together (e.g., during operation 210 of FIG. 2). Solving the system of equations provides the three-
dimensional coordinates {5c. ' ,y( , ( ') for all facial markers i, and provides the translation matrix
Tj 0 for all views j, both relative to the global coordinate system of view 0. Solving the complete set of equations together provides several advantages. Doing so overcomes the limitation that some facial markers may not be detected in all views, and provides a solution that is robust toward errors in detecting feature coordinates from the individual views.
Normalizing the Three-Dimensional Model
[0062] Once the device generates the three-dimensional model of the user' s face (e.g., either in a face enrollment or a face-recognition operation), the device transforms the model to generate a normalized three-dimensional model. For example, the image-capture device can generate the normalized model by performing a translation operation, a rotation operation, and a scale-change operation so that the two eyes are fixed to certain coordinates (e.g., coordinates (1,0,0) and (-1,0,0) for the user's left and right eyes, respectively).
[0063] This computation-efficient transformation facilitates normalizing the three- dimensional model at the image-capture device, and prevents the device from having to fit two models to a common coordinate system before comparing the two models, which can be time consuming when comparing the local user' s face to those of other users in a large user-profile database.
[0064] FIG. 6 illustrates a normalized three-dimensional model 600 of a user's face in accordance with an embodiment. Specifically, the scale and orientation of normalized three- dimensional model 600 is transformed so that the left and right eyes (e.g., features 604 and 606) are positioned at feature coordinates (1,0,0) and (-1,0,0) of global coordinate system 602, respectively.
Computing a Difference Between Three-Dimensional Models
[0065] Once the device generates the three-dimensional model, the device can compare the model of the user' s face to other three-dimensional models (e.g., to perform face recognition or to authenticate the user), without first fitting them to a common coordinate system. To compare two models, the device can compute the difference between features of the two three-dimensional models by computing a distance between corresponding feature points of the two models.
[0066] For example, the device can compute the distance as a Euclidean distance between all features points i that occur in the two models as follows:
In equation (4), the two coordinates (x^y^ z^ and (χ',γ', ζ') correspond to a feature point i that occurs in the two three-dimensional models being compared. The computed difference, diff, provides a numeric value indicating a difference between the two three-dimensional models (e.g., as a Euclidean distance relative to the global coordinate system).
[0067] In some embodiments, the image-capture device can compare two three-dimensional models in a way that accounts for differences in coordinate systems for the two models. For example, if a stored three-dimensional model of a registered user's face has not been normalized, or has been normalized to a different coordinate system, the device can perform the comparison operation by solving the following linear equation:
The device can compute the rotation matrix R using gyroscope data, and can solve for the translation matrix T and the scale factor s by solving equation (5), for example, using linear least-square fitting. The device can then compute the fitting error for each three-dimensional model, and can use the fitting error as the difference between the two three dimensional models.
[0068] To perform face recognition, the device can compute the distance between the three- dimensional model of the user's face to those of other registered users (e.g., using equation (4) or equation (5)). If the confidence is high for the closest match (e.g., the difference of the closest match is less than a certain threshold), the device can provide the identity of the closest match as the user's identity. Otherwise, the device can provide a result indicating that the local user is not recognized.
[0069] If the image-capture device is verifying the identity of the local user, the device can compare the three-dimensional model of the local user to that of a user profile that the user claims belongs to him. If the confidence is high (e.g., the difference is less than the threshold), the device can grant the local user access. Otherwise, the device denies the local user access.
[0070] FIG. 7 illustrates an exemplary apparatus 700 that facilitates generating a three- dimensional model of a local user in accordance with an embodiment. Apparatus 700 can comprise a plurality of hardware and/or software modules which may communicate with one another via a wired or wireless communication channel. Apparatus 700 may be realized using one or more integrated circuits, and may include fewer or more modules than those shown in FIG. 7. Further, apparatus 700 may be integrated in a computer system, or realized as a separate device which is capable of communicating with other computer systems and/or devices. Specifically, apparatus 700 can comprise a communication module 702, an interface module 704, an image-capture module 706, a motion sensor 708, a feature-detecting module 710, a model-generating module 712, and an authentication module 708.
[0071] In some embodiments, communication module 702 can communicate with third- party systems, such as an authentication server. Interface module 704 can provide feedback to the local user during the authentication process, for example, to alert the user of a potential problem that prevents apparatus 700 from detecting the local user's facial features.
[0072] Image-capture module 706 can capture a set of images of the local user from various orientations, and motion sensor 708 can determine orientation information for the captured images. Feature-detecting module 710 can detect a plurality of features of the local user's face from the captured images, and model-generating module 712 can generate a three-dimensional model of the local user's face from the detected features and orientation information for their corresponding images. Authentication module 708 can compare the generated three-dimensional model to those of registered users to identify or authenticate the local user.
[0073] FIG. 8 illustrates an exemplary computer system 802 that facilitates generating a three-dimensional model of a local user in accordance with an embodiment. Computer system 802 includes a processor 804, a memory 806, and a storage device 808. Memory 806 can include a volatile memory (e.g., RAM) that serves as a managed memory, and can be used to store one or more memory pools. Furthermore, computer system 802 can be coupled to a display device 810, a keyboard 812, and a pointing device 814. Storage device 808 can store an operating system 816, an image-capture system 818, and data 834.
[0074] In some embodiments, display 810 includes a touch screen display, such that keyboard 812 includes a virtual keyboard presented on display 810, and pointing device 814
includes a touch-sensitive device coupled to display 810 (e.g., a capacitive-touch sensor or a resistive-touch sensor layered on display 810). To type using keyboard 812, the user can tap on a portion of display 810 that presents a desired key. The user can also select any other display object presented on display 810 by tapping on the display object, and can interact with the display object using a set of predetermined touch-screen gestures.
[0075] Image-capture system 818 can include instructions, which when executed by computer system 802, can cause computer system 802 to perform methods and/or processes described in this disclosure. Specifically, image-capture system 818 may include instructions for communicating with third-party systems, such as an authentication server (communication module 820). Further, image-capture system 818 can include instructions for providing feedback to the local user during the authentication process, for example, to alert the user of a potential problem that prevents image-capture system 800 from detecting the local user's facial features (interface module 822).
[0076] Image-capture system 818 can also include instructions for capturing a set of images of the local user from various orientations (image-capture module 824), and for determining orientation information for the captured images (motion-sensing module 826). Image-capture system 818 can also include instructions for detecting a plurality of features of the local user's face from the captured images (feature-detecting module module 828), and for generating a three- dimensional model of the local user's face from the detected features and orientation information for their corresponding images (model-generating module 830). Image-capture system 818 can also include instructions for comparing the generated three-dimensional model to those of registered users to identify or authenticate the local user (authentication module 832).
[0077] Data 834 can include any data that is required as input or that is generated as output by the methods and/or processes described in this disclosure. Specifically, data 834 can store at least user profiles for one or more registered users, access privileges for the registered users, and at least one three-dimensional model for each of the registered user's faces.
[0078] The data structures and code described in this detailed description are typically stored on a computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. The computer-readable storage medium includes, but is not limited to, volatile memory, non- volatile memory, magnetic and optical storage devices such as
disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other media capable of storing computer-readable media now known or later developed.
[0079] The methods and processes described in the detailed description section can be embodied as code and/or data, which can be stored in a computer-readable storage medium as described above. When a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium.
[0080] Furthermore, the methods and processes described above can be included in hardware modules. For example, the hardware modules can include, but are not limited to, application-specific integrated circuit (ASIC) chips, field-programmable gate arrays (FPGAs), and other programmable-logic devices now known or later developed. When the hardware modules are activated, the hardware modules perform the methods and processes included within the hardware modules.
[0081] The foregoing descriptions of embodiments of the present invention have been presented for purposes of illustration and description only. They are not intended to be exhaustive or to limit the present invention to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. Additionally, the above disclosure is not intended to limit the present invention. The scope of the present invention is defined by the appended claims.
Claims
1. A computer-implemented method, comprising:
capturing, by an image-capture device on a mobile device, a set of images of a person from various orientations;
determining orientation information for a respective captured image;
detecting a plurality of features of the person's face from the respective captured image; generating a three-dimensional model of the person's face from the detected features and orientation information for their corresponding images; and
authenticating the person's identity based on the three-dimensional model.
2. The method of claim 1, wherein capturing the set of images comprises:
monitoring a change in orientation of the image-capture device;
determining that the orientation has changed by at least a minimum amount from an orientation of a previous captured image;
capturing an image in response to determining that the image-capture device is stabilized; and
storing the captured image in response to determining that the image is suitable for detecting facial features.
3. The method of claim 1, wherein capturing the set of images further comprises providing a notification in response to:
determining that the image-capture device is not stabilized;
determining that the person's face is not in the image frame;
determining that the current orientation of the device is not suitable for detecting features of the person's face; or
determining that no more images need to be captured.
4. The method of claim 3, wherein the notification includes one or more of:
a sound; a vibration pattern;
a flashing pattern from a light source of the image-capture device; and
an image displayed on a screen of the image-capture device.
5. The method of claim 1, wherein capturing the set of images is performed in response to receiving a request to register the person as a user; and
wherein the method further comprises storing the three-dimensional model in association with a user profile for the person.
6. The method of claim 1, wherein capturing the set of images is performed in response to receiving a request to authenticate the person.
7. The method of claim 6, further comprising:
authenticating the person by determining whether the generated three-dimensional model of the person matches a stored three-dimensional model of a registered user.
8. The method of claim 6, further comprising authenticating the person, which involves: sending the generated three-dimensional model of the person to a remote authentication system; and
receiving an authentication response which indicates whether the person is a registered user.
9. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method, the method comprising:
capturing a set of images of a person from various orientations using an image-capture device on a mobile device;
determining orientation information for a respective captured image;
detecting a plurality of features of the person's face from the respective captured image; generating a three-dimensional model of the person's face from the detected features and orientation information for their corresponding images; and
authenticating the person's identity based on the three-dimensional model.
10. The storage medium of claim 9, wherein capturing the set of images comprises: monitoring a change in orientation of the image-capture device;
determining that the orientation has changed by at least a minimum amount from an orientation of a previous captured image;
capturing an image in response to determining that the image-capture device is stabilized; and
storing the captured image in response to determining that the image is suitable for detecting facial features.
11. The storage medium of claim 9, wherein capturing the set of images further comprises providing a notification in response to:
determining that the image-capture device is not stabilized;
determining that the person's face is not in the image frame;
determining that the current orientation of the device is not suitable for detecting features of the person's face; or
determining that no more images need to be captured.
12. The storage medium of claim 11, wherein the notification includes one or more of: a sound;
a vibration pattern;
a flashing pattern from a light source of the image-capture device; and
an image displayed on a screen of the image-capture device.
13. The storage medium of claim 9, wherein capturing the set of images is performed response to receiving a request to register the person as a user; and
wherein the method further comprises storing the three-dimensional model in association with a user profile for the person.
14. The storage medium of claim 9, wherein capturing the set of images is performed in response to receiving a request to authenticate the person.
15. The storage medium of claim 14, wherein the method further comprises authenticating the person by determining whether the generated three-dimensional model of the person matches a stored three-dimensional model of a registered user.
16. The storage medium of claim 14, wherein the method further comprises
authenticating the person, which involves:
sending the generated three-dimensional model of the person to a remote authentication system; and
receiving an authentication response which indicates whether the person is a registered user.
17. A mobile device, comprising:
an image-capture module configured to capture a set of images of a person from various orientations;
a motion sensor configured to determine orientation information for a respective captured image;
a feature-detecting module configured to detect a plurality of features of the person's face from the respective captured image;
a model-generating module configured to generate a three-dimensional model of the person's face from the detected features and orientation information for their corresponding images; and an authentication module configured to authenticate the person's identity based on the three- dimensional mode.
18. The mobile device of claim 17, wherein while capturing the set of images, the image- capture module is further configured to:
monitor a change in orientation;
determine that the orientation has changed by at least a minimum amount from an orientation of a previous captured image;
capture an image in response to determining that the image-capture device is stabilized; and store the captured image in response to determining that the image is suitable for detecting facial features.
19. The mobile device of claim 17, further comprising an interface module configured to provide a notification in response to:
determining that the image-capture device is not stabilized;
determining that the person's face is not in the image frame;
determining that the current orientation of the device is not suitable for detecting features of the person's face; or
determining that no more images need to be captured.
20. The mobile device of claim 19, wherein the notification includes one or more of: a sound;
a vibration pattern;
a flashing pattern from a light source of the image-capture device; and
an image displayed on a screen of the image-capture device.
21. The mobile device of claim 17, further comprising an interface module configured to receive a request to register the person as a user;
wherein the image-capture module is configured to capture the set of images in response to the request to register the person as a user; and
wherein the apparatus further comprises a profile-managing module to store the three- dimensional model in association with a user profile for the person.
22. The mobile device of claim 17, further comprising an interface module configured to receive a request to authenticate the person;
wherein the image-capture module is configured to capture the set of images in response to the request to authenticate the person.
23. The mobile device of claim 22, further comprising an authentication module configured to authenticate the person by determining whether the generated three-dimensional model of the person matches a stored three-dimensional model of a registered user.
24. The mobile device of claim 22, further comprising an authentication module configured to authenticate the person, wherein authenticating the person involves:
sending the generated three-dimensional model of the person to a remote authentication system; and
receiving an authentication response which indicates whether the person is a registered user.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP13781046.1A EP2842075B1 (en) | 2012-04-25 | 2013-04-22 | Three-dimensional face recognition for mobile devices |
CN201380022051.8A CN104246793A (en) | 2012-04-25 | 2013-04-22 | Three-dimensional face recognition for mobile devices |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/456,074 | 2012-04-25 | ||
US13/456,074 US20130286161A1 (en) | 2012-04-25 | 2012-04-25 | Three-dimensional face recognition for mobile devices |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013159686A1 true WO2013159686A1 (en) | 2013-10-31 |
Family
ID=49476903
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/074511 WO2013159686A1 (en) | 2012-04-25 | 2013-04-22 | Three-dimensional face recognition for mobile devices |
Country Status (4)
Country | Link |
---|---|
US (1) | US20130286161A1 (en) |
EP (1) | EP2842075B1 (en) |
CN (1) | CN104246793A (en) |
WO (1) | WO2013159686A1 (en) |
Families Citing this family (84)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8600120B2 (en) | 2008-01-03 | 2013-12-03 | Apple Inc. | Personal computing device control using face detection and recognition |
TWI439960B (en) | 2010-04-07 | 2014-06-01 | Apple Inc | Avatar editing environment |
US9002322B2 (en) | 2011-09-29 | 2015-04-07 | Apple Inc. | Authentication with secondary approver |
CN107257403A (en) | 2012-04-09 | 2017-10-17 | 英特尔公司 | Use the communication of interaction incarnation |
EP2672426A3 (en) * | 2012-06-04 | 2014-06-04 | Sony Mobile Communications AB | Security by z-face detection |
BR112015004867B1 (en) | 2012-09-05 | 2022-07-12 | Element, Inc. | IDENTITY MISTIFICATION PREVENTION SYSTEM |
US9892761B2 (en) * | 2013-02-22 | 2018-02-13 | Fuji Xerox Co., Ltd. | Systems and methods for creating and using navigable spatial overviews for video |
US9288471B1 (en) * | 2013-02-28 | 2016-03-15 | Amazon Technologies, Inc. | Rotatable imaging assembly for providing multiple fields of view |
US9886622B2 (en) * | 2013-03-14 | 2018-02-06 | Intel Corporation | Adaptive facial expression calibration |
US9898642B2 (en) | 2013-09-09 | 2018-02-20 | Apple Inc. | Device, method, and graphical user interface for manipulating user interfaces based on fingerprint sensor inputs |
JP2015088096A (en) * | 2013-11-01 | 2015-05-07 | 株式会社ソニー・コンピュータエンタテインメント | Information processor and information processing method |
US10146299B2 (en) | 2013-11-08 | 2018-12-04 | Qualcomm Technologies, Inc. | Face tracking for additional modalities in spatial interaction |
KR102263695B1 (en) * | 2014-01-20 | 2021-06-10 | 삼성전자 주식회사 | Apparatus and control method for mobile device using multiple cameras |
GB201402856D0 (en) * | 2014-02-18 | 2014-04-02 | Right Track Recruitment Uk Ltd | System and method for recordal of personnel attendance |
US10482461B2 (en) | 2014-05-29 | 2019-11-19 | Apple Inc. | User interface for payments |
US9589362B2 (en) | 2014-07-01 | 2017-03-07 | Qualcomm Incorporated | System and method of three-dimensional model generation |
US20160070952A1 (en) * | 2014-09-05 | 2016-03-10 | Samsung Electronics Co., Ltd. | Method and apparatus for facial recognition |
CN105488371B (en) * | 2014-09-19 | 2021-04-20 | 中兴通讯股份有限公司 | Face recognition method and device |
US9607388B2 (en) | 2014-09-19 | 2017-03-28 | Qualcomm Incorporated | System and method of pose estimation |
KR101997500B1 (en) | 2014-11-25 | 2019-07-08 | 삼성전자주식회사 | Method and apparatus for generating personalized 3d face model |
EP3241187A4 (en) * | 2014-12-23 | 2018-11-21 | Intel Corporation | Sketch selection for rendering 3d model avatar |
US9799133B2 (en) | 2014-12-23 | 2017-10-24 | Intel Corporation | Facial gesture driven animation of non-facial features |
US9830728B2 (en) | 2014-12-23 | 2017-11-28 | Intel Corporation | Augmented facial animation |
US10931933B2 (en) * | 2014-12-30 | 2021-02-23 | Eys3D Microelectronics, Co. | Calibration guidance system and operation method of a calibration guidance system |
US9852543B2 (en) | 2015-03-27 | 2017-12-26 | Snap Inc. | Automated three dimensional model generation |
US10304203B2 (en) | 2015-05-14 | 2019-05-28 | Qualcomm Incorporated | Three-dimensional model generation |
US10373366B2 (en) | 2015-05-14 | 2019-08-06 | Qualcomm Incorporated | Three-dimensional model generation |
US9911242B2 (en) | 2015-05-14 | 2018-03-06 | Qualcomm Incorporated | Three-dimensional model generation |
JP6507046B2 (en) * | 2015-06-26 | 2019-04-24 | 株式会社東芝 | Three-dimensional object detection device and three-dimensional object authentication device |
CN106559387B (en) * | 2015-09-28 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Identity verification method and device |
US10220172B2 (en) | 2015-11-25 | 2019-03-05 | Resmed Limited | Methods and systems for providing interface components for respiratory therapy |
WO2017101094A1 (en) | 2015-12-18 | 2017-06-22 | Intel Corporation | Avatar animation system |
CN105654035B (en) * | 2015-12-21 | 2019-08-09 | 湖南拓视觉信息技术有限公司 | Three-dimensional face identification method and the data processing equipment for applying it |
KR102365721B1 (en) | 2016-01-26 | 2022-02-22 | 한국전자통신연구원 | Apparatus and Method for Generating 3D Face Model using Mobile Device |
US10257505B2 (en) * | 2016-02-08 | 2019-04-09 | Microsoft Technology Licensing, Llc | Optimized object scanning using sensor fusion |
CN105993022B (en) * | 2016-02-17 | 2019-12-27 | 香港应用科技研究院有限公司 | Method and system for recognition and authentication using facial expressions |
US9619723B1 (en) | 2016-02-17 | 2017-04-11 | Hong Kong Applied Science and Technology Research Institute Company Limited | Method and system of identification and authentication using facial expression |
US9967262B1 (en) * | 2016-03-08 | 2018-05-08 | Amazon Technologies, Inc. | Account verification based on content submission |
FI20165211A (en) * | 2016-03-15 | 2017-09-16 | Ownsurround Ltd | Arrangements for the production of HRTF filters |
JP2017194301A (en) * | 2016-04-19 | 2017-10-26 | 株式会社デジタルハンズ | Face shape measuring device and method |
AU2017100670C4 (en) | 2016-06-12 | 2019-11-21 | Apple Inc. | User interfaces for retrieving contextually relevant media content |
US10395099B2 (en) * | 2016-09-19 | 2019-08-27 | L'oreal | Systems, devices, and methods for three-dimensional analysis of eyebags |
WO2018057272A1 (en) * | 2016-09-23 | 2018-03-29 | Apple Inc. | Avatar creation and editing |
DK179978B1 (en) | 2016-09-23 | 2019-11-27 | Apple Inc. | Image data for enhanced user interactions |
CN106331854A (en) * | 2016-09-29 | 2017-01-11 | 深圳Tcl数字技术有限公司 | Smart television control method and device |
GB2554674B (en) * | 2016-10-03 | 2019-08-21 | I2O3D Holdings Ltd | 3D capture: object extraction |
US10341568B2 (en) | 2016-10-10 | 2019-07-02 | Qualcomm Incorporated | User interface to assist three dimensional scanning of objects |
US10488195B2 (en) | 2016-10-25 | 2019-11-26 | Microsoft Technology Licensing, Llc | Curated photogrammetry |
USD836654S1 (en) * | 2016-10-28 | 2018-12-25 | General Electric Company | Display screen or portion thereof with graphical user interface |
US20180137663A1 (en) * | 2016-11-11 | 2018-05-17 | Joshua Rodriguez | System and method of augmenting images of a user |
US10311593B2 (en) * | 2016-11-16 | 2019-06-04 | International Business Machines Corporation | Object instance identification using three-dimensional spatial configuration |
US10586379B2 (en) | 2017-03-08 | 2020-03-10 | Ebay Inc. | Integration of 3D models |
US20180357819A1 (en) * | 2017-06-13 | 2018-12-13 | Fotonation Limited | Method for generating a set of annotated images |
US10540489B2 (en) | 2017-07-19 | 2020-01-21 | Sony Corporation | Authentication using multiple images of user from different angles |
CN107277053A (en) * | 2017-07-31 | 2017-10-20 | 广东欧珀移动通信有限公司 | Auth method, device and mobile terminal |
CN107590434A (en) * | 2017-08-09 | 2018-01-16 | 广东欧珀移动通信有限公司 | Identification model update method, device and terminal device |
KR102389678B1 (en) * | 2017-09-09 | 2022-04-21 | 애플 인크. | Implementation of biometric authentication |
AU2018334318B2 (en) | 2017-09-18 | 2021-05-13 | Element, Inc. | Methods, systems, and media for detecting spoofing in mobile authentication |
CN108171182B (en) * | 2017-12-29 | 2022-01-21 | Oppo广东移动通信有限公司 | Electronic device, face recognition method and related product |
FI20185300A1 (en) | 2018-03-29 | 2019-09-30 | Ownsurround Ltd | An arrangement for generating head related transfer function filters |
US12033296B2 (en) | 2018-05-07 | 2024-07-09 | Apple Inc. | Avatar creation user interface |
US11722764B2 (en) | 2018-05-07 | 2023-08-08 | Apple Inc. | Creative camera |
DK180078B1 (en) | 2018-05-07 | 2020-03-31 | Apple Inc. | USER INTERFACE FOR AVATAR CREATION |
US11170085B2 (en) | 2018-06-03 | 2021-11-09 | Apple Inc. | Implementation of biometric authentication |
US11727656B2 (en) | 2018-06-12 | 2023-08-15 | Ebay Inc. | Reconstruction of 3D model with immersive experience |
CN110826045B (en) * | 2018-08-13 | 2022-04-05 | 深圳市商汤科技有限公司 | Authentication method and device, electronic equipment and storage medium |
US11100349B2 (en) | 2018-09-28 | 2021-08-24 | Apple Inc. | Audio assisted enrollment |
US10860096B2 (en) | 2018-09-28 | 2020-12-08 | Apple Inc. | Device control using gaze information |
EP3794587B1 (en) * | 2018-10-08 | 2024-07-17 | Google LLC | Selective enrollment with an automated assistant |
US11238294B2 (en) | 2018-10-08 | 2022-02-01 | Google Llc | Enrollment with an automated assistant |
EP3651057B1 (en) * | 2018-11-09 | 2023-06-14 | Tissot S.A. | Procedure for facial authentication of a wearer of a watch |
CN109753930B (en) * | 2019-01-03 | 2021-12-24 | 京东方科技集团股份有限公司 | Face detection method and face detection system |
AU2020237108B2 (en) | 2019-03-12 | 2022-12-15 | Element Inc. | Detecting spoofing of facial recognition with mobile devices |
US11507248B2 (en) | 2019-12-16 | 2022-11-22 | Element Inc. | Methods, systems, and media for anti-spoofing using eye-tracking |
DE102020100565A1 (en) | 2020-01-13 | 2021-07-15 | Aixtron Se | Process for depositing layers |
USD968990S1 (en) * | 2020-03-26 | 2022-11-08 | Shenzhen Sensetime Technology Co., Ltd. | Face recognition machine |
DK181103B1 (en) | 2020-05-11 | 2022-12-15 | Apple Inc | User interfaces related to time |
US11921998B2 (en) | 2020-05-11 | 2024-03-05 | Apple Inc. | Editing features of an avatar |
US11899566B1 (en) | 2020-05-15 | 2024-02-13 | Google Llc | Training and/or using machine learning model(s) for automatic generation of test case(s) for source code |
DE102020119531A1 (en) * | 2020-07-23 | 2022-01-27 | Bundesdruckerei Gmbh | Method for personalizing an ID document and method for identifying a person using biometric facial features and ID document |
US11823327B2 (en) * | 2020-11-19 | 2023-11-21 | Samsung Electronics Co., Ltd. | Method for rendering relighted 3D portrait of person and computing device for the same |
EP4264460A1 (en) | 2021-01-25 | 2023-10-25 | Apple Inc. | Implementation of biometric authentication |
US11714536B2 (en) | 2021-05-21 | 2023-08-01 | Apple Inc. | Avatar sticker editor user interfaces |
US11776190B2 (en) | 2021-06-04 | 2023-10-03 | Apple Inc. | Techniques for managing an avatar on a lock screen |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070116457A1 (en) * | 2005-11-22 | 2007-05-24 | Peter Ljung | Method for obtaining enhanced photography and device therefor |
CN101377814A (en) * | 2007-08-27 | 2009-03-04 | 索尼株式会社 | Face image processing apparatus, face image processing method, and computer program |
CN101395613A (en) * | 2006-01-31 | 2009-03-25 | 南加利福尼亚大学 | 3D face reconstruction from 2D images |
US20090279784A1 (en) * | 2008-05-07 | 2009-11-12 | Microsoft Corporation | Procedural authoring |
EP2137670A2 (en) | 2007-03-05 | 2009-12-30 | Fotonation Vision Limited | Illumination detection using classifier chains |
US20110150300A1 (en) * | 2009-12-21 | 2011-06-23 | Hon Hai Precision Industry Co., Ltd. | Identification system and method |
US20120027270A1 (en) * | 2007-11-29 | 2012-02-02 | Viewdle Inc. | Method and System of Person Identification by Facial Image |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5870217A (en) * | 1981-10-23 | 1983-04-26 | Fuji Photo Film Co Ltd | Camera-shake detecting device |
WO1997003416A1 (en) * | 1995-07-10 | 1997-01-30 | Sarnoff Corporation | Method and system for rendering and combining images |
US6970580B2 (en) * | 2001-10-17 | 2005-11-29 | Qualcomm Incorporated | System and method for maintaining a video image in a wireless communication device |
US7746404B2 (en) * | 2003-11-10 | 2010-06-29 | Hewlett-Packard Development Company, L.P. | Digital camera with panoramic image capture |
JP2006338092A (en) * | 2005-05-31 | 2006-12-14 | Nec Corp | Pattern collation method, pattern collation system and pattern collation program |
JP4753801B2 (en) * | 2006-06-07 | 2011-08-24 | ソニー・エリクソン・モバイルコミュニケーションズ株式会社 | Information processing device, information processing method, information processing program, and portable terminal device |
US7916897B2 (en) * | 2006-08-11 | 2011-03-29 | Tessera Technologies Ireland Limited | Face tracking for controlling imaging parameters |
JP2008191816A (en) * | 2007-02-02 | 2008-08-21 | Sony Corp | Image processor, image processing method, and computer program |
CN102413282B (en) * | 2011-10-26 | 2015-02-18 | 惠州Tcl移动通信有限公司 | Self-shooting guidance method and equipment |
US20130215239A1 (en) * | 2012-02-21 | 2013-08-22 | Sen Wang | 3d scene model from video |
-
2012
- 2012-04-25 US US13/456,074 patent/US20130286161A1/en not_active Abandoned
-
2013
- 2013-04-22 CN CN201380022051.8A patent/CN104246793A/en active Pending
- 2013-04-22 WO PCT/CN2013/074511 patent/WO2013159686A1/en active Application Filing
- 2013-04-22 EP EP13781046.1A patent/EP2842075B1/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070116457A1 (en) * | 2005-11-22 | 2007-05-24 | Peter Ljung | Method for obtaining enhanced photography and device therefor |
CN101395613A (en) * | 2006-01-31 | 2009-03-25 | 南加利福尼亚大学 | 3D face reconstruction from 2D images |
EP2137670A2 (en) | 2007-03-05 | 2009-12-30 | Fotonation Vision Limited | Illumination detection using classifier chains |
CN101377814A (en) * | 2007-08-27 | 2009-03-04 | 索尼株式会社 | Face image processing apparatus, face image processing method, and computer program |
US20120027270A1 (en) * | 2007-11-29 | 2012-02-02 | Viewdle Inc. | Method and System of Person Identification by Facial Image |
US20090279784A1 (en) * | 2008-05-07 | 2009-11-12 | Microsoft Corporation | Procedural authoring |
US20110150300A1 (en) * | 2009-12-21 | 2011-06-23 | Hon Hai Precision Industry Co., Ltd. | Identification system and method |
Non-Patent Citations (3)
Title |
---|
DOUGLAS FIDALEO ET AL., MODEL-ASSISTED 3D FACE RECONSTRUCTION FROM VIDEO |
RITA WONG ET AL., INTERACTIVE QUALITY-DRIVEN FEEDBACK FOR BIOMETRIC SYSTEMS |
See also references of EP2842075A4 |
Also Published As
Publication number | Publication date |
---|---|
US20130286161A1 (en) | 2013-10-31 |
EP2842075A1 (en) | 2015-03-04 |
EP2842075B1 (en) | 2018-01-03 |
CN104246793A (en) | 2014-12-24 |
EP2842075A4 (en) | 2015-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2842075B1 (en) | Three-dimensional face recognition for mobile devices | |
JP7365445B2 (en) | Computing apparatus and method | |
JP6610906B2 (en) | Activity detection method and device, and identity authentication method and device | |
CN108369653B (en) | Eye pose recognition using eye features | |
EP3332403B1 (en) | Liveness detection | |
CN110688930B (en) | Face detection method and device, mobile terminal and storage medium | |
CN111194449A (en) | System and method for human face living body detection | |
CN109684951A (en) | Face identification method, bottom library input method, device and electronic equipment | |
US8860795B2 (en) | Masquerading detection system, masquerading detection method, and computer-readable storage medium | |
JP6809226B2 (en) | Biometric device, biometric detection method, and biometric detection program | |
US20150161435A1 (en) | Frontal face detection apparatus and method using facial pose | |
KR101510312B1 (en) | 3D face-modeling device, system and method using Multiple cameras | |
WO2021218568A1 (en) | Image depth determination method, living body recognition method, circuit, device, and medium | |
CN112257696A (en) | Sight estimation method and computing equipment | |
JP7264308B2 (en) | Systems and methods for adaptively constructing a three-dimensional face model based on two or more inputs of two-dimensional face images | |
WO2023034251A1 (en) | Spoof detection based on challenge response analysis | |
KR20090115738A (en) | Information extracting method, registering device, collating device and program | |
CN114202677B (en) | Method and system for authenticating an occupant in a vehicle interior | |
KR101844367B1 (en) | Apparatus and Method for Head pose estimation using coarse holistic initialization followed by part localization | |
JP2022048817A (en) | Information processing device and information processing method | |
WO2020095400A1 (en) | Characteristic point extraction device, characteristic point extraction method, and program storage medium | |
CN115348438B (en) | Control method and related device for three-dimensional display equipment | |
CN113837053B (en) | Biological face alignment model training method, biological face alignment method and device | |
KR101509934B1 (en) | Device of a front head pose guidance, and method thereof | |
EP3989184A1 (en) | Liveness detection of facial image data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13781046 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2013781046 Country of ref document: EP |