US20220284732A1 - Iris liveness detection for mobile devices - Google Patents
Iris liveness detection for mobile devices Download PDFInfo
- Publication number
- US20220284732A1 US20220284732A1 US17/704,822 US202217704822A US2022284732A1 US 20220284732 A1 US20220284732 A1 US 20220284732A1 US 202217704822 A US202217704822 A US 202217704822A US 2022284732 A1 US2022284732 A1 US 2022284732A1
- Authority
- US
- United States
- Prior art keywords
- image
- eye
- characteristic
- image data
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title abstract description 33
- 238000000034 method Methods 0.000 claims description 66
- 210000001747 pupil Anatomy 0.000 claims description 62
- 230000008859 change Effects 0.000 claims description 16
- 239000013598 vector Substances 0.000 abstract description 67
- 238000013459 approach Methods 0.000 abstract description 37
- 230000001815 facial effect Effects 0.000 abstract description 2
- 210000001508 eye Anatomy 0.000 description 56
- 230000008569 process Effects 0.000 description 27
- 238000004458 analytical method Methods 0.000 description 25
- 238000004891 communication Methods 0.000 description 20
- 238000012545 processing Methods 0.000 description 18
- 230000014509 gene expression Effects 0.000 description 16
- 238000003066 decision tree Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 208000016339 iris pattern Diseases 0.000 description 7
- 230000001360 synchronised effect Effects 0.000 description 7
- 238000009499 grossing Methods 0.000 description 6
- 230000004807 localization Effects 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 238000001429 visible spectrum Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 241000593989 Scardinius erythrophthalmus Species 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 201000005111 ocular hyperemia Diseases 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000013476 bayesian approach Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000004397 blinking Effects 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 229920001690 polydopamine Polymers 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 206010057410 Hippus Diseases 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 210000004709 eyebrow Anatomy 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 210000004209 hair Anatomy 0.000 description 2
- 238000011524 similarity measure Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 241001354243 Corona Species 0.000 description 1
- 206010014970 Ephelides Diseases 0.000 description 1
- 208000003351 Melanosis Diseases 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000000720 eyelash Anatomy 0.000 description 1
- 210000000744 eyelid Anatomy 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000003041 ligament Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 210000001331 nose Anatomy 0.000 description 1
- 238000010397 one-hybrid screening Methods 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000004478 pupil constriction Effects 0.000 description 1
- 230000010344 pupil dilation Effects 0.000 description 1
- 230000004434 saccadic eye movement Effects 0.000 description 1
- 210000003786 sclera Anatomy 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/04—Synchronising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G06K9/6289—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/803—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/197—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
- G06V40/45—Detection of the body part being alive
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/10—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from different wavelengths
- H04N23/11—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from different wavelengths for generating image signals from visible and infrared light wavelengths
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
- H04N23/84—Camera processing pipelines; Components thereof for processing colour signals
- H04N23/843—Demosaicing, e.g. interpolating colour pixel values
-
- H04N5/332—
-
- H04N9/04515—
Definitions
- Embodiments described herein relate generally to an iris liveness detection, and more specifically, to techniques for capturing and using information about the iris liveness to authenticate a user to a mobile device.
- Smartphones Since the introduction of a first smartphone in 1994, there has been a rapid growth in smartphones' technology. Smartphones have become much more than just computers; they also provide functionalities of personal databases, jukeboxes, cameras, communications hubs and communications gateways.
- FIG. 1 is an example mobile device environment for implementing an iris liveness detection according to an example embodiment
- FIG. 2 is a flow diagram depicting an example iris liveness detection process according to an example embodiment
- FIG. 3 depicts examples of RGB/NIR image pairs acquired according to an example embodiment
- FIG. 4 depicts examples of RGB/NIR image pairs acquired from a live person and examples of RGB/NIR image pairs acquired from photographs and computer displays;
- FIG. 5 depicts examples of RGB/NIR image pairs acquired from a 3-D model of a face
- FIG. 6 depicts a pupil localization process according to an example embodiment
- FIG. 7 depicts an example sequence of images showing an eye-blinking effect
- FIG. 8 is an example binary decision tree used to determine whether images depict a live person or are part of a presentation attack
- FIG. 9 is a flow diagram of an example process for detecting presentation attacks according to an example embodiment
- FIG. 10 illustrates a computer system upon which one or more embodiments may be implemented.
- the techniques include a workflow for acquiring iris biometric information of a person attempting to use a mobile device.
- the iris biometric information may be acquired using electronic sensors integrated with the device.
- the sensors may be configured to capture images of an eye or an eye's iris region.
- Examples of mobile devices may include smartphones, tablets, PDAs, laptops, electronic watches, and the like.
- Electronic sensors may be configured to capture image pairs.
- An image pair includes a visible red-green-blue (RGB) spectrum image and a near infra-red (NIR) spectrum image of an eye or an eye's iris region. Capturing of the RGB and NIR images may be performed synchronously in terms of timing.
- RGB visible red-green-blue
- NIR near infra-red
- RGB/NIR hybrid sensor is an example of a sensor configured to capture RGB/NIR image pairs synchronously.
- the RGB/NIR hybrid sensor is an electronic sensor configured to capture both an RGB image and an NIR image of the same scene and at the same time.
- the captured RGB/NIR image pair include the images that depict objects shown in the same spatial relationships to each other in each of the images. Therefore, the images of the RGB/NIR pair depicting for example, and eye, will depict the eye at the same location in each of the two images of the pair.
- RGB and NIR images may be collectively referred to as incoming images.
- An incoming image may capture multi-spectral information specific to characteristics of a human eye and/or the eye' iris region.
- the multi-spectral information may be mapped onto one or more discrete feature vectors representing the characteristics of the eye's iris region.
- Discrete feature vectors may be processed by applying one or more classifiers to the vectors to generate a classified image.
- a classifier is a means for analyzing contents of an image and contents of feature vectors generated for the image.
- Examples of classifiers may include intermediate classifiers which use a distance metric to determine whether the discrete feature vectors match predetermined training feature vectors. For example, if a distance value computed based on a discrete feature vector of an image depicting a person attempting to use a mobile device and a training feature of an image depicting an actual owner of the mobile device exceed a certain threshold, then it may be concluded that the person attempting to use the device is not the device's owner.
- Classified images of an eye and/or an iris region may be further processed using multi-frame pupil localization techniques.
- Pupil localization techniques may include processing of pupil regions identified in the classified images and determining one or more characteristics of the pupil. The characteristics of the pupil may be used to determine liveness of the depicted iris. For example, the characteristics may be used to determine whether the images depict an iris of an owner of a mobile device or an iris of an imposter attempting to use the mobile device.
- the techniques described herein allow detecting spoofing attacks and security breaches committed with respect to mobile devices.
- the techniques are applicable to implementations involving actual human faces as well as 3-D face models made of materials that have properties similar to properties of human faces.
- a method comprises acquiring a plurality of image pairs using one or more image sensors.
- the sensors may be integrated with a mobile device, and the image pairs may depict a person who attempts to use the mobile device.
- Each image pair, of the plurality of image pairs may include an RGB image and a NIR image, both images acquired in a synchronized manner. Acquiring a pair of images in the synchronized manner may include acquiring the images of the pair at a same time.
- the sensors may include at least one hybrid RGB/NIR sensor.
- a particular image pair that depicts an eye-iris region in-focus is selected from a plurality of image pairs. Based on, at least in part, the particular image pair, a hyperspectral image is generated.
- the hyperspectral image may be generated by fusing two images included in the particular image pair.
- a particular feature vector for the eye-iris region depicted in the particular image pair is generated.
- the particular feature vector may numerically represent a particular feature, such as an iris region depicted in the image pair.
- One or more trained model feature vectors are retrieved from a storage unit.
- the trained model feature vectors may be generated based on images depicting an owner of a mobile device.
- the images depicting the particular user depict valid biometric characteristics of the owner of the device.
- the trained model features vectors are used to determine whether the particular feature vector have some similarities with the particular feature vector generated from image pairs depicting a person attempting to use the mobile device.
- the similarities may be quantified using a distance metric computed based on the particular feature vector and the one or more trained model feature vectors.
- a distance metric represents a similarity measure between the trained model feature vectors and a particular feature vector.
- a distance metric represents a similarity measure of the particular image pair, acquired from a person attempting to use a mobile device, and the trained model feature vectors generated based on the valid biometric characteristics of an owner of the mobile device.
- a distance metric may be compared with a predefined first threshold.
- the first threshold may be determined empirically. If the distance metric exceeds the first threshold, then a first message indicating that the plurality of image pairs fails to depict the particular user of a mobile device is generated.
- the first message may also indicate that the person whose depictions were acquired by the sensors of the mobile device is not the owner of the mobile device. Furthermore, the first message may indicate that a presentation attack on the mobile device is in progress.
- two or more image pairs that depict an iris are selected from the acquired plurality of image pairs. For each NIR image of each image pair, of the two or more image pairs, one or more characteristics of the iris depicted in the image pair are determined.
- the second message may also indicate that the person whose depictions were acquired by the sensors of the mobile device is the owner of the mobile device. Furthermore, the second message may indicate that an authentication of the owner to the mobile device was successful. Otherwise, a third message may be generated to indicate that a presentation attack on the mobile device is in progress.
- Biometric information has been traditionally used by law enforcement to secure and restrict access to resources and facilities, and to establish identities of individuals. Biometric technology has been employed at for example, airports, train-stations, and other public areas. In these situations, biometric information is acquired in so called supervised settings. In a supervised setting, one individual oversees an acquisition of biometric information from another individual to ensure validity of the acquired information. Because the acquisition of the biometric information in these settings is supervised, spoofing of the biometric information of the individual is rather rare.
- biometric technology when biometric technology is adapted in unsupervised settings, spoofing of biometric information of an individual is not uncommon.
- biometric information when biometric information is used to authenticate an individual to a consumer device such as a mobile device, a biometric data acquisition process is usually unsupervised.
- acquiring biometric information of an individual in an unsupervised setting may be prone to spoofing.
- a fingerprint authentication which has been widely adopted in mobile devices, may be easily targeted by various spoofing techniques.
- a human supervision may be an effective way for detecting spoofing attacks and widely used in many applications including border security patrol.
- the supervision is impractical in cases of mobile devices and other consumer electronic devices.
- An iris of an eye is an annular region between a pupil and a sclera of the eye.
- An iris region usually has a distinct pattern, and due to its distinctiveness, the pattern may be used to uniquely identify a person.
- an iris pattern contains complex and distinctive ligaments, furrows, ridges, rings, coronas, freckles and collarets.
- An iris pattern becomes relatively stable at the eight month of gestation, and remains stable throughout the person's lifetime.
- Iris patterns usually demonstrate high variability. For example, even twin children may have different iris patterns. In fact, an iris pattern of the left eye of a person is most likely different than an iris pattern of the right eye of the same person. The unique characteristics of an iris region make the iris a suitable source of biometric information useful to authenticate individuals.
- biometric characteristics of an iris are collected and analyzed using mobile devices such as smartphones, tablets, PDAs, laptops, watches, and the like.
- the process of collecting and analyzing the biometric characteristics may be implemented to authenticate a user to a mobile device, to detect spoofing attempts, and/or to detect liveness of the iris in general.
- Authentication of a person to a mobile device based on the person's iris biometrics is usually unsupervised. It is unsupervised because it does not require any monitoring of the person authenticating himself to the device. Indeed, usually only the person who authenticates himself to the device participates in the authentication process.
- Unsupervised authentication approaches based on biometric data are more susceptible to spoofing than traditional authentication techniques. This is because in the unsupervised authentication no one is monitoring a user as the user's biometric data is acquired. Since there is no monitoring, an imposter may attempt to provide intercepted or false information to gain access to a mobile device of another person.
- Spoofing attacks on an unsupervised authentication system may include presenting to a mobile device biometric data of a person other than a user of the device, and mimicking real biometric information of the user of the device to gain access to the user's device.
- the mimicking may include providing to the device an iris biometric sample that was recorded without co-operation or knowledge of the user. This may include presenting, by an imposter, a picture, a recorded video, or a high quality iris image of the user in front of the device to gain access the user's device.
- An iris liveness detection approach presented herein is an anti-spoofing technique.
- the iris liveness detection allows determining whether biometric information presented to a device is an actual biometric measurement obtained from a live person and whether it was captured at the time when the biometric information is presented to the device.
- An automatic liveness detection approach may include an analysis of intrinsic properties of a live person, an analysis of involuntary body signals, and a challenge-response analysis.
- the analysis of intrinsic properties may include analyzing spectrographic properties of a human eye, analyzing a red-eye effect, and analyzing a 3-D curvature of an iris surface.
- An analysis of involuntary body signals may include analyzing an eyelid movements and hippus.
- a challenge-response analysis may include analyzing a user's response when the user is prompted to blink or look at different directions.
- an automatic iris liveness detection approach is implemented as part of an iris recognition system, and is used as a countermeasure against spoofing. It may be implemented in hardware, software, or both. It is applicable to a variety of electronic devices and its implementation may be optimized to minimally affect performance of the iris recognition system built into the devices.
- Iris liveness detection techniques may be implemented in mobile devices.
- the techniques allow recognizing static images such as high quality printed images of an iris, iris images projected on a screen, or high resolution video frames, and determining whether such images are presentation attacks on mobile devices.
- the techniques may be implemented in a variety of mobile devices without requiring any special hardware. Therefore, the techniques may be inexpensive solutions against presentation attacks. Furthermore, the techniques may not depend on user interactions, and thus they may be widely adopted for every day-use by consumers.
- an iris liveness detection technique includes acquiring and processing visible spectrum RGB images as well as NIR images by a mobile device.
- the images may be captured using cameras or sensors integrated in the device. If a mobile device is equipped with cameras, then at least one camera may be a hybrid front facing camera configured to perform an iris recognition, and at least one camera may be configured to carry out video calls or selfie imaging. If a mobile device is equipped with RGB/NIR hybrid sensors, then the sensors may be configured to synchronously capture RGB/NIR image pairs.
- Captured RGB/NIR image pairs may be processed using components of a mobile device configured to perform a visible spectrum iris recognition and an NIR iris recognition.
- FIG. 1 is an example mobile device environment for implementing an iris liveness detection according to an example embodiment.
- a mobile device environment 100 may include various mobile devices.
- Non-limiting examples of mobile devices include various types and models of smartphones 104 a - 104 b , laptops 106 a - 106 b , PDAs 108 a , and tablets 108 b .
- Each mobile device may be configured to capture visual spectrum RGB images 102 a and NIR images 102 b of a person facing the device.
- the examples described in the following section refer to the approaches implemented in smartphone 104 a ; however, the approaches may be implemented on any type of mobile device.
- visual spectrum RGB images 102 a and NIR images 102 b of a person facing smartphone 104 a are captured by cameras and/or sensors integrated in smartphone 104 a .
- the RGB images 102 a and NIR images 102 b may be further processed by components of smartphone 104 a .
- the processing may include determining liveness of an iris depicted in the captured images. If the iris liveness is detected in the images, then the person facing smartphone 104 a may be granted access to smartphone 104 a and resources of smartphone 104 a . However, if the iris liveness is not detected in the images, then the person facing smartphone 104 a is denied access to the smartphone 104 a and its resources.
- Processing of RGB and NIR images by a mobile device may include determining locations of an iris in the images, determining locations of a pupil within the iris in the respective images, and analyzing the determined locations for the purpose of detecting the iris' liveness. Detecting the iris' liveness may allow identifying incidents of presentation attacks on the mobile device. For example, the technique may allow identifying presentation attacks when mannequins, having engineered artificial eyes used to duplicate the optical behavior of human eyes, are used to gain access to mobile devices.
- an iris liveness detection process is part of an authentication process performed to authenticate a user to a mobile device.
- the iris liveness detection process may comprise two stages.
- the first stage of the process may include acquiring a plurality of RGB and NIR image pairs depicting the user facing the mobile device, and selecting a particular RGB/NIR image pair that depicts the user's eyes in-focus.
- the second stage of the process may include processing the particular image pair to detect liveness of the iris depicted in the image pair, and determining whether the user may access the mobile device and its resources.
- FIG. 2 is a flow diagram depicting an example iris liveness detection process according to an example embodiment.
- the example iris liveness detection process comprises a first stage 202 and a second stage 212 .
- an image stream is acquired by a mobile device.
- the image stream may include RGB and NIR image pairs and depict a user facing a mobile device.
- the pairs may be acquired using one or more camera and/or one or more sensors integrated in the mobile device.
- the cameras and the sensors may be separate devices, hybrid devices, or both, and may be configured to capture and acquire the images in a synchronized manner.
- Capturing images in a synchronized manner may include synchronizing the capturing in terms of timing.
- a hybrid RGB/NIR sensor may be used to capture both an RGB image and a NIR image at the same time. Synchronizing the capturing of both images allows capturing the images in such a way that the images depict objects shown in the same spatial relationships to each other in each of the images.
- Capturing of the images may be initiated by a user as the user tries to use a mobile device.
- the user may press a certain key, or touch a certain icon displayed on the device to “wake up” the device.
- a mobile device may be equipped with a “wake up” key, or a “unlock” key, used to request access to the mobile device and to initiate the image acquisition process. Selection of the keys configured to initiate the image acquisition and a naming convention for the keys depends on the specific implementation and the type of the mobile device.
- a user facing a mobile device presses a “wake up” key of the mobile device to initiate an image acquisition process.
- the mobile device Upon detecting that the key was pressed, the mobile device initiates an RGB/NIR hybrid sensor, or cameras and sensors, integrated in the device, causes the hybrid sensor to synchronously acquire RGB and NIR images of eyes of the user.
- the RGB/NIR image pairs are acquired synchronously to ensure that the locations of certain features in one image correspond to the location of the certain features in another image.
- RGB and NIR image pairs may be acquired in a normal office situation with active illumination of 1350 nm. Examples for the image pairs acquired at different stand-off distance are shown in FIG. 3 .
- FIG. 3 depicts examples of RGB/NIR image pairs acquired according to an example embodiment.
- the examples depicted in FIG. 3 include an RGB image 302 a , an NIR image 302 b , an RGB image 304 a , and an NIR image 304 b .
- Images 302 a - 302 b depict one person and images 304 a - 304 b depict another person.
- the RGB/NIR image pairs may be synchronously acquired by an RGB/NIR hybrid sensor at the time when a user is trying to authenticate himself to a mobile device.
- the image pairs may be compared to training RGB/NIR images acquired from an owner of the device.
- an obtained image stream of RGB/NIR image pairs is processed to select an RGB/NIR image pair that depicts an eye-iris region in-focus.
- This may include applying detectors configured to detect eye-iris regions in the image pairs and select a subset of the image pairs that depict the eye-iris regions, and comparators configured to select, from the subset, an RGB/NIR image pair that depicts the eye-iris region in focus. If the eyes are detected in one image pair, the eyes' locations in the subsequently captured image pairs may be tracked until one or more image pairs depicting the eyes in-focus are found.
- the visible spectrum (wavelength) of the image stream may be subjected to a certain type of processing to determine images that depict a sequence of good quality, in-focus eye regions. The processing may be performed using the state-of-the art face detectors, eye location detectors, and eye trackers.
- a hyperspectral image is generated.
- a hyperspectral image is generated from an RGB image and a NIR image of the image pair by fusing both images into one image. Fusing of an RGB image and a NIR image may be accomplished by applying a fusing operator to a mathematical representation of the RGB image and a mathematical representation of the NIR image.
- a mathematical representation I v of an RGB image and a mathematical representation I i of a NIR image of an RGB/NIR image pair are obtained and used to generate a hyperspectral image I h .
- the mathematical representations of the RGB image and the NIR image capture ambient light and a surface reflectance on an eye represented at four different wavebands (Blue, Green, Red and NIR), respectively.
- the hyperspectral image I h obtained by fusing the mathematical representations of the RGB and NIR images, will capture an ambient light and a surface reflectance on an eye represented at the four different wavebands and derived by applying a fusing operator to the respective mathematical representations.
- RGB image and NIR image of an image pair are generated.
- the RGB and NIR image formation by an RGB/NIR hybrid sensor may be captured using the following expression:
- I v ⁇ ⁇ v ⁇ p E ⁇ ( p , ⁇ v ) ⁇ R ⁇ ( p ) ⁇ Q ⁇ ( ⁇ v ) ⁇ dpd ⁇ ⁇ v ( 1 )
- I i ⁇ ⁇ 1 ⁇ p E ⁇ ( p , ⁇ i ) ⁇ R ⁇ ( p ) ⁇ Q ⁇ ( ⁇ i ) ⁇ dpd ⁇ ⁇ i ( 2 )
- I v ⁇ ⁇ m ⁇ n is the RGB image
- I i ⁇ ⁇ k ⁇ I is the NIR image
- ⁇ v ⁇ [350 nm, 700 nm], ⁇ i ⁇ [750 nm, 900 nm] are the wavelength ranges of the RGB and NIR images, respectively;
- R is the spatial response of the sensor
- Q is the quantum efficiency of the sensor.
- I h ⁇ ⁇ m ⁇ n ⁇ 4 and ⁇ is a fusing operator.
- a hyperspectral image I h is further processed to minimize the effect of ambient light. This may be accomplished by obtaining metadata from a camera or a sensor, and using the metadata to perform a white color balance, a gamma correction, and/or an auto exposure correction of the hyperspectral image I h .
- an iris liveness detection process includes a second stage.
- a hyperspectral image I h is processed to identify one or more multispectral features depicted in the hyperspectral image. Since the hyperspectral image I h represents an ambient light and a surface reflectance on an eye represented at four different wavebands (Blue, Green, Red and NIR), image data in each of the wavebands of the hyperspectral image I h may be processed individually to extract the features from each waveband separately.
- Extracting features from a hyperspectral image may include clustering image data of the hyperspectral image based on the intensity values within each of the wavebands and determining the features based on the clustered image data. Extracted features may be represented as features vectors.
- a feature vector generated for an image is a vector that contains information describing one or more characteristics of an object depicted in the image.
- An example feature vector may include a numerical value representing characteristics of an eye region depicted in the image. The numerical value may be computed based on raw intensity values of the pixels that constitute the eye region.
- step 214 of second stage 212 one or more feature vectors are generated based on a hyperspectral image obtained in first stage 202 .
- a hyperspectral image I h is viewed as comprising four image planes (I c1 , I c2 , I c3 , I c4 ) having the size m ⁇ n and representing four different wavebands.
- the planes may also be referred to as channels.
- the pixels in each plane are clustered separately to form ⁇ predefined clusters.
- the clustering process may be represented using the following expression:
- I cj u ⁇ [1, ⁇ ] m ⁇ n represents a label of the cluster corresponding to the pixels in I cj
- j ⁇ [1,4] denotes the image channel (waveband)
- ⁇ is a count of the clusters
- a clustering operator ⁇ is a nearest neighborhood clustering operator configured to group the pixels in each plane into one of the a cluster at the time and based on the intensity values of the pixels in the plane.
- the label clusters are concatenated to obtain:
- I h u ⁇ ′( I c1 u ,I c2 u ,I c3 u ,I c4 u ), (5)
- ⁇ ′ is a concatenation operator
- the normalized frequency distribution of each combination may be calculated using a transform operator H:
- mapping (f 1 , f 2 , . . . , f s ) is the number of times each unique cluster combination appeared in I h u .
- the mapping defined using expression (6) may be used as feature vectors determined for the hyperspectral image I v .
- the feature extraction technique presented herein represents a unique distribution of information across various image planes in a hyperspectral image L. Furthermore, the presented technique is computationally inexpensive and generates relatively compact feature vectors.
- one or more trained model feature vectors are obtained or retrieved.
- the trained model feature vectors may be generated based on actual and reliable images of a “live” user of a mobile device, and stored in storage units of the device.
- Trained model feature vectors for a live user may be calculated when the user's mobile device is configured to implement an iris liveness detection approach.
- the vectors may be generated based on one or more images depicting for example, facial features of the user, and may be used to train an image classifier to predict whether other images most likely depict the user of the mobile device or whether the other images are presentation attacks on the device.
- a distance metric is computed based on a feature vector, generated from a hyperspectral image, and one or more trained model feature vectors retrieved from a storage unit.
- a storage unit may be a volatile memory unit of a mobile device, a non-volatile memory unit of the mobile device, or any other unit configured to store data.
- a distance metric is a numerical representation of similarities between a feature vector generated from a hyperspectral image and trained model feature vectors generated from images of a user of a mobile device. If a distance value computed from the feature vector and the trained model feature vector exceeds a certain threshold, then the feature represented by the feature vector is dissimilar to the feature represented by the trained model feature vector. This may indicate that an individual whose depictions were used to generate the hyperspectral image is an imposter, and not the user of the mobile device.
- the feature represented by the feature vector is similar, or maybe even identical, to the feature represented by the trained model feature vector. This may indicate that the individual whose depictions were used to generate the hyperspectral image is the user of the mobile device.
- a distance metric is computed as a deviation (error) d.
- the deviation d may be computed using a Bayesian approach.
- F q denotes a feature vector of a query image, such as a hyperspectral image generated from an RGB-NIR image pair acquired by an RGB-NIR hybrid sensor.
- F db denotes one or more trained model feature vectors of a trained model.
- the trained model may be trained on actual images of a user of a mobile device.
- a deviation d is measured as the square root of the entropy approximation to the logarithm of evidence ratio when testing whether the query image can be represented as the same underlying distribution of the live images. This can be mathematically represented as:
- D(F q ⁇ F db ) is the Kullback-Leibler divergence of F db obtained from F q , which is a measure of information lost when the database feature vector F db is approximated from the query feature vector F q .
- the above presented choice of distance metric d q,db is based on the observations that it is a close relative to Jenson—Shannon divergence and an asymptotic approximation of ⁇ 2 distance. Furthermore, d q,db is symmetric and fulfills the triangle inequality.
- a distance metric d q,db computed using expressions (7)-(8) is used to determine whether an incoming query image depicts a live person. If d q,db ⁇ , where ⁇ ⁇ is a predetermined certain threshold, then, in step 222 , it is determined that the query image depicts a live person. Otherwise, in step 224 , it is determined that the query image does not depict a live person.
- Presentation attacks may include various types of spoofing attacks on a mobile device. They may include mimicking real biometric information of a user of a mobile device to gain access to the user's device. The mimicking may include for example, providing to the device an iris biometric sample that was recorded without knowledge of the user of the device.
- One of the most common presentation attacks include presenting a high quality printed photograph in front of the device. For example, an imposter may try to use the high quality color photograph of the user of the mobile device to try to access the device.
- Effectiveness of approaches for detecting presentation attacks may be measured using various approaches.
- One approach includes determining a Normal Presentation Classification Error Rate (NPCER).
- NPCER is defined as the proportion of live users incorrectly classified as a presentation attack.
- APCER is defined as the proportion of presentation attack attempts incorrectly classified as live users.
- Yet other approach includes determining an “Average Classification Error Rate” (ACER), which is computed as the mean value of the NPCER and the APCER error rates.
- the ability to detect presentation attacks depends on a variety of factors. For example, detecting the presentation attacks may depend on the surface reflection and refraction of the material that is presented in front of a hybrid sensor of a mobile device. There are many differences between reflection and refraction factors determined for a printed image and reflection and refraction factors determined for a human skin.
- an iris liveness detection process detects presentation attacks conducted using photographs shown on either reflective paper or a matte paper, and presentation attacks conducted by projecting images on a screen or a display device.
- the approach takes advantage of the fact that the photographic material (reflective paper or matte paper) and the displays of devices have properties that are significantly different than the properties of the human skin or the human eye.
- FIG. 4 depicts examples of RGB/NIR image pairs acquired from a live person and examples of RGB/NIR image pairs acquired from photographs and computer displays.
- Images 402 a , 404 a , 406 a , 408 a and 409 a are visible spectrum RGB images.
- Images 402 b , 404 b , 406 b , 408 b and 409 b are NIR images.
- Images 402 a and 402 b are images acquired from a live person; all remaining images depicted in FIG. 4 are examples of presentation attacks.
- images 404 a and 404 b are high quality visible printed images.
- Images 406 a and 406 b are high quality glossy NIR printed images.
- Images 408 a and 408 b are NIR images printed on a matte paper. Images 409 a and 409 b are NIR images shown in a laptop screen having a high resolution display. Each of pairs 404 - 409 may be compared with image pair 402 to show the differences in surface reflections depicted in pairs 404 - 409 provided during presentation attacks and surface reflections depicted in pair 402 obtained from a live person.
- presentation attacks may include techniques that go beyond using known printing materials and image displaying devices. New materials and display devices may be used to conduct presentation attacks in the future. For example, a new presentation attack may be conducted using a realistic 3-D face model of a user of a mobile device.
- FIG. 5 depicts examples of RGB/NIR image pairs acquired from a 3-D model of a face.
- a 3-D face model may be a mannequin that has engineered artificial eyes with iris regions to duplicate the optical behavior of human eyes, including a red-eye effect.
- the mannequin may be made out of a skin-like material that has properties similar to the properties of a human skin.
- the mannequin may also have realistically reproduced hair, eyebrows, lashes, and so forth.
- images 502 a , 504 a , and 506 a are visible spectrum RGB images, while images 502 b , 504 b , and 506 b are NIR images.
- Images 502 a and 502 b in the first row in FIG. 5 depict a realistic 3-D face model.
- Images 504 a and 504 b in the second row in FIG. 5 depict close up images showing the human like skin, hair and ocular properties.
- Images 506 a and 506 b in the third row in FIG. 5 are side views of the mannequin.
- reflectance and refraction properties in the images of a mannequin in FIG. 5 more-less correspond to reflectance and refraction properties of photographs of a live person, such as pair 402 a - 402 b in FIG. 4 .
- the eye regions in images 504 a and 504 b capture a red-eye effect.
- the mannequin may easily be misclassified as a live person.
- a mannequin may be equipped with printed contact lenses with an iris pattern of a live person. If an imposter uses images of such a mannequin to conduct a presentation attack on a mobile device, then there is a possibility that the imposter may obtain an access to the mobile device. Therefore, analyzing the spectral response of the presented images alone may be insufficient to identify sophisticated presentation attacks.
- an iris liveness detection approach for mobile devices is enhanced using techniques for a pupil analysis performed on the acquired images.
- An analysis of a pupil of a human eye depicted in the images increases the chances that even sophisticated presentation attacks on a mobile device may be identified. This is because mimicking both the pupil dynamics and properties of the human eye region is unlikely feasible at the current state of image-based technologies.
- a pupil detection and a pupil analysis are performed on a sequence of NIR images.
- Detecting a pupil in the NIR images may include cropping the images so that the images represent only the eye regions, and then processing the cropped images using an edge-localization approach and a gradient-based approach to determine a location of the pupil in the images.
- Characteristics of an iris region of the eye depicted in digital images may be impacted by illumination variations and shadows created by eyelashes surrounding the eye. The issue, however, may be addressed by representing the images using a representation that is less sensitive to the illumination variations.
- An example of such a representation is a representation generated using one-dimensional image processing.
- characteristics of an iris region and a pupil in the iris region are captured using one-dimensional image processing.
- One-dimensional image processing usually requires no thresholding, and therefore allows reducing the effect of edge smearing.
- One-dimensional processing of an image may include applying a smoothing operator along a first direction of the image, and applying a derivative operator along a second (the orthogonal) direction.
- I ⁇ ⁇ m ⁇ n be a cropped image depicting an eye region.
- the cropped eye image be an NIR image denoted as I i .
- the smoothed eye image may be represented using the following expression:
- I ⁇ s I ⁇ ( x , r + x ⁇ sin ⁇ ( ⁇ ) cos ⁇ ( ⁇ ) ) ⁇ S ⁇ ( x ) , ( 9 )
- I ⁇ g I ⁇ s ( x , r + x ⁇ sin ⁇ ( ⁇ + 9 ⁇ 0 ) cos ⁇ ( ⁇ + 9 ⁇ 0 ) ) ⁇ G ⁇ + 9 ⁇ 0 ( x ) , ( 11 )
- ⁇ g ⁇ ⁇ is the standard deviation of the derivative operator.
- the magnitude representation of an edge gradient may be obtained using the following expression:
- I ⁇ M ⁇ square root over (( I ⁇ g ) 2 +( I ⁇ +90 g ) 2 ) ⁇ . (13)
- a transform operator T is applied on I ⁇ M , as shown below:
- I d is the transformed image.
- the transformation operator T is chosen in such a way that it expresses the image I ⁇ M in a binary form, followed by the detection of the largest connected region in the image;
- ⁇ • is a threshold selected in such a way that n min p ⁇ n max p , where n min p and n max p are the minimum and maximum numbers of pixels which could possibly be in the pupil region in the particular frame.
- n min p and n max p are the minimum and maximum numbers of pixels which could possibly be in the pupil region in the particular frame.
- FIG. 6 depicts a pupil localization process according to an example embodiment.
- image 602 depicts an original NIR image I i .
- Image 604 is an edge gradient image generated along one direction of the original NIR image.
- Image 606 is an edge gradient image generated along an orthogonal direction.
- Image 608 is a magnitude image.
- Image 609 depicts the localized pupil.
- Image 602 represents the original image.
- Images 604 - 606 represent the output of one-dimensional image processing for the angular direction ⁇ and its orthogonal value.
- Image 608 is the magnitude image obtained from the result of the one dimensional image processing, and the localized pupil is shown in image 609 .
- the images are further processed to determine dynamic characteristics of the depicted pupil.
- Dynamic characteristics of a pupil may include the eye's saccades, hippus, and pupil dilation/constriction which may arise naturally as the person moves toward the camera.
- the dynamic characteristics may also include an eye-blinking, which alters the size of a pupil area. Examples of images that were captured as a person was blinking are depicted in FIG. 7 .
- FIG. 7 depicts an example sequence of images showing an eye-blinking effect.
- Images 702 , 704 , 706 and 708 are NIR images depicting an eye of a live person and acquired as the person was blinking.
- Images 712 , 714 , 716 and 718 are black-and-white images depicting locations and sizes of the pupils identified in the images 702 , 704 , 706 , and 708 , respectively.
- Images 712 , 714 , 716 and 718 show that the sizes of the pupil and the pupil's locations were changing as the person was blinking. The changes appear to be significant in detecting the iris liveness, and may be measured with an acceptable accuracy using the presented pupil analysis technique.
- a pupil analysis may include an analysis of a pupil area in general, and an analysis of a pixel intensity in the pupil region in particular.
- a pupil analysis performed on the images may involve determining whether a size of the pupil area depicted in the images is changing from image-to-image, or whether an eye-blinking is depicted in the images. If such changes are detected in the images, then it may be concluded that the images depict a live person. However, if such changes cannot be detected, then the images are most likely provided as a presentation attack.
- the images may be images taken from a mannequin whose eyes have no dynamic characteristics, such as an eye-blinking.
- a pixel intensity in a pupil region of any of NIR images 712 , 714 , 716 , and 718 of FIG. 7 is determined using a Purkinje image.
- a Purkinje image is an image formed by the light reflected from the four optical surfaces of the human eye. Purkinje images may be used in various applications, including an iris liveness detection, an eye tracking, and a red-eye effect detection.
- a binary decision tree is used to classify a sequence of images captured by a mobile device and depicting human eyes.
- the binary decision tree may be used to classify the images as either images of a live person or images presented as part of a presentation attack.
- a binary decision tree may be designed to interface with different models and approaches, including an intermediate decision approach of FIG. 2 for an iris liveness detection, and a pupil analysis described in FIG. 6 and FIG. 7 .
- the binary decision tree usually has one root node and one or more intermediate nodes.
- An example of the binary decision tree is depicted in FIG. 8 .
- FIG. 8 is an example binary decision tree used to determine whether images depict a live person or are part of a presentation attack.
- An example binary decision tree 800 comprises a root node 802 , an intermediary decision node 804 , and result nodes 806 , 808 and 810 .
- Root node 802 is used to determine whether an incoming image depicts a live iris or a presentation attack image. This may be determined based on a distance metric d q,db computed using expressions (7)-(8) described above, and where q represents an incoming image (a query image) and db represents a feature vector F db described above.
- root node 802 a decision is made whether d q,db ⁇ , where ⁇ ⁇ and corresponds to a predetermined threshold value. If d q,db ⁇ , then it may be concluded that the incoming image depicts a live person, and further processing is performed at intermediary decision node 804 . Otherwise, it may be concluded in result node 810 that the incoming image does not depict a live person, but is part of a presentation attack.
- intermediary decision node 804 If it was determined that the incoming image is an image of a live person, then, in intermediary decision node 804 , one or more image recognition modules are invoked to perform an iris recognition on the incoming image.
- a pupil localization result derived as described in FIG. 6 and FIG. 7 , may be provided to intermediate decision node 804 along with additional input images acquired along with the incoming image. The provided result and the images may be used by an iris recognition module to determine whether the images show any changes in characteristics of the depicted pupil.
- result node 806 is reached to indicate that the incoming image depicts a live person. However, if it is determined that the provided information does not indicate any changes in characteristics of the depicted pupil, then result node 808 is reached to indicate a presentation attack.
- a decision process depicted in FIG. 8 provides an effective approach for detecting presentation attacks. It combines the approaches for determining whether incoming images depict a live iris, and the approaches for determining whether the incoming images depict a live pupil.
- the performance of the system implementing the decision process depicted in FIG. 8 may be measured using the indicators such as ACER, NPCER and APCER, described above. A comparison of the results obtained when both the iris and the pupil analysis was performed with the results when only the iris analysis was performed indicates that the approaches implementing both the iris and the pupil analysis are more effective.
- FIG. 9 is a flow diagram of an example process for detecting presentation attacks according to an example embodiment.
- step 902 an image stream of RGB and NIR image pairs is acquired using a mobile device. In an embodiment, this step corresponds to step 204 in FIG. 2 .
- An image stream may include a plurality of image pairs, and each image pair of the plurality of images may include an RGB image and NIR image, both acquired in a synchronized manner.
- the image pairs may be acquired using for example, an RGB/NIR hybrid sensor that synchronously captures both the RGB image and the NIR image.
- an acquired stream of images may be processed to identify at least one image pair that depicts an eye region in-focus.
- the identified image pairs may be further reviewed to determine one image pair that includes the images that provide the high quality depiction of the eye region.
- a hyperspectral image is generated from a selected RGB/NIR image pair. This step corresponds to step 208 of FIG. 2 .
- a hyperspectral image is generated by fusing an RGB image with an NIR image of the RGB/NIR image pair using a fusing operator.
- a fusing operator may be expressed using for example, expression (3).
- a feature vector for a hyperspectral image is generated. This step corresponds to step 214 of FIG. 2 .
- a feature vector generated for an image represents one or more characteristics of an object depicted in the image. An example of characteristics may be a depiction of eyes in the image. In this example, a feature vector may be generated for an eye region detected in the image.
- step 908 one or more trained model feature vectors are retrieved from a storage unit. This step corresponds to step 216 of FIG. 2 .
- Trained model feature vectors are vectors that were generated based on actual and reliable images of a live user of a mobile device.
- the trained model feature vectors are used as references in determining whether a feature vector generated from a hyperspectral image in step 906 matches the trained model feature vectors within some threshold.
- a first classifier 910 is applied to the trained model feature vectors and a feature vector generated for a hyperspectral image. Applying first classifier 910 may include steps 912 , 914 and 916 .
- a classifier is a means or an approach for classifying an image based on visual contents of the image. Applying a classifier to an image allows analyzing contents of the image and analyzing the numerical properties of the image. Image classification allows processing the image's contents to determine one or more image features and represent the image features as numerical properties.
- a distance metric is determined based on a feature vector generated from a hyperspectral image, and one or more trained model feature vectors retrieved from a storage unit. This step corresponds to step 218 in FIG. 2 .
- the DM may be computed using for example, a Bayesian approach. The approach may utilize for example, expressions (7)-(8).
- a test is performed to determine whether a DM exceeds a predefined threshold.
- a threshold may be a numeric value determined empirically based on for example, some training or experience. If the DM exceeds the threshold, then step 916 is performed. Otherwise, step 922 is performed.
- an indication is generated to specify that an acquired stream of images does not depict a live person, and instead it is a presentation attack.
- the indication may include an error message, a text message, an email, an audio signal, or any other form of communications.
- This step is performed when it has been determined that a distance between a feature vector and one or more training model feature vectors exceeds a threshold, and therefore, there is no sufficient similarity between the RGB/NIR image pair and the actual/reliable images of the user of a mobile device. Because the RGB/NIR image pair is not sufficiently similar to the actual/reliable images of the user, it may be concluded that the RGB/NIR images do not depict the user of the mobile device, and instead they depict an imposter.
- Steps 922 , 924 , 926 and 928 include an application of a second classifier 920 to NIR images of two or more RGB/NIR image pairs. Alternatively, this process may be performed on two or more image pairs.
- a pupil characteristics analysis and an iris recognition are performed on NIR images of RGB/NIR image pairs.
- This may be include cropping each of the NIR images so they depicts only eye regions.
- This may also include smoothing the cropped images using for example, a smoothing functions described in expression (10).
- this may include generating an intermediate edge gradient image from the smoothed image described in expression (11).
- the intermediate edge gradient image may be further transformed using a transformation operator T, as in expression (14).
- a test is performed based on the identified characteristics to determine whether there are any changes in the characteristics of the identified pupil from image-to-image.
- An analysis of characteristics of the identified pupil may include an analysis of a pixel intensity in the pupil region in two or more NIR images.
- an analysis of pupil's characteristics may include determining whether a size of the pupil area, depicted in the images, is changing from image-to-image, or whether an eye-blinking is depicted in the images.
- step 928 is performed, in which an indication is generated that the images depict a live person.
- an indication is generated that the images are most likely provided as a presentation attack.
- the images may be images taken from a mannequin whose eyes have no dynamic characteristics, such as an eye-blinking.
- the indication may include an error message, a text message, an email, an audio signal, or any other form of communications.
- an iris liveness detection technique for in iris recognition applications implemented in mobile devices.
- the technique employs the ability to acquire a plurality of RGB/NIR image pair by a mobile device in a synchronized manner.
- the technique also employs the ability to collect and process iris biometrics using the mobile device.
- the approach allows detecting whether acquired RGB/NIR image pairs depict a live person or whether the images are presented as a presentation attack.
- the approach may be utilized to authenticate a user to the mobile device by detecting whether the user is indeed an authorized owner of the mobile device.
- the approach may be implemented on any type of mobile device. It does not require implementing or integrating any additional hardware. It may be implemented as an authentication mechanism to authenticate a user to a mobile device and to detect authentication spoofing attempts.
- the approach may be further developed to include the ability to utilize various types of iris biometrics information, not only biometrics of an iris or a pupil.
- the approach may be extended to take into consideration biometrics of fingerprints, noses, eyebrows, and the like.
- the approach may also be enhanced by developing and providing a database containing various types of biometrics data, and a database containing information about different types of advanced presentation attacks.
- the approach may be implemented using the latest visible spectrum/NIR CMOS image sensor technologies.
- the techniques described herein are implemented by one or more special-purpose computing devices.
- the special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
- ASICs application-specific integrated circuits
- FPGAs field programmable gate arrays
- Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
- the special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
- FIG. 10 is a block diagram that depicts a computer system 1000 upon which an embodiment may be implemented.
- Computer system 1000 includes a bus 1002 or other communication mechanism for communicating information, and a hardware processor 1004 coupled with bus 1002 for processing information.
- Hardware processor 1004 may be, for example, a general purpose microprocessor.
- Computer system 1000 also includes a main memory 1006 , such as a random access memory (RAM) or other dynamic storage device, coupled to bus 1002 for storing information and instructions to be executed by processor 1004 .
- Main memory 1006 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1004 .
- Such instructions when stored in non-transitory storage media accessible to processor 1004 , render computer system 1000 into a special-purpose machine that is customized to perform the operations specified in the instructions.
- Computer system 1000 further includes a read only memory (ROM) 1008 or other static storage device coupled to bus 1002 for storing static information and instructions for processor 1004 .
- ROM read only memory
- a storage device 1010 such as a magnetic disk, optical disk, or solid-state drive is provided and coupled to bus 1002 for storing information and instructions.
- Computer system 1000 may be coupled via bus 1002 to a display 1012 , such as a plasma display and the like, for displaying information to a computer user.
- a display 1012 such as a plasma display and the like
- An input device 1014 is coupled to bus 1002 for communicating information and command selections to processor 1004 .
- cursor control 1016 is Another type of user input device
- cursor control 1016 such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1004 and for controlling cursor movement on display 1012 .
- This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
- Computer system 1000 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1000 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 1000 in response to processor 1004 executing one or more sequences of one or more instructions contained in main memory 1006 . Such instructions may be read into main memory 1006 from another storage medium, such as storage device 1010 . Execution of the sequences of instructions contained in main memory 1006 causes processor 1004 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
- Non-volatile media includes, for example, optical disks, magnetic disks, or solid-state drives, such as storage device 1010 .
- Volatile media includes dynamic memory, such as main memory 1006 .
- storage media include, for example, a floppy disk, a flexible disk, hard disk, solid-state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
- Storage media is distinct from but may be used in conjunction with transmission media.
- Transmission media participates in transferring information between storage media.
- transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1002 .
- transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
- Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 1004 for execution.
- the instructions may initially be carried on a magnetic disk or solid-state drive of a remote computer.
- the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
- a modem local to computer system 1000 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
- An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 1002 .
- Bus 1002 carries the data to main memory 1006 , from which processor 1004 retrieves and executes the instructions.
- the instructions received by main memory 1006 may optionally be stored on storage device 1010 either before or after execution by processor 1004 .
- Computer system 1000 also includes a communication interface 1018 coupled to bus 1002 .
- Communication interface 1018 provides a two-way data communication coupling to a network link 1020 that is connected to a local network 1022 .
- communication interface 1018 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
- ISDN integrated services digital network
- communication interface 1018 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
- LAN local area network
- Wireless links may also be implemented.
- communication interface 1018 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
- Network link 1020 typically provides data communication through one or more networks to other data devices.
- network link 1020 may provide a connection through local network 1022 to a host computer 1024 or to data equipment operated by an Internet Service Provider (ISP) 1026 .
- ISP 1026 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 1028 .
- Internet 1028 uses electrical, electromagnetic or optical signals that carry digital data streams.
- the signals through the various networks and the signals on network link 1020 and through communication interface 1018 which carry the digital data to and from computer system 1000 , are example forms of transmission media.
- Computer system 1000 can send messages and receive data, including program code, through the network(s), network link 1020 and communication interface 1018 .
- a server 1030 might transmit a requested code for an application program through Internet 1028 , ISP 1026 , local network 1022 and communication interface 1018 .
- the received code may be executed by processor 1004 as it is received, and/or stored in storage device 1010 , or other non-volatile storage for later execution.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Ophthalmology & Optometry (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Collating Specific Patterns (AREA)
- Eye Examination Apparatus (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
- This application is a continuation of U.S. application Ser. No. 16/240,120 filed Jan. 4, 2019, which is a continuation of U.S. application Ser. No. 15/340,926, filed Nov. 1, 2016, issued on Jan. 8, 2019 as U.S. Pat. No. 10,176,377 which claims the benefit under 35 U.S.C. § 119 of U.S. provisional application 62/249,798, filed Nov. 2, 2015, the entire contents of which are hereby incorporated by reference for all purposes as fully set forth herein.
- Embodiments described herein relate generally to an iris liveness detection, and more specifically, to techniques for capturing and using information about the iris liveness to authenticate a user to a mobile device.
- Since the introduction of a first smartphone in 1994, there has been a rapid growth in smartphones' technology. Smartphones have become much more than just computers; they also provide functionalities of personal databases, jukeboxes, cameras, communications hubs and communications gateways.
- As today's smartphones are increasingly used to store and communicate sensitive financial and personal information, a reliable assessment of an identity of the smartphone's user is emerging as an important new service. Personal identification numbers or passwords appear to be insufficient for this purpose.
- In the drawings:
-
FIG. 1 is an example mobile device environment for implementing an iris liveness detection according to an example embodiment; -
FIG. 2 is a flow diagram depicting an example iris liveness detection process according to an example embodiment; -
FIG. 3 depicts examples of RGB/NIR image pairs acquired according to an example embodiment; -
FIG. 4 depicts examples of RGB/NIR image pairs acquired from a live person and examples of RGB/NIR image pairs acquired from photographs and computer displays; -
FIG. 5 depicts examples of RGB/NIR image pairs acquired from a 3-D model of a face; -
FIG. 6 depicts a pupil localization process according to an example embodiment; -
FIG. 7 depicts an example sequence of images showing an eye-blinking effect; -
FIG. 8 is an example binary decision tree used to determine whether images depict a live person or are part of a presentation attack; -
FIG. 9 is a flow diagram of an example process for detecting presentation attacks according to an example embodiment; -
FIG. 10 illustrates a computer system upon which one or more embodiments may be implemented. - In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments. It will be apparent, however, that the embodiments may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring embodiments.
- Techniques are described herein for detecting liveness of a human iris using a mobile device. In an embodiment, the techniques include a workflow for acquiring iris biometric information of a person attempting to use a mobile device. The iris biometric information may be acquired using electronic sensors integrated with the device. The sensors may be configured to capture images of an eye or an eye's iris region. Examples of mobile devices may include smartphones, tablets, PDAs, laptops, electronic watches, and the like.
- Electronic sensors may be configured to capture image pairs. An image pair includes a visible red-green-blue (RGB) spectrum image and a near infra-red (NIR) spectrum image of an eye or an eye's iris region. Capturing of the RGB and NIR images may be performed synchronously in terms of timing.
- An RGB/NIR hybrid sensor is an example of a sensor configured to capture RGB/NIR image pairs synchronously. The RGB/NIR hybrid sensor is an electronic sensor configured to capture both an RGB image and an NIR image of the same scene and at the same time. The captured RGB/NIR image pair include the images that depict objects shown in the same spatial relationships to each other in each of the images. Therefore, the images of the RGB/NIR pair depicting for example, and eye, will depict the eye at the same location in each of the two images of the pair.
- RGB and NIR images may be collectively referred to as incoming images. An incoming image may capture multi-spectral information specific to characteristics of a human eye and/or the eye' iris region. The multi-spectral information may be mapped onto one or more discrete feature vectors representing the characteristics of the eye's iris region.
- Discrete feature vectors may be processed by applying one or more classifiers to the vectors to generate a classified image. A classifier is a means for analyzing contents of an image and contents of feature vectors generated for the image. Examples of classifiers may include intermediate classifiers which use a distance metric to determine whether the discrete feature vectors match predetermined training feature vectors. For example, if a distance value computed based on a discrete feature vector of an image depicting a person attempting to use a mobile device and a training feature of an image depicting an actual owner of the mobile device exceed a certain threshold, then it may be concluded that the person attempting to use the device is not the device's owner.
- Classified images of an eye and/or an iris region may be further processed using multi-frame pupil localization techniques. Pupil localization techniques may include processing of pupil regions identified in the classified images and determining one or more characteristics of the pupil. The characteristics of the pupil may be used to determine liveness of the depicted iris. For example, the characteristics may be used to determine whether the images depict an iris of an owner of a mobile device or an iris of an imposter attempting to use the mobile device.
- In an embodiment, the techniques described herein allow detecting spoofing attacks and security breaches committed with respect to mobile devices. The techniques are applicable to implementations involving actual human faces as well as 3-D face models made of materials that have properties similar to properties of human faces.
- In an embodiment, a method comprises acquiring a plurality of image pairs using one or more image sensors. The sensors may be integrated with a mobile device, and the image pairs may depict a person who attempts to use the mobile device. Each image pair, of the plurality of image pairs, may include an RGB image and a NIR image, both images acquired in a synchronized manner. Acquiring a pair of images in the synchronized manner may include acquiring the images of the pair at a same time. The sensors may include at least one hybrid RGB/NIR sensor.
- A particular image pair that depicts an eye-iris region in-focus is selected from a plurality of image pairs. Based on, at least in part, the particular image pair, a hyperspectral image is generated. The hyperspectral image may be generated by fusing two images included in the particular image pair.
- Based on, at least in part, a hyperspectral image, a particular feature vector for the eye-iris region depicted in the particular image pair is generated. The particular feature vector may numerically represent a particular feature, such as an iris region depicted in the image pair.
- One or more trained model feature vectors are retrieved from a storage unit. The trained model feature vectors may be generated based on images depicting an owner of a mobile device. The images depicting the particular user depict valid biometric characteristics of the owner of the device. The trained model features vectors are used to determine whether the particular feature vector have some similarities with the particular feature vector generated from image pairs depicting a person attempting to use the mobile device. The similarities may be quantified using a distance metric computed based on the particular feature vector and the one or more trained model feature vectors.
- A distance metric represents a similarity measure between the trained model feature vectors and a particular feature vector. Stating differently, a distance metric represents a similarity measure of the particular image pair, acquired from a person attempting to use a mobile device, and the trained model feature vectors generated based on the valid biometric characteristics of an owner of the mobile device.
- A distance metric may be compared with a predefined first threshold. The first threshold may be determined empirically. If the distance metric exceeds the first threshold, then a first message indicating that the plurality of image pairs fails to depict the particular user of a mobile device is generated. The first message may also indicate that the person whose depictions were acquired by the sensors of the mobile device is not the owner of the mobile device. Furthermore, the first message may indicate that a presentation attack on the mobile device is in progress.
- However, if the distance metric does not exceed the first threshold, then two or more image pairs that depict an iris are selected from the acquired plurality of image pairs. For each NIR image of each image pair, of the two or more image pairs, one or more characteristics of the iris depicted in the image pair are determined.
- It is also determined whether at least one characteristic, of the one or more characteristics determined for NIR images, changes from image-to-image by at least a second threshold. If so, then a second message indicating that the plurality of image pairs depicts the particular user of a mobile device is generated. The second message may also indicate that the person whose depictions were acquired by the sensors of the mobile device is the owner of the mobile device. Furthermore, the second message may indicate that an authentication of the owner to the mobile device was successful. Otherwise, a third message may be generated to indicate that a presentation attack on the mobile device is in progress.
- Biometric information has been traditionally used by law enforcement to secure and restrict access to resources and facilities, and to establish identities of individuals. Biometric technology has been employed at for example, airports, train-stations, and other public areas. In these situations, biometric information is acquired in so called supervised settings. In a supervised setting, one individual oversees an acquisition of biometric information from another individual to ensure validity of the acquired information. Because the acquisition of the biometric information in these settings is supervised, spoofing of the biometric information of the individual is rather rare.
- However, when biometric technology is adapted in unsupervised settings, spoofing of biometric information of an individual is not uncommon. For example, when biometric information is used to authenticate an individual to a consumer device such as a mobile device, a biometric data acquisition process is usually unsupervised. Thus acquiring biometric information of an individual in an unsupervised setting may be prone to spoofing. For instance, a fingerprint authentication, which has been widely adopted in mobile devices, may be easily targeted by various spoofing techniques.
- Arguably, a human supervision may be an effective way for detecting spoofing attacks and widely used in many applications including border security patrol. However, the supervision is impractical in cases of mobile devices and other consumer electronic devices.
- An iris of an eye is an annular region between a pupil and a sclera of the eye. An iris region usually has a distinct pattern, and due to its distinctiveness, the pattern may be used to uniquely identify a person. Typically, an iris pattern contains complex and distinctive ligaments, furrows, ridges, rings, coronas, freckles and collarets. An iris pattern becomes relatively stable at the eight month of gestation, and remains stable throughout the person's lifetime.
- Iris patterns usually demonstrate high variability. For example, even twin children may have different iris patterns. In fact, an iris pattern of the left eye of a person is most likely different than an iris pattern of the right eye of the same person. The unique characteristics of an iris region make the iris a suitable source of biometric information useful to authenticate individuals.
- In an embodiment, biometric characteristics of an iris are collected and analyzed using mobile devices such as smartphones, tablets, PDAs, laptops, watches, and the like. The process of collecting and analyzing the biometric characteristics may be implemented to authenticate a user to a mobile device, to detect spoofing attempts, and/or to detect liveness of the iris in general.
- Authentication of a person to a mobile device based on the person's iris biometrics is usually unsupervised. It is unsupervised because it does not require any monitoring of the person authenticating himself to the device. Indeed, usually only the person who authenticates himself to the device participates in the authentication process.
- Unsupervised authentication approaches based on biometric data are more susceptible to spoofing than traditional authentication techniques. This is because in the unsupervised authentication no one is monitoring a user as the user's biometric data is acquired. Since there is no monitoring, an imposter may attempt to provide intercepted or false information to gain access to a mobile device of another person.
- Spoofing attacks on an unsupervised authentication system may include presenting to a mobile device biometric data of a person other than a user of the device, and mimicking real biometric information of the user of the device to gain access to the user's device. The mimicking may include providing to the device an iris biometric sample that was recorded without co-operation or knowledge of the user. This may include presenting, by an imposter, a picture, a recorded video, or a high quality iris image of the user in front of the device to gain access the user's device. These types of attacks are collectively referred to as presentation attacks.
- An iris liveness detection approach presented herein is an anti-spoofing technique. The iris liveness detection allows determining whether biometric information presented to a device is an actual biometric measurement obtained from a live person and whether it was captured at the time when the biometric information is presented to the device.
- An automatic liveness detection approach may include an analysis of intrinsic properties of a live person, an analysis of involuntary body signals, and a challenge-response analysis. In the context of an iris liveness detection, the analysis of intrinsic properties may include analyzing spectrographic properties of a human eye, analyzing a red-eye effect, and analyzing a 3-D curvature of an iris surface. An analysis of involuntary body signals may include analyzing an eyelid movements and hippus. A challenge-response analysis may include analyzing a user's response when the user is prompted to blink or look at different directions.
- In an embodiment, an automatic iris liveness detection approach is implemented as part of an iris recognition system, and is used as a countermeasure against spoofing. It may be implemented in hardware, software, or both. It is applicable to a variety of electronic devices and its implementation may be optimized to minimally affect performance of the iris recognition system built into the devices.
- Iris liveness detection techniques may be implemented in mobile devices. The techniques allow recognizing static images such as high quality printed images of an iris, iris images projected on a screen, or high resolution video frames, and determining whether such images are presentation attacks on mobile devices. The techniques may be implemented in a variety of mobile devices without requiring any special hardware. Therefore, the techniques may be inexpensive solutions against presentation attacks. Furthermore, the techniques may not depend on user interactions, and thus they may be widely adopted for every day-use by consumers. Moreover, iris liveness detection techniques may be cost-effective yet powerful mechanisms incorporated into mobile devices. Implementations of the techniques may be computationally light, and may be embedded in a camera pipeline of the mobile device on in digital signal processors dedicated to an iris recognition.
- In an embodiment, an iris liveness detection technique includes acquiring and processing visible spectrum RGB images as well as NIR images by a mobile device. The images may be captured using cameras or sensors integrated in the device. If a mobile device is equipped with cameras, then at least one camera may be a hybrid front facing camera configured to perform an iris recognition, and at least one camera may be configured to carry out video calls or selfie imaging. If a mobile device is equipped with RGB/NIR hybrid sensors, then the sensors may be configured to synchronously capture RGB/NIR image pairs.
- Captured RGB/NIR image pairs may be processed using components of a mobile device configured to perform a visible spectrum iris recognition and an NIR iris recognition.
-
FIG. 1 is an example mobile device environment for implementing an iris liveness detection according to an example embodiment. Amobile device environment 100 may include various mobile devices. Non-limiting examples of mobile devices include various types and models of smartphones 104 a-104 b, laptops 106 a-106 b,PDAs 108 a, andtablets 108 b. Each mobile device may be configured to capture visualspectrum RGB images 102 a andNIR images 102 b of a person facing the device. For the clarity of the description, the examples described in the following section refer to the approaches implemented insmartphone 104 a; however, the approaches may be implemented on any type of mobile device. - In an embodiment, visual
spectrum RGB images 102 a andNIR images 102 b of aperson facing smartphone 104 a are captured by cameras and/or sensors integrated insmartphone 104 a. TheRGB images 102 a andNIR images 102 b may be further processed by components ofsmartphone 104 a. The processing may include determining liveness of an iris depicted in the captured images. If the iris liveness is detected in the images, then theperson facing smartphone 104 a may be granted access tosmartphone 104 a and resources ofsmartphone 104 a. However, if the iris liveness is not detected in the images, then theperson facing smartphone 104 a is denied access to thesmartphone 104 a and its resources. - Processing of RGB and NIR images by a mobile device may include determining locations of an iris in the images, determining locations of a pupil within the iris in the respective images, and analyzing the determined locations for the purpose of detecting the iris' liveness. Detecting the iris' liveness may allow identifying incidents of presentation attacks on the mobile device. For example, the technique may allow identifying presentation attacks when mannequins, having engineered artificial eyes used to duplicate the optical behavior of human eyes, are used to gain access to mobile devices.
- In an embodiment, an iris liveness detection process is part of an authentication process performed to authenticate a user to a mobile device. The iris liveness detection process may comprise two stages. The first stage of the process may include acquiring a plurality of RGB and NIR image pairs depicting the user facing the mobile device, and selecting a particular RGB/NIR image pair that depicts the user's eyes in-focus. The second stage of the process may include processing the particular image pair to detect liveness of the iris depicted in the image pair, and determining whether the user may access the mobile device and its resources.
-
FIG. 2 is a flow diagram depicting an example iris liveness detection process according to an example embodiment. The example iris liveness detection process comprises afirst stage 202 and asecond stage 212. - In
step 204 ofstage 202, an image stream is acquired by a mobile device. The image stream may include RGB and NIR image pairs and depict a user facing a mobile device. The pairs may be acquired using one or more camera and/or one or more sensors integrated in the mobile device. The cameras and the sensors may be separate devices, hybrid devices, or both, and may be configured to capture and acquire the images in a synchronized manner. - Capturing images in a synchronized manner may include synchronizing the capturing in terms of timing. For example, a hybrid RGB/NIR sensor may be used to capture both an RGB image and a NIR image at the same time. Synchronizing the capturing of both images allows capturing the images in such a way that the images depict objects shown in the same spatial relationships to each other in each of the images.
- Capturing of the images may be initiated by a user as the user tries to use a mobile device. For example, the user may press a certain key, or touch a certain icon displayed on the device to “wake up” the device. A mobile device may be equipped with a “wake up” key, or a “unlock” key, used to request access to the mobile device and to initiate the image acquisition process. Selection of the keys configured to initiate the image acquisition and a naming convention for the keys depends on the specific implementation and the type of the mobile device.
- In an embodiment, a user facing a mobile device presses a “wake up” key of the mobile device to initiate an image acquisition process. Upon detecting that the key was pressed, the mobile device initiates an RGB/NIR hybrid sensor, or cameras and sensors, integrated in the device, causes the hybrid sensor to synchronously acquire RGB and NIR images of eyes of the user. The RGB/NIR image pairs are acquired synchronously to ensure that the locations of certain features in one image correspond to the location of the certain features in another image.
- RGB and NIR image pairs may be acquired in a normal office situation with active illumination of 1350 nm. Examples for the image pairs acquired at different stand-off distance are shown in
FIG. 3 . -
FIG. 3 depicts examples of RGB/NIR image pairs acquired according to an example embodiment. The examples depicted inFIG. 3 include anRGB image 302 a, anNIR image 302 b, anRGB image 304 a, and an NIR image 304 b. Images 302 a-302 b depict one person and images 304 a-304 b depict another person. The RGB/NIR image pairs may be synchronously acquired by an RGB/NIR hybrid sensor at the time when a user is trying to authenticate himself to a mobile device. The image pairs may be compared to training RGB/NIR images acquired from an owner of the device. - In
step 206, an obtained image stream of RGB/NIR image pairs is processed to select an RGB/NIR image pair that depicts an eye-iris region in-focus. This may include applying detectors configured to detect eye-iris regions in the image pairs and select a subset of the image pairs that depict the eye-iris regions, and comparators configured to select, from the subset, an RGB/NIR image pair that depicts the eye-iris region in focus. If the eyes are detected in one image pair, the eyes' locations in the subsequently captured image pairs may be tracked until one or more image pairs depicting the eyes in-focus are found. For example, the visible spectrum (wavelength) of the image stream may be subjected to a certain type of processing to determine images that depict a sequence of good quality, in-focus eye regions. The processing may be performed using the state-of-the art face detectors, eye location detectors, and eye trackers. - In
step 208, based on an RGB-NIR image pair depicting an eye-region in-focus, a hyperspectral image is generated. A hyperspectral image is generated from an RGB image and a NIR image of the image pair by fusing both images into one image. Fusing of an RGB image and a NIR image may be accomplished by applying a fusing operator to a mathematical representation of the RGB image and a mathematical representation of the NIR image. - In an embodiment, a mathematical representation Iv of an RGB image and a mathematical representation Ii of a NIR image of an RGB/NIR image pair are obtained and used to generate a hyperspectral image Ih. The mathematical representations of the RGB image and the NIR image capture ambient light and a surface reflectance on an eye represented at four different wavebands (Blue, Green, Red and NIR), respectively. The hyperspectral image Ih, obtained by fusing the mathematical representations of the RGB and NIR images, will capture an ambient light and a surface reflectance on an eye represented at the four different wavebands and derived by applying a fusing operator to the respective mathematical representations.
- In an embodiment, mathematical representations of an RGB image and an NIR image of an image pair are generated. The RGB and NIR image formation by an RGB/NIR hybrid sensor may be captured using the following expression:
-
- where Iv∈˜m×n is the RGB image, Ii∈˜k×I is the NIR image,
- λv∈[350 nm, 700 nm], λi∈[750 nm, 900 nm] are the wavelength ranges of the RGB and NIR images, respectively;
- where p is the spatial domain of the sensor;
- where R is the spatial response of the sensor,
- where E is the irradiance; and
- where Q is the quantum efficiency of the sensor.
- In an embodiment, Ii∈˜k×I is demosaiced/interpolated to obtain m=k, n=l. That means that Ii (the mathematical representation of the NIR image) is demosaiced/interpolated so that the mathematical representation of the NIR image has the same size m×n as the mathematical representation of the RGB image.
- The two images, Iv and Ii are fused together to generate a hyperspectral image Ih using the following expression:
-
I h=Γ(I v ,I i) (3) - where Ih∈˜m×n×4 and Γ is a fusing operator.
- In an embodiment, a hyperspectral image Ih is further processed to minimize the effect of ambient light. This may be accomplished by obtaining metadata from a camera or a sensor, and using the metadata to perform a white color balance, a gamma correction, and/or an auto exposure correction of the hyperspectral image Ih.
- In an embodiment, an iris liveness detection process includes a second stage. In the second stage, a hyperspectral image Ih is processed to identify one or more multispectral features depicted in the hyperspectral image. Since the hyperspectral image Ih represents an ambient light and a surface reflectance on an eye represented at four different wavebands (Blue, Green, Red and NIR), image data in each of the wavebands of the hyperspectral image Ih may be processed individually to extract the features from each waveband separately.
- Extracting features from a hyperspectral image may include clustering image data of the hyperspectral image based on the intensity values within each of the wavebands and determining the features based on the clustered image data. Extracted features may be represented as features vectors.
- A feature vector generated for an image is a vector that contains information describing one or more characteristics of an object depicted in the image. An example feature vector may include a numerical value representing characteristics of an eye region depicted in the image. The numerical value may be computed based on raw intensity values of the pixels that constitute the eye region.
- Referring again to
FIG. 2 , instep 214 ofsecond stage 212, one or more feature vectors are generated based on a hyperspectral image obtained infirst stage 202. - In an embodiment, a hyperspectral image Ih is viewed as comprising four image planes (Ic1, Ic2, Ic3, Ic4) having the size m×n and representing four different wavebands. The planes may also be referred to as channels. The pixels in each plane are clustered separately to form α predefined clusters. The clustering process may be represented using the following expression:
-
I cj u=Ω(I cj), (4) - where Icj u∈[1,α]m×n represents a label of the cluster corresponding to the pixels in Icj, j∈[1,4] denotes the image channel (waveband), α is a count of the clusters, and Ω is the clustering operator. While the count a of clusters may be chosen in any manner, in an embodiment, based on the dimensionality and computational complexity of expression (4), α=8 is chosen.
- In an embodiment, a clustering operator Ω is a nearest neighborhood clustering operator configured to group the pixels in each plane into one of the a cluster at the time and based on the intensity values of the pixels in the plane.
- In an embodiment, the label clusters are concatenated to obtain:
-
I h u=Γ′(I c1 u ,I c2 u ,I c3 u ,I c4 u), (5) - where Γ′ is a concatenation operator.
- Due to different combinations of clustering obtained by the concatenation of the label clusters for four channels, each element in Ih u may have one of the s=α4 unique combinations. The normalized frequency distribution of each combination may be calculated using a transform operator H:
-
H:→I h u =F, (6) - where F=(f1, f2, . . . , fs) is the number of times each unique cluster combination appeared in Ih u. The mapping defined using expression (6) may be used as feature vectors determined for the hyperspectral image Iv.
- The feature extraction technique presented herein represents a unique distribution of information across various image planes in a hyperspectral image L. Furthermore, the presented technique is computationally inexpensive and generates relatively compact feature vectors.
- In
step 216 ofstage 212, one or more trained model feature vectors are obtained or retrieved. The trained model feature vectors may be generated based on actual and reliable images of a “live” user of a mobile device, and stored in storage units of the device. - Trained model feature vectors for a live user may be calculated when the user's mobile device is configured to implement an iris liveness detection approach. The vectors may be generated based on one or more images depicting for example, facial features of the user, and may be used to train an image classifier to predict whether other images most likely depict the user of the mobile device or whether the other images are presentation attacks on the device.
- In
step 218, a distance metric (DM) is computed based on a feature vector, generated from a hyperspectral image, and one or more trained model feature vectors retrieved from a storage unit. A storage unit may be a volatile memory unit of a mobile device, a non-volatile memory unit of the mobile device, or any other unit configured to store data. - A distance metric is a numerical representation of similarities between a feature vector generated from a hyperspectral image and trained model feature vectors generated from images of a user of a mobile device. If a distance value computed from the feature vector and the trained model feature vector exceeds a certain threshold, then the feature represented by the feature vector is dissimilar to the feature represented by the trained model feature vector. This may indicate that an individual whose depictions were used to generate the hyperspectral image is an imposter, and not the user of the mobile device.
- However, if the distance value does not exceed the certain threshold, then the feature represented by the feature vector is similar, or maybe even identical, to the feature represented by the trained model feature vector. This may indicate that the individual whose depictions were used to generate the hyperspectral image is the user of the mobile device.
- In an embodiment, a distance metric is computed as a deviation (error) d. The deviation d may be computed using a Bayesian approach. Assume that Fq denotes a feature vector of a query image, such as a hyperspectral image generated from an RGB-NIR image pair acquired by an RGB-NIR hybrid sensor. Furthermore, assume that Fdb denotes one or more trained model feature vectors of a trained model. The trained model may be trained on actual images of a user of a mobile device. In a Bayesian approach, a deviation d is measured as the square root of the entropy approximation to the logarithm of evidence ratio when testing whether the query image can be represented as the same underlying distribution of the live images. This can be mathematically represented as:
-
- where, D(Fq∥Fdb) is the Kullback-Leibler divergence of Fdb obtained from Fq, which is a measure of information lost when the database feature vector Fdb is approximated from the query feature vector Fq. The above presented choice of distance metric dq,db is based on the observations that it is a close relative to Jenson—Shannon divergence and an asymptotic approximation of χ2 distance. Furthermore, dq,db is symmetric and fulfills the triangle inequality.
- In
step 220 ofstage 212, a distance metric dq,db computed using expressions (7)-(8) is used to determine whether an incoming query image depicts a live person. If dq,db<β, where β∈˜ is a predetermined certain threshold, then, instep 222, it is determined that the query image depicts a live person. Otherwise, instep 224, it is determined that the query image does not depict a live person. - Presentation attacks may include various types of spoofing attacks on a mobile device. They may include mimicking real biometric information of a user of a mobile device to gain access to the user's device. The mimicking may include for example, providing to the device an iris biometric sample that was recorded without knowledge of the user of the device. One of the most common presentation attacks include presenting a high quality printed photograph in front of the device. For example, an imposter may try to use the high quality color photograph of the user of the mobile device to try to access the device.
- Effectiveness of approaches for detecting presentation attacks may be measured using various approaches. One approach includes determining a Normal Presentation Classification Error Rate (NPCER). The NPCER is defined as the proportion of live users incorrectly classified as a presentation attack. Another approach includes determining an “Attack Presentation Classification Error Rate” (APCER). The APCER is defined as the proportion of presentation attack attempts incorrectly classified as live users. Yet other approach includes determining an “Average Classification Error Rate” (ACER), which is computed as the mean value of the NPCER and the APCER error rates.
- The ability to detect presentation attacks depends on a variety of factors. For example, detecting the presentation attacks may depend on the surface reflection and refraction of the material that is presented in front of a hybrid sensor of a mobile device. There are many differences between reflection and refraction factors determined for a printed image and reflection and refraction factors determined for a human skin.
- In an embodiment, an iris liveness detection process detects presentation attacks conducted using photographs shown on either reflective paper or a matte paper, and presentation attacks conducted by projecting images on a screen or a display device. The approach takes advantage of the fact that the photographic material (reflective paper or matte paper) and the displays of devices have properties that are significantly different than the properties of the human skin or the human eye.
-
FIG. 4 depicts examples of RGB/NIR image pairs acquired from a live person and examples of RGB/NIR image pairs acquired from photographs and computer displays.Images Images Images FIG. 4 are examples of presentation attacks. Specifically,images Images Images Images - However, in some cases, relying on differences in the respective properties alone may be insufficient to differentiate presentation attacks from legitimate access attempts. As the spoofing techniques are evolving, presentation attacks may include techniques that go beyond using known printing materials and image displaying devices. New materials and display devices may be used to conduct presentation attacks in the future. For example, a new presentation attack may be conducted using a realistic 3-D face model of a user of a mobile device.
-
FIG. 5 depicts examples of RGB/NIR image pairs acquired from a 3-D model of a face. A 3-D face model may be a mannequin that has engineered artificial eyes with iris regions to duplicate the optical behavior of human eyes, including a red-eye effect. The mannequin may be made out of a skin-like material that has properties similar to the properties of a human skin. The mannequin may also have realistically reproduced hair, eyebrows, lashes, and so forth. - In
FIG. 5 ,images images Images FIG. 5 depict a realistic 3-D face model.Images FIG. 5 depict close up images showing the human like skin, hair and ocular properties.Images FIG. 5 are side views of the mannequin. - It appears that reflectance and refraction properties in the images of a mannequin in
FIG. 5 more-less correspond to reflectance and refraction properties of photographs of a live person, such as pair 402 a-402 b inFIG. 4 . For example, it appears that the eye regions inimages - Furthermore, a mannequin may be equipped with printed contact lenses with an iris pattern of a live person. If an imposter uses images of such a mannequin to conduct a presentation attack on a mobile device, then there is a possibility that the imposter may obtain an access to the mobile device. Therefore, analyzing the spectral response of the presented images alone may be insufficient to identify sophisticated presentation attacks.
- In an embodiment, an iris liveness detection approach for mobile devices is enhanced using techniques for a pupil analysis performed on the acquired images. An analysis of a pupil of a human eye depicted in the images increases the chances that even sophisticated presentation attacks on a mobile device may be identified. This is because mimicking both the pupil dynamics and properties of the human eye region is unlikely feasible at the current state of image-based technologies.
- Current smartphones have capabilities to acquire 120-240 frames per second, but that capability will most likely be doubled with the next-generation technology. For example, very soon it might be possible to acquire as many as 30-40 images within the time window that is now required to acquire only two images. If it is assumed that on average 30 frames are acquired within a particular time window, then about 60 images may be acquired within that time window in the future. The 60 images may include 30 RGB images and 30 NIR images acquired in a synchronous manner. Therefore, the advances in the smartphone technology may enable the smartphones to also perform a complex analysis of pupils depicted in the acquired images.
- In an embodiment, a pupil detection and a pupil analysis are performed on a sequence of NIR images. Detecting a pupil in the NIR images may include cropping the images so that the images represent only the eye regions, and then processing the cropped images using an edge-localization approach and a gradient-based approach to determine a location of the pupil in the images.
- Characteristics of an iris region of the eye depicted in digital images may be impacted by illumination variations and shadows created by eyelashes surrounding the eye. The issue, however, may be addressed by representing the images using a representation that is less sensitive to the illumination variations. An example of such a representation is a representation generated using one-dimensional image processing.
- In an embodiment, characteristics of an iris region and a pupil in the iris region are captured using one-dimensional image processing. One-dimensional image processing usually requires no thresholding, and therefore allows reducing the effect of edge smearing.
- One-dimensional processing of an image may include applying a smoothing operator along a first direction of the image, and applying a derivative operator along a second (the orthogonal) direction. Let I∈˜m×n be a cropped image depicting an eye region. Let the cropped eye image be an NIR image denoted as Ii. The smoothed eye image may be represented using the following expression:
-
- where Iθ s∈˜m×n is the smoothed iris image, Sθ(x)∈˜m×1 is the one dimensional smoothing function along a line which has a perpendicular distance of r∈• from the origin and makes an angle θ∈• with the X-axis, and ⊗ is the one-dimensional convolution operator. The convolution operation may be carried out for each value of r to obtain the smoothed image Iθ s. The smoothing function used here may be defined using the following expression:
-
- where σs∈˜ is the standard deviation of the Gaussian function used in the smoothing process. The one dimensional derivative operator along the orthogonal direction θ+90° is applied to the smoothed image for different values of r to obtain an intermediate edge gradient image, expressed as:
-
- where
-
- where, σg∈˜ is the standard deviation of the derivative operator. The magnitude representation of an edge gradient may be obtained using the following expression:
-
I θ M=√{square root over ((I θ g)2+(I θ+90 g)2)}. (13) - In an embodiment, a transform operator T is applied on Iθ M, as shown below:
-
I d =T δ I θ M, (14) - where Id is the transformed image. The transformation operator T is chosen in such a way that it expresses the image Iθ M in a binary form, followed by the detection of the largest connected region in the image;
- where δ∈• is a threshold selected in such a way that nmin p≤δ≤nmax p, where nmin p and nmax p are the minimum and maximum numbers of pixels which could possibly be in the pupil region in the particular frame. Based on metadata obtained from a face and eye tracking system and based on the camera parameters, an approximate number of pixels in the pupil region may be determined. The value of δ may be learned for each individual frame.
-
FIG. 6 depicts a pupil localization process according to an example embodiment. InFIG. 6 ,image 602 depicts an original NIR image Ii. Image 604 is an edge gradient image generated along one direction of the original NIR image.Image 606 is an edge gradient image generated along an orthogonal direction.Image 608 is a magnitude image.Image 609 depicts the localized pupil. - Images 602-609 depicted in
FIG. 6 may be obtained using expressions (9)-(14), and assuming that θ=90°.Image 602 represents the original image. Images 604-606 represent the output of one-dimensional image processing for the angular direction θ and its orthogonal value.Image 608 is the magnitude image obtained from the result of the one dimensional image processing, and the localized pupil is shown inimage 609. - In an embodiment, after a pupil is localized in images depicting a human eye, the images are further processed to determine dynamic characteristics of the depicted pupil. Dynamic characteristics of a pupil may include the eye's saccades, hippus, and pupil dilation/constriction which may arise naturally as the person moves toward the camera. The dynamic characteristics may also include an eye-blinking, which alters the size of a pupil area. Examples of images that were captured as a person was blinking are depicted in
FIG. 7 . -
FIG. 7 depicts an example sequence of images showing an eye-blinking effect.Images Images images Images - A pupil analysis may include an analysis of a pupil area in general, and an analysis of a pixel intensity in the pupil region in particular. For example, a pupil analysis performed on the images may involve determining whether a size of the pupil area depicted in the images is changing from image-to-image, or whether an eye-blinking is depicted in the images. If such changes are detected in the images, then it may be concluded that the images depict a live person. However, if such changes cannot be detected, then the images are most likely provided as a presentation attack. For example, the images may be images taken from a mannequin whose eyes have no dynamic characteristics, such as an eye-blinking.
- In an embodiment, a pixel intensity in a pupil region of any of
NIR images FIG. 7 is determined using a Purkinje image. A Purkinje image is an image formed by the light reflected from the four optical surfaces of the human eye. Purkinje images may be used in various applications, including an iris liveness detection, an eye tracking, and a red-eye effect detection. - In an embodiment, a binary decision tree is used to classify a sequence of images captured by a mobile device and depicting human eyes. The binary decision tree may be used to classify the images as either images of a live person or images presented as part of a presentation attack.
- A binary decision tree may be designed to interface with different models and approaches, including an intermediate decision approach of
FIG. 2 for an iris liveness detection, and a pupil analysis described inFIG. 6 andFIG. 7 . The binary decision tree usually has one root node and one or more intermediate nodes. An example of the binary decision tree is depicted inFIG. 8 . -
FIG. 8 is an example binary decision tree used to determine whether images depict a live person or are part of a presentation attack. An examplebinary decision tree 800 comprises aroot node 802, anintermediary decision node 804, and resultnodes Root node 802 is used to determine whether an incoming image depicts a live iris or a presentation attack image. This may be determined based on a distance metric dq,db computed using expressions (7)-(8) described above, and where q represents an incoming image (a query image) and db represents a feature vector Fdb described above. - In
root node 802, a decision is made whether dq,db<β, where β∈˜ and corresponds to a predetermined threshold value. If dq,db<β, then it may be concluded that the incoming image depicts a live person, and further processing is performed atintermediary decision node 804. Otherwise, it may be concluded inresult node 810 that the incoming image does not depict a live person, but is part of a presentation attack. - If it was determined that the incoming image is an image of a live person, then, in
intermediary decision node 804, one or more image recognition modules are invoked to perform an iris recognition on the incoming image. A pupil localization result, derived as described inFIG. 6 andFIG. 7 , may be provided tointermediate decision node 804 along with additional input images acquired along with the incoming image. The provided result and the images may be used by an iris recognition module to determine whether the images show any changes in characteristics of the depicted pupil. - If it was determined that the provided information indicates some changes in characteristics of the depicted pupil, then result
node 806 is reached to indicate that the incoming image depicts a live person. However, if it is determined that the provided information does not indicate any changes in characteristics of the depicted pupil, then resultnode 808 is reached to indicate a presentation attack. - In an embodiment, a decision process depicted in
FIG. 8 provides an effective approach for detecting presentation attacks. It combines the approaches for determining whether incoming images depict a live iris, and the approaches for determining whether the incoming images depict a live pupil. The performance of the system implementing the decision process depicted inFIG. 8 may be measured using the indicators such as ACER, NPCER and APCER, described above. A comparison of the results obtained when both the iris and the pupil analysis was performed with the results when only the iris analysis was performed indicates that the approaches implementing both the iris and the pupil analysis are more effective. - A decision process of
FIG. 8 for detecting presentation attacks may also be represented using a flow diagram.FIG. 9 is a flow diagram of an example process for detecting presentation attacks according to an example embodiment. Instep 902, an image stream of RGB and NIR image pairs is acquired using a mobile device. In an embodiment, this step corresponds to step 204 inFIG. 2 . - An image stream may include a plurality of image pairs, and each image pair of the plurality of images may include an RGB image and NIR image, both acquired in a synchronized manner. The image pairs may be acquired using for example, an RGB/NIR hybrid sensor that synchronously captures both the RGB image and the NIR image.
- In an embodiment, an acquired stream of images may be processed to identify at least one image pair that depicts an eye region in-focus. The identified image pairs may be further reviewed to determine one image pair that includes the images that provide the high quality depiction of the eye region.
- In
step 904, a hyperspectral image is generated from a selected RGB/NIR image pair. This step corresponds to step 208 ofFIG. 2 . A hyperspectral image is generated by fusing an RGB image with an NIR image of the RGB/NIR image pair using a fusing operator. A fusing operator may be expressed using for example, expression (3). - In
step 906, a feature vector for a hyperspectral image is generated. This step corresponds to step 214 ofFIG. 2 . A feature vector generated for an image represents one or more characteristics of an object depicted in the image. An example of characteristics may be a depiction of eyes in the image. In this example, a feature vector may be generated for an eye region detected in the image. - In
step 908, one or more trained model feature vectors are retrieved from a storage unit. This step corresponds to step 216 ofFIG. 2 . Trained model feature vectors are vectors that were generated based on actual and reliable images of a live user of a mobile device. The trained model feature vectors are used as references in determining whether a feature vector generated from a hyperspectral image instep 906 matches the trained model feature vectors within some threshold. - Once one or more trained model feature vectors are retrieved, a
first classifier 910 is applied to the trained model feature vectors and a feature vector generated for a hyperspectral image. Applyingfirst classifier 910 may includesteps - A classifier is a means or an approach for classifying an image based on visual contents of the image. Applying a classifier to an image allows analyzing contents of the image and analyzing the numerical properties of the image. Image classification allows processing the image's contents to determine one or more image features and represent the image features as numerical properties.
- In
step 912, a distance metric (DM) is determined based on a feature vector generated from a hyperspectral image, and one or more trained model feature vectors retrieved from a storage unit. This step corresponds to step 218 inFIG. 2 . The DM may be computed using for example, a Bayesian approach. The approach may utilize for example, expressions (7)-(8). - In
step 914, a test is performed to determine whether a DM exceeds a predefined threshold. A threshold may be a numeric value determined empirically based on for example, some training or experience. If the DM exceeds the threshold, then step 916 is performed. Otherwise,step 922 is performed. - In
step 916, an indication is generated to specify that an acquired stream of images does not depict a live person, and instead it is a presentation attack. The indication may include an error message, a text message, an email, an audio signal, or any other form of communications. This step is performed when it has been determined that a distance between a feature vector and one or more training model feature vectors exceeds a threshold, and therefore, there is no sufficient similarity between the RGB/NIR image pair and the actual/reliable images of the user of a mobile device. Because the RGB/NIR image pair is not sufficiently similar to the actual/reliable images of the user, it may be concluded that the RGB/NIR images do not depict the user of the mobile device, and instead they depict an imposter. -
Steps second classifier 920 to NIR images of two or more RGB/NIR image pairs. Alternatively, this process may be performed on two or more image pairs. - In
step 922, a pupil characteristics analysis and an iris recognition are performed on NIR images of RGB/NIR image pairs. This may be include cropping each of the NIR images so they depicts only eye regions. This may also include smoothing the cropped images using for example, a smoothing functions described in expression (10). Furthermore, this may include generating an intermediate edge gradient image from the smoothed image described in expression (11). The intermediate edge gradient image may be further transformed using a transformation operator T, as in expression (14). Once locations of a pupil in the images is determined, one or more characteristics of the pupil are determined. - In
step 924, a test is performed based on the identified characteristics to determine whether there are any changes in the characteristics of the identified pupil from image-to-image. An analysis of characteristics of the identified pupil may include an analysis of a pixel intensity in the pupil region in two or more NIR images. For example, an analysis of pupil's characteristics may include determining whether a size of the pupil area, depicted in the images, is changing from image-to-image, or whether an eye-blinking is depicted in the images. - If such changes are detected in the images, then step 928 is performed, in which an indication is generated that the images depict a live person. However, if no change can be detected, then in
step 926, an indication is generated that the images are most likely provided as a presentation attack. For example, the images may be images taken from a mannequin whose eyes have no dynamic characteristics, such as an eye-blinking. The indication may include an error message, a text message, an email, an audio signal, or any other form of communications. - In an embodiment, an iris liveness detection technique is presented for in iris recognition applications implemented in mobile devices. The technique employs the ability to acquire a plurality of RGB/NIR image pair by a mobile device in a synchronized manner. The technique also employs the ability to collect and process iris biometrics using the mobile device. The approach allows detecting whether acquired RGB/NIR image pairs depict a live person or whether the images are presented as a presentation attack. The approach may be utilized to authenticate a user to the mobile device by detecting whether the user is indeed an authorized owner of the mobile device.
- The approach may be implemented on any type of mobile device. It does not require implementing or integrating any additional hardware. It may be implemented as an authentication mechanism to authenticate a user to a mobile device and to detect authentication spoofing attempts.
- The approach may be further developed to include the ability to utilize various types of iris biometrics information, not only biometrics of an iris or a pupil. For example, the approach may be extended to take into consideration biometrics of fingerprints, noses, eyebrows, and the like.
- The approach may also be enhanced by developing and providing a database containing various types of biometrics data, and a database containing information about different types of advanced presentation attacks.
- The approach may be implemented using the latest visible spectrum/NIR CMOS image sensor technologies.
- According to some embodiments, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
- For example,
FIG. 10 is a block diagram that depicts acomputer system 1000 upon which an embodiment may be implemented.Computer system 1000 includes abus 1002 or other communication mechanism for communicating information, and ahardware processor 1004 coupled withbus 1002 for processing information.Hardware processor 1004 may be, for example, a general purpose microprocessor. -
Computer system 1000 also includes amain memory 1006, such as a random access memory (RAM) or other dynamic storage device, coupled tobus 1002 for storing information and instructions to be executed byprocessor 1004.Main memory 1006 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed byprocessor 1004. Such instructions, when stored in non-transitory storage media accessible toprocessor 1004, rendercomputer system 1000 into a special-purpose machine that is customized to perform the operations specified in the instructions. -
Computer system 1000 further includes a read only memory (ROM) 1008 or other static storage device coupled tobus 1002 for storing static information and instructions forprocessor 1004. Astorage device 1010, such as a magnetic disk, optical disk, or solid-state drive is provided and coupled tobus 1002 for storing information and instructions. -
Computer system 1000 may be coupled viabus 1002 to adisplay 1012, such as a plasma display and the like, for displaying information to a computer user. Aninput device 1014, including alphanumeric and other keys, is coupled tobus 1002 for communicating information and command selections toprocessor 1004. Another type of user input device iscursor control 1016, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections toprocessor 1004 and for controlling cursor movement ondisplay 1012. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. -
Computer system 1000 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes orprograms computer system 1000 to be a special-purpose machine. According to one embodiment, the techniques herein are performed bycomputer system 1000 in response toprocessor 1004 executing one or more sequences of one or more instructions contained inmain memory 1006. Such instructions may be read intomain memory 1006 from another storage medium, such asstorage device 1010. Execution of the sequences of instructions contained inmain memory 1006 causesprocessor 1004 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. - The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operate in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical disks, magnetic disks, or solid-state drives, such as
storage device 1010. Volatile media includes dynamic memory, such asmain memory 1006. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid-state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge. - Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise
bus 1002. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications. - Various forms of media may be involved in carrying one or more sequences of one or more instructions to
processor 1004 for execution. For example, the instructions may initially be carried on a magnetic disk or solid-state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local tocomputer system 1000 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data onbus 1002.Bus 1002 carries the data tomain memory 1006, from whichprocessor 1004 retrieves and executes the instructions. The instructions received bymain memory 1006 may optionally be stored onstorage device 1010 either before or after execution byprocessor 1004. -
Computer system 1000 also includes acommunication interface 1018 coupled tobus 1002.Communication interface 1018 provides a two-way data communication coupling to anetwork link 1020 that is connected to alocal network 1022. For example,communication interface 1018 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example,communication interface 1018 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation,communication interface 1018 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information. -
Network link 1020 typically provides data communication through one or more networks to other data devices. For example,network link 1020 may provide a connection throughlocal network 1022 to ahost computer 1024 or to data equipment operated by an Internet Service Provider (ISP) 1026.ISP 1026 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 1028.Local network 1022 andInternet 1028 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals onnetwork link 1020 and throughcommunication interface 1018, which carry the digital data to and fromcomputer system 1000, are example forms of transmission media. -
Computer system 1000 can send messages and receive data, including program code, through the network(s),network link 1020 andcommunication interface 1018. In the Internet example, aserver 1030 might transmit a requested code for an application program throughInternet 1028,ISP 1026,local network 1022 andcommunication interface 1018. - The received code may be executed by
processor 1004 as it is received, and/or stored instorage device 1010, or other non-volatile storage for later execution. - In the foregoing specification, embodiments of the approach have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the approach, and what is intended by the applicants to be the scope of the approach, is the literal and equivalent scope of the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/704,822 US20220284732A1 (en) | 2015-11-02 | 2022-03-25 | Iris liveness detection for mobile devices |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562249798P | 2015-11-02 | 2015-11-02 | |
US15/340,926 US10176377B2 (en) | 2015-11-02 | 2016-11-01 | Iris liveness detection for mobile devices |
US16/240,120 US10810423B2 (en) | 2015-11-02 | 2019-01-04 | Iris liveness detection for mobile devices |
US17/073,247 US11288504B2 (en) | 2015-11-02 | 2020-10-16 | Iris liveness detection for mobile devices |
US17/704,822 US20220284732A1 (en) | 2015-11-02 | 2022-03-25 | Iris liveness detection for mobile devices |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/073,247 Continuation US11288504B2 (en) | 2015-11-02 | 2020-10-16 | Iris liveness detection for mobile devices |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220284732A1 true US20220284732A1 (en) | 2022-09-08 |
Family
ID=58635562
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/340,926 Active 2036-12-05 US10176377B2 (en) | 2015-11-02 | 2016-11-01 | Iris liveness detection for mobile devices |
US16/240,120 Active 2037-02-15 US10810423B2 (en) | 2015-11-02 | 2019-01-04 | Iris liveness detection for mobile devices |
US17/073,247 Active US11288504B2 (en) | 2015-11-02 | 2020-10-16 | Iris liveness detection for mobile devices |
US17/704,822 Pending US20220284732A1 (en) | 2015-11-02 | 2022-03-25 | Iris liveness detection for mobile devices |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/340,926 Active 2036-12-05 US10176377B2 (en) | 2015-11-02 | 2016-11-01 | Iris liveness detection for mobile devices |
US16/240,120 Active 2037-02-15 US10810423B2 (en) | 2015-11-02 | 2019-01-04 | Iris liveness detection for mobile devices |
US17/073,247 Active US11288504B2 (en) | 2015-11-02 | 2020-10-16 | Iris liveness detection for mobile devices |
Country Status (1)
Country | Link |
---|---|
US (4) | US10176377B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024209367A1 (en) * | 2023-04-03 | 2024-10-10 | Securiport Llc | Liveness detection |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10635894B1 (en) * | 2016-10-13 | 2020-04-28 | T Stamp Inc. | Systems and methods for passive-subject liveness verification in digital media |
CN107403147B (en) * | 2017-07-14 | 2020-09-01 | Oppo广东移动通信有限公司 | Iris living body detection method and related product |
EP3447684A1 (en) | 2017-08-22 | 2019-02-27 | Eyn Limited | Verification method and system |
GB2567798A (en) * | 2017-08-22 | 2019-05-01 | Eyn Ltd | Verification method and system |
GB2570620A (en) * | 2017-08-22 | 2019-08-07 | Eyn Ltd | Verification method and system |
US12073676B2 (en) * | 2017-09-18 | 2024-08-27 | Legic Identsystems Ag | Personal identity verification system and method for verifying the identity of an individual |
SG10201808116WA (en) * | 2017-09-21 | 2019-04-29 | Tascent Inc | Binding of selfie face image to iris images for biometric identity enrollment |
CN107862304B (en) * | 2017-11-30 | 2021-11-26 | 西安科锐盛创新科技有限公司 | Eye state judging method |
CN108427937A (en) * | 2018-03-29 | 2018-08-21 | 武汉真元生物数据有限公司 | Stability region choosing method and device |
WO2019205007A1 (en) | 2018-04-25 | 2019-10-31 | Beijing Didi Infinity Technology And Development Co., Ltd. | Systems and methods for blink action recognition based on facial feature points |
US11093771B1 (en) | 2018-05-04 | 2021-08-17 | T Stamp Inc. | Systems and methods for liveness-verified, biometric-based encryption |
US11496315B1 (en) | 2018-05-08 | 2022-11-08 | T Stamp Inc. | Systems and methods for enhanced hash transforms |
CN109271863B (en) * | 2018-08-15 | 2022-03-18 | 北京小米移动软件有限公司 | Face living body detection method and device |
CN109685105B (en) * | 2018-11-16 | 2019-10-25 | 中国矿业大学 | A kind of high spectrum image clustering method based on the study of unsupervised width |
US10922845B2 (en) * | 2018-12-21 | 2021-02-16 | Here Global B.V. | Apparatus and method for efficiently training feature detectors |
US11138302B2 (en) | 2019-02-27 | 2021-10-05 | International Business Machines Corporation | Access control using multi-authentication factors |
US11301586B1 (en) | 2019-04-05 | 2022-04-12 | T Stamp Inc. | Systems and processes for lossy biometric representations |
CN110490044B (en) * | 2019-06-14 | 2022-03-15 | 杭州海康威视数字技术股份有限公司 | Face modeling device and face modeling method |
IL277564A (en) * | 2019-09-23 | 2021-05-31 | Sensority Ltd | Living skin tissue tracking in video stream |
US11200670B2 (en) * | 2020-05-05 | 2021-12-14 | International Business Machines Corporation | Real-time detection and correction of shadowing in hyperspectral retinal images |
US11967173B1 (en) | 2020-05-19 | 2024-04-23 | T Stamp Inc. | Face cover-compatible biometrics and processes for generating and using same |
CN112232109B (en) * | 2020-08-31 | 2024-06-04 | 奥比中光科技集团股份有限公司 | Living body face detection method and system |
US11080516B1 (en) * | 2020-12-30 | 2021-08-03 | EyeVerify, Inc. | Spoof detection based on red-eye effects |
US20220255924A1 (en) * | 2021-02-05 | 2022-08-11 | Cisco Technology, Inc. | Multi-factor approach for authentication attack detection |
US12079371B1 (en) | 2021-04-13 | 2024-09-03 | T Stamp Inc. | Personal identifiable information encoder |
WO2022226478A1 (en) * | 2021-04-21 | 2022-10-27 | Tascent, Inc. | Thermal based presentation attack detection for biometric systems |
CN113240758B (en) * | 2021-05-28 | 2022-03-08 | 珠江水利委员会珠江水利科学研究院 | Remote sensing image fusion method, system, equipment and medium based on fusion derivative index |
CN114582008A (en) * | 2022-03-03 | 2022-06-03 | 北方工业大学 | Living iris detection method based on two wave bands |
CN116740796B (en) * | 2022-05-24 | 2024-08-09 | 湖南金康光电有限公司 | Iris recognition method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130272570A1 (en) * | 2012-04-16 | 2013-10-17 | Qualcomm Incorporated | Robust and efficient learning object tracker |
US20160019420A1 (en) * | 2014-07-15 | 2016-01-21 | Qualcomm Incorporated | Multispectral eye analysis for identity authentication |
US20170046583A1 (en) * | 2015-08-10 | 2017-02-16 | Yoti Ltd | Liveness detection |
US20170061251A1 (en) * | 2015-08-28 | 2017-03-02 | Beijing Kuangshi Technology Co., Ltd. | Liveness detection method, liveness detection system, and liveness detection device |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5291560A (en) * | 1991-07-15 | 1994-03-01 | Iri Scan Incorporated | Biometric personal identification system based on iris analysis |
US6813380B1 (en) * | 2001-08-14 | 2004-11-02 | The United States Of America As Represented By The Secretary Of The Army | Method of determining hyperspectral line pairs for target detection |
US7336806B2 (en) * | 2004-03-22 | 2008-02-26 | Microsoft Corporation | Iris-based biometric identification |
US7450740B2 (en) * | 2005-09-28 | 2008-11-11 | Facedouble, Inc. | Image classification and information retrieval over wireless digital networks and the internet |
US8600174B2 (en) * | 2005-09-28 | 2013-12-03 | Facedouble, Inc. | Method and system for attaching a metatag to a digital image |
US8260008B2 (en) * | 2005-11-11 | 2012-09-04 | Eyelock, Inc. | Methods for performing biometric recognition of a human eye and corroboration of same |
US7801335B2 (en) * | 2005-11-11 | 2010-09-21 | Global Rainmakers Inc. | Apparatus and methods for detecting the presence of a human eye |
US20100202669A1 (en) * | 2007-09-24 | 2010-08-12 | University Of Notre Dame Du Lac | Iris recognition using consistency information |
US8600120B2 (en) * | 2008-01-03 | 2013-12-03 | Apple Inc. | Personal computing device control using face detection and recognition |
US8374404B2 (en) * | 2009-02-13 | 2013-02-12 | Raytheon Company | Iris recognition using hyper-spectral signatures |
US8364971B2 (en) * | 2009-02-26 | 2013-01-29 | Kynen Llc | User authentication system and method |
US20130278631A1 (en) * | 2010-02-28 | 2013-10-24 | Osterhout Group, Inc. | 3d positioning of augmented reality information |
US20160019421A1 (en) * | 2014-07-15 | 2016-01-21 | Qualcomm Incorporated | Multispectral eye analysis for identity authentication |
US20170091550A1 (en) * | 2014-07-15 | 2017-03-30 | Qualcomm Incorporated | Multispectral eye analysis for identity authentication |
WO2016013090A1 (en) * | 2014-07-24 | 2016-01-28 | 富士通株式会社 | Face authentication device, face authentication method, and face authentication program |
TWI553565B (en) * | 2014-09-22 | 2016-10-11 | 銘傳大學 | Utilizing two-dimensional image to estimate its three-dimensional face angle method, and its database establishment of face replacement and face image replacement method |
US9836591B2 (en) * | 2014-12-16 | 2017-12-05 | Qualcomm Incorporated | Managing latency and power in a heterogeneous distributed biometric authentication hardware |
US9424458B1 (en) * | 2015-02-06 | 2016-08-23 | Hoyos Labs Ip Ltd. | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices |
US9898835B2 (en) * | 2015-02-06 | 2018-02-20 | Ming Chuan University | Method for creating face replacement database |
CN112667069A (en) * | 2015-03-13 | 2021-04-16 | 苹果公司 | Method for automatically identifying at least one user of an eye tracking device and eye tracking device |
US10769255B2 (en) * | 2015-11-11 | 2020-09-08 | Samsung Electronics Co., Ltd. | Methods and apparatuses for adaptively updating enrollment database for user authentication |
EP3427185B1 (en) * | 2016-03-07 | 2024-07-31 | Magic Leap, Inc. | Blue light adjustment for biometric security |
CN107818305B (en) * | 2017-10-31 | 2020-09-22 | Oppo广东移动通信有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
-
2016
- 2016-11-01 US US15/340,926 patent/US10176377B2/en active Active
-
2019
- 2019-01-04 US US16/240,120 patent/US10810423B2/en active Active
-
2020
- 2020-10-16 US US17/073,247 patent/US11288504B2/en active Active
-
2022
- 2022-03-25 US US17/704,822 patent/US20220284732A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130272570A1 (en) * | 2012-04-16 | 2013-10-17 | Qualcomm Incorporated | Robust and efficient learning object tracker |
US20160019420A1 (en) * | 2014-07-15 | 2016-01-21 | Qualcomm Incorporated | Multispectral eye analysis for identity authentication |
US20170046583A1 (en) * | 2015-08-10 | 2017-02-16 | Yoti Ltd | Liveness detection |
US20170061251A1 (en) * | 2015-08-28 | 2017-03-02 | Beijing Kuangshi Technology Co., Ltd. | Liveness detection method, liveness detection system, and liveness detection device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024209367A1 (en) * | 2023-04-03 | 2024-10-10 | Securiport Llc | Liveness detection |
Also Published As
Publication number | Publication date |
---|---|
US20210034864A1 (en) | 2021-02-04 |
US20190138807A1 (en) | 2019-05-09 |
US10176377B2 (en) | 2019-01-08 |
US11288504B2 (en) | 2022-03-29 |
US10810423B2 (en) | 2020-10-20 |
US20170124394A1 (en) | 2017-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11288504B2 (en) | Iris liveness detection for mobile devices | |
US11188734B2 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
US12014571B2 (en) | Method and apparatus with liveness verification | |
Chakraborty et al. | An overview of face liveness detection | |
Galdi et al. | Multimodal authentication on smartphones: Combining iris and sensor recognition for a double check of user identity | |
CN110326001B (en) | System and method for performing fingerprint-based user authentication using images captured with a mobile device | |
US10943095B2 (en) | Methods and systems for matching extracted feature descriptors for enhanced face recognition | |
US20170262472A1 (en) | Systems and methods for recognition of faces e.g. from mobile-device-generated images of faces | |
Thavalengal et al. | Iris liveness detection for next generation smartphones | |
US10885171B2 (en) | Authentication verification using soft biometric traits | |
WO2016084072A1 (en) | Anti-spoofing system and methods useful in conjunction therewith | |
KR20190053602A (en) | Face verifying method and apparatus | |
Rathgeb et al. | Makeup presentation attacks: Review and detection performance benchmark | |
KR20190093799A (en) | Real-time missing person recognition system using cctv and method thereof | |
Heo | Fusion of visual and thermal face recognition techniques: A comparative study | |
Sun et al. | Dual camera based feature for face spoofing detection | |
SulaimanAlshebli et al. | The cyber security biometric authentication based on liveness face-iris images and deep learning classifier | |
KR20210050649A (en) | Face verifying method of mobile device | |
Khamele et al. | An approach for restoring occluded images for face-recognition | |
Favorskaya | Face presentation attack detection: Research opportunities and perspectives | |
CN114067383B (en) | Passive three-dimensional facial imaging based on macrostructure and microstructure image dimensions | |
Dixit et al. | SIFRS: Spoof Invariant Facial Recognition System (A Helping Hand for Visual Impaired People) | |
Srivastava et al. | A Machine Learning and IoT-based Anti-spoofing Technique for Liveness Detection and Face Recognition | |
Al-Omar et al. | A Review On Live Remote Face Recognition and Access Provision Schemes | |
Anjum et al. | Face Liveness Detection for Biometric Antispoofing Applications using Color Texture and Distortion Analysis Features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FOTONATION LIMITED, IRELAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:THAVALENGAL, SHEJIN;REEL/FRAME:060009/0791 Effective date: 20161028 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: SECURITY INTEREST;ASSIGNORS:ADEIA GUIDES INC.;ADEIA IMAGING LLC;ADEIA MEDIA HOLDINGS LLC;AND OTHERS;REEL/FRAME:063529/0272 Effective date: 20230501 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: XPERI HOLDING CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:XPERI PRODUCT SPINCO CORPORATION;REEL/FRAME:066226/0749 Effective date: 20220926 Owner name: XPERI PRODUCT SPINCO CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:FOTONATION LIMITED;REEL/FRAME:066226/0640 Effective date: 20220926 Owner name: ADEIA IMAGING LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:XPERI HOLDING CORPORATION;REEL/FRAME:066237/0375 Effective date: 20220926 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |