WO2022171970A1 - Dispositif et procede de traitement de donnees videos pour detection du vivant - Google Patents
Dispositif et procede de traitement de donnees videos pour detection du vivant Download PDFInfo
- Publication number
- WO2022171970A1 WO2022171970A1 PCT/FR2022/050271 FR2022050271W WO2022171970A1 WO 2022171970 A1 WO2022171970 A1 WO 2022171970A1 FR 2022050271 W FR2022050271 W FR 2022050271W WO 2022171970 A1 WO2022171970 A1 WO 2022171970A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video data
- neural network
- interest
- human presence
- signal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 14
- 238000012545 processing Methods 0.000 title claims description 9
- 238000013528 artificial neural network Methods 0.000 claims abstract description 44
- 238000013186 photoplethysmography Methods 0.000 claims abstract description 33
- 238000005259 measurement Methods 0.000 claims abstract description 12
- 238000004458 analytical method Methods 0.000 claims abstract description 9
- 238000010801 machine learning Methods 0.000 claims abstract description 6
- 230000003595 spectral effect Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 17
- 238000000605 extraction Methods 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000007637 random forest analysis Methods 0.000 claims description 6
- 238000007477 logistic regression Methods 0.000 claims description 4
- 238000001228 spectrum Methods 0.000 claims description 4
- 238000003860 storage Methods 0.000 claims description 4
- 238000004737 colorimetric analysis Methods 0.000 claims description 3
- 230000001815 facial effect Effects 0.000 claims description 3
- 230000001537 neural effect Effects 0.000 claims description 3
- 230000036387 respiratory rate Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 238000005520 cutting process Methods 0.000 claims description 2
- 238000001914 filtration Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 description 16
- 230000015654 memory Effects 0.000 description 9
- 238000013135 deep learning Methods 0.000 description 5
- 239000008280 blood Substances 0.000 description 4
- 210000004369 blood Anatomy 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000747 cardiac effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000031700 light absorption Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 210000002565 arteriole Anatomy 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 230000004941 influx Effects 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/162—Detection; Localisation; Normalisation using pixel segmentation or colour matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
- G06V10/431—Frequency domain transformation; Autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/446—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering using Haar-like filters, e.g. using integral image techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
- G06V40/45—Detection of the body part being alive
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/15—Biometric patterns based on physiological signals, e.g. heartbeat, blood flow
Definitions
- the invention relates to the field of the detection of human presence, or detection of the living.
- the detection of living beings is a booming field, and generally aims to verify whether a person on a video is indeed a person actually filmed, and not a usurpation (“spoof” in English) under the one of the forms mentioned above, or another form.
- document US 2016/0371555 describes a method comprising an acoustic analysis, a pulse presence measurement from video data, and a comparison between the pulse presence measurement from video data and a physical measurement by the user wishing to be authenticated/identified.
- the invention improves the situation.
- a device for analyzing video data comprising: a first analyzer arranged to perform a remote photoplethysmography measurement on video data to be analyzed received as input, comprising a separator arranged to determine regions of interest in the video data to be analyzed, an aggregator arranged to determine a remote photoplethysmography signal from the video data to be analyzed relating to each region of interest, and a calculator arranged to calculate a spectral signal from the photoplethysmography signal, and to derive one or more physiological signals therefrom, a tester arranged to receive said one or more physiological signals and one or more of the photoplethysmography signal and said spectral signal, and to return a first human presence value, a second analyzer arranged to receive the video data to be analyzed and to apply a neural network to them to draw r a second human presence value, the neural network being trained on video data similar to the video data to be analyzed and sets of characteristics extracted from this
- This device is particularly advantageous because it allows detection of living organisms in a non-intrusive but extremely reliable and robust way against known usurpation methods. Indeed, it is based exclusively on the analysis of video data, which minimizes its intrusive aspect. However, it does not sacrifice reliability. Indeed, the first value of human presence makes it possible to increase the signal-to-noise ratio compared to known remote photoplethysmography measurements used in the detection of living organisms, while protecting against a 3D mask attack, partial or not. Simultaneously, the second human presence value helps protect against common video playback and other attacks.
- the device may have one or more of the following characteristics:
- the splitter is arranged to apply one or more of the group comprising the Haar cascade method, a deep neural network to determine the contours of the face in each frame of the video data, and to cut these into regions of interests in each frame,
- the separator is arranged to cut out the video data in which the contours of the face have been determined by colorimetric analysis and/or from the recognition of the characteristic point of the face,
- the aggregator is arranged to determine a remote photoplethysmography signal, for each frame, from the average of the respective components R, G,
- the aggregator is further arranged to determine a remote photoplethysmography signal from normalization and infinite or finite impulse response bandpass filtering applied to the average of the respective R, G, B components of the data videos of each region of interest, - the aggregator is also arranged to determine a remote photoplethysmography signal from the combination of the signals drawn from the respective components R,
- G B video data of each region of interest
- the computer is arranged to receive a remote photoplethysmography signal and to derive one or more physiological signals therefrom by applying a Welch algorithm or a fast Fourrer transformation and by drawing one or more spectra, and by determining one or more physiological data chosen from a group comprising heart rate, respiratory rate, or heart rate variation.
- the tester a neural network which has been trained with a database of videos labeled to indicate a human presence or not, the data provided to the input layer of this neural network being formed by the determined physiological data signal for each of these videos.
- the second analyzer comprises on the one hand a neural network of the LSTM type which receives as input face characteristics extracted from the video data by applying an LBP type extraction and/or a SURF type extraction, and which is trained with a base of video data labeled to indicate a human presence or not, and on the other hand a deep neural network based on the MobilenetV3 or ResNext architecture comprising at output a dense layer of neurons normalized by a layer applying the Softmax function, the function cost function which can mix cross-entropy loss, focus loss, label softening and maximum entropy loss, and optionally one or more auxiliary cost functions based on a depth map, the rPPG signal, attributes relating to the video quality, skin color attributes, and device type attributes.
- the unifier is arranged to perform an operation among a product of input values with weighted weights, the application of logistic regression models, a Min/max/average type combination, or a random forest algorithm.
- the invention also relates to a device for analyzing video data, comprising:
- an analyzer arranged to receive the video data and to apply a neural network thereto in order to derive deep characteristics therefrom, the neural network being trained on video data similar to the video data to be analyzed and feature sets extracted from this video data obtained by local analysis and/or machine learning,
- a separator arranged to determine regions of interest in the video data to be analyzed, extract characteristics from regions of interest -(127) coupled to a neural network arranged to extract face characteristics
- an aggregator arranged to determine a remote photoplethysmography signal from the video data to be analyzed relating to each region of interest and coupled to a neural network arranged to extract remote photoplethysmography characteristics
- a calculator arranged to calculate a remote photoplethysmography score from data from the aggregator or separator
- an analyzer arranged to calculate a brightness score from image processing which analyzes the brightness of the video data by looking for a colorimetric drift in order to characterize the probability that the video data has been refilmed
- a unifier arranged to receive the feature map score, the photoplethysmography score and the luminosity score, and to return a unified human presence value.
- the invention also relates to a video data processing method implemented by computer, comprising receiving video data, processing it with the device according to the invention, and returning a unified human presence value, [a computer program comprising instructions for implementing the device according to the invention and a storage medium on which this computer program is recorded.
- the invention finally relates to a computer program product comprising instructions for implementing the method when it is executed on a computer, and to a storage medium on which the computer program product is recorded.
- Figure 1 shows a schematic example of a device according to the invention
- Figure 2 shows a schematic example of the first analyzer of Figure 1
- Figure 3 shows an alternative embodiment of the device of Figure 1.
- Figure 1 shows a schematic example of implementation of the invention.
- the device 2 comprises a memory 4, a first analyzer 6, a tester 8, a second analyzer 10 and a unifier 12.
- the memory 4 can be any type of data storage capable of receiving digital data: hard disk, hard disk with flash memory, flash memory in any form, random access memory, magnetic disk, storage distributed locally or in the cloud, etc.
- the data calculated by the device can be stored on any type of memory similar to memory 4, or on the latter. This data can be erased after the device has performed its tasks or retained.
- the memory 4 receives all the data necessary for the implementation of the device 2. These data are of several kinds. They may include parameters and/or sets of parameters to implement the device 2 or one of the elements it comprises, video data to be analyzed and optionally video data that can be used to drive one of the elements comprising device 2.
- the first analyzer 6, the tester 8, the second analyzer 10 and runifier 12 are elements directly or indirectly accessing the memory 4. They can be made in the form of an appropriate computer code executed on one or more processors. By processors, it must be understood any processor suitable for the calculations described below. Such a processor can be produced in any known way, in the form of a microprocessor for a personal computer, a dedicated chip of the FPGA or SoC type, a computing resource on a grid or in the cloud, a microcontroller, or any other form capable of providing the computing power necessary for the implementation described below. One or more of these elements can also be made in the form of specialized electronic circuits such as an ASIC. A combination of processor and electronic circuits can also be envisaged.
- the function of the first analyzer 6 is to receive video data to be analyzed, and to process them to carry out all or part of a remote photoplethysmography measurement (or rPPG measurement for “remote photoplethysmography” in English ) and return data that can be processed by the tester 8.
- the tester 8 for its part has a role of processing the data from the first analyzer 6 in order to return a first value of human presence which qualifies the detection of living organisms by rPPG measurement.
- the first analyzer 6 and the tester 8 could be seen as one and the same unit.
- remote photoplethysmography is an optical measurement technique from a video stream allowing access to a cardiac signal by measuring changes in blood volume in the tissues.
- Figure 2 shows an exemplary embodiment of the first analyzer 6. As can be seen in this figure, this comprises a separator 20, an aggregator 22, and a calculator 24. As elements of the first analyzer 6 , the paragraph above concerning the means of achieving them applies identically.
- FIG. 1 also makes it possible to better understand the operations executed by the first analyzer 6.
- video data 25 received at the input of the device 2, and possibly stored in the memory 4 at least temporarily, are transmitted to the separator 20.
- the separator 20 is arranged to determine regions of interest in the video data 25.
- the video data contains the faces of the users seeking to be authenticated.
- the separator 20 applies conventional algorithms such as the Haar cascade method, a deep neural network (“Deep Neural Network” or DNN in English) such as retinaface_mnet025_v2 or resl0_300x300_ssd_iter_140000 in order to initially determine the contours of the face in each frame of the video data 25, then by cutting it into several regions identified again in each frame, in particular by detecting the variations in facial skin.
- a deep neural network such as retinaface_mnet025_v2 or resl0_300x300_ssd_iter_140000
- Skin detection can be performed by colorimetric analysis (from the probability that a pixel color is skin, obtained using several possible methods), from the recognition of characteristic points of the face (eyes, nose, contours , etc.), or by combining the two (extending the color of a particular area, nose for example, and subtracting eyes and mouth).
- the result is a region of interest data set 27 which each contains the video data of the video data 25 relating to a particular region of interest identified by the separator 20.
- the aggregator 22 works on each of the regions of interest data 28 in order to prepare them to derive an rPPG signal therefrom.
- the aggregator 22 performs one or more of the following operations:
- the computer 24 is arranged to receive all the rPPG measurement signals 28 and to derive one or more spectra therefrom by applying the Welch algorithm or by applying a fast Fourier transformation (FFT), and to determine one or more physiological data , such as heart rate, breathing rate, HRV (heart rate variability).
- FFT fast Fourier transformation
- the output of the computer 24 is a physiological data signal 29 which is transmitted to the tester 8 in order to calculate a first human presence value.
- the tester 8 is implemented by means of a neural network which has been trained with a database of videos labeled "spoofing" or "alive", and for which the data provided to the input layer are formed by the physiological data signal 29 determined for each of these videos.
- This neural network can be a model that works on the spectrum (one-dimensional CNN or two-dimensional CNN), or even a model that works on the spatio-temporal signals coming from each of the previously determined sub-zones, each sub-zone providing either a signal temporal mixed, either three R, G, B signals or six R, G, B, Y, U, V signals.
- the architecture of this neural network is inspired by the ResNet 18 model (18 layers) (https://arxiv .org/pdf/1512.03385.pdf).
- the loss function estimates the error (mean absolute error or MAE for "Mean Absolute Error” or mean squared error or RMSE for "Root Mean Squared Error”) on the heart rate.
- the first human presence detection value at the output of tester 8 can be a score between two extrema, one of which is associated with usurpation and the other with detection of living organisms.
- the output may be a boolean indicating either spoofing or live detection.
- the tester 8 could be implemented by means of a "classic" algorithm, which processes the physiological data signal 29 to calculate a score for the corresponding video data to be analyzed 25 .
- a score can be between two extrema, one of which is associated with usurpation and the other with detection of the living.
- the output can be a Boolean indicating either spoofing or live detection. For example, each time the models are updated, a test dataset can be used to define a threshold such that in the test dataset all attacks are detected (i.e. the case where a video to be analyzed does not correspond to the presence of a person).
- the function of the second analyzer 10 is to receive video data 25 to be analyzed, and to analyze them by performing an extraction of characteristics making it possible to determine whether they are video data taken at from a 3D image or if it is a video of a 2D image (therefore typically a spoofing).
- the second analyzer 10 implements an extraction of face data to isolate this data in the video data 25, similar to what is done in the first analyzer 6, then the determination of a share of so-called “classic” characteristics in the face data and characteristics resulting from deep learning in the video data 25.
- the classic characteristics can be obtained by the implementation of an LBP type extraction (for Local Binary Pattern in English or Local Binary Pattern).
- the "local binary patterns” type characteristics encode the distribution of the binary differences of each of the pixels compared to its neighboring pixels.
- the final representation which is drawn from it is then a discrete distribution (histogram) which allows the use of machine learning model of the “random forest” type (Random Forest in English) or SVM (for Support Vector Machine in English, or machines with support vectors).
- a SURF type extraction for Speeded Up Robust Features in English, or Accelerated Robust Features, which encodes points of interest (orientation, intensity) at different places in the image, thus making it possible to get a robust representation.
- the selected points of interest can be those identified for a face.
- This extraction is particularly interesting because the Applicant's research has revealed that the reflections induced by the 2D nature of the usurpations tend to generate noisy and non-localized points of interest in the expected places such as the eye, mouth, etc. contrary to what happens in the "real" videos.
- the classical characteristics obtained can be further enriched, for example with characteristics resulting from temporal correlations between different areas of the face (example: division into 25 areas).
- the conventional characteristics are then used by a neural network of the LSTM (Long Short Term Memory) type to determine a first score for the second human presence detection value.
- the training of this neural network can be based on the use of a cross-entropy type cost function.
- the work of the Applicant has shown that this type of neural network is more efficient than models of the random forest/gradient-boosting/SVM type because it makes it possible to learn the dependencies between the frames of the same video.
- Softmax function a function that applies logistic regression across multiple classes to assign decimal probabilities to each class of a problem with several classes, the sum of the probabilities being equal to 1, with as input the average of the characteristics of the frames of the video data to be analyzed.
- the second analyzer 10 can then return on the one hand the value returned for the classic characteristics and on the other hand the value returned for the characteristics resulting from deep learning or a combination of the two.
- the second human presence detection value can be a pair or a composition of these values.
- the unifier 12 performs a product of the input values with weighted weights.
- logistic regression models a combination of Min/max/average type, or a random forest algorithm.
- the returned result is a unified human presence value.
- Figure 3 represents an example of another embodiment of the device of Figure 1, in which the device is designed as the aggregation of several neural networks whose purpose is to deduce characteristics of the video signals allowing the unifier 12 to return a score.
- the second analyzer 10 is used to produce a set 100 of 512 characteristics and the separator 20 is used on the one hand to feed a neural network 30 of the RhythmNet type (https://arxiv. org/pdf/1910.11515.pdf) to extract another set 300 of 512 characteristics, and on the other hand to define a set 127 comprising 128 characteristics drawn from the data of regions of interest 27.
- the neural network 30 can be replaced by a model of the ResNext 18 type.
- the aggregator 22 is used to supply a correlator 32 which determines a set 320 of 256 characteristics from the correlations between the complete rPPG signals.
- Feature set 100, feature set 127, feature set 300 and feature set 320 together form a feature map 33 which is processed by a dense neural layer normalized by a layer applying the Softmax function 34, which returns a feature map score to unifier 12.
- the device 2 further comprises:
- an optional analyzer 36 which comprises a neural network which analyzes the Moiré of the video in order to characterize the probability that the video has been refilmed, and which produces a Moiré score 360,
- an analyzer 38 which includes a conventional image processing which analyzes the luminosity of the video in order to characterize the probability that the video has been refilmed by looking for a colorimetric drift, and which produces a luminosity score 360
- an optional analyzer 40 which includes a neural network which analyzes the blur of the video in order to characterize the probability that the video has been refilmed, and which produces a score of blur 400.
- the score of Moiré 360, the score of brightness 380 and blur score 400 are also sent to Scheduler 12, along with an rPPG score 80 which can come from tester 8 or neural network 30.
- the unifier 12 operates in a manner similar to that of Figure 1, and processes the set of scores transmitted to it to return a unified human presence value.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Biology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22708998.4A EP4292013A1 (fr) | 2021-02-15 | 2022-02-15 | Dispositif et procede de traitement de donnees videos pour detection du vivant |
CA3207705A CA3207705A1 (fr) | 2021-02-15 | 2022-02-15 | Dispositif et procede de traitement de donnees videos pour detection du vivant |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR2101447A FR3119915B1 (fr) | 2021-02-15 | 2021-02-15 | Dispositif et procédé de traitement de données vidéos pour détection du vivant |
FRFR2101447 | 2021-02-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022171970A1 true WO2022171970A1 (fr) | 2022-08-18 |
Family
ID=77021375
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FR2022/050271 WO2022171970A1 (fr) | 2021-02-15 | 2022-02-15 | Dispositif et procede de traitement de donnees videos pour detection du vivant |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4292013A1 (fr) |
CA (1) | CA3207705A1 (fr) |
FR (1) | FR3119915B1 (fr) |
WO (1) | WO2022171970A1 (fr) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160371555A1 (en) | 2015-06-16 | 2016-12-22 | EyeVerify Inc. | Systems and methods for spoof detection and liveness analysis |
-
2021
- 2021-02-15 FR FR2101447A patent/FR3119915B1/fr active Active
-
2022
- 2022-02-15 EP EP22708998.4A patent/EP4292013A1/fr active Pending
- 2022-02-15 CA CA3207705A patent/CA3207705A1/fr active Pending
- 2022-02-15 WO PCT/FR2022/050271 patent/WO2022171970A1/fr active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160371555A1 (en) | 2015-06-16 | 2016-12-22 | EyeVerify Inc. | Systems and methods for spoof detection and liveness analysis |
Non-Patent Citations (11)
Title |
---|
ATOUM YOUSEF ET AL: "Face anti-spoofing using patch and depth-based CNNs", 2017 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB), 1 October 2017 (2017-10-01), pages 319 - 328, XP055783469, ISBN: 978-1-5386-1124-1, DOI: 10.1109/BTAS.2017.8272713 * |
DE HAAN GERARD ET AL: "Robust Pulse Rate From Chrominance-Based rPPG", IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, IEEE, USA, vol. 60, no. 10, 1 October 2013 (2013-10-01), pages 2878 - 2886, XP011526965, ISSN: 0018-9294, [retrieved on 20130916], DOI: 10.1109/TBME.2013.2266196 * |
HERNANDEZ-ORTEGA JAVIER ET AL: "Continuous Presentation Attack Detection in Face Biometrics Based on Heart Rate", 19 January 2019, ADVANCES IN DATABASES AND INFORMATION SYSTEMS; [LECTURE NOTES IN COMPUTER SCIENCE; LECT.NOTES COMPUTER], SPRINGER INTERNATIONAL PUBLISHING, CHAM, PAGE(S) 72 - 86, ISBN: 978-3-319-10403-4, XP047500870 * |
LI XIAOBAI ET AL: "Remote Heart Rate Measurement from Face Videos under Realistic Situations", 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 23 June 2014 (2014-06-23), pages 4264 - 4271, XP032649361, DOI: 10.1109/CVPR.2014.543 * |
LIU YAOJIE ET AL: "Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 18 June 2018 (2018-06-18), pages 389 - 398, XP033475999, DOI: 10.1109/CVPR.2018.00048 * |
NIU XUESONG ET AL: "SynRhythm: Learning a Deep Heart Rate Estimator from General to Specific", 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), IEEE, 20 August 2018 (2018-08-20), pages 3580 - 3585, XP033459951, DOI: 10.1109/ICPR.2018.8546321 * |
RINKU DATTA RAKSHIT ET AL: "Face Spoofing and Counter-Spoofing: A Survey of State-of-the-art", TRANSACTIONS ON MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE, VOL. 5, NO. 2, 9 May 2017 (2017-05-09), pages 31 - 73, XP055559503, Retrieved from the Internet <URL:http://sseuk.org/index.php/TMLAI/article/view/3130> [retrieved on 20190220], DOI: 10.14738/tmlai.52.3130 * |
SONG RENCHENG ET AL: "Heart Rate Estimation From Facial Videos Using a Spatiotemporal Representation With Convolutional Neural Networks", IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, IEEE, USA, vol. 69, no. 10, 30 March 2020 (2020-03-30), pages 7411 - 7421, XP011808817, ISSN: 0018-9456, [retrieved on 20200914], DOI: 10.1109/TIM.2020.2984168 * |
SUMAN SAHA ET AL: "Domain Agnostic Feature Learning for Image and Video Based Face Anti-spoofing", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 15 December 2019 (2019-12-15), XP081642335 * |
XU ZHENQI ET AL: "Learning temporal features using LSTM-CNN architecture for face anti-spoofing", 2015 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), IEEE, 3 November 2015 (2015-11-03), pages 141 - 145, XP032910078, DOI: 10.1109/ACPR.2015.7486482 * |
ZUHENG MING ET AL: "A Survey On Anti-Spoofing Methods For Face Recognition with RGB Cameras of Generic Consumer Devices", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 October 2020 (2020-10-08), XP081781682 * |
Also Published As
Publication number | Publication date |
---|---|
CA3207705A1 (fr) | 2022-08-18 |
EP4292013A1 (fr) | 2023-12-20 |
FR3119915B1 (fr) | 2024-01-19 |
FR3119915A1 (fr) | 2022-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Fuzzified image enhancement for deep learning in iris recognition | |
FR2884007A1 (fr) | Procede d'identification de visages a partir d'images de visage, dispositif et programme d'ordinateur correspondants | |
Malu et al. | Learning photography aesthetics with deep cnns | |
EP2901370B1 (fr) | Procédé de détection d'un vrai visage | |
EP3018615B1 (fr) | Procede de comparaison de donnees ameliore | |
WO2013098512A1 (fr) | Procédé et dispositif de détection et de quantification de signes cutanés sur une zone de peau | |
Thavalengal et al. | Iris liveness detection for next generation smartphones | |
EP3582141B1 (fr) | Procédé d'apprentissage de paramètres d'un réseau de neurones à convolution | |
FR3073311A1 (fr) | Procede d'estimation de pose d'une camera dans le referentiel d'une scene tridimensionnelle, dispositif, systeme de realite augmentee et programme d'ordinateur associe | |
FR3087558A1 (fr) | Procede d'extraction de caracteristiques d'une empreinte digitale representee par une image d'entree | |
FR3103938A1 (fr) | Procédé de détection d’au moins un élément d’intérêt visible dans une image d’entrée au moyen d’un réseau de neurones à convolution | |
Kotwal et al. | Multispectral deep embeddings as a countermeasure to custom silicone mask presentation attacks | |
Malgheet et al. | Iris recognition development techniques: a comprehensive review | |
Jiang et al. | Face anti-spoofing with generated near-infrared images | |
WO2012001289A1 (fr) | Procede et dispositif de detection et de quantification de signes cutanes sur une zone de peau | |
Proenca | Iris recognition: What is beyond bit fragility? | |
EP3620970A1 (fr) | Procédé d'extraction de caractéristiques d'une empreinte digitale représentée par une image d'entrée | |
FR3100074A1 (fr) | Procédé d'analyse d'une caractéristique faciale d'un visage | |
EP3633544B1 (fr) | Procede d'association d'elements d'interet visibles dans une video | |
Crisan et al. | Low cost, high quality vein pattern recognition device with liveness Detection. Workflow and implementations | |
FR3102600A1 (fr) | Procédé de segmentation d’une image d’entrée représentant au moins une empreinte biométrique au moyen d’un réseau de neurones à convolution | |
FR2979727A1 (fr) | Identification par reconnaissance d'iris | |
Proença | Iris recognition in the visible wavelength | |
WO2022171970A1 (fr) | Dispositif et procede de traitement de donnees videos pour detection du vivant | |
George et al. | A comprehensive evaluation on multi-channel biometric face presentation attack detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22708998 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 3207705 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022708998 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022708998 Country of ref document: EP Effective date: 20230915 |