WO2016015170A1 - A method for face recognition and a system thereof - Google Patents

A method for face recognition and a system thereof Download PDF

Info

Publication number
WO2016015170A1
WO2016015170A1 PCT/CN2014/000716 CN2014000716W WO2016015170A1 WO 2016015170 A1 WO2016015170 A1 WO 2016015170A1 CN 2014000716 W CN2014000716 W CN 2014000716W WO 2016015170 A1 WO2016015170 A1 WO 2016015170A1
Authority
WO
WIPO (PCT)
Prior art keywords
view
face
features
generated
image
Prior art date
Application number
PCT/CN2014/000716
Other languages
English (en)
French (fr)
Inventor
Xiaoou Tang
Xiaogang Wang
Zhenyao ZHU
Ping Luo
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to PCT/CN2014/000716 priority Critical patent/WO2016015170A1/en
Priority to CN201480080815.3A priority patent/CN106663186B/zh
Publication of WO2016015170A1 publication Critical patent/WO2016015170A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1914Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries, e.g. user dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/11Technique with transformation invariance effect

Definitions

  • the present application relates to a method for face recognition and a system thereof.
  • Deep neural net is inspired by the understanding of hierarchical cortex in human brain and mimicking some aspects of its activities. Human not only can recognize identity, but can also imagine face images of a person under different viewpoints, making face recognition in human brain robust to view changes. In some sense, human brain can infer a 3D model from a 2D face image, even without actually perceiving 3D data.
  • a method for multi-view perceptron comprising:
  • a multi-view perceptron system comprising:
  • an identity feature determination unit configured to determine a plurality of identity features for an input face image in a given view point of the image
  • a view representation capture unit configured to capture a view representation of the input face image
  • a feature combination unit configured to yield one or more features for face recovery from the determined identity features and the view representation
  • a recovery unit configured to generate a face image from the generated face for recovery, and then unite the generated face image and the view representation to a view label of the generated face image.
  • identity feature determination unit may be coupled together to form a biological neural network.
  • the parameters of the biological neural network i.e. weight and biases, may be determined through maximizing a lower-bound of a probability distribution formed from the generated the face image, the view representation in view of the view labels of the input face image.
  • Fig. 1 is a schematic diagram illustrating a system for face recognition consistent with one disclosed embodiments.
  • Fig.2 is a schematic diagram illustrating neural network simulated for the system for face recognition according to one embodiment of the present application.
  • Fig. 3 is a schematic flowchart illustrating face recognition consistent with some disclosed embodiments of the present application.
  • Fig. 4 is a schematic flowchart illustrating a training process for the neural networks consistent with some disclosed embodiments of the present application.
  • FIG. 5 is a schematic diagram illustrating a system for face recognition consistent with another disclosed embodiment of the present application.
  • Fig. 6 is a schematic flowchart illustrating face test procedure consistent with some disclosed embodiments of the present application.
  • FIG. 1 is a schematic diagram illustrating an exemplary multi-view perceptron system 100 according to one embodiment of the present application.
  • the multi-view perceptron system 100 receives face
  • v i.e. ⁇ ⁇ y i ⁇ » 1 'i f ejj . _ 1 . _ l fe _ 1?
  • x y - is the input image of the z ' -th identity under the j-th viewpoint
  • 3 ⁇ 4 denotes the output image of the same identity in the k-th viewpoint
  • v ik is the view label of the output and may be a M dimensional binary vector with the c-th element as 1 and the remaining zeros.
  • system 100 may be implemented using certain hardware, software, or a combination thereof.
  • embodiments of the present invention may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
  • the system lOO may include a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
  • the apparatuslOOO may comprise a deterministic unit (neurons) 10 configured to learn the identity features 1 ⁇ 2 for an input face image x in a given arbitrary view, and a random unit (neurons) 20 configured to capture a view representations h v of the input face image x.
  • the view representation h v is naturally coupled with many types of face variations, such as viewpoints (angle of view), illuminations, and face expressions.
  • the identity feature determination unit 10 operates to determine a plurality of identity features for an input face image in a given viewpoint (angle of view) of the image.
  • the identity feature determination unit lO may generate a first plurality of identity features ⁇ ⁇ from the input face image in accordance with an activation function, i.e. sigmoid function ⁇ ( ⁇ ) and then to generate a second plurality of identity features i 2 d based on the generated first identity features ⁇ ⁇ .
  • h l 2 ⁇ ( ⁇ / 1 ⁇ ⁇ 1 ⁇ ) Formula 2)
  • U 0 and U ⁇ . are predetermined values of weight, which may be numbers ranged from 0 to 1 as will be discussed later.
  • the multi-view perceptron system 100 further comprises a feature combination unit 30 configured to yield one or more features for face recovery from the determined identity features and the view representation.
  • the feature combination unit 30 may combine the generated second identity features t with the generated view representation h v to yield one or more third features /i ⁇ for face recovery and then generate one or more fourth features for face recovery /i 4 from the yielded third features/13.
  • the third and the fourth features /13 and /i 4 for face recovery may be determined by rule of
  • the multi-view perceptron system 100 may further comprise a recovery unit 40 configured to generate a face image y from the generated recovery features /i 4 , and then unite the generated face image y and the view representation h v to a view label of the generated face image.
  • the view point of face image y and the view label v may be determined by rule of
  • t/ 4 and t/ 5 are predetermined values of weight, which may be number ranged from 0 to 1.
  • the system 100 may be implemented as network which mimics a biological neural network and are formed by a plurality of artificial nodes, known as "neurons” or "units", which are connected together.
  • an artificial neuron is a mathematical function conceived as a model of biological neurons.
  • the artificial neuron receives one or more inputs (representing dendrites) and sums them to produce an output (representing a neuron's axon).
  • the above motioned U 0 , ⁇ , U 2 , Uz, U , V 2 , V ⁇ , W 2 represent the weights and biases of the formed neural network.
  • FIG. 2 illustrates a schematic configuration of neural network according to one embodiment of the present application.
  • step S401 parameters ⁇ , i.e. U Q , U x , U 2 , U 3 , U 4 , V 2 , V 3 , W 2 and W 3 are randomly initialized with a value that is ranged from 0 to 1.
  • step S402 it samples a number of view representation i j , based on the current parameters ⁇ .
  • the view representation h 2 is sampled from a prior distributionq /iO, i.e. uniform distribution.
  • the set of h v are assigned with values such that h 2 has a uniform distribution, i.e. ⁇ /i v ⁇ ⁇ U (0, 1).
  • h 3 is generated from h 2 through W 2 of the current parameters ⁇ .
  • step S403 a face image x is inputted to the identity feature determination unit 10, i.e. the lowest layer in the simulated net as shown in Fig. 2 so as to generate the first and the second identity features in accordance with the formulas 1) and 2) based on the randomly initialized Uo and Uj.
  • the combination unit 10 i.e. the lowest layer in the simulated net as shown in Fig. 2 so as to generate the first and the second identity features in accordance with the formulas 1) and 2) based on the randomly initialized Uo and Uj.
  • the recovery unit 40 then operates to combine the generated second identity features h l 2 with the assigned h 2 to yield one or more third features h 3 for face recovery and then generate one or more fourth features for face recovery /i 4 from the yielded third features/13 in accordance with formulas 3) and 4).
  • the recovery unit 40 then generates the face image y from the generated recovery features /i 4 , and then unite the generated face image y and the view representation h v , which is assigned with value, to a view label of the generated face image by rule of formulas 5) and 6).
  • step S404 it uses the generated face image y and the view labels v to form/compute a prior distribution (i.e. importance weights) corresponding to the different view representations h v , which may be represented as p(h v ⁇ y, v; 0 old ).
  • a prior distribution i.e. importance weights
  • step S405 Gradient accent is used to maximize the lower-bound of the importance weight p y, v ⁇ h v ; 0 old ).
  • the lower-bound may be particularized as log ⁇ ft q(h v )— ' q(h v ) ' as snown m F° rmu l a 7).
  • VE the gradient of the lower-bound
  • Importance sampling is a basic sampling algorithm, which estimates a complex distribution p(x) with a proposal distribution q(x).
  • p(x) is too complex to sample direct, in the embodiments of the present application, it can sample from a simple distribution, i.e. uniform distribution, and the ratio p(x)/q(x) are known as importance weights, which correct the bias introduced by sampling from a different distribution, as below:
  • step S406 it updates the parameters by gradient accent by rule of:
  • step S407 it is determined if the lower bound is reached or convergence of a data-likelihood of the joint probability is observed, if not, steps S402-S407 are iterated, otherwise, the parameters E (U Q , U t , U 2 , U 3 , U 4 , V 2 , V 3 , W 2 and Wz) are learnt/determined.
  • process 200 comprises a series of steps that may be performed by one or more of processors, which may be embedded or arranged on the computer, may be performed by each module/unit of the system 100 to implement a data processing operation.
  • processors which may be embedded or arranged on the computer
  • each module/unit of the system 100 to implement a data processing operation.
  • the following discussion is made in reference to the situation where each module/unit of the system lOOis made in hardware or the combination of hardware and software.
  • the skilled in the art shall appreciate that other suitable devices or systems shall be applicable to carry out the following process and the system 100 are just used to be an illustration to carry out the process.
  • a plurality of identity features for an input face image in a given viewpoint of the image will be determined.
  • the first plurality of identity features Yi ⁇ is generated from the input face images in accordance with an activation function, and then a second plurality of identity features is to generated based on the generated first identity features h ' ' .
  • the first plurality of identity features i ⁇ and the identity features h may be to generated based on the generated first identity features ⁇ ⁇ by rule of Formula 1) and Formula 2).
  • step S202 the process 100 captures a view representations h v of the input face image x.
  • the process yields one or more features for face recovery from the determined identity features and the view representation.
  • the generated second identity features h- ⁇ is combined with the generated view representation h v to yield one or more third features /i ⁇ for face recovery and then generate one or more fourth features for face recovery /i 4 from the yielded third features/13.
  • the third and the fourth features /13 and /i 4 for face recovery may be determined by rule of Formula 3) and Formula 4) as discussed in the above.
  • the face image y will be generated from the generated recovery features /i 4 , and then the generated y is united with the view representation h v to a view label of the generated face image.
  • the face image y and the v may be determined by rule of Formula 5)and Formula 6).
  • Fig. 5 illustrates a multi-view perceptron system 500 according to another embodiment of the present application. The system 500 may reconstruct a full spectrum of multi-view images for all the possible view labels v of a given image.
  • the system 500 may comprise a identity feature determination unit 10, a view representation capture unit 20, a feature combination unit 30, a recovery unit 40 and an image selection unit 50.
  • Fig. 6 illustrates a process 600 for the system 500 to reconstruct the full spectrum of multi-view images for all the possible view labels v of the given image. The cooperation of the units 10-50 will be discussed in reference to Fig. 6 as below.
  • step S601 the identity feature determination unit 10 operates to learn a plurality of identity features for input face images x with a given view label v.
  • view representation capture unit 20 operates to capture a view
  • the feature combination unit 30 operates to combine the generated second identity features h 2 d with the view representation h v to yield one or more third features b.3 for face recovery and then generate one or more fourth features for face recovery h 4 from the yielded third featuresb.3.
  • the recovery unit 40 operates to generate face image y from the generated recovery features h 4 , and then the generated y (which may be represented as a set of outputs ⁇ y s ⁇ s _ 1 ) and the view representation h v to a view label of the generated face image. Since the configuration of the units 10-40 are the same of those of Fig. 1 and the processes for steps S601-S604 are the same as steps S201-S204, the detailed description thereof is omitted.
  • step S605 the image selection unit 50 operates to compute the probabilitiesp(f
  • the system 500 repeats the above procedure to obtain the most similar image to the input x with different view labels v, such that a full spectrum of multi-view images are reconstructed for all the possible view labels v of the input image x.
  • a set of corresponding output images ⁇ y z ⁇ may be generated through the above step S601-S605, where z indicates the index of the values of view we generated (or interpolated). If one of y z that is the most similarimage to x is selected from the output images ⁇ y z ⁇ , the view label of the z-th output y z may be assigned to the face image x.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
PCT/CN2014/000716 2014-07-28 2014-07-28 A method for face recognition and a system thereof WO2016015170A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/000716 WO2016015170A1 (en) 2014-07-28 2014-07-28 A method for face recognition and a system thereof
CN201480080815.3A CN106663186B (zh) 2014-07-28 2014-07-28 用于脸部识别的方法和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/000716 WO2016015170A1 (en) 2014-07-28 2014-07-28 A method for face recognition and a system thereof

Publications (1)

Publication Number Publication Date
WO2016015170A1 true WO2016015170A1 (en) 2016-02-04

Family

ID=55216543

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/000716 WO2016015170A1 (en) 2014-07-28 2014-07-28 A method for face recognition and a system thereof

Country Status (2)

Country Link
CN (1) CN106663186B (zh)
WO (1) WO2016015170A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017181923A1 (zh) * 2016-04-21 2017-10-26 腾讯科技(深圳)有限公司 一种人脸验证方法、装置及计算机存储介质
CN112000940A (zh) * 2020-09-11 2020-11-27 支付宝(杭州)信息技术有限公司 一种隐私保护下的用户识别方法、装置以及设备
CN116912919A (zh) * 2023-09-12 2023-10-20 深圳须弥云图空间科技有限公司 一种图像识别模型的训练方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110581974B (zh) * 2018-06-07 2021-04-02 中国电信股份有限公司 人脸画面改进方法、用户终端和计算机可读存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020602A (zh) * 2012-10-12 2013-04-03 北京建筑工程学院 基于神经网络的人脸识别方法
JP2013218605A (ja) * 2012-04-11 2013-10-24 Canon Inc 画像認識装置、画像認識方法及びプログラム
JP2013218604A (ja) * 2012-04-11 2013-10-24 Canon Inc 画像認識装置、画像認識方法及びプログラム

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015176305A1 (zh) * 2014-05-23 2015-11-26 中国科学院自动化研究所 人形图像分割方法
CN103984959B (zh) * 2014-05-26 2017-07-21 中国科学院自动化研究所 一种基于数据与任务驱动的图像分类方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013218605A (ja) * 2012-04-11 2013-10-24 Canon Inc 画像認識装置、画像認識方法及びプログラム
JP2013218604A (ja) * 2012-04-11 2013-10-24 Canon Inc 画像認識装置、画像認識方法及びプログラム
CN103020602A (zh) * 2012-10-12 2013-04-03 北京建筑工程学院 基于神经网络的人脸识别方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017181923A1 (zh) * 2016-04-21 2017-10-26 腾讯科技(深圳)有限公司 一种人脸验证方法、装置及计算机存储介质
CN112000940A (zh) * 2020-09-11 2020-11-27 支付宝(杭州)信息技术有限公司 一种隐私保护下的用户识别方法、装置以及设备
CN116912919A (zh) * 2023-09-12 2023-10-20 深圳须弥云图空间科技有限公司 一种图像识别模型的训练方法及装置
CN116912919B (zh) * 2023-09-12 2024-03-15 深圳须弥云图空间科技有限公司 一种图像识别模型的训练方法及装置

Also Published As

Publication number Publication date
CN106663186B (zh) 2018-08-21
CN106663186A (zh) 2017-05-10

Similar Documents

Publication Publication Date Title
Hernandez et al. Human motion prediction via spatio-temporal inpainting
Denton et al. Stochastic video generation with a learned prior
Michalski et al. Modeling deep temporal dependencies with recurrent grammar cells""
CN106462724B (zh) 基于规范化图像校验面部图像的方法和系统
CN105981050B (zh) 用于从人脸图像的数据提取人脸特征的方法和系统
Jia et al. Factorized latent spaces with structured sparsity
Yao et al. Robust CNN-based gait verification and identification using skeleton gait energy image
Wen et al. Discriminative dictionary learning with two-level low rank and group sparse decomposition for image classification
KR102440385B1 (ko) 멀티 인식모델의 결합에 의한 행동패턴 인식방법 및 장치
Khan et al. Human Gait Analysis: A Sequential Framework of Lightweight Deep Learning and Improved Moth‐Flame Optimization Algorithm
CN107451594B (zh) 一种基于多元回归的多视角步态分类方法
CN117238026B (zh) 一种基于骨骼和图像特征的姿态重建交互行为理解方法
Dhoke et al. A MATLAB based Face Recognition using PCA with Back Propagation Neural network
WO2016015170A1 (en) A method for face recognition and a system thereof
Yuan et al. Generative modeling of infinite occluded objects for compositional scene representation
Deng et al. Individual identification using a gait dynamics graph
Shcherbakov et al. Image inpainting based on stacked autoencoders
CN108496174B (zh) 用于面部识别的方法和系统
Lundquist et al. Sparse encoding of binocular images for depth inference
Wangni et al. Towards statistically provable geometric 3d human pose recovery
Vatambeti et al. Gait based person identification using deep learning model of Generative Adversarial Network
CN113627404B (zh) 基于因果推断的高泛化人脸替换方法、装置和电子设备
Tian et al. A Novel Deep Embedding Network for Building Shape Recognition
Chaturvedi et al. Izhikevich Model Based Pattern Classifier for Hand Written Character Recognition--A Review Analysis
Ince et al. Gait analysis and identification based on joint information using RGB-depth camera

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14898426

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14898426

Country of ref document: EP

Kind code of ref document: A1