CN107004115B - Method and system for recognition of face - Google Patents
Method and system for recognition of face Download PDFInfo
- Publication number
- CN107004115B CN107004115B CN201480083717.5A CN201480083717A CN107004115B CN 107004115 B CN107004115 B CN 107004115B CN 201480083717 A CN201480083717 A CN 201480083717A CN 107004115 B CN107004115 B CN 107004115B
- Authority
- CN
- China
- Prior art keywords
- facial image
- feature
- characteristic extracting
- extracting module
- convolutional layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/192—Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
- G06V30/194—References adjustable by an adaptive method, e.g. learning
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Collating Specific Patterns (AREA)
Abstract
The open device and method for being used for recognition of face.The equipment may include: extractor, be configured to have multiple cascade characteristic extracting modules, wherein each of cascade characteristic extracting module includes convolutional layer and full articulamentum.The convolutional layer is used to extract local feature from input facial image or in the feature extracted from the previous characteristic extracting module of cascade characteristic extracting module;Convolutional layer that the full articulamentum is connected in same characteristic features extraction module and global characteristics are extracted from the local feature of extraction.The equipment may also include identifier, determine for the distance between the global characteristics according to extraction: whether the one of image whether two facial images in input picture come from same identity or input picture as search facial image belongs to same identity with one of image in the facial image volume including input picture.
Description
Technical field
This application involves the system and method for recognition of face.
Background technique
Recently, deep learning obtains immense success and is substantially better than in terms of recognition of face is using low-level features
System.There are two noticeable breakthroughs.First is the extensive recognition of face for utilizing deep neural network.By by face figure
As being categorized into thousands of or even millions of a identity, last hidden layer forms the feature for having height distinctive to identity.The
Two are to supervise deep neural network using identification and validation task.Validation task is by the distance between the feature of same identity
It minimizes, and reduces the variation inside individual.By combining the feature learnt from many human face regions, identifies-test in combination
Card realizes that 99.15% face verification of currently existing technical level is accurate on the LFW facial recognition data collection of most extensive evaluation
Property.
It has been directed to learn attributive classification device first, then carries out recognition of face using attribute forecast.In addition, extensively
The classification based on rarefaction representation (sparse representation-based) is had studied generally and is blocked having
(occlusions) recognition of face carried out in the case where.Robust Boltzmann machine has also been proposed to distinguish the pixel of damage
And learn potential expression (representation).The design of these methods clearly handles the component blocked.
Summary of the invention
Existing at present is to learn attributive classification device first, then carries out recognition of face, but the application using attribute forecast
Attempt inverse process: prediction identity first then predicts attribute using the identity correlated characteristic of study.It is observed that refreshing
Feature in higher level through network has high selectivity to identity and identity association attributes (such as, gender and race).When
When identity (it can be except training data) or attribute is presented, the subset of the feature constantly motivated can be identified, and
It can identify another subset of continuous repressed feature.The feature of any subset in the two subsets effectively indicates the body
Part or attribute in the presence/absence of, and the application show only single feature to the identification of specific identity or attribute also have compared with
High accuracy.In other words, the feature in deep neural network has sparsity in identity and attribute.Although there is no instruct
Deep neural network in the application distinguishes attribute during the training period, but they implicitly learn such level concepts.With
Widely used hand-made feature (such as, higher-dimension LBP (local binary patterns)) is compared, and directly use is by depth nerve net
The feature of network study has much higher classification accuracy in terms of identity association attributes.
With traditional classification based on rarefaction representation on the contrary, the application, which shows to be not added with during the training period, manually blocks mould
In the case where formula (pattern), the deep neural network trained by neural network facial image has implicit coding not to blocking
Denaturation.
It observes in this application, the sparsity by the feature of deep neural network study is appropriate.For input
Facial image activates the approximately half of of the feature in the hidden layer of top.On the other hand, it is activated on approximately half of facial image
Each feature.Such sparsity distribution can maximize the distance between distinguishing ability and image of deep neural network.No
With the different subsets of identity activation feature.Two images of same identity have similar activation pattern.This excitation the application will be deep
The real-valued binaryzation in the top hidden layer of neural network is spent, and is identified using binary code.As a result it obtains
Unexpected effect.Only slightly decline is less than 1% to the verifying accuracy of LFW.It is to advising greatly caused by due to mass memory
The search of mould face makes a significant impact, and saves and calculate the time.This also indicates that two-value activation pattern compares deep neural network
In activation amplitude it is important.
In the one side of the application, the open equipment for being used for recognition of face.The equipment may include feature extractor and
Recognition unit.Feature extractor is configured to have multiple cascade characteristic extracting modules, wherein every in characteristic extracting module
A includes: convolutional layer, and the convolutional layer is used for from input facial image or from multiple cascade characteristic extracting modules
Previous characteristic extracting module in extract local feature in the feature extracted;And full articulamentum, full articulamentum are connected to identical
Convolutional layer in characteristic extracting module and global characteristics are extracted from the local feature of extraction.According to the global characteristics of extraction it
Between distance, identifier is for determining: whether two facial images in input picture come from same identity or input picture
A middle image as search facial image (probe face image) and the facial image volume including the input picture
In one of image whether belong to same identity.
The convolution in fisrt feature extraction module in one embodiment of the application, in cascade characteristic extracting module
Layer is connected to input facial image, and convolutional layer in each of subsequent characteristic extracting module is connected to previous feature extraction
Convolutional layer in module.Full articulamentum in each characteristic extracting module is connected to the convolutional layer in same characteristic extracting module.
The equipment can also include training aids, and the training aids is configured to by that will identify supervisory signals and verifying prison
Superintend and direct signal counter-propagate through cascade characteristic extracting module update each convolutional layer in same characteristic extracting module with it is right
Answer the neuron weight in the connection between full articulamentum.
The process of update may include: that two facial images are input to neural network respectively, obtain two facial images
In each of character representation;By the way that the character representation of facial image in the full articulamentum of each of neural network, each is divided
Class calculates identification error to one in multiple identity;By verifying the respective of two facial images in each full articulamentum
Whether character representation comes from same identity to calculate validation error, and identification error and validation error are considered identification supervision letter respectively
Number and verifying supervisory signals;And all identification supervisory signals and verifying supervisory signals are counter-propagated through into neural network,
To update the neuron weight of the connection between each convolutional layer in same characteristic extracting module and corresponding full articulamentum.This
Application finds and proves three properties of the feature extracted in characteristic extracting module later, that is, sparsity, selectivity and
Robustness, they are all very crucial to recognition of face, wherein the feature in each facial image has approximately half of zero and half
It is zero and the time of half is positive that positive value and each feature have the approximately half of time in all people's face image,
Say that feature has sparsity in this meaning;Face images for giving identity or containing given identity association attributes,
With the feature for taking positive value (being activated) or zero (suppressed), to the extent that, feature is related to identity for identity
Attribute (such as, gender and race) has selectivity;Feature has robustness for image damage (such as, blocking), wherein
In the case where appropriate image damage, characteristic value is largely remained unchanged.
In the one side of the application, the open equipment for being used for recognition of face, comprising:
Extractor, including neural network, the neural network configuration at multiple cascade characteristic extracting modules, wherein
Each of described cascade characteristic extracting module includes:
Convolutional layer, the convolutional layer are used for from the facial image of input or from the cascade characteristic extracting module
Previous characteristic extracting module in extract local feature in the feature extracted;With
Full articulamentum is connected to the convolutional layer of same characteristic extracting module and extracts from the local feature of extraction complete
Office's feature;And
Identifier is determined for the distance between the global characteristics according to extraction:
Whether two facial images in the facial image of the input come from same identity, or
As one of image of search facial image and including the people of the input in the facial image of the input
Whether one of image in the facial image volume of face image belongs to same identity.
In one embodiment of the application, the volume of the fisrt feature extraction module in the cascade characteristic extracting module
Lamination is configured to extract the local feature, and subsequent each characteristic extracting module from the facial image of the input
In convolutional layer be connected to the convolutional layer in the previous characteristic extracting module in the cascade characteristic extracting module.
In one embodiment of the application, the full articulamentum in each characteristic extracting module is connected to the cascade spy
Levy the convolutional layer in the same characteristic extracting module in extraction module.
In one embodiment of the application, the equipment further include:
Training aids is configured to by that will identify that supervisory signals and verifying supervisory signals counter-propagate through the cascade
Characteristic extracting module update the neural weight for following connection:
Connection between convolutional layer in fisrt feature extraction module and the input layer of the facial image containing the input;
Second feature extraction module to the end each convolutional layer in characteristic extracting module into previous characteristic extracting module
Correspondence convolutional layer between connection;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
In one embodiment of the application, the facial image of each input extracts in last characteristic extracting module
Feature sparsely organized with 2D, the feature has approximately half of zero and half positive value, and each feature is in institute
Having the approximately half of time on the facial image for the input having is zero and the time of half is positive.
In one embodiment of the application, the feature extracted in last characteristic extracting module is related to identity to identity
Attribute has selectivity, so that for the facial image for giving identity or all inputs containing given identity association attributes
It all has and is activated or repressed feature.
In one embodiment of the application, the identity association attributes include gender and/or race.
In one embodiment of the application, the feature extracted in the last characteristic extracting module has image damage
There is robustness, wherein the value of the feature of the extraction largely remains unchanged in the case where appropriate image damage.
In one embodiment of the application, if it is determined that characteristic distance be less than threshold value, then the identifier is determined
Two faces belong to same identity, or
If as one of image of search facial image and including the input in the facial image of the input
Facial image facial image volume in one of image characteristic distance and the retrieval facial image to facial image
The characteristic distance of every other facial image is compared to minimum in volume, it is determined that goes out them and belongs to same identity.
In one embodiment of the application, the characteristic distance include from by Euclidean distance, joint Bayes away from
One selected in the group constituted from, COS distance and Hamming distance.
In one embodiment of the application, for individual human face image, the feature point that will be exported from each full articulamentum
Class is to one in multiple identity, and wherein error in classification regards as the identification supervisory signals.
In one embodiment of the application, for the facial image of two comparisons, separately verify from each full articulamentum
Whether the feature of output belongs to same identity with the facial image of the described two comparisons of determination, and wherein validation error regards as institute
State verifying supervisory signals.
In the one side of the application, the open method for being used for recognition of face, comprising:
The local feature of the facial image of two or more inputs is extracted by the neural network after training;
Global characteristics are extracted from the local feature of extraction by the neural network after the training;
Determine the distance between the global characteristics extracted;And
It is determined according to determining distance:
Whether two facial images in the facial image of the input come from same identity to be used for face verification, or
As one of image of search facial image and including the input picture in the facial image of the input
Facial image volume in one of image whether belong to same identity,
Wherein, the neural network includes multiple cascade characteristic extracting modules, and each characteristic extracting module has
Convolutional layer, and wherein the convolutional layer in the fisrt feature extraction module in the cascade characteristic extracting module be connected to it is described
The facial image of input, and the convolutional layer in each subsequent characteristic extracting module is connected in previous characteristic extracting module
Convolutional layer.
In one embodiment of the application, each characteristic extracting module further includes full articulamentum, and each feature mentions
Full articulamentum in modulus block is connected to the convolutional layer in same characteristic extracting module.
In one embodiment of the application, the method also includes:
By will identify supervisory signals and verifying supervisory signals counter-propagate through the cascade characteristic extracting module come
Update the neural weight for connecting below:
Convolutional layer in the fisrt feature extraction module and between the input layer of the facial image containing the input
Connection;
Second to the end each convolutional layer in characteristic extracting module to the correspondence convolutional layer in previous characteristic extracting module
Between connection;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
In one embodiment of the application, the update further include:
Two facial images are input to the neural network respectively, obtain the character representation of each facial image;
It is multiple by the way that the character representation of facial image in the full articulamentum of each of the neural network, each to be classified as
One in identity calculates identification error;
Individual features by verifying in each full articulamentum, two facial images indicate whether from same identity come
Validation error is calculated, the identification error and the validation error are regarded as identification supervisory signals respectively and verifies supervisory signals;
And
The identification supervisory signals and the verifying supervisory signals are counter-propagated through into the neural network simultaneously, with more
It is used newly in the neural weight of following connection:
Connection between convolutional layer in fisrt feature extraction module and the input layer of the facial image containing the input;
Second feature extraction module to the end each convolutional layer in characteristic extracting module into previous characteristic extracting module
Correspondence convolutional layer between connection;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
In one embodiment of the application, feature that each facial image extracts in the last characteristic extracting module
It is sparsely organized with 2D, the feature has approximately half of zero and half positive value, and each feature is in all faces
Having the approximately half of time on image is zero and the time of half is positive.
In one embodiment of the application, the feature extracted in the last characteristic extracting module is to identity and identity
Association attributes have selectivity, so that for the facial image for giving identity or all inputs containing given identity association attributes
It all has and is activated or repressed feature.
In one embodiment of the application, the identity association attributes include gender and/or race.
In one embodiment of the application, the determination further include:
If it is determined that characteristic distance be less than threshold value, it is determined that go out two faces belong to same identity, or
If as one of image of search facial image and including the input in the facial image of the input
Facial image facial image volume in one of image characteristic distance and described search facial image to facial image
The characteristic distance of every other facial image is compared to minimum in volume, it is determined that goes out them and belongs to same identity.
Detailed description of the invention
Exemplary non-limiting embodiments of the invention are described referring to the attached drawing below.Attached drawing is illustrative, and generally not
In definite ratio.Same or like element on different figures quotes identical drawing reference numeral.
Fig. 1 is the schematic diagram for showing the equipment for recognition of face for meeting some open embodiments.
Fig. 2 is showing for the sparsity for showing the feature extracted in characteristic extracting module later, selectivity and robustness
It is intended to.
Fig. 3 is the structure for showing the cascade nature extraction module in feature extractor and the input face figure in training aids
The schematic diagram of picture and supervisory signals.
Fig. 4 is to show the sparsity of the activation feature (neuron) on independent facial image and in face images
The signal histogram of the sparsity of the independent feature (neuron) of activation.
Fig. 5 is the signal histogram for showing selective activation and inhibition on the facial image of specific identity.
Fig. 6 is the signal histogram for showing selective activation and inhibition on the facial image containing particular community.
Fig. 7 is to show to have randomized block to the robustness of image damage for testing the feature extracted by feature extractor
The schematic diagram of the facial image blocked.
Fig. 8 is the average characteristics shown in the case where various degrees of randomized block blocks on the facial image of independent identity
The schematic diagram of activation.
Fig. 9 is the schematic flow diagram for showing the training aids as shown in Figure 1 for meeting some open embodiments.
Figure 10 is the schematic flow diagram for showing the feature extractor as shown in Figure 1 for meeting some open embodiments.
Figure 11 is the schematic flow diagram for showing the identifier as shown in Figure 1 for meeting some open embodiments.
Specific embodiment
Reference will now be made in detail to some specific embodiments of the present inventions, including for implementing the present invention expected from inventor
Optimal mode.The example of these specific embodiments is shown in attached drawing.Although describing the present invention in conjunction with these specific embodiments,
It should be understood that being not meant to limit the invention to the embodiments described herein.Contrary, it is intended to cover may include in such as appended power
Alternative solution, modification and the equivalent in the spirit and scope of the present invention that sharp claim limits.It is listed perhaps in being described below
More details, in order to provide the comprehensive understanding to the application.It can be in some or all of these no details
In the case of practice the present invention.In other cases, well known process operation is not described in detail, in order to avoid unnecessarily make to this
The understanding of invention generates obstacle.
Term used herein is not intended to limit the present invention merely for the sake of for the purpose of describing particular embodiments.It removes
Non- context clearly indicates otherwise, and otherwise singular " one " used herein, "one" and " described " also may include multiple
Number form formula.It should also be understood that term " includes " and/or " comprising " used in this specification are for illustrating that there are the features, whole
Number, step, operations, elements, and/or components, but presence is not precluded or add other one or more features, integer, step,
Operation, component, assembly unit and/or their combination.
Such as those skilled in the art it will be appreciated that, the present invention can be presented as system, method or computer program product.Cause
This, the present invention can use following form: full hardware embodiment, full software implementation (including firmware, resident software, microcode
Deng), or the software and hardware aspect group that usually will all can be described as " circuit ", " device ", " module " or " system " herein
Embodiment altogether.In addition, the present invention can use the form of computer program product, the computer program product embodies
In any tangible expression media, the medium has the computer usable program code embodied in the medium.
It should also be understood that such as the first and second etc. relational languages (if yes) are used alone, by entity, an item
Mesh or movement are distinguished with another, and may not require or imply any practical pass between these entities, project or movement
System or sequence.
Many functions in function of the present invention and many principles in the principle of the invention by software or integrate electricity when implementing
(IC) is best supported on road, such as, digital signal processor and software or application-specific integrated circuit.Despite the presence of may largely make great efforts and
The many design alternatives motivated by such as pot life, current techniques and economic consideration, it is anticipated that those skilled in the art
Member will readily be able to when by concept disclosed herein and principle guidance using the least experiment such software instruction of generation or
IC.Therefore, in order to succinct and minimize any risk of fuzzy principles and concepts according to the present invention, such software and IC's
It is discussed further and (if yes) is limited to necessity principle and concept used in preferred embodiment.
Fig. 1 is the schematic diagram for showing the example devices 100 for recognition of face for meeting some open embodiments.Such as figure
Shown, equipment 100 may include feature extractor 10 and identifier 20.Feature extractor 10 is configured to from input facial image
Middle extraction feature.In one embodiment of the application, feature extractor 10 may include neural network, which can be with
It is configured with multiple cascade characteristic extracting modules.Each characteristic extracting module in cascade includes convolutional layer and full connection
Layer.Cascade characteristic extracting module can be implemented by software, integrated circuit (IC) or their combination.Fig. 3 shows feature extraction
The schematic diagram of the structure of cascade characteristic extracting module in device 10.As shown, first in cascade characteristic extracting module
Convolutional layer in characteristic extracting module is connected to input facial image, and the convolutional layer in subsequent each characteristic extracting module
The convolutional layer being connected in previous characteristic extracting module.Full articulamentum in each characteristic extracting module is connected to same feature and mentions
Convolutional layer in modulus block.
With reference to Fig. 1, in order to enable neural network effectively to work, equipment 100 further includes training aids 30,30 quilt of training aids
It is configured to by that will identify that supervisory signals and verifying supervisory signals counter-propagate through cascade characteristic extracting module and update use
In the neural weight of following connection:
Convolutional layer in fisrt feature extraction module and the connection between the input layer containing input facial image;
Second each convolutional layer in characteristic extracting module and the corresponding convolutional layer in previous characteristic extracting module to the end
Between connection;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module,
So that the feature extracted in last/highest characteristic extracting module in cascade characteristic extracting module has
Sparsity, selectivity and robustness, this will be discussed later.
Identifier 20 can be implemented by software, integrated circuit (IC) or their combination, and be configured to calculate from difference
Facial image in the distance between the feature extracted, to determine whether two facial images come from same identity for face
As one of image of search facial image and including the facial image of the input picture in verifying or input picture
Whether an image in volume belongs to same identity.
Feature extractor 10
Feature extractor 10 contains multiple cascade characteristic extracting modules, and operates by different level from input face figure
Feature is extracted as in.Fig. 3 shows the example of the structure of the cascade characteristic extracting module in feature extractor 10, for example, described
Feature extractor includes four cascade characteristic extracting modules, and each characteristic extracting module includes convolutional layer Conv-n and full connection
Layer FC-n, wherein n=1 ..., 4.Convolutional layer Conv-1 in the fisrt feature extraction module of feature extractor 10 is connected to defeated
Enter facial image, as input layer, and the convolutional layer Conv-n in subsequent each characteristic extracting module of feature extractor 10
(n > 1) is connected to the convolutional layer Conv- (n-1) in previous characteristic extracting module.Each feature extraction mould of feature extractor 10
Full articulamentum FC-n in block is connected to the convolutional layer Conv-n in same characteristic extracting module.
Figure 10 is the schematic flow diagram for showing the characteristic extraction procedure in feature extractor 10.In step 201, feature mentions
Take device 10 that will input the convolutional layer in all characteristic extracting modules that facial image propagated forward passes through feature extractor 10.With
Afterwards, in step 202, output in each of convolutional layer is propagated forward in same characteristic extracting module by feature extractor 10
The full articulamentum of correspondence.Finally, in step 203, it is by output/table of the last one full articulamentum in full articulamentum
It is shown as being characterized, as discussed below.
Convolutional layer in feature extractor 10 is configured to from input picture (for the first convolutional layer) or characteristic pattern (such as this
The output characteristic pattern of previous convolutional layer (being maximum pond behind) as field is known) in extract local facial spy
It levies (that is, the feature extracted from the regional area of input picture or input feature vector), to form the output feature of current convolutional layer
Figure.Each characteristic pattern is a certain feature organized with 2D.In previous convolutional layer (followed by maximum pond) and current convolution
It is identical in layer between corresponding input feature vector figure and output characteristic pattern in the case where neural connection weight set w having the same
Feature in output characteristic pattern or in the regional area of same characteristic features figure is extracted from input feature vector figure.Volume in each convolutional layer
Product operation can indicate are as follows:
Wherein xiAnd yjIt is i-th of input feature vector figure and j-th of output characteristic pattern respectively.kijIt is i-th of input feature vector figure
The convolution kernel exported between characteristic pattern with j-th.* convolution is indicated.bjIt is the deviation of j-th of output characteristic pattern.Herein, will
ReLU nonlinear function y=max (0, x) is used for neuron.Weight in the higher convolutional layer of ConvNets is locally shared.
R indicates the regional area of shared weight.
It can be maximum pond after each convolutional layer, maximum pond is formulated into:
Wherein i-th of output characteristic pattern yiIn each neuron in i-th of input feature vector figure xiIn s × s non-overlap office
Portion region upper storage reservoir.
Each of full articulamentum in feature extractor 10 is configured to from being obtained from same characteristic extracting module
Global characteristics (feature extracted from the whole region of input feature vector figure) is extracted in the characteristic pattern of convolutional layer.In other words, Quan Lian
It meets layer FC-n and extracts global characteristics from convolutional layer Conv-n.Full articulamentum also serve as receive during the training period supervisory signals and
The interface of feature is exported during feature extraction.Full articulamentum can be formulated into:
Wherein xiIndicate the output of i-th of neuron in previous convolutional layer (followed by maximum pond).yjIndicate current
Full articulamentum in j-th of neuron output.wi,jIt is i-th of mind in previous convolutional layer (followed by maximum pond)
Through the weight in the connection between j-th of neuron in member and current full articulamentum.bjIt is in current full articulamentum
The deviation of j-th of neuron.Max (0, x) is ReLU non-linear.
The feature extracted in last/highest characteristic extracting module of feature extractor 10 is (for example, FC- as shown in Figure 3
Those of in 4 layers) with sparsity, selectivity and robustness: each facial image feature have approximately half of zero and
It is zero and the time of half is that half positive value and each feature have the approximately half of time in all people's face image
Just, say that feature has sparsity from this two o'clock;For given identity or contain all face figures of given identity association attributes
As having the feature for taking positive value (being activated) or zero (suppressed), to the extent that feature is related to identity for identity
Attribute (such as, gender and race) has selectivity;Feature has robustness for image damage (such as, blocking), wherein
In the case where appropriate image damage, characteristic value is largely remained unchanged.Sparse features can be turned and being compared with threshold value
Change binary code into, wherein binary code can be used for recognition of face.
Fig. 2 shows three properties of the feature extracted in FC-4 layers: sparsity, selectivity and robustness.Show on the left of Fig. 2
OutBush(Bush) three facial images andBao Weier(Powell) a facial image on feature.BushSecond
Facial image partial destruction.In one embodiment of the application, there are 512 features in FC-4 layers, Fig. 2 shows special from these
To 32 progress double samplings in sign, to be shown as example.Feature sparsely activates on each facial image, wherein approximately half of
Feature be positive and half is zero.The feature of the facial image of same identity has similar activation pattern, and is directed to different bodies
Part is then different.The robustness of feature is: when presentation is blocked, such as showing on the face in second people of Bush, the activation of feature
Mode largely remains unchanged.Face images (as background) are shown, are belonged on the right side of Fig. 2BushAll images, have
All images of attribute " male " and the activation histogram with some selection features on all images of attribute " women ", it is special
Sign is usually activated on the facial image of about half.But for all images for belonging to particular community identity, these features can
Be constantly activated (or not activating).To the extent that feature has sparsity and selectivity to identity and attribute.
Appropriate sparsity on image distinguish the face of different identity can farthest, and the appropriateness in feature is dilute
Thin property makes them have maximum distinguishing ability.46594 of verify data concentration are shown on the left of Fig. 4 (for example) in facial image
The histogram of each (just) feature quantity of being activated, and the image that each feature is activated and (is positive) is shown on the right side of Fig. 4
The histogram of quantity.It assesses based on the feature extracted by FC-4 layers.In one embodiment of the application, with FC-4 layers in it is complete
512, portion (for example) feature is compared, and the average and standard deviation of the quantity of the neuron being activated on image is 292 ± 34,
And compared with whole 46594 authentication images, the mean standard deviation of the quantity for the image that each feature is activated is 26565 ±
5754, they are all placed in the middle about in the half of all feature/images.
Activation pattern (that is, whether feature is activated and (has positive value)) is more important than accurate activation value.By taking threshold value
Feature activation is converted into binary code and only sacrifices face verification accuracy less than 1%.This shows the excitation of feature
Or holddown has contained most of discrimination property information.Binary code is both economical for storage and is rapidly used for figure
As search.
The example of the activation histogram of the feature of given identity and attribute is shown respectively in Fig. 5 and Fig. 6.Given identity
Histogram shows stronger selectivity.For given identity, some feature lasts are activated, and wherein histogram distribution is big
In zero value, as shown in the front two row in Fig. 5;And some other feature lasts are suppressed, wherein histogram accumulation zero or
At smaller value, as shown in rear two row in Fig. 5.As for attribute, every row of Fig. 7 show some association attributes (with sex, race and
Those age-dependent attributes) on single feature histogram.It is selected on each attribute given on the left of every a line
It is characterized in being activated.As shown in fig. 6, feature to sex, race and certain ages (such as, children and the elderly) show compared with
Strong selectivity, wherein feature is effectively activated for given attribute, and is suppressed for other attributes of identical type.
For some other attributes, such as, young and a middle-aged person, selectivity is weaker, without individually for each of these attributes
Feature is activated, this is because the age does not correspond exactly to identity.For example, in recognition of face, feature for young and
The same identity of middle age shooting has invariance.
Fig. 7 and Fig. 8 shows robustness of the feature to image damage of the extraction in subsequent characteristics extraction module (FC-4 layers).
Facial image is blocked by the randomized block of all size from 10 × 10 to 70 × 70, as shown in Figure 7.Fig. 8 is shown with randomized block
Average characteristics activation on the image blocked, wherein on each facial image for listing the single identity given at the top of it
Average activation, wherein the left side of every a line gives various degrees of block.Characteristic value is mapped to color diagram, and wherein warm colour indicates
Positive value and cool colour expression zero or smaller value.The sequence of the feature in figure in each column is respectively by the original face of each identity
Feature activation value classification on image.It can such as find out in fig. 8, activation pattern largely remains unchanged (wherein most to be swashed
Feature living is still activated and most repressed features are still suppressed), until occurring largely blocking.
Identifier 20
The operation of identifier 20 is special by the overall situation of the different faces image of the full articulamentum extraction of feature extractor 10 to calculate
The distance between sign, to determine whether two facial images come from same identity for face verification, or determining input figure
As one of image of search facial image and one of image in the facial image volume including input picture as in
Whether same identity is belonged to, to be used for recognition of face.Figure 11 is the schematic flow diagram for showing the identification process of identifier 20.In step
In rapid 301, identifier 20 calculates the feature extracted from different facial images by feature extractor 10 (that is, by full articulamentum
The distance between the global characteristics of the different faces image of extraction).Finally, identifier 20 determines two face figures in step 302
Seem it is no from same identity with for face verification, or in step 303, determine in input picture as search face figure
Whether one of image of picture belongs to same identity with one of image in the facial image volume including input picture, with
For recognition of face.
In identifier 20, if the characteristic distance of two facial images is less than threshold value, it is determined that they belong to all over the body
Part;Or compared with the search facial image to the characteristic distance of every other facial image volume, if search facial image and
One characteristic distance in facial image volume is the smallest, it is determined that they belong to same identity.Wherein, true by identifier 20
Fixed characteristic distance can be Euclidean distance, joint Bayes's distance, COS distance, Hamming distance or any other away from
From.
In one embodiment of the application, joint Bayes's distance is used as characteristic distance.Joint Bayes at
For the universal similarity measurement of face, it indicates that extracted face characteristic x (is subtracted with the sum of two independent gaussian variables
After mean value):
X=μ+ò, (5)
Wherein (0, S μ~Nμ) indicate face identity and ò~N (0, Sò) indicate individual inside variation.In given individual
P (x is assumed in internal or external variation1,x2∣HI) and P (x1,x2∣HE) in the case where, joint Bayes is general to the joint of two faces
Rate is modeled.It is easy to show that the two probability are also the Gaussian Profile for being respectively provided with following variation from equation (5):
With
It can use EM algorithm and send and learn S in dataμAnd Sò.In testing, likelihood ratio is calculated:
It has closing solution and effectively.
Training aids 30
Training aids 30 is used to through the company between the convolutional layer of input feature vector value extractor 10 and the neuron of full articulamentum
Initial weight, multiple identification supervisory signals and the multiple verifying supervisory signals connect update the convolutional layer of feature extractor 10 and complete
The weight w of connection between the neuron of articulamentum, so that the last one in cascade nature extraction module in extractor
The feature that extraction module extracts has sparsity, selectivity and robustness.
As shown in figure 3, identification supervisory signals in training aids 30 and verifying supervisory signals (be expressed as " Id " and
" Ve ") while being added to the full connection of each of the full articulamentum FC-n of each characteristic extracting module in feature extractor 10
Layer, wherein n=1 ..., 4, and propagate backward to input facial image respectively, extract mould to update all cascade natures
The weight of connection between the neuron of block.
By the way that all full articulamentum expression/outputs (that is, formula (4)) of individual human face image are classified as N number of identity
One, so that the identification supervisory signals " Id " for being used for training aids 30 are generated, wherein error in classification is used as identification supervisory signals.
Full articulamentum by separately verifying the facial image of two comparisons in each characteristic extracting module indicates, determines
Whether the facial image of two comparisons belongs to same identity, so that the verifying supervisory signals in training aids 30 are generated, wherein will test
It demonstrate,proves error and is used as verifying supervisory signals.Given a pair of of training facial image, feature extractor 10 is respectively from each feature extraction mould
Two characteristic vector f are extracted in two facial images of blockiAnd fj.If fiAnd fjIt is the feature of the facial image of same identity,
So validation error isOr if fiAnd fjIt is the feature of the facial image of different identity, then validation error
It isWherein | | fi-fj||2It is the Euclidean distance of two characteristic vectors, m is regime values.
If fiAnd fjIt is dissimilar for same identity, or if fiAnd fjIt is similar for different identity, it is missed then existing
Difference.
Fig. 9 is the schematic flow diagram for showing the training process of training aids 30.In a step 101, training aids 30 is to two people
Face image samples and they is input to feature extractor 10 respectively, to obtain in all full articulamentums of feature extractor 10
, the character representations of two facial images.Then, in a step 102, training aids 30 will be by will be in each full articulamentum, every
The character representation of a facial image is classified as one in multiple (N number of) identity to calculate identification error.Meanwhile in step 103
In, training aids 30 is indicated whether by verifying the individual features of in each full articulamentum, two facial images from all over the body
Part calculate validation error.Identification error and validation error are used as identification supervisory signals and verifying supervisory signals respectively.In step
In rapid 104, all identification supervisory signals and verifying supervisory signals are counter-propagated through feature extractor simultaneously by training aids 30
10, to update the weight of the connection between the neuron in feature extractor 10.Full articulamentum FC-n will be added to (wherein simultaneously
N=1,2,3,4) identification supervisory signals and verifying supervisory signals (or error) counter-propagate through the grade of characteristic extracting module
Connection, until traveling to input picture.After backpropagation, by error obtained in every layer in the cascade of characteristic extracting module
It is cumulative.The weight in the connection between the neuron in feature extractor 10 is updated according to the size of error.Finally, in step
In 105, training aids 30 assesses whether training process restrains, without convergence point is reached, then repeatedly step 101 to 104.
All components or step in the appended claims add the counter structure of function element, material, act and wait
Effect object be intended to include: for special requirement protection, other claimed elements combine execute function any structure,
Material or movement.The description of this invention is merely for the sake of the purpose of illustration and description above, and is not exhaustion and simultaneously
Non- is to limit the invention to disclosed form.Without departing from the scope and spirit of the present invention, the technology of this field
Personnel should understand many modifications and variations.By the way that above-mentioned embodiment is chosen and described, it is therefore intended that best illustrate this
The principle and practical application of invention, and enable those skilled in the art to be suitable for the various implementations of expected special-purpose
Example and various changes are to understand the present invention.
Claims (15)
1. a kind of equipment for recognition of face comprising:
Extractor, including neural network, the neural network configuration is at multiple cascade characteristic extracting modules, wherein described
Each of cascade characteristic extracting module includes:
Convolutional layer, before the convolutional layer is used for from the facial image of input or from the cascade characteristic extracting module
Local feature is extracted in the feature extracted in one characteristic extracting module;With
Full articulamentum is connected to the convolutional layer of same characteristic extracting module and extracts from the local feature of extraction global special
Sign;And
Identifier is determined for the distance between the global characteristics according to extraction:
Whether two facial images in the facial image of the input come from same identity, or
As one of image of search facial image and including the face figure of the input in the facial image of the input
Whether one of image in the facial image volume of picture belongs to same identity,
Wherein,
If it is determined that characteristic distance be less than threshold value, then the identifier determines that two faces belong to same identity, or
If the people in the facial image of the input as one of image of search facial image and including the input
The characteristic distance and described search facial image of one of image in the facial image volume of face image are into facial image volume
The characteristic distance of every other facial image is compared to minimum, it is determined that goes out them and belongs to same identity.
2. equipment according to claim 1, wherein the fisrt feature extraction module in the cascade characteristic extracting module
Convolutional layer be configured to extract the local feature, and subsequent each feature extraction from the facial image of the input
Convolutional layer in module is connected to the convolutional layer in the previous characteristic extracting module in the cascade characteristic extracting module.
3. equipment according to claim 2, further include:
Training aids is configured to by that will identify that supervisory signals and verifying supervisory signals counter-propagate through the cascade spy
Extraction module is levied to update the neural weight for following connection:
Connection between convolutional layer in fisrt feature extraction module and the input layer of the facial image containing the input;
Second feature extraction module to the end each convolutional layer in characteristic extracting module to pair in previous characteristic extracting module
Answer the connection between convolutional layer;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
4. equipment according to claim 3, wherein the facial image of each input is in last characteristic extracting module
The feature of extraction is sparsely organized with 2D, and the feature has half zero and half positive value, and each feature is in institute
Time on the facial image for the input having with half is zero and the time of half is positive.
5. equipment according to claim 3, wherein the feature extracted in last characteristic extracting module is to identity and identity
Association attributes have selectivity, so that for the face for giving identity or all inputs containing given identity association attributes
Image, which all has, to be activated or repressed feature.
6. equipment according to claim 5, wherein the identity association attributes include gender and/or race.
7. equipment according to claim 1, wherein the characteristic distance includes from by Euclidean distance, joint Bayes
One selected in the group that distance, COS distance and Hamming distance are constituted.
8. equipment according to claim 3, wherein
For individual human face image, by the tagsort exported from each full articulamentum to one in multiple identity, wherein dividing
Class error regards as the identification supervisory signals.
9. equipment according to claim 3 separately verifies wherein being directed to the facial image of two comparisons from each full connection
Whether the feature of layer output, belong to same identity with the facial image of the described two comparisons of determination, wherein validation error is regarded as
The verifying supervisory signals.
10. a kind of method for recognition of face comprising:
The local feature of the facial image of two or more inputs is extracted by the neural network after training;
Global characteristics are extracted from the local feature of extraction by the neural network after the training;
Determine the distance between the global characteristics extracted;And
It is determined according to determining distance:
Whether two facial images in the facial image of the input come from same identity to be used for face verification, or
As one of image of search facial image and including the people of the input picture in the facial image of the input
Whether one of image in face image volume belongs to same identity,
Wherein, the neural network includes multiple cascade characteristic extracting modules, and each characteristic extracting module has convolution
Layer, and wherein the convolutional layer in the fisrt feature extraction module in the cascade characteristic extracting module is connected to the input
Facial image, and the convolutional layer in each subsequent characteristic extracting module is connected to the convolution in previous characteristic extracting module
Layer,
And wherein,
If it is determined that characteristic distance be less than threshold value, it is determined that go out two faces belong to same identity, or
If the people in the facial image of the input as one of image of search facial image and including the input
The characteristic distance and described search facial image of one of image in the facial image volume of face image are into facial image volume
The characteristic distance of every other facial image is compared to minimum, it is determined that and go out them and belongs to same identity,
Wherein, each characteristic extracting module further includes full articulamentum, the full articulamentum connection in each characteristic extracting module
To the convolutional layer in same characteristic extracting module.
11. according to the method described in claim 10, further include:
By that will identify that supervisory signals and verifying supervisory signals counter-propagate through the cascade characteristic extracting module and update
Neural weight for connecting below:
Convolutional layer in the fisrt feature extraction module and the connection between the input layer of the facial image containing the input;
Second feature extraction module to the end each convolutional layer in characteristic extracting module to pair in previous characteristic extracting module
Answer the connection between convolutional layer;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
12. according to the method for claim 11, wherein the update further include:
Two facial images are input to the neural network respectively, obtain the character representation of each facial image;
By the way that the character representation of facial image in the full articulamentum of each of the neural network, each is classified as multiple identity
In one calculate identification error;
Individual features by verifying in each full articulamentum, two facial images indicate whether to calculate from same identity
The identification error and the validation error are regarded as identification supervisory signals respectively and verify supervisory signals by validation error;And
The identification supervisory signals and the verifying supervisory signals are counter-propagated through into the neural network simultaneously, to update use
In the neural weight of following connection:
Connection between convolutional layer in fisrt feature extraction module and the input layer of the facial image containing the input;
Second feature extraction module to the end each convolutional layer in characteristic extracting module to pair in previous characteristic extracting module
Answer the connection between convolutional layer;And
The connection between each convolutional layer and corresponding full articulamentum in same characteristic extracting module.
13. according to the method for claim 11, wherein each facial image extracts in the last characteristic extracting module
Feature sparsely organized with 2D, the feature has half zero and half positive value, and each feature is in owner
Time in face image with half is zero and the time of half is positive.
14. according to the method for claim 11, wherein the feature extracted in the last characteristic extracting module is to identity
There is selectivity with identity association attributes, so that for the people for giving identity or all inputs containing given identity association attributes
Face image, which all has, to be activated or repressed feature.
15. according to the method for claim 14, wherein the identity association attributes include gender and/or race.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/001091 WO2016086330A1 (en) | 2014-12-03 | 2014-12-03 | A method and a system for face recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107004115A CN107004115A (en) | 2017-08-01 |
CN107004115B true CN107004115B (en) | 2019-02-15 |
Family
ID=56090783
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480083717.5A Active CN107004115B (en) | 2014-12-03 | 2014-12-03 | Method and system for recognition of face |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107004115B (en) |
WO (1) | WO2016086330A1 (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10579785B2 (en) * | 2017-09-29 | 2020-03-03 | General Electric Company | Automatic authentification for MES system using facial recognition |
EP3698268A4 (en) | 2017-11-22 | 2021-02-17 | Zhejiang Dahua Technology Co., Ltd. | Methods and systems for face recognition |
CN108009481A (en) * | 2017-11-22 | 2018-05-08 | 浙江大华技术股份有限公司 | A kind of training method and device of CNN models, face identification method and device |
CN108073917A (en) * | 2018-01-24 | 2018-05-25 | 燕山大学 | A kind of face identification method based on convolutional neural networks |
CN110309692B (en) * | 2018-03-27 | 2023-06-02 | 杭州海康威视数字技术股份有限公司 | Face recognition method, device and system, and model training method and device |
TWI667621B (en) * | 2018-04-09 | 2019-08-01 | 和碩聯合科技股份有限公司 | Face recognition method |
CN111079549B (en) * | 2019-11-22 | 2023-09-22 | 杭州电子科技大学 | Method for carrying out cartoon face recognition by utilizing gating fusion discrimination characteristics |
CN111968264A (en) * | 2020-10-21 | 2020-11-20 | 东华理工大学南昌校区 | Sports event time registration device |
CN116311464B (en) * | 2023-03-24 | 2023-12-12 | 北京的卢铭视科技有限公司 | Model training method, face recognition method, electronic device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1700240A (en) * | 2004-05-17 | 2005-11-23 | 香港中文大学 | Face recognition method based on random sampling |
CN1866270A (en) * | 2004-05-17 | 2006-11-22 | 香港中文大学 | Face recognition method based on video frequency |
CN101763506A (en) * | 2008-12-22 | 2010-06-30 | Nec九州软件株式会社 | Facial image tracking apparatus and method, computer readable recording medium |
CN102629320A (en) * | 2012-03-27 | 2012-08-08 | 中国科学院自动化研究所 | Ordinal measurement statistical description face recognition method based on feature level |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100543707B1 (en) * | 2003-12-04 | 2006-01-20 | 삼성전자주식회사 | Face recognition method and apparatus using PCA learning per subgroup |
EP2672426A3 (en) * | 2012-06-04 | 2014-06-04 | Sony Mobile Communications AB | Security by z-face detection |
-
2014
- 2014-12-03 CN CN201480083717.5A patent/CN107004115B/en active Active
- 2014-12-03 WO PCT/CN2014/001091 patent/WO2016086330A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1700240A (en) * | 2004-05-17 | 2005-11-23 | 香港中文大学 | Face recognition method based on random sampling |
CN1866270A (en) * | 2004-05-17 | 2006-11-22 | 香港中文大学 | Face recognition method based on video frequency |
CN101763506A (en) * | 2008-12-22 | 2010-06-30 | Nec九州软件株式会社 | Facial image tracking apparatus and method, computer readable recording medium |
CN102629320A (en) * | 2012-03-27 | 2012-08-08 | 中国科学院自动化研究所 | Ordinal measurement statistical description face recognition method based on feature level |
Non-Patent Citations (1)
Title |
---|
基于卷积神经网络的深度学习算法与应用研究;陈先昌;《中国优秀硕士学位论文全文数据库信息科技辑》;20140915;第2014卷(第9期);第13-17、23-25页 |
Also Published As
Publication number | Publication date |
---|---|
CN107004115A (en) | 2017-08-01 |
WO2016086330A1 (en) | 2016-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107004115B (en) | Method and system for recognition of face | |
Li et al. | Visual semantic reasoning for image-text matching | |
CN107423701B (en) | Face unsupervised feature learning method and device based on generative confrontation network | |
Ramakrishnan et al. | Overcoming language priors in visual question answering with adversarial regularization | |
Hassan et al. | Soft biometrics: A survey: Benchmark analysis, open challenges and recommendations | |
Dekhtyar et al. | Re data challenge: Requirements identification with word2vec and tensorflow | |
CN106415594A (en) | A method and a system for face verification | |
Zeng et al. | Efficient person re-identification by hybrid spatiogram and covariance descriptor | |
CN103778409A (en) | Human face identification method based on human face characteristic data mining and device | |
Abdulrahman et al. | Comparative study for 8 computational intelligence algorithms for human identification | |
Zalasiński et al. | New algorithm for evolutionary selection of the dynamic signature global features | |
Khan et al. | Human Gait Analysis: A Sequential Framework of Lightweight Deep Learning and Improved Moth‐Flame Optimization Algorithm | |
Al-Modwahi et al. | Facial expression recognition intelligent security system for real time surveillance | |
WO2022142903A1 (en) | Identity recognition method and apparatus, electronic device, and related product | |
CN110837777A (en) | Partial occlusion facial expression recognition method based on improved VGG-Net | |
Bharath et al. | Iris recognition using radon transform thresholding based feature extraction with Gradient-based Isolation as a pre-processing technique | |
Venkat et al. | Recognizing occluded faces by exploiting psychophysically inspired similarity maps | |
Wang et al. | Prototype-based intent perception | |
Singh et al. | A sparse coded composite descriptor for human activity recognition | |
KR101938491B1 (en) | Deep learning-based streetscape safety score prediction method | |
Saha et al. | Topomorphological approach to automatic posture recognition in ballet dance | |
Tunc et al. | Age group and gender classification using convolutional neural networks with a fuzzy logic-based filter method for noise reduction | |
Wu et al. | The use of kernel set and sample memberships in the identification of nonlinear time series | |
Liao et al. | Federated hierarchical hybrid networks for clickbait detection | |
Meraoumia et al. | 2D and 3D palmprint information and hidden Markov model for improved identification performance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |