CN110414378A

CN110414378A - A kind of face identification method based on heterogeneous facial image fusion feature

Info

Publication number: CN110414378A
Application number: CN201910619187.6A
Authority: CN
Inventors: 刘青山; 孙玉宝; 马强
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2019-07-10
Filing date: 2019-07-10
Publication date: 2019-11-05

Abstract

The invention discloses a kind of face identification methods based on heterogeneous facial image fusion feature.Facial image in face database is pre-processed, the image of the fixed size comprising face is cut out；Face of the training based on separation characteristic ajusts model, ajusts to face is carried out by pretreated facial image；Training generates the sketch of confrontation network based on circulation and caricature generates model, and the generation of sketch and caricature facial image is carried out to the facial image after ajusting；The sketch and the progress feature extraction of caricature facial image and fusion that model generates are generated to the true man's image ajusted by pretreatment and face and by sketch and caricature using residual error network, recognition of face is carried out according to fusion feature.Face recognition accuracy rate of the invention is higher, there is preferable identification and robustness.

Description

A kind of face identification method based on heterogeneous facial image fusion feature

Technical field

The invention belongs to technical field of image processing, in particular to a kind of face identification method.

Background technique

After face recognition technology comes into vogue, energy is studied in large quantities and has also been put in recognition of face.Compared with Early face recognition technology, such as Eigenfaces are mainly identified using the geometry feature of face, and face is analyzed Topological relation between the position and component of component is identified.Although this method is simple, final identification is accurate Rate is highly susceptible to the influence of posture and expression shape change.2012 or so, neural network occurred gradually in the visual field of people.Just For current stage, two classes can be mainly divided into for the research of recognition of face: first is that the proposition or improvement of network structure, two It is different the improvement of loss function.

Network structure: after entering into the visual field of people from neural network in 2012, more and more researchers start The research of recognition of face is carried out using deep learning.In 2014, face is become by the DeepFace that Facebook is proposed The bellwether of identification research, and 97.35% accuracy rate is achieved on LFW face database.Subsequent Google is released Then FaceNet using the mapping put on e-learning image to theorem in Euclid space compares two images in theorem in Euclid space The distance of point is compared.The accuracy rate that the method obtains on LFW is 99.63%.

With the continuous intensification of network, DeepID1 carries out recognition of face using the feature of higher-dimension, has extensive well Ability.DeepID2 improves loss function on the basis of DeepID1, reduce in class away from, expand class spacing, So that it is higher to train the accuracy rate that the model come obtains.2014, since VGGNet achieves overnight success on ImageNet, just have Researcher has been applied in recognition of face, is equally also achieved good results.The same period, the proposition of GoogleNet Extraordinary result is achieved in certain tasks in the same contest.Two networks of VGGNet and GoogleNet have one Common feature is exactly that the depth of network is constantly being deepened.Unlike VGGNet, GoogleNet has attempted noveler Network structure, rather than the characteristics of directly succession AlexNet, and model size ratio VGGNet is much smaller, and performance is also opposite For comparative superiority.2015, residual error network achieved champion in the tasks such as the ImageNet classification competed, and achievement is more than VGGNet.The it is proposed of residual error module can greatly reduce the loss in the training process of characteristic information, so that final mark sheet Show that retained information is more abundant.Then, the deformation of a large amount of model has been emerged in large numbers on the basis of residual error network.

Loss function: after deep learning starts the concern by everybody, various loss functions gradually by It proposes.Earliest and the most famous is Softmax loss function, and the later period, much various loss functions were all at it On the basis of deformed.Although sample can correctly be classified using Softmax loss function, it is the absence of in class Away from the constraint with class spacing.So CenterLoss had also been proposed in 2016.The it is proposed of CenterLoss primarily to The inter- object distance of every a kind of sample is constrained, so that every one kind sample is drawn close to center, to improve the accuracy rate of identification.Though Right CenterLoss reduces inter- object distance, and the accuracy rate of identification also has a certain upgrade, but there is no carry out to class spacing Constraint.In order to it is effective while realize two constraint, in 2017, Liu et al. people proposed SphereFace. The it is proposed of the SphereFace modification mainly to loss function in fact, that is, propose A-Softmax loss.In view of When Softmax calculates loss, feature vector and multiplied by weight when, contains the information of angle, sharp in other words It can make the feature learnt that there is the distribution character in angle with Softmax loss.So the proposition of A-Softmax is exactly It makes these angular distribution characteristics more obvious, to realize reduction inter- object distance, expands the effect of class spacing.

In 2018, Wang proposed CosFace, was easier to realize compared to SphereFace, CosFace, reduced The calculation amount of parameter, and the supervision constraint of Softmax loss is eliminated, training process is more succinct and can be realized receipts It holds back.Most importantly, CosFace is significantly improved the performance of model.Although cosine can go to angle, two Person still has apparent difference.In fact, realizing the maximization of classification boundaries in angular region for cosine space With more reasonable explanatory.In angular region, the gap on edge is equivalent to hyperspherical arc distance.Then, Deng proposes ArcFace, especially its loss function: the arm of angle loses (Angular Margin Loss).The loss function is Angled edge is placed in the inside of cosine function, allows the network to the characteristic that more angles are arrived in study.

In addition to other than the research in terms of network structure and loss function two to carry out recognition of face, there are also parts to study The it is proposed that the building and different face characteristics for putting forth effort on face database indicate.The face of different scales and different difficulty is known Other database is organized publication gradually and comes out.But the prior art still around use true man's image carry out recognition of face into Row research.

Summary of the invention

In order to solve the technical issues of above-mentioned background technique is mentioned, the invention proposes one kind to be melted based on heterogeneous facial image The face identification method for closing feature, the research that will use only the progress recognition of face of true man's image have been extended to true man and heterogeneous people The face identification method that face image combines.

In order to achieve the above technical purposes, the technical solution of the present invention is as follows:

A kind of face identification method based on heterogeneous facial image fusion feature, comprising the following steps:

(1) facial image in face database is pre-processed, cuts out the image of the fixed size comprising face；

(2) face of the training based on separation characteristic ajusts model, to the pendulum for carrying out face by pretreated facial image Just；

(3) train based on circulation generate confrontation network sketch and caricature generate model, to the facial image after ajusting into The generation of row sketch and caricature facial image；

(4) mould is generated using residual error network to the true man's image ajusted by pretreatment and face and by sketch and caricature The sketch and caricature facial image that type generates carry out feature extraction and fusion, carry out recognition of face according to fusion feature.

Further, in step (1), Face datection and crucial point location is carried out to facial image first, utilize key Dot position information is cut out face.

Further, in step (1), the method cut out is as follows:

Face is ajusted first with the eyes position detected, so that eyes are located in same horizontal line；Then Fixed eyes position, selects the image of predetermined size.

Further, in step (2), the face based on separation characteristic ajusts model using encoder and decoder The effect for undertaking generator judges the true and false of image using arbiter and identifies the identity and posture information of image；By mentioning The Attitude estimation in the posture code and arbiter of decoder is supplied, face characteristic is separated with posture.

Further, in step (2), the majorized function that the face based on separation characteristic ajusts model is as follows:

In above formula, V_D(D, G) and V_G(D, G) is objective function, and D indicates arbiter, G presentation code decoder, E () table Showing mathematic expectaion, x indicates true picture, and y indicates identity label and posture label, and c indicates posture, and z indicates noise, x, y~ P_data(x,y)Indicate x and y from given data, z~P_z(z) noise of Gaussian distributed, c~P are indicated_z(c) it indicates to giving Fixed posture is within the scope of preset, function G presentation code decoder,WithRespectively indicate differentiation image True and false property, posture information and identity information.

Further, in step (3), the circulation generates the total losses function L (G of confrontation network_AB,G_BA,D_A,D_B) such as Under:

L(G_AB,G_BA,D_A,D_B)=L_GAN(G_AB,D_B,A,B)

+L_GAN(G_BA,D_A,B,A)

+λL_cyc(G_AB,G_BA)

In above formula, A, B indicate both modalities which, G_ABIndicate that the image under mode A is converted to the generator of image under mode B, G_BAIndicate that the image under mode B is converted to the generator of image under mode A, D_AIt indicates to differentiate whether image is figure under mode A The arbiter of picture, D_BIndicate to differentiate image whether be image under mode B arbiter；L_GAN(G_AB,D_B, A, B) and indicate arbiter D_BLoss function:

L_GAN(G_AB,D_B, A, B) and=E_{B~Pdata (b)}(logD_B(b))

+E_{A~Pdata (a)}[log(1-D_B(G_AB(a)))]

In above formula, a is the image under mode A, and b is the image under mode B, and E () indicates mathematic expectaion, a~Pdata (a), b~Pdata (b) respectively indicates the distribution that two kinds of pictures under both modalities which meet data set；L_GAN(G_BA,D_A, B, A) and table Show arbiter D_ALoss function:

L_GAN(G_BA,D_A, B, A) and=E_{A~Pdata (a)}(logD_A(a))

+E_{B~Pdata (b)}[log(1-D_A(G_BA(b)))]

L_cyc(G_AB,G_BA) indicate circulation loss function:

In above formula,Indicate G_BA(G_AB(a)),Indicate G_AB(G_BA(b))；

λ is L_cyc(G_AB,G_BA) weight coefficient.

Further, in step (3), sketch face database is selected, by the sketch image in sketch face database Input circulation generates confrontation network and is trained, and obtains sketch and generates model, generates the sketch image that model obtains using sketch The overall situation and profile information of image can be retained；Caricature face database is selected, by the cartoon image in caricature face database Input circulation generates confrontation network and is trained, and obtains caricature and generates model, generates the cartoon image that model obtains using caricature The local message of image can be retained.

By adopting the above technical scheme bring the utility model has the advantages that

The present invention ajusts method using the face based on separation characteristic and generates the heterogeneous of confrontation network generation based on circulation Facial image carries out recognition of face, the experiment proved that, under identical experiment condition, method proposed by the present invention is in VGGFace2 Recognition accuracy on face database is higher than other recognition methods on the database.

Detailed description of the invention

Fig. 1 is basic flow chart of the invention；

Fig. 2 is that the face based on separation characteristic ajusts flow chart；

Fig. 3 makes a living into confrontation network and the face based on separation characteristic ajusts network diagram；

Fig. 4 is that the face based on separation characteristic ajusts the result figure on Multi-PIE and VGGFace2；

Fig. 5 is that circulation generates confrontation Principles of Network schematic diagram；

Fig. 6 is the heterogeneous Face image synthesis frame diagram that confrontation network is generated based on circulation；

Fig. 7 is that the sketch that confrontation network is generated based on circulation and caricature generate result figure；

Fig. 8 is the recognition of face flow chart based on heterogeneous facial image fusion feature.

Specific embodiment

Below with reference to attached drawing, technical solution of the present invention is described in detail.

As shown in Figure 1, a kind of face identification method based on heterogeneous facial image fusion feature that the present invention designs, step It is as follows:

1, the facial image in face database is pre-processed, cuts out the image of the fixed size comprising face；

2, face of the training based on separation characteristic ajusts model, to the pendulum for carrying out face by pretreated facial image Just；

3, the sketch and caricature generation model for generating confrontation network based on circulation are trained, the facial image after ajusting is carried out The generation of sketch and caricature facial image；

4, mould is generated using residual error network to the true man's image ajusted by pretreatment and face and by sketch and caricature The sketch and caricature facial image that type generates carry out feature extraction and fusion, carry out recognition of face according to fusion feature.

This implementation uses VGGFace2 and LFW face database.Due to the image in VGGFace2 and LFW face database Complex, most image all contains more human body and complex background information, therefore the face in later period for convenience The research of identification needs to carry out all pictures certain pretreatment operation.Due in VGGFace2 face database Given the coordinate information of the face key point of each picture, so can directly be carried out for VGGFace2 face database The cutting of facial image.Since LFW face database does not provide corresponding key point information, the present invention utilizes MTCNN Face datection and crucial point location are carried out to the facial image in LFW, using the key point location information detected to face into Row is cut.

Due to the method that difference has more different face to cut at present, the invention firstly uses the eyes positions detected Face is ajusted, so that eyes are on the same horizontal line.Then the position for fixing eyes, selects 256 × 256 sizes Image.The information of face is only contained in the facial image for obtain after pretreatment operation to all images, it can Largely improve the accuracy rate of identification.

Since there are more facial images to include posture information, and posture in VGGFace2 and LFW face database Information is one kind most commonly seen in the significant challenge of current recognition of face.Most people's face recognition method is due to joined difference The accuracy rate of posture, identification can reduce by 10% or so, but for people, only reduce little by little.Common solution face There are two ways to pose problem occurred in identification: the study of face ajusted with specific characteristic.The present invention utilizes two kinds of sides The combination of method proposes the face based on separation characteristic and ajusts method, and the process of method is as shown in Figure 2.

What the DRGAN in Fig. 2 was indicated is that the face indicated based on separation characteristic ajusts method.Compared to GAN network, it is based on The face of separation characteristic is ajusted there are two innovative point: first is that the effect of generator is undertaken using the structure of coding and decoding, second is that logical The Attitude estimation being supplied in the posture code and discriminator of decoder is crossed, face characteristic is separated with posture.GAN network Shown in (a) in structure such as Fig. 3, shown in (b) in network structure such as Fig. 3 that the face indicated based on separation characteristic is ajusted.

From figure 3, it can be seen that the face based on separation characteristic ajusts network encoder and decoder instead of in GAN Generator part, and arbiter needs the work completed in addition to judging the true and false of image, also to identify the body of image Part and posture information.The majorized function that face based on separation characteristic is ajusted can be repaired on the basis of the loss function of GAN Change:

In above formula, E () indicates mathematic expectaion, and x indicates that true picture, y indicate that identity label and posture label, c indicate Posture, z indicate noise, x, y~P_data(x,y)Indicate x and y from given data, z~P_z(z) Gaussian distributed is indicated Noise, c~P_z(c) it indicates to given posture within the scope of preset, function G presentation code decoder, WithIt respectively indicates for differentiating the true and false property of image, posture information and identity information.Face based on separation information ajusts method In used encoder, decoder and arbiter network parameter etc. as shown in table 1.

Table 1

It is Multi-PIE for trained face database.The database contains posture, illumination and expression, is current One of more common face database, and the same person contains 15 kinds of different postures in the database.It randomly selects 9 kind postures of 337 people under natural expression and good illumination condition, image of the side face of left and right within ± 60 °.It is selecting In 337 people taken, the image of 200 people is as training sample, and the image of remaining 137 people is as test sample.It is instructing When white silk, the normal view that all true man's facial images are 100 × 100 all in accordance with size is aligned first, then The region of grab sample 96 × 96 in image after the alignment, to realize that data enhance.Using batch training, by 64 people Face image is as a batch.All convolutional layer weights are 0 all in accordance with mean value in network, the normal distribution that standard deviation is 0.02 It is initialized.The method choice Adam of optimization, and initial learning rate is 0.0002, momentum 0.5.By very long instruction After white silk, shown in (a) in result such as Fig. 4 that the part face obtained on test set is ajusted.

From Fig. 4 (a) as can be seen that needing the human face posture ajusted larger, difficulty is also higher, thus the people after ajusting There are the distortion of partial region and discontinuously in face.Appoint since the present embodiment will carry out final recognition of face on VGGFace2 Business, it is therefore desirable to which the biggish face of attitudes vibration in VGGFace2 is ajusted.It does so not just to extract more rationally Face characteristic, and be the generation of corresponding heterogeneous facial image for convenience.It is ajusted pair using the face based on separation characteristic Attitudes vibration biggish facial image progress face in part is ajusted in VGGFace2, the result after partially ajusting such as (b) institute in Fig. 4 Show.

From (b) in Fig. 4 as can be seen that face is by effect after ajusting or pretty good, but part picture occurs The case where distortion, there are also the diplopias for occurring blue in the picture of part.Although facial image after ajusting there are a little problem, It is that can preferably be remained in the profile and module information of face, and component all positions of the face after ajusting are several It is identical.

After ajusting to facial image, next need to complete is to carry out sketch and caricature figure to facial image The generation of picture.Since the image of GAN generates feature, more and more researchs have appeared in study and the transfer neck of the style of image Domain, wherein most commonly seen is that circulation generates confrontation network.Image under different modalities can be carried out mutual turn by the network Change, for example, can by an image horse and zebra carry out mutual conversion.Mutually turn between image under different modalities Change is exactly a kind of mapping relations in study between mode in fact.It is corresponding the object of the present invention is to be generated using true man's image Heterogeneous image, i.e. study true man's image to sketch and true man's image to the mapping function between caricature.Common GAN is unidirectional life At, although can recycle generation confrontation network by the true and false of the input picture that arbiter differentiates and then generate mutually. Just because of the characteristics of mutually generating, but optimize the mapping space of picture under two mode, to prevent different input To the result of identical output.The principle signal that circulation generates confrontation network is as shown in Figure 5.

Circulation, which generates confrontation network, can be converted to the image under A mode the image under B mode.In order to realize the two The process of generation then needs two generator G_ABAnd G_BA, while also containing two arbiter D_AAnd D_BTo differentiate the true of image Vacation, as shown in (a) in Fig. 5.The two generators are mutually converted the image of A mode and B mode respectively.Under A mode Image a pass through G_ABObtain the image G of a generation_AB(a), then by the image G of generation_AB(a) pass through G_BAObtain an A mould Image G under state_BA(G_AB(a)) it, is expressed asAs shown in (b) in Fig. 5.Same b passes through the image that two generators obtain Referred to asAs shown in (c) in Fig. 5.Due to there is the process mutually generated in model, so needing two arbiters to carry out Differentiate the true and false property for generating image.Other than calculating primary generation and obtaining the true and false property of image, it is also necessary to secondary by two Differentiated at obtained image.In other words, in the circulation generated at one, the loss of required calculating includes input and life At circulation loss caused by the loss of the true and false property of image and secondary generation.

It can be regarded as the superposition at two networks GAN since circulation generates confrontation network, so for first arbiter D_BIt is primarily used to differentiate G_AB(a) the true and false property of the true picture b and under mode B, it may be assumed that

L_GAN(G_AB,D_B, A, B) and=E_{B~Pdata (b)}(logD_B(b))

+E_{A~Pdata (a)}[log(1-D_B(G_AB(a)))]

In above formula, E () indicates mathematic expectaion, and a~Pdata (a), b~Pdata (b) are respectively indicated under both modalities which Two kinds of pictures meet the distribution of data set.G_ABMain purpose be to allow the image G of generation_AB(a) appear more like is mode Image under B.D_BPurpose be then to tell G as far as possible_AB(a) and b on earth that be only the image under mode B.Therefore, excellent Changing target can indicate are as follows:

Similarly, for second arbiter D_ALoss function can indicate are as follows:

L_GAN(G_BA,D_A, B, A) and=E_{A~Pdata (a)}(logD_A(a))

+E_{B~Pdata (b)}[log(1-D_A(G_BA(b)))]

G_BAMain purpose be to allow the image G of generation_BA(b) appearing more like is image under mode A, to make Arbiter can not resolution image it is true and false.D_APurpose be then to tell G as far as possible_BA(b) and a on earth that be only mode Image under A, i.e., judgement input picture is true and false.Corresponding optimization aim can indicate are as follows:

It but is it cannot be guaranteed that the mapping learnt is so that each independent input can just with confrontation loss A satisfied output is obtained, and is easy to appear the problem of gradient disappears during training.It is reflected to further decrease The possibility variation space of function is penetrated, invention introduces a significantly more efficient loss functions: circulation loss function.Circulation loss Purpose be, differentiate a and G_BA(G_AB(a)) whether it is under mode A, and determines b and G_AB(G_BA(b)) whether all Come from mode B.Therefore, circulation loss function can indicate are as follows:

So generating in confrontation network in circulation, total loss function selected by the present invention can be indicated are as follows:

L(G_AB,G_BA,D_A,D_B)=L_GAN(G_AB,D_B,A,B)

+L_GAN(G_BA,D_A,B,A)

+λL_cyc(G_AB,G_BA)

In above formula, λ indicates the weight of circulation loss function.

The present invention will carry out the generation of heterogeneous facial image on the basis of the above, generate the heterogeneous of confrontation network based on circulation Face image synthesis frame is as shown in Figure 6.Carry out model training before, the present embodiment arranged from Hong Kong Chinese University three The training that part sketch picture generates model for sketch is selected in a sketch face database, from current biggish two caricatures The training that part caricature picture generates model for caricature is selected in face database.Selected true man when two model trainings Part true man image of the picture in sketch face database.In order to facilitate differentiation, for generating the two of sketch and caricature A model is referred to as PSGAN and PCGAN.The sketch and caricature result generated by two models is as shown in Figure 7.

First row in Fig. 7 is the true man's image chosen from VGGFace2 face database at random, second be classified as by The correspondence caricature facial image that PCGAN is generated, third are classified as the correspondence sketch facial image generated by PSGAN.It is not difficult to find out that Using train caricature that PCGAN and PSGAN are respectively obtained and sketch image can preferably to show heterogeneous image exclusive The characteristics of, sketch image can preferably retain the overall situation and profile information of image, and caricature can then reinforce local features, protect More local messages are stayed.Although sketch have from the point of view of the result of generation write it is a little intermittent, can be certain Reduce the influence of illumination and background in degree.From the sketch images in Fig. 7 just it can be clearly seen that.Compared to sketch, caricature There is fuzzy the case where even losing in profile.It can be inferred that the reason of causing caricature to be distorted by analysis, be exactly used to train Caricature facial image style be not it is fixed, there are more different-styles.

The present embodiment choose VGGFace2 face database carry out recognition of face research, therefore carry out feature extraction and Before heterogeneous Face image synthesis, need to carry out the biggish facial image of attitudes vibration in database to ajust operation.Passing through After face is ajusted, all side faces either biggish picture of attitudes vibration all becomes corresponding positive face picture.Utilize instruction Two experienced models PSGAN and PCGAN generate two kinds of heterogeneous facial images of sketch and caricature respectively.

Since mainly study is the face identification method based on heterogeneous facial image fusion feature to the present invention, for extracting The residual error network of true man, caricature and sketch feature are referred to as PResNet, CResNet and SResNet.Experiment is broadly divided into three Group: 1) the true man's image ajusted and cut only is used only；2) using trained PSGAN model in step 3) by all sanctions Then the corresponding sketch of the Face image synthesis sheared uses true man and sketch image simultaneously；3) the PCGAN mould of step 3) is utilized Then the corresponding caricature of all Face image synthesis reduced is used true man and caricature by type simultaneously.In order to facilitate differentiation, this Three groups of experiments are referred to as ResNet-Fusion (1), ResNet-Fusion (2) and ResNet-Fusion (3).Present invention choosing Use the associated losses of Softmax and CenterLoss as the loss function of model training, and the weight λ of CenterLoss To indicate.It is as shown in Figure 8 based on face and the total frame of recognition of face of corresponding heterogeneous facial image fusion feature.

There is no the links that addition face is ajusted in Fig. 8, but actually before carrying out model training, attitudes vibration is larger Face picture be all sent to the face based on separation characteristic and ajust network and carry out the operation ajusted of face, in addition, in figure It is all trained before that PSGAN and PCGAN, which respectively indicates two heterogeneous Face image synthesis models,.

The present invention chooses the extraction that residual error network carries out feature to three kinds of true man, sketch and caricature images respectively.Through remarkable Picture size after face is ajusted is 256 × 256.Since picture pixels are larger, when leading to training pattern, a batch Picture number it is less, training process is longer.In order to accelerate the training of model, the input figure that the present embodiment will be used to test The size of picture is adjusted to 112 × 112.Although picture becomes smaller, certain detailed information may be lost, this is greatly greatly The fast training and optimization process of model, can save a large amount of time.For extracting the net of feature in recognition of face frame Network parameter is as shown in table 2.

Table 2

In table 2, n indicates the number of training sample.In Pool1 and Conv2, Pool2 and Conv3, Pool3 and Conv4, There are 1,2,5,3 residual error modules between Pool4 and FC6 respectively.Convolution used in each residual error module is all 3 × 3/1/1, i.e., The convolution kernel size of residual error module is 3, and step-length 1 is filled with 1.Therefore for image after by residual error module, size can't It changes.

3,310,000 facial images are shared in VGGFace2 face database.If choosing more than 300 ten thousand face figures every time As carry out three experiment if, require a great deal of time, thus the present embodiment by the training set of the database according to classification It is divided into three parts, every portion includes nearly 1,000,000 facial images, and every portion is all used for an experiment.In a network Not shared parameter between residual error network for extracting different images feature.The weight initial value of all residual error modules meets Variance is 0, the normal distribution that standard deviation is 0.01, and the initial learning rate of model is 0.01, attenuation coefficient 0.1.Utilize instruction The model perfected is tested in test set data, and the recognition accuracy for finally obtaining three experiments is as shown in table 3.

Table 3

From table 3 it is observed that only the recognition accuracy of true man's characteristics of image is 88.2%, heterogeneous facial image has been merged The final discrimination of two groups of experiments improve 1.6% and 2.0% on the basis of only with true man's image respectively.It can be seen that After having merged heterogeneous facial image feature, the character representation that final e-learning arrives has better identification.Compared to this On database from the point of view of other experimental results, the case where not utilizing pre-training model and final characteristic dimension is all 2048 Under, it is higher that fusion feature proposed by the present invention obtains face recognition accuracy rate.In order to further illustrate differentiating for fusion feature Trained three models have also been carried out face verification by property and robustness, the present embodiment respectively on LFW face database Experiment.Feature extraction and fusion are carried out to this 6000 pairs of images respectively using trained three models.Then it calculates separately Cosine similarity between this 6000 pairs is ranked up these similarities according to sequence from big to small, before statistics filters out The error sample quantity occurred in 3000 pairs of facial images, calculates the accuracy rate of face verification.The result of final test such as 4 institute of table Show.

Table 4

Method	Accuracy rate (%)
		Combined Joint Bayesian	92.42
Roof-Hog	92.80
		Tom-vs-Pete+Attribute	93.30
FR+FCN	96.45
		DeepID 2	99.47
SphereFace	99.42
		PingAn AI Lab	99.80
ResNet-Fusion(1)	95.52
		ResNet-Fusion(2)	96.08
ResNet-Fusion(3)	96.15

The first seven result is the face verification accuracy rate that method some other on the database obtains respectively in table 4, from As can be seen that after the feature for having merged two kinds of heterogeneous facial images of sketch and caricature, the accuracy rate of verifying is table 4 96.08% and 96.15%, the accuracy rate than not merging heterogeneous facial image feature is distinguished high by 0.56% and 0.63%.Pass through After the test for carrying out face verification on LFW face database, illustrate merging sketch and caricature feature to a certain extent Character representation later has preferable robustness.

In conclusion the face identification method proposed by the present invention based on heterogeneous facial image fusion feature using sketch and The characteristics of caricature, so that final mark sheet is shown with preferable identification and robustness.

Embodiment is merely illustrative of the invention's technical idea, and this does not limit the scope of protection of the present invention, it is all according to Technical idea proposed by the present invention, any changes made on the basis of the technical scheme are fallen within the scope of the present invention.

Claims

1. a kind of face identification method based on heterogeneous facial image fusion feature, which comprises the following steps:

(2) face of the training based on separation characteristic ajusts model, ajusts to face is carried out by pretreated facial image；

(3) sketch and caricature generation model for generating confrontation network based on circulation are trained, element is carried out to the facial image after ajusting Retouch the generation with caricature facial image；

(4) model life is generated using residual error network to the true man's image ajusted by pretreatment and face and by sketch and caricature At sketch and caricature facial image carry out feature extraction and fusion, recognition of face is carried out according to fusion feature.

2. the face identification method according to claim 1 based on heterogeneous facial image fusion feature, which is characterized in that in step Suddenly in (1), Face datection and crucial point location is carried out to facial image first, face is cut using key point location information It cuts out.

3. the face identification method according to claim 2 based on heterogeneous facial image fusion feature, which is characterized in that in step Suddenly in (1), the method cut out is as follows:

Face is ajusted first with the eyes position detected, so that eyes are located in same horizontal line；Then it fixes Eyes position selects the image of predetermined size.

4. the face identification method according to claim 1 based on heterogeneous facial image fusion feature, which is characterized in that in step Suddenly in (2), the face based on separation characteristic ajusts the effect that model undertakes generator using encoder and decoder, uses Arbiter judges the true and false of image and identifies the identity and posture information of image；By be supplied to decoder posture code and Attitude estimation in arbiter separates face characteristic with posture.

5. the face identification method according to claim 1 based on heterogeneous facial image fusion feature, which is characterized in that in step Suddenly in (2), the majorized function that the face based on separation characteristic ajusts model is as follows:

In above formula, V_D(D, G) and V_G(D, G) is objective function, and D indicates that arbiter, G presentation code decoder, E () indicate number It hopes in term, x indicates that true picture, y indicate that identity label and posture label, c indicate posture, and z indicates noise, x, y~P_data(x,y) Indicate x and y from given data, z~P_z(z) noise of Gaussian distributed, c~P are indicated_z(c) it indicates to given posture Within the scope of preset, function G presentation code decoder,WithIt respectively indicates and differentiates the true and false property of image, Posture information and identity information.

6. the face identification method according to claim 1 based on heterogeneous facial image fusion feature, which is characterized in that in step Suddenly in (3), the circulation generates the total losses function L (G of confrontation network_AB,G_BA,D_A,D_B) it is as follows:

L(G_AB,G_BA,D_A,D_B)=L_GAN(G_AB,D_B,A,B)+L_GAN(G_BA,D_A,B,A)+λL_cyc(G_AB,G_BA)

In above formula, A, B indicate both modalities which, G_ABIndicate that the image under mode A is converted to the generator of image under mode B, G_BATable Show that the image under mode B is converted to the generator of image under mode A, D_AIt indicates to differentiate whether image is image under mode A Arbiter, D_BIndicate to differentiate image whether be image under mode B arbiter；L_GAN(G_AB,D_B, A, B) and indicate arbiter D_B's Loss function:

L_GAN(G_AB,D_B, A, B) and=E_{B~Pdata (b)}(logD_B(b))+E_{A~Pdata (a)}[log(1-D_B(G_AB(a)))]

In above formula, a is the image under mode A, and b is the image under mode B, and E () indicates mathematic expectaion, a~Pdata (a), b ~Pdata (b) respectively indicates the distribution that two kinds of pictures under both modalities which meet data set；L_GAN(G_BA,D_A, B, A) and it indicates to differentiate Device D_ALoss function:

L_GAN(G_BA,D_A, B, A) and=E_{A~Pdata (a)}(logD_A(a))+E_{B~Pdata (b)}[log(1-D_A(G_BA(b)))]

L_cyc(G_AB,G_BA) indicate circulation loss function:

In above formula,Indicate G_BA(G_AB(a)),Indicate G_AB(G_BA(b))；

λ is L_cyc(G_AB,G_BA) weight coefficient.

7. the face identification method based on heterogeneous facial image fusion feature described in any one of -6 according to claim 1, It is characterized in that, in step (3), selects sketch face database, the sketch image in sketch face database is inputted into circulation It generates confrontation network to be trained, obtains sketch and generate model, generating the sketch image that model obtains using sketch can retain The overall situation and profile information of image；Caricature face database is selected, the cartoon image in caricature face database is inputted into circulation It generates confrontation network to be trained, obtains caricature and generate model, generating the cartoon image that model obtains using caricature can retain The local message of image.