Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for generating model for generating caricature head portrait, and for generating
The method and apparatus of caricature head portrait.
In a first aspect, embodiment of the disclosure provides a kind of method for generating model for generating caricature head portrait, the party
Method includes: to obtain preset training sample set, wherein training sample includes sample facial image and sample facial image pair
The sample caricature head portrait answered;What acquisition pre-established is initially generated confrontation network, wherein being initially generated confrontation network includes caricature
Head portrait generates network, Face image synthesis network and caricature head portrait and differentiates that network, facial image differentiate network;It executes as follows
Training step: machine learning method is utilized, the sample facial image for including using the training sample in training sample set is as unrestrained
Picture head picture generates the input of network, and caricature head portrait is generated the caricature head portrait of network output as the defeated of Face image synthesis network
Enter, the caricature head portrait and corresponding sample caricature head portrait that caricature head portrait is generated network output are as caricature head portrait differentiation network
Input, the facial image and corresponding sample facial image that Face image synthesis network is exported differentiate network as facial image
Input, be trained to confrontation network is initially generated, the caricature head portrait after training is generated network, and to be determined as caricature head portrait raw
At model.
In some embodiments, training step includes: using machine learning method, by the training sample in training sample set
Originally the sample facial image for including generates the input of network as caricature head portrait, and caricature head portrait is generated to the caricature head of network output
As the input as Face image synthesis network, caricature head portrait is generated to the caricature head portrait and corresponding sample caricature of network output
Head portrait differentiates the input of network, the facial image that Face image synthesis network is exported and corresponding sample people as caricature head portrait
Face image differentiates the input of network as facial image, is trained to confrontation network is initially generated;And by training sample set
Input of the sample caricature head portrait that training sample in conjunction includes as Face image synthesis network, by Face image synthesis network
The facial image of output generates the input of network, the facial image that Face image synthesis network is exported and right as caricature head portrait
The sample facial image answered as facial image differentiate network input, by caricature head portrait generate network output caricature head portrait and
Corresponding sample caricature head portrait differentiates the input of network as caricature head portrait, is trained, will instruct to confrontation network is initially generated
Caricature head portrait after white silk generates network and is determined as caricature head portrait generation model.
In some embodiments, for the training sample in training sample set, sample face which includes
Similarity between image and the feature vector of sample caricature head portrait is more than or equal to preset similarity threshold.
In some embodiments, to be initially generated confrontation network be trained, comprising: determine for characterizing sample face figure
As the first generational loss value of the difference of the facial image with the output of Face image synthesis network, and determine for characterizing sample
Caricature head portrait and caricature head portrait generate the second generational loss value of the difference of the caricature head portrait of network output;Determine that caricature head portrait is sentenced
Other network is corresponding, differentiates that the sample caricature head portrait of network and caricature head portrait generate network output for characterizing input caricature head portrait
Caricature head portrait difference first differentiate penalty values, and determine facial image differentiate network it is corresponding, for characterize input
Facial image differentiates that the second of the difference for the facial image that the sample facial image of network is exported with Face image synthesis network sentences
Other penalty values;Differentiate that penalty values, second differentiate damage based on identified first generational loss value, the second generational loss value, first
Mistake value is trained to confrontation network is initially generated.
In some embodiments, generational loss value is determined to obtain by following any loss function: L1 norm loss function,
L2 norm loss function.
Second aspect, embodiment of the disclosure provides a kind of method for generating caricature head portrait, this method comprises: obtaining
Take target facial image;Target facial image input caricature head portrait trained in advance is generated into model, obtains caricature head portrait and defeated
Out, wherein it is that the method described according to any embodiment in above-mentioned first aspect generates that caricature head portrait, which generates model,.
The third aspect, embodiment of the disclosure provide a kind of device that model is generated for generating caricature head portrait, the dress
Setting includes: first acquisition unit, is configured to obtain preset training sample set, wherein training sample includes sample face
Image, sample caricature head portrait corresponding with sample facial image;Second acquisition unit, be configured to obtain pre-establish it is initial
Generate confrontation network, wherein being initially generated confrontation network includes caricature head portrait generation network, Face image synthesis network, and
Caricature head portrait differentiates that network, facial image differentiate network;Training unit is configured to execute following training step: utilizing machine
Learning method, the sample facial image for including using the training sample in training sample set generate the defeated of network as caricature head portrait
Enter, the caricature head portrait that caricature head portrait is generated network output generates caricature head portrait as the input of Face image synthesis network
The caricature head portrait and corresponding sample caricature head portrait of network output differentiate the input of network as caricature head portrait, and facial image is raw
Input at the facial image and corresponding sample facial image of network output as facial image differentiation network, to being initially generated
Confrontation network is trained, and the caricature head portrait after training is generated network and is determined as caricature head portrait generation model.
In some embodiments, training unit is further configured to: machine learning method is utilized, by training sample set
In the training sample sample facial image that includes the input of network is generated as caricature head portrait, it is defeated that caricature head portrait is generated into network
Caricature head portrait is generated the caricature head portrait and correspondence of network output by input of the caricature head portrait as Face image synthesis network out
Sample caricature head portrait the input of network, the facial image that Face image synthesis network is exported and right are differentiated as caricature head portrait
The sample facial image answered differentiates the input of network as facial image, and is trained to confrontation network is initially generated;It will
Input of the sample caricature head portrait that training sample in training sample set includes as Face image synthesis network, by face figure
As generating input of the facial image of network output as caricature head portrait generation network, the people that Face image synthesis network is exported
Face image and corresponding sample facial image differentiate the input of network as facial image, and caricature head portrait is generated network output
Caricature head portrait and corresponding sample caricature head portrait differentiate the input of network as caricature head portrait, carry out to confrontation network is initially generated
Caricature head portrait after training is generated network and is determined as caricature head portrait generation model by training.
In some embodiments, for the training sample in training sample set, sample face which includes
Similarity between image and the feature vector of sample caricature head portrait is more than or equal to preset similarity threshold.
In some embodiments, training unit includes: the first determining module, is configured to determine for characterizing sample face
First generational loss value of the difference of image and the facial image of Face image synthesis network output, and determine for characterizing sample
This caricature head portrait and caricature head portrait generate the second generational loss value of the difference of the caricature head portrait of network output;Second determines mould
Block is configured to determine caricature head portrait and differentiates that network is corresponding, differentiates the sample caricature of network for characterizing input caricature head portrait
Head portrait and caricature head portrait generate the first differentiation penalty values of the difference of the caricature head portrait of network output, and determine that facial image is sentenced
Other network is corresponding, differentiates that the sample facial image of network and Face image synthesis network export for characterizing input facial image
Facial image difference second differentiate penalty values;Training module, be configured to based on identified first generational loss value,
Second generational loss value, first differentiate that penalty values, second differentiate penalty values, are trained to confrontation network is initially generated.
In some embodiments, generational loss value is determined to obtain by following any loss function: L1 norm loss function,
L2 norm loss function.
Fourth aspect, embodiment of the disclosure provide a kind of for generating the device of caricature head portrait, which includes: people
Face image acquiring unit is configured to obtain target facial image;Caricature head portrait generation unit is configured to target face figure
As input caricature head portrait generation model trained in advance, caricature head portrait and output are obtained, wherein it is root that caricature head portrait, which generates model,
It is generated according to the method that any embodiment in above-mentioned first aspect describes.
5th aspect, embodiment of the disclosure provide a kind of electronic equipment, which includes: one or more places
Manage device;Storage device is stored thereon with one or more programs;When one or more programs are held by one or more processors
Row, so that one or more processors realize the method as described in implementation any in first aspect or second aspect.
6th aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program,
The method as described in implementation any in first aspect or second aspect is realized when the computer program is executed by processor.
The method and apparatus for generating model for generating caricature head portrait that embodiment of the disclosure provides, it is default by obtaining
Training sample set and pre-establish be initially generated confrontation network, wherein be initially generated confrontation network include caricature head portrait
It generates network, Face image synthesis network and caricature head portrait and differentiates that network, facial image differentiate network, utilize machine learning
Method, the sample facial image for including using the training sample in training sample set generate the input of network as caricature head portrait,
Caricature head portrait is generated network as the input of Face image synthesis network by the caricature head portrait that caricature head portrait is generated network output
The caricature head portrait of output and corresponding sample caricature head portrait differentiate the input of network as caricature head portrait, by Face image synthesis net
The facial image and corresponding sample facial image of network output differentiate the input of network as facial image, fight to being initially generated
Network is trained, and the caricature head portrait after training is generated network and is determined as caricature head portrait generation model.To realize use
The facial image input of input is generated confrontation network by the mode of two-way training, obtain caricature head portrait, then caricature head portrait is turned
It is changed to facial image, the similitude of the facial image of the facial image and output of input is higher, to facilitate using caricature head
The higher caricature head portrait of facial image similitude for generating with inputting as generating model.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining that correlation is open, rather than the restriction to the disclosure.It also should be noted that in order to
Convenient for description, is illustrated only in attached drawing and disclose relevant part to related.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1, which is shown, can apply the caricature head portrait that is used to generate of embodiment of the disclosure to generate the method for model or be used for
Generate the device that caricature head portrait generates model, and the method for generating caricature head portrait or the device for generating caricature head portrait
Exemplary system architecture 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed, such as image processing class is answered on terminal device 101,102,103
With, web browser applications, instant messaging tools, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be various electronic equipments.When terminal device 101,102,103 is software, above-mentioned electronic equipment may be mounted at
In.Multiple softwares or software module (such as providing the software of Distributed Services or software module) may be implemented into it,
Single software or software module may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as upload and train to terminal device 101,102,103
The background server that sample set is handled.The training sample set that gets can be used to being initially generated in background server
Confrontation network is trained, so that obtaining caricature head portrait generates model.In addition, background server can also use caricature head portrait raw
It is handled at facial image of the model to input, obtains caricature head portrait and output.
It should be noted that the method for generating model provided by embodiment of the disclosure for generating caricature head portrait can be with
It is executed, can also be executed by terminal device 101,102,103 by server 105, correspondingly, generate mould for generating caricature head portrait
The device of type can be set in server 105, also can be set in terminal device 101,102,103.In addition, the disclosure
Method provided by embodiment for generating caricature head portrait can be executed by server 105, can also by terminal device 101,
102, it 103 executes, correspondingly, the device for generating caricature head portrait can be set in server 105, also can be set in end
In end equipment 101,102,103.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
To be implemented as multiple softwares or software module (such as providing the software of Distributed Services or software module), also may be implemented
At single software or software module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.Training sample set or target needed for training pattern
Facial image does not need in the case where long-range obtain, and above system framework can not include network, and only need server or end
End equipment.
With continued reference to Fig. 2, the reality for generating the method for model for generating caricature head portrait according to the disclosure is shown
Apply the process 200 of example.This is used to generate the method that caricature head portrait generates model, comprising the following steps:
Step 201, preset training sample set is obtained.
In the present embodiment, executing subject (such as the clothes shown in FIG. 1 for generating the method for model for generating caricature head portrait
Be engaged in device or terminal device) can by wired connection mode or radio connection from long-range, or from it is local obtain it is preset
Training sample set.Wherein, training sample includes sample facial image, sample caricature head portrait corresponding with sample facial image.
In general, sample facial image is the facial image shot to true face, sample caricature head portrait is the head drawn
Picture.The corresponding relationship of sample facial image and sample caricature head portrait pre-establishes.For example, technical staff can be to a large amount of
Sample facial image and sample caricature head portrait carry out artificial selection, select the higher sample facial image of similarity degree and sample
Caricature head portrait is simultaneously set as training sample.
In some optional implementations of the present embodiment, for the training sample in training sample set, the training
Similarity between the sample sample facial image for including and the feature vector of sample caricature head portrait is more than or equal to preset similar
Spend threshold value.Wherein, feature vector can be used for characterizing various features, such as color characteristic, shape feature of image etc..Specifically
Ground, for generating the executing subject of training sample set, can use the feature vector of existing determining image method (such as
LBP (Local Binary Pattern, local binary patterns) algorithm, algorithm neural network based etc.), determine training sample
Including sample facial image and sample caricature head portrait feature vector.Again to the feature vector of each sample facial image and respectively
The feature vector of a sample caricature head portrait is matched two-by-two, to extract the multipair sample facial image and sample to match each other
This caricature head portrait.Wherein, between the sample facial image to match each other and the corresponding feature vector of sample caricature head portrait
Similarity is more than or equal to preset similarity threshold.
Step 202, obtain pre-establish be initially generated confrontation network.
In the present embodiment, above-mentioned executing subject can be initially generated confrontation from local or from what long-range acquisition pre-established
Network.Wherein, being initially generated confrontation network may include that caricature head portrait generates network, Face image synthesis network and caricature
Head portrait differentiates that network, facial image differentiate network.Caricature head portrait generates network and is used to utilize the Face image synthesis caricature inputted
Head portrait, caricature head portrait differentiate that network is used to distinguish the caricature head portrait and generates the caricature head portrait of network output and input the caricature
Head portrait differentiates that the sample caricature head portrait of network, Face image synthesis network are used to generate face figure using the caricature head portrait of input
Picture, facial image differentiate that network is used to distinguish the facial image of the Face image synthesis network output and inputs the face figure
Sample facial image as differentiating network.
It should be appreciated that being initially generated confrontation network can be generation confrontation network unbred, after initiation parameter,
It is also possible to the generation confrontation network of trained mistake.
It can be it should be noted that caricature head portrait generates network and Face image synthesis network for carrying out image procossing
Convolutional neural networks (such as the convolutional Neural net of the various structures comprising convolutional layer, pond layer, anti-pond layer, warp lamination
Network).Caricature head portrait differentiates that network and facial image differentiate that network can be convolutional neural networks (such as comprising each of full articulamentum
The convolutional neural networks of kind structure, wherein classification feature may be implemented in above-mentioned full articulamentum).In addition, differentiating that network is also possible to
For realizing other models of classification feature, such as support vector machines (Support Vector Machine, SVM).Herein, it overflows
Picture head picture differentiates that network and facial image differentiate that network can export differentiation result respectively.If for example, caricature head portrait differentiates network
Determine that input caricature head portrait differentiates that the image of network is that caricature head portrait generates the caricature head portrait that is exported of network, then can export pair
It should be in the label 1 (or 0) of the image;If it is determined that not being the caricature head portrait that caricature head portrait generates that network is exported, then can export
Label 0 (or 1) corresponding to the image.And facial image differentiates network if it is determined that input facial image differentiates the figure of network
It seem the facial image that Face image synthesis network is exported, then the label that can correspond to the image exports 1 (or 0);If sentencing
Fixed is not the facial image that Face image synthesis network is exported, then can export the label 0 (or 1) corresponding to the image.It needs
It is noted that differentiating that network can also export other pre-set information, it is not limited to numerical value 1 and 0.
Step 203, following training step is executed: using machine learning method, by the training sample in training sample set
Including sample facial image as caricature head portrait generate network input, by caricature head portrait generate network output caricature head portrait
As the input of Face image synthesis network, caricature head portrait is generated to the caricature head portrait and corresponding sample caricature head of network output
Input as differentiating network as caricature head portrait, the facial image that Face image synthesis network is exported and corresponding sample face
Image differentiates the input of network as facial image, is trained to confrontation network is initially generated, by the caricature head portrait after training
It generates network and is determined as caricature head portrait generation model.
In the present embodiment, above-mentioned executing subject can execute following training step: utilizing machine learning method, will train
The sample facial image that training sample in sample set includes generates the input of network as caricature head portrait, and caricature head portrait is raw
Caricature head portrait, is generated the caricature head of network output by the input at the caricature head portrait of network output as Face image synthesis network
Picture and corresponding sample caricature head portrait differentiate the input of network, the face that Face image synthesis network is exported as caricature head portrait
Image and corresponding sample facial image differentiate the input of network as facial image, instruct to confrontation network is initially generated
Practice, the caricature head portrait after training is generated into network and is determined as caricature head portrait generation model.
Specifically, above-mentioned executing subject can fix first generates network (including caricature head portrait generates network and face figure
As generate network) and differentiate network (including caricature head portrait differentiate network and facial image differentiation network) in any network
The parameter of (can be described as first network) optimizes the network (can be described as the second network) of unlocked parameter;Second is fixed again
The parameter of network, optimizes first network.Above-mentioned iteration is constantly carried out, caricature head portrait is made to differentiate that input cannot be distinguished in network
Image whether be caricature head portrait generate network it is generated, and make facial image differentiate network cannot be distinguished input image
It whether is that Face image synthesis network is generated.At this point, caricature head portrait generates network caricature head portrait generated and sample is unrestrained
Picture head picture is close, and caricature head portrait differentiates that network can not accurately distinguish the caricature head portrait of caricature head portrait generation network output and sample is overflow
Picture head picture (i.e. differentiation accuracy rate is 50%);Face image synthesis network facial image generated connects with sample facial image
Closely, facial image differentiates that network can not accurately distinguish the facial image and sample facial image of the output of Face image synthesis network
(i.e. differentiation accuracy rate is 50%).Caricature head portrait at this time can be generated to network and be determined as caricature head portrait generation model.In general,
Above-mentioned executing subject can use existing back-propagation algorithm and gradient descent algorithm and carry out to network and differentiation network is generated
Training.The parameter of the generation network after training and differentiation network can be adjusted every time, the generation that will be obtained after each adjusting parameter
Network and differentiation network are initially generated confrontation network as training next time.It, can be by using loss function in training process
Determine penalty values, iteratively training generates network and differentiates network according to penalty values, so that the damage determined when each interative computation
Mistake value is minimum.
As shown in Figure 3A, G1 is that caricature head portrait generates network, and G2 is Face image synthesis network, and D1 is the differentiation of caricature head portrait
Network, D2 are that facial image differentiates network.For a training sample, which, will according to mode as shown in Figure 3A
Input of the sample facial image as G1, finally exports facial image by G2, is thus trained to G1, G2, D1, D2.From figure
In as can be seen that the input of the facial image of input using two-way training by the way of, i.e., is generated and fights network, obtained by the present embodiment
Facial image is converted to caricature head portrait, then by caricature head portrait, the generation after final training fights network, can be by the unrestrained of generation
Picture head picture is reduced to the higher facial image of facial image similitude with input, so that finally obtained caricature head
As generating model, the higher caricature head portrait of the facial image similitude that can be generated and input.
In some optional implementations of the present embodiment, above-mentioned executing subject can be in accordance with the following steps to initial raw
It is trained at confrontation network:
Step 1 determines the difference of the facial image for characterizing sample facial image and the output of Face image synthesis network
The first generational loss value, and determine the caricature head portrait that network output is generated for characterizing sample caricature head portrait and caricature head portrait
Difference the second generational loss value.It is lost in general, the first generational loss value and the second generational loss value can be according to recurrence
The penalty values that function determines, return loss function is typically expressed as L (y, y '), using its obtained penalty values for characterizing
True value (sample facial image or sample caricature head portrait i.e. in the present embodiment) y and the predicted value (face i.e. in the present embodiment
The facial image or caricature head portrait that image generates network output generate the caricature head portrait of network output) journey inconsistent between y '
Degree.When training, minimum is reached.
Optionally, generational loss value can be determined to obtain by following any loss function: L1 norm loss function, L2 model
Number loss function.Wherein, L1 norm loss function and L2 norm loss function are the loss functions of existing Pixel-level, i.e., with picture
Element is basic unit, determines the difference between pixel that two images include, utilizes generational loss value to characterize so as to improve
The accuracy of difference between image.
Step 2 determines that caricature head portrait differentiates that network is corresponding, the sample of network is differentiated for characterizing input caricature head portrait
Caricature head portrait and caricature head portrait generate the first differentiation penalty values of the difference of the caricature head portrait of network output, and determine face figure
As differentiating that network is corresponding, differentiates the sample facial image and Face image synthesis network of network for characterizing input facial image
The second of the difference of the facial image of output differentiates penalty values.It (such as is handed in general, can be used for the loss function of two classification
Pitch entropy loss function) determine differentiation penalty values.
Step 3 differentiates penalty values, second based on identified first generational loss value, the second generational loss value, first
Differentiate penalty values, is trained to confrontation network is initially generated.Specifically, it is right respectively that preset, each penalty values be can use
The weight answered is weighted summation to identified each penalty values, obtains total losses value.When training, caricature is continuously adjusted
Head portrait generates network, Face image synthesis network, caricature head portrait and differentiates that network, facial image differentiate the parameter of network, so that always
Penalty values are gradually reduced, and (are, for example, less than equal to preset penalty values threshold value or total losses when total losses value meets preset condition
Value no longer reduces) when, determine that model training is completed.
In some optional implementations of the present embodiment, above-mentioned training step can execute as follows:
Using machine learning method, the sample facial image for including using the training sample in training sample set is as caricature
Head portrait generates the input of network, and caricature head portrait is generated the caricature head portrait of network output as the defeated of Face image synthesis network
Enter, the caricature head portrait and corresponding sample caricature head portrait that caricature head portrait is generated network output are as caricature head portrait differentiation network
Input, the facial image and corresponding sample facial image that Face image synthesis network is exported differentiate network as facial image
Input, to be initially generated confrontation network be trained;And the sample that the training sample in training sample set includes is overflow
Input of the picture head picture as Face image synthesis network, the facial image that Face image synthesis network is exported is as caricature head portrait
The input of network is generated, the facial image that Face image synthesis network is exported and corresponding sample facial image are as face figure
Input as differentiating network, the caricature head portrait and corresponding sample caricature head portrait that caricature head portrait is generated network output are as caricature
Head portrait differentiates the input of network, is trained to confrontation network is initially generated, the caricature head portrait after training is generated network and is determined
Model is generated for caricature head portrait.
Specifically, as shown in Figure 3A and Figure 3B.For a training sample, the training sample is according to side as shown in Figure 3A
Formula finally exports facial image by G2, is thus trained to G1, G2, D1, D2 using sample facial image as the input of G1.
Again to the training sample according still further to mode as shown in Figure 3B using sample caricature head portrait as the input of G2, finally exported by G1 unrestrained
Picture head picture, is thus trained G1, G2, D1, D2.This implementation can use one it can be seen from Fig. 3 A and Fig. 3 B
Training sample is trained twice, generates network and Face image synthesis network ginseng so as to alternately optimize caricature head portrait
Number helps synchronously to improve the accuracy that caricature head portrait generates network and Face image synthesis network generates image, to make
It obtains finally obtained caricature head portrait and generates the higher caricature head portrait of facial image similitude that model can be generated with input.
With continued reference to the applied field for generating the method for model for generating caricature head portrait that Fig. 4, Fig. 4 are according to the present embodiment
One schematic diagram of scape.In the application scenarios of Fig. 4, electronic equipment 401 obtains preset training sample set from local first
402.Wherein, each training sample in training sample set 402 include sample facial image, it is corresponding with sample facial image
Sample caricature head portrait.Then, what electronic equipment 401 was pre-established from local acquisition is initially generated confrontation network 403.Wherein, just
Begin to generate confrontation network 403 to include that caricature head portrait generation network G 1, Face image synthesis network G 2 and caricature head portrait differentiate net
Network G3, facial image differentiate network G 4.
Subsequently, electronic equipment 401 executes following steps: machine learning method is utilized, it will be in training sample set 402
The sample facial image that training sample includes generates the input of network G 1 as caricature head portrait, and caricature head portrait is generated network output
Input of the caricature head portrait as Face image synthesis network G 2, caricature head portrait is generated to the caricature head portrait and correspondence of network output
Sample caricature head portrait as caricature head portrait differentiate network D1 input, the facial image that Face image synthesis network is exported and
Corresponding sample facial image differentiates the input of network D2 as facial image, is trained to G1, G2, D1, D2.When caricature head
As differentiating that network D1 can not accurately distinguish the caricature head portrait of caricature head portrait generation network output and sample caricature head portrait (differentiates quasi-
True rate is 50%) and facial image differentiates that network D2 can not accurately distinguish the facial image of Face image synthesis network output
When with sample facial image (i.e. differentiation accuracy rate being 50%), caricature head portrait at this time is generated into network G 1 and is determined as caricature head portrait
Generate model 404.
The method provided by the above embodiment of the disclosure, by obtaining preset training sample set and pre-establishing first
Begin to generate confrontation network, wherein being initially generated confrontation network includes that caricature head portrait generates network, Face image synthesis network, with
And caricature head portrait differentiates that network, facial image differentiate network, using machine learning method, by the training sample in training sample set
Originally the sample facial image for including generates the input of network as caricature head portrait, and caricature head portrait is generated to the caricature head of network output
As the input as Face image synthesis network, caricature head portrait is generated to the caricature head portrait and corresponding sample caricature of network output
Head portrait differentiates the input of network, the facial image that Face image synthesis network is exported and corresponding sample people as caricature head portrait
Face image differentiates the input of network as facial image, is trained to confrontation network is initially generated, by the caricature head after training
It is determined as caricature head portrait generation model as generating network.To realize by the way of two-way training, i.e., by the face of input
Image input generates confrontation network, obtains caricature head portrait, then caricature head portrait is converted to facial image, the facial image of input with
The similitude of the facial image of output is higher, to help to generate the facial image that model is generated with inputted using caricature head portrait
The higher caricature head portrait of similitude.
With further reference to Fig. 5, it illustrates one embodiment according to the method for generating caricature head portrait of the disclosure
Process 500.This is used to generate the process 500 of the method for caricature head portrait, comprising the following steps:
Step 501, target facial image is obtained.
In the present embodiment, for generating executing subject (such as server shown in FIG. 1 or the end of the method for caricature head portrait
End equipment) target facial image can be obtained from long-range, or from local by wired connection mode or radio connection.Its
In, target facial image is its facial image for generating caricature head portrait to be utilized.For example, target facial image can be above-mentioned hold
Camera that row main body includes or the camera for including with the electronic equipment of above-mentioned executing subject communication connection are to target person
The facial image that face is shot, target person can be the user in the coverage of camera.
Step 502, target facial image input caricature head portrait trained in advance is generated into model, obtains caricature head portrait and defeated
Out.
In the present embodiment, target facial image can be inputted caricature head portrait trained in advance and generated by above-mentioned executing subject
Model obtains caricature head portrait and output.Wherein, it is the side described according to above-mentioned Fig. 2 corresponding embodiment that caricature head portrait, which generates model,
What method generated.
Above-mentioned executing subject can in various manners export the head portrait of generation.For example, the head portrait of generation can be shown
Show on the display screen for including with above-mentioned executing subject, alternatively, sending the head portrait of generation to and above-mentioned executing subject communication link
Other electronic equipments connect.
The method provided by the above embodiment of the disclosure is inputted target facial image by obtaining target facial image
Model is generated previously according to the caricature head portrait of the method training of above-mentioned Fig. 2 corresponding embodiment description, obtains caricature head portrait and output,
It is that training obtains by the way of two-way training since caricature head portrait generates model, it is raw by using the caricature head portrait
At model, the similitude of the facial image of input and the caricature head portrait of output can be improved, help to realize as different users
Personalized caricature head portrait is provided.
With further reference to Fig. 6, as the realization to method shown in above-mentioned Fig. 2, it is unrestrained for generating that present disclose provides one kind
Picture head picture generates one embodiment of the device of model, and the Installation practice is corresponding with embodiment of the method shown in Fig. 2, the dress
Setting specifically can be applied in various electronic equipments.
As shown in fig. 6, the device 600 for generating model for generating caricature head portrait of the present embodiment includes: the first acquisition list
Member 601, is configured to obtain preset training sample set, wherein training sample includes sample facial image and sample face
The corresponding sample caricature head portrait of image;Second acquisition unit 602, be configured to obtain pre-establish be initially generated confrontation net
Network, wherein being initially generated confrontation network includes that caricature head portrait generation network, Face image synthesis network and caricature head portrait are sentenced
Other network, facial image differentiate network;Training unit 603 is configured to execute following training step: utilizing machine learning side
Method, the sample facial image for including using the training sample in training sample set generate the input of network as caricature head portrait, will
Caricature head portrait generates input of the caricature head portrait of network output as Face image synthesis network, and it is defeated that caricature head portrait is generated network
Caricature head portrait and corresponding sample caricature head portrait out differentiates the input of network as caricature head portrait, by Face image synthesis network
The facial image of output and corresponding sample facial image differentiate the input of network as facial image, to being initially generated confrontation net
Network is trained, and the caricature head portrait after training is generated network and is determined as caricature head portrait generation model.
In the present embodiment, first acquisition unit 601 can be by wired connection mode or radio connection from remote
Journey, or preset training sample set is obtained from local.Wherein, training sample includes sample facial image and sample face figure
As corresponding sample caricature head portrait.The corresponding relationship of sample facial image and sample caricature head portrait pre-establishes.For example, skill
Art personnel can carry out artificial selection to a large amount of sample facial image and sample caricature head portrait, and it is higher to select similarity degree
Sample facial image and sample caricature head portrait are simultaneously set as training sample.
In the present embodiment, second acquisition unit 602 it is available pre-establish be initially generated confrontation network.Wherein,
Being initially generated confrontation network may include that caricature head portrait generates network, Face image synthesis network and caricature head portrait differentiation net
Network, facial image differentiate network.Caricature head portrait generates network and is used to utilize the Face image synthesis caricature head portrait inputted, caricature head
Net is differentiated as differentiating that network is used to distinguish the caricature head portrait and generates the caricature head portrait of network output and input the caricature head portrait
The sample caricature head portrait of network, Face image synthesis network are used to generate facial image, facial image using the caricature head portrait of input
Differentiate that network is used to distinguish the facial image of the Face image synthesis network output and inputs the facial image and differentiates network
Sample facial image.
It should be appreciated that being initially generated confrontation network can be generation confrontation network unbred, after initiation parameter,
It is also possible to the generation confrontation network of trained mistake.
It can be it should be noted that caricature head portrait generates network and Face image synthesis network for carrying out image procossing
Convolutional neural networks (such as the convolutional Neural net of the various structures comprising convolutional layer, pond layer, anti-pond layer, warp lamination
Network).Caricature head portrait differentiates that network and facial image differentiate that network can be convolutional neural networks (such as comprising each of full articulamentum
The convolutional neural networks of kind structure, wherein classification feature may be implemented in above-mentioned full articulamentum).In addition, differentiating that network is also possible to
For realizing other models of classification feature, such as support vector machines (Support Vector Machine, SVM).Herein, it overflows
Picture head picture differentiates that network and facial image differentiate that network can export differentiation result respectively.If for example, caricature head portrait differentiates network
Determine that input caricature head portrait differentiates that the image of network is that caricature head portrait generates the caricature head portrait that is exported of network, then can export pair
It should be in the label 1 (or 0) of the image;If it is determined that not being the caricature head portrait that caricature head portrait generates that network is exported, then can export
Label 0 (or 1) corresponding to the image.And facial image differentiates network if it is determined that input facial image differentiates the figure of network
Seem the facial image that Face image synthesis network is exported, then can export the label 1 (or 0) corresponding to the image;If sentencing
Fixed is not the facial image that Face image synthesis network is exported, then can export the label 0 (or 1) corresponding to the image.It needs
It is noted that differentiating that network can also export other pre-set information, it is not limited to numerical value 1 and 0.
In the present embodiment, training unit 603 can execute following training step: utilizing machine learning method, will train
The sample facial image that training sample in sample set includes generates the input of network as caricature head portrait, and caricature head portrait is raw
Caricature head portrait, is generated the caricature head of network output by the input at the caricature head portrait of network output as Face image synthesis network
Picture and corresponding sample caricature head portrait differentiate the input of network, the face that Face image synthesis network is exported as caricature head portrait
Image and corresponding sample facial image differentiate the input of network as facial image, instruct to confrontation network is initially generated
Practice, the caricature head portrait after training is generated into network and is determined as caricature head portrait generation model.
Specifically, above-mentioned training unit 603 can be fixed first generates network (including caricature head portrait generates network and face
Image generate network) and differentiate network (including caricature head portrait differentiate network and facial image differentiation network) in any network
The parameter of (can be described as first network) optimizes the network (can be described as the second network) of unlocked parameter;Second is fixed again
The parameter of network, optimizes first network.Above-mentioned iteration is constantly carried out, caricature head portrait is made to differentiate that input cannot be distinguished in network
Image whether be caricature head portrait generate network it is generated, and make facial image differentiate network cannot be distinguished input image
It whether is that Face image synthesis network is generated.At this point, caricature head portrait generates network caricature head portrait generated and sample is unrestrained
Picture head picture is close, and caricature head portrait differentiates that network can not accurately distinguish the caricature head portrait of caricature head portrait generation network output and sample is overflow
Picture head picture (i.e. differentiation accuracy rate is 50%);Face image synthesis network facial image generated connects with sample facial image
Closely, facial image differentiates that network can not accurately distinguish the facial image and sample facial image of the output of Face image synthesis network
(i.e. differentiation accuracy rate is 50%).Caricature head portrait at this time can be generated to network and be determined as caricature head portrait generation model.In general,
Above-mentioned training unit 603 can use existing back-propagation algorithm and gradient descent algorithm to generate network and differentiate network into
Row training.The parameter of the generation network after training and differentiation network can be adjusted every time, the life that will be obtained after each adjusting parameter
At network and differentiate that network is initially generated confrontation network as training next time.It, can be by using loss letter in training process
Number determines penalty values, and according to penalty values, iteratively training generates network and differentiates network, so as to determined when each interative computation
Penalty values are minimum.
In some optional implementations of the present embodiment, training unit 603 can be further configured to: utilize machine
Device learning method, the sample facial image for including using the training sample in training sample set generate network as caricature head portrait
Input, it is using the caricature head portrait of caricature head portrait generation network output as the input of Face image synthesis network, caricature head portrait is raw
The input for differentiating network as caricature head portrait at the caricature head portrait and corresponding sample caricature head portrait of network output, by facial image
The facial image and corresponding sample facial image that generate network output differentiate the input of network as facial image, to initial raw
It is trained at confrontation network;And the sample caricature head portrait for using the training sample in training sample set including is as face figure
Input as generating network, the facial image that Face image synthesis network is exported generate the input of network as caricature head portrait,
The facial image and corresponding sample facial image that Face image synthesis network is exported differentiate the defeated of network as facial image
Enter, the caricature head portrait and corresponding sample caricature head portrait that caricature head portrait is generated network output are as caricature head portrait differentiation network
Input is trained to confrontation network is initially generated, and the caricature head portrait after training is generated network and is determined as the generation of caricature head portrait
Model.
In some optional implementations of the present embodiment, for the training sample in training sample set, the training
Similarity between the sample sample facial image for including and the feature vector of sample caricature head portrait is more than or equal to preset similar
Spend threshold value.
In some optional implementations of the present embodiment, training unit 603 may include: the first determining module (figure
In be not shown), be configured to determine the facial image for characterizing sample facial image and the output of Face image synthesis network
First generational loss value of difference, and determine and generate the caricature of network output for characterizing sample caricature head portrait and caricature head portrait
Second generational loss value of the difference of head portrait;Second determining module (not shown) is configured to determine the differentiation of caricature head portrait
Network is corresponding, differentiates that the sample caricature head portrait of network and caricature head portrait generate network output for characterizing input caricature head portrait
The first of the difference of caricature head portrait differentiates penalty values, and determines that facial image differentiates that network is corresponding, is used to characterize input people
Face image differentiates that the second of the difference of the sample facial image of network and the facial image of Face image synthesis network output differentiates
Penalty values;Training module (not shown) is configured to based on identified first generational loss value, the second generational loss
Value, first differentiate that penalty values, second differentiate penalty values, are trained to confrontation network is initially generated.
In some optional implementations of the present embodiment, generational loss value is determined by following any loss function
It arrives: L1 norm loss function, L2 norm loss function.
The device provided by the above embodiment 600 of the disclosure, by obtaining preset training sample set and pre-establishing
Be initially generated confrontation network, wherein be initially generated confrontation network include caricature head portrait generate network, Face image synthesis net
Network and caricature head portrait differentiate that network, facial image differentiate that network will be in training sample set using machine learning method
The sample facial image that training sample includes generates the input of network as caricature head portrait, and caricature head portrait is generated network output
Caricature head portrait is generated the caricature head portrait and corresponding sample of network output by input of the caricature head portrait as Face image synthesis network
This caricature head portrait differentiates the input of network, the facial image that Face image synthesis network is exported and corresponding as caricature head portrait
Sample facial image differentiates the input of network as facial image, is trained to confrontation network is initially generated, after training
Caricature head portrait generates network and is determined as caricature head portrait generation model.It, i.e., will input to realize by the way of two-way training
Facial image input generate confrontation network, obtain caricature head portrait, then caricature head portrait is converted into facial image, the face of input
The similitude of image and the facial image of output is higher, to help to generate the people that model is generated with inputted using caricature head portrait
The higher caricature head portrait of face image similitude.
With further reference to Fig. 7, as the realization to method shown in above-mentioned Fig. 5, it is unrestrained for generating that present disclose provides one kind
One embodiment of the device of picture head picture, the Installation practice is corresponding with embodiment of the method shown in fig. 5, which specifically may be used
To be applied in various electronic equipments.
As shown in fig. 7, the present embodiment includes: facial image acquiring unit for generating the device 700 of caricature head portrait
701, it is configured to obtain target facial image;Caricature head portrait generation unit 702 is configured to input target facial image pre-
First trained caricature head portrait generates model, obtains caricature head portrait and output.Wherein, it is according to above-mentioned figure that caricature head portrait, which generates model,
What the method for 2 corresponding embodiments description generated.
In the present embodiment, facial image acquiring unit 701 can pass through wired connection mode or radio connection
Target facial image is obtained from long-range, or from local.Wherein, target facial image is its face for generating caricature head portrait to be utilized
Image.For example, target facial image can be the camera that above-mentioned apparatus 700 includes or with above-mentioned apparatus 700 communicate to connect
The facial image that the camera that electronic equipment includes shoots the face of target person, target person, which can be, to be taken the photograph
As head coverage in user.
In the present embodiment, target facial image can be inputted caricature trained in advance by caricature head portrait generation unit 702
Head portrait generates model, obtains caricature head portrait and output.Wherein, it is according to above-mentioned Fig. 2 corresponding embodiment that caricature head portrait, which generates model,
What the method for description generated.
Above-mentioned caricature head portrait generation unit 702 can in various manners export the head portrait of generation.For example, can be by life
At head portrait be shown on the display screen for including with above-mentioned apparatus 700, alternatively, sending the head portrait of generation to and above-mentioned apparatus
Other electronic equipments of 700 communication connections.
The device provided by the above embodiment 700 of the disclosure, it is by obtaining target facial image, target facial image is defeated
The caricature head portrait entered previously according to the method training of above-mentioned Fig. 2 corresponding embodiment description generates model, obtains caricature head portrait and defeated
It out, is that training obtains by the way of two-way training since caricature head portrait generates model, by using the caricature head portrait
Model is generated, the similitude of the facial image of input and the caricature head portrait of output can be improved, help to realize as different use
Family provides personalized caricature head portrait.
Below with reference to Fig. 8, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1
Server or terminal device) 800 structural schematic diagram.Terminal device in embodiment of the disclosure can include but is not limited to all
As mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable
Formula multimedia player), the mobile terminal and such as number TV, desk-top meter of car-mounted terminal (such as vehicle mounted guidance terminal) etc.
The fixed terminal of calculation machine etc..Electronic equipment shown in Fig. 8 is only an example, should not be to the function of embodiment of the disclosure
Any restrictions are brought with use scope.
As shown in figure 8, electronic equipment 800 may include processing unit (such as central processing unit, graphics processor etc.)
801, random access can be loaded into according to the program being stored in read-only memory (ROM) 802 or from storage device 808
Program in memory (RAM) 803 and execute various movements appropriate and processing.In RAM 803, it is also stored with electronic equipment
Various programs and data needed for 800 operations.Processing unit 801, ROM 802 and RAM803 are connected with each other by bus 804.
Input/output (I/O) interface 805 is also connected to bus 804.
In general, following device can connect to I/O interface 805: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 806 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 807 of dynamic device etc.;Storage device 808 including such as tape, hard disk etc.;And communication device 809.Communication device
809, which can permit electronic equipment 800, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 8 shows tool
There is the electronic equipment 800 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.Each box shown in Fig. 8 can represent a device, can also root
According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 809, or from storage device 808
It is mounted, or is mounted from ROM 802.When the computer program is executed by processing unit 801, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.
It is situated between it should be noted that computer-readable medium described in embodiment of the disclosure can be computer-readable signal
Matter or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with
System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than
Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires
Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable
Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited
Memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer readable storage medium, which can be, appoints
What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its
It is used in combination.And in embodiment of the disclosure, computer-readable signal media may include in a base band or as carrier wave
The data-signal that a part is propagated, wherein carrying computer-readable program code.The data-signal of this propagation can be adopted
With diversified forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal is situated between
Matter can also be any computer-readable medium other than computer readable storage medium, which can be with
It sends, propagate or transmits for by the use of instruction execution system, device or device or program in connection.Meter
The program code for including on calculation machine readable medium can transmit with any suitable medium, including but not limited to: electric wire, optical cable,
RF (radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more
When a program is executed by the electronic equipment, so that the electronic equipment: obtaining preset training sample set, wherein training sample
Including sample facial image, sample caricature head portrait corresponding with sample facial image;What acquisition pre-established is initially generated confrontation
Network, wherein being initially generated confrontation network includes that caricature head portrait generates network, Face image synthesis network and caricature head portrait
Differentiate that network, facial image differentiate network;Execute following training step:, will be in training sample set using machine learning method
The training sample sample facial image that includes the input of network is generated as caricature head portrait, caricature head portrait is generated into network output
Input of the caricature head portrait as Face image synthesis network, caricature head portrait is generated into the caricature head portrait of network output and corresponding
Sample caricature head portrait differentiates the input of network, the facial image that Face image synthesis network is exported and correspondence as caricature head portrait
Sample facial image as facial image differentiate network input, to be initially generated confrontation network be trained, after training
Caricature head portrait generate network be determined as caricature head portrait generate model.
In addition, when said one or multiple programs are executed by the electronic equipment, so that the electronic equipment: obtaining target
Facial image;Target facial image input caricature head portrait trained in advance is generated into model, obtains caricature head portrait and output.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, described program design language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor
Including first acquisition unit, second acquisition unit, training unit.Wherein, the title of these units not structure under certain conditions
The restriction of the pairs of unit itself, for example, first acquisition unit is also described as " obtaining preset training sample set
Unit ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.