CN111223164B - Face simple drawing generation method and device - Google Patents

Face simple drawing generation method and device Download PDF

Info

Publication number
CN111223164B
CN111223164B CN202010016526.4A CN202010016526A CN111223164B CN 111223164 B CN111223164 B CN 111223164B CN 202010016526 A CN202010016526 A CN 202010016526A CN 111223164 B CN111223164 B CN 111223164B
Authority
CN
China
Prior art keywords
face
model
simple drawing
attribute
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010016526.4A
Other languages
Chinese (zh)
Other versions
CN111223164A (en
Inventor
高飞
朱静洁
李鹏
俞泽远
王韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Institute of Information Technology AIIT of Peking University
Hangzhou Weiming Information Technology Co Ltd
Original Assignee
Advanced Institute of Information Technology AIIT of Peking University
Hangzhou Weiming Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Institute of Information Technology AIIT of Peking University, Hangzhou Weiming Information Technology Co Ltd filed Critical Advanced Institute of Information Technology AIIT of Peking University
Priority to CN202010016526.4A priority Critical patent/CN111223164B/en
Publication of CN111223164A publication Critical patent/CN111223164A/en
Application granted granted Critical
Publication of CN111223164B publication Critical patent/CN111223164B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for generating a face simple drawing, comprising the following steps: cutting out a face image from the image and predicting attribute categories; inputting the face image into a general face portrait synthesis model so that the general face portrait synthesis model synthesizes a first face simple drawing; determining a special face portrait synthesizing model to be used for the attribute category, and inputting the face image into the special face portrait synthesizing model so that the special face portrait synthesizing model synthesizes a second face simple drawing; the first face profile and the second face profile are fused. While the first face simple drawing is synthesized by using the general face portrait synthesis model, the second face simple drawing is synthesized by using different special face portrait synthesis models aiming at different face attribute categories so as to overcome the influence of the face attribute change on the portrait simple drawing synthesis quality, and further the face simple drawing obtained by the first face simple drawing and the second face simple drawing is more accurate, and the personalized requirements of different face attributes are met.

Description

Face simple drawing generation method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for generating a face simple drawing.
Background
The face image is converted into the simple drawing, and the method has important application value in the public safety field and the digital entertainment field.
In the conventional image processing method, if the generated simple drawing is to be effective, the computational complexity of the image processing method is required to be high, which is difficult to meet the requirement of real-time property, and with the development of the machine learning technology, the image processing technology based on machine learning has higher computational speed and higher accuracy than the conventional image processing technology, so that a plurality of machine learning models for generating the simple drawing from the face image are derived.
However, the simple drawing generated by the face image of these machine learning models is greatly affected by the change of the face attribute (such as the attribute of the face texture characteristic) and the generation effect is poor.
Disclosure of Invention
The invention aims at providing a method and a device for generating a face drawing aiming at the defects of the prior art, and the aim is achieved through the following technical scheme.
The first aspect of the present invention proposes a face-drawing-figure generating method, the method comprising:
cutting out a face image from the received image, and predicting the attribute category of the face in the face image;
inputting the face image into a trained general face portrait synthesis model so that the general face portrait synthesis model synthesizes a first face simple drawing of the face image;
determining a trained special face portrait synthesis model to be used by the attribute category, and inputting the face image into the special face portrait synthesis model so that the special face portrait synthesis model synthesizes a second face simple drawing of the face image;
and fusing the first face simple drawing and the second face simple drawing to obtain a third face simple drawing.
A second aspect of the present invention proposes a face-profile-drawing generating apparatus, the apparatus comprising:
the attribute prediction module is used for cutting out a face image from the received image and predicting the attribute category of the face in the face image;
the universal synthesis module is used for inputting the face image into a trained universal face portrait synthesis model so that the universal face portrait synthesis model synthesizes a first face simple drawing of the face image;
the special synthesis module is used for determining a trained special face portrait synthesis model which is required to be used by the attribute category, and inputting the face image into the special face portrait synthesis model so that the special face portrait synthesis model synthesizes a second face simple drawing of the face image;
and the fusion module is used for fusing the first face simple drawing and the second face simple drawing to obtain a third face simple drawing.
In the embodiment of the invention, the first face simple drawing is synthesized by using the general face portrait synthesis model, the second face simple drawing is synthesized by using different special face portrait synthesis models aiming at different face attribute categories so as to overcome the influence of the face attribute change on the portrait simple drawing synthesis quality, and further, the third face simple drawing obtained by fusing the first face simple drawing and the second face simple drawing is more accurate, and the personalized requirements of different face attributes are met.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a flow chart illustrating an embodiment of a method for generating a face profile according to an exemplary embodiment of the present invention;
fig. 2 is a schematic diagram illustrating segmentation of different regions of a face according to the present invention;
FIG. 3 is a schematic diagram of a face-drawing generating system according to the present invention;
FIG. 4 is a hardware architecture diagram of an electronic device according to an exemplary embodiment of the invention;
fig. 5 is a flowchart illustrating an embodiment of a face profile generating apparatus according to an exemplary embodiment of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Because the existing machine learning model for synthesizing the face shorthand drawing is single, different requirements of users with different face attributes (such as age, gender and the like) on the portrait shorthand drawing are difficult to meet, the face shorthand drawing generation method provided by the invention aims to overcome the influence of face attribute change on the synthesis quality of the portrait shorthand drawing, so that the face images with any attributes can be synthesized into high-quality portrait shorthand drawing with clear and attractive appearance and consistent identity, and the personalized requirements of different face attributes are met.
The method for generating the face drawing of the present invention will be described in detail with reference to specific embodiments.
Fig. 1 is a flowchart of an embodiment of a method for generating a face profile according to an exemplary embodiment of the present invention, where the method for generating a face profile may be applied to an electronic device (such as a PC, a terminal, a server, etc.). As shown in fig. 1, the method for generating the face simple drawing comprises the following steps:
step 101: and cutting out a face image from the received image, and predicting the attribute category of the face in the face image.
In an embodiment, for a process of clipping a face image from a received image, the face image may be corrected by inputting the image into a trained face detection model, so that the face detection model detects a face in the image, predicts a position of a key point of the face, performs affine transformation on the image according to the position of the key point of the face, and finally clips a face image with a set size from the affine-transformed image.
The key points of the human face can comprise key positions such as a left eye center, a right eye center, a nose tip, two mouth corners and the like. Faces in images can be corrected by affine transformation. Alternatively, the left eye and the right eye of the face in the image can be positioned in the horizontal position by affine transformation, and the pixel distance is set between the left eye and the right eye.
For example, the left eye and the right eye can be adjusted to the horizontal position through affine transformation, the distance between the two eyes is adjusted to 120 pixels, when clipping is performed, a face image with the size of 512 x 512 pixels can be clipped along the boundary between the two eyes and the upper edge of the image, and the center point of the two eyes is positioned on the vertical center line of the face image.
It will be appreciated by those skilled in the art that the face detection model may be implemented in a related art, and the specific implementation manner of the face detection model is not limited in the present invention, for example, the MTCNN model may be used to detect key points of a face.
Before predicting the attribute category of the face in the face image, the operations of adjusting the brightness and beautifying the skin can be performed on the face image so as to improve the visual quality of the face image.
In one embodiment, for the process of predicting the attribute category of the face in the face image, the feature map of the face image may be extracted by the feature extraction network in the prediction model by inputting the face image into a trained prediction model, and output to the attribute prediction network in the prediction model, where the attribute prediction network predicts the attribute category of the face based on the feature map.
Illustratively, the attribute categories may include young men, young women, elderly men, elderly women, and the like.
In the present invention, the prediction model is a multi-task learning model, that is, the prediction model further includes a weight prediction network, and the training process for the prediction model is described in the following step 104, which is not described in detail herein.
Step 102: and inputting the face image into a trained general face portrait synthesis model so that the general face portrait synthesis model synthesizes a first face simple drawing of the face image.
The output of the general facial portrait synthesizing model is a facial portrait simple drawing which is directly synthesized without considering the facial attribute category.
Step 103: and determining a trained special face portrait synthesis model to be used for the attribute category, and inputting the face image into the special face portrait synthesis model so that the special face portrait synthesis model synthesizes a second face simple drawing of the face image.
The output of the special facial portrait synthesizing model is a facial portrait simple drawing synthesized by considering the facial attribute types, and each attribute type corresponds to the use of a special facial portrait synthesizing model for facial portrait simple drawing synthesis.
Before performing steps 102 and 103, it is necessary to train the general face portrait synthesis model and the dedicated face portrait synthesis model corresponding to each attribute category, respectively.
Wherein, the general face portrait synthesizing model and the special face portrait synthesizing model adopt the structure of generating the countermeasure network during training. And both the face portrait synthesis model and the special face portrait synthesis model are realized by adopting encoder-decoder structures.
The encoder is implemented using multiple convolutional layers, e.g., the encoder may employ a VGGFace feature extractor, and the decoder is implemented using multiple transposed convolutional layers, a normalization layer, and an activation layer, respectively.
The training process of the special face portrait synthesis model corresponding to each attribute category can comprise the following steps:
four categories are included by attribute category: for young men, young women, old men and old women, a face sample set is firstly obtained, each face sample in the face sample set is marked with an attribute category, a real face profile corresponding to each face sample is obtained, a corresponding special face portrait synthesis model and a discrimination model are built for each attribute category, and the built special face portrait synthesis model and discrimination model are optimized in an alternating iterative mode by utilizing the face sample marked with the attribute category and the corresponding real face profile.
The special face portrait synthesizing model inputs a face sample and outputs a synthesized face simple drawing; the distinguishing model is input into a synthesized human face simple drawing, output into a distinguishing result and human face attribute of the human face simple drawing, input into a real human face simple drawing, and output into a distinguishing result and human face attribute of the real human face simple drawing.
Based on the description, the loss value of the distinguishing model is obtained by distinguishing results and categories of the synthesized face simple drawing and distinguishing results and attribute categories of the real face simple drawing, and the loss value of the special face portrait distinguishing model is obtained by content loss values between the synthesized face simple drawing and the face sample, style loss values between the synthesized face simple drawing and the real face simple drawing and loss values of the distinguishing model.
Wherein, the loss value L of the discrimination model is calculated from the discrimination result and the category of the synthesized face stick figure, the discrimination result and the attribute category of the real face stick figure ca-adv The cross entropy loss function calculation may be employed.
The calculation formula for the content loss value between the synthesized face stick figure and the face sample is as follows:
representing the encoder, i.e.)>After the face sample x is input into the encoder, the characteristic diagram of the jth calculation layer is shown, namely +.>After the synthesized face drawing G (x) is input into the encoder, the feature diagram of the jth calculation layer is passed through, C j 、H j And W is j The channel number, length and width of the feature map output by the jth calculation layer are respectively calculated.
The calculation formula for the style loss value between the synthesized face stick figure and the real face stick figure is as follows:
gram (·) represents a Gram matrix, i.e.After the synthesized face drawing G (x) is input into the encoder, the Gram matrix of the feature map output by the kth calculation layer is +.>And after the real face drawing s is input into the encoder, the Gram matrix of the feature map is output through a kth calculation layer.
Loss value L of the discrimination model ca-adv Content loss value L content Style loss value L style The formula for calculating the loss value of the special face portrait synthesis model is as follows:
L global =L identity +λL style +βL ca-adv (equation 3)
Wherein, lambda is more than or equal to 0, and beta is more than or equal to 0.
In the training process, the special face portrait synthesis model and the discrimination model are alternately and iteratively optimized, namely the optimization formula is as follows:
wherein G represents the optimization of the dedicated face portrait synthesis model, and D represents the optimization of the discrimination model.
It should be noted that, the training process for the general face portrait synthesis model is the same as the training principle of the special face portrait synthesis model, and the difference is that the general face portrait synthesis model can perform training optimization by using all face samples in the face sample set, and is not repeated.
Alternatively, the general face portrait synthesis model may be trained first, then the special face portrait synthesis model is initialized by using the general face portrait synthesis model, and then fine tuning and optimization are performed on the special face portrait synthesis model, so as to improve model training efficiency.
Step 104: and fusing the first face simple drawing and the second face simple drawing to obtain a third face simple drawing.
The prediction model described in the step 101 further comprises a weight prediction network, the feature extraction network in the prediction model extracts a feature map of the face image and outputs the feature map to the weight prediction network, and the weight prediction network predicts fusion weights based on the feature map.
Further, by fusing the first face profile and the second face profile using the fusion weights, the fusion formula is as follows:
O final =β·G k* (x)+(1-β)·G u (x) (equation 5)
Wherein, represents pixel value dot product, G k* (x) Representing a second face drawing, G u (x) Representing a first face shorthand, and β represents a fusion weight.
Aiming at the training process of the prediction model, the feature extraction network and the attribute prediction network in the constructed prediction model can be optimized by utilizing each face sample in the face sample set until the feature extraction network and the attribute prediction network converge, and then the weight prediction network in the constructed prediction model is optimized by utilizing the feature map obtained by the optimized feature extraction network of each face sample in the face sample set until the loss value of the weight prediction network is lower than a preset value;
the feature extraction network comprises a plurality of convolution layers, and the attribute prediction network and the weight prediction network are both realized by a plurality of full-connection layers.
The loss value of the weight prediction network is the content loss value between the face sample and the fused third face simple drawing.
And after the fused third face simple drawing is obtained by respectively obtaining a first face simple drawing and a second face simple drawing through a general face portrait synthetic model and a corresponding special face portrait synthetic model, fusing the first face simple drawing and the second face simple drawing by utilizing a fusion weight obtained by a feature map of the face sample through a weight prediction network.
Therefore, the attribute category prediction is predicted by the feature extraction network and the attribute prediction network in the prediction model, and the fusion weight prediction is predicted by the feature extraction network and the weight prediction network in the prediction model, so that the prediction model is a multi-task learning model.
And, before optimizing the weight prediction network, optimization training of the general face portrait synthesis model and the special face portrait synthesis model needs to be completed.
After the face image is cut out from the received image, the face image may be input into a trained face analysis model, so that the face analysis model may divide the face regions in the face image, and obtain the positions of the face regions output by the face analysis model.
The regions of each part of the face can comprise 11 regions of left eyebrow, right eyebrow, left eye, right eye, nose, mouth, face, hair, neck, trunk, background and the like. Referring to fig. 2, the analysis results corresponding to 11 regions output by the face analysis model are shown in fig. 2, where the 11 regions include left eyebrow, right eyebrow, left eye, right eye, nose, mouth, face, hair, neck, torso, and background.
It should be further noted that after the third face sketch is obtained, post-processing operation may be performed on the third face sketch, the sketch in the face area in the processed third face sketch may be adjusted, and then vectorization operation may be performed on the adjusted third face sketch to obtain a final face sketch.
The post-processing operation comprises blurring, binarization, expansion and the like, so that narrower discontinuities and slender ravines are compensated, smaller hollows are eliminated, and the fracture in the contour line is filled to achieve the purpose of smoothing the contour.
Optionally, when the simple drawing in the facial area is adjusted, the simple drawing in the facial area can be removed for a user with a young female attribute type, so as to achieve the effect of removing black lines and spots in the facial skin area, and for a user with an old male attribute type, the left eye, the right eye and the nose areas are respectively expanded by preset pixel distances to obtain an expanded area, and the simple drawing in the facial area but not in the expanded area is removed, so that the effect of restraining lines corresponding to fish tails and stature lines in the facial skin area within a threshold length is achieved.
Finally, the vectorization operation is carried out on the adjusted third face simple drawing, so that the generated lines are smoother, the final face portrait simple drawing is more concise and attractive, and the characteristics and the requirements of users with different ages and sexes are met.
For the processes from step 101 to step 104, referring to the system structure shown in fig. 3, the face photo is input into a general generator to obtain a first face simple drawing, the face photo is input into a prediction model to obtain an attribute category and a fusion weight, a corresponding special generator is selected by an attribute category control gating module, the face photo is input into the selected special generator to obtain a second face simple drawing, the fusion weight is used for fusing the first face simple drawing and the second face simple drawing to obtain a third face simple drawing, and the self-adaptive post-processing module performs operations such as blurring, binarization, expansion, adjustment, vectorization and the like on the third face simple drawing to obtain a final face portrait simple drawing output.
In this embodiment, while the first face profile is synthesized by using the general face portrait synthesis model, the second face profile is synthesized by using different special face portrait synthesis models for different face attribute categories, so as to overcome the influence of the face attribute change on the portrait profile synthesis quality, and further, the third face profile obtained by fusing the first face profile and the second face profile is more accurate, and the personalized requirements of different face attributes are satisfied.
Fig. 4 is a hardware configuration diagram of an electronic device according to an exemplary embodiment of the present invention, the electronic device including: a communication interface 401, a processor 402, a machine-readable storage medium 403, and a bus 404; wherein the communication interface 401, the processor 402 and the machine readable storage medium 403 perform communication with each other via a bus 404. The processor 402 may perform the facial profile generating method described above by reading and executing machine-executable instructions in the machine-readable storage medium 403 corresponding to the control logic of the facial profile generating method, the details of which are described above with reference to the above-described embodiments and are not further described herein.
The machine-readable storage medium 403 referred to in this disclosure may be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: volatile memory, nonvolatile memory, or similar storage medium. In particular, the machine-readable storage medium 403 may be RAM (Random Access Memory ), flash memory, a storage drive (e.g., hard drive), any type of storage disk (e.g., optical disk, DVD, etc.), or a similar storage medium, or a combination thereof.
The invention also provides an embodiment of the facial profile generating device corresponding to the embodiment of the facial profile generating method.
Fig. 5 is a flowchart illustrating an embodiment of a face-profile-drawing generating apparatus according to an exemplary embodiment of the present invention, which may be applied to an electronic device. As shown in fig. 5, the face-profile-drawing generating apparatus includes:
the attribute prediction module 510 is configured to cut out a face image from the received image, and predict an attribute category of a face in the face image;
a general synthesis module 520, configured to input the face image into a trained general face portrait synthesis model, so that the general face portrait synthesis model synthesizes a first face simple drawing of the face image;
a dedicated synthesis module 530, configured to determine a trained dedicated face portrait synthesis model to be used by the attribute category, and input the face image into the dedicated face portrait synthesis model, so that the dedicated face portrait synthesis model synthesizes a second face profile of the face image;
and a fusion module 540, configured to fuse the first face shorthand drawing and the second face shorthand drawing to obtain a third face shorthand drawing.
In an optional implementation manner, the attribute prediction module 510 is specifically configured to input the face image into a trained prediction model in a process of predicting an attribute type of a face in the face image, so that a feature extraction network in the prediction model extracts a feature map of the face image and outputs the feature map to an attribute prediction network in the prediction model, where the attribute prediction network predicts the attribute type of the face based on the feature map.
In an alternative implementation, the apparatus further comprises (not shown in fig. 5):
the training module is used for acquiring a face sample set, wherein each face sample in the face sample set is marked with an attribute category, and the attribute category comprises young men, young women, old men and old women; acquiring a real face simple drawing corresponding to each face sample in the face sample set; constructing a corresponding special face portrait synthesis model and a corresponding discrimination model aiming at each attribute category, and optimizing the constructed special face portrait synthesis model and discrimination model in an alternate iterative mode by utilizing a face sample marked with the attribute category and a corresponding real face simple stroke; the special face portrait synthesizing model inputs a face sample and outputs a synthesized face simple drawing; the distinguishing model is input into a synthesized human face simple drawing, output into a distinguishing result and human face attribute of the human face simple drawing, input into a real human face simple drawing, and output into a distinguishing result and human face attribute of the real human face simple drawing; the loss value of the distinguishing model is obtained by distinguishing results and types of the synthesized face simple drawing and distinguishing results and attribute types of the real face simple drawing, and the loss value of the special face portrait distinguishing model is obtained by content loss values between the synthesized face simple drawing and the face sample, style loss values between the synthesized face simple drawing and the real face simple drawing and loss values of the distinguishing model.
In an optional implementation manner, the prediction model further includes a weight prediction network, and the attribute prediction module 510 is further configured to extract a feature map of the face image by using a feature extraction network and output the feature map to the weight prediction network, where the weight prediction network predicts a fusion weight based on the feature map;
the fusing module 540 is specifically configured to fuse the first face profile and the second face profile by using the fusion weight.
In an optional implementation manner, the training module is further configured to optimize the feature extraction network and the attribute prediction network in the constructed prediction model by using each face sample in the face sample set until the feature extraction network and the attribute prediction network converge; optimizing a weight prediction network in the constructed prediction model by using a feature map obtained by each face sample in the face sample set through an optimized feature extraction network until the loss value of the weight prediction network is lower than a preset value; the loss value of the weight prediction network is a content loss value between a face sample and a third face simple drawing obtained through fusion, wherein the third face simple drawing is obtained by fusing the first face simple drawing and the second face simple drawing by utilizing a fusion weight obtained by the feature map of the face sample through the weight prediction network after the face sample is respectively subjected to a general face portrait synthesis model and a corresponding special face portrait synthesis model to obtain the first face simple drawing and the second face simple drawing.
In an alternative implementation, the apparatus further comprises (not shown in fig. 5):
the face analysis module is configured to, after the attribute prediction module 510 cuts out a face image from a received image, input the face image into a trained face analysis model, so that the face analysis model segments each region of a face in the face image, and obtain a position of a face region output by the face analysis model;
a post-processing module, configured to perform post-processing operation on a third face profile after the fusion module 540 fuses the first face profile and the second face profile to obtain the third face profile; and adjusting the drawing of the processed third face drawing, and vectorizing the adjusted third face drawing to obtain the final face drawing.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims (8)

1. A method of generating a face profile, the method comprising:
cutting out a face image from the received image, and predicting the attribute category of the face in the face image;
inputting the face image into a trained general face portrait synthesis model so that the general face portrait synthesis model synthesizes a first face simple drawing of the face image;
determining a trained special face portrait synthesis model to be used by the attribute category, and inputting the face image into the special face portrait synthesis model so that the special face portrait synthesis model synthesizes a second face simple drawing of the face image;
fusing the first face shorthand drawing and the second face shorthand drawing to obtain a third face shorthand drawing;
the training process of the special face portrait synthesis model comprises the following steps:
acquiring a face sample set, wherein each face sample in the face sample set is marked with an attribute category, and the attribute category comprises young men, young women, old men and old women;
acquiring a real face simple drawing corresponding to each face sample in the face sample set;
constructing a corresponding special face portrait synthesis model and a corresponding discrimination model aiming at each attribute category, and optimizing the constructed special face portrait synthesis model and discrimination model in an alternate iterative mode by utilizing a face sample marked with the attribute category and a corresponding real face simple stroke;
the special face portrait synthesizing model inputs a face sample and outputs a synthesized face simple drawing; the distinguishing model is input into a synthesized human face simple drawing, output into a distinguishing result and human face attribute of the human face simple drawing, input into a real human face simple drawing, and output into a distinguishing result and human face attribute of the real human face simple drawing;
the loss value of the distinguishing model is obtained by distinguishing results and types of the synthesized face simple drawing and distinguishing results and attribute types of the real face simple drawing, and the loss value of the special face portrait distinguishing model is obtained by content loss values between the synthesized face simple drawing and the face sample, style loss values between the synthesized face simple drawing and the real face simple drawing and loss values of the distinguishing model.
2. The method of claim 1, wherein predicting the attribute category of the face in the face image comprises:
and inputting the face image into a trained prediction model, extracting a feature image of the face image by a feature extraction network in the prediction model, and outputting the feature image to an attribute prediction network in the prediction model, wherein the attribute prediction network predicts the attribute category of the face based on the feature image.
3. The method according to claim 2, wherein the prediction model further comprises a weight prediction network, a feature extraction network extracts a feature map of the face image and outputs to the weight prediction network, the weight prediction network predicting fusion weights based on the feature map;
fusing the first face profile and the second face profile to obtain a third face profile, comprising:
and fusing the first face simple drawing and the second face simple drawing by utilizing the fusion weight.
4. A method according to claim 3, wherein the training process of the predictive model comprises:
optimizing a feature extraction network and an attribute prediction network in the constructed prediction model by utilizing each face sample in the face sample set until the feature extraction network and the attribute prediction network converge;
optimizing a weight prediction network in the constructed prediction model by using a feature map obtained by each face sample in the face sample set through an optimized feature extraction network until the loss value of the weight prediction network is lower than a preset value;
the loss value of the weight prediction network is a content loss value between a face sample and a third face simple drawing obtained through fusion, wherein the third face simple drawing is obtained by fusing the first face simple drawing and the second face simple drawing by utilizing a fusion weight obtained by the feature map of the face sample through the weight prediction network after the face sample is respectively subjected to a general face portrait synthesis model and a corresponding special face portrait synthesis model to obtain the first face simple drawing and the second face simple drawing.
5. The method of claim 1, wherein after clipping the face image from the received image, the method further comprises:
inputting the face image into a trained face analysis model so that the face analysis model can divide each part area of the face in the face image and acquire the position of the face area output by the face analysis model;
after fusing the first face profile and the second face profile to obtain a third face profile, the method further includes:
performing post-processing operation on the third face simple drawing;
and adjusting the drawing of the processed third face drawing, and vectorizing the adjusted third face drawing to obtain the final face drawing.
6. A face profile generating apparatus, the apparatus comprising:
the attribute prediction module is used for cutting out a face image from the received image and predicting the attribute category of the face in the face image;
the universal synthesis module is used for inputting the face image into a trained universal face portrait synthesis model so that the universal face portrait synthesis model synthesizes a first face simple drawing of the face image;
the special synthesis module is used for determining a trained special face portrait synthesis model which is required to be used by the attribute category, and inputting the face image into the special face portrait synthesis model so that the special face portrait synthesis model synthesizes a second face simple drawing of the face image;
the fusion module is used for fusing the first face simple drawing and the second face simple drawing to obtain a third face simple drawing;
wherein the apparatus further comprises:
the training module is used for acquiring a face sample set, wherein each face sample in the face sample set is marked with an attribute category, and the attribute category comprises young men, young women, old men and old women; acquiring a real face simple drawing corresponding to each face sample in the face sample set; constructing a corresponding special face portrait synthesis model and a corresponding discrimination model aiming at each attribute category, and optimizing the constructed special face portrait synthesis model and discrimination model in an alternate iterative mode by utilizing a face sample marked with the attribute category and a corresponding real face simple stroke; the special face portrait synthesizing model inputs a face sample and outputs a synthesized face simple drawing; the distinguishing model is input into a synthesized human face simple drawing, output into a distinguishing result and human face attribute of the human face simple drawing, input into a real human face simple drawing, and output into a distinguishing result and human face attribute of the real human face simple drawing; the loss value of the distinguishing model is obtained by distinguishing results and types of the synthesized face simple drawing and distinguishing results and attribute types of the real face simple drawing, and the loss value of the special face portrait distinguishing model is obtained by content loss values between the synthesized face simple drawing and the face sample, style loss values between the synthesized face simple drawing and the real face simple drawing and loss values of the distinguishing model.
7. The apparatus according to claim 6, wherein the attribute prediction module is specifically configured to input the face image into a trained prediction model in predicting an attribute class of a face in the face image, to extract a feature map of the face image from a feature extraction network in the prediction model, and to output the feature map to an attribute prediction network in the prediction model, the attribute prediction network predicting the attribute class of the face based on the feature map.
8. The apparatus of claim 7, wherein the prediction model further comprises a weight prediction network, the attribute prediction module further configured to extract a feature map of the face image by a feature extraction network and output the feature map to the weight prediction network, the weight prediction network predicting a fusion weight based on the feature map;
the fusion module is specifically configured to fuse the first face simple drawing and the second face simple drawing by using the fusion weight.
CN202010016526.4A 2020-01-08 2020-01-08 Face simple drawing generation method and device Active CN111223164B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010016526.4A CN111223164B (en) 2020-01-08 2020-01-08 Face simple drawing generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010016526.4A CN111223164B (en) 2020-01-08 2020-01-08 Face simple drawing generation method and device

Publications (2)

Publication Number Publication Date
CN111223164A CN111223164A (en) 2020-06-02
CN111223164B true CN111223164B (en) 2023-10-24

Family

ID=70828116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010016526.4A Active CN111223164B (en) 2020-01-08 2020-01-08 Face simple drawing generation method and device

Country Status (1)

Country Link
CN (1) CN111223164B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003099779A (en) * 2001-09-21 2003-04-04 Japan Science & Technology Corp Device, method, and program for evaluating person attribute
JP2004102359A (en) * 2002-09-04 2004-04-02 Advanced Telecommunication Research Institute International Image processing device, method and program
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN108596839A (en) * 2018-03-22 2018-09-28 中山大学 A kind of human-face cartoon generation method and its device based on deep learning
CN110023989A (en) * 2017-03-29 2019-07-16 华为技术有限公司 A kind of generation method and device of sketch image
CN110222588A (en) * 2019-05-15 2019-09-10 合肥进毅智能技术有限公司 A kind of human face sketch image aging synthetic method, device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010092199A (en) * 2008-10-07 2010-04-22 Sony Corp Information processor and processing method, program, and recording medium
TWI626610B (en) * 2015-12-21 2018-06-11 財團法人工業技術研究院 Message pushing method and message pushing device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003099779A (en) * 2001-09-21 2003-04-04 Japan Science & Technology Corp Device, method, and program for evaluating person attribute
JP2004102359A (en) * 2002-09-04 2004-04-02 Advanced Telecommunication Research Institute International Image processing device, method and program
CN110023989A (en) * 2017-03-29 2019-07-16 华为技术有限公司 A kind of generation method and device of sketch image
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN108596839A (en) * 2018-03-22 2018-09-28 中山大学 A kind of human-face cartoon generation method and its device based on deep learning
CN110222588A (en) * 2019-05-15 2019-09-10 合肥进毅智能技术有限公司 A kind of human face sketch image aging synthetic method, device and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Mingjin Zhang等.Dual-Transfer Face Sketch–Photo Synthesis.IEEE Transactions on Image Processing.2019,(第2期),第642-657页. *
周仁琴 ; 刘福新 ; .面向移动数字娱乐的卡通人脸动画系统.计算机工程与应用.2009,(第01期),第96-98页. *
王楠楠 ; 李洁 ; 高新波 ; .人脸画像合成研究的综述与对比分析.模式识别与人工智能.2018,(第01期),第43-54页. *
黄菲 ; 高飞 ; 朱静洁 ; 戴玲娜 ; 俞俊 ; .基于生成对抗网络的异质人脸图像合成:进展与挑战.南京信息工程大学学报(自然科学版).2019,(第06期),第660-681页. *

Also Published As

Publication number Publication date
CN111223164A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN111243626B (en) Method and system for generating speaking video
CN109376582B (en) Interactive face cartoon method based on generation of confrontation network
CN109815826B (en) Method and device for generating face attribute model
CN109191409B (en) Image processing method, network training method, device, electronic equipment and storage medium
CN109858392B (en) Automatic face image identification method before and after makeup
CN112541864A (en) Image restoration method based on multi-scale generation type confrontation network model
CN112950661A (en) Method for generating antithetical network human face cartoon based on attention generation
US11282257B2 (en) Pose selection and animation of characters using video data and training techniques
Liu et al. A 3 GAN: an attribute-aware attentive generative adversarial network for face aging
CN110796593A (en) Image processing method, device, medium and electronic equipment based on artificial intelligence
CN112633191A (en) Method, device and equipment for reconstructing three-dimensional face and storage medium
CN113963409A (en) Training of face attribute editing model and face attribute editing method
CN112102468B (en) Model training method, virtual character image generation device, and storage medium
CN114187624A (en) Image generation method, image generation device, electronic equipment and storage medium
CN115546461A (en) Face attribute editing method based on mask denoising and feature selection
CN114724214A (en) Micro-expression editing method and system based on face action unit
Bian et al. Conditional adversarial consistent identity autoencoder for cross-age face synthesis
CN113409329A (en) Image processing method, image processing apparatus, terminal, and readable storage medium
Liu et al. A3GAN: An attribute-aware attentive generative adversarial network for face aging
CN111223164B (en) Face simple drawing generation method and device
CN111275778B (en) Face simple drawing generation method and device
CN116721008A (en) User-defined expression synthesis method and system
CN117237521A (en) Speech driving face generation model construction method and target person speaking video generation method
CN111191549A (en) Two-stage face anti-counterfeiting detection method
CN116030517A (en) Model training method, face recognition device and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200826

Address after: Room 101, building 1, block C, Qianjiang Century Park, ningwei street, Xiaoshan District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Weiming Information Technology Co.,Ltd.

Applicant after: Institute of Information Technology, Zhejiang Peking University

Address before: Room 288-1, 857 Xinbei Road, Ningwei Town, Xiaoshan District, Hangzhou City, Zhejiang Province

Applicant before: Institute of Information Technology, Zhejiang Peking University

Applicant before: Hangzhou Weiming Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20200602

Assignee: Zhejiang Visual Intelligence Innovation Center Co.,Ltd.

Assignor: Institute of Information Technology, Zhejiang Peking University|Hangzhou Weiming Information Technology Co.,Ltd.

Contract record no.: X2023330000927

Denomination of invention: Method and device for generating simple facial strokes

Granted publication date: 20231024

License type: Common License

Record date: 20231219