CN110334679B - Face point processing method and device - Google Patents

Face point processing method and device Download PDF

Info

Publication number
CN110334679B
CN110334679B CN201910627394.6A CN201910627394A CN110334679B CN 110334679 B CN110334679 B CN 110334679B CN 201910627394 A CN201910627394 A CN 201910627394A CN 110334679 B CN110334679 B CN 110334679B
Authority
CN
China
Prior art keywords
face
face point
sampling
sampling unit
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910627394.6A
Other languages
Chinese (zh)
Other versions
CN110334679A (en
Inventor
陈良
余清洲
苏晋展
张伟
许清泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meitu Technology Co Ltd
Original Assignee
Xiamen Meitu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meitu Technology Co Ltd filed Critical Xiamen Meitu Technology Co Ltd
Priority to CN201910627394.6A priority Critical patent/CN110334679B/en
Publication of CN110334679A publication Critical patent/CN110334679A/en
Application granted granted Critical
Publication of CN110334679B publication Critical patent/CN110334679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The disclosure provides a method and a device for processing face points, and relates to the technical field of image processing. The face point processing method and device provided by the disclosure are based on a face point fitting network, down-sampling is carried out on a face facial feature image to be detected, face point feature information is extracted to obtain face point data, and up-sampling is carried out on the face point feature information based on a segmentation network to obtain a face facial feature mask. According to the face point processing method and device, the face point fitting network and the segmentation network are combined, the face facial features image is processed, the face point data and the facial features mask are output, time cost is saved, and precision is improved.

Description

Face point processing method and device
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for processing a face point.
Background
The face point alignment has a very wide application scene in production, and AR (Augmented Reality) materials and makeup can be added to the face of a person, and even a 3D (three-dimensional) model of the face can be established in an auxiliary mode. In practical application scenes, face points are mostly obtained in a regression mode of each frame, then a mask is fitted according to the difference value of the face points of each frame, and the parts of five sense organs of the face are made up according to the mask. However, fitting a mask in this way is time-costly and less accurate.
Disclosure of Invention
Based on the research, the present disclosure provides a face point processing method and apparatus.
Embodiments of the present disclosure may be implemented as follows:
in a first aspect, the disclosed embodiments provide a face point processing method, which is applied to an electronic device, where the electronic device stores a face point fitting network and a segmentation network; the method comprises the following steps:
based on the face point fitting network, performing down-sampling on a face facial feature image to be detected, and extracting face point characteristic information to obtain face point data;
and based on the segmentation network, performing up-sampling on the face point characteristic information to obtain a face facial feature mask.
Further, the face point fitting network and the segmentation network are obtained by training through the following steps:
based on a face point fitting network to be trained, performing down-sampling on a facial feature picture, and extracting first face point characteristic information to obtain first face point data;
based on a segmentation network to be trained, performing up-sampling on the first face point feature information to obtain a first face facial feature mask;
and according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
Further, the step of adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm based on a preset loss function according to the first facial feature mask and the first facial point data by combining sample data obtained in advance includes:
adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data;
obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information;
and adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
Further, the preset second loss function is a cross entropy loss function, and the preset first loss function is:
Figure BDA0002127097920000031
Figure BDA0002127097920000032
wherein liCoordinates of each face point in the face point data; l isiCoordinates of each face point in sample data; i isxi,yiPixel points of a mask for the five sense organs of the human face; r is a threshold value for judging whether the pixel point is visible or not; n is the number of sample data, and i is any one sample in the sample data.
Further, the face point fitting network comprises a plurality of down-sampling units, the output size of each down-sampling unit is 1/C of the output size of the last down-sampling unit, and C is a positive integer.
Further, the segmentation network comprises a plurality of up-sampling units, each up-sampling unit and each down-sampling unit are symmetrically arranged, the output size of each up-sampling unit is C times of the output size of the last up-sampling unit, and C is a positive integer.
Further, the step of performing upsampling on the face point feature information based on the segmentation network to obtain a face facial feature mask includes:
for each up-sampling unit, fusing the output of the up-sampling unit with the output of the corresponding down-sampling unit to obtain fused characteristic information;
and for each up-sampling unit, up-sampling the up-sampling unit on the fusion characteristic information output by the last up-sampling unit.
In a second aspect, the disclosed embodiment provides a face point processing apparatus, which is applied to an electronic device, where the electronic device stores a face point fitting network and a segmentation network; the human face point processing device comprises an extraction module and a segmentation module;
the extraction module is used for performing down-sampling on a facial feature image to be detected based on the facial point fitting network, and extracting facial point characteristic information to obtain facial point data;
and the segmentation module is used for performing up-sampling on the face point characteristic information based on the segmentation network to obtain a face facial feature mask.
Further, the face point processing apparatus further includes a training module, and the training module is configured to:
based on a face point fitting network to be trained, performing down-sampling on a facial feature picture, and extracting first face point characteristic information to obtain first face point data;
based on a segmentation network to be trained, performing up-sampling on the first face point feature information to obtain a first face facial feature mask;
and according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
Further, the preset loss function includes a preset first loss function and a preset second loss function, and the training module is further configured to:
adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data;
obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information;
and adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
According to the face point processing method and device, the face point fitting network and the segmentation network are combined to process the face facial features image, the output of the face point data and the facial features mask is achieved, the time cost is saved, compared with the prior art that the mask is obtained according to the face point difference fitting of each frame, the face point fitting network and the segmentation network are adopted to improve the accuracy of the face facial features mask.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present disclosure and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings may be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of an electronic device provided in the present disclosure.
Fig. 2 is a schematic flow chart of a face point processing method provided by the present disclosure.
Fig. 3 is another schematic flow chart of the face point processing method provided by the present disclosure.
Fig. 4 is a schematic diagram illustrating a principle of the face point processing method according to the present disclosure.
Fig. 5 is a schematic flow chart of a face point processing method provided by the present disclosure.
Fig. 6 is a schematic diagram of a network training process of the face point processing method provided by the present disclosure.
Fig. 7 is a schematic flow chart of a face point processing method according to the present disclosure.
Fig. 8 is a block diagram of a face point processing apparatus provided in the present disclosure.
Icon: 100-an electronic device; 10-a face point processing device; 11-an extraction module; 12-a segmentation module; 13-a training module; 20-a memory; 30-a processor; 40-a communication module.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
It should be noted that the features in the embodiments of the present disclosure may be combined with each other without conflict.
The face point alignment has a very wide application scene in production, and AR materials and makeup can be added to the face of a person, and even a 3D model of the face can be established in an auxiliary mode. Nowadays, the convolutional neural network has very wide application in various task scenes, and is not exceptional for face alignment.
In a real-time scene, face points are mostly obtained in a regression mode of each frame, the face points are provided for the next frame of cutting picture so as to achieve the purpose of tracking, and meanwhile, a mask is fitted according to the difference value of the face points of each frame and is used for processing of making up and the like of all parts of facial features. However, because the face points have the characteristic of dispersion, the mask fitting method has limitations in practical applications such as make-up, and meanwhile, the mask fitted in the mode cannot correctly process the situation that facial features are partially blocked, the accuracy is low, and meanwhile, the time cost and the calculation cost are high.
Based on the above research, the present disclosure provides a method and an apparatus for processing a face point, so as to improve the above problem.
Referring to fig. 1, the face point processing method provided by the present disclosure is applied to an electronic device 100, and the electronic device 100 executes the face point processing method provided by the present disclosure.
The electronic device 100 includes the human face point processing apparatus 10, the memory 20, the processor 30 and the communication module 40 shown in fig. 1, and the respective elements of the memory 20, the processor 30 and the communication module 40 are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, the components may be directly electrically connected to each other via one or more communication buses or signal lines. The face point processing device 10 includes at least one software functional module which can be stored in the memory 20 in the form of software or Firmware (Firmware), and the processor 30 executes various functional applications and data processing by running software programs and modules stored in the memory 20.
The Memory 20 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 30 may be an integrated circuit chip having signal processing capabilities. The Processor 30 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like.
The communication module 40 is configured to establish a communication connection between the electronic device 100 and another external device through a network, and perform data transmission through the network.
It is to be understood that the configuration shown in fig. 1 is merely exemplary, and that the electronic device 100 may include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
In the present disclosure, the electronic device 100 may be, but is not limited to, a device having a processing capability, such as a Personal Computer (PC), a notebook Computer, a Personal Digital Assistant (PDA), or a server.
Referring to fig. 2, the present disclosure provides a face point processing method applicable to the electronic device 100. Wherein the method steps defined by the method related flows may be implemented by the processor 30. The specific process shown in fig. 2 will be described in detail below.
Step S10: and based on the face point fitting network, performing down-sampling on the facial features image to be detected, extracting the facial point characteristic information, and obtaining the facial point data.
Step S20: and based on the segmentation network, performing up-sampling on the face point characteristic information to obtain a face facial feature mask.
The human face facial features image to be detected is obtained based on a facial-aligned facial features cascade model, a full-face picture is input into the facial features cascade model to obtain full-face facial points, namely facial key points, and then positioning cutting is carried out according to the full-face facial points to obtain the facial features image, such as a facial mouth image, a facial eye image and the like. And the facial image that cuts out according to the facial point location of full face is the facial image that is image that just.
After the face facial feature image to be detected is obtained, the face facial feature image to be detected is input into a face point fitting network, the face facial feature image to be detected is subjected to down-sampling processing based on the face point fitting network, and face point characteristic information is extracted to obtain face point data. And simultaneously, based on the segmentation network, performing up-sampling on the face point characteristic information output by down-sampling, and recovering image information to obtain a face facial features mask, wherein the face facial features mask is the same as the size of a face facial features image to be detected.
Further, in the present disclosure, the face point fitting network includes a plurality of down-sampling units, an output size of each down-sampling unit is 1/C of an output size of a last down-sampling unit, and C is a positive integer.
The downsampling part of the face point fitting network can adopt a ShortCut structure to form a plurality of downsampling units according to mainstream network design experience, the output of each downsampling unit is obtained by downsampling on the basis of the output of the last downsampling unit, for example, after a certain facial feature picture is input into a downsampling unit A1 to be downsampled, first feature information is output, the first feature information is used as the input of a downsampling unit A2, downsampling is performed on the first feature information, second feature information is output, the second feature information is used as the input of a downsampling unit A3, downsampling is performed on the second feature information, and the like, and after downsampling is performed by all downsampling units, the face point feature information of the facial feature picture is obtained.
In order to obtain the face point data and the face facial feature mask, in the present disclosure, the output size of each down-sampling unit needs to be designed to be 1/C of the output size of the last down-sampling unit, for example, in the down-sampling process of the face point fitting network, a certain facial feature picture is down-sampled by the down-sampling unit a1 to obtain first feature information, the first feature information is down-sampled by the down-sampling unit a2 to obtain second feature information, and the output size of the down-sampling unit a2 is designed to be 1/C of the output size of the down-sampling unit a1, so the size of the second feature information is reduced to be 1/C of the first feature information. Optionally, in the present disclosure, C is 2.
In the present disclosure, a ShortCut structure is adopted to form a plurality of down-sampling units, and each down-sampling unit completes the down-sampling process through convolution operation. The number and size of convolution kernels can be set according to actual conditions, and the disclosure is not limited. And after the facial feature image to be detected is subjected to convolution operation of each down-sampling unit, the facial point data are regressed in the full connection layer of the facial point fitting network.
Further, the segmentation network comprises a plurality of up-sampling units, each up-sampling unit and each down-sampling unit are symmetrically arranged, the output size of each up-sampling unit is C times of the output size of the last up-sampling unit, and C is a positive integer.
The split network also adopts a Shortcut structure to form a plurality of up-sampling units, each up-sampling unit and each down-sampling unit are symmetrically arranged, and the output size of each up-sampling unit is C times of that of the last up-sampling unit. If the face point fitting network performs down-sampling for n times, the segmentation network performs up-sampling for n times, and if the output size of each down-sampling unit is 1/2 times of the output size of the last down-sampling unit, the output size of each up-sampling unit is 2 times of the output size of the last up-sampling unit.
Further, referring to fig. 3, the step of upsampling the facial point feature information based on the segmentation network to obtain a facial feature mask includes steps S21 to S22.
Step S21: and for each up-sampling unit, fusing the output of the up-sampling unit with the output of the corresponding down-sampling unit to obtain fused characteristic information.
Step S22: and for each up-sampling unit, up-sampling the up-sampling unit on the fusion characteristic information output by the last up-sampling unit.
The facial features lose more shallow information after being subjected to downsampling, and are not beneficial to learning of an upsampling part, so that feature fusion needs to be carried out on each downsampling unit and the corresponding upsampling unit, and circulation of shallow information is achieved.
And for each up-sampling unit, performing convolution on the output of the down-sampling unit corresponding to the up-sampling unit with unchanged size, and then fusing the output of the up-sampling unit with the output of the up-sampling unit to obtain fused characteristic information. And after the fusion characteristic information is obtained, taking the fusion characteristic information as the input of the next up-sampling unit, and up-sampling the fusion characteristic information. For example, in fig. 4, the up-sampling unit B3 corresponds to the down-sampling unit A3, the up-sampling unit B2 corresponds to the down-sampling unit a2, and the up-sampling unit B1 corresponds to the down-sampling unit a 1. The output of the down-sampling unit A3 is used as the input of the up-sampling unit B3, up-sampling is carried out through the up-sampling unit B3, the output of the down-sampling unit A3 is subjected to size-invariant convolution and then is fused with the output of the up-sampling unit B3 to obtain fused characteristic information B1, the fused characteristic information B1 is used as the input of the up-sampling unit B2, up-sampling is carried out through the up-sampling unit B3, the output of the down-sampling unit A2 is subjected to size-invariant convolution and then is fused with the output of the up-sampling unit B2 to obtain fused characteristic information B2, the fused characteristic information B2 is used as the input of the up-sampling unit B1, and by analogy, after up-sampling is carried out through the last up-sampling unit, the features fused by the last up-sampling unit are input into a softmax layer of a segmentation network for classification, and then the face five-sense organ mask is obtained.
Alternatively, the feature fusion may be performed by Concat fusion or Alpha fusion.
According to the face point processing method and device, on the basis of the facial-alignment facial features cascading model, the structure of the face point fitting network is not changed, the segmentation network and the face point fitting network are combined, when the facial features image is processed, the face point data and the facial features mask can be output simultaneously, time and calculation cost are saved, and real-time segmentation performance is greatly improved.
Further, please refer to fig. 5 and fig. 6 in combination, fig. 6 is a schematic diagram of a training process of network training provided by the present disclosure, and fig. 6 is a schematic diagram of a training process of only a mouth portion of a human face, it can be understood that a training process of other facial features may also refer to the schematic diagram of the training process shown in fig. 6. The face point fitting network and the segmentation network are obtained by training through the following steps:
step S30: and performing down-sampling on the facial features picture based on a facial point fitting network to be trained, and extracting first facial point characteristic information to obtain first facial point data.
Step S40: and performing up-sampling on the first face point characteristic information based on a segmentation network to be trained to obtain a first face facial feature mask.
The process of obtaining the first face point data and the first face facial mask may refer to the processes of step S10 to step S22.
Step S50: and according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
Further, please refer to fig. 7 in combination, where the preset loss function includes a preset first loss function and a preset second loss function, and the step of adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained by using a back propagation algorithm based on the preset loss function by combining the sample data obtained in advance according to the first face facial feature mask and the first face point data includes steps S51 to S53.
Step S51: and adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data.
Wherein, the preset first loss function is:
Figure BDA0002127097920000131
Figure BDA0002127097920000132
wherein liCoordinates of each face point in the face point data; l isiCoordinates of each face point in sample data; i isxi,yiPixel points of a mask for the five sense organs of the human face; r is a threshold value for judging whether the pixel point is visible or not; n is the number of sample data, i is any sample in the sample data, Loss (L, L) is the Loss value of the sample data and the face point data, and Euclidean is the Euclidean distance function.
According to whether each pixel point in the first face facial features mask is visible or not, giving corresponding weight to a preset first loss function, for example, if the pixel point I in the first face facial features mask is visiblexi,yiIf the weight is less than the visible threshold value, the given weight is 0, and if the pixel point I in the mask of the first facial features is smaller than the visible threshold value, the pixel point I in the mask of the first facial features isxi,yiAbove the visible threshold, the weight assigned is 1.
After the corresponding weight of a preset first loss function is given through a first face facial feature mask, sample data and a face point data loss value loss are calculated according to an Euclidean distance function Euclidean, the loss value is propagated reversely through a back propagation algorithm, and then the weight of a face point fitting network to be trained is adjusted.
In the present disclosure, coordinates of each face point in sample data are obtained by pre-labeling, the sample data is a group route in fig. 6, the first face point data is a Prediction in fig. 6, and the Loss Func in fig. 6 is a preset first Loss function. According to the method, the corresponding weight of the preset first loss function is given through the first face facial feature mask, when the loss value of the sample data and the face point data is calculated, the loss generated by the shielded part of the face point is shielded to a certain extent, and the precision of the face point and the robustness during shielding are improved.
Step S52: and obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information.
Step S53: and adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
After the weight of the face point fitting network is adjusted, extracted face point feature information is correspondingly adjusted, and a face facial feature mask obtained according to the adjusted face point feature information is correspondingly adjusted. Therefore, second face point feature information is extracted and obtained based on the adjusted face point fitting network, and a second face facial features mask is obtained according to the second face point feature information, wherein the second face facial features mask is the adjusted face facial features mask.
And after a second face facial feature mask is obtained, based on a preset second loss function, adjusting the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset first loss function and the output of the preset second loss function are both smaller than a preset threshold value.
Optionally, in this disclosure, the second loss function is preset to be a cross entropy loss function.
According to the method, the face point fitting network and the segmentation network are jointly trained, the first face facial features mask is used for endowing corresponding weight of the preset first loss function, loss generated by the shielded part of the face point is shielded, and the accuracy and robustness of the face point and the face facial features mask during shielding are improved. After the training of the face point fitting network and the segmentation network is finished, the face point fitting network and the segmentation network are combined to process facial features images, face point data and facial features masks can be output simultaneously, time cost and calculation cost are saved, and real-time segmentation performance is greatly improved.
On the basis, please refer to fig. 8 in combination, an embodiment of the present disclosure provides a face point processing apparatus 10, which is applied to an electronic device 100, wherein the electronic device 100 stores a face point fitting network and a segmentation network; the face point processing device 10 includes an extraction module 11 and a segmentation module 12.
The extraction module 11 is configured to perform downsampling on a facial feature image to be detected based on the facial point fitting network, and extract facial point feature information to obtain facial point data.
The segmentation module 12 is configured to perform upsampling on the face point feature information based on the segmentation network to obtain a face facial feature mask.
Further, the human face processing apparatus 10 further includes a training module 13, where the training module 13 is configured to:
and performing down-sampling on the facial features picture based on a facial point fitting network to be trained, and extracting first facial point characteristic information to obtain first facial point data.
And performing up-sampling on the first face point characteristic information based on a segmentation network to be trained to obtain a first face facial feature mask.
And according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight values of the face point fitting network to be trained and the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
Further, the preset loss function includes a preset first loss function and a preset second loss function, and the training module 13 is further configured to:
and adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data.
And obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information.
And adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the above-described face point processing apparatus 10 may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.
In summary, the face point processing method and device provided by the disclosure perform down-sampling on a face facial feature image to be detected based on a face point fitting network, extract face point feature information to obtain face point data, and perform up-sampling on the face point feature information based on a segmentation network to obtain a face facial feature mask, so that the face point data and the facial feature mask are output simultaneously, time cost is saved, and precision is improved.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (7)

1. A face point processing method is applied to electronic equipment, wherein the electronic equipment stores a face point fitting network and a segmentation network; the method comprises the following steps:
based on the face point fitting network, performing down-sampling on a face facial feature image to be detected, and extracting face point characteristic information to obtain face point data;
based on the segmentation network, up-sampling the face point characteristic information to obtain a face facial feature mask;
the face point fitting network comprises a plurality of down-sampling units, the output size of each down-sampling unit is 1/C of the output size of the last down-sampling unit, and C is a positive integer;
the segmentation network comprises a plurality of up-sampling units, each up-sampling unit and each down-sampling unit are symmetrically arranged, the output size of each up-sampling unit is C times of the output size of the last up-sampling unit, and C is a positive integer;
the step of up-sampling the face point feature information based on the segmentation network to obtain a face facial features mask comprises:
for each up-sampling unit, fusing the output of the up-sampling unit with the output of the corresponding down-sampling unit to obtain fused characteristic information;
for each up-sampling unit, up-sampling the up-sampling unit on the fusion characteristic information output by the last up-sampling unit;
the face points are face key points.
2. The method of claim 1, wherein the face point fitting network and the segmentation network are trained by:
based on a face point fitting network to be trained, performing down-sampling on a facial feature picture, and extracting first face point characteristic information to obtain first face point data;
based on a segmentation network to be trained, performing up-sampling on the first face point feature information to obtain a first face facial feature mask;
and according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
3. The method according to claim 2, wherein the preset loss function includes a preset first loss function and a preset second loss function, and the step of adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained by using a back propagation algorithm based on the preset loss function according to the first face facial feature mask and the first face point data in combination with pre-obtained sample data comprises:
adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data;
obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information;
and adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
4. The method according to claim 3, wherein the preset second loss function is a cross entropy loss function, and the preset first loss function is:
Figure FDA0003309309930000021
Figure FDA0003309309930000031
wherein liCoordinates of each face point in the face point data; l isiCoordinates of each face point in sample data; i isxi,yiPixel points of a mask for the five sense organs of the human face; r is a threshold value for judging whether the pixel point is visible or not; n is the number of sample data, and i is any one sample in the sample data.
5. A face point processing device is applied to electronic equipment, wherein the electronic equipment stores a face point fitting network and a segmentation network; the human face point processing device comprises an extraction module and a segmentation module;
the extraction module is used for performing down-sampling on a facial feature image to be detected based on the facial point fitting network, and extracting facial point characteristic information to obtain facial point data;
the segmentation module is used for performing up-sampling on the face point characteristic information based on the segmentation network to obtain a face facial feature mask;
the face point fitting network comprises a plurality of down-sampling units, the output size of each down-sampling unit is 1/C of the output size of the last down-sampling unit, and C is a positive integer;
the segmentation network comprises a plurality of up-sampling units, each up-sampling unit and each down-sampling unit are symmetrically arranged, the output size of each up-sampling unit is C times of the output size of the last up-sampling unit, and C is a positive integer;
the segmentation module is specifically configured to:
for each up-sampling unit, fusing the output of the up-sampling unit with the output of the corresponding down-sampling unit to obtain fused characteristic information;
for each up-sampling unit, up-sampling the up-sampling unit on the fusion characteristic information output by the last up-sampling unit;
the face points are face key points.
6. The apparatus according to claim 5, further comprising a training module, said training module being configured to:
based on a face point fitting network to be trained, performing down-sampling on a facial feature picture, and extracting first face point characteristic information to obtain first face point data;
based on a segmentation network to be trained, performing up-sampling on the first face point feature information to obtain a first face facial feature mask;
and according to the first face facial feature mask and the first face point data, combining sample data obtained in advance, based on a preset loss function, and adjusting the weight of the face point fitting network to be trained and the weight of the segmentation network to be trained through a back propagation algorithm until the output of the preset loss function is smaller than a preset threshold value.
7. The apparatus according to claim 6, wherein the preset loss function comprises a preset first loss function and a preset second loss function, and the training module is configured to:
adjusting the weight of the face point fitting network to be trained through a back propagation algorithm based on the preset first loss function according to the first face facial features mask, the first face point data and the sample data;
obtaining second face point characteristic information based on the adjusted face point fitting network, and obtaining a second face facial feature mask according to the second face point characteristic information;
and adjusting the weight of the segmentation network to be trained through a back propagation algorithm based on the preset second loss function according to the second face facial feature mask.
CN201910627394.6A 2019-07-11 2019-07-11 Face point processing method and device Active CN110334679B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910627394.6A CN110334679B (en) 2019-07-11 2019-07-11 Face point processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910627394.6A CN110334679B (en) 2019-07-11 2019-07-11 Face point processing method and device

Publications (2)

Publication Number Publication Date
CN110334679A CN110334679A (en) 2019-10-15
CN110334679B true CN110334679B (en) 2021-11-26

Family

ID=68146483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910627394.6A Active CN110334679B (en) 2019-07-11 2019-07-11 Face point processing method and device

Country Status (1)

Country Link
CN (1) CN110334679B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827371B (en) * 2019-11-05 2023-04-28 厦门美图之家科技有限公司 Certificate generation method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485214A (en) * 2016-09-28 2017-03-08 天津工业大学 A kind of eyes based on convolutional neural networks and mouth state identification method
CN108062543A (en) * 2018-01-16 2018-05-22 中车工业研究院有限公司 A kind of face recognition method and device
CN109410220A (en) * 2018-10-16 2019-03-01 腾讯科技(深圳)有限公司 Image partition method, device, computer equipment and storage medium
CN109410240A (en) * 2018-10-09 2019-03-01 电子科技大学中山学院 Method and device for positioning volume characteristic points and storage medium thereof
CN109598261A (en) * 2019-01-18 2019-04-09 嘉兴学院 A kind of three-dimensional face identification method based on region segmentation
CN109727240A (en) * 2018-12-27 2019-05-07 深圳开立生物医疗科技股份有限公司 A kind of three-dimensional ultrasound pattern blocks tissue stripping means and relevant apparatus
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN109978930A (en) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 A kind of stylized human face three-dimensional model automatic generation method based on single image

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485214A (en) * 2016-09-28 2017-03-08 天津工业大学 A kind of eyes based on convolutional neural networks and mouth state identification method
CN108062543A (en) * 2018-01-16 2018-05-22 中车工业研究院有限公司 A kind of face recognition method and device
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN109410240A (en) * 2018-10-09 2019-03-01 电子科技大学中山学院 Method and device for positioning volume characteristic points and storage medium thereof
CN109410220A (en) * 2018-10-16 2019-03-01 腾讯科技(深圳)有限公司 Image partition method, device, computer equipment and storage medium
CN109727240A (en) * 2018-12-27 2019-05-07 深圳开立生物医疗科技股份有限公司 A kind of three-dimensional ultrasound pattern blocks tissue stripping means and relevant apparatus
CN109598261A (en) * 2019-01-18 2019-04-09 嘉兴学院 A kind of three-dimensional face identification method based on region segmentation
CN109978930A (en) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 A kind of stylized human face three-dimensional model automatic generation method based on single image

Also Published As

Publication number Publication date
CN110334679A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
CN109508688B (en) Skeleton-based behavior detection method, terminal equipment and computer storage medium
CN110826519A (en) Face occlusion detection method and device, computer equipment and storage medium
CN111583097A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN113971751A (en) Training feature extraction model, and method and device for detecting similar images
US11354883B2 (en) Image processing method and apparatus, and electronic device
CN113792526B (en) Training method of character generation model, character generation method, device, equipment and medium
JP2023525462A (en) Methods, apparatus, electronics, storage media and computer programs for extracting features
AU2020294190B2 (en) Image processing method and apparatus, and electronic device
CN113591675A (en) Method, device and equipment for constructing image recognition model and storage medium
CN110334679B (en) Face point processing method and device
CN112991274A (en) Crowd counting method and device, computer equipment and storage medium
WO2020119058A1 (en) Micro-expression description method and device, computer device and readable storage medium
CN115937537A (en) Intelligent identification method, device and equipment for target image and storage medium
CN114140320B (en) Image migration method and training method and device of image migration model
CN111489289A (en) Image processing method, image processing device and terminal equipment
CN114694209A (en) Video processing method and device, electronic equipment and computer storage medium
CN114429661A (en) Face three-dimensional cartoon method, device, equipment and medium based on local area supervision
CN115147434A (en) Image processing method, device, terminal equipment and computer readable storage medium
CN112069885A (en) Face attribute identification method and device and mobile terminal
CN107633498B (en) Image dark state enhancement method and device and electronic equipment
CN113283318A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112084874A (en) Object detection method and device and terminal equipment
CN113033263A (en) Face image age feature recognition method
CN116012873B (en) Pedestrian re-identification method and device, electronic equipment and storage medium
CN115147681B (en) Training of clothing generation model and method and device for generating clothing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant