CN112541484A - Face matting method, system, electronic device and storage medium - Google Patents
Face matting method, system, electronic device and storage medium Download PDFInfo
- Publication number
- CN112541484A CN112541484A CN202011581223.3A CN202011581223A CN112541484A CN 112541484 A CN112541484 A CN 112541484A CN 202011581223 A CN202011581223 A CN 202011581223A CN 112541484 A CN112541484 A CN 112541484A
- Authority
- CN
- China
- Prior art keywords
- face
- image
- rotation angle
- matting
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000001514 detection method Methods 0.000 claims abstract description 72
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims abstract description 35
- 230000009466 transformation Effects 0.000 claims abstract description 35
- 238000012549 training Methods 0.000 claims description 30
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 230000004807 localization Effects 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 238000012937 correction Methods 0.000 abstract description 9
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 6
- 230000003321 amplification Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G06T3/02—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/187—Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Abstract
The invention relates to an artificial intelligence technology, and discloses a face matting method, which comprises the following steps: carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates; constructing an original triangle and a target triangle according to the face rotation angle and the face coordinates; and realizing face matting from the target image based on affine transformation according to the original triangle and the target triangle to obtain a face image. The invention also provides a face matting system, an electronic device and a computer readable storage medium. The face matting method, the face matting system, the electronic device and the computer-readable storage medium provided by the invention can carry out 360-degree face detection and face correction on the target image and return the face image meeting the specification requirement.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a face matting method, a face matting system, an electronic device and a computer-readable storage medium.
Background
Identity card face identification is a common image identification application scene, but because the image data of the identity card uploaded by a user are various and the position of the identity card is various, difficulty and resource waste are brought to the face identification of the identity card. Therefore, it is necessary to perform standard face matting on the image of the identification card.
Currently, the face matting methods commonly used in the industry include:
the method for detecting the face and matting the image based on the identity card rotated for many times is characterized in that the key point detection of the face is carried out after the identity card is rotated for many times until the face is detected. The disadvantages of such methods are: when a stronger face detector is selected, the face which is upside down or faces left and right is easy to return, which is not beneficial to further face key point detection and face correction; when a weaker detector is selected, the face of the identity card can not be detected easily.
The method for correcting the scratch image based on the face detection and the face key points is characterized in that the included angle between the connecting line of the central points of two eyes and the horizontal line is calculated through the face detection and the face key point detection to rotate the face, so that the correction effect is achieved, and a face image is cut according to the face coordinates. The disadvantages of such methods are: when the identity card image uploaded by the user is upside down or rotated by 90 degrees, the detected key points of the face are inaccurate, so that the face correction cannot be realized more accurately.
In summary, how to accurately extract a face image meeting the specification requirement from the identity card images at various placement positions has become a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention provides a method, a system, an electronic device and a computer-readable storage medium for face matting, so as to solve at least one of the above technical problems.
Firstly, in order to achieve the above object, the present invention provides a face matting method, which comprises the steps of:
carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates;
constructing an original triangle and a target triangle according to the face rotation angle and the face coordinates; and
and realizing face matting from the target image based on affine transformation according to the original triangle and the target triangle to obtain a face image.
Optionally, the performing face detection on the target image through the trained face positioning network model to obtain a face rotation angle and a face coordinate includes:
inputting the target image into the trained face positioning network model, and outputting a face heat map, a face scale map, a face center offset map and a face rotation angle map;
obtaining face coordinates in the target image according to the face heat image, the face scale image, the face center offset image and a face frame rule;
and acquiring a face rotation angle from the face rotation angle image.
Optionally, the training process of the face positioning network model includes:
preprocessing a face training image;
generating a predicted face heat image, a face scale image, a face center offset image and a face rotation angle image according to the preprocessed face training image;
respectively calculating loss values of the face heat map, the face scale map, the face center offset map and the face rotation angle map, and connecting the loss values in series according to preset different weights to obtain a total loss value;
and reversely transmitting the total loss value, and continuously acquiring a next face training image for training until the parameters of the face positioning network model are converged.
Optionally, the face training image includes a true value of a face rotation angle that is artificially labeled, and the generating a predicted face rotation angle map according to the preprocessed face training image includes:
and marking the characteristic value corresponding to the position of the face center point of the face rotation angle image as the true value, and marking the characteristic values corresponding to the positions of other pixel points of the face rotation angle image as 0.
Optionally, the obtaining of the face coordinates in the target image according to the face heat map, the face scale map, the face center offset map, and the face frame rule includes:
determining pixel points with characteristic values larger than a preset threshold value in the face heat map as face regions;
acquiring face coordinate offset at a position corresponding to the face area on the face central offset map, and adding the face coordinate offset and the coordinates of the face heat map to obtain a face central point position;
and calculating the width and height of the face on the face scale image through index conversion to obtain a face detection frame, rejecting repeated face detection frames through non-maximum value suppression, and determining the position coordinates of the face detection frame as face coordinates in the target image.
Optionally, the constructing of the original triangle and the target triangle according to the face rotation angle and the face coordinates includes:
carrying out external expansion on the face detection frame to the periphery according to a preset proportion to obtain an external expansion face frame;
constructing three rotation points according to the coordinates of the external expansion face frame and the center point thereof and the face rotation angle;
constructing an original triangle according to the three rotation points;
and constructing a target triangle according to the width and the height of the external extended face frame.
Optionally, the obtaining of the face image from the face matting in the target image based on affine transformation according to the original triangle and the target triangle includes:
obtaining an affine transformation matrix by obtaining an affine transformation function according to the original triangle and the target triangle;
and carrying out face matting from the target image through an affine transformation function according to the affine transformation matrix to obtain a corrected face image.
In addition, to achieve the above object, the present invention further provides a face matting system, comprising:
the detection module is used for carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates;
the construction module is used for constructing an original triangle and a target triangle according to the face rotation angle and the face coordinates; and
and the matting module is used for realizing face matting from the target image based on affine transformation according to the original triangle and the target triangle to obtain a face image.
Further, to achieve the above object, the present invention further provides an electronic device, which includes a memory and a processor, where the memory stores a face matting program operable on the processor, and the face matting program, when executed by the processor, implements the steps of the face matting method as described above.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium storing a face matting program, which is executable by at least one processor to cause the at least one processor to perform the steps of the face matting method as described above.
Compared with the prior art, the face matting method, the face matting system, the electronic device and the computer-readable storage medium provided by the invention can perform 360-degree face detection and face correction on target images such as identity card images and return face images meeting specification requirements.
Drawings
FIG. 1 is a schematic flow chart diagram of a preferred embodiment of a face matting method according to the invention;
FIG. 2 is a schematic flow chart of a training process of the face localization network model in the present invention;
FIG. 3 is a detailed flowchart of step S400 in FIG. 1;
FIG. 4 is a detailed flowchart of step S402 in FIG. 1;
FIG. 5 is a detailed flowchart of step S404 in FIG. 1;
FIG. 6 is a schematic diagram of face matting from a target image to obtain a face image according to the present invention;
FIG. 7 is a diagram of an alternative hardware architecture of the electronic device of the present invention;
FIG. 8 is a block diagram of a preferred embodiment of a face matting system according to the invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a preferred embodiment of the face matting method according to the present invention. In this embodiment, the execution order of the steps in the flowchart shown in fig. 1 may be changed and some steps may be omitted according to different requirements. The method comprises the following steps:
and S100, carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates.
In this embodiment, a face positioning network model obtained by improving a Centerface (anchor-free face detection model) is used for face detection. Unlike the original Centerface model, which adopts MobileNetV3 as a backbone network, the face positioning network model of the present embodiment adopts MobileNetV2 as a backbone network, and adopts UNet structure as the neck for subsequent detection, and adopts a top-down transverse connection architecture to construct a feature pyramid from a single scale input.
Fig. 2 is a schematic flow chart of the training process of the face location network model. In this embodiment, the training process of the face positioning network model may include:
and step S200, preprocessing the face training image.
The preprocessing refers to data amplification, and comprises random clipping, random color dithering, random brightness dithering, random saturation dithering, random contrast dithering and the like.
And step S202, generating a predicted face heat image, a face scale image, a face center offset image and a face rotation angle image according to the preprocessed face training image.
And inputting the preprocessed face training image into the face positioning network model, and outputting a plurality of feature maps. In the present embodiment, four kinds of feature maps are included: the system comprises a face heat image, a face scale image, a face center offset image and a face rotation angle image. And generating the face heat image, the face scale image and the face center offset image in a manner similar to that of the original center model. It should be noted that, in the present embodiment, a face rotation angle map is additionally generated, where the face rotation angle map is generated according to the manually labeled face rotation angle and the position of the face center point. The true value (only one value) of the face rotation angle may be manually labeled in advance in the face training image. In the generated face rotation angle image, each pixel point has a corresponding characteristic value, wherein the position of the center point of the face corresponds to the position of the center point of the faceThe eigenvalue (face rotation angle) is the true value, the range of the rotation angle is (-180, 180), and the eigenvalue corresponding to other positions is 0. The size of the face rotation angle graph is of an original graph (a face training image)Where n is the step size, e.g., n-4.
And step S204, calculating loss values of the face heat map, the face scale map, the face center offset map and the face rotation angle map respectively, and connecting in series according to different preset weights to obtain a total loss value.
And (4) performing loss calculation between the characteristic image obtained by predicting the face positioning network model and the real image to obtain a loss value. The loss values of the face heat image, the face scale image (divided into width and height) and the face center offset image are respectively Lc、Lw、Lh、LoffThe specific calculation method is similar to the original Centerface model, and is not described herein again. Loss value L of the face rotation angle mapangleThe calculation method is as follows:
representing the rotation angle corresponding to the pixel point k in the face rotation angle diagram predicted by the network model, akAnd representing the rotation angle corresponding to the pixel point k marked manually.
The total loss value obtained for the final concatenation is calculated as follows:
L=Lc+λoffLoff+λwLw+λhLh+λaLangle
wherein λoff,λw,λh,λaThe loss values L of the face center offset map, the face scale map (divided into width and height) and the face rotation angle map are respectivelyoff、Lw、Lh、LangleThe values of the corresponding weights can be 1, 0.5 and 0.5 respectively.
And step S206, reversely transmitting the total loss value, and repeating iteration for multiple times until the parameters of the face positioning network model are converged.
In the present embodiment, the learning rate may be set to 5e-4. And reversely transmitting the finally obtained loss value to obtain the next face training image, and repeating the iteration for 120 times until the parameters of the face positioning network are converged.
After the face positioning network model is trained, the face positioning network model can be adopted to carry out face detection on the target image, and a face rotation angle is output on the basis of positioning face coordinates. The target image mainly refers to an identity card image. The face positioning network model can realize the detection of 360-degree rotating faces and is suitable for various face detection scenes of identity cards.
Specifically, further refer to fig. 3, which is a schematic view of the detailed flow of step S100. It is to be understood that the flow chart is not intended to limit the order in which the steps are performed. Some steps in the flowchart may be added or deleted as desired. In this embodiment, the step S100 specifically includes:
and S1000, inputting the target image into the trained face positioning network model, and outputting a face heat image, a face scale image, a face center offset image and a face rotation angle image.
And S1002, obtaining the face coordinates in the target image according to the face heat image, the face scale image, the face center offset image and the face frame rule.
In this embodiment, the steps specifically include: and regarding pixel points which are larger than a preset threshold (such as 0.3) in the face heat map as a face area, taking out face coordinate offset from a corresponding position on the face central offset map, adding the face coordinate offset to the coordinates of the face heat map to obtain a face central point position, and finally calculating the width and height of the face on the face scale map through index conversion so as to obtain a face detection frame (according to the face central point position and the width and height of the face), and removing repeated face detection frames through non-maximum suppression (NMS). And the position coordinates of the face detection frame are face coordinates in the target image.
And S1004, acquiring a face rotation angle theta from the face rotation angle map.
The face rotation angle is used for later rotation of the picture, and 360-degree face detection can be achieved through the rotation angle. The original Centerface model cannot be implemented, for example, the face of a person who is turned over cannot be detected. However, the present embodiment can not only detect a face at any angle, but also output a face rotation angle, and the original image can be rotated to a normal state (a forward face) according to the face rotation angle.
Returning to fig. 1, step S102, an original triangle and a target triangle are constructed according to the face rotation angle and the face coordinates.
Specifically, further refer to fig. 4, which is a schematic view of the detailed flow of step S102. It is to be understood that the flow chart is not intended to limit the order in which the steps are performed. Some steps in the flowchart may be added or deleted as desired. In this embodiment, the step S102 specifically includes:
and S1020, performing outward expansion on the face detection frame to the periphery according to a preset proportion to obtain an outward expansion face frame.
Because the face matting is mainly used for subsequent face recognition, in practical application, the face detection frame needs to be expanded to a certain extent, so that part of background information can be attached to the face detection frame, the face image which is not scratched out is only a face region, and subsequent face recognition is not facilitated.
Assuming that the width and height of the face image desired to be output (i.e. the width and height of the extended face frame) are w 'and h', respectively, the aspect ratio is as follows
The coordinates of the face detection frame are [ x ]1,y1,x2,y2]Width w ═ x2-x1Height h ═ y2-y1. In this embodiment, x is1、x2By expanding by one quarter of the width w of the detection frame, i.e. by expandingWill y1、y2The external expansion is carried out in the following way, namely The coordinates of the face frame of the external expansion obtained after the external expansion areThe center point is
S1022, three rotation points are constructed according to the coordinates of the external extended face frame and the center point thereof and the face rotation angle theta.
In particular, the three rotation points are configured as follows:
and S1024, constructing an original triangle according to the three rotation points.
Specifically, the original triangle srcTriangle is constructed in the following manner:
and S1026, constructing a target triangle according to the width and the height of the external extended face frame.
Specifically, the target triangle dstTriangle is constructed in the following manner:
dstTriangle=((0,0),(0,h′),(w′,h′))
returning to fig. 1, step S104, implementing face matting based on affine transformation according to the original triangle and the target triangle.
In this embodiment, affine transformation is performed based on an opencv library function to realize face matting according to the original triangle and the target triangle, so as to obtain a corrected face image.
Specifically, further refer to fig. 5, which is a schematic view of the detailed flow of step S104. It is to be understood that the flow chart is not intended to limit the order in which the steps are performed. Some steps in the flowchart may be added or deleted as desired. In this embodiment, the step S104 specifically includes:
and S1040, obtaining an affine transformation matrix M according to the original triangle and the target triangle.
And obtaining an affine transformation function (getAffiniTransform) through an opencv library according to the original triangle srcTriangle and the target triangle dstTriangle, and obtaining an affine transformation matrix M. The affine transformation matrix M obtained in this embodiment is a 2 × 3 matrix, and the matrix parameters correspond to information such as translation, rotation, scaling, and the like.
And S1042, performing face matting from the target image according to the affine transformation matrix M to obtain a corrected face image.
Specifically, according to the affine transformation matrix M, the width and height w 'and h' of the face image desired to be output, and the target image (identity card image), the corrected face image is extracted through an opencv library affine transformation function (warpAffine).
The face image obtained finally is a face image in a normal state (namely, a forward face) obtained after reverse rotation is performed according to the face rotation angle. For example, refer to fig. 6, which is a schematic diagram of obtaining a face image by performing face matting from a target image according to the present invention. According to the above-mentioned flow of this embodiment, the face matting is performed from the target image on the upper side of fig. 6, so that the face image on the lower side can be obtained.
The face matting method provided by the embodiment can perform 360-degree face detection and face correction on target images such as an identity card image and the like, and returns a face image meeting specification requirements. The method is improved based on the anchor-free face detection technology, avoids complex anchor post-processing time, and is rapid, efficient, high in recall rate and low in false detection rate; meanwhile, the human face detection at any rotation angle between [0 degrees and 360 degrees ] is supported, the problem that the human face detection method without anchor points fails to detect the pain points of the inverted human face is solved, various types of identity card human face detection are matched, and high-precision human face rotation angles are output, so that favorable conditions are created for the human face correction of the identity card image. Moreover, the human face detection frame is expanded, so that the human face image with any size and specification can be supported. In addition, through constructing two triangles, affine transformation is carried out, face matting from the image of the identity card is realized, and deformation is not generated.
Fig. 7 is a schematic diagram of an alternative hardware architecture of the electronic device 2 according to the present invention.
In this embodiment, the electronic device 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 7 only shows the electronic device 2 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The electronic device 2 may be a server or other electronic equipment with computing capability. The server may be a rack server, a blade server, a tower server, a cabinet server, or other computing devices, may be an independent server, or may be a server cluster composed of a plurality of servers.
The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the electronic device 2, such as a hard disk or a memory of the electronic device 2. In other embodiments, the memory 11 may also be an external storage device of the electronic apparatus 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the electronic apparatus 2. Of course, the memory 11 may also comprise both an internal memory unit of the electronic apparatus 2 and an external memory device thereof. In this embodiment, the memory 11 is generally used for storing an operating system installed in the electronic device 2 and various types of application software, such as a program code of the face matting system 200. Furthermore, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the electronic device 2. In this embodiment, the processor 12 is configured to run the program code stored in the memory 11 or process data, such as running the face matting system 200.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is generally used for establishing a communication connection between the electronic apparatus 2 and other electronic devices.
Referring to fig. 8, a block diagram of a preferred embodiment of the face matting system 200 according to the invention is shown.
In this embodiment, the face matting system 200 comprises a series of computer program instructions stored on a memory 11 that, when executed by a processor 12, can implement the face matting operations of embodiments of the invention. In some embodiments, the face matting system 200 may be divided into one or more modules based on the particular operations implemented by the portions of the computer program instructions. For example, in fig. 2, the face matting system 200 can be divided into a detection module 201, a construction module 202, and a matting module 203. Wherein:
the detection module 201 is configured to perform face detection on the target image through the trained face positioning network model to obtain a face rotation angle and a face coordinate.
In the embodiment, a face positioning network model obtained by improving on the basis of the center is adopted for face detection. Unlike the original Centerface model, which adopts MobileNetV3 as a backbone network, the face positioning network model of the present embodiment adopts MobileNetV2 as a backbone network, and adopts UNet structure as the neck for subsequent detection, and adopts a top-down transverse connection architecture to construct a feature pyramid from a single scale input.
In this embodiment, the training process of the face positioning network model may include:
(1) and preprocessing the face training image.
The preprocessing refers to data amplification, and comprises random clipping, random color dithering, random brightness dithering, random saturation dithering, random contrast dithering and the like.
(2) And generating a predicted face heat image, a face scale image, a face center offset image and a face rotation angle image according to the preprocessed face training image.
And inputting the preprocessed face training image into the face positioning network model, and outputting a plurality of feature maps. In the present embodiment, four kinds of feature maps are included: the system comprises a face heat image, a face scale image, a face center offset image and a face rotation angle image. And generating the face heat image, the face scale image and the face center offset image in a manner similar to that of the original center model. It should be noted that, in the present embodiment, a face rotation angle map is additionally generated, where the face rotation angle map is generated according to the manually labeled face rotation angle and the position of the face center point. The true value (only one value) of the face rotation angle may be manually labeled in advance in the face training image. In the generated face rotation angle diagram, each pixel point has a corresponding characteristic value, wherein the characteristic value (face rotation angle) corresponding to the face center point position is the true value, the range of the rotation angle value is (-180, 180), and the characteristic values corresponding to other positions are 0. The size of the face rotation angle graph is of an original graph (a face training image)Where n is the step size, e.g., n-4.
(3) And respectively calculating loss values of the face heat map, the face scale map, the face center offset map and the face rotation angle map, and connecting in series according to different preset weights to obtain a total loss value.
And (4) performing loss calculation between the characteristic image obtained by predicting the face positioning network model and the real image to obtain a loss value. The loss values of the face heat image, the face scale image (divided into width and height) and the face center offset image are respectively Lc、Lw、Lh、LoffThe specific calculation method is similar to the original Centerface model, and is not described herein again.Loss value L of the face rotation angle mapangleThe calculation method is as follows:
representing the rotation angle corresponding to the pixel point k in the face rotation angle diagram predicted by the network model, akAnd representing the rotation angle corresponding to the pixel point k marked manually.
The total loss value obtained for the final concatenation is calculated as follows:
L=Lc+λoffLoff+λwLw+λhLh+λaLangle
wherein λoff,λw,λh,λaThe loss values L of the face center offset map, the face scale map (divided into width and height) and the face rotation angle map are respectivelyoff、Lw、Lh、LangleThe values of the corresponding weights can be 1, 0.5 and 0.5 respectively.
(4) And reversely transmitting the total loss value, and repeating iteration for multiple times until the parameters of the face positioning network model are converged.
In the present embodiment, the learning rate may be set to 5e-4. And reversely transmitting the finally obtained loss value to obtain the next face training image, and repeating the iteration for 120 times until the parameters of the face positioning network are converged.
After the face positioning network model is trained, the face positioning network model can be adopted to carry out face detection on the target image, and a face rotation angle is output on the basis of positioning face coordinates. The target image mainly refers to an identity card image. The face positioning network model can realize the detection of 360-degree rotating faces and is suitable for various face detection scenes of identity cards.
Specifically, the process of performing face detection on the target image through the face positioning network model may include:
(1) and inputting the target image into the trained face positioning network model, and outputting a face heat image, a face scale image, a face center offset image and a face rotation angle image.
(2) And obtaining the face coordinates in the target image according to the face heat image, the face scale image, the face center offset image and the face frame rule.
In this embodiment, the steps specifically include: and (2) regarding pixel points which are larger than a preset threshold (such as 0.3) in the face heat map as a face area, then taking out face coordinate offset from a corresponding position on the face central offset map, adding the face coordinate offset and the coordinates of the face heat map to obtain a face central point position, finally calculating the width and height of the face on the face scale map through index conversion, thus obtaining a face detection frame (according to the face central point position and the width and height of the face), and then rejecting the repeated face detection frame through non-maximum value inhibition. And the position coordinates of the face detection frame are face coordinates in the target image.
(3) And acquiring a face rotation angle theta from the face rotation angle map.
The face rotation angle is used for later rotation of the picture, and 360-degree face detection can be achieved through the rotation angle. The original Centerface model cannot be implemented, for example, the face of a person who is turned over cannot be detected. However, the present embodiment can not only detect a face at any angle, but also output a face rotation angle, and the original image can be rotated to a normal state (a forward face) according to the face rotation angle.
The constructing module 202 is configured to construct an original triangle and a target triangle according to the face rotation angle and the face coordinates.
In this embodiment, the process specifically includes:
(1) and carrying out external expansion on the face detection frame towards the periphery according to a preset proportion to obtain an external expansion face frame.
Because the face matting is mainly used for subsequent face recognition, in practical application, the face detection frame needs to be expanded to a certain extent, so that part of background information can be attached to the face detection frame, the face image which is not scratched out is only a face region, and subsequent face recognition is not facilitated.
Assuming that the width and height of the face image desired to be output (i.e. the width and height of the extended face frame) are w 'and h', respectively, the aspect ratio is as follows
The coordinates of the face detection frame are [ x ]1,y1,x2,y2]Width w ═ x2-x1Height h ═ y2-y1. In this embodiment, x is1、x2By expanding by one quarter of the width w of the detection frame, i.e. by expandingWill y1、y2The external expansion is carried out in the following way, namely The coordinates of the face frame of the external expansion obtained after the external expansion areThe center point is
(2) And constructing three rotation points according to the coordinates of the external expansion face frame and the center point thereof and the face rotation angle theta.
In particular, the three rotation points are configured as follows:
(3) and constructing an original triangle according to the three rotation points.
Specifically, the original triangle srcTriangle is constructed in the following manner:
(4) and constructing a target triangle according to the width and the height of the external extended face frame.
Specifically, the target triangle dstTriangle is constructed in the following manner:
dstTriangle=((0,0),(0,h′),(w′,h′))
the matting module 203 is configured to implement face matting based on affine transformation according to the original triangle and the target triangle.
In this embodiment, affine transformation is performed based on an opencv library function to realize face matting according to the original triangle and the target triangle, so as to obtain a corrected face image.
Specifically, the process may include:
(1) and obtaining an affine transformation matrix M according to the original triangle and the target triangle.
And obtaining an affine transformation function through an opencv library according to the original triangle srcTriangle and the target triangle dstTriangle, and obtaining an affine transformation matrix M. The affine transformation matrix M obtained in this embodiment is a 2 × 3 matrix, and the matrix parameters correspond to information such as translation, rotation, scaling, and the like.
(2) And carrying out face matting from the target image according to the affine transformation matrix M to obtain a corrected face image.
Specifically, the corrected face image is scratched out through an opencv library affine transformation function according to the affine transformation matrix M, the width and height w 'and h' of the face image expected to be output, and the target image (identity card image).
The face image obtained finally is a face image in a normal state (namely, a forward face) obtained after reverse rotation is performed according to the face rotation angle. For example, the face image on the lower side can be obtained by performing face matting from the target image on the upper side of fig. 6 according to the above-described procedure.
The face matting system provided by the embodiment can perform 360-degree face detection and face correction on target images such as identity card images and the like, and returns face images meeting specification requirements. The method is improved based on the anchor-free face detection technology, avoids complex anchor post-processing time, and is rapid, efficient, high in recall rate and low in false detection rate; meanwhile, the human face detection at any rotation angle between [0 degrees and 360 degrees ] is supported, the problem that the human face detection method without anchor points fails to detect the pain points of the inverted human face is solved, various types of identity card human face detection are matched, and high-precision human face rotation angles are output, so that favorable conditions are created for the human face correction of the identity card image. Moreover, the human face detection frame is expanded, so that the human face image with any size and specification can be supported. In addition, through constructing two triangles, affine transformation is carried out, face matting from the image of the identity card is realized, and deformation is not generated.
The present invention also provides another embodiment, which is a computer-readable storage medium storing a face matting program executable by at least one processor to cause the at least one processor to perform the steps of the face matting method as described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A method of face matting, the method comprising:
carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates;
constructing an original triangle and a target triangle according to the face rotation angle and the face coordinates; and
and realizing face matting from the target image based on affine transformation according to the original triangle and the target triangle to obtain a face image.
2. The method of claim 1, wherein the performing face detection on the target image through the trained face localization network model to obtain the face rotation angle and the face coordinates comprises:
inputting the target image into the trained face positioning network model, and outputting a face heat map, a face scale map, a face center offset map and a face rotation angle map;
obtaining face coordinates in the target image according to the face heat image, the face scale image, the face center offset image and a face frame rule;
and acquiring a face rotation angle from the face rotation angle image.
3. The method of claim 1 or 2, wherein the training process of the face localization network model comprises:
preprocessing a face training image;
generating a predicted face heat image, a face scale image, a face center offset image and a face rotation angle image according to the preprocessed face training image;
respectively calculating loss values of the face heat map, the face scale map, the face center offset map and the face rotation angle map, and connecting the loss values in series according to preset different weights to obtain a total loss value;
and reversely transmitting the total loss value, and continuously acquiring a next face training image for training until the parameters of the face positioning network model are converged.
4. The method according to claim 3, wherein the face training image includes a true value of a face rotation angle that is artificially labeled, and the generating a predicted face rotation angle map according to the preprocessed face training image includes:
and marking the characteristic value corresponding to the position of the face center point of the face rotation angle image as the true value, and marking the characteristic values corresponding to the positions of other pixel points of the face rotation angle image as 0.
5. The method of claim 2, wherein the obtaining the face coordinates in the target image according to the face heat map, the face scale map, the face center offset map, and the face frame rule comprises:
determining pixel points with characteristic values larger than a preset threshold value in the face heat map as face regions;
acquiring face coordinate offset at a position corresponding to the face area on the face central offset map, and adding the face coordinate offset and the coordinates of the face heat map to obtain a face central point position;
and calculating the width and height of the face on the face scale image through index conversion to obtain a face detection frame, rejecting repeated face detection frames through non-maximum value suppression, and determining the position coordinates of the face detection frame as face coordinates in the target image.
6. The method of claim 5, wherein said constructing an original triangle and a target triangle from said face rotation angles and said face coordinates comprises:
carrying out external expansion on the face detection frame to the periphery according to a preset proportion to obtain an external expansion face frame;
constructing three rotation points according to the coordinates of the external expansion face frame and the center point thereof and the face rotation angle;
constructing an original triangle according to the three rotation points;
and constructing a target triangle according to the width and the height of the external extended face frame.
7. The method as claimed in claim 1 or 6, wherein said obtaining face image from face matting in the target image based on affine transformation from the original triangle and the target triangle comprises:
obtaining an affine transformation matrix by obtaining an affine transformation function according to the original triangle and the target triangle;
and carrying out face matting from the target image through an affine transformation function according to the affine transformation matrix to obtain a corrected face image.
8. A face matting system, characterized in that the system comprises:
the detection module is used for carrying out face detection on the target image through the trained face positioning network model to obtain a face rotation angle and face coordinates;
the construction module is used for constructing an original triangle and a target triangle according to the face rotation angle and the face coordinates; and
and the matting module is used for realizing face matting from the target image based on affine transformation according to the original triangle and the target triangle to obtain a face image.
9. An electronic device comprising a memory, a processor, the memory having stored thereon a face matting program executable on the processor, the face matting program when executed by the processor implementing the steps of the face matting method according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a face matting program executable by at least one processor to cause the at least one processor to perform the steps of the face matting method as recited in any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011581223.3A CN112541484B (en) | 2020-12-28 | 2020-12-28 | Face matting method, system, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011581223.3A CN112541484B (en) | 2020-12-28 | 2020-12-28 | Face matting method, system, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112541484A true CN112541484A (en) | 2021-03-23 |
CN112541484B CN112541484B (en) | 2024-03-19 |
Family
ID=75017708
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011581223.3A Active CN112541484B (en) | 2020-12-28 | 2020-12-28 | Face matting method, system, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112541484B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113705460A (en) * | 2021-08-30 | 2021-11-26 | 平安科技(深圳)有限公司 | Method, device and equipment for detecting opening and closing of eyes of human face in image and storage medium |
CN113961746A (en) * | 2021-09-29 | 2022-01-21 | 北京百度网讯科技有限公司 | Video generation method and device, electronic equipment and readable storage medium |
CN114650453A (en) * | 2022-04-02 | 2022-06-21 | 北京中庆现代技术股份有限公司 | Target tracking method, device, equipment and medium applied to classroom recording and broadcasting |
CN115294320A (en) * | 2022-10-08 | 2022-11-04 | 平安银行股份有限公司 | Method and device for determining image rotation angle, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295533A (en) * | 2016-08-01 | 2017-01-04 | 厦门美图之家科技有限公司 | Optimization method, device and the camera terminal of a kind of image of autodyning |
CN107358207A (en) * | 2017-07-14 | 2017-11-17 | 重庆大学 | A kind of method for correcting facial image |
CN109359575A (en) * | 2018-09-30 | 2019-02-19 | 腾讯科技(深圳)有限公司 | Method for detecting human face, method for processing business, device, terminal and medium |
CN109948397A (en) * | 2017-12-20 | 2019-06-28 | Tcl集团股份有限公司 | A kind of face image correcting method, system and terminal device |
CN110826395A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Method and device for generating face rotation model, computer equipment and storage medium |
CN111160108A (en) * | 2019-12-06 | 2020-05-15 | 华侨大学 | Anchor-free face detection method and system |
CN111428579A (en) * | 2020-03-03 | 2020-07-17 | 平安科技(深圳)有限公司 | Face image acquisition method and system |
-
2020
- 2020-12-28 CN CN202011581223.3A patent/CN112541484B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295533A (en) * | 2016-08-01 | 2017-01-04 | 厦门美图之家科技有限公司 | Optimization method, device and the camera terminal of a kind of image of autodyning |
CN107358207A (en) * | 2017-07-14 | 2017-11-17 | 重庆大学 | A kind of method for correcting facial image |
CN109948397A (en) * | 2017-12-20 | 2019-06-28 | Tcl集团股份有限公司 | A kind of face image correcting method, system and terminal device |
CN109359575A (en) * | 2018-09-30 | 2019-02-19 | 腾讯科技(深圳)有限公司 | Method for detecting human face, method for processing business, device, terminal and medium |
CN110826395A (en) * | 2019-09-18 | 2020-02-21 | 平安科技(深圳)有限公司 | Method and device for generating face rotation model, computer equipment and storage medium |
CN111160108A (en) * | 2019-12-06 | 2020-05-15 | 华侨大学 | Anchor-free face detection method and system |
CN111428579A (en) * | 2020-03-03 | 2020-07-17 | 平安科技(深圳)有限公司 | Face image acquisition method and system |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113705460A (en) * | 2021-08-30 | 2021-11-26 | 平安科技(深圳)有限公司 | Method, device and equipment for detecting opening and closing of eyes of human face in image and storage medium |
CN113705460B (en) * | 2021-08-30 | 2024-03-15 | 平安科技(深圳)有限公司 | Method, device, equipment and storage medium for detecting open and closed eyes of face in image |
CN113961746A (en) * | 2021-09-29 | 2022-01-21 | 北京百度网讯科技有限公司 | Video generation method and device, electronic equipment and readable storage medium |
CN113961746B (en) * | 2021-09-29 | 2023-11-21 | 北京百度网讯科技有限公司 | Video generation method, device, electronic equipment and readable storage medium |
CN114650453A (en) * | 2022-04-02 | 2022-06-21 | 北京中庆现代技术股份有限公司 | Target tracking method, device, equipment and medium applied to classroom recording and broadcasting |
CN114650453B (en) * | 2022-04-02 | 2023-08-15 | 北京中庆现代技术股份有限公司 | Target tracking method, device, equipment and medium applied to classroom recording and broadcasting |
CN115294320A (en) * | 2022-10-08 | 2022-11-04 | 平安银行股份有限公司 | Method and device for determining image rotation angle, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112541484B (en) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112541484B (en) | Face matting method, system, electronic device and storage medium | |
CN110472593B (en) | Training image acquisition method, model training method and related device | |
CN111428579A (en) | Face image acquisition method and system | |
CN110838133B (en) | Multi-target tracking method and related equipment | |
CN113947766B (en) | Real-time license plate detection method based on convolutional neural network | |
CN111291768A (en) | Image feature matching method and device, equipment and storage medium | |
CN114143528A (en) | Multi-video stream fusion method, electronic device and storage medium | |
CN113436338A (en) | Three-dimensional reconstruction method and device for fire scene, server and readable storage medium | |
CN111161331A (en) | Registration method of BIM model and GIS model | |
CN113850136A (en) | Yolov5 and BCNN-based vehicle orientation identification method and system | |
CN113112542A (en) | Visual positioning method and device, electronic equipment and storage medium | |
CN111476096A (en) | Face image key point method and device, computer equipment and storage medium | |
CN110826403A (en) | Tracking target determination method and related equipment | |
CN106997366B (en) | Database construction method, augmented reality fusion tracking method and terminal equipment | |
CN113177941B (en) | Steel coil edge crack identification method, system, medium and terminal | |
CN112333468B (en) | Image processing method, device, equipment and storage medium | |
EP4075381A1 (en) | Image processing method and system | |
CN111353325A (en) | Key point detection model training method and device | |
CN116091709B (en) | Three-dimensional reconstruction method and device for building, electronic equipment and storage medium | |
CN113112531B (en) | Image matching method and device | |
CN113486941B (en) | Live image training sample generation method, model training method and electronic equipment | |
CN111062863B (en) | Method, device, equipment and storage medium for binding 3D model with longitude and latitude coordinates | |
CN110796136B (en) | Mark and image processing method and related device | |
CN113436332A (en) | Digital display method and device for fire-fighting plan, server and readable storage medium | |
CN112419459B (en) | Method, apparatus, computer device and storage medium for baking model AO mapping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |