CN112669197A

CN112669197A - Image processing method, image processing device, mobile terminal and storage medium

Info

Publication number: CN112669197A
Application number: CN201910983237.9A
Authority: CN
Inventors: 王鹏飞; 王向鸿; 李京; 王治金; 熊君君
Original assignee: SF Technology Co Ltd; SF Express Co Ltd
Current assignee: SF Technology Co Ltd; SF Express Co Ltd; SF Tech Co Ltd
Priority date: 2019-10-16
Filing date: 2019-10-16
Publication date: 2021-04-16

Abstract

The embodiment of the application discloses an image processing method, an image processing device, a mobile terminal and a storage medium, wherein the image to be processed can be obtained, and face detection is carried out on the image to obtain a face area; when the face region meets a preset compliance condition, extracting a face region mask and a skin region mask from the image; extracting a target area which accords with a preset proportion from the image; and preprocessing the skin area in the target area according to the skin area mask, and performing color adjustment on the non-portrait area in the target area according to the portrait area mask to obtain a target image. The method not only can carry out compliance detection on the face region in the image, but also can carry out pretreatment on the skin region in the target region and carry out color adjustment on the non-face region based on the face region mask and the skin region mask so as to obtain the target image meeting the requirements, and improve the efficiency and reliability of image processing.

Description

Image processing method, image processing device, mobile terminal and storage medium

Technical Field

The present application relates to the field of computer vision technologies, and in particular, to an image processing method and apparatus, a mobile terminal, and a storage medium.

Background

With the continuous development of internet technology, many personal registration systems such as examination registration websites, visa handling organizations, enterprise internal management platforms and the like need to submit personal certificates. The sizes and background colors of the certificates of different platforms are different, the users usually need to pay for multiple times to ask for help professional photographers to shoot and process, and meanwhile, platform managers need to detect whether the certificates uploaded by the users are in compliance or not with labor. For the influence of factors such as severe change of ambient illumination, various and complicated human postures, and human face shielding of glasses or hairs in a complex scene, missed detection and false detection are easy to occur, and the representation is not robust enough. Or, the human face region is manually scratched out from the picture through the photo shop software, and then the human face region is placed in a proper background layer, the human face edge obtained by manual scratching is rough and low in efficiency, and the human face edge needs to be manually and complexly adjusted under different illumination conditions or for specific skin color crowds, so that the synthetic effect is poor, and the reliability is low.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, a mobile terminal and a storage medium, which can improve the efficiency and reliability of image processing.

In a first aspect, an embodiment of the present application provides an image processing method, including:

acquiring an image to be processed, and carrying out face detection on the image to obtain a face area;

when the face region meets a preset compliance condition, extracting a face region mask and a skin region mask from the image;

extracting a target area which accords with a preset proportion from the image;

and preprocessing the skin area in the target area according to the skin area mask, and performing color adjustment on the non-portrait area in the target area according to the portrait area mask to obtain a target image.

In some embodiments, the preprocessing the skin region in the target region according to the skin region mask, and performing color adjustment on the non-portrait region in the target region according to the portrait region mask to obtain a target image includes:

performing beautifying operation on the skin area in the identification photo area according to the skin area mask to obtain a foreground area after beautifying;

replacing the non-portrait area in the identification photo area with a preset color according to the portrait area mask to obtain a replaced background area;

and fusing the foreground area and the background area to obtain a target image.

In some embodiments, the beautifying the skin area in the identification photo area according to the skin area mask, and obtaining a foreground area after beautifying includes:

extracting a skin area from the certificate photo area according to the skin area mask;

performing buffing treatment on the skin area through a bilateral filtering algorithm to obtain a buffed skin area;

and whitening the skin area after the skin is abraded by a three-dimensional table look-up method to obtain a foreground area after the skin is beautified.

In some embodiments, before the extracting the human image area mask and the skin area mask from the image, the method further comprises:

detecting the number of human faces, the area of the human faces, the definition, the exposure and the posture of the human faces in the human face area;

and when the number of the human faces, the human face area, the definition, the exposure and the human face posture in the human face region all meet the conditions, determining that the human face region meets the preset compliance conditions.

In some embodiments, detecting the sharpness in the face region comprises:

converting the face area into a gray image, and performing convolution operation on the gray image and a preset Laplace kernel to obtain a response image;

and calculating the variance of the response image, and determining that the definition in the face region meets the condition when the variance is greater than a preset threshold value.

In some embodiments, detecting the exposure level within the face region comprises:

counting the proportion of pixels smaller than a first pixel threshold value in the face region to the total pixels of the face region to obtain a first proportion value;

counting the proportion of pixels larger than a second pixel threshold value in the face region to the total pixels of the face region to obtain a second proportion value, wherein the first pixel threshold value is smaller than the second pixel threshold value;

and when the first proportion value is smaller than a first preset value and the second proportion value is smaller than a second preset value, determining that the exposure degree in the face area meets the condition, wherein the first preset value is smaller than the second preset value.

In some embodiments, detecting the face pose within the face region comprises:

and calculating an included angle between a straight line where the two eyes are located and a horizontal line in the face region, and determining that the face posture in the face region meets the condition when the included angle is smaller than a preset angle.

In some embodiments, the performing face detection on the image to obtain a face region includes:

carrying out size normalization processing on the image to obtain an image with a normalized size;

performing face detection on the size-normalized image through a face detector of a convolutional neural network;

when the face is detected, the face area with the preset size is extracted from the image with the normalized size by taking the face as the center.

In some embodiments, said extracting the human image area mask and the skin area mask from the image comprises:

acquiring the mean value and the standard deviation of pixels in the image;

performing pixel normalization processing on the image according to the mean value and the standard deviation to obtain an image after pixel normalization;

and carrying out segmentation processing on the image after the pixel normalization through an image segmentation network to obtain a portrait area mask and a skin area mask.

In some embodiments, the segmenting the pixel-normalized image by the image segmentation network to obtain the human image area mask and the skin area mask includes:

extracting local features of the image after the pixel normalization through an initial network module of an image segmentation network to obtain a first feature image;

extracting shared features of the first feature image through a shared feature module of an image segmentation network to obtain a second feature image;

extracting portrait characteristics and skin characteristics of the second characteristic image through a branch task module of the image segmentation network to obtain a portrait characteristic image and a skin characteristic image;

and extracting a portrait area mask and a skin area mask based on the second characteristic image, the portrait characteristic image and the skin characteristic image through an integration module of an image segmentation network.

In some embodiments, the extracting the target region meeting the preset proportion from the image includes:

acquiring a portrait frame according to the portrait area mask;

acquiring a face detection frame, and calculating the height of a face according to the face detection frame;

determining the boundary of the area according to the heights of the portrait frame, the face detection frame and the face;

and extracting a target area which accords with a preset proportion from the image according to the area boundary.

In a second aspect, an embodiment of the present application further provides an image processing apparatus, including:

the first detection module is used for acquiring an image to be processed and carrying out face detection on the image to obtain a face area;

the mask extraction module is used for extracting a human image area mask and a skin area mask from the image when the human face area meets a preset compliance condition;

the region extraction module is used for extracting a target region which accords with a preset proportion from the image;

and the processing module is used for preprocessing the skin area in the target area according to the skin area mask and adjusting the color of the non-portrait area in the target area according to the portrait area mask to obtain a target image.

In some embodiments, the processing module comprises:

the beautifying submodule is used for carrying out beautifying operation on the skin area in the identification photo area according to the skin area mask to obtain a foreground area after beautifying;

the replacing submodule is used for replacing the non-portrait area in the identification photo area with a preset color according to the portrait area mask to obtain a replaced background area;

and the fusion submodule is used for fusing the foreground area and the background area to obtain a target image.

In some embodiments, the beauty module is specifically configured to:

In some embodiments, the image processing apparatus further comprises:

the second detection module is used for detecting the number of human faces, the area of the human faces, the definition, the exposure and the posture of the human faces in the human face area;

and the determining module is used for determining that the face area meets the preset compliance condition when the number of the faces, the face area, the definition, the exposure and the face posture in the face area all meet the conditions.

In some embodiments, the second detection module is specifically configured to:

In some embodiments, the first detection module is specifically configured to:

In some embodiments, the mask extraction module comprises:

the acquisition submodule is used for acquiring the mean value and the standard deviation of pixels in the image;

the processing submodule is used for carrying out pixel normalization processing on the image according to the mean value and the standard deviation to obtain an image after pixel normalization;

and the segmentation submodule is used for carrying out segmentation processing on the image after the pixel normalization through an image segmentation network to obtain a human image area mask and a skin area mask.

In some embodiments, the partitioning sub-module is specifically configured to:

In some embodiments, the region extraction module is specifically configured to:

acquiring a portrait frame according to the portrait area mask;

In a third aspect, an embodiment of the present application further provides a mobile terminal, including a memory and a processor, where the memory stores a computer program, and the processor executes any one of the image processing methods provided in the embodiment of the present application when calling the computer program in the memory.

In a fourth aspect, the present application further provides a storage medium, where the storage medium is used to store a computer program, and the computer program is loaded by a processor to execute any one of the image processing methods provided by the embodiments of the present application.

The method and the device can acquire the image to be processed, and carry out face detection on the image to obtain a face area; when the face region meets a preset compliance condition, extracting a face region mask and a skin region mask from the image, and extracting a target region which meets a preset proportion from the image; then, the skin area in the target area can be preprocessed according to the skin area mask, and the color of the non-portrait area in the target area can be adjusted according to the portrait area mask, so that the target image is obtained. The scheme can not only carry out compliance detection on the face region extracted from the image, but also carry out pretreatment on the skin region in the target region and carry out color adjustment on the non-face region based on the extracted face region mask code and the skin region mask code so as to obtain the target image (such as a certificate photo image) meeting the requirements, does not need to shoot again cost and manual compliance audit cost, avoids manual image repairing, has good synthesis effect, and improves the efficiency and reliability of image processing.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of an image processing method provided in an embodiment of the present application;

fig. 2 is another schematic flow chart of an image processing method provided in an embodiment of the present application;

FIG. 3 is a schematic diagram of image processing provided by an embodiment of the present application;

FIG. 4 is another schematic diagram of image processing provided by embodiments of the present application;

FIG. 5 is a schematic diagram of a visualization interface for image processing provided by an embodiment of the present application;

fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a mobile terminal according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, fig. 1 is a schematic flow chart of an image processing method according to an embodiment of the present application. The execution main body of the image processing method may be the image processing device provided in the embodiment of the present application, or a mobile terminal integrated with the image processing device, where the image processing device may be implemented in a hardware or software manner, and the mobile terminal may be a smart phone, a tablet computer, a palm computer, a notebook computer, or the like. The image processing method may include:

s101, obtaining an image to be processed, and carrying out face detection on the image to obtain a face area.

The image to be processed may be an image obtained by shooting a user, or an image obtained by downloading from a server, where the image may include the user and may also include other objects. After the image is obtained, face detection can be performed on the image.

In some embodiments, performing face detection on the image, and obtaining the face region may include: carrying out size normalization processing on the image to obtain an image with a normalized size; carrying out face detection on the image with the normalized size through a face detector of a convolutional neural network; when the face is detected, the face area with the preset size is extracted from the image with the size normalized by taking the face as the center.

Specifically, in order to improve the accuracy of face detection, a face may be detected through a convolutional neural network, and in order to improve the efficiency and accuracy of face detection through the convolutional neural network, size normalization processing may be performed on an image first, that is, the size of the image is corrected. For example, to maintain the ratio between the length and width of the image, when one side of the image is h and the other is w, the image may be normalized as: one side (e.g., long side) is 256 pixels and the other side (e.g., short side) is 256 x (min (h, w)/max (h, w)), so that a size-normalized image can be obtained.

After the size-normalized image is obtained, the size-normalized image may be subjected to face detection by a face detector of Convolutional neural network (MTCNN). The MTCNN may include three cascaded sub-networks: (1) a proposal sub-network P-Net (Proposal network) that can be used to generate candidate boxes; (2) adjusting a sub-network R-Net (refine network) which can be used for removing a large number of non-face frames from candidate frames generated by P-Net to obtain face frames; (3) and the output sub-network O-Net (output network) can be used for determining a final candidate region (namely a face region) and the feature information of the face from the face frame screened out by the R-Net, wherein the feature information can comprise eyes, mouth and the like so as to be used for calculating the face posture later. And, in order to accelerate the detection speed, the packet convolution Depthwise in the hybrid network ShuffleNet can be adopted to replace the traditional convolution layer in the MTCNN. The 3 stages of the three cascaded sub-networks are used for face detection in a coarse-to-fine detection mode, wherein a feature pyramid is used for solving the multi-scale face.

It should be noted that the MTCNN is obtained by pre-training based on training sample images, for example, a plurality of training sample images including faces and a real face region labeled in each training sample image may be obtained, each training sample image is input to the MTCNN, the face region in the training sample image is calculated through the MTCNN to obtain a predicted face region, and a loss function is constructed, the real face region and the predicted face region may be converged through a preset loss function, and by adjusting parameters of the MTCNN to appropriate values, the loss is made lower and the gradient is not decreased, an error between the real face region and the predicted face region is reduced until the real face region approaches the predicted face region, that is, the loss function approaches zero as much as possible, and the trained MTCNN may be obtained. The MTCNN after training can detect the face in any image to obtain the face area.

During the process of detecting the face, the trained MTCNN can output prompt information related to the undetected face when the face is not detected, and when the face is detected, a face area with a preset size is extracted from the image with the size normalized by taking the face as the center. The preset size can be flexibly set according to actual needs, and specific contents are not limited herein. For example, a quadrilateral face region B1(l1, t1, r1, B1) may be extracted from the image after size normalization, where (l1, t1) is the coordinates of the upper left vertex of the face region and (r1, B1) is the coordinates of the lower right vertex of the face region. The shape of the face region can be flexibly set according to actual needs, for example, the face region can be rectangular, square or circular.

And S102, when the face region meets a preset compliance condition, extracting a face region mask and a skin region mask from the image.

In order to avoid subsequent manual detection of compliance, after the face region is obtained, whether the face region meets a preset compliance condition or not can be judged, automatic compliance detection on the face region is realized, and the preset compliance condition can be flexibly set according to actual needs.

In some embodiments, before extracting the human image area mask and the skin area mask from the image, the image processing method may further include: detecting the number of human faces, the area of the human faces, the definition, the exposure and the posture of the human faces in a human face area; and when the number of the human faces, the human face area, the definition, the exposure and the human face posture in the human face region all meet the conditions, determining that the human face region meets the preset compliance conditions.

Specifically, for example, as shown in fig. 2, after the face region is obtained, the number of faces, the face area, the sharpness, the exposure, the face pose, and the like in the face region may be detected in various ways, that is, the number of faces is counted, the face area is calculated, the sharpness is evaluated, the exposure is evaluated, the face pose is estimated, and the like, so as to evaluate the face quality. When the number of the human faces, the human face area, the definition, the exposure and the human face posture in the human face region all meet the conditions, determining that the human face region meets the preset compliance conditions; when one of the number of faces, the face area, the definition, the exposure and the face pose in the face region does not meet the condition, detailed information which does not meet the condition can be output.

It should be noted that, in addition to the above detection manner, one or more of the number of faces, the area of the faces, the sharpness, the exposure, the pose of the faces, and the like in the face region may be detected, when one or more of the number of faces, the area of the faces, the sharpness, the exposure, the pose of the faces, and the like satisfy the condition, it may be determined that the face region satisfies the preset compliance condition, otherwise, detailed information that does not satisfy the condition may be output.

In some embodiments, detecting the number of human faces in the human face region may include: counting the number of the faces in the face area, determining that the number of the faces in the face area meets the condition when the number of the faces is 1, and determining that the number of the faces in the face area does not meet the condition when the number of the faces is more than 1.

In some embodiments, when the face region is a rectangular region, detecting the face area within the face region may include: the method comprises the steps of obtaining the length and the width of a face area, calculating the face area according to the length and the width, and then judging whether the face area is larger than a preset area threshold value, wherein the preset area threshold value can be flexibly set according to actual needs, and specific values are not limited here. And when the face area is smaller than or equal to the preset area threshold value, determining that the face area in the face area does not meet the condition.

In some embodiments, when the face region is a circular region, detecting the face area within the face region may include: the method comprises the steps of obtaining the diameter or the radius of a face area, calculating the face area according to the diameter or the radius, and then judging whether the face area is larger than a preset area threshold value, wherein the preset area threshold value can be flexibly set according to actual needs, and specific values are not limited here. And when the face area is smaller than or equal to the preset area threshold value, determining that the face area in the face area does not meet the condition.

In some embodiments, detecting the sharpness within the face region may comprise: converting the face area into a gray image, and performing convolution operation on the gray image and a preset Laplace kernel to obtain a response image; and calculating the variance of the response image, and determining that the definition in the face region meets the condition when the variance is greater than a preset threshold value.

For example, in order to improve the accuracy and effect of the subsequent generation of the target image, the definition of the face region may be determined in advance, so as to screen out the face region with higher definition for processing. First, a face region may be converted from a three primary color (RGB, Red Green Blue) image into a grayscale image. Then, Laplacian Transform (Laplacian Transform) is performed on the grayscale image, for example, the grayscale image may be convolved with a preset Laplacian kernel to obtain a response image, where the preset Laplacian kernel may be flexibly set according to actual needs, for example, the preset Laplacian kernel may be a 3 × 3 Laplacian kernel.

At this time, the variance of the response image may be calculated, whether the variance is greater than a preset threshold value or not may be determined, when the variance is greater than the preset threshold value, it is determined that the sharpness in the face region satisfies the condition, and when the variance is less than or equal to the preset threshold value, it is determined that the sharpness in the face region does not satisfy the condition. The preset threshold may be flexibly set according to actual needs, for example, the preset threshold may be set to 100, and if the variance is greater than or equal to the preset threshold, the face region is clearer, and if the variance is smaller than the preset threshold, the face region is blurry, so that the blurry face region with the variance smaller than the preset threshold needs to be removed.

In some embodiments, detecting exposure within the face region may include: counting the proportion of pixels smaller than a first pixel threshold value in the face area to the total pixels in the face area to obtain a first proportion value; counting the proportion of pixels larger than a second pixel threshold value in the face region to the total pixels in the face region to obtain a second proportion value, wherein the first pixel threshold value is smaller than the second pixel threshold value; and when the first proportion value is smaller than a first preset value and the second proportion value is smaller than a second preset value, determining that the exposure degree in the face area meets the condition, wherein the first preset value is smaller than the second preset value.

For example, the proportion of pixels in the face region smaller than the first pixel threshold value to the total pixels in the face region may be counted to obtain a first proportion value, the first pixel threshold value may be flexibly set according to actual needs, for example, the first pixel threshold value may be set to 0 to 30, and the pixels smaller than the first pixel threshold value may be referred to as an extremely low pixel value. And counting the proportion of pixels larger than a second pixel threshold value in the face region to the total pixels in the face region to obtain a second proportion value, wherein the second pixel threshold value can be flexibly set according to actual needs, for example, the second pixel threshold value can be set to be 220 to 255, and the pixels larger than the second pixel threshold value can be called as an extremely high pixel value. Then, judging whether the first proportion value is smaller than a first preset value or not, and judging whether the second proportion value is smaller than a second preset value or not, wherein the first preset value and the second preset value can be flexibly set according to actual needs, for example, when the first proportion value is smaller than 40% and the second proportion value is smaller than 60%, determining that the exposure degree in the face area meets the condition; when the first proportion value is greater than or equal to 40%, indicating that the face area is exposed, and determining that the exposure degree in the face area does not meet the condition; or when the second proportion value is greater than or equal to 60%, the face area is indicated as low light, and the exposure degree in the face area is determined not to meet the condition.

In some implementations, detecting a face pose within a face region can include: and calculating an included angle between a straight line where the two eyes are located and a horizontal line in the face region, and determining that the face posture in the face region meets the condition when the included angle is smaller than a preset angle.

For example, positions of two eyes in the face region may be detected, a straight line where the two eyes are located may be determined according to the positions of the two eyes, for example, a connecting line between center points of the positions of the two eyes is the straight line where the two eyes are located, and an included angle between the straight line where the two eyes are located in the face region and a horizontal line, which may be a horizontal axis that intersects the face, may be calculated, for example, the horizontal line may divide the mouth into an upper lip and a lower lip. For example, if the center point of the left eye is p1(x1, y1) and the center point of the right eye is p2(x2, y2), the angle between the line where the two eyes are located and the horizontal line is tan^-1(| y2-y1|/| x2-x1 |). Then, whether the included angle is smaller than a preset angle or not is judged, the preset angle can be flexibly set according to actual needs, and for example, the preset angle can be set to be 15 degrees. When the included angle is smaller than a preset angle, the face is a front face, and the face posture in the face area is determined to meet the condition; and when the included angle is larger than or equal to the preset angle, the face is indicated as a side face, and the condition that the face posture in the face area does not meet is determined.

And when the face region does not meet the preset compliance condition, outputting relevant information that the face region is not compliant, such as unclear face region or nonstandard face pose. When the face region meets the preset compliance condition, a face region mask and a skin region mask may be extracted from the image, for example, as shown in fig. 2, the face region mask and the skin region mask may be extracted from the image through an image segmentation network.

In some embodiments, extracting the human image area mask and the skin area mask from the image may include: acquiring the mean value and the standard deviation of pixels in an image; performing pixel normalization processing on the image according to the mean value and the standard deviation to obtain an image after pixel normalization; and carrying out segmentation processing on the image after pixel normalization through an image segmentation network to obtain a portrait area mask and a skin area mask.

In order to improve the efficiency and reliability of image processing by the image segmentation network, the image may be subjected to pixel normalization processing in advance, for example, a mean value and a standard deviation of pixels in the image may be obtained, and the image may be subjected to pixel normalization processing according to the mean value and the standard deviation: in this case, the image with the normalized pixels may be input to an image segmentation network, and the image with the normalized pixels may be segmented by the image segmentation network to obtain a portrait area mask (p) and a skin area mask(s).

In some embodiments, the segmenting the image after the pixel normalization by the image segmentation network to obtain the human image area mask and the skin area mask may include:

extracting local features of the image after pixel normalization through an initial network module of an image segmentation network to obtain a first feature image; extracting shared features of the first feature image through a shared feature module of the image segmentation network to obtain a second feature image; extracting the portrait characteristics and the skin characteristics of the second characteristic image through a branch task module of the image segmentation network to obtain a portrait characteristic image and a skin characteristic image; and extracting a portrait area mask and a skin area mask based on the second characteristic image, the portrait characteristic image and the skin characteristic image through an integration module of the image segmentation network.

The image segmentation network is a pre-trained image segmentation network, the image segmentation network is a multitask-based image segmentation network, and the image segmentation network may include an initial network module (initial block), a shared features module (shared features), a branched task module (factored module), and an integration module (aggregation module). Specifically, strong prior information that a skin region is in a portrait region is fused into an image segmentation network design, a multitask image segmentation network is adopted, segmentation sub-modules in each module of the image segmentation network adopt a coding encoder structure and a decoding decoder structure, for example, as shown in fig. 3, an initial network module is used for learning shallow local features (including edges, textures and the like) in an image, a shared feature module is used for learning features (including portrait region features) shared by two segmentation tasks such as portrait region segmentation and skin region segmentation, a branch task module is used for learning features (including portrait region features) of the portrait segmentation tasks, an integration module is used for fusing learned portrait region features to learn the skin region, and finally a portrait region mask and a skin region mask are obtained. In fig. 3, in order to protect the privacy of the user, the facial features in the input image have been masked.

In order to improve the detection accuracy of the image segmentation network, the image segmentation network may be trained: the method comprises the steps of obtaining a plurality of training sample images containing human faces, obtaining a real portrait area mask and a real skin area mask corresponding to each training sample image, inputting each training sample image into an image segmentation network, calculating a predicted portrait area mask and a predicted skin area mask in the training sample images through the image segmentation network, constructing a first loss function and a second loss function, converging the real portrait area mask and the predicted portrait area mask through the first loss function, converging the real skin area mask and the predicted skin area mask through the second loss function, adjusting parameters of the image segmentation network to appropriate values to enable the loss to be low and the gradient not to be reduced, reducing the error between the real portrait area mask and the predicted portrait area mask until the real portrait area mask is close to the predicted portrait area mask, and reducing an error between the real skin area mask and the predicted skin area mask until the real skin area mask approaches the predicted skin area mask, thereby obtaining a trained image segmentation network, so that a portrait area mask and a skin area mask can be extracted from the image through the trained image segmentation network.

For example, in the process of extracting the portrait area mask and the skin area mask in fig. 3, first, an initial network module of the image segmentation network extracts a local feature of the image after pixel normalization to obtain a first feature image, where the local feature is a shallow local feature, and the local feature may include features such as an edge and a texture. Then, the shared feature module of the image segmentation network can extract the shared feature of the first feature image to obtain a second feature image, wherein the shared feature may include a human region feature. Secondly, extracting the portrait characteristics and the skin characteristics of the second characteristic image through a branch task module of the image segmentation network to obtain a portrait characteristic image and a skin characteristic image. Finally, a portrait area mask and a skin area mask may be extracted based on the second feature image, the portrait feature image, and the skin feature image through an integration module of the image segmentation network.

And S103, extracting a target area according with a preset proportion from the image.

The preset proportion can be flexibly set according to actual needs, and the target area can be an area larger than a human face area, for example, the target area can include an area from the head to the chest in a portrait, and the target area is the identification photograph area in fig. 2.

In some embodiments, extracting the target region in accordance with the preset ratio from the image may include: acquiring a portrait frame according to the portrait area mask; acquiring a face detection frame, and calculating the height of a face according to the face detection frame; determining the boundary of the region according to the heights of the portrait frame, the face detection frame and the face; and extracting a target area which accords with a preset proportion from the image according to the area boundary.

For example, as shown in fig. 4, a face frame B2(l2, t2, r2, B2) and a face detection frame B1(l1, t1, r1, B1) may be obtained according to the face region mask p, the height h of the face may be calculated according to the face detection frame (B1-t2), and the region boundary may be determined according to the height of the face frame, the face detection frame, and the face; an object region, namely an identification photo region (IDPhoto Area), meeting a preset proportion is extracted from the image according to the region boundary, for example, the upper boundary of the object region is t2, the lower boundary of the object region is min (b1+ h, b2), the left boundary of the object region is l2, and the right boundary of the object region is r 2. In fig. 4, in order to protect the privacy of the user, the facial features in the input image have been masked.

S104, preprocessing the skin area in the target area according to the skin area mask, and adjusting the color of the non-portrait area in the target area according to the portrait area mask to obtain the target image.

After the target area is obtained, preprocessing such as beauty treatment can be performed on the skin area in the target area, and color adjustment can be performed on the background area (i.e., the non-portrait area), that is, the foreground portrait beauty treatment and the background color replacement in fig. 2, so as to obtain the target image with the required background color.

In some embodiments, preprocessing the skin region within the target region according to the skin region mask, and performing color adjustment on the non-portrait region within the target region according to the portrait region mask, to obtain the target image may include: performing beautifying operation on the skin area in the identification photo area according to the skin area mask to obtain a foreground area after beautifying; replacing the non-portrait area in the identification photo area with a preset color according to the portrait area mask to obtain a replaced background area; and fusing the foreground area and the background area to obtain a target image.

Specifically, as shown in fig. 4, a skin area in the identification photo area may be determined according to the skin area mask, and a beautifying operation may be performed on the skin area to obtain a foreground area after beautifying (i.e., a skin area after beautifying), where the beautifying operation may include skin polishing, skin whitening, and the like. And determining a non-human image area in the identification photo area according to the human image area mask, and replacing the non-human image area with a preset color to obtain a replaced background area, wherein the preset color can be flexibly set according to actual needs, for example, the preset color can be white, blue or red, and the like. At this time, the foreground region and the background region may be fused (Fusion) to obtain a target image, which may be a certificate photo image, and for example, the foreground region a and the background region B may be summed: and the value of the alpha can be flexibly set according to actual needs.

In some embodiments, performing a beautifying operation on the skin region in the identification photo region according to the skin region mask, and obtaining a foreground region after beautifying may include: extracting a skin area from the certificate photo area according to the skin area mask; performing dermabrasion treatment on the skin area through a bilateral filtering algorithm to obtain a dermabraded skin area; and whitening the skin area after the skin is abraded by a three-dimensional table look-up method to obtain a foreground area after the skin is beautified.

For example, in the process of skin beauty (Portrait Make-up), a skin region can be extracted from a certificate photo region according to a skin region mask, then, the skin region can be subjected to skin grinding through a bilateral filtering algorithm to obtain a skin region after skin grinding, parameters of the bilateral filtering algorithm can be flexibly set according to actual needs, for example, the bilateral filtering algorithm can be a nonlinear filter, and the effects of keeping the edge of the skin region and reducing noise and smoothing can be achieved. And finally, whitening the skin area after the skin is abraded by a three-dimensional table Look-up method (3d-LUT, Look-up table) to obtain a foreground area after the skin is beautified, wherein the three-dimensional table Look-up method can be flexibly set according to actual needs, for example, the three-dimensional table Look-up method can be used for pre-storing the mapping relationship between the RGB pixels before the skin is abraded and the RGB pixels after the skin is whitened, obtaining the RGB pixel values of the skin area after the skin is abraded, and searching for the corresponding R1G1B1 pixel values after transformation by using the three-dimensional table Look-up method, namely, the pixels in the foreground area after the skin is beautified are R1G1B1 pixel values.

In this embodiment, face detection may be performed on an image through MTCNN to obtain a face region, compliance detection such as a face number, a face area, a sharpness, an exposure, a face posture, and the like may be performed on the face region in the image, a face region mask and a skin region mask may be obtained through an image segmentation network under a condition that the compliance is satisfied, a certificate photo region in accordance with a ratio may be extracted based on the face region mask and the skin region mask, and a skin beautifying operation may be performed on skin in the certificate photo region to obtain a certificate photo image in accordance with a requirement.

For example, as shown in fig. 5, an application (or an applet) for implementing image processing of this embodiment may be installed on the mobile terminal, the application (or the applet) is opened, and a visual interface of the certificate photo is entered, and related information may be entered in the interface, for example, an image may be uploaded in an area of the certificate photo in a certificate photo collection interface, and based on a requirement of the photo and a selected background color being green, the automatically generated certificate photo may be automatically generated, and the requirement of the photo may be modified according to actual needs, so that the certificate photo meeting the requirement may be automatically generated only by uploading the image taken by the mobile terminal without being taken to a special shooting location, which not only greatly saves personal time, the method can also reduce the cost of manual compliance audit, has better robustness and real-time performance, can detect the complex face in time, has high speed of generating the required identification photo, and has natural beauty of the skin area, and the mask of the skin area limits the skin area, thereby preventing the non-skin area from becoming fuzzy, overexposure and the like. It should be noted that, in fig. 5, in order to protect the privacy of the user, the facial features in the input image have been masked, and the visual interface of the certificate photo can be flexibly set according to actual needs in addition to the layout in fig. 5, and specific contents are not limited herein.

In order to better implement the image processing method provided by the embodiment of the present application, the embodiment of the present application further provides an apparatus based on the image processing method. The terms are the same as those in the image processing method, and details of implementation can be referred to the description in the method embodiment.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure, wherein the image processing apparatus 300 may include a first detection module 301, a mask extraction module 302, a region extraction module 303, a processing module 304, and the like.

The first detection module 301 is configured to acquire an image to be processed, and perform face detection on the image to obtain a face region.

A mask extracting module 302, configured to extract a human face area mask and a skin area mask from the image when the human face area satisfies a preset compliance condition.

The region extraction module 303 is configured to extract a target region that meets a preset ratio from the image.

The processing module 304 is configured to perform preprocessing on a skin area in the target area according to the skin area mask, and perform color adjustment on a non-portrait area in the target area according to the portrait area mask to obtain a target image.

In some embodiments, the processing module 304 may include a beauty sub-module, a replacement sub-module, a fusion sub-module, and the like, which may be specifically as follows:

In some embodiments, the beauty module is specifically configured to: extracting a skin area from the certificate photo area according to the skin area mask; performing dermabrasion treatment on the skin area through a bilateral filtering algorithm to obtain a dermabraded skin area; and whitening the skin area after the skin is abraded by a three-dimensional table look-up method to obtain a foreground area after the skin is beautified.

In some embodiments, the image processing apparatus may further include a second detecting module, a determining module, and the like, which may specifically be as follows:

In some embodiments, the second detection module is specifically configured to: converting the face area into a gray image, and performing convolution operation on the gray image and a preset Laplace kernel to obtain a response image; and calculating the variance of the response image, and determining that the definition in the face region meets the condition when the variance is greater than a preset threshold value.

In some embodiments, the second detection module is specifically configured to: counting the proportion of pixels smaller than a first pixel threshold value in the face area to the total pixels in the face area to obtain a first proportion value; counting the proportion of pixels larger than a second pixel threshold value in the face region to the total pixels in the face region to obtain a second proportion value, wherein the first pixel threshold value is smaller than the second pixel threshold value; and when the first proportion value is smaller than a first preset value and the second proportion value is smaller than a second preset value, determining that the exposure degree in the face area meets the condition, wherein the first preset value is smaller than the second preset value.

In some embodiments, the second detection module is specifically configured to: and calculating an included angle between a straight line where the two eyes are located and a horizontal line in the face region, and determining that the face posture in the face region meets the condition when the included angle is smaller than a preset angle.

In some embodiments, the first detection module 301 is specifically configured to: carrying out size normalization processing on the image to obtain an image with a normalized size; carrying out face detection on the image with the normalized size through a face detector of a convolutional neural network; when the face is detected, the face area with the preset size is extracted from the image with the size normalized by taking the face as the center.

In some embodiments, the mask extraction module 302 may include an obtaining sub-module, a processing sub-module, a splitting sub-module, and the like, which may specifically be as follows:

the processing submodule is used for carrying out pixel normalization processing on the image according to the mean value and the standard deviation to obtain the image after pixel normalization;

and the segmentation submodule is used for carrying out segmentation processing on the image after the pixels are normalized through an image segmentation network to obtain a human image area mask and a skin area mask.

In some embodiments, the partitioning sub-module is specifically configured to: extracting local features of the image after pixel normalization through an initial network module of an image segmentation network to obtain a first feature image; extracting shared features of the first feature image through a shared feature module of the image segmentation network to obtain a second feature image; extracting the portrait characteristics and the skin characteristics of the second characteristic image through a branch task module of the image segmentation network to obtain a portrait characteristic image and a skin characteristic image; and extracting a portrait area mask and a skin area mask based on the second characteristic image, the portrait characteristic image and the skin characteristic image through an integration module of the image segmentation network.

In some embodiments, the region extraction module is specifically configured to: acquiring a portrait frame according to the portrait area mask; acquiring a face detection frame, and calculating the height of a face according to the face detection frame; determining the boundary of the region according to the heights of the portrait frame, the face detection frame and the face; and extracting a target area which accords with a preset proportion from the image according to the area boundary.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Accordingly, as shown in fig. 7, the mobile terminal may include Radio Frequency (RF) circuit 601, a memory 602 including one or more computer-readable storage media, an input unit 603, a display unit 604, a sensor 605, an audio circuit 606, a Wireless Fidelity (WiFi) module 607, a processor 608 including one or more processing cores, and a power supply 609. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 7 is not intended to be limiting of mobile terminals and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the RF circuit 601 may be used for receiving and transmitting signals during a message transmission or communication process, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages by one or more processors 608; in addition, data relating to uplink is transmitted to the base station. In general, the RF circuit 601 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 601 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 602 may be used to store software programs and modules, and the processor 608 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 608 and the input unit 603 access to the memory 602.

The input unit 603 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, input unit 603 may include a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (e.g., operations by a user on or near the touch-sensitive surface using a finger, a stylus, or any other suitable object or attachment) thereon or nearby, and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 608, and can receive and execute commands sent by the processor 608. In addition, touch sensitive surfaces may be implemented using various types of resistive, capacitive, infrared, and surface acoustic waves. The input unit 603 may include other input devices in addition to the touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 604 may be used to display information input by or provided to the user and various graphical user interfaces of the mobile terminal, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 604 may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay the display panel, and when a touch operation is detected on or near the touch-sensitive surface, the touch operation is transmitted to the processor 608 to determine the type of touch event, and the processor 608 then provides a corresponding visual output on the display panel according to the type of touch event. Although in FIG. 7 the touch-sensitive surface and the display panel are two separate components to implement input and output functions, in some embodiments the touch-sensitive surface may be integrated with the display panel to implement input and output functions.

The mobile terminal may also include at least one sensor 605, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the mobile terminal is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile terminal, further description is omitted here.

Audio circuitry 606, a speaker, and a microphone may provide an audio interface between a user and the mobile terminal. The audio circuit 606 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electric signal, which is received by the audio circuit 606 and converted into audio data, which is then processed by the audio data output processor 608, and then transmitted to, for example, another mobile terminal via the RF circuit 601, or the audio data is output to the memory 602 for further processing. The audio circuit 606 may also include an earbud jack to provide communication of a peripheral headset with the mobile terminal.

WiFi belongs to short distance wireless transmission technology, and the mobile terminal can help the user to send and receive e-mail, browse web page and access streaming media etc. through WiFi module 607, it provides wireless broadband internet access for the user. Although fig. 7 shows the WiFi module 607, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 608 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby performing overall monitoring of the mobile terminal. Optionally, processor 608 may include one or more processing cores; preferably, the processor 608 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 608.

The mobile terminal also includes a power supply 609 (e.g., a battery) for powering the various components, which may be logically coupled to the processor 608 via a power management system that may be configured to manage charging, discharging, and power consumption. The power supply 609 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

Although not shown, the mobile terminal may further include a camera, a bluetooth module, and the like, which will not be described herein. Specifically, in this embodiment, the processor 608 in the mobile terminal loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 608 runs the application program stored in the memory 602, so as to perform the following functions:

acquiring an image to be processed, and performing face detection on the image to obtain a face area; when the face region meets a preset compliance condition, extracting a face region mask and a skin region mask from the image; extracting a target area which accords with a preset proportion from the image; and preprocessing the skin area in the target area according to the skin area mask, and performing color adjustment on the non-portrait area in the target area according to the portrait area mask to obtain the target image.

In some embodiments, when the skin area within the target area is preprocessed according to the skin area mask and the non-portrait area within the target area is color-adjusted according to the portrait area mask to obtain the target image, the processor 608 further performs: performing beautifying operation on the skin area in the identification photo area according to the skin area mask to obtain a foreground area after beautifying; replacing the non-portrait area in the identification photo area with a preset color according to the portrait area mask to obtain a replaced background area; and fusing the foreground area and the background area to obtain a target image.

In some embodiments, when performing a beautifying operation on the skin region in the identification photo region according to the skin region mask to obtain a beautified foreground region, the processor 608 further performs: extracting a skin area from the certificate photo area according to the skin area mask; performing dermabrasion treatment on the skin area through a bilateral filtering algorithm to obtain a dermabraded skin area; and whitening the skin area after the skin is abraded by a three-dimensional table look-up method to obtain a foreground area after the skin is beautified.

In some embodiments, before extracting the human image area mask and the skin area mask from the image, the processor 608 further performs: detecting the number of human faces, the area of the human faces, the definition, the exposure and the posture of the human faces in a human face area; and when the number of the human faces, the human face area, the definition, the exposure and the human face posture in the human face region all meet the conditions, determining that the human face region meets the preset compliance conditions.

In some embodiments, upon detecting sharpness in the face region, the processor 608 further performs: converting the face area into a gray image, and performing convolution operation on the gray image and a preset Laplace kernel to obtain a response image; and calculating the variance of the response image, and determining that the definition in the face region meets the condition when the variance is greater than a preset threshold value.

In some embodiments, upon detecting exposure within the face region, processor 608 further performs: counting the proportion of pixels smaller than a first pixel threshold value in the face area to the total pixels in the face area to obtain a first proportion value; counting the proportion of pixels larger than a second pixel threshold value in the face region to the total pixels in the face region to obtain a second proportion value, wherein the first pixel threshold value is smaller than the second pixel threshold value; and when the first proportion value is smaller than a first preset value and the second proportion value is smaller than a second preset value, determining that the exposure degree in the face area meets the condition, wherein the first preset value is smaller than the second preset value.

In some embodiments, upon detecting a face pose within the face region, the processor 608 further performs: and calculating an included angle between a straight line where the two eyes are located and a horizontal line in the face region, and determining that the face posture in the face region meets the condition when the included angle is smaller than a preset angle.

In some embodiments, when performing face detection on the image to obtain a face region, the processor 608 further performs: carrying out size normalization processing on the image to obtain an image with a normalized size; carrying out face detection on the image with the normalized size through a face detector of a convolutional neural network; when the face is detected, the face area with the preset size is extracted from the image with the size normalized by taking the face as the center.

In some embodiments, when extracting the human image area mask and the skin area mask from the image, the processor 608 further performs: acquiring the mean value and the standard deviation of pixels in an image; performing pixel normalization processing on the image according to the mean value and the standard deviation to obtain an image after pixel normalization; and carrying out segmentation processing on the image after pixel normalization through an image segmentation network to obtain a portrait area mask and a skin area mask.

In some embodiments, when the image after pixel normalization is segmented by the image segmentation network to obtain the human image area mask and the skin area mask, the processor 608 further performs: extracting local features of the image after pixel normalization through an initial network module of an image segmentation network to obtain a first feature image; extracting shared features of the first feature image through a shared feature module of the image segmentation network to obtain a second feature image; extracting the portrait characteristics and the skin characteristics of the second characteristic image through a branch task module of the image segmentation network to obtain a portrait characteristic image and a skin characteristic image; and extracting a portrait area mask and a skin area mask based on the second characteristic image, the portrait characteristic image and the skin characteristic image through an integration module of the image segmentation network.

In some embodiments, when the target region meeting the preset ratio is extracted from the image, the processor 608 further performs: acquiring a portrait frame according to the portrait area mask; acquiring a face detection frame, and calculating the height of a face according to the face detection frame; determining the boundary of the region according to the heights of the portrait frame, the face detection frame and the face; and extracting a target area which accords with a preset proportion from the image according to the area boundary.

In the above embodiments, the descriptions of the embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed description of the image processing method, and are not described herein again.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.

To this end, the present application provides a storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute any one of the image processing methods provided by the present application. For example, the computer program is loaded by a processor and may perform the following steps:

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the storage medium can execute any image processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any image processing method provided in the embodiments of the present application can be achieved, for details, see the foregoing embodiments, and are not described herein again.

The foregoing describes an image processing method, an image processing apparatus, a mobile terminal, and a storage medium provided in the embodiments of the present application in detail, and a specific example is applied in the present application to explain the principles and embodiments of the present application, and the description of the foregoing embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An image processing method, comprising:

extracting a target area which accords with a preset proportion from the image;

2. The image processing method according to claim 1, wherein the preprocessing the skin area in the target area according to the skin area mask, and performing color adjustment on the non-portrait area in the target area according to the portrait area mask to obtain a target image comprises:

3. The image processing method according to claim 2, wherein performing a beautifying operation on the skin area in the identification photo area according to the skin area mask, and obtaining a foreground area after beautifying comprises:

4. The method of claim 1, wherein prior to extracting the human image area mask and the skin area mask from the image, the method further comprises:

5. The image processing method according to claim 4, wherein detecting the sharpness in the face region comprises:

calculating the variance of the response image, and determining that the definition in the face region meets the condition when the variance is greater than a preset threshold value;

the detection of the exposure level in the face region comprises the following steps:

when the first proportion value is smaller than a first preset value and the second proportion value is smaller than a second preset value, determining that the exposure degree in the face area meets the condition, wherein the first preset value is smaller than the second preset value;

the detecting the face pose in the face region comprises:

6. The image processing method according to claim 1, wherein the performing face detection on the image to obtain a face region comprises:

7. The image processing method according to any one of claims 1 to 6, wherein the extracting the human image area mask and the skin area mask from the image includes:

acquiring the mean value and the standard deviation of pixels in the image;

8. The image processing method according to claim 7, wherein the segmenting the pixel-normalized image through an image segmentation network to obtain a human image area mask and a skin area mask comprises:

9. The image processing method according to any one of claims 1 to 6, wherein the extracting the target region conforming to the preset proportion from the image comprises:

acquiring a portrait frame according to the portrait area mask;

10. An image processing apparatus characterized by comprising:

11. A mobile terminal characterized by comprising a processor and a memory, the memory having stored therein a computer program, the processor executing the image processing method according to any one of claims 1 to 9 when calling the computer program in the memory.

12. A storage medium for storing a computer program which is loaded by a processor to execute the image processing method according to any one of claims 1 to 9.