CN114862729A - Image processing method, image processing device, computer equipment and storage medium - Google Patents

Image processing method, image processing device, computer equipment and storage medium Download PDF

Info

Publication number
CN114862729A
CN114862729A CN202110153055.6A CN202110153055A CN114862729A CN 114862729 A CN114862729 A CN 114862729A CN 202110153055 A CN202110153055 A CN 202110153055A CN 114862729 A CN114862729 A CN 114862729A
Authority
CN
China
Prior art keywords
image
target
original image
mask
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110153055.6A
Other languages
Chinese (zh)
Inventor
赵远远
郑青青
刘浩
裴廷宽
李琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110153055.6A priority Critical patent/CN114862729A/en
Publication of CN114862729A publication Critical patent/CN114862729A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to the field of artificial intelligence and computer vision, and to an image processing method, apparatus, computer device and storage medium. The method comprises the following steps: smoothing the original image to obtain a smooth image; determining an initial skin region in the smoothed image; determining pixel points with significantly changed pixel values in the original image to obtain texture edge information; removing the texture edge area in the initial skin area based on the texture edge information to obtain a target skin area; and fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image. The method can improve the image quality.

Description

Image processing method, image processing device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technology and image processing technology, and in particular, to an image processing method, an image processing apparatus, a computer device, and a storage medium.
Background
With the development of artificial intelligence, especially the development of image processing technology in the field of computer vision, the technology for beautifying images is endless and has wide application, for example: the face image is beautified in the self-photographing process, or in the video call or live broadcast process, and the like. One of the important technical directions is to dermize the skin in the image to make it smooth.
In the conventional method, an edge-preserving filter is generally adopted to perform filtering processing on an image, such as a bilateral filter or a guide filter, to obtain a smooth image after buffing, so as to perform buffing operation on the image overall under the condition of ensuring the image contour lines to be clear as much as possible. However, performing the peeling operation on the whole image reduces the transparency of the image and the image quality.
Disclosure of Invention
In view of the above, it is necessary to provide an image processing method, an apparatus, a computer device, and a storage medium capable of improving image quality in view of the above technical problems.
A method of image processing, the method comprising:
smoothing the original image to obtain a smooth image;
determining an initial skin region in the smoothed image;
determining pixel points with significantly changed pixel values in the original image to obtain texture edge information;
removing the texture edge area in the initial skin area based on the texture edge information to obtain a target skin area;
and fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
In one embodiment, the smoothing the original image to obtain a smoothed image includes:
determining down-sampling multiples according to the resolution of an original image;
according to the down-sampling multiple, down-sampling the original image;
and smoothing the down-sampled image to obtain a smooth image.
In one embodiment, the method further comprises:
determining an edge mask according to the texture edge information;
removing the texture edge region in the initial skin region based on the texture edge information to obtain a target skin region, including:
fusing the initial mask image used for determining the initial skin area and the edge mask to obtain a target mask image; the target mask map is used for determining a target skin area;
the fusing the image content in the smooth image, which is located in the target skin region, into the corresponding position of the target skin region in the original image to obtain the target image, including:
and according to the target mask image, fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
In one embodiment, the method further comprises:
acquiring a binary mask map for determining an initial skin region of the smoothed image;
acquiring pixel values in an initial skin area corresponding to the binary mask image from a single-color channel of the smooth image;
generating the initial mask map for determining non-binary values of the initial skin region from the acquired pixel values; pixel values of points in the initial mask map corresponding to the initial skin region are determined from the acquired pixel values.
In one embodiment, the fusing, according to the target mask map, image content in the smooth image, which is located in the target skin region, into a corresponding position of the target skin region in the original image to obtain a target image includes:
overlapping and fusing the first image and the second image to obtain a target image;
the first image is obtained by multiplying the pixel values corresponding to each point in the smooth image by the pixel values corresponding to each point in the target mask image;
the second image is an image obtained by multiplying the pixel values corresponding to the respective points in the original image by the pixel values corresponding to the respective points in the reverse mask image of the target mask image.
In one embodiment, the method further comprises:
determining the range of pixel value intervals to which the pixel values of each point in the target mask image belong;
and according to the significance coefficient corresponding to the pixel value interval range, performing significance weighting on the pixel value of the corresponding point in the target mask image to obtain the target mask image subjected to significance weighting.
In one embodiment, the determining an edge mask according to the texture edge information comprises:
and performing inverse operation on the texture edge information through an inverse function to obtain an edge mask.
In one embodiment, the texture edge information is a texture edge information map; the determining pixel points with significantly changed pixel values in the original image to obtain texture edge information includes:
determining difference information between the original image and the smoothed image;
and obtaining the texture edge information graph according to the difference information.
In one embodiment, the method further comprises:
extracting high-frequency information in the original image according to difference information between the original image and the smooth image;
and adding the high-frequency information into the target image to obtain a sharpened target image.
In one embodiment, the original image contains a human face; the method further comprises the following steps:
determining the five sense organ regions of the human face in the original image;
and according to the five sense organ region, fusing the target image and the original image to obtain a target image with details of the five sense organs.
In one embodiment, the original image contains a human face; the method further comprises the following steps:
acquiring a face area corresponding to a face in the original image;
and according to the face region, fusing the target image and the original image to obtain a target image with an optimized face region.
In one embodiment, the method further comprises:
acquiring an input smoothing parameter;
determining the superposition weight corresponding to the target image and the original image respectively according to the smoothing parameter;
superposing and fusing the target image and the original image according to corresponding superposition weights;
and outputting the superposed and fused image.
An image processing apparatus, the apparatus comprising:
the smoothing module is used for smoothing the original image to obtain a smooth image;
an initial region determination module for determining an initial skin region in the smoothed image;
the edge determining module is used for determining pixel points with significantly changed pixel values in the original image to obtain texture edge information;
a target area determining module, configured to remove a texture edge area in the initial skin area based on the texture edge information to obtain a target skin area;
and the fusion module is used for fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
A computer device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of the image processing method according to embodiments of the present application.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, causes the processor to perform the steps of an image processing method according to embodiments of the present application.
A computer program product or computer program comprising computer instructions stored in a computer readable storage medium; the processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to enable the computer device to execute the steps in the image processing method according to the embodiments of the application.
The image processing method, the device, the computer equipment and the storage medium smooth the original image to obtain a smooth image, then determine the initial skin area in the smooth image, determine the pixel points with the significantly changed pixel values in the original image to obtain the texture edge information, remove the texture edge area in the initial skin area based on the texture edge information to obtain the target skin area, fuse the image content in the target skin area in the smooth image into the corresponding position of the target skin area in the original image to obtain the target image, so that the image content in the target skin area in the fused target image is smooth, the image content outside the target skin area keeps the definition degree in the original image, the problem that the image permeability is reduced due to the fact that the original image is wholly smoothed is avoided, and the image quality is improved, the image content in the target skin area with the texture edge area removed in the smooth image is fused into the corresponding position of the target skin area in the original image, so that the texture edge area in the original image is prevented from being smooth, the texture edge details are clear, and the image quality is further improved.
Drawings
FIG. 1 is a diagram of an exemplary embodiment of an image processing method;
FIG. 2 is a diagram showing an application environment of an image processing method in another embodiment;
FIG. 3 is a flow diagram illustrating a method for image processing according to one embodiment;
FIG. 4 is a graph of an inversion function in one embodiment;
FIG. 5 is a diagram of a non-binary facial mask in one embodiment;
FIG. 6 is a flowchart illustrating an overall image processing method according to an embodiment;
FIG. 7 is a block diagram showing the configuration of an image processing apparatus according to an embodiment;
FIG. 8 is a block diagram showing the construction of an image processing apparatus according to another embodiment;
FIG. 9 is a diagram showing an internal structure of a computer device in one embodiment;
fig. 10 is an internal structural view of a computer device in another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The image processing method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud services, a cloud database, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a CDN, and big data and artificial intelligence platforms.
Specifically, the user 106 may capture an original video or an original image through the terminal 102, or input a pre-stored original video or an original image, the terminal 102 may send the original video or the original image to the server 104, the server 104 may execute the steps in the image processing method in the embodiments of the present application, so as to obtain a target image according to each frame of the original image or the received original image in the received original video, the server 104 may send the obtained target image to the terminal 102, the terminal 102 may display the target image, and the user 106 may view the displayed target image through the terminal 102.
In other embodiments, the terminal 102 may not send the original video or the original image to the server 104 for image processing, but the terminal 102 itself directly performs image processing and then directly displays the target image obtained by the image processing.
It can be understood that the application environment in the above embodiment may be applied to scenes such as real-time skin polishing and beautifying in a self-timer process, or skin polishing and beautifying an image completed by self-timer, and skin polishing and beautifying an input local video or image.
The image processing method provided by the application can also be applied to the application environment shown in fig. 2. Wherein the first terminal 202 and the second terminal 204 communicate with the server 206 through the network, respectively. The first terminal 202 and the second terminal 204 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 206 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.
Specifically, the first user 208 may capture an original video in real time through the first terminal 202, the first terminal 202 may send the original video or each frame of original image in the original video to the server 206 in real time, the server 206 may execute the steps in the image processing method in embodiments of the present application to obtain a target image according to each frame of original image in the original video, the server 206 may send the obtained target image to the first terminal 202 and the second terminal 204, the first terminal 202 and the second terminal 204 may display the target image in real time, the first user 208 may see the displayed target image through the used first terminal 202, and the second user 210 may see the displayed target image through the used second terminal 204. The number of the second terminals 204 and the number of the second users 210 are at least one, and the second users 210 correspond to the second terminals 204 one to one.
In other embodiments, instead of sending the original video to the server 206 for image processing, the first terminal 202 may directly perform image processing, present the target image, and send the target image to the second terminal 204, so as to implement end-to-end communication between the first terminal 202 and the second terminal 204.
It is understood that the application environment in the above embodiments may be applied to a video call or a live broadcast and other scenes. In a video call scenario, the user 208 is one party in the video call, and the user 210 is the other party in the video call with the user 208. In a live scenario, user 208 is the anchor who is playing the live broadcast, and user 210 is the viewer who is watching the live broadcast.
It can be understood that the image processing method in the embodiments of the present application may adopt a computer vision technology in an artificial intelligence technology, and can effectively improve the image quality of the target image.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
In an embodiment, as shown in fig. 3, an image processing method is provided, where the image processing method may be executed by a server or a terminal, or may be executed by both the terminal and the server, and the embodiment of the present application is described by taking an example where the method is applied to the server in fig. 1 or fig. 2, and includes the following steps:
step 302, performing smoothing processing on the original image to obtain a smoothed image.
The original image is an unprocessed image including a skin region. Smoothing processing (also referred to as blurring processing) is processing for reducing noise or distortion in an image. The smoothed image is an image obtained by smoothing.
In one embodiment, the original image may be a pre-stored existing image or an image frame in a pre-stored existing video. In another embodiment, the original image may be an image captured in real time during a self-timer shooting, a video call, or a live broadcast.
Specifically, the server may perform filtering processing on the original image to obtain a smooth image.
In one embodiment, the filtering process may be spatial filtering or frequency-domain filtering.
In one embodiment, the server may use the sparse template as a convolution kernel to spatially filter the original image. Specifically, the server may convolve the original image with the sparse template as a convolution kernel to obtain a smoothed image. The sparse template refers to a template with sparsely distributed non-0 pixel values, that is, the non-0 pixel values in the template are not adjacent to each other, but are separated by 0 pixel values. In the embodiment, the sparse template is adopted for spatial filtering processing, so that the calculated amount can be reduced, the power consumption in the image processing process can be reduced, and the processing efficiency can be improved.
In one embodiment, the server may use an X-shaped sparse template for spatial filtering. The X-shaped sparse template refers to a template having a pixel value distribution in the shape of the letter "X".
In one embodiment, the number of layers of the X-shaped sparse template may be set arbitrarily, such as: the specific form of the four-layer X-shaped sparse template kernel may be as follows:
Figure BDA0002933146140000081
as can be seen from the above equation, the distribution of pixel values 1 in the X-shaped sparse template of four layers is like the letter "X", and pixel values 1 are sparsely distributed and separated by pixel value 0, and thus belong to a sparse template.
In one embodiment, the server may perform downsampling on the original image, and then perform smoothing on the downsampled image to obtain a smoothed image, so as to improve the processing efficiency and improve the upper limit of the smoothing degree. In one embodiment, the smoothing process on the downsampled image can be represented by the following formula:
S ds =kernel*I ds
wherein, I ds Is an image obtained by down-sampling an original image, kernel is a convolution kernel for performing spatial filtering on the image, S ds Is a smoothed image obtained after smoothing processing.
In other embodiments, the server may also perform smoothing processing by using filtering methods such as gaussian filtering or bilateral filtering.
At step 304, an initial skin region in the smoothed image is determined.
The initial skin area is a skin area containing texture edge information in the smoothed image.
In one embodiment, the server may determine an initial skin region in the smoothed image according to a color distribution of the smoothed image.
In one embodiment, the server may determine, as the initial skin region, a region in the smoothed image, which has a color of a skin color, according to a color distribution of the smoothed image.
In one embodiment, the server may count the distribution of pixel values of the smoothed image in each color channel associated with the skin color, and determine the region of the smoothed image where the color is the skin color.
In one embodiment, the server may determine the region of the smoothed image whose color is the skin color according to the pixel value distribution of the smoothed image in the R channel (i.e., the color channel of red) and the B channel (i.e., the color channel of blue) in the RGB color space.
In one embodiment, the server may determine the area of the smoothed image whose color is skin color according to the pixel values of the points of the smoothed image in the R channel and the magnitude relationship between the pixel values of the points in the R channel and the pixel values in the B channel.
In one embodiment, the server may determine, as the initial skin region, a region composed of points in the smoothed image, where the pixel value in the R channel belongs to a preset pixel value range and the pixel value in the R channel is greater than the pixel value in the B channel.
In one embodiment, the preset pixel value range may be set empirically. Such as: as a rule of thumb, the preset pixel value range may be set to [75,255], that is, the server may determine, as an area whose color is a skin color, an area made up of points in the smoothed image, the points of which pixel value in the R channel is greater than or equal to 75 and less than or equal to 255 and the pixel value in the R channel is greater than the pixel value in the B channel.
In one embodiment, the initial skin region may be determined by the following formula:
S ds [R]>75∩S ds [B]<S ds [R];
wherein S is ds Is a smooth image, S ds [R]Is the R channel, S, of the smoothed image ds [B]Is the B channel of the smoothed image.
In one embodiment, the server may down-sample the original image, smooth the down-sampled image to obtain a smooth image, and then determine the initial skin region from the smooth image.
Step 306, determining pixel points with significantly changed pixel values in the original image to obtain texture edge information.
The texture edge information is a position where a pixel value in an image significantly changes. It can be understood that the texture edge information includes pixel points with significantly changed pixel values in the original image, and therefore, the texture edge information includes not only the contour information in the original image but also the texture information in the original image.
In one embodiment, the server may determine, according to a difference between the original image and the smoothed image, a pixel point in the original image where a pixel value significantly changes, to obtain texture edge information.
In another embodiment, the server may determine a pixel point with a significantly changed pixel value in the original image by using an edge detection operator, so as to obtain texture edge information.
In other embodiments, the server may also determine a pixel point with a significantly changed pixel value in the original image by using other edge detection methods, to obtain texture edge information, without limitation.
And 308, removing the texture edge area in the initial skin area based on the texture edge information to obtain a target skin area.
The target skin area is a skin area that does not include texture edge information. The texture edge region is a region including only a position corresponding to the texture edge information.
Specifically, the server may locate a texture edge area corresponding to the texture edge information in the initial skin area according to the texture edge information, and then remove the texture edge area in the initial skin area to obtain the target skin area.
In one embodiment, the server may determine an edge mask according to the texture edge information, and the server may mask the texture edge area in the initial skin area according to the edge mask to remove the texture edge area in the initial skin area to obtain the target skin area. The edge mask is used for covering texture edge information in the original image.
In one embodiment, after the initial skin region is determined, the initial skin region may be represented by a mask map, and then a target skin region that does not include texture edge information is determined from the mask map and edge mask representing the initial skin region.
In one embodiment, the mask map used to represent the initial skin region may be binary or non-binary.
In one embodiment, the server may multiply and fuse the mask map representing the initial skin region with the edge mask to obtain a target mask map for determining a target skin region that does not include texture edge information.
And step 310, fusing the image content in the target skin area in the smooth image into the corresponding position of the target skin area in the original image to obtain the target image.
In an embodiment, if the smoothed image is obtained by directly smoothing the original image, the server may directly fuse the image content in the smoothed image, which is located in the target skin region, into the corresponding position of the target skin region in the original image, so as to obtain the target image.
In another embodiment, if the smooth image is obtained by smoothing the downsampled image, the server may upsample the smooth image to obtain an upsampled smooth image with a resolution consistent with that of the original image, and then fuse image content in the upsampled smooth image, which is located in the target skin region, into a corresponding position of the target skin region in the original image to obtain the target image.
In one embodiment, the fusion may be any one of normal fusion, color filter fusion, and highlight fusion.
In one embodiment, the server may locate the target skin region according to a target mask map used to determine the target skin region that does not include texture edge information, and blend image content in the smooth image that is located in the target skin region into a corresponding position of the target skin region in the original image to obtain the target image.
In one embodiment, the server may perform fusion processing on the smoothed image and the original image by multiplying and fusing the smoothed image and a target mask map used for determining a target skin region not including texture edge information to obtain a first image, multiplying and fusing the original image and a reverse mask map of the target mask map to obtain a second image, and then performing superposition fusion on the first image and the second image. It is understood that the fusion process in this embodiment is a normal fusion process.
In one embodiment, normal fusion may be represented by the following formula:
SR=Blend(I in ,S↑,M)=S↑M+I in (1-M);
where SR is the fused target image, Blend () is the fusion function, I in Is an original image, S ≈ is an upsampled smoothed image, and M is a target mask map for determining a target skin region.
In one embodiment, the color filtering fusion may be to perform color filtering processing on the smooth image, and then perform normal fusion on the color-filtered smooth image and the original image according to the target skin region. The color filtering process can be expressed by the following formula:
SR'=1-(1-I in )(1-SR);
wherein SR' is a smoothed image after color filtering processing, I in Is an original image and SR is a smoothed image.
In one embodiment, the highlight fusion may be to perform highlight processing on the smooth image, and then perform normal fusion on the highlight-processed smooth image and the original image according to the target skin area. The highlight treatment can be expressed by the following formula:
SR'=2I in SR,I in ≤0.5
SR'=1-2(1-I in )(1-SR),I in >0.5;
wherein SR' is the smooth image after the highlight treatment, I in Is an original image and SR is a smoothed image.
It can be understood that, because the image content in the smooth image, which is located in the target skin region, is fused into the corresponding position of the target skin region in the original image, the image content in the fused target image, which is located in the target skin region, is mainly determined according to the corresponding image content of the target skin region in the smooth image, so that the target skin region of the target image is smoothed, and the image content in the target image, which is located outside the target skin region, is mainly determined according to the corresponding image content of the region outside the target skin region in the original image, so that the degree of sharpness in the original image is maintained outside the target skin region of the target image.
In the image processing method, the original image is smoothed to obtain a smoothed image, then an initial skin area in the smoothed image is determined, pixel points with significantly changed pixel values in the original image are determined to obtain texture edge information, then the texture edge area in the initial skin area is removed based on the texture edge information to obtain a target skin area, the image content in the target skin area in the smoothed image is fused into the corresponding position of the target skin area in the original image to obtain the target image, so that the image content in the target skin area in the fused target image is smoothed, the image content outside the target skin area keeps the definition degree in the original image, the problem that the image transparency is reduced due to the fact that the whole original image is smoothed is solved, the image quality is improved, and the image content in the target skin area without the texture edge area in the smoothed image is determined, the texture edge region is fused into the corresponding position of the target skin region in the original image, and the smoothness of the texture edge region in the original image is avoided, so that the texture edge detail is clear, and the image quality is further improved.
In one embodiment, smoothing the original image to obtain a smoothed image comprises: determining down-sampling multiples according to the resolution of an original image; according to the down-sampling multiple, down-sampling the original image; and smoothing the down-sampled image to obtain a smooth image.
In one embodiment, the downsampling of the original image according to the downsampling multiple can be represented by the following formula:
I ds =Downsampling(I in ,r);
wherein, I ds Is the down-sampled image, down sampling is the down-sampling process, I in Is the original image and r is the down-sampling multiple.
In one embodiment, the downsampling multiple may be positively correlated with the resolution of the original image. The down-sampling multiples corresponding to the long side and the short side of the original image may be consistent or inconsistent.
In one embodiment, the downsampling multiple may be linear with the resolution of the original image. In one embodiment, the downsampling factor may be linear with the short edge resolution of the original image. The downsampling multiple in this embodiment can be expressed by the following formula:
r=min(H,W)/L;
where r is the downsampling multiple, H and W are the length and width of the original image, respectively, min (H, W) represents the short edge resolution of the original image, and L is a constant. Such as: l may be set to 180, or may be other values, without limitation.
In another embodiment, the down-sampling multiple may be determined according to a resolution interval range to which the resolution of the original image belongs, and the larger the resolution interval range to which the resolution belongs, the larger the down-sampling multiple. Such as: the down-sampling factor is 3 when the resolution is between 720P (P, i.e., PPI, unit of resolution Per inc) and 1080P, and 6 when the resolution is between 1080P and 4 KP.
In other embodiments, the downsampling multiple may be exponential or power with the resolution of the original image, and the like, which is not limited.
In the above embodiment, the original image is downsampled first and then smoothed, so that on one hand, the calculation amount can be reduced, the calculation efficiency can be improved, on the other hand, a larger receptive field and a larger smoothing degree can be obtained during smoothing, and the upper limit of the smoothing degree can be improved. In addition, the downsampling multiple is dynamically adjusted according to the resolution of the original image, so that the applicability of the scheme can be improved, and a good smoothing effect can be realized for the original images with different resolutions.
In one embodiment, the method further comprises: and determining an edge mask according to the texture edge information. In this embodiment, removing the texture edge area in the initial skin area based on the texture edge information to obtain the target skin area includes: fusing the initial mask image and the edge mask for determining the initial skin area to obtain a target mask image; a target mask map for determining a target skin area. In this embodiment, fusing the image content in the smooth image, which is located in the target skin region, into the corresponding position of the target skin region in the original image, to obtain the target image includes: and according to the target mask image, fusing the image content in the target skin area in the smooth image into the corresponding position of the target skin area in the original image to obtain the target image.
Wherein, the initial mask map is used for determining the initial skin area. In one embodiment, the server may determine an area outside the texture edge area based on the texture edge information and generate an edge mask based on the area outside the texture edge area.
In one embodiment, the edge mask may be binary. Such as: the pixel value of the point at the edge position is 0, and the pixel value of the point other than the edge is 1.
In another embodiment, the edge mask may be non-binary. Such as: the closer to the edge position, the closer to 0 the pixel value is, and the farther from the edge, the closer to 1 the pixel value is.
In one embodiment, the server may determine an initial mask map for determining the initial skin region according to the smoothed image, and then multiply and fuse the initial mask map and the edge mask to obtain a target mask map.
In one embodiment, the initial mask map may be a binary initial mask map. A binary initial mask map refers to an initial mask map consisting of only two pixel values. Such as: the binary initial mask map may be an initial mask map composed of points having a pixel value of 0 and points having a pixel value of 1. Such as: in the binary initial mask map, the pixel values of points inside the initial skin region may be 1, and the pixel values of points outside the initial skin region may be 0. The reverse is also possible, i.e. 0 in the initial skin area and 1 outside.
In another embodiment, the initial mask map may be a non-binary initial mask map. The non-binary initial mask pattern refers to an initial mask pattern including more than two pixel values. In this embodiment, the non-binary initial mask map is used to determine the initial skin region, so that a hard edge is prevented from being present in the target image finally obtained by fusing according to the mask map due to abrupt change of pixel values in the initial mask map, and the natural feeling of the target image is improved, thereby improving the image quality of the target image.
In one embodiment, the server may first determine a binary initial mask map for determining the initial skin region from the smoothed image, and then generate a non-binary initial mask map from the binary initial mask map and the smoothed image.
In one embodiment, the server may perform a blurring process on the binary initial mask map to obtain a non-binary initial mask map. In another embodiment, the server may perform the dilation process on the binary initial mask map to obtain a non-binary initial mask map. In another embodiment, the server may derive the non-binary initial mask map by a hardware auto-interpolation process. In other embodiments, the server may also obtain pixel values within the initial skin region in the smoothed image from the binary initial mask map, and then generate a non-binary initial mask map from the obtained pixel values.
In one embodiment, the server may multiply the pixel values of each point in the initial mask map and the edge mask respectively to obtain the target mask map. In one embodiment, if the initial mask map is obtained from a smoothed image obtained by smoothing the downsampled original image, the initial mask map is upsampled to obtain an upsampled initial mask map with a resolution consistent with that of the original image, and then the upsampled initial mask map is multiplied by the edge mask for fusion.
In one embodiment, the server may multiply and fuse the smoothed image and the target mask map to obtain a first image, multiply and fuse the original image and a reverse mask map of the target mask map to obtain a second image, and perform overlay fusion on the first image and the second image to fuse image content in the smoothed image, which is located in the target skin region, into a corresponding position of the target skin region in the original image, so as to obtain the target image.
In the above embodiment, the target mask image is obtained by fusing the initial mask image and the edge mask, and then the smooth image and the original image are fused according to the target mask image to obtain the target image.
In one embodiment, the method further comprises: acquiring a binary mask map for determining an initial skin region of the smoothed image; acquiring pixel values in an initial skin area corresponding to the binary mask image from a single-color channel of the smooth image; generating a non-binary initial mask map for determining an initial skin region according to the acquired pixel values; the pixel values of the points in the initial mask map corresponding to the initial skin area are determined from the acquired pixel values.
The single color channel refers to a certain color channel in the image. Such as: for an RGB image, the R, G, and B channels are each a single color channel, i.e., the single color channel may be any one of the R, G, and B channels, etc.
Specifically, the server may first determine a binary initial mask map (i.e., a binary mask map for determining an initial skin region of the smoothed image) from the smoothed image, then obtain pixel values located within the initial skin region corresponding to the binary mask map from a single color channel of the smoothed image, and then generate a non-binary initial mask map for determining the initial skin region according to the obtained pixel values.
In one embodiment, the server may take a single-color channel of the smoothed image, and then multiply and fuse the binary mask map and the single-color channel of the smoothed image to obtain the single-color channel of the smoothed image that only retains the pixel values in the initial skin region, that is, the pixel values in the initial skin region in the single-color channel of the smoothed image remain unchanged, and the pixel values outside the initial skin region are 0, so as to obtain the pixel values located in the initial skin region corresponding to the binary mask map.
In one embodiment, the server may amplify the pixel values within the acquired initial skin region, generating a non-binary initial mask map. That is, the pixel values in the initial skin region in the generated initial mask map are amplified values of the acquired pixel values, and the pixel values outside the initial skin region are 0.
In one embodiment, the server may exponentially amplify the pixel values within the acquired initial skin region. Such as: amplifying to 5 th power.
In another embodiment, the server may multiply up the pixel values within the acquired initial skin region. Such as: the amplification is 2 times of the original amplification.
In other embodiments, the server may also perform both exponential and multiple amplification of the pixel values within the acquired initial skin region. Such as: firstly amplifying to 5 th power, and then amplifying by 2 times.
In one embodiment, the server may limit the scaled-up pixel values to between 0 and 255 to satisfy the pixel value range of the image.
In one embodiment, the processing steps of obtaining pixel values located within the initial skin region corresponding to the binary mask map from the single color channel of the smoothed image, and then generating the non-binary initial mask map for determining the initial skin region based on the obtained pixel values may be represented by the following formula:
M coarse =clamp((S ds [R]·M color ) β ·2,0,255);
wherein M is coarse A non-binary initial mask map. clamp is a clipping function, which in this embodiment represents (S) ds [R]·M color ) β Limited to between 0 and 255, a value greater than 255 is set to 255, and a value less than 0 is set to 0. S ds [R]Representing a smoothed image S ds R channel of (1), M color Is a binary mask map. Beta is an exponential enhancement coefficient and is a constant.
In the above embodiment, the pixel values in the initial skin region corresponding to the binary mask map are acquired from the single color channel of the smoothed image, the non-binary initial mask map is generated, and then the subsequent processing can be performed according to the non-binary initial mask map, so that the problem of hard edges in the finally obtained target image due to abrupt change of the pixel values in the mask map is avoided, the natural feeling of the target image is improved, and the image quality of the target image is improved.
In one embodiment, fusing the image content in the smooth image located in the target skin area into the corresponding position of the target skin area in the original image according to the target mask map, and obtaining the target image includes: overlapping and fusing the first image and the second image to obtain a target image; the first image is obtained by multiplying the pixel values corresponding to each point in the smooth image by the pixel values corresponding to each point in the target mask image; the second image is an image obtained by multiplying the pixel values corresponding to the respective points in the original image by the pixel values corresponding to the respective points in the reverse mask image of the target mask image.
The reverse mask map is a mask map in which the pixel value of each dot is opposite to the pixel value of the corresponding dot in the target mask map. Such as: a point in the target mask map has a pixel value of 50, then the corresponding point in the reverse mask map has a pixel value of 205 (i.e., 255-50).
Specifically, the server may multiply the pixel values corresponding to each point in the smoothed image by the pixel values corresponding to each point in the target mask map to obtain a first image, multiply the pixel values corresponding to each point in the original image by the pixel values corresponding to each point in the reverse mask map of the target mask map to obtain a second image, and superimpose and fuse the first image and the second image to obtain the target image.
In an embodiment, the processing step of superimposing and fusing the first image and the second image to obtain the target image in the above embodiment may be represented by the following formula:
SR=Blend(I in ,S↑,M)=S↑M+I in (1-M);
where SR is the target image, Blend () is the fusion function, I in Is the original image, S ≈ is the upsampled smoothed image, and M is the target mask map.
In the above embodiment, the smooth image and the original image are fused according to the target mask image to obtain the target image, so that the image content in the target skin region in the fused target image can be smoothed, the image content outside the target skin region retains the definition degree in the original image, the problem that the image transparency is reduced by smoothing the whole original image is avoided, the image quality is improved, the smooth image and the original image are fused according to the target skin region without texture edge information, the edge position in the original image is also prevented from being smoothed, the edge details are clear, and the image quality is further improved.
In one embodiment, the method further comprises: determining the pixel value interval range to which the pixel value of each point in the target mask image belongs; and according to the significance coefficient corresponding to the pixel value interval range, performing significance weighting on the pixel value of the corresponding point in the target mask image to obtain the target mask image subjected to significance weighting.
Wherein, the significance coefficient is used for performing significance weighting on the pixel value in the target mask image.
Specifically, different saliency coefficients may be set in advance for different pixel value interval ranges. The server may determine a pixel value interval range to which the pixel values of each point in the target mask map belong, then perform saliency weighting on the pixel values of the corresponding point in the target mask map according to a saliency coefficient set in advance for the pixel value interval range, and obtain the saliency-weighted target mask map after performing saliency weighting on each point in the target mask map.
In one embodiment, the larger the pixel values in the range of pixel value intervals, the larger the corresponding saliency coefficient.
In one embodiment, the significance coefficient may take on a value around 1. For example, the saliency factor for a range of pixel values that needs to be dimmed may be greater than 0 and less than 1, with more dimming being achieved. The significance coefficient corresponding to the range of the pixel value interval needing to be brightened can be larger than 1, and the more the significance coefficient is larger, the greater the brightness degree is.
In one embodiment, the saliency weighted pixel values may be limited to a [0,255] range.
In one embodiment, the step of saliency weighting the pixel values of each point in the target mask map to obtain a saliency-weighted target mask map may be implemented by a piecewise function as follows:
Figure BDA0002933146140000191
wherein, T 1 、T 2 、T 3 And T 4 Are respectively interval endpoints of the interval ranges of the pixel values, and T 1 <T 2 <T 3 <T 4 。R 1 Is a range T of pixel value intervals 1 ≤M skin <T 2 Corresponding significance coefficient, R 2 Is a range T of pixel value intervals 2 ≤M skin <T 3 Corresponding significance coefficient, R 3 Is a range T of pixel value intervals 3 ≤M skin <T 4 The corresponding significance coefficient. M skin Is the pixel value in the target mask map and M is the saliency-weighted target mask map.
In the above embodiment, the pixel values in different pixel value threshold ranges in the target mask image are subjected to saliency weighting, so that the finally obtained smoothing effect of the target image is more detailed, different brightness degrees exist in different pixel value interval ranges, and the image quality is improved.
In one embodiment, determining an edge mask from the texture edge information comprises: and performing inverse operation on the texture edge information through an inverse function to obtain an edge mask.
And the inversion function is used for carrying out inverse operation on the texture edge information. And the inverse operation is used for acquiring the region except the texture edge region corresponding to the texture edge information.
Specifically, the server may perform inverse operation on the texture edge information through an inverse function to obtain an area outside the texture edge area corresponding to the texture edge information, and generate an edge mask according to the area outside the texture edge area.
In one embodiment, the server may obtain texture edge information in the original image according to difference information between the original image and the smoothed image.
In one embodiment, the texture edge information may be a texture edge information map. The server can perform inverse operation on the texture edge information graph through an inversion function to obtain the edge mask.
In one embodiment, the texture edge information map is inversely computed to obtain an edge mask, which can be expressed by the following formula:
M edge =F(|E|);
wherein M is edge Is the edge mask, F is the inverse function, and E is the texture edge information map.
In one embodiment, if the texture edge information is a texture edge information map and the edge mask is binary, the inversion function may be a function that sets the pixel value of the edge location in the texture edge information map to 0 and sets the pixel value of the region outside the edge to 1.
In one embodiment, if the texture edge information is a texture edge information map and the edge mask is non-binary, the inversion function may be a function of inverting pixel values in the texture edge information map, that is, a larger pixel value is changed into a smaller pixel value and a smaller pixel value is changed into a larger pixel value from pixel values ranging from 0 to 1, that is, a pixel value close to 1 is changed into a pixel value close to 0, and a pixel value close to 0 is changed into a pixel value close to 1. For example, 3 functions y (x) 1-x as shown in fig. 4 0.2 、y(x)=1-x 0.5 And
Figure BDA0002933146140000201
all as inverse functions, and the graphs of these three functions are shown, and it can be seen from the graphs that all of these three functions can change the value close to 1 to the value close to 0, and change the value close to 0 to the value close to 1.
In one embodiment, the inversion function may be in the form of the following equation:
F(E)=1.0-E γ
where F (E) is an inversion function, E is an argument in the inversion function, and γ is a constant not exceeding 1.0. For example, as shown in fig. 4, the first 2 inversion functions are all in the form of the above formula, and γ is 0.2 and 0.5, respectively.
In the above embodiment, the texture edge information is inversely computed through the inversion function to obtain the edge mask, so that the target skin region does not include the edge region, the smoothing of the edge region can be avoided, and the image quality of the target image is improved. In addition, particularly under the condition of adopting a non-binary edge mask, hard edges in the target image can be avoided, the natural feeling of the target image is improved, and the image quality is improved.
In one embodiment, the texture edge information is a texture edge information map. In this embodiment, determining a pixel point with a significantly changed pixel value in an original image, and obtaining texture edge information includes: determining difference information between the original image and the smooth image; and obtaining a texture edge information graph according to the difference information.
In one embodiment, the server may subtract the original image from the smoothed image to determine difference information between the original image and the smoothed image.
In one embodiment, the server may average difference information of each color channel obtained by subtracting the smooth image from the original image to obtain a texture edge information map. It will be appreciated that the resulting texture edge information map is a single-channel grayscale map.
In one embodiment, if the smooth image is obtained by smoothing the downsampled original image, the smooth image is upsampled to obtain an upsampled smooth image with a resolution consistent with that of the original image, and then the texture edge information map is obtained according to the difference information between the original image and the upsampled smooth image.
In one embodiment, the texture edge information map may be obtained by the following formula:
Figure BDA0002933146140000211
where E is the texture edge information map, I in Is an original image, S ≈ is an upsampled smoothed image, R, G and B denote R, G, and B channels, respectively. The expression above represents that the difference information of the R channel, the G channel, and the B channel obtained by subtracting the original image from the up-sampled smooth image is averaged to obtain a texture edge information map.
In the above embodiment, the texture edge information map is obtained according to the difference information between the original image and the smoothed image, and the texture edge information in the image can be accurately determined.
In one embodiment, the method further comprises: extracting high-frequency information in the original image according to difference information between the original image and the smooth image; and adding the high-frequency information into the target image to obtain the sharpened target image.
Specifically, the server may subtract the original image from the smoothed image to obtain difference information, determine high-frequency information in the original image according to the difference information, and add the high-frequency information to the target image to obtain the sharpened target image.
In one embodiment, the server may extract high frequency information in the original image using a high contrast preserving operator. In the embodiment, the high-contrast retention operator is adopted to extract the high-frequency information, and compared with the traditional texture edge information extraction operator, the high-frequency information extraction method is more stable, and can reduce the introduction of noise points in the sharpening process.
In other embodiments, the server may extract the high frequency information in the original image using an operator such as laplacian for extracting texture edge information. The extraction method is not limited.
In one embodiment, the extraction of high frequency information in the original image using the high contrast preserving operator can be represented by the following formula:
hPass=2.0·step(I in -S↑+127,127)-255;
here, hPass represents high-frequency information. Step (x, e) is a truncation function when x>e is, step (x, e) ═ x, when x<When e, step (x, e) is 0. I is in Is an original image, and S ≈ is an upsampled smoothed image.
In one embodiment, the server may superimpose the extracted high-frequency information with the target image to obtain a sharpened target image.
In one embodiment, the server may superimpose the extracted high frequency information on the target image and limit the pixel values of the superimposed points to the range of [0,255], resulting in a sharpened target image. In one embodiment, the server may set the pixel values greater than 255 after the superimposition to 255 and set the pixel values less than 0 after the superimposition to 0, and the pixel values in the range of [0,255] after the superimposition remain unchanged.
In one embodiment, the server may add high frequency information to the target image by the following formula, and limit the pixel values of the points after superposition to the range of [0,255], resulting in a sharpened target image:
I enhance =max(min(SR+hPass,255),0);
wherein, I enhance Is the sharpened target image, SR target image, hPass represents high frequency information. min (SR + hPass,255) indicates that the pixel value of the point of SR + hPass where the pixel value is less than or equal to 255 remains unchanged, and the pixel value of the point where the pixel value is greater than 255 is set to 255. max (min (SR + hPass,255),0) indicates that the pixel value of the point in min (SR + hPass,255) at which the pixel value is greater than or equal to 0 remains unchanged, and the pixel value of the point at which the pixel value is less than 0 is set to 0.
In the above embodiment, according to the difference information between the original image and the smooth image, the high-frequency information in the original image is extracted and added to the target image to obtain the sharpened target image, so that the image content of the environment region in the target image can be prevented from being damaged to a certain extent, the naturalness and the reality of the target image are further improved, and the image quality is improved.
In one embodiment, the original image contains a human face. In this embodiment, the method further includes: determining five sense organ regions of the face in the original image; and according to the region of the five sense organs, fusing the target image and the original image to obtain a target image with details of the five sense organs.
In one embodiment, the five sense organ regions may include regions where the five sense organs, such as the eyebrows, eyes, nose, and mouth, are located.
In one embodiment, the server may identify key points of the five sense organs in the original image and then determine the five sense organs region of the face in the original image according to the key points of the five sense organs.
In one embodiment, the server may divide the grid according to the identified key points of the five sense organs, and then perform interpolation processing on the grid to obtain the region of the five sense organs. The interpolation method used for the interpolation process is not limited.
In one embodiment, the mesh of partitions may be triangular patches.
In one embodiment, the server may perform interpolation Processing by a GPU (Graphics Processing Unit) executing a hardware raster automation interpolation algorithm. In the embodiment, the GPU is used for automatically executing the interpolation algorithm, so that the processing efficiency is improved.
In one embodiment, the server may generate a facial mask according to the facial region, and then fuse the target image with the original image according to the facial mask to obtain a target image with details of the facial features preserved. The facial features mask is used for shielding the areas, outside the facial features, determined by the facial features.
In one embodiment, the server may fuse the target image with the original image according to the facial mask by the following formula:
Res=Blend(I in ,SR,M contour );
wherein Res is a target image which is obtained after fusion and retains the details of five sense organs. I is in Is an original image, SR is a target image, M contour Is a facial mask.
In one embodiment, the quintuple mask may be binary, i.e. consisting of pixel values 0 and 1. For example, the pixel value of the facial features region in the facial features mask is 0, and the pixel value of the non-facial features region is 1, or vice versa, the facial features region is 1, and the non-facial features region is 0.
In another embodiment, the quintuple mask may be non-binary, i.e. a grey-scale map consisting of more than 2 pixel values. For example, the pixel value at the boundary between the facial region and the non-facial region in the facial mask gradually transitions from 0 to 1. As shown in fig. 5, which is a non-binary facial feature mask, it can be seen that the pixel value of the boundary between the facial feature region and the non-facial feature region in the facial feature mask gradually transitions from 0 to 1.
In one embodiment, the server may generate a binary facial feature mask from the facial feature regions and then generate a non-binary facial feature mask from the binary facial feature mask.
In one embodiment, the server may blur the binary facial features mask to obtain a non-binary facial features mask. In another embodiment, the server may perform an extension process on the facial features region in the binary facial features mask to obtain a non-binary facial features mask. In other embodiments, the server may derive the non-binary facial mask through a hardware auto-interpolation process. The method of generating the non-binary facial mask is not limited.
In one embodiment, the server may perform multiplication and fusion on the target image and the facial features mask, and then perform superposition and fusion on the result of multiplication and fusion on the original image and the reverse mask of the facial features mask to obtain the target image with details and features of the facial features preserved.
It can be understood that the image content of the target image with the details of the five sense organs retained in the area of the five sense organs is mainly determined according to the image content of the original image in the area of the five sense organs, and the image content of the target image with the details of the five sense organs retained in the area of the five sense organs is mainly determined according to the image content of the target image outside the area of the five sense organs.
In one embodiment, the server may fuse the sharpened target image with the original image according to the region of the five sense organs.
In the embodiment, the target image and the original image are fused according to the facial features region to obtain the target image with the facial features retained, so that facial features details in the image are prevented from being lost in the smoothing process, the natural feeling of the target image is improved, and the image quality is improved.
In one embodiment, the original image contains a human face. In this embodiment, the method further includes: acquiring a face area corresponding to a face in an original image; and according to the face region, fusing the target image and the original image to obtain a target image with the optimized face region.
The face region is a region corresponding to a face in an original image.
In one embodiment, the server may identify face key points in the original image and then determine a face region in the original image based on the face key points.
In an embodiment, the server may divide a mesh according to the identified face key points, and then perform interpolation processing on the mesh to obtain a face region. The interpolation method used for the interpolation process is not limited.
In one embodiment, the mesh of partitions may be triangular patches.
In one embodiment, the server may perform interpolation Processing by a GPU (Graphics Processing Unit) executing a hardware raster automation interpolation algorithm. In the embodiment, the GPU is used for automatically executing the interpolation algorithm, so that the processing efficiency is improved.
In one embodiment, the server may generate a face mask according to the face region, and then fuse the target image and the original image according to the face mask to obtain a target image with an optimized face region. The face mask is used for covering the region outside the face determined by the face.
In one embodiment, the face mask may be binary, i.e., consisting of pixel values 0 and 1. For example, the pixel value of the face region in the face mask is 0, and the pixel value of the non-face region is 1, or vice versa, the face region is 1, and the non-face region is 0.
In another embodiment, the face mask may be non-binary, i.e. a grey-scale map consisting of more than 2 pixel values. For example, the pixel value at the boundary between the face region and the non-face region in the face mask gradually transitions from 0 to 1.
In one embodiment, the server may generate a binary face mask according to the face region, and then generate a non-binary face mask according to the binary face mask.
In one embodiment, the server may perform a blurring process on the binary face mask to obtain a non-binary face mask. In another embodiment, the server may perform an extension process on the face region in the binary face mask to obtain a non-binary face mask. In other embodiments, the server may obtain the non-binary face mask through hardware automatic interpolation processing. The method of generating the non-binary face mask is not limited.
In one embodiment, the server may perform multiplication and fusion on the target image and the reverse mask of the face mask, and then perform superposition and fusion on the result of the multiplication and fusion of the original image and the face mask to obtain the target image with the optimized face region.
It can be understood that the image content in the face region in the target image after face region optimization is mainly determined according to the image content of the target image in the face region, so as to obtain the image content outside the face region in the target image after face region optimization, and is mainly determined according to the image content outside the face region of the original image.
In one embodiment, the server can fuse the target image with the details of the five sense organs and the original image according to the face region.
In the embodiment, the target image is fused with the original image according to the face region to obtain the target image after the face region is optimized, so that the situation that the background close to skin color in the original image is mistakenly identified as skin and smoothed to cause background blurring is avoided, and in the obtained target image after the face region is optimized, only the face region is smoothed, but the background region outside the face region is not smoothed, so that the quality of the target image is improved.
In one embodiment, the method further comprises: acquiring an input smoothing parameter; determining the superposition weight corresponding to the target image and the original image respectively according to the smoothing parameter; superposing and fusing the target image and the original image according to the corresponding superposition weight; and outputting the superposed and fused image.
The smoothing parameter is a parameter for adjusting the degree of smoothing.
In one embodiment, the terminal may present an adjustment entry for the smoothing parameter in the interface, and the user may adjust the smoothing parameter through the adjustment entry.
In one embodiment, the user may select the smoothing parameter by adjusting the entry, may enter the smoothing parameter by adjusting the entry, may be in other manners, and is not limited.
In other embodiments, if the user does not adjust the smoothing parameters, preset default smoothing parameters may be used.
In an embodiment, the terminal may receive a smoothing parameter input by a user through adjustment, and send the smoothing parameter to the server, the server may determine corresponding overlay weights for the target image and the original image respectively according to the smoothing parameter, and perform overlay fusion on the target image and the original image according to the corresponding overlay weights, the server may send the overlay-fused image to the terminal, and the terminal may display the fused image.
In another embodiment, the terminal may receive a smoothing parameter input by a user through adjustment, then the terminal may determine the respective corresponding overlay weights of the target image and the original image according to the smoothing parameter, perform overlay fusion on the target image and the original image according to the corresponding overlay weights, and display the fused image.
In one embodiment, the smoothing parameter may take on a value in the range of [0,1 ].
In one embodiment, the server may use the smoothing parameter as the overlay weight corresponding to the target image, use the 1 minus smoothing parameter as the overlay weight corresponding to the original image, and then perform overlay fusion on the target image and the original image according to the corresponding overlay weights.
In one embodiment, the server may perform overlay fusion on the target image and the original image according to the corresponding overlay weights according to the following formula:
P=Blend(I in ,Res,α)=Res·α+I in (1-α);
where P is the image after the superimposition and fusion, Blend () is the fusion function, I in Is the original image, Res is the target image, and α is the smoothing parameter.
It will be appreciated that the larger the value of the smoothing parameter, the more smooth the resulting overlay-fused image, and the smoother the skin in the image will appear.
In the above embodiment, the target image and the original image are overlapped and fused according to the corresponding overlapping weight according to the input smoothing parameter, so that a user can flexibly and conveniently adjust the smoothing degree of the target image, and the operation efficiency is improved.
Fig. 6 is a schematic overall flow chart of an image processing method in the embodiments of the present application. Firstly, the server can obtain an original image, down-sample the original image, smooth the down-sampled image to obtain a smooth image, and then the server can determine an initial skin area according to the color distribution condition in the smooth image, determine pixel points with significantly changed pixel values in the original image, and obtain texture edge information. Then, the server may remove the texture edge region in the initial skin region based on the texture edge information to obtain a target skin region, and then fuse the image content in the smooth image, which is located in the target skin region, into the corresponding position of the target skin region in the original image to obtain a target image. The server can extract high-frequency information from the original image to sharpen the fused target image, retain details of the five sense organs in the sharpened target image according to the five sense organs region, retain the characteristics outside the face region according to the face region, and only optimize the image content in the face region to obtain the final target image. And finally, the server can adjust the smoothness degree of the target image according to the input smoothness parameters and output the target image with the adjusted smoothness degree.
The application also provides an application scene, wherein the application scene is a video call scene or a live broadcast scene, and the scene applies the image processing method. Specifically, the application of the image processing method in the application scenario is as follows:
in the process of video call or live broadcast of a user, a terminal used by the user can acquire a face image of the user in real time and send the face image to a server, the server can execute an image processing method in each embodiment of the application by taking the face image as an original image to obtain a target image, the target image is sent to the terminal used by the user and sent to terminals used by other users who have video call with the user or terminals used by each audience who watch live broadcast of the user, and the terminal receiving the target image can display the target image in real time.
In other embodiments, the terminal used by the user may also directly execute the image processing method in the embodiments of the present application by the terminal itself without sending the face image to the server to obtain the target image, and send the target image to the terminal used by another user who performs a video call with the user or the terminal used by each viewer who watches the live broadcast of the user, so as to implement end-to-end communication between the terminals.
In the process of video call or live broadcast, the user can also adjust the smoothing parameter in the interface so as to adjust the smoothing degree of the target image displayed by the terminal (namely the skin abrasion degree of the target image).
The application further provides an application scene, wherein the application scene is an image or video beauty, and the application scene applies the image processing method. Specifically, the application of the image processing method in the application scenario is as follows:
the user can select images or videos stored in the terminal in advance, the terminal can send the selected images or videos to the server, the server can use the received images or all frames of images in the received videos as original images, the image processing method in the embodiments of the application is executed, target images corresponding to the images or all frames of target images corresponding to the videos are obtained, the target images are sent to the terminal, and the terminal can display the target images or the target videos formed by all frames of target images, so that the skin in the images or videos can be abraded.
In other embodiments, the terminal may not send the image or the video to the server, but the terminal itself directly executes the image processing method in the embodiments of the present application to obtain the target image corresponding to the image or each frame of target image corresponding to the video, and displays the target image or the target video composed of each frame of target image.
In the process of beautifying the selected image or video, the user can also adjust the smoothing parameter to adjust the smoothing degree of the skin in the beautified image or video.
It should be understood that, although the steps in the flowcharts are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in each flowchart may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
In one embodiment, as shown in fig. 7, an image processing apparatus 700 is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two modules, and specifically includes: a smoothing module 702, an initial region determination module 704, an edge determination module 706, a target region determination module 708, and a fusion module 710, wherein:
the smoothing module 702 is configured to smooth the original image to obtain a smoothed image.
An initial region determination module 704 for determining an initial skin region in the smoothed image.
An edge determining module 706, configured to determine a pixel point of an original image where a pixel value significantly changes, so as to obtain texture edge information.
A target area determining module 708, configured to remove a texture edge area in the initial skin area based on the texture edge information, to obtain a target skin area.
And the fusion module 710 is configured to fuse the image content in the smooth image, which is located in the target skin region, into a corresponding position of the target skin region in the original image, so as to obtain a target image.
In one embodiment, the smoothing module 702 is further configured to determine a downsampling multiple according to the resolution of the original image; according to the down-sampling multiple, down-sampling the original image; and smoothing the down-sampled image to obtain a smooth image.
In one embodiment, the edge determination module 706 is further configured to determine an edge mask based on the texture edge information. In this embodiment, the target area determining module 708 is further configured to fuse the initial mask map used for determining the initial skin area and the edge mask to obtain a target mask map; a target mask map for determining a target skin area. In this embodiment, the fusion module 710 is further configured to fuse the image content located in the target skin region in the smoothed image into a corresponding position of the target skin region in the original image, so as to obtain the target image.
In one embodiment, the initial region determination module 704 is further configured to obtain a binary mask map for determining an initial skin region of the smoothed image; acquiring pixel values in an initial skin area corresponding to the binary mask image from a single-color channel of the smooth image; generating a non-binary initial mask map for determining an initial skin region based on the acquired pixel values; the pixel values of the points in the initial mask map corresponding to the initial skin area are determined from the acquired pixel values.
In one embodiment, the fusion module 710 is further configured to perform overlay fusion on the first image and the second image to obtain a target image; the first image is obtained by multiplying the pixel values corresponding to each point in the smooth image by the pixel values corresponding to each point in the target mask image; the second image is an image obtained by multiplying the pixel values corresponding to the respective points in the original image by the pixel values corresponding to the respective points in the reverse mask image of the target mask image.
In one embodiment, the target area determination module 708 is further configured to determine a range of pixel value intervals to which pixel values for points in the target mask map belong; and according to the significance coefficient corresponding to the pixel value interval range, performing significance weighting on the pixel value of the corresponding point in the target mask image to obtain the target mask image subjected to significance weighting.
In one embodiment, the edge determination module 706 is further configured to perform an inverse operation on the texture edge information through an inverse function to obtain an edge mask.
In one embodiment, the texture edge information is a texture edge information map. In this embodiment, the edge determining module 706 is further configured to determine difference information between the original image and the smoothed image; and obtaining a texture edge information graph according to the difference information.
In one embodiment, the image processing apparatus 700 further includes:
a sharpening module 712, configured to extract high-frequency information in the original image according to difference information between the original image and the smoothed image; and adding the high-frequency information into the target image to obtain the sharpened target image.
In one embodiment, the original image contains a human face. In this embodiment, the image processing apparatus 700 further includes:
a five sense organs processing module 714, configured to determine five sense organs regions of a human face in an original image; and according to the region of the five sense organs, fusing the target image and the original image to obtain a target image with details of the five sense organs.
In one embodiment, the original image contains a human face. In this embodiment, the image processing apparatus 700 further includes:
a face region processing module 716, configured to obtain a face region corresponding to a face in an original image; and according to the face region, fusing the target image and the original image to obtain a target image with the optimized face region.
In one embodiment, as shown in fig. 8, the image processing apparatus 700 further includes:
a smoothing degree adjusting module 718, configured to obtain an input smoothing parameter; determining the superposition weight corresponding to the target image and the original image respectively according to the smoothing parameter; superposing and fusing the target image and the original image according to the corresponding superposition weight; and outputting the superposed and fused image.
In the image processing device, the original image is smoothed to obtain a smoothed image, then an initial skin area in the smoothed image is determined, pixel points with significantly changed pixel values in the original image are determined to obtain texture edge information, then the texture edge area in the initial skin area is removed based on the texture edge information to obtain a target skin area, the image content in the target skin area in the smoothed image is fused into the corresponding position of the target skin area in the original image to obtain the target image, so that the image content in the target skin area in the fused target image is smoothed, the image content outside the target skin area keeps the definition degree in the original image, the problem that the image transparency is reduced due to the fact that the whole original image is smoothed is solved, the image quality is improved, and the image content in the target skin area without the texture edge area in the smoothed image is determined, the texture edge region is fused into the corresponding position of the target skin region in the original image, and the smoothness of the texture edge region in the original image is avoided, so that the texture edge detail is clear, and the image quality is further improved.
For specific limitations of the image processing apparatus, reference may be made to the above limitations of the image processing method, which are not described herein again. The respective modules in the image processing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store raw image data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image processing method.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the configurations shown in fig. 9 and 10 are merely block diagrams of some configurations relevant to the present disclosure, and do not constitute a limitation on the computing devices to which the present disclosure may be applied, and that a particular computing device may include more or less components than those shown, or combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. An image processing method, characterized in that the method comprises:
smoothing the original image to obtain a smooth image;
determining an initial skin region in the smoothed image;
determining pixel points with significantly changed pixel values in the original image to obtain texture edge information;
removing the texture edge area in the initial skin area based on the texture edge information to obtain a target skin area;
and fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
2. The method of claim 1, wherein the smoothing the original image to obtain a smoothed image comprises:
determining down-sampling multiples according to the resolution of an original image;
according to the down-sampling multiple, down-sampling the original image;
and smoothing the down-sampled image to obtain a smooth image.
3. The method of claim 1, further comprising:
determining an edge mask according to the texture edge information;
removing the texture edge region in the initial skin region based on the texture edge information to obtain a target skin region, including:
fusing the initial mask image used for determining the initial skin area and the edge mask to obtain a target mask image; the target mask map is used for determining a target skin area;
the fusing the image content in the smooth image, which is located in the target skin region, into the corresponding position of the target skin region in the original image to obtain the target image, including:
and according to the target mask image, fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
4. The method of claim 3, further comprising:
acquiring a binary mask map for determining an initial skin region of the smoothed image;
acquiring pixel values in an initial skin area corresponding to the binary mask map from a single-color channel of the smooth image;
generating the initial mask map for determining non-binary values of the initial skin region from the acquired pixel values; pixel values of points in the initial mask map corresponding to the initial skin region are determined from the acquired pixel values.
5. The method according to claim 4, wherein said fusing the image content of the smooth image within the target skin area into the corresponding position of the target skin area in the original image according to the target mask map to obtain the target image comprises:
superposing and fusing the first image and the second image to obtain a target image;
the first image is obtained by multiplying the pixel values corresponding to each point in the smooth image by the pixel values corresponding to each point in the target mask image;
the second image is an image obtained by multiplying the pixel values corresponding to the respective points in the original image by the pixel values corresponding to the respective points in the reverse mask image of the target mask image.
6. The method of claim 4, further comprising:
determining the range of pixel value intervals to which the pixel values of each point in the target mask image belong;
and according to the significance coefficient corresponding to the pixel value interval range, performing significance weighting on the pixel value of the corresponding point in the target mask image to obtain the target mask image subjected to significance weighting.
7. The method of claim 3, wherein determining an edge mask from the texture edge information comprises:
and performing inverse operation on the texture edge information through an inverse function to obtain an edge mask.
8. The method of claim 1, wherein the texture edge information is a texture edge information map; the determining pixel points with significantly changed pixel values in the original image to obtain texture edge information includes:
determining difference information between the original image and the smoothed image;
and obtaining the texture edge information graph according to the difference information.
9. The method of claim 1, further comprising:
extracting high-frequency information in the original image according to difference information between the original image and the smooth image;
and adding the high-frequency information into the target image to obtain a sharpened target image.
10. The method according to claim 1, wherein the original image comprises a human face; the method further comprises the following steps:
determining the five sense organ regions of the human face in the original image;
and according to the five sense organ region, fusing the target image and the original image to obtain a target image with details of the five sense organs.
11. The method according to claim 1, wherein the original image comprises a human face; the method further comprises the following steps:
acquiring a face area corresponding to a face in the original image;
and according to the face region, fusing the target image and the original image to obtain a target image with an optimized face region.
12. The method according to any one of claims 1 to 11, further comprising:
acquiring an input smoothing parameter;
determining the superposition weight corresponding to the target image and the original image respectively according to the smoothing parameter;
superposing and fusing the target image and the original image according to corresponding superposition weights;
and outputting the superposed and fused images.
13. An image processing apparatus, characterized in that the apparatus comprises:
the smoothing module is used for smoothing the original image to obtain a smooth image;
an initial region determination module for determining an initial skin region in the smoothed image;
the edge determining module is used for determining pixel points with significantly changed pixel values in the original image to obtain texture edge information;
a target area determining module, configured to remove a texture edge area in the initial skin area based on the texture edge information to obtain a target skin area;
and the fusion module is used for fusing the image content in the smooth image, which is positioned in the target skin area, into the corresponding position of the target skin area in the original image to obtain a target image.
14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 12.
15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 12.
CN202110153055.6A 2021-02-04 2021-02-04 Image processing method, image processing device, computer equipment and storage medium Pending CN114862729A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110153055.6A CN114862729A (en) 2021-02-04 2021-02-04 Image processing method, image processing device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110153055.6A CN114862729A (en) 2021-02-04 2021-02-04 Image processing method, image processing device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114862729A true CN114862729A (en) 2022-08-05

Family

ID=82623128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110153055.6A Pending CN114862729A (en) 2021-02-04 2021-02-04 Image processing method, image processing device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114862729A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293994A (en) * 2022-09-30 2022-11-04 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
WO2024046300A1 (en) * 2022-08-31 2024-03-07 北京字跳网络技术有限公司 Image processing method and apparatus, device, and medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024046300A1 (en) * 2022-08-31 2024-03-07 北京字跳网络技术有限公司 Image processing method and apparatus, device, and medium
CN115293994A (en) * 2022-09-30 2022-11-04 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
WO2024067461A1 (en) * 2022-09-30 2024-04-04 腾讯科技(深圳)有限公司 Image processing method and apparatus, and computer device and storage medium

Similar Documents

Publication Publication Date Title
Li et al. Weighted guided image filtering
Li et al. Detail-enhanced multi-scale exposure fusion
Li et al. Edge-preserving decomposition-based single image haze removal
Rahman et al. An adaptive gamma correction for image enhancement
US9639956B2 (en) Image adjustment using texture mask
Jiang et al. Unsupervised decomposition and correction network for low-light image enhancement
CN108205804B (en) Image processing method and device and electronic equipment
CN111445410B (en) Texture enhancement method, device and equipment based on texture image and storage medium
Celik et al. Contextual and variational contrast enhancement
CN107172354B (en) Video processing method and device, electronic equipment and storage medium
Lee et al. Local tone mapping using the K-means algorithm and automatic gamma setting
US20090220169A1 (en) Image enhancement
Kim et al. Low-light image enhancement based on maximal diffusion values
Panetta et al. Tmo-net: A parameter-free tone mapping operator using generative adversarial network, and performance benchmarking on large scale hdr dataset
Lee et al. Noise reduction and adaptive contrast enhancement for local tone mapping
CN112602088B (en) Method, system and computer readable medium for improving quality of low light images
JP6015267B2 (en) Image processing apparatus, image processing program, computer-readable recording medium recording the same, and image processing method
Ancuti et al. Image and video decolorization by fusion
CN112258440B (en) Image processing method, device, electronic equipment and storage medium
CN114862729A (en) Image processing method, image processing device, computer equipment and storage medium
Lee et al. Correction of the overexposed region in digital color image
CN111353955A (en) Image processing method, device, equipment and storage medium
CN107564085B (en) Image warping processing method and device, computing equipment and computer storage medium
Wang et al. Single Underwater Image Enhancement Based on $ L_ {P} $-Norm Decomposition
CN108346128B (en) Method and device for beautifying and peeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40073122

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination