CN117221550A

CN117221550A - Image processing apparatus, method, device, and storage medium

Info

Publication number: CN117221550A
Application number: CN202311215065.3A
Authority: CN
Inventors: 张驰; 卢康
Original assignee: Beijing Eswin Computing Technology Co Ltd; Haining Eswin IC Design Co Ltd
Current assignee: Beijing Eswin Computing Technology Co Ltd; Haining Eswin IC Design Co Ltd
Priority date: 2023-09-19
Filing date: 2023-09-19
Publication date: 2023-12-12

Abstract

The present disclosure provides an image processing apparatus, method, device, and storage medium, which can be applied to the technical field of image processing. The image processing apparatus includes: the preprocessing module is configured to preprocess each image frame of the video stream to be processed to obtain a first processed image; a first determination module configured to determine a hue, saturation, and brightness of each pixel in the first processed image; a second determining module configured to determine a skin tone weight for each pixel in the first processed image based on hue, saturation, and brightness; and a third determining module configured to determine a skin tone region in the first processed image according to the skin tone weight of each pixel and a preset skin tone threshold.

Description

Image processing apparatus, method, device, and storage medium

Technical Field

The present disclosure relates to the field of image processing technology, and in particular, to an image processing apparatus, method, device, storage medium, and program product.

Background

The human eye generally prefers to see colorful photos, and can bring about larger visual impact. However, the human memory has a priori characteristic of skin color, and is highly sensitive to skin color change, so that skin color can be detected and treated as a special treatment object.

Conventional skin tone detection schemes generally include thresholding, gaussian modeling, histogram, neural network, bayesian decision, and mixtures of various detection schemes. Although the partial scheme is easy to hardware, the detection accuracy is low.

Disclosure of Invention

The embodiment of the disclosure provides an image processing device, a method, a device, a storage medium and a program product.

According to a first aspect of the embodiments of the present disclosure, there is provided an image processing apparatus including: the preprocessing module is configured to preprocess each image frame of the video stream to be processed to obtain a first processed image; a first determination module configured to determine a hue, saturation, and brightness of each pixel in the first processed image; a second determining module configured to determine a skin tone weight for each pixel in the first processed image based on hue, saturation, and brightness; and a third determining module configured to determine a skin tone region in the first processed image according to the skin tone weight of each pixel and a preset skin tone threshold.

According to an embodiment of the disclosure, the second determination module is further configured to: acquiring a three-dimensional lookup table, wherein the three-dimensional lookup table comprises a plurality of groups of first reference data, and each group of first reference data comprises a reference tone, a reference saturation, a reference brightness and a corresponding reference skin color weight; and interpolating in a three-dimensional lookup table based on hue, saturation and brightness to obtain the skin tone weight of each pixel in the first processed image.

According to an embodiment of the disclosure, the second determination module is further configured to: acquiring a plurality of sample images having skin color attributes including ambient light source, illumination intensity, gender, age, race, and skin location; preprocessing each sample image to obtain a second processed image; determining the hue, saturation and brightness of each pixel in the second processed image; and obtaining a three-dimensional lookup table by using the hue, saturation and brightness of each pixel in the second processed image.

According to an embodiment of the disclosure, the second determination module is further configured to: removing non-skin tone regions from the second processed image; and obtaining a three-dimensional lookup table by using the hue, saturation and brightness of each pixel in the second processed image after the non-skin color region is removed.

According to an embodiment of the disclosure, the second determination module is further configured to: converting the three-dimensional lookup table into a first two-dimensional lookup table and a second two-dimensional lookup table, wherein the first two-dimensional lookup table comprises a plurality of groups of second reference data based on a first brightness threshold value, the second two-dimensional lookup table comprises a plurality of groups of second reference data based on a second brightness threshold value, and each group of second reference data comprises a reference tone, a reference saturation and a corresponding reference skin color weight; interpolation is carried out on the hue and the saturation in a first two-dimensional lookup table and a second two-dimensional lookup table respectively, so that a first skin color weight and a second skin color weight are obtained; and determining a flesh tone weight for each pixel in the first processed image based on the luminance, the first luminance threshold, the second luminance threshold, the first flesh tone weight, and the second flesh tone weight.

According to an embodiment of the present disclosure, a preprocessing module performs preprocessing on each image frame of a video stream to be processed, including: denoising a plurality of image frames of a video stream to be processed to obtain N denoising images, wherein N is an integer greater than 1; and aiming at the ith denoising image in the N denoising images, according to the gray value of each pixel of the ith denoising image-1, adjusting the gray value of each pixel of the ith denoising image, wherein i is more than 1 and less than or equal to N, and i is an integer.

According to an embodiment of the present disclosure, adjusting the gray value of each pixel of the i-th denoising image according to the gray value of each pixel of the i-1-th denoising image, includes: carrying out histogram statistics on a plurality of gray values of pixels in the i-1 th denoising image to obtain statistical parameters, wherein the statistical parameters comprise a maximum gray value, a minimum gray value, a self-adaptive adjustment range threshold value and a global self-adaptive brightness gain; and adjusting each pixel of the ith denoising image according to the statistical parameters of the ith-1 denoising image.

According to an embodiment of the present disclosure, a skin tone region includes at least one skin tone pixel; the third determination module is further configured to: and for each pixel in the first processed image, determining the pixel as a skin tone pixel and performing protection operation on the skin tone pixel under the condition that the skin tone weight of the pixel is determined to be larger than a preset skin tone threshold value.

According to an embodiment of the present disclosure, the image processing apparatus further includes an enhancement processing module configured to: for each pixel in the first processed image: under the condition that the skin tone weight of the pixel is larger than a preset skin tone threshold value, determining skin tone protection gain according to the skin tone weight of the pixel; under the condition that the skin tone weight of the pixel is less than or equal to a preset skin tone threshold value, determining a color enhancement gain according to the skin tone weight of the pixel; fusing skin color protection gain and color enhancement gain in the first processed image to obtain fusion gain; and performing enhancement processing on the first processed image by using the fusion gain to obtain an enhanced image.

According to a second aspect of embodiments of the present disclosure, there is provided an electronic device including the above-described image processing apparatus of the present disclosure.

According to a third aspect of the embodiments of the present disclosure, there is provided an image processing method including: preprocessing each image frame of the video stream to be processed to obtain a first processed image; determining the hue, saturation and brightness of each pixel in the first processed image; determining the skin color weight of each pixel in the first processed image according to the tone, the saturation and the brightness; and determining a skin tone region in the first processed image according to the skin tone weight of each pixel and a preset skin tone threshold value.

According to a fourth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method described above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the above-described method.

According to a sixth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the above method.

According to the technical scheme of the embodiment of the disclosure, each pixel of each image frame of a video stream to be processed is converted into an HSV color space, a skin color weight corresponding to a current pixel is calculated according to H, S, V values of each pixel, whether the current pixel belongs to skin color pixels or not is judged according to the skin color weight and a preset skin color threshold value, and a skin color region of each image is determined. By the method, on the basis of controlling the hardware operation cost and the storage cost of the image quality improvement chip, the accuracy of skin color detection under different scene environment light sources, different illumination intensities and different race conditions can be improved, and the method is suitable for more skin color detection scenes.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:

fig. 1 shows a schematic configuration diagram of an image processing apparatus according to an embodiment of the present disclosure;

FIG. 2 illustrates a schematic diagram of a preprocessing module according to an embodiment of the present disclosure;

FIG. 3 illustrates a structural schematic of an HSV color model according to an embodiment of the present disclosure;

FIG. 4 illustrates a schematic diagram of a three-dimensional look-up table according to an embodiment of the present disclosure;

FIG. 5A illustrates a schematic diagram of interpolation principles of a three-dimensional look-up table according to an embodiment of the present disclosure;

FIG. 5B illustrates a schematic diagram of interpolation principles of a two-dimensional lookup table according to an embodiment of the present disclosure;

fig. 6 illustrates a schematic configuration of an image processing apparatus according to another embodiment of the present disclosure;

FIG. 7 shows a flowchart of an image processing method according to an embodiment of the present disclosure;

FIG. 8A shows a flowchart of an image processing method according to another embodiment of the present disclosure;

fig. 8B illustrates a flowchart of a skin tone protection operation and a color enhancement operation according to another embodiment of the present disclosure;

fig. 9 shows a block diagram of an electronic device adapted to implement an image processing method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.

In carrying out the disclosed concept, the inventors found that: skin tone is essentially an external manifestation of the physical properties of the skin, determined by biological characteristics, with its own unique properties. In practical application, the acquisition of skin color depends on equipment and an imaging light source, so that the skin color is also provided with illumination dependence and imaging light source dependence. In other words, the same piece of skin is represented by different imaging devices under the same light source, and the imaging colors have a certain difference. Meanwhile, the same collecting device collects the color of the same piece of skin under different illumination intensities of different light sources, and the skin performances are also different. In addition, the different skin colors of the people and the different ages of the people have certain differences, which are factors for limiting the accuracy of the skin color detection.

In view of this, the embodiments of the present disclosure provide an image processing apparatus, a method, a device, a storage medium, and a program product, by which the accuracy of skin color detection in different scenes can be improved, and the method and the device can adapt to the requirements of more people and more skin color detection scenes, and can be applied to an image quality improvement chip.

Various embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 shows a schematic configuration diagram of an image processing apparatus according to an embodiment of the present disclosure.

As shown in fig. 1, the image processing apparatus 100 of this embodiment includes a preprocessing module 110, a first determination module 120, a second determination module 130, and a third determination module 140.

The preprocessing module 110 is configured to preprocess each image frame of the video stream to be processed, resulting in a first processed image.

The first determination module 120 is configured to determine the hue, saturation, and brightness of each pixel in the first processed image.

The second determination module 130 is configured to determine a flesh tone weight for each pixel in the first processed image based on hue, saturation, and brightness.

The third determination module 140 is configured to determine a skin tone region in the first processed image based on the skin tone weight of each pixel and a preset skin tone threshold.

By way of an embodiment of the present disclosure, individual pixels of each image frame of a video stream to be processed are converted into HSV color space, H representing hue, S representing saturation, and V representing brightness. And calculating the skin color weight corresponding to the current pixel according to the H, S, V value of each pixel. Judging whether the current pixel belongs to a skin tone pixel or not according to the skin tone weight and a preset skin tone threshold value, and determining the skin tone region of each image. By the method, on the basis of controlling the hardware operation cost and the storage cost of the image quality improvement chip, the accuracy of skin color detection under different scene environment light sources, different illumination intensities and different race conditions can be improved, and the method is suitable for more skin color detection scenes.

Fig. 2 shows a schematic diagram of a preprocessing module according to an embodiment of the present disclosure.

As shown in fig. 2, the preprocessing module 210 of this embodiment performs preprocessing on each image frame of the video stream to be processed, including: denoising a plurality of image frames of a video stream to be processed to obtain N denoising images, wherein N is an integer greater than 1; aiming at the ith denoising image in the N denoising images, according to the gray value of each pixel of the ith denoising image to 1 th denoising image, the gray value of each pixel of the ith denoising image is adjusted, i is more than 1 and less than or equal to N, and i is an integer.

For example, for the ith image frame of the video stream to be processed, the ith image frame is first denoised to obtain the ith denoised image. And then, according to the statistical parameters of the ith-1 denoising image, each pixel of the ith denoising image is adjusted to obtain an ith first processing image.

Through the embodiment of the disclosure, the preprocessing module firstly performs denoising processing on the input image frame, so as to avoid the influence of noise on skin color detection. And then, carrying out self-adaptive brightness equalization processing on each denoising image, and improving the overall brightness and darkness of the video image. For example, in the image quality improvement chip, the gray value of the previous frame image may be counted, the gray value of the current frame image may be adjusted according to the gray value of the previous frame image, the brightness of the whole image may be adjusted, and the accuracy of skin color detection may be improved.

For example, the adaptive luminance equalization process may employ a multi-scale Retinex (retinal cortex) algorithm, gray-scale equalization, histogram adaptive equalization, or the like.

In an embodiment of the present disclosure, adjusting the gray value of each pixel of the i-th denoising image according to the gray value of each pixel of the i-1-th denoising image includes: carrying out histogram statistics on a plurality of gray values of pixels in the i-1 th denoising image to obtain statistical parameters, wherein the statistical parameters comprise a maximum gray value, a minimum gray value, a self-adaptive adjustment range threshold value and a global self-adaptive brightness gain; and adjusting each pixel of the ith denoising image according to the statistical parameters of the ith-1 denoising image.

Through the embodiment of the disclosure, the histogram self-adaptive equalization processing can be performed on each denoising image, and the gray level of the whole image can be adjusted. And adjusting the pixel value of the current frame by using the data of the gray level histogram counted by the previous frame, and counting the data of the gray level histogram of the current frame for the next frame operation. Meanwhile, in order to avoid detection difference caused by brightness mutation in the detection process, dynamic adjustment of global self-adaptive brightness gain of multiple frames is needed, and the situation that skin color detection rate is obviously reduced caused by large overall change difference of images of the previous and subsequent frames is avoided.

It should be noted that, the histogram adaptive equalization process statistics requires a lot of time for the statistics of the maximum gray value, the minimum gray value, the adaptive adjustment range threshold value, and the global adaptive luminance gain of the gray histogram. Considering that the experiment of the global data parameter statistics of the image is large, the hardware cost of manufacturing the image quality improvement chip is affected, so that the image quality improvement chip cannot use the data of the current frame for the calculation of the current frame. Therefore, the hardware is designed to calculate the current frame by using the statistical parameters of the previous frame, namely, after denoising the input image, the parameters required by the self-adaptive brightness balance of the current frame are calculated and used for the image balance of the next frame, and meanwhile, the current frame carries out the self-adaptive brightness balance on the current image by using the self-adaptive brightness balance parameters obtained by the statistics of the previous frame. In addition, in order to avoid abrupt change of the picture, the global self-adaptive brightness gain parameter of a plurality of frames is counted, and the self-adaptive equalization intensity of the whole picture can be controlled so as to avoid the picture flickering problem of the display video.

For example, when the image quality improvement chip performs the histogram adaptive equalization processing, the control logic of the following (1) to (4) may be referred to:

(1) At the end of the nth frame image, the image quality improvement chip logic collects statistical information about the gray level histogram;

(2) The self-adaptive brightness adjustment algorithm predicts the control parameters of the (n+1) th frame according to the statistical data of the (N) th frame;

(3) The firmware writes the self-adaptive brightness adjustment gain control parameters into a hardware register of the image quality improvement chip before the n+1st frame image arrives;

(4) The n+1st frame image is calculated using the new adaptive brightness adjustment parameters.

Specifically, the histogram adaptive equalization process may accumulate the statistical histogram upward from the gray value 0 of the gray histogram, and when the accumulated value is greater than h_low×m×n (m×n is the image resolution, h_low is the minimum gray value), obtain the stretched lower limit t_min of the image tone at this time; similarly, the statistical histogram is accumulated downward from the histogram gradation value 255, and when the accumulated value is larger than h_high×m×n (h_high is the maximum gradation value), the upper limit t_max of the stretching of the image gradation at this time is obtained.

After calculating the upper and lower limit ranges [ T_min, T_max ] of the gradation, the input RGB image data is subjected to adaptive gradation mapping. The mapping rules may be: for gray values smaller than T_min, mapping the gray values to M_min; for gray values greater than T_max, mapping the gray value to M_max; for gray values between T_min and T_max, then linear mapping is between [ T_min, T_max ].

The adjusted gray value is adjusted by using the global adaptive brightness gain, and the mapping rule can be expressed as the following formula (1):

in the formula (1), P (i, j), P' (i, j) are the original gray value and the adjusted gray value at the pixel (i, j), respectively; m_max, M_min are the upper and lower limits of the adaptive adjustment range threshold, respectively.

And obtaining a first processed image after denoising and self-adaptive brightness equalization processing according to each image frame of the video stream to be processed.

In an embodiment of the present disclosure, the first determining module 120 is further configured to: the RGB (R is red, G is green, and B is blue) values of each pixel in the first processed image are converted to obtain the hue, saturation, and brightness of each pixel.

According to the embodiment of the disclosure, the first determining module converts the first processing data from the RGB color space to the HSV (H is tone, S is saturation, and V is brightness) color space, so that brightness information and color saturated hue information are effectively separated, and the influence of illumination on the determination of the weight of the complexion of the image is avoided.

For example, the RGB values for each pixel are converted to HSV values by an HSV color model for subsequent skin tone weight determinations. Skin tone weights need to be determined synthetically in the HSV color model from H, S, V values.

Fig. 3 shows a schematic structural diagram of an HSV color model according to an embodiment of the present disclosure.

As shown in fig. 3, the HSV color model 301 of this embodiment is a hexagonal pyramid model having three dimensions H, S, V. The brightness V is the brightness of the pixel, and ranges from 0 to 1 in the vertical direction, the lowest point 0 represents black, the highest point 1 represents white, and from bottom to top, the pixel is brighter. The hue H is a rotation angle in the range of 0 ° to 360 °, and different rotation angles are expressed as corresponding hues. The saturation S is a radial proportion ranging from 0 to 1.

Based on the HSV color model 301 described above, for any pixel in the first processed image, the RGB color space is (R, G, B), and the HSV color space is (H, S, V). First, the R, G, B value is converted to 0-1, and when the widths of R, G and B are 8 bits, namely, R ' =R/255, G ' =G/255 and B ' =B/255. Then, let MAX be the maximum value of R ', G', B ', MIN be the minimum value of R', G ', B', then convert the RGB value of this pixel into HSV value according to the following formulas (2) - (4):

V＝MAX (4)

in the design of the image quality improvement chip, the expansion of the bit width and the control of the calculation precision can be carried out on the H, S, V value according to specific requirements, and the bit width of the H, S, V value calculated by the formulas (2) - (4) can be controlled to be about 10bit to 12bit so as to ensure the precision of table lookup.

In an embodiment of the present disclosure, the second determining module 130 is further configured to: acquiring a three-dimensional Look-Up Table (also called 3D LUT), wherein the three-dimensional Look-Up Table comprises a plurality of groups of first reference data, and each group of first reference data comprises a reference tone, a reference saturation, a reference brightness and a corresponding reference skin color weight; and interpolating in a three-dimensional lookup table based on hue, saturation and brightness to obtain the skin tone weight of each pixel in the first processed image.

By the embodiment of the disclosure, the three-dimensional lookup table 3D LUT based on the HSV color space is obtained offline. Interpolation can be performed on any H, S, V value in the 3D LUT to obtain the skin tone weight of the current pixel. In skin color detection, the influence of brightness on skin color detection needs to be considered, so as to avoid the problem that skin color areas in a display picture are not uniform in transition with non-skin color areas due to the cleavage of a skin color detection picture, and corresponding skin color weights are allocated to related color areas such as skin color, skin-like color and the like.

In an embodiment of the present disclosure, the second determining module 130 is further configured to: acquiring a plurality of sample images having skin color attributes including ambient light source, illumination intensity, gender, age, race, and skin location; preprocessing each sample image to obtain a second processed image; determining the hue, saturation and brightness of each pixel in the second processed image; and obtaining a three-dimensional lookup table by using the hue, saturation and brightness of each pixel in the second processed image.

According to the embodiment of the disclosure, as skin color sample selection can influence skin color detection effect, in order to accurately count the skin color clustering characteristics of the face area, various color face images are required to be selected as sample images for skin color detection, and the sample images comprise sample images of different environment light sources, illumination intensity, gender, age, race and skin parts, so that the counted skin colors are more representative.

The sample images that have passed through the volume may be taken off-line to obtain a three-dimensional look-up table, which is stored in a static random access memory (Statistic Random Access Memory, SRAM). Since the larger the three-dimensional lookup table 3D LUT is, the larger the area and the production cost of the image quality improvement chip are, the downsampling process is required to be performed on the 3D LUT, and downsampling is required to be performed to the size of mxnxo, and the M, N, O is a preset size, which characterizes the threshold values of the reference hue, the reference saturation and the reference brightness, respectively, and can be adjusted according to the actual situation of the image quality improvement chip, which is not limited herein.

Fig. 4 shows a schematic diagram of a three-dimensional look-up table according to an embodiment of the present disclosure.

For example, in the case where m=13 and n=o=9, a three-dimensional lookup table 401 of 13×9×9 size as shown in fig. 4 can be obtained, avoiding the 3D LUT from occupying the area of hardware storage.

According to the embodiment of the disclosure, a three-dimensional lookup table obtained offline is queried according to a H, S, V value, the reference skin color weight in the three-dimensional lookup table is indexed, and interpolation is carried out to obtain the final skin color weight of the current pixel.

After the three-dimensional lookup table is obtained, the skin color weight corresponding to any HSV value is calculated through interpolation so as to be used for subsequent skin color protection and enhancement processing of other colors.

In an embodiment of the present disclosure, the second determining module 130 is further configured to: removing non-skin tone regions from the second processed image; and obtaining a three-dimensional lookup table by using the hue, saturation and brightness of each pixel in the second processed image after the non-skin color region is removed.

According to the embodiment of the disclosure, non-skin color areas such as eyes, eyebrows, mouth and the like can be removed (for example, the non-skin color areas are manually filled to be white), and only the residual skin color area is used as a sample image for skin color acquisition, so that missed detection or false detection of skin color caused by incomplete or inaccurate skin color sample image selection is avoided.

In the embodiment of the present disclosure, for a three-dimensional lookup table, interpolation calculation is required to be performed on the hue, saturation, and brightness of each pixel to be processed in the three-dimensional lookup table. Linear interpolation (1 index), bilinear interpolation (bilinear interpolation) and trilinear interpolation (trilinear) are the most commonly used interpolation algorithms, and respectively aim at data in one-dimensional, two-dimensional and three-dimensional spaces, so that the method has the advantages of simplicity in calculation, good effect, contribution to hardware in an image quality improvement chip and the like.

Fig. 5A illustrates a schematic diagram of interpolation principles of a three-dimensional lookup table according to an embodiment of the present disclosure.

As shown in fig. 5A, first indexing into tri-linear interpolation in a 3D LUT (three-dimensional look-up table) based on the converted HSV values, 8 interpolation reference points ABCDEFGH are required, which constitute a cube 501. Point A can be considered to be the coordinate minimum point ([ 0] [0] [0 ]), and Point G is the coordinate maximum point ([ 1] [1] [1 ]). The interpolated point P is located at a position inside the cube 501 and the reference skin tone weight of the P point, i.e. the skin tone weight of the current pixel. The process of interpolating the P-point can be broken down into three steps:

(1) Performing three linear interpolations (two times of interpolation in the horizontal direction and one time of interpolation in the vertical direction) on an ABCD plane, and solving a first skin color weight1 corresponding to the R point;

(2) Performing three linear interpolations (two times of interpolation in the horizontal direction and one time of interpolation in the vertical direction) on the EFGH plane, and solving a second skin color weight2 corresponding to the S point;

(3) And (4) performing linear interpolation once in the RS direction to obtain the final skin color weight.

Fig. 5B illustrates a schematic diagram of interpolation principles of a two-dimensional lookup table according to an embodiment of the present disclosure.

As shown in fig. 5B, first three linear interpolations in two planes ABCD and EFGH, respectively, are required.

Taking the example of cubic linear interpolation of ABCD plane 502, it includes two bilinear interpolations in the horizontal direction and one bilinear interpolation in the vertical direction. In fig. 5B, blk_h and blk_s are sampling steps of H, S values of the current pixel in the HS plane, which are downsampling steps of the current block of the 3D LUT. On the premise that the H value and the S value of the current pixel are known, the offset values offset_h and offset_s on the HS plane can be calculated according to the following formulas (5) to (6):

offset_h＝H-Hue_A (5)

offset_s＝S-Sat_A (6)

in the formulas (5) to (6), hue_a and sat_a are the Hue and saturation values indicated by the point a, respectively.

Then, bilinear interpolation in the horizontal direction is performed according to the following formulas (7) to (8) to obtain a top _{h_val} And bottom _{h_val} ：

In the formulas (7) to (8), A, B, C, D is the reference skin color weight indicated by the points a, B, C, and D, respectively.

Then, performing primary bilinear interpolation in the vertical direction according to the following formula (9) to obtain a first skin color weight1:

similarly, the second skin color weight2 of the EDFG plane can be obtained from the same interpolation calculations as those of formulas (5) to (9) above.

Then, a linear interpolation is performed in the RS direction, blk_v is the sampling step of the V value of the current pixel, and is the downsampling step of the current block of the 3D LUT. On the premise that the V value of the pixel is known, the offset amount offset_v in the vertical direction can be calculated according to the following formula (10):

offset_v＝V-Val_A (10)

In the formula (10), val_a is the luminance indicated by the a point.

The skin tone weight for each pixel in the first processed image is determined according to the following equation (11):

in an alternative embodiment of the present disclosure, the second determining module 130 is further configured to: converting the three-dimensional lookup table into a first two-dimensional lookup table and a second two-dimensional lookup table, wherein the first two-dimensional lookup table comprises a plurality of groups of second reference data based on a first brightness threshold value, the second two-dimensional lookup table comprises a plurality of groups of second reference data based on a second brightness threshold value, and each group of second reference data comprises a reference tone, a reference saturation and a corresponding reference skin color weight; interpolation is carried out on the hue and the saturation in a first two-dimensional lookup table and a second two-dimensional lookup table respectively, so that a first skin color weight and a second skin color weight are obtained; and determining a flesh tone weight for each pixel in the first processed image based on the luminance, the first luminance threshold, the second luminance threshold, the first flesh tone weight, and the second flesh tone weight.

For example, in a case where some luminance changes are not very frequent, that is, some luminance adjustment operations have been performed in the early stage of the input image data, the 3D LUT may be reduced to a 2D LUT (two-dimensional lookup table) at two specific luminances, and thresholds for the two 2D LUTs, such as a first luminance threshold value thrshold1 and a second luminance threshold value thrshold2, for which the skin tone weight changes according to the luminance are set. For each two-dimensional lookup table, the hue and saturation of each pixel to be processed need to be interpolated in the two-dimensional lookup table. For example, based on the interpolation principle of the two-dimensional lookup table in fig. 5B, the hue H and saturation S of any pixel are respectively indexed out of four points ABCD in the two tables in the two 2D LUTs to perform interpolation, and the first skin color weight1 and the second skin color weight2 corresponding to the two 2D LUTs are obtained through calculation by the bilinear interpolation algorithm. Then, according to the following formula (12), based on the threshold 1, threshold 2 and the original brightness V of the pixel, the interpolated Weight1 and Weight2 are fused to obtain the skin color Weight of the pixel:

According to the embodiment of the disclosure, the three-dimensional lookup table 3D LUT only keeps two-dimensional lookup table 2D LUTs related to specific brightness values under two normal brightness to calculate the skin color weight, so that the manufacturing cost of image quality improvement chip hardware is further reduced. In addition, according to the embodiment of the disclosure, an applicable three-dimensional lookup table or two-dimensional lookup table can be selected according to specific technical problems, so that the hardware cost and the storage cost of the image quality improvement chip are controlled, the SRAM storage space is reduced, and the accuracy of skin color detection is improved.

In an embodiment of the present disclosure, the skin tone region includes at least one skin tone pixel; the third determination module 140 is further configured to: and for each pixel in the first processed image, determining the pixel as a skin tone pixel and performing protection operation on the skin tone pixel under the condition that the skin tone weight of the pixel is determined to be larger than a preset skin tone threshold value.

For example, when the flesh tone weight of the current pixel is less than or equal to a preset flesh tone threshold value, the pixel is determined to be a non-flesh tone pixel, and a color enhancement operation is performed on the non-flesh tone pixel.

For example, the range of the skin tone weight can be set to 0-1, the preset skin tone threshold can be set to 0.8, and when the skin tone weight of the current pixel is greater than 0.8, the pixel can be subjected to protection operation; and when the skin tone weight of the current pixel is smaller than or equal to 0.8, the pixel is a non-skin tone pixel, and the color enhancement operation can be performed.

For example, the protection operation for a flesh tone pixel may be to lock the edge of the pixel to protect the pixel from modification. The color enhancement operation on the non-skin color pixels can be brightness enhancement or saturation enhancement, so that the image is more vivid, and the image effect is effectively improved.

According to the embodiment of the disclosure, whether the current pixel belongs to the skin tone pixel is judged according to the calculated skin tone weight and the preset skin tone threshold value, the skin tone region and the non-skin tone region are clearly distinguished, and protection of the skin tone region and color enhancement of the non-skin tone region are realized. For pixels with flesh tone weights less than a preset flesh tone threshold, the pixels can be determined to be non-flesh tone pixels, and then the non-flesh tone pixels are subjected to color enhancement. Therefore, the embodiment of the disclosure can realize the protection of skin color and enhance other colors (such as green, grass green, red and the like), so that the colors are more vivid, and the image quality is improved.

Fig. 6 illustrates a schematic configuration of an image processing apparatus according to another embodiment of the present disclosure.

As shown in fig. 6, the image processing apparatus 600 of this embodiment includes not only the preprocessing module 110, the first determination module 120, the second determination module 130, and the third determination module 140 described above, but also the enhancement processing module 610.

The enhancement processing module 610 is configured to:

for each pixel in the first processed image:

under the condition that the skin tone weight of the pixel is larger than a preset skin tone threshold value, determining skin tone protection gain according to the skin tone weight of the pixel;

under the condition that the skin tone weight of the pixel is less than or equal to a preset skin tone threshold value, determining a color enhancement gain according to the skin tone weight of the pixel;

fusing skin color protection gain and color enhancement gain in the first processed image to obtain fusion gain; and

and performing enhancement processing on the first processed image by using the fusion gain to obtain an enhanced image.

For example, first, since the first processed image contains a plurality of pixels, and each pixel can determine only either one of the skin tone protection gain and the color enhancement gain, the entire first processed image can determine a part of the skin tone protection gain and another part of the color enhancement gain. Then, the gains of the two parts of the whole first processing image are integrated integrally, and the integrated gain is obtained. Then, the first processed image can be enhanced by using the fusion gain to obtain an enhanced image, so that the image effect is effectively improved.

According to the embodiment of the disclosure, interpolation of the 3D LUT or the 2D LUT is performed on the first processed image, so that the skin color weight of the current pixel is obtained, and whether the current pixel belongs to the skin color pixel is judged according to a preset threshold value. When the skin tone weight of the current pixel is greater than a preset skin tone threshold value, determining that the pixel value belongs to the skin tone pixel, and further calculating skin tone protection gain according to the skin tone weight; and when the skin tone weight of the current pixel is not greater than the preset skin tone threshold value, determining that the pixel value does not belong to the skin tone pixel, and calculating the color enhancement gain according to the skin tone weight. And finally, fusing the skin color protection gain and the color enhancement gain, and performing enhancement processing on the first processed image.

The embodiment of the disclosure also provides an electronic device comprising the image processing apparatus according to any embodiment of the disclosure.

It should be noted that, the division of the units in the embodiments of the present disclosure is schematic, only one logic function is divided, and another division manner may be implemented in practice. In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may be physically stored separately, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

For example, any of the preprocessing module 110, the first determination module 120, the second determination module 130, the third determination module 140, and the enhancement processing module 610 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the pre-processing module 110, the first determination module 120, the second determination module 130, the third determination module 140, and the enhancement processing module 610 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or as any one of or a suitable combination of three of software, hardware, and firmware. Or at least one of the pre-processing module 110, the first determination module 120, the second determination module 130, the third determination module 140, and the enhancement processing module 610 may be at least partially implemented as a computer program module which, when executed, may perform the corresponding functions.

Embodiments of the present disclosure also provide an image processing method performed by an image processing apparatus, which will be described in detail below with reference to fig. 7 to 8B.

Fig. 7 shows a flowchart of an image processing method according to an embodiment of the present disclosure.

As shown in fig. 7, the image processing method of this embodiment can be applied to the image processing apparatus 100 of the above-described embodiment. The image processing method includes operations S710 to S740.

In operation S710, each image frame of the video stream to be processed is preprocessed to obtain a first processed image.

In the embodiment of the present disclosure, the operation S710 is performed by the preprocessing module 110, which corresponds to the operation performed by the preprocessing module 110, and is not described herein for brevity.

In operation S720, the hue, saturation, and brightness of each pixel in the first processed image are determined.

In the embodiment of the present disclosure, the operation S720 is performed by the first determining module 120, which corresponds to the operation performed by the first determining module 120, and is not described herein for brevity.

In operation S730, a flesh tone weight of each pixel in the first processed image is determined according to hue, saturation, and brightness.

In the embodiment of the present disclosure, the operation S730 is performed by the second determining module 130, which corresponds to the operation performed by the second determining module 130, and is not described herein for brevity.

In operation S740, a skin tone region in the first processed image is determined according to the skin tone weight of each pixel and a preset skin tone threshold.

In the embodiment of the present disclosure, the operation S740 is performed by the third determining module 140, which corresponds to the operation performed by the third determining module 140, and is not described herein for brevity.

Fig. 8A shows a flowchart of an image processing method according to another embodiment of the present disclosure.

As shown in fig. 8A, the image processing method of this embodiment can be applied to the image processing apparatus 600 of the above-described embodiment. The image processing method comprises the following steps:

operation S811, preprocessing each image frame of the video stream to be processed to obtain a first processed image;

operation S822, performing color space conversion on each pixel in the first processed image, and converting from RGB color space to HSV color space to obtain hue, saturation and brightness of each pixel in the first processed image;

operation S831, determining a skin tone weight of each pixel in the first processed image according to the hue, saturation, and brightness;

operation S841, determining a skin tone region in the first processed image according to the skin tone weight of each pixel and the preset skin tone threshold;

In operation S850, skin tone protection or color enhancement processing is performed on each pixel in the first processed image.

In the embodiment of the present disclosure, operations S811 to S841 correspond to operations S910 to S940, respectively, and are not described herein. Operation S850 is performed by the enhancement processing module 610, and corresponds to the operation performed by the enhancement processing module 610, and is not described herein for brevity.

With continued reference to fig. 8A, in an embodiment of the present disclosure, the operation S811 may include operations S811a to S811b.

In operation S811a, a plurality of image frames of a video stream to be processed are denoised to obtain N denoised images.

In operation S811b, adaptive luminance equalization processing is performed on each of the N denoised images to obtain a first processed image.

Fig. 8B illustrates a flow chart of skin tone protection operations and color enhancement operations according to another embodiment of the present disclosure.

As shown in fig. 8B, in the embodiment of the present disclosure, the above-described operation S850 may include operations S851 to S853.

In operation S851, for each pixel in the first processed image: under the condition that the skin tone weight of the pixel is larger than a preset skin tone threshold value, determining skin tone protection gain according to the skin tone weight of the pixel; and under the condition that the skin tone weight of the pixel is less than or equal to a preset skin tone threshold value, determining the color enhancement gain according to the skin tone weight of the pixel.

In operation S852, the skin color protection gain and the color enhancement gain in the first processed image are fused to obtain a fused gain.

In operation S853, enhancement processing is performed on the first processed image using the fusion gain, resulting in an enhanced image.

It should be noted that, the steps of the above method embodiment correspond to the actions performed by the modules/units in the above apparatus embodiment, and specific implementation flow of each step in the method embodiment of the present disclosure may refer to detailed functional descriptions of each module/unit in the above apparatus embodiment, which is not repeated in detail.

As shown in fig. 9, an electronic device 900 according to an embodiment of the present disclosure includes a processor 901 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage portion 908 into a Random Access Memory (RAM) 903. The processor 901 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 901 may also include on-board memory for caching purposes. Processor 901 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the present disclosure.

In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are stored. The processor 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. The processor 901 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 902 and/or the RAM 903. Note that the program may be stored in one or more memories other than the ROM 902 and the RAM 903. The processor 901 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.

According to an embodiment of the disclosure, the electronic device 900 may also include an input/output (I/O) interface 905, the input/output (I/O) interface 905 also being connected to the bus 904. The electronic device 900 may also include one or more of the following components connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 902 and/or RAM 903 and/or one or more memories other than ROM 902 and RAM 903 described above.

Embodiments of the present disclosure also provide a computer program product comprising a computer program containing program code for performing the method shown in the flowcharts. The program code means for causing a computer system to carry out the image processing method provided by the embodiments of the present disclosure when the computer program product is run on the computer system.

The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 901. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, via communication portion 909, and/or installed from removable medium 911. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 901. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. An image processing apparatus comprising:

the preprocessing module is configured to preprocess each image frame of the video stream to be processed to obtain a first processed image;

a first determination module configured to determine a hue, saturation, and brightness of each pixel in the first processed image;

a second determining module configured to determine a skin tone weight for each pixel in the first processed image based on the hue, saturation, and brightness; and

and a third determining module configured to determine a skin tone region in the first processed image according to the skin tone weight of each pixel and a preset skin tone threshold.

2. The apparatus of claim 1, wherein the second determination module is further configured to:

acquiring a three-dimensional lookup table, wherein the three-dimensional lookup table comprises a plurality of groups of first reference data, and each group of first reference data comprises a reference tone, a reference saturation, a reference brightness and a corresponding reference skin color weight; and

and interpolating in the three-dimensional lookup table based on the hue, the saturation and the brightness to obtain the skin color weight of each pixel in the first processed image.

3. The apparatus of claim 2, wherein the second determination module is further configured to:

obtaining a plurality of sample images having skin tone attributes including ambient light source, illumination intensity, gender, age, race, and skin location;

preprocessing each sample image to obtain a second processed image;

determining the hue, saturation and brightness of each pixel in the second processed image; and

and obtaining the three-dimensional lookup table by using the tone, the saturation and the brightness of each pixel in the second processing image.

4. The apparatus of claim 3, wherein the second determination module is further configured to:

removing non-skin tone regions from the second processed image; and

The three-dimensional lookup table is obtained by using the hue, saturation and brightness of each pixel in the second processed image after the non-skin color region is removed.

5. The apparatus of claim 2, wherein the second determination module is further configured to:

converting the three-dimensional lookup table into a first two-dimensional lookup table and a second two-dimensional lookup table, wherein the first two-dimensional lookup table comprises a plurality of groups of second reference data based on a first brightness threshold value, the second two-dimensional lookup table comprises a plurality of groups of second reference data based on a second brightness threshold value, and each group of second reference data comprises a reference tone, a reference saturation and a corresponding reference skin color weight;

interpolating the hue and saturation in the first two-dimensional lookup table and the second two-dimensional lookup table respectively to obtain a first skin color weight and a second skin color weight; and

and determining the skin tone weight of each pixel in the first processed image according to the brightness, the first brightness threshold, the second brightness threshold, the first skin tone weight and the second skin tone weight.

6. The apparatus of claim 1, wherein the preprocessing module preprocesses each image frame of the video stream to be processed, comprising:

Denoising a plurality of image frames of a video stream to be processed to obtain N denoising images, wherein N is an integer greater than 1; and

aiming at an ith denoising image in the N denoising images, according to the gray value of each pixel of the ith denoising image-1, adjusting the gray value of each pixel of the ith denoising image, wherein i is more than 1 and less than or equal to N, and i is an integer.

7. The apparatus of claim 6, wherein the adjusting the gray value of each pixel of the i-th de-noised image according to the gray value of each pixel of the i-1-th de-noised image comprises:

carrying out histogram statistics on a plurality of gray values of pixels in the i-1 th denoising image to obtain statistical parameters, wherein the statistical parameters comprise a maximum gray value, a minimum gray value, a self-adaptive adjustment range threshold value and a global self-adaptive brightness gain; and

and adjusting each pixel of the ith denoising image according to the statistical parameters of the ith-1 denoising image.

8. The apparatus of claim 1, wherein the skin tone region comprises at least one skin tone pixel;

the third determination module is further configured to:

and for each pixel in the first processing image, determining the pixel as a skin tone pixel and performing protection operation on the skin tone pixel under the condition that the skin tone weight of the pixel is determined to be larger than the preset skin tone threshold value.

9. The apparatus of claim 1, further comprising an enhancement processing module configured to:

for each pixel in the first processed image:

under the condition that the skin tone weight of the pixel is larger than the preset skin tone threshold value, determining skin tone protection gain according to the skin tone weight of the pixel;

under the condition that the skin tone weight of the pixel is less than or equal to the preset skin tone threshold value, determining a color enhancement gain according to the skin tone weight of the pixel;

and carrying out enhancement processing on the first processed image by using the fusion gain to obtain an enhanced image.

10. An electronic device, comprising:

the image processing apparatus according to any one of claims 1 to 9.

11. An image processing method, comprising:

preprocessing each image frame of the video stream to be processed to obtain a first processed image;

determining the hue, saturation and brightness of each pixel in the first processed image;

determining the skin color weight of each pixel in the first processed image according to the tone, the saturation and the brightness; and

And determining a skin color region in the first processed image according to the skin color weight of each pixel and a preset skin color threshold value.

12. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of claim 11.

13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of claim 11.