WO2023029233A1

WO2023029233A1 - Face pigment detection model training method and apparatus, device, and storage medium

Info

Publication number: WO2023029233A1
Application number: PCT/CN2021/132558
Authority: WO
Inventors: 李启东; 李志阳; 王喆; 杨小栋
Original assignee: 厦门美图宜肤科技有限公司
Priority date: 2021-08-30
Filing date: 2021-11-23
Publication date: 2023-03-09
Also published as: JP2023546307A; CN113688752B; JP7455234B2; KR20230035225A; CN113688752A

Abstract

The present application relates to the technical field of image processing, and provides a face pigment detection model training method and apparatus, a device, and a storage medium. The method comprises: performing gain processing on an original sample image to obtain a target sample image; inputting the target sample image into an initial face pigment detection model to obtain an actual black-pigment high-definition detail image and an actual red-pigment high-definition detail image; decomposing the original sample image to obtain a supervised black-pigment high-definition detail image and a supervised red-pigment high-definition detail image; and by using the supervised black-pigment high-definition detail image and the supervised red-pigment high-definition detail image as supervision parameters, iteratively correcting the initial face pigment detection model according to the actual black-pigment high-definition detail image and the actual red-pigment high-definition detail image, to obtain a target face pigment detection model. The present solution solves the problem that since the quality of image captured by a low-cost camera is poor, colors of adjacent pixels of the image tend to be consistent, resulting in low decomposition quality of different pigments in the face image.

Description

Human face pigment detection model training method, device, equipment and storage medium

Cross References to Related Applications

This application claims the priority of the Chinese patent application with the application number 2021110024638 and titled "Human face pigment detection model training method, device, equipment and storage medium" submitted to the Chinese Patent Office on August 30, 2021, the entire content of which is passed References are incorporated in this application.

technical field

The present application relates to the technical field of image processing, in particular, to a human face pigment detection model training method, device, equipment and storage medium.

Background technique

The skin color of the human face is mainly composed of two pigments: melanin and heme. These two pigments have relatively fixed spectra for the absorption and reflection of light. Therefore, they have relatively fixed colors in the image imaging, and the final human face The overall color of the skin is determined by the content of the two pigments; in turn, according to the image imaging results, the content of melanin (the result is a brown image, Brown) and hemoglobin (the result is a red image, Red) is calculated. Therefore, the color of the obtained face image can be analyzed to obtain the distribution of different pigments in the face image.

At present, image analysis and processing methods are often only applicable to images with high image quality, such as images collected by professional digital cameras or SLR cameras, etc., but when applied to image processing with low image quality, such as mobile phone camera Due to the fact that such low-quality images have more color noise, the denoising process in the mobile phone imaging algorithm will cause the colors of adjacent pixels in the image to tend to be consistent, which in turn affects the difference between different pigments in the face image. recognition separation effect.

Therefore, how to solve the problem that the low-cost camera (camera) has poor image quality and causes the colors of adjacent pixels of the image to tend to be consistent, so that the decomposition quality of different pigments in the face image is low, is a technical problem to be solved urgently.

Contents of the invention

This application provides a human face pigment detection model training method, device, equipment and storage medium, which solves the problem that low-cost cameras (cameras) have poor image quality, which causes the colors of adjacent pixels in the image to tend to be consistent, which makes the human face image The problem of low quality decomposition of different pigments.

Some embodiments of the present application provide a human face pigment detection model training method, the method may include:

performing gain processing on the original sample image to obtain a target sample image, the resolution of the original sample image may be higher than the resolution of the target sample image;

The target sample image is input into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model;

The original sample image is decomposed and processed to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image;

Using the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, iteratively correcting the initial human face pigment detection model, Obtain the target face pigment detection model.

Optionally, using the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, the initial human face pigment The detection model is iteratively corrected to obtain the target face pigment detection model, which can include:

The brightness information of the supervised melanin high-definition detail image and the brightness information of the supervised red pigment high-definition detail image are used as supervision parameters, and according to the brightness information of the actual melanin high-definition detail image and the brightness information of the actual red pigment high-definition detail image, the The above initial face pigment detection model is iteratively corrected to obtain the target face pigment detection model.

Optionally, the initial human face pigment detection model may include: an encoder, a first decoder and a second decoder;

Said inputting the target sample image into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and actual red pigment high-definition detail image output by the initial human face pigment detection model may include:

Encoding the target sample image by the encoder to obtain encoded features;

performing detail decoding on the encoded features by the first decoder to obtain a melanin detail image and a red pigment detail image;

performing color decoding on the encoded features by the second decoder to obtain a melanin color image and a red pigment color image;

Superimposing the melanin detail image and the melanin color image by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and the red pigment detail image and the red pigment color image Superposition processing is performed to obtain the actual red pigment high-definition detail image.

Optionally, the color decoding of the encoded features by the second decoder to obtain a melanin color image and a red pigment color image may include:

Perform color decoding on the encoded features by the second decoder to obtain an intermediate melanin coefficient map matrix and an intermediate red pigment coefficient map matrix, and compare the intermediate melanin coefficient map matrix with the position of each pixel in the target sample image Multiply the pixel vectors of the melanin color image to obtain the melanin color image, and multiply the intermediate red pigment coefficient map matrix with the pixel vectors of each pixel position in the target sample image to obtain the red pigment color image.

Optionally, the initial human face pigment detection model is used to superimpose the melanin detail image and the melanin color image to obtain the actual melanin high-definition detail image, and the red pigment detail image and the The red pigment color image is superimposed to obtain the actual red pigment high-definition detail image, which may include:

Adding the pixel values of the same position and the same channel in the melanin detail image and the melanin color image respectively by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and for the red pigment The detail image and the pixel values of the same position and the same channel in the red pigment color image are respectively added to obtain the actual red pigment high-definition detail image.

Optionally, the gain processing may include at least one of the following: compression processing, color format conversion processing, and pigment region color adjustment processing.

Optionally, the color adjustment processing of the pigment area may include: detecting a melanin area and a red pigment area from the original sample image, removing the melanin area and red pigment area from the original sample image, and removing The image after the melanin area and the red area is fused with the original sample image.

Other embodiments of the present application also provide a method for detecting human face pigment, the method may include:

Acquiring a target sample image, the target sample image may be an image captured by a low-resolution pixel camera;

The target sample image is input into the target human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the target human face pigment detection model;

According to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, the melanin distribution information and the red pigment distribution information in the target sample image are determined.

Some other embodiments of the present application also provide a human face pigment detection model training device, the device may include:

The gain module may be configured to perform gain processing on the original sample image to obtain a target sample image, and the resolution of the original sample image may be higher than the resolution of the target sample image;

The processing module can be configured to input the target sample image into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model; The original sample image is decomposed and processed to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image;

The correction module may be configured to use the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, for the initial human The face pigment detection model is iteratively corrected to obtain the target human face pigment detection model.

Optionally, the correction module can also be configured to:

The processing module can also be configured to:

Encoding the target sample image by the encoder to obtain encoded features;

Optionally, the processing module may also be configured to:

Further embodiments of the present application also provide a human face pigment detection device, which may include:

The acquisition module can be configured to acquire a target sample image, the target sample image is an image captured by a low-resolution pixel camera;

The processing module can be configured to input the target sample image into the target human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the target human face pigment detection model;

The determination module can be configured to determine the melanin distribution information and the red pigment distribution information in the target sample image according to the actual melanin high-definition detailed image and the actual red pigment high-definition detailed image.

Other embodiments of the present application also provide an electronic device. The electronic device may include: a processor, a storage medium, and a bus. The storage medium stores machine-readable instructions executable by the processor. When the electronic device runs When, the processor communicates with the storage medium through a bus, and the processor executes the machine-readable instructions to perform the steps of the method provided in the first aspect or the second aspect above.

Another embodiment of the present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method provided in the above-mentioned embodiments are executed.

The beneficial effects of the application are at least:

The embodiment of the present application provides a human face pigment detection model training method, device, equipment and storage medium, the method may include: performing gain processing on the original sample image to obtain the target sample image, the resolution of the original sample image is higher than that of the target sample The resolution of the image; the target sample image is input into the initial face pigment detection model, and the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial face pigment detection model are obtained; the original sample image is decomposed and processed to obtain supervision Melanin high-definition detail images and supervised red pigment high-definition detail images; with supervised melanin high-definition detail images and supervised red pigment high-definition detail images as supervision parameters, according to the actual melanin high-definition detail images and actual red pigment high-definition detail images, the initial face pigment detection model Perform iterative correction to obtain the target face pigment detection model. In this scheme, the gain processing is mainly performed on the original sample image collected by a professional digital camera or SLR camera to obtain the target sample image, so as to realize the effect of simulating the face image captured by the camera of a mobile phone, and then, the target sample image Input the image into the initial face pigment detection model to obtain the HB image and HR image, and use the original sample image to decompose the TB image and TR image to iteratively correct the initial face pigment detection model to obtain the target face pigment detection model, It makes it possible to input the target sample image captured by the low-cost camera (camera) into the target face pigment detection model obtained from the above training, and obtain the HB image and HR image output by the target face pigment detection model. The accurate detection of melanin and red pigment in the face image collected by the low-cost camera (camera) solves the problem that the poor image quality of the low-cost camera (camera) causes the colors of adjacent pixels of the image to tend to be consistent, which makes different pigments in the face image The problem of low decomposition quality, better restore the detail information in the actual melanin high-definition detail image and the actual red pigment high-definition detail image.

Description of drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following will briefly introduce the accompanying drawings used in the embodiments. It should be understood that the following drawings only show some embodiments of the present application, so It should be regarded as a limitation on the scope, and those skilled in the art can also obtain other related drawings based on these drawings without creative work.

FIG. 1 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;

Fig. 2 is a schematic flow chart of a human face pigment detection model training method provided by the embodiment of the present application;

Fig. 3 is the frame diagram of the initial human face pigment detection model in a kind of human face pigment detection model training method that the embodiment of the application provides;

Fig. 4 is a schematic flow chart of another human face pigment detection model training method provided by the embodiment of the present application;

FIG. 5 is a schematic structural diagram of a human face pigment detection model training device provided in an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It should be understood that the appended The figures are only for the purpose of illustration and description, and are not used to limit the protection scope of the present application. Additionally, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this application illustrate operations implemented in accordance with some embodiments of the application. It should be understood that the operations of the flowcharts may be performed out of order, and steps that have no logical context may be performed in reverse order or concurrently. In addition, those skilled in the art may add one or more other operations to the flowchart or remove one or more operations from the flowchart under the guidance of the content of the present application.

In addition, the described embodiments are only some of the embodiments of the application, not all of the embodiments. The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the scope of protection of the present application.

It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the existence of the features stated later, but does not exclude the addition of other features.

FIG. 1 is a schematic structural diagram of an electronic device provided in an embodiment of the present application; the electronic device may be a processing device such as a computer or a server, and is used to implement the human face pigment detection model training method provided in the present application. As shown in FIG. 1 , an electronic device may include: a processor 101 and a memory 102 .

The processor 101 and the memory 102 may be directly or indirectly electrically connected to realize data transmission or interaction. For example, electrical connections may be made through one or more communication buses or signal lines.

Wherein, the processor 101 may be an integrated circuit chip, which has a signal processing capability. The above-mentioned processor 101 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP) and the like. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

Memory 102 can be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), erasable Read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electric Erasable Programmable Read-Only Memory (EEPROM), etc.

It can be understood that the structure shown in FIG. 1 is only for illustration, and the electronic device 100 may also include more or less components than those shown in FIG. 1 , or have a configuration different from that shown in FIG. 1 . Each component shown in Fig. 1 may be implemented by hardware, software or a combination thereof.

The memory 102 is used to store programs, and the processor 101 invokes the programs stored in the memory 102 to execute the human face pigment detection model training method provided in the following embodiments.

A kind of face pigment detection model training method provided by the embodiment of the present application will be introduced in detail as follows through multiple embodiments.

Fig. 2 is a schematic flow chart of a human face pigment detection model training method provided by the embodiment of the present application. Optionally, the execution subject of the method may be an electronic device such as a server or a computer, which has a data processing function. It should be understood that in other embodiments, the order of some steps in the face pigment detection model training method can be exchanged according to actual needs, or some steps can also be omitted or deleted. As shown in Figure 2, the method may include:

S201. Perform gain processing on the original sample image to obtain a target sample image. The resolution of the original sample image may be higher than the resolution of the target sample image.

Wherein, the original sample image refers to a face image collected by a professional digital camera or a single-lens reflex camera or the like. For example, under a suitable light source (usually cross-polarized light), enough original face sample images are taken by a SLR camera. Highly distinguish brown areas (such as spots, pores) and red areas (acne, sensitive muscles, red bloodshot eyelids) of the face, which are different from normal skin areas. Among them, the area corresponding to melanin finally appears as a brown map (denoted as a Brown map), and the area corresponding to hemoglobin appears as a red map (denoted as a Red map).

Since the face images collected by professional digital cameras or SLR cameras belong to high-definition images, in this embodiment, in order to make the "target face pigment detection model" obtained by subsequent training better applicable to low-resolution images taken by mobile phones Ratio image quality, maintaining the brown image and red image corresponding to each face image unchanged. Therefore, this application proposes to perform gain processing on the original sample image. For example, the compression principle of jpg can be used to compress the quality of the original sample image with random quality, so as to reduce the quality of the original sample image, and then achieve the effect of simulating the real mobile phone camera. 3-channel face color image effect.

S202. Input the target sample image into the initial face pigment detection model, and obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model.

Optionally, the initial human face pigment detection model can be selected from an encoding-decoding (Encoder-Decoder) network model, a deep learning network model (Deep Neural Networks, DNN for short), or other network training models, etc. Here, the "initial human "Face Pigment Detection Model" is not specifically limited.

Among them, the "actual melanin high-definition detailed image" refers to the melanin image containing high-definition details (referred to as HB image), and the "actual red pigment high-definition detailed image" refers to the red pigment image containing high-definition details (referred to as HR image).

In this embodiment, taking the "initial human face pigment detection model" as an example of an Encoder-Decoder network model, the target sample image obtained in step S202 is input into the initial human face pigment detection model, and processed by the Encoder-Decoder network model , to get the HB image and HR image output by the Encoder-Decoder network model.

S203. Decompose the original sample image to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image.

In this embodiment, in order to make the details of the HB map and HR map output by the initial human face pigment detection model clearer, it is proposed that the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image can be used for the above-mentioned "initial face pigment detection model” for supervised training and learning.

Among them, the supervised melanin high-definition detail image is obtained by decomposing the original sample image using a traditional decomposition algorithm, and the "supervised melanin high-definition detail image" is used as the real target image supervised during the initial face pigment detection model training and learning, which is recorded as a TB image (Brown diagram).

Similarly, the supervised red pigment high-definition detail image is also obtained by decomposing the original sample image using the traditional decomposition algorithm, and the "supervised red pigment high-definition detail image" is used as the real target image supervised during the initial face pigment detection model training and learning. is a TR image (Red image).

In this embodiment, the following decomposition method may be used to decompose the original sample image to obtain a TB image and a TR image.

By manual screening method, Brown and Red channel decomposition vectors are screened from the original sample image, each marked as σ _B ＝[σ _B1 , σ _B2 , σ _B3 ] ^t (Brown decomposition vector) and σ _R ＝[σ _R1 , σ _R2 , σ _R3 ] ^t (Red decomposition vector), so as to extract the melanin-brown map and red pigment-red map from the original sample image.

For an image C expressed in RGB format, the pixel value C _i at position i = [R _i , G _i , B _i ] ^t , representing a 3×1 column vector, defining the vector:

LC _i =-log(C _i )=-[log(R _i ), log(G _i ), log(B _i )] ^t .

Among them, t means transpose, and log(·) means taking natural logarithm.

Construct 2 decomposition vectors and construct the following matrix:

Calculate the 2 exploded diagrams as follows:

(1) Determine the fixed constant offset vector E0 whose size is 3×1, generally get E0=[0,0,0] ^t ;

(2) Calculate a new 3-channel map E, E _i =D ^-1 ×[LC _i -E0];

(3) Calculate the projection of E to the 2-decomposition vector to obtain the brown map and the red map, namely:

Brown map Brown:

Red map Red:

Among them, D ^-1 represents the inverse matrix of D, E _i ·σ _B and E _i ·σ _R both represent the dot product of two 3×1 column vectors, and still get a 3×1 column vector, the meaning of the exponential operation is

(x1, x2, x3) ^t represents a 3×1 column vector.

Among them, the brown map Brown:

is the TB image in this application; the red image Red:

is the TR image in this application.

It is worth noting that in this application, the above-mentioned decomposition method is mainly used to obtain the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image for deep learning training, and the above-mentioned method will not be used again after the network model training process or network training is completed. decomposition method.

S204. Using the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervision parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, iteratively correct the initial human face pigment detection model to obtain the target human face pigment detection Model.

For example, use the above-mentioned TB image to supervise the HB image output by the initial human face pigment detection model, and use the above-mentioned TR image to supervise the HR image output by the initial human face pigment detection model, and perform multiple times on the initial human face pigment detection model. Iterative training and learning until the difference between the HB image and the TB image output by the trained face pigment detection model, and the difference between the HR image and the TR image is reduced to below the preset value, it can be considered that the network training is completed, after the training is completed A target human face pigment detection model is obtained. At this time, the target human face pigment detection model can be used to detect the distribution of different pigments in the human face image captured by a low-cost camera (camera).

The following is a brief description of the application of the trained target face pigment detection model.

In this embodiment, the target sample image captured by a low-cost camera (camera) is obtained, and the target sample image is input into the target human face pigment detection model obtained through the above training, and the actual output of the target human facial pigment detection model is obtained. The high-definition detailed images of melanin and the high-definition detailed images of actual red pigments realize the accurate detection of melanin and red pigments in face images collected by low-cost cameras (cameras), and solve the problem of poor image quality of low-cost cameras (cameras) The color of adjacent pixels in the image tends to be consistent, which makes the decomposition quality of different pigments in the face image low, and the detailed information in the actual melanin high-definition detail image and the actual red pigment high-definition detail image is better restored.

To sum up, the embodiment of the present application provides a human face pigment detection model training method, which may include: performing gain processing on the original sample image to obtain the target sample image, and the resolution of the original sample image may be higher than that of the target sample image Input the target sample image into the initial face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial face pigment detection model; decompose the original sample image to obtain the supervised melanin high-definition detail Image and supervised red pigment high-definition detail image; with supervised melanin high-definition detail image and supervised red pigment high-definition detail image as supervisory parameters, the initial face pigment detection model is iteratively corrected according to the actual melanin high-definition detail image and actual red pigment high-definition detail image , to get the target face pigment detection model. In this scheme, the gain processing is mainly performed on the original sample image collected by a professional digital camera or SLR camera to obtain the target sample image, so as to realize the effect of simulating the face image captured by the camera of a mobile phone, and then, the target sample image Input the image into the initial face pigment detection model to obtain the HB image and HR image, and use the original sample image to decompose the TB image and TR image to iteratively correct the initial face pigment detection model to obtain the target face pigment detection model, It makes it possible to input the target sample image captured by the low-cost camera (camera) into the target face pigment detection model obtained from the above training, and obtain the HB image and HR image output by the target face pigment detection model. The accurate detection of melanin and red pigment in the face image collected by the low-cost camera (camera) solves the problem that the poor image quality of the low-cost camera (camera) causes the colors of adjacent pixels of the image to tend to be consistent, which makes different pigments in the face image Decompose the problem of low quality, and better restore the detail information in the actual melanin high-definition detail image and the actual red pigment high-definition detail image.

The above S204 will be specifically explained through the following embodiment: how to use the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the brightness information of the actual melanin high-definition detail image and the brightness information of the actual red pigment high-definition detail image, for The initial face pigment detection model is iteratively corrected to obtain the target face pigment detection model.

Optionally, the brightness information of the supervised melanin high-definition detail image and the brightness information of the supervised red pigment high-definition detail image are used as supervision parameters, and the initial human face color is calculated according to the brightness information of the actual melanin high-definition detail image and the actual red pigment high-definition detail image. The pixel detection model is iteratively corrected to obtain the target face pigment detection model.

In this embodiment, in order to make the detailed information in the HB image and HR image output by the "initial face pigment detection model" clearer, in addition to using conventional loss function supervision, such as L1, etc. This application also proposes to additionally use the "aligned brightness details" in the TB image and the TR image as a supervisory parameter to conduct supervised training and learning on the initial face pigment detection model, and continuously update the temporary face pigment detection model until a certain cycle After the error between the HB image and the TB image output by the obtained temporary face pigment detection model, and the error between the HR image and the TR image all meet the preset conditions, the iterative cycle process is ended, and the temporary face pigment detection model obtained at this time is As a target face pigment detection model.

Taking the HB image and TB image as an example, for the 3-channel HB image and TB image, you can take the maximum value (expressed by max) and minimum value (expressed by min) of the 3 channels, and adjust the ratio of max by c to extract Brightness details information, as follows:

HB _L ＝[c×max(HB)+min(HB))]/(1+c)

TB _L ＝[c×max(TB)+min(TB)]/(1+c)

Use _TBL to supervise _HBL to better restore the detailed information of HB images during training and learning. Similarly, HR and TR are supervised the same way. According to the experiment, in the training and learning comparison, the training result is relatively good when c is in the range of 1.5-2.0.

The following example will be used to explain S202 in detail: how to input the target sample image into the initial face pigment detection model to obtain the actual melanin high-definition detailed image and the actual red pigment high-definition detailed image output by the initial human face pigment detection model used in this application Initial face pigment detection model.

Wherein, Fig. 3 is a frame diagram of an initial human face pigment detection model provided by the embodiment of the present application, as shown in Fig. 3 , the initial human face pigment detection model may include: an encoder (Encoder), a first decoder (Decoder1 ) and the second decoder (Decoder2).

Wherein, the network model of Encoder-Decoder was selected as the initial human face pigment detection model, in this embodiment, the specific network layer composition in the network model of Encoder-Decoder is not considered, and the encoder is used to input to the initial human face pigment detection model The target sample image in is encoded to obtain the encoded features; decoding has two branches, Decoder1 and Decoder2, where Decoder1 is used to generate image detail information, and Decoder2 is used to generate image color information, adding detail information and color information The final HB image (brown image) and HR image (red image) with high-definition detail information are obtained.

How to obtain the actual melanin high-definition detailed image HB and the actual red pigment high-definition detailed image HR output by the initial face pigment detection model will be described in detail below in conjunction with Figures 3-4.

Fig. 4 is a schematic flow chart of another human face pigment detection model training method provided by the embodiment of the present application. As shown in Fig. 4, the above step S202: input the target sample image into the initial human face pigment detection model to obtain the initial human face color The actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the pigment detection model can include:

S401. The encoder encodes the target sample image to obtain encoded features.

Wherein, the target sample image can be a 3-channel color image LI captured by a simulated real mobile phone, and the size of LI is 3xHxW, where H refers to the height information of the image, and W refers to the width information of the image.

Optionally, the encoder may encode the target sample image to convert the target sample image into a fixed-length vector and obtain encoded features.

S402. The first decoder performs detail decoding on the encoded features to obtain a melanin detail image and a red pigment detail image.

Optionally, the first decoder Decoder1 performs detailed decoding on the encoded features, gradually restores the spatial detail information of the target sample image, and obtains the DB image and the DB image.

It is worth noting that the size of the DB image and the DB image can both be the same as the target sample image, both are 3 channels, and the size is HxW.

S403. The second decoder performs color decoding on the encoded features to obtain a melanin color image and a red pigment color image.

Optionally, the encoded features are color-decoded by the second decoder to obtain a melanin color image and a red pigment color image, including:

The encoded features are color-decoded by the second decoder to obtain the intermediate melanin coefficient map matrix and the intermediate red pigment coefficient map matrix, and the intermediate melanin coefficient map matrix is multiplied by the pixel vector of each pixel position in the target sample image to obtain the melanin A color image, and multiplying the intermediate red pigment coefficient map matrix by the pixel vector of each pixel position in the target sample image to obtain a red pigment color image.

Among them, the middle melanin coefficient map matrix and the middle red pixel coefficient map matrix refer to the 12-channel Brown coefficient map KB matrix and Red coefficient map obtained by color decoding the encoded features by the second decoder Decoder2, which are consistent with the size of the target sample image KR matrix.

Among them, the size of the Brown coefficient map KB matrix and the Red coefficient map KR matrix are both 12xHxW, 12 actually means that each pixel position i has 12 coefficients, in order to construct a coefficient matrix 3x4 for each position i, the matrix contains 12 coefficients .

In this embodiment, for a coefficient map matrix with a size of 12xHxW and a target sample image LI with a size of 3xHxW, for each pixel position i of LI, the corresponding pixel value is recorded as I _i (IP _i1 , IP _i2 , IP _i3 ), corresponding to the 12 coefficients at position i of the coefficient map, which can be converted into a matrix K _i34 of size 3x4, for the pixel value I _i (IP _i1 , IP _i2 , IP _i3 ) at position i, add 1 to form a uniform Sub-pixel value I _i (IP _i1 , IP _i2 , IP _i3 , 1), and align sub-pixel value I _i (IP _i1 , IP _i2 , IP _i3 , 1) to transpose, and homogeneous pixel value I _i (IP _i1 , IP _i2 , IP _i3 , 1) becomes a 4x1 homogeneous vector, and thus turns into a matrix-vector multiplication method to obtain the color result O _i (OP _i1 , OP _i2 OP _i3 ) corresponding to position i, namely:

O _i ＝K _i34 ×I _i (IP _i1 , IP _i2 , IP _i3 )

Through the calculation method of the above formula, both the melanin color image OB and the red pigment color image OR can be calculated.

In this embodiment, in view of the quality of images captured by different types of cameras, repeated analysis and screening are required, and at the same time, the decomposition results are likely to cause uneven transitions such as color blocks. This application proposes a black pigment coefficient map matrix and a red pigment coefficient map matrix. The method avoids the problem of uneven transition, and restores the detailed information of the decomposition map through the process of detail learning, so that the results of the human face pigment detection results highlight the special areas of the skin, such as spots, acne, pores, etc.

S404. Perform superposition processing on the melanin detail image and the melanin color image by the initial face pigment detection model to obtain an actual melanin high-definition detail image, and perform superposition processing on the red pigment detail image and the red pigment color image to obtain the actual red pigment high-definition detail image image.

In this embodiment, the above obtained final DB image containing high-definition details and the OB image containing melanin color are superimposed to obtain an HB image. That is, HB=OB+DB.

Similarly, the final DR image containing high-definition details obtained above is superimposed with the OR image containing red pigment color to obtain an HR image. That is, HR=OR+DR.

Optionally, the melanin detail image and the melanin color image are superimposed by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and the red pigment detail image and the red pigment color image are superimposed to obtain the actual red pigment High-resolution detailed images, which can include:

Add the pixel values of the same channel in the same position in the melanin detail image and the melanin color image by the initial face pigment detection model to obtain the actual melanin high-definition detail image, and, for the same position in the red pigment detail image and the red pigment color image The pixel values of the same channel are added separately to obtain the actual red pigment high-definition detail image.

In this embodiment, since the size of the detail image and the color image are exactly the same, and both are 3 channels, it means adding pixel by pixel to superimpose the melanin detail image and the melanin color image to obtain the actual melanin high-definition detail image, and , superimpose the red pigment detail image and the red pigment color image.

For example, taking the HR image as an example, for each pixel position i, the pixel value corresponding to HR is recorded as HR _i (HR _i1 , HR _i2 , HR _i3 ), and the pixel value corresponding to OR is recorded as OR _i (OR _i1 , OR _i2 , OR _i3 ), the pixel value corresponding to DR is recorded as DR _i (DR _i1 , DR _i2 , DR _i3 ), then:

HR _i =OR _i +DR _i =(OR _i1 +DR _i1 , OR _i2 +DR _i2 , OR _i3 +DR _i3 )

Similarly, the HB image can be obtained by using the above superposition method.

What processing is included in the gain processing mentioned in the above S202 will be specifically explained through the following embodiments.

Optionally, the gain processing includes at least one of the following: compression processing, color format conversion processing, and pigment region color adjustment processing.

Since the data captured by the SLR camera is a high-definition image, in this embodiment, in order to make the target face pigment detection model more suitable for the quality of the mobile phone, the Brown and Red images corresponding to each image are kept unchanged. , it is necessary to reduce the quality of the original sample image captured by the SLR camera. Therefore, this application proposes that additional gain processing is required for the original sample image captured by the SLR camera, and the gain processing may include at least one item: compression processing, color format conversion processing, and pigment area color adjustment processing, in order to solve the problems used in the actual application process. Face pigment detection for relatively low-quality images captured by mobile phone cameras or other devices.

Optionally, the color adjustment processing of the pigmented area may include: detecting a melanin area and a red pigmented area from the original sample image, removing the melanin area and the red pigmented area from the original sample image, and removing the melanin area and the red pigmented area The image is fused with the original sample image.

(1) Compression processing. For each input original sample image, use the compression principle of jpg to compress the image quality with random quality. During training, it is set to 80-99 random image quality compression, so that the convolutional neural network (Convolutional Neural Network) Neural Network (CNN for short) eliminates the influence of different compression quality during the learning process.

(2) Color format conversion processing, mainly using the saturation algorithm to reduce the saturation of the original sample image, weakening brown areas (such as spots, pores), red areas (acne, sensitive skin, red bloodshot) and other normal areas The chromatic aberration of the image taken by the mobile phone will be weaker than the chromatic aberration of the image taken by the SLR.

Among them, the desaturation method generally uniformly converts the original sample image into the HSL format, where H represents hue, S represents saturation, and L represents brightness. By adjusting the S channel, the saturation of the original sample image is reduced. In order to adapt to the task of decomposing channels during training, a new desaturation method is adopted.

The general calculation process for calculating the S channel can be: convert any 3-channel color image into a color image represented by RGB, and convert the value to 0.0-1.0;

Calculate the maximum value smax=max(R, G, B) and minimum value smin=min(R, G, B) of RGB; calculate the brightness channel as L=(smax+smin)/2, and the difference between the two is Diff= (smax-smin), the calculation formula of saturation is:

According to the calculation process of the above saturation, a specific way to reduce the saturation S is proposed. Keeping the size of L unchanged, reducing the maximum value smax and increasing the minimum value smin, the saturation can be reduced. The coefficient of the degree of saturation reduction is cs( 0.0≤cs≤1.0), the new smax1 and smin1 are calculated in a reduced way as:

smax1=(1.0f-0.5×cs×Diff ² )×smax

smin1=min(2×L-smax1, smax1)

Among them, max(·) means to take the maximum value, and min(·) means to take the minimum value.

The new difference Diff1 is Diff1=smax1-smin1, replace the calculation formula of saturation to obtain a new saturation S, namely:

Generally, red pimples and stains belong to areas with high saturation, and the corresponding Diff value is also large. The above formula is also sufficient to reduce the saturation of the area with large Diff value, and keep the saturation of the smaller area as much as possible. The intensity remains the same, thereby reducing the color difference of acne breakouts, pigmentation spots and normal skin areas.

(3) Color adjustment processing of pigmented areas. For the image quality captured by mobile phones, the lighter brown/red areas on the face cannot be highlighted compared to other skin areas. In order to make the original sample image better simulate the image captured by mobile phones Image quality, so that the Brown image can better highlight brown spots, pores, etc., and the Red image can better highlight red acne, red blood streaks, and red sensitive areas. Therefore, in this embodiment, it is proposed to use a detection algorithm to identify the original sample image These Brown areas and Red areas in Origin, and use the inpainting algorithm to remove these Brown areas and Red areas to obtain a clean result map, which is recorded as a Clean map, and then merged with alpha fusion, that is, Clean *(alpha)+Origin*(1.0-alpha), where * means multiplication, and the alpha value is in the range of 0.0-0.5, which can better compare the lighter brown/red areas on the face to other skin areas stand out.

It is worth noting that when performing gain processing on the original sample image, only one of compression processing, color format conversion processing, and pigment area color adjustment processing, or a combination of any two, or all combinations can be selected to simulate the real The images captured by the mobile phone camera make the target face pigment detection model obtained after subsequent training applicable to low-quality images captured by low-cost cameras (cameras), which reduces the production cost of equipment such as skin testers and improves The application effect of face pigment detection method in mobile phone photography.

The following describes the human face pigment detection model training device and storage media provided by this application. The specific implementation process and technical effects refer to the above, and will not be repeated below.

FIG. 5 is a schematic structural diagram of a face pigment detection model training device provided by an embodiment of the present application; as shown in FIG. 5 , the device may include: a gain module 501 , a processing module 502 and a correction module 503 .

The gain module 501 may be configured to perform gain processing on the original sample image to obtain a target sample image, and the resolution of the original sample image is higher than the resolution of the target sample image;

The processing module 502 can be configured to input the target sample image into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model; decompose the original sample image Processing to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image;

The correction module 503 can be configured to use the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervision parameters, and iteratively correct the initial human face pigment detection model according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image , to get the target face pigment detection model.

Optionally, the correction module 503 may also be configured to:

The brightness information of the supervised melanin high-definition detail image and the brightness information of the supervised red pigment high-definition detail image are used as supervision parameters. According to the brightness information of the actual melanin high-definition detail image and the brightness information of the actual red pigment high-definition detail image, the initial face pigment detection model Through iterative correction, the target face pigment detection model is obtained.

The processing module 502 may also be configured to:

Encode the target sample image by the encoder to obtain the encoded features;

Color decoding is performed on the encoded features by the second decoder to obtain a melanin color image and a red pigment color image;

The melanin detail image and the melanin color image are superimposed by the initial face pigment detection model to obtain the actual melanin high-definition detail image, and the red pigment detail image and the red pigment color image are superimposed to obtain the actual red pigment high-definition detail image.

Optionally, the processing module 502 may also be configured to:

The above-mentioned apparatus is used to execute the methods provided in the foregoing embodiments, and its implementation principles and technical effects are similar, and details are not repeated here.

The above modules may be one or more integrated circuits configured to implement the above method, for example: one or more specific integrated circuits (Application Specific Integrated Circuit, referred to as ASIC), or, one or more microprocessors (digital singnal processor, DSP for short), or, one or more Field Programmable Gate Arrays (Field Programmable Gate Array, FPGA for short), etc. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, referred to as CPU) or other processors that can call program codes. For another example, these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC for short).

Optionally, the present application further provides a program product, such as a computer-readable storage medium, including a program, and the program is used to execute the foregoing method embodiments when executed by a processor.

In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software functional units.

The above-mentioned integrated units implemented in the form of software functional units may be stored in a computer-readable storage medium. The above-mentioned software functional units are stored in a storage medium, and include several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) or a processor (English: processor) to execute the functions described in various embodiments of the present application. part of the method. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (English: Read-Only Memory, abbreviated: ROM), random access memory (English: Random Access Memory, abbreviated: RAM), magnetic disk or optical disc, etc. Various media that can store program code.

Industrial Applicability

The present application provides a human face pigment detection model training method, device, equipment and storage medium. The method includes: performing gain processing on the original sample image to obtain a target sample image; inputting the target sample image into an initial face pigment detection model to obtain an actual melanin high-definition detail image and an actual red pigment high-definition detail image; decomposing the original sample image processing to obtain the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image; with the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervision parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, the initial human The face pigment detection model is iteratively corrected to obtain the target human face pigment detection model. This solution solves the problem that the low-cost camera’s shooting quality is poor, resulting in the color of adjacent pixels of the image tending to be consistent, and the problem of low decomposition quality of different pigments in the face image.

In addition, it can be understood that the face pigment detection model training method, device, equipment and storage medium of the present application are reproducible and can be used in various industrial applications. For example, the human face pigment detection model training method, device, equipment and storage medium of the present application can be used in the technical field of image processing.

Claims

A human face pigment detection model training method is characterized in that, comprising:

performing gain processing on the original sample image to obtain a target sample image, where the resolution of the original sample image is higher than the resolution of the target sample image;

The target sample image is input into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model;

Decomposing the original sample image to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image;

Using the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, iteratively correcting the initial human face pigment detection model, Obtain the target face pigment detection model.
The method according to claim 1, characterized in that, using the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, iteratively correcting the initial human face pigment detection model to obtain the target human face pigment detection model, including:

The brightness information of the supervised melanin high-definition detail image and the brightness information of the supervised red pigment high-definition detail image are used as supervision parameters, and according to the brightness information of the actual melanin high-definition detail image and the brightness information of the actual red pigment high-definition detail image, the The above initial face pigment detection model is iteratively corrected to obtain the target face pigment detection model.
The method according to claim 1 or 2, wherein the initial human face pigment detection model comprises: an encoder, a first decoder and a second decoder;

Said inputting said target sample image into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and actual red pigment high-definition detail image output by said initial human face pigment detection model, including:

Encoding the target sample image by the encoder to obtain encoded features;

performing detail decoding on the encoded features by the first decoder to obtain a melanin detail image and a red pigment detail image;

performing color decoding on the encoded features by the second decoder to obtain a melanin color image and a red pigment color image;

Superimposing the melanin detail image and the melanin color image by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and the red pigment detail image and the red pigment color image Superposition processing is performed to obtain the actual red pigment high-definition detail image.
The method according to claim 3, wherein the second decoder performs color decoding on the encoded features to obtain a melanin color image and a red pigment color image, including:

Perform color decoding on the encoded features by the second decoder to obtain an intermediate melanin coefficient map matrix and an intermediate red pigment coefficient map matrix, and compare the intermediate melanin coefficient map matrix with the position of each pixel in the target sample image Multiply the pixel vectors of the melanin color image to obtain the melanin color image, and multiply the intermediate red pigment coefficient map matrix with the pixel vectors of each pixel position in the target sample image to obtain the red pigment color image.
The method according to claim 3 or 4, wherein the initial human face pigment detection model is used to superimpose the melanin detail image and the melanin color image to obtain the actual melanin high-definition detail image , and, superimposing the red pigment detail image and the red pigment color image to obtain the actual red pigment high-definition detail image, including:

Adding the pixel values of the same position and the same channel in the melanin detail image and the melanin color image respectively by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and for the red pigment The detail image and the pixel values of the same position and the same channel in the red pigment color image are respectively added to obtain the actual red pigment high-definition detail image.
The method according to any one of claims 1-5, wherein the gain processing includes at least one of the following: compression processing, color format conversion processing, and pigment region color adjustment processing.
The method according to claim 6, wherein the color adjustment process of the pigment region comprises: detecting a melanin region and a red pigment region from the original sample image, and removing the melanin region from the original sample image and the red pigment area, and performing fusion processing on the image after removing the melanin area and the red pigment area and the original sample image.
A human face pigment detection model training device is characterized in that said device comprises:

A gain module configured to perform gain processing on the original sample image to obtain a target sample image, the resolution of the original sample image being higher than the resolution of the target sample image;

The processing module is configured to input the target sample image into the initial human face pigment detection model to obtain the actual melanin high-definition detail image and the actual red pigment high-definition detail image output by the initial human face pigment detection model; The sample image is decomposed and processed to obtain a supervised melanin high-definition detail image and a supervised red pigment high-definition detail image;

The correction module is configured to use the supervised melanin high-definition detail image and the supervised red pigment high-definition detail image as supervisory parameters, according to the actual melanin high-definition detail image and the actual red pigment high-definition detail image, to correct the initial human face color The pixel detection model is iteratively corrected to obtain the target face pigment detection model.
Facial pigment detection model training device according to claim 8, is characterized in that, described correction module is also configured to:

The brightness information of the supervised melanin high-definition detail image and the brightness information of the supervised red pigment high-definition detail image are used as supervision parameters, and according to the brightness information of the actual melanin high-definition detail image and the brightness information of the actual red pigment high-definition detail image, the The above initial face pigment detection model is iteratively corrected to obtain the target face pigment detection model.
The face pigment detection model training device according to claim 8 or 9, wherein the initial human face pigment detection model comprises: an encoder, a first decoder and a second decoder,

Wherein, the processing module is further configured to: encode the target sample image by the encoder to obtain encoded features;

performing detail decoding on the encoded features by the first decoder to obtain a melanin detail image and a red pigment detail image;

performing color decoding on the encoded features by the second decoder to obtain a melanin color image and a red pigment color image;

Superimposing the melanin detail image and the melanin color image by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and the red pigment detail image and the red pigment color image Superposition processing is performed to obtain the actual red pigment high-definition detail image.
Facial pigment detection model training device according to claim 10, is characterized in that, described processing module is also configured to:

Perform color decoding on the encoded features by the second decoder to obtain an intermediate melanin coefficient map matrix and an intermediate red pigment coefficient map matrix, and compare the intermediate melanin coefficient map matrix with the position of each pixel in the target sample image Multiplying the pixel vectors of the melanin color image to obtain the melanin color image, and multiplying the intermediate red pigment coefficient map matrix with the pixel vectors of each pixel position in the target sample image to obtain the red pigment color image.
The face pigment detection model training device according to claim 10 or 11, wherein the processing module is also configured to:

Adding the pixel values of the same position and the same channel in the melanin detail image and the melanin color image respectively by the initial human face pigment detection model to obtain the actual melanin high-definition detail image, and for the red pigment The detail image and the pixel values of the same position and the same channel in the red pigment color image are respectively added to obtain the actual red pigment high-definition detail image.
An electronic device, characterized in that it includes: a processor, a storage medium and a bus, the storage medium stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the The storage media communicate with each other through a bus, and the processor executes the machine-readable instructions to perform the steps of the method according to any one of claims 1-7.
A computer storage medium, wherein a computer program is stored on the storage medium, and when the computer program is run by a processor, the steps of the method according to any one of claims 1-7 are executed.