WO2024001363A1

WO2024001363A1 - Image processing method and apparatus, and electronic device

Info

Publication number: WO2024001363A1
Application number: PCT/CN2023/085096
Authority: WO
Inventors: 黄硕
Original assignee: 魔门塔(苏州)科技有限公司
Priority date: 2022-06-30
Filing date: 2023-03-30
Publication date: 2024-01-04
Also published as: CN117391924A

Abstract

Provided in the present application are an image processing method and apparatus, and an electronic device. The image processing method comprises: acquiring a first facial image, wherein the first facial image is a facial image where the face is uniformly illuminated; inputting the first facial image into an image processing model; and acquiring a second facial image, which is output by means of the image processing model, wherein the image processing model comprises a first generation model, and the first generation model is used for adding, to the first facial image, a light and shadow effect so that the face is unevenly illuminated, so as to generate the second facial image. According to the image processing method in the embodiments of the present application, during facial recognition, light and shadow migration of a facial image sample where the face is uniformly illuminated is performed on the basis of an image processing model, such that the difficulty in and the cost of acquiring the facial image sample where the face is unevenly illuminated can be greatly reduced.

Description

An image processing method, device and electronic equipment

Technical field

The present application relates to the field of image processing technology, and in particular to an image processing method, device and electronic equipment.

Background technique

Face recognition is based on human facial features. The identity features contained in each face are further extracted through a neural network and compared with known faces to identify the identity of each face.

In actual application scenarios, when collecting human face images (when shooting with human faces as the subject), the human face images can be divided into two states according to the different states of light illuminating the face.

One state is that the illumination of the face is uneven, and there are obvious areas with different illumination intensities on the face in the face image. For example, when light illuminates the left side of the face from the side of the head, the illumination intensity of the left side of the face that is illuminated by the light will be significantly stronger than that of the right side of the face that is not illuminated by the light, resulting in the left half of the face being brighter than the right half of the face. Dark (yin and yang) face images. For another example, when a circular beam of light shines on the face, a circular spot will appear on the face illuminated by the beam. The illumination intensity of the spot on the face will be significantly stronger than that of the other parts of the face, resulting in an obvious brightness difference in the collected images. face images.

The other state is that the face is evenly illuminated, and there are no obvious areas with different illumination intensities on the face in the face image. For example, when light shines from the front of the head to the face, there is no obvious shadow on the face under the light, so that a face image with even face illumination can be collected.

In the application scenario of training face recognition models, face images with uneven facial illumination are an important type of training sample images. Facial image samples with uneven facial illumination can play a role in image amplification for training face recognition models, thereby improving the recognition accuracy of such samples.

However, in the application scenario of taking photos of people, in order to improve the quality of the photos of people, the photographer will try to avoid uneven lighting on the faces of the people in the photos. This results in that face images with uneven illumination are rarer than face images with uniform illumination.

Therefore, in order to obtain enough training sample images, a method of obtaining face images with uneven illumination is needed.

Contents of the invention

In order to solve the problem of how to obtain a face image with uneven illumination under the existing technology, this application provides an image processing method, device and electronic equipment. This application also provides a computer-readable storage medium.

The embodiments of this application adopt the following technical solutions:

In a first aspect, this application provides an image processing method, which method includes:

Obtaining a first face image, where the first face image is a face image with uniform facial illumination;

Input the first face image to the image processing model and obtain the second face image output by the image processing model, where:

The image processing model includes a first generative model for attaching Add uneven light and shadow effects on the face to generate the second face image.

According to the image processing method of the embodiment of the present application, light and shadow migration is performed on face image samples with uniform facial illumination in face recognition based on the image processing model, which can greatly reduce the difficulty and acquisition difficulty of facial image samples with uneven facial illumination. cost. While expanding the number of face recognition samples, it also solves the sample imbalance problem caused by the relatively small number of face image samples with uneven facial illumination, and can improve the training accuracy of the face recognition model.

Furthermore, since the source is a face image sample with uniform facial illumination (real sample), light and shadow migration is used to generate a face image sample with uneven facial illumination. The face image sample with uneven facial illumination is close to Real samples are beneficial to improving the training accuracy of face recognition models.

In an implementation manner of the first aspect, the image processing model is a generative adversarial network model.

In an implementation of the first aspect, the image processing model further includes:

A discriminant model, which is used to determine sample images based on light and shadow in the process of training the image processing model, and analyze the additional light and shadow effects of the first generative model according to the output of the first generative model, so as to determine the light and shadow additional effects of the first generative model according to the discriminant model. The analysis result adjusts the first generation model, wherein the face sample image input to the first generation model is a face image with uniform facial illumination, and the light and shadow determination sample image is a person with uneven facial illumination. face image.

Reduce the difficulty of obtaining training samples required during model training.

Using a generative adversarial network model to implement the image processing model of the embodiment of the present application can reduce the difficulty of obtaining the image processing model of the embodiment of the present application. Furthermore, using a generative adversarial network model and conducting model training through the mutual game between the generative model and the discriminative model can reduce the difficulty of obtaining training samples required in the model training process.

A second generative model. The second generative model is an inverse mapping of the first generative model. The second generative model is used to convert the first generative model into a second generative model during the training of the image processing model. The output is used as input to adjust the first generative model based on a comparison of the output of the second generative model and the input of the first generative model.

Based on the second generative model, the image processing model in the embodiment of the present application adopts the structure of a recurrent generative adversarial network, which improves the training efficiency of model training and improves the processing precision and accuracy of the image processing model.

In a second aspect, the present application provides a model training method. The method is used to train an image processing model. The image processing model is used to add a light and shadow effect of uneven facial illumination to a face image with uniform facial illumination, so as to Generate a face image with uneven facial illumination. The image processing model is a generative adversarial network model. The image processing model includes a first generation model and a discriminant model. The method includes:

Obtaining a first face sample image, wherein the first face sample image is a face image with uniform facial illumination;

Input the first face sample image to the first generation model, and the first generation model generates a second face sample image according to the first face sample image;

Obtain a third face sample image, wherein the third face sample image is a face image with uneven facial illumination;

Using the discriminant model, based on the third face sample image, analyze the light and shadow additional effects of the first generation model based on the second face sample image;

The first generation model is adjusted according to the analysis results of the discriminant model.

Using a generative adversarial network model to implement the image processing model of the embodiment of the present application can reduce the difficulty of obtaining the image processing model of the embodiment of the present application.

In an implementation manner of the second aspect, the third face sample image and the first face sample image are not Pairs of sample images.

Since paired sample images are not required, the difficulty of obtaining training samples required during model training is effectively reduced.

In an implementation of the second aspect, the method further includes:

Obtain a fourth face sample image, wherein the fourth face sample image is a face image with uniform facial illumination;

Input the fourth face sample image to the first generation model, and the first generation model generates a fifth face sample image based on the fourth face sample image;

Using the discriminant model, based on the third face sample image, the light and shadow additional effects of the first generation model are analyzed based on the fifth face sample image.

In an implementation of the second aspect, the method further includes:

Obtaining a sixth face sample image, wherein the sixth face sample image is a face image with uneven facial illumination;

Using the discriminant model, based on the sixth face sample image, the light and shadow additional effects of the first generation model are analyzed based on the second face sample image.

In an implementation manner of the second aspect, the image processing model further includes a second generative model, and the second generative model is an inverse mapping of the first generative model;

The method also includes:

Input the second face sample image to the second generation model, and the second generation model generates a seventh face sample image based on the second face sample image;

Compare the seventh face sample image and the first face sample image, and adjust the first generation model according to the comparison result.

In a third aspect, this application provides an image processing device, which includes:

An image processing model, which is used to obtain a first face image and output a second face image, where:

The first face image is a face image with uniform facial illumination;

The image processing model includes a first generation model, and the first generation model is used to add a light and shadow effect of uneven facial illumination to the first face image to generate the second face image.

In an implementation manner of the third aspect, the image processing model is a generative adversarial network model.

In an implementation of the third aspect, the image processing model further includes:

In a fourth aspect, the present application provides a model training device, which is used to train an image processing model. The image processing model is used to add the light and shadow effect of uneven facial illumination to the face image with uniform facial illumination to generate the face image with uneven facial illumination. The image processing model is a generative adversarial network model. The image processing model includes a first generation model and a discriminant model, and the device includes:

A first sample acquisition module configured to acquire a first face sample image and input the first face sample image to the first generation model, wherein the first face sample image is facial illumination Uniform face images;

a second sample acquisition module configured to acquire a second face sample image generated by the first generation model based on the first face sample image;

A third sample acquisition module configured to acquire a third face sample image and input the third face sample image into the discrimination model, wherein the third face sample image is one with uneven facial illumination. face images;

An analysis result acquisition module, which is used to obtain the analysis results of the discrimination model, wherein the analysis results include: the discrimination model is based on the third face sample image, and the analysis result is based on the second face sample image. The results of the light and shadow additional effects of the first generated model;

A first adjustment module configured to adjust the first generation model according to the analysis results.

In an implementation manner of the fourth aspect, the third face sample image and the first face sample image are not paired sample images.

In an implementation of the fourth aspect:

The first sample acquisition module is also used to: acquire a fourth face sample image, wherein the fourth face sample image is a face image with uniform facial illumination; input the fourth face sample image to the first generative model;

The second sample acquisition module is also used to acquire a fifth face sample image generated by the first generation model based on the fourth face sample image;

The analysis result obtained by the analysis result acquisition module also includes that the discrimination model is based on the third face sample image and analyzes the light and shadow additional effects of the first generation model based on the fifth face sample image. result.

In an implementation of the fourth aspect:

The third sample acquisition module is also used to acquire a sixth face sample image and input the sixth face sample image to the discrimination model, wherein the sixth face sample image is a face with uneven illumination. face images;

The analysis result obtained by the analysis result acquisition module also includes that the discrimination model is based on the sixth face sample image and analyzes the light and shadow additional effects of the first generation model based on the second face sample image. result.

In an implementation manner of the fourth aspect, the image processing model further includes a second generative model, and the second generative model is an inverse mapping of the first generative model;

The second sample acquisition module is also used to input the second face sample image to the second generation model;

The device also includes a second adjustment module, the second adjustment module being configured to: obtain a seventh face sample image generated by the second generation model based on the second face sample image; compare the seventh face sample image with the second face sample image; Face sample image and the first face sample image, and the first generation model is adjusted according to the comparison result.

In a fifth aspect, the present application provides an electronic device, the electronic device comprising a processor for executing computer program instructions, wherein when the computer program instructions stored in the memory are executed by the processor, the electronic device is triggered The method as described in the first aspect is performed.

In a sixth aspect, the present application provides an electronic device, the electronic device comprising a processor for executing computer program instructions, wherein when the computer program instructions stored in the memory are executed by the processor, the electronic device is triggered Perform the method as described in the second aspect.

In a seventh aspect, the present application provides a computer-readable storage medium that stores a computer program that, when run on a computer, causes the computer to execute the method described in the first aspect or the second aspect. .

Description of drawings

Figure 1 shows a method flow chart according to an embodiment of the present application;

Figure 2 shows a schematic structural diagram of an image processing model according to an embodiment of the present application;

Figure 3 shows a model training flow chart according to an embodiment of the present application;

Figure 4 shows a structural block diagram of an image processing device according to an embodiment of the present application;

Figure 5 shows a structural block diagram of a model training device according to an embodiment of the present application;

Figure 6 shows a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application clearer, the technical solutions of the present application will be clearly and completely described below in conjunction with specific embodiments of the present application and corresponding drawings. Obviously, the described embodiments are only some of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

The terms used in the embodiments of the present application are only used to explain specific embodiments of the present application and are not intended to limit the present application.

Aiming at the problem of how to obtain face images with uneven illumination, a feasible implementation solution is to perform a dimming operation when taking pictures, take an image of a person's face with uneven illumination, and cut the image to obtain an image with uneven illumination. Face images. The above implementation solution requires special image shooting, which is costly and requires a lot of work.

In order to reduce the difficulty and cost of obtaining samples, an embodiment of the present application proposes an image processing method. Based on the image processing model, a face image with uneven facial illumination is generated from a facial image with uniform facial illumination. In the embodiment of this application, the face image can be any type of image, for example, a color image, a black and white image, a grayscale image, etc. In an application scenario, the face image is an infrared face image. Infrared face images are face images captured by infrared cameras, usually single-channel images.

Figure 1 shows a flow chart of a method according to an embodiment of the present application. The electronic device executes the process shown in Figure 1 to generate a face image with uneven facial illumination.

S100: Obtain a first face image. The first face image is a face image with uniform facial illumination.

S110. Input the first face image to the image processing model.

The image processing model is used to perform light and shadow migration operations. Light and shadow migration refers to converting the facial light and shadow effects of face images with uneven facial illumination (including yin and yang faces) to faces in face images with uniform facial illumination. .

S120. Obtain the second face image output by the image processing model.

The second face image is an image generated by the image processing model based on the first face image, and the second face image is a face image with uneven facial illumination. The difference between the second face image and the first face image lies in the uniformity of illumination on the face. In other features, the second face image is consistent with the first face image.

According to the image processing method of the embodiment of the present application, light and shadow migration is performed on face image samples with uniform facial illumination in face recognition based on the image processing model, which can greatly reduce the difficulty of obtaining facial image samples with uneven facial illumination. and acquisition costs. While expanding the number of face recognition samples, it also solves the sample imbalance problem caused by the relatively small number of face image samples with uneven facial illumination, and can improve the training accuracy of the face recognition model.

One of the keys to the implementation of the embodiments of the present application lies in the image processing model. In order to reduce the difficulty of obtaining the image processing model, in one embodiment of the present application, the image processing model is a deep learning model, and the image processing model is obtained through model training.

During the model training process, training samples need to be provided. In a feasible model training method, pairs of training samples need to be provided. That is, it is necessary to provide the input sample of the image processing model (face image with uniform facial illumination) and the output sample of the image processing model (face image with uneven facial illumination) paired with the input sample. In a pair of input samples/output samples, the difference between the two face images lies in the uniformity of illumination on the face. In other features, the two face images are consistent.

Since face images with uneven facial illumination are relatively rare, it is very difficult to obtain pairs of face images with uneven facial illumination and face images with uniform facial illumination. Therefore, it is difficult to train the image processing model. .

In response to the above problems, in one embodiment of the present application, an image processing model is constructed based on the Generative Adversarial Networks (GAN) model.

The GAN model is a deep learning model that contains at least two modules: a generative model and a discriminative model. It produces good output through mutual game learning between the generative model and the discriminative model.

Specifically, FIG. 2 shows a schematic structural diagram of an image processing model according to an embodiment of the present application.

As shown in FIG. 2 , the image processing model includes a first generation model 210 .

The image 201 roughly represents a facial image. There is no area with different brightness on the face of the image 201. The image 201 refers to a face image (first face image) with uniform facial illumination.

Image 202 roughly represents the facial image. Compared with image 201, the difference between image 202 is that the brightness of the left and right half of the face is different (yin and yang face). Image 202 refers to the same facial image corresponding to image 201 ( Only the illumination status of the face is different), the face image with uneven facial illumination (the second face image).

After the training of the image processing model is completed, the face image (image 201, first face image) with uniform facial illumination is input to the first generation model 210. The first generation model 210 is used to provide input to the first generation model 210. A light and shadow effect with uneven facial illumination is added to the human face image to generate a human face image with uneven facial illumination (image 202, second face image).

Image 203 roughly represents the face image. Compared with image 201, the brightness of the left and right half of the face in image 203 is different (yin and yang face), and, except for the brightness state, the expression of image 203 is different from that of image 201 and image 203. The image 202 is also different. The image 203 refers to the face image corresponding to the images 201 and 202, and the face illumination is uneven. Face image (light and shadow determination sample image). The image processing model also includes a discriminative model 220. In the process of training the image processing model, the first generation model 210 is input with a face sample image with uniform facial illumination, and the discriminant model 220 determines the sample image (image 203, with uneven facial illumination) based on the light and shadow input to the discriminant model 220. face image), analyze the light and shadow additional effects of the first generative model 210 according to the output of the first generative model 210, and adjust the first generative model according to the analysis results of the discriminant model 220, thereby continuously optimizing the additional facial lighting of the first generative model 210. Implementation of uneven light and shadow effects.

Specifically, in one implementation, the image processing model uses Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation, U-GAT -IT) model. The U-GAT-IT model is an unsupervised generative network model, and model training does not require paired data. That is, during the model training process, the light and shadow determination sample image (face image with uneven facial illumination) input to the discriminant model 220 and the face sample image with uniform facial illumination input to the first generation model 210 may not be the same. Right image.

Further, the discriminant model 220 uses an attention mechanism (mainly reflected in weighted feature maps). Based on the attention map obtained by the auxiliary classifier, the discriminant model 220 helps the first generation model 210 determine where in the human image to perform concentrated conversion of light and shadow effects by distinguishing the source domain and the target domain.

The image processing model uses an adaptive hybrid normalization layer (Adaptive Layer-Instance Normalization, AdaLIN), which can automatically adjust the proportion of instance normalization (Instance Normalization, IN) and layer normalization (Layer Normalization, LN), helping The attention-guided model flexibly controls the changes in shape and texture without modifying the model architecture or hyperparameters, and achieves style conversion of light and shadow effects (uneven illumination on the face).

Furthermore, the image processing model also adopts the structure of cycle-generative adversarial network (cycle-gan). Based on the structure of the recurrent generative adversarial network, the training efficiency of model training is improved, and the processing precision and accuracy of the image processing model are improved.

Specifically, the image processing model also includes a second generation model 230.

Image 204 roughly represents the facial image. Compared with images 201 and 202, the difference between image 204 is that the brightness status of the face is different. The brightness of the left and right half of the face in image 204 is different, but the degree of difference is lower than that of the image. 202. That is, compared to the uneven illumination of the face of image 202 (yin and yang face), image 204 is more inclined to uniform facial illumination, but image 204 does not achieve complete uniform illumination of the face (the illumination state of image 204 does not reach the same level as Image 201 consistent degree). Image 202 refers to the face image corresponding to the same facial image as images 201 and 202 (only the lighting state of the face is different). The lighting state of the face is uniform compared with that of image 202, but the uniformity of lighting is not consistent with that of image 201. .

The second generative model 230 is the inverse mapping of the first generative model 210 . That is, after the training of the image processing model is completed, the first generation model 210 can convert the face image with uniform facial illumination into the face image with uneven facial illumination (image 204), and the second generation model 230 can convert the face image into a face image with uneven facial illumination (image 204). A face image with uneven illumination is converted into a face image with even face illumination.

In an ideal state, face image A with uniform facial illumination is input to the first generation model 210, and the first generation model 210 generates face image B; then face image B is input to the second generation model 230, and the second generation model 230 generates face image B. Model 230 will convert face image B into face image A.

However, in actual application scenarios (especially in the model training process), the first generation model 210 is the input person. The execution effect of adding light and shadow effects to face images cannot reach the ideal state. Therefore, the face image A with uniform facial illumination is input to the first generation model 210, and the first generation model 210 generates the face image B; then the face image B is input to the second generation model 230, and the second generation model 230 Convert face image B into face image C. There is a difference between face image C and face image A.

In the process of training the image processing model, the output of the first generation model 210 is used as the input of the second generation model 230, and the first generation model is adjusted according to the comparison result between the output of the second generation model 230 and the input of the first generation model 210. 210, thereby continuously optimizing the execution effect of adding light and shadow effects with uneven facial illumination to the first generation model 210.

Specifically, the comparison between the output of the second generative model 230 and the input of the first generative model 210 is implemented based on L2 norm loss (L2Loss).

Furthermore, based on the structure of the above image processing model, embodiments of this application propose a model training method.

Figure 3 shows a model training flow chart according to an embodiment of the present application. The electronic device executes the process shown in Figure 3 to train the image processing model with the structure shown in Figure 2.

S300: Obtain a sample image, which contains a human face image.

S301, identify and separate human face images from sample images.

S302: Preprocess the separated human face image to obtain a human face sample image.

For example, the human face image is scaled to a preset size; the human face image is cut to retain the preset facial features; and the facial features in the face image are aligned to the preset image position.

S310, classify the face sample image and divide it into two categories: a face image with uniform facial illumination (for example, the first face sample image) and a face image with uneven facial illumination (for example, the third face sample image).

S320: Use face images with uniform facial illumination and face images with uneven facial illumination to train the image processing model with the structure shown in Figure 2.

In S320, the face image with uniform facial illumination is input to the first generation model 210, and the facial image with uneven facial illumination is input to the discriminant model 220. The discriminant model 220 is based on the face image with uneven facial illumination, and analyzes the light and shadow additional effects of the first generation model 210 according to the image output by the first generation model 210 (the first generation model 210 is a face with uniform facial illumination). The execution effect of adding light and shadow effects with uneven lighting on the face to the face image). Finally, the first generation model 210 is adjusted according to the analysis results of the discriminant model 220 .

For example, a first face sample image (a face image with uniform facial illumination) is input to the first generation model 210, and the first generation model 210 generates a second face sample image based on the first face sample image. Ideally, the second face sample image is an image obtained by successfully adding the light and shadow effect of uneven facial illumination to the first face sample image.

The third face sample image (face image with uneven facial illumination) is input to the discriminant model 220 . The discriminant model 220 is based on the third face sample image and analyzes the light and shadow additional effects of the first generation model 210 based on the second face sample image.

Furthermore, the face image with uniform facial illumination input to the first generation model 210 and the facial image with uneven facial illumination input to the discriminant model 220 may not be paired images. That is, the difference between the first face sample image and the third face sample image may not only be the uniformity of facial illumination, but also may be different in other facial features. . For example, the first face sample image may be an image of person A, and the third face sample image may be an image of person B; or, the first face sample image may be an image of person A in the shooting view. The image under angle a, the third face sample image may be the image of person A under shooting angle b.

Furthermore, the face image with uniform facial illumination input to the first generation model 210 and the facial image with uneven facial illumination input to the discriminant model 220 can be combined arbitrarily. For example, when the facial image with uniform facial illumination input to the first generation model 210 remains unchanged, replace the facial image with uneven facial illumination input to the discriminant model 220; or, after inputting to the discriminant model 220 While the facial image with uneven facial illumination remains unchanged, the facial image with uniform facial illumination input to the first generation model 210 is replaced.

The third face sample image (face image with uneven facial illumination) is input to the discriminant model 220 . The discriminant model 220 is based on the third face sample image and analyzes the light and shadow additional effects of the first generation model 210 based on the second face sample image. The first generation model 210 is adjusted according to the analysis results of the discriminant model 220 .

Afterwards, the fourth face sample image (a face image with even facial illumination, an image different from the first face sample image) is input to the first generation model 210, and the first generation model 210 generates the image according to the fourth face sample image. Generate a fifth face sample image.

The discriminant model 220 analyzes the light and shadow additional effects of the first generation model 210 based on the fifth face sample image based on the third face sample image. The first generation model 210 continues to be adjusted according to the analysis results of the discriminant model 220 .

For another example, the first face sample image (face image with uniform facial illumination) is input to the first generation model 210, and the first generation model 210 generates the second face sample image based on the first face sample image. Ideally, the second face sample image is an image obtained by successfully adding the light and shadow effect of uneven facial illumination to the first face sample image.

After that, the sixth face sample image (a face image with uneven facial illumination, an image different from the third face image) is input to the discriminant model 220. The discriminant model 220 analyzes the light and shadow additional effects of the first generation model 210 based on the second face sample image based on the sixth face sample image. The first generation model 210 is adjusted according to the analysis results of the discriminant model 220 .

Further, in S320, the face image with uniform facial illumination is input to the first generation model 210. After the first generation model 210 generates a new image according to the facial image with uniform facial illumination, the first generation model 210 outputs The image is also input to a second generative model 230. The second generation model 230 generates a new image according to the image output by the first generation model 210, and adjusts the first generation model 210 by comparing the image output by the second generation model 230 with the image input to the first generation model 210 (and adjusts it simultaneously. Second generative model 230).

For example, a first face sample image (a face image with uniform facial illumination) is input to the first generation model 210, and the first generation model 210 generates a second face sample image based on the first face sample image. Input the second face sample image to the second generation model 230, and the second generation model 230 generates a seventh face sample image based on the second face sample image;

Compare the seventh face sample image with the first face sample image, and adjust the first generation model 210 according to the comparison result.

Furthermore, based on the image processing method in the embodiment of the present application, the embodiment of the present application also proposes an image processing method. device. Figure 4 shows a structural block diagram of an image processing device according to an embodiment of the present application. As shown in Figure 4, the image processing device 400 includes:

Image processing model 410 (for example, the image processing model shown in Figure 2), which is used to obtain a first face image and output a second face image, where:

The first face image is a face image with uniform facial illumination;

The image processing model includes a first generation model 411 (for example, the first generation model 411 can refer to the first generation model 210 shown in Figure 2). The first generation model is used to add uneven facial illumination to the first face image. Light and shadow effects to generate a second face image.

Further, based on the model training method of the embodiment of the present application, the embodiment of the present application also proposes a model training device, which is used to train an image processing model (for example, the image processing model as shown in Figure 2). The image processing model The model is used to add uneven facial illumination light and shadow effects to facial images with uniform facial illumination to generate facial images with uneven facial illumination. The image processing model is a generative adversarial network model. The image processing model includes the first Generative models and discriminative models.

Figure 5 shows a structural block diagram of a model training device according to an embodiment of the present application. As shown in Figure 5, the model training device 500 includes:

The first sample acquisition module 501 is used to acquire the first face sample image and input the first face sample image to the first generation model (for the first generation model, refer to the first generation model 210 shown in Figure 2) , wherein the first face sample image is a face image with uniform facial illumination (refer to image 201);

The second sample acquisition module 502 is used to acquire the second face sample image generated by the first generation model based on the first face sample image and input the second face sample image into the discriminant model (refer to Figure 2 for the discriminant model) The discriminant model shown 220);

The third sample acquisition module 503 is used to acquire a third face sample image and input the third face sample image into the discriminant model, where the third face sample image is a face image with uneven facial illumination (refer to image203);

The analysis result acquisition module 504 is used to obtain the analysis results of the discriminant model, where the analysis results include the result of the discriminant model analyzing the light and shadow additional effects of the first generation model based on the second face sample image based on the third face sample image. ;

The first adjustment module 505 is used to adjust the first generation model according to the analysis results.

Further, the second sample acquisition module 502 is also used to input the second face sample image to the second generation model 5 (for the second generation model, refer to the second generation model 230 shown in Figure 2).

The model training device 500 also includes a second adjustment module 506, which is used to obtain a fourth face sample image generated by the second generation model based on the second face sample image. According to the fourth face sample image and the first face sample The comparison results of the images adjust the first generative model.

In the description of the embodiments of the present application, for the convenience of description, the device is described by dividing its functions into various modules. The division of each module is only a division of logical functions. When implementing the embodiments of the present application, each module can be divided into The functionality of a module is implemented in the same or more software and/or hardware.

Specifically, during actual implementation, the device proposed in the embodiment of the present application may be fully or partially integrated into a physical entity, or may be physically separated. And these modules can all be implemented in the form of software calling through processing elements; they can also all be implemented in the form of hardware; some modules can also be implemented in the form of software calling through processing elements, and some modules can be implemented in the form of hardware. For example, the determination module can be a separately established processing element, or it can be a collection of It is implemented in a chip of an electronic device. The implementation of other modules is similar. In addition, all or part of these modules can be integrated together or implemented independently. During the implementation process, each step of the above method or each of the above modules can be completed by instructions in the form of hardware integrated logic circuits or software in the processor element.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more specific integrated circuits (Application Specific Integrated Circuit, ASIC), or one or more digital signal processors ( Digital Singnal Processor, DSP), or one or more Field Programmable Gate Array (Field Programmable Gate Array, FPGA), etc. For another example, these modules can be integrated together and implemented in the form of a System-On-a-Chip (SOC).

Furthermore, based on the image processing method proposed in this application, an embodiment of this application also proposes an electronic device. The electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions. When the computer When the program instructions are executed by the processor, the processor controls the electronic device to perform actions in the image processing method shown in the embodiments of the present application.

Furthermore, based on the model training method proposed in this application, an embodiment of this application also proposes an electronic device. The electronic device includes a memory for storing computer program instructions and a processor for executing the program instructions. When the computer When the program instructions are executed by the processor, the processor controls the electronic device to perform actions in the model training method shown in the embodiments of this application.

Figure 6 shows a schematic diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 6 , the electronic device 600 includes a memory 610 and a processor 620 . When specific computer program instructions stored on the memory 610 are executed by the processor 620, the processor 620 controls the electronic device 600 to perform actions in the image processing method or model training method shown in the embodiments of this application.

Further, in actual application scenarios, the method processes of the embodiments shown in this specification can be implemented by electronic chips installed on electronic devices. Therefore, based on the method proposed in this application, an embodiment of this application also proposes an electronic chip. The electronic chip includes a memory for storing computer program instructions and a processor for executing computer program instructions. When the computer program instructions When executed by the processor, the electronic chip is triggered to perform actions in the image processing method shown in the above embodiments of the present application.

Furthermore, based on the method provided by this application, an embodiment of this application also proposes an electronic chip. The electronic chip includes a memory for storing computer program instructions and a processor for executing computer program instructions. When the computer program When the instructions are executed by the processor, the electronic chip is triggered to execute the actions in the model training method shown in the above embodiments of the present application.

Furthermore, the equipment, devices, and modules described in the embodiments of this application may be implemented by computer chips or entities, or by products with certain functions.

Those skilled in the art should understand that embodiments of the present application may be provided as methods, devices, or computer program products. Thus, the invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the invention may take the form of a computer program product embodied on one or more computer-usable storage media embodying computer-usable program code therein.

In the several embodiments provided in this application, if any function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several The instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.

Specifically, an embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program that, when run on a computer, causes the computer to execute the method provided by the embodiment of the present application.

An embodiment of the present application also provides a computer program product. The computer program product includes a computer program that, when run on a computer, causes the computer to execute the method provided by the embodiment of the present application.

The embodiments in this application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.

These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.

These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

It should also be noted that in the embodiments of this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association of associated objects, indicating that there can be three relationships. For example, A and/or B can represent the existence of A alone, the existence of A and B at the same time, or the existence of B alone. Where A and B can be singular or plural. The character "/" generally indicates that the related objects are in an "or" relationship. “At least one of the following” and similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c can mean: a, b, c, a and b, a and c, b and c or a and b and c, where a, b, c can be single, also Can be multiple.

In the embodiments of this application, the terms "comprising", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, commodity or device that includes a series of elements not only includes those elements, but also includes Other elements are not expressly listed or are inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or device that includes the stated element.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. The present application may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.

Each embodiment in this application is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.

Those of ordinary skill in the art can realize that each unit and algorithm step described in the embodiments of this application can be implemented by a combination of electronic hardware, computer software, and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.

Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the devices, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

The above are only specific embodiments of the present application. Any person familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, and they should be covered by the protection scope of the present application. The protection scope of this application shall be subject to the protection scope of the claims.

Claims

An image processing method, characterized by including:

Obtaining a first face image, where the first face image is a face image with uniform facial illumination;

Input the first face image to the image processing model and obtain the second face image output by the image processing model, where:

The image processing model includes a first generation model, and the first generation model is used to add a light and shadow effect of uneven facial illumination to the first face image to generate the second face image.
The method of claim 1, wherein the image processing model is a generative adversarial network model.
The method according to claim 2, characterized in that the image processing model further includes:

A discriminant model, which is used to determine sample images based on light and shadow in the process of training the image processing model, and analyze the additional light and shadow effects of the first generative model according to the output of the first generative model, so as to determine the light and shadow additional effects of the first generative model according to the discriminant model. The analysis result adjusts the first generation model, wherein the face sample image input to the first generation model is a face image with uniform facial illumination, and the light and shadow determination sample image is a person with uneven facial illumination. face image.
The method according to claim 3, characterized in that the image processing model further includes:

A second generative model. The second generative model is an inverse mapping of the first generative model. The second generative model is used to convert the first generative model into a second generative model during the training of the image processing model. The output is used as input to adjust the first generative model based on a comparison of the output of the second generative model and the input of the first generative model.
A model training method, characterized in that the method is used to train an image processing model, and the image processing model is used to add a light and shadow effect of uneven facial illumination to a face image with uniform facial illumination to generate a face For face images with uneven illumination, the image processing model is a generative adversarial network model. The image processing model includes a first generative model and a discriminant model. The method includes:

Obtaining a first face sample image, wherein the first face sample image is a face image with uniform facial illumination;

Input the first face sample image to the first generation model, and the first generation model generates a second face sample image according to the first face sample image;

Obtain a third face sample image, wherein the third face sample image is a face image with uneven facial illumination;

Using the discriminant model, based on the third face sample image, analyze the light and shadow additional effects of the first generation model based on the second face sample image;

The first generation model is adjusted according to the analysis results of the discriminant model.
The method according to claim 5, characterized in that the third face sample image and the first face sample image are not paired sample images.
The method of claim 5, further comprising:

Obtain a fourth face sample image, wherein the fourth face sample image is a face image with uniform facial illumination;

Input the fourth face sample image to the first generation model, and the first generation model generates a fifth face sample image based on the fourth face sample image;

Using the discriminant model, based on the third face sample image, the light and shadow additional effects of the first generation model are analyzed based on the fifth face sample image.
The method of claim 5, further comprising:

Obtaining a sixth face sample image, wherein the sixth face sample image is a face image with uneven facial illumination;

Using the discriminant model, based on the sixth face sample image, the light and shadow additional effects of the first generation model are analyzed based on the second face sample image.
The method according to any one of claims 5-8, characterized in that the image processing model further includes a second generative model, and the second generative model is an inverse mapping of the first generative model;

The method also includes:

Input the second face sample image to the second generation model, and the second generation model generates a seventh face sample image based on the second face sample image;

Compare the seventh face sample image and the first face sample image, and adjust the first generation model according to the comparison result.
An image processing device, characterized in that the device includes:

An image processing model, which is used to obtain a first face image and output a second face image, where:

The first face image is a face image with uniform facial illumination;

The image processing model includes a first generation model, and the first generation model is used to add a light and shadow effect of uneven facial illumination to the first face image to generate the second face image.
A model training device, characterized in that the device is used to train an image processing model, and the image processing model is used to add a light and shadow effect of uneven facial illumination to a face image with uniform facial illumination to generate a facial expression. Face images with uneven illumination, the image processing model is a generative adversarial network model, the image processing model includes a first generation model and a discriminant model, the device includes:

A first sample acquisition module configured to acquire a first face sample image and input the first face sample image to the first generation model, wherein the first face sample image is facial illumination Uniform face images;

a second sample acquisition module configured to acquire a second face sample image generated by the first generation model based on the first face sample image;

A third sample acquisition module configured to acquire a third face sample image and input the third face sample image into the discrimination model, wherein the third face sample image is one with uneven facial illumination. face images;

An analysis result acquisition module, which is used to obtain the analysis results of the discrimination model, wherein the analysis results include: the discrimination model is based on the third face sample image, and the analysis result is based on the second face sample image. The results of the light and shadow additional effects of the first generated model;

A first adjustment module configured to adjust the first generation model according to the analysis results.
An electronic device, characterized in that the electronic device includes a processor for executing computer program instructions, wherein when the computer program instructions stored in the memory are executed by the processor, the electronic device is triggered to execute the The method described in any one of claims 1-4.
An electronic device, characterized in that the electronic device includes a processor for executing computer program instructions, wherein when the computer program instructions stored in the memory are executed by the processor, the electronic device is triggered to execute the The method described in any one of claims 5-9.
A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when it is run on a computer, it causes the computer to execute the method according to any one of claims 1-9. .