CN113421317A

CN113421317A - Method and system for generating image and electronic equipment

Info

Publication number: CN113421317A
Application number: CN202110645535.4A
Authority: CN
Inventors: 李永凯; 王宁波; 朱树磊
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2021-09-21
Anticipated expiration: 2041-06-10
Also published as: CN113421317B

Abstract

A method, a system and an electronic device for generating images are provided, which generate characteristic distribution by extracting characteristics of input living or non-living images, then generate joint characteristic distribution based on the obtained characteristic distribution, and finally output new living or non-living images by performing deconvolution up-sampling reconstruction on the joint characteristic distribution. The method can effectively generate the live body or non-live body images of the shielding face and the non-shielding face, ensures the robustness of the performance of the generated images, enables the generated images to be directly trained by applying a face live body detection algorithm, and effectively improves the live body detection effect of the shielding face.

Description

Method and system for generating image and electronic equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and a system for generating an image, and an electronic device.

Background

With the wide application of face recognition technology, the security of the face recognition system and the recognition accuracy of the face recognition system in a complex scene are more and more emphasized.

The existing living body detection technology for face recognition needs to rely on training of massive living body images with face shielding and face non-shielding in complex scene application so as to improve the recognition accuracy.

However, the above scheme has the problem that a large number of living body images with human face occlusion and human face non-occlusion are lacked during algorithm design training.

Disclosure of Invention

The application provides a method, a system and an electronic device for generating images, which are used for realizing generation of a large number of images of living bodies or non-living bodies.

In a first aspect, the present application provides a method of generating an image, the method comprising:

acquiring a first set of images of a first target object, wherein the first set of images comprises: a first image of an unobstructed face of the first target object and a second image of an obstructed face of the first target object;

calculating a first feature distribution of the first image and a second feature distribution of the second image;

generating a joint feature distribution according to the first feature distribution and the second feature distribution;

deconvoluting, by the first model, the upsampling of the joint feature distribution to generate a second set of images, wherein the second set of images comprises: the third image of the face of the second target object without occlusion and the fourth image of the face of the second target object with occlusion.

By the method, a large number of live body or non-live body images of the shielded and non-shielded human face are effectively generated, the robustness of the performance of the generated images is ensured, the generated images can be directly trained by using a human face live body detection algorithm, and the live body detection effect of the shielded human face is effectively improved.

In one possible design, the calculating a first feature distribution of the first image and a second feature distribution of the second image includes:

extracting a first mean and a first variance of the first set of images and extracting a second mean and a second variance of the second image;

obtaining a first feature distribution of the first image according to the first mean value and the first variance;

and obtaining a second feature distribution of the second image according to the second mean value and the second variance.

By calculating the feature distribution of the input image, the generalization of the extracted features is stronger, and the effectiveness of the generated image is ensured.

In one possible design, before the deconvoluting upsampling the joint feature distribution by the first model to generate the second set of images, the method further includes:

acquiring a first Gaussian distribution of the first image and a second Gaussian distribution of the second image;

obtaining the relative entropy of the first group of images according to a first distribution distance between the first Gaussian distribution and the first characteristic distribution and a second distribution distance between the second Gaussian distribution and the second characteristic distribution;

and adjusting model parameters in the first model according to the relative entropy so as to adjust the image characteristics of each image in the second group of images.

The generated image is guaranteed to have similar image characteristics to the input image.

In one possible design, the joint feature distribution is deconvoluted and upsampled by the first model to generate a second group of images, which is specifically obtained by the following formula:

therein, ζ_outRepresenting said second set of images, Z₁Representing said first characteristic distribution, Z₂Representing said second characteristic distribution, x₁Representing said first image, x₂Representing said second image, Z₃Representing the joint feature distribution.

In one possible design, after the deconvolution upsampling the joint feature distribution by the first model to generate the second set of images, the method further includes:

obtaining a first loss value according to a first classification loss and a second classification loss generated by classifying the third image and the fourth image;

obtaining a second loss value according to a first living characteristic distance between the first image and the third image and a second living characteristic distance between the second image and the fourth image;

obtaining a third loss value according to a third living body feature distance between the third image and the fourth image;

weighting and summing the first loss value, the second loss value and the third loss value to obtain a model loss value;

and adjusting model parameters in the first model according to the model loss value so as to adjust the image characteristics of each image in the second group of images.

By the method, the classification result of the generated image is kept consistent with the expected classification result, the living body characteristics between the generated images are ensured to be consistent, the living body characteristics between the generated image and the input image are ensured to be consistent, the effectiveness of the generated image is further ensured, the performance of the first model for generating the image is effectively improved, and the performance of the generated image is more robust.

In a second aspect, the present application provides a system for generating an image, the system comprising:

an acquisition module configured to acquire a first set of images of a first target object, wherein the first set of images includes: a first image of an unobstructed face of the first target object and a second image of an obstructed face of the first target object;

a calculation module for calculating a first feature distribution of the first image and a second feature distribution of the second image; generating a joint feature distribution according to the first feature distribution and the second feature distribution;

a generating module, configured to perform deconvolution upsampling on the joint feature distribution through the first model to generate a second group of images, where the second group of images includes: the third image of the face of the second target object without occlusion and the fourth image of the face of the second target object with occlusion.

In one possible design, the calculation module is specifically configured to extract a first mean and a first variance of the first group of images, and to extract a second mean and a second variance of the second group of images; obtaining a first feature distribution of the first image according to the first mean value and the first variance; obtaining a second feature distribution of the second image according to the second mean value and the second variance;

in a possible design, the generating module is further configured to calculate, according to the classification of the third image and the fourth image, a first classification loss and a second classification loss generated by the generating module, so as to obtain a first loss value; obtaining a second loss value according to a first living characteristic distance between the first image and the third image and a second living characteristic distance between the second image and the fourth image; obtaining a third loss value according to a third living body feature distance between the third image and the fourth image; weighting and summing the first loss value, the second loss value and the third loss value to obtain a model loss value; and adjusting model parameters in the first model according to the model loss value so as to adjust the image characteristics of each image in the second group of images.

In a third aspect, the present application provides an electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the above-described method steps of generating an image when executing the computer program stored on the memory.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the above-mentioned method steps of generating an image.

For each of the second to fourth aspects and possible technical effects of each aspect, please refer to the above description of the first aspect or the possible technical effects of each of the possible solutions in the first aspect, and no repeated description is given here.

Drawings

FIG. 1 is a block diagram of a system for generating an image provided herein;

FIG. 2 is a flow chart of a method of generating an image provided herein;

FIG. 3 is a schematic diagram of an unobstructed face image provided by the present application;

FIG. 4 is a schematic diagram of a face image with an occlusion according to the present application;

FIG. 5 is a schematic diagram of a system for generating an image provided herein;

fig. 6 is a schematic structural diagram of an electronic device provided in the present application.

Detailed Description

To solve the problem of limited application of live sample images in live detection of occluded faces, the present application provides a system for generating images, which may be the system shown in fig. 1, and which may include an encoding module, a decoding module, and a live detection module. Wherein, the encoder module comprises two encoders jointly: a first encoder and a second encoder. The decoding module is composed of a decoder. The living body detection module is composed of a living body detection model.

Based on the system, the method can be applied to the system, live body or non-live body images of the shielded and unshielded human faces can be effectively generated as training data through the method, the problem that effective training data are limited in the current human face live body detection is solved, and the detection accuracy of the human face live body detection is effectively improved.

The method provided by the present application is described in further detail below with reference to the accompanying drawings.

Referring to fig. 2, the present application provides a method for generating an image, which includes the following specific processes:

s201: acquiring a first set of images of a first target object;

the first set of images in this application comprises: the first image of the non-occluded face of the first target object and the second image of the occluded face of the first target object.

The first image and the second image may be selectable to adapt to different application scenarios. The first image and the second image may be acquired by an infrared device, that is, the first image and the second image may be single channel infrared images. The first image and the second image can be obtained according to actual requirements, for example, through a common camera device, and then the first image and the second image are color three-channel images. The first image and the second image may also be images obtained by color, spatial or mode conversion, for example: HSV, YCrCb, LBP, fourier spectrogram, etc.

The absence of occlusion in the first image indicates that there are no occlusions on the face in the image, as shown in fig. 3. The occlusion in the second image indicates that an occlusion exists on the face in the image and partially occludes the face, where the occlusion may be a mask or other occlusions, for example: a hat, sunglasses, or a book, etc. In the embodiment of the present disclosure, the mask is taken as an example of the blocking object, that is, the second image is a face image of the first target blocked by the mask, as shown in fig. 4, other blocking situations will not be specifically described in this embodiment.

S202: calculating to obtain a first feature distribution of the first image and a second feature distribution of the second image;

specifically, in the present application, the first encoder and the second encoder may process the first image and the second image to obtain corresponding feature distributions, and the processing method is as follows:

the first image obtains a first mean value and a first variance through a first encoder, calculates a first independent distribution of the first image in a hidden space, and obtains a first feature distribution of the first image according to the first independent distribution. And the second image obtains a second mean value and a second variance through a second encoder, calculates a second independent distribution of the second image in the hidden space, and obtains a second feature distribution of the second image according to the second independent distribution.

Taking the first encoder as an example, first, a first image is obtained, and a first mean and a first variance of the first image are obtained through the first encoder. Here, the data feature of the first image in the hidden space is represented by a first mean value and a first variance.

Then, according to the first mean value and the first variance, a first independent distribution of the first image in the hidden space is obtained through calculation, and a calculation formula is shown as the following formula 1:

as shown in the above formula, z₁Expressed as a first independent distribution, μ₁Expressed as the first mean, σ₁Denoted as the first variance, and epsilon as random noise to maintain randomness.

Finally, obtaining a first feature distribution of the first image according to the first independent distribution, wherein a calculation formula is shown as the following formula 2:

Z₁＝q_φ1(z₁|x₁) (formula 2)

As shown in the above formula, Z₁Expressed as a first characteristic distribution, z₁Expressed as a first independent distribution, x₁Represented as a first image.

Taking the second encoder as an example, the second image is obtained first, and the second average and the second variance of the second image are obtained through the second encoder. Here, the data feature of the second image in the hidden space is represented by a second mean and a second variance.

Then, according to the second mean and the second variance, a second independent distribution of the second image in the hidden space is obtained through calculation, as shown in the following formula 3:

as shown in the above formula, z₂Expressed as a second independent distribution, μ₂Expressed as the second mean, σ₂Denoted as the second variance and epsilon as random noise to maintain randomness.

And finally, obtaining a second feature distribution of the second image according to the second independent distribution, wherein a calculation formula is shown as a formula 4:

Z₂＝q_φ2(z₂|x₂) (formula 4)

As shown in the above formula, Z₂Expressed as a second characteristic distribution, z₂Expressed as a second independent distribution, x₂Indicated as the second image.

By calculating the feature distribution of the first group of images, the features of low-level texture information such as eyes and reflectivity are not directly extracted, the description power of the features extracted by the shielded human face is enhanced, the generalization of the extracted features is stronger, and the method can cope with complex scenes, such as: the eyes of the human face are shielded, the angle change of the human face is large, and the like.

S203: generating a joint feature distribution according to the first feature distribution and the second feature distribution;

in the present application, the first feature distribution and the second feature distribution are combined to learn to generate a joint feature distribution having the liveness features of the first image and the second image. In the present application, the joint feature distribution may be obtained by a method of adding channels of the first feature distribution and the second feature distribution.

S204: and performing deconvolution upsampling on the joint feature distribution through the first model to generate a second group of images.

In this application, the second set of images includes: a third image of an unobstructed face of the second target object and a fourth image of an obstructed face of the second target object. In the decoding module, the joint feature distribution is subjected to deconvolution up-sampling through the first model to generate a second group of images, relative entropies of the first group of images are calculated so that the reconstructed second group of images and the first group of images have similar living body features, and first model parameters are adjusted according to the relative entropies to adjust image features of the images in the second group of images.

Specifically, the first model may be a convolutional neural network, and deconvolution is performed in the first model, where deconvolution is a common method of upsampling, and by deconvolution, a mapping operation from a small resolution to a large resolution may be implemented according to features of an input image, so as to perform a reduction function, and output an image. Thus, in the present application, the output second set of images is reconstructed from the joint feature distribution and has similar image features to the first set of images.

In an embodiment of the present application, the method for generating the second group of images by the first module may be as shown in the following formula 5:

in the above formula, ζ_outRepresenting a second set of images, Z, generated by reconstruction by the first module₁Representing a first characteristic distribution, Z₂Representing a second characteristic distribution, x₁Representing a first image, x₂Representing a second image, Z₃Representing a joint feature distribution.

Here, since the first group of images acquired is a living body image, it includes: since the first image without occlusion and the second image with occlusion are present, the second set of images generated are also live images, including: and generating a third image without occlusion and a fourth image with occlusion. And the final decoding module output picture is similar to the live features of the input picture.

It should be noted that, since the second set of images has similar living body features to the first set of images, namely: if the first set of images is live images, then the second set of images is also live images; if the first set of images is a non-live image, then the second set of images is also a non-live image. In the present application, the first group of images may be live images having a living body feature or non-living images having no living body feature. In the embodiment of the present application, taking the first group of images as an example of a living body image having a living body feature, the other case can be obtained by the same method, and is not specifically described in the embodiment of the present application.

In order to generate a second group of images with similar living body features as the first group of images, relative entropies of the first group of images are calculated, and the first model parameters are adjusted according to the relative entropies to adjust the image features of the images in the second group of images, in the embodiment of the application, the method for calculating the relative entropies of the first group of images is as follows:

for a first set of images: calculating a first distribution distance between a first feature distribution of the first image and a first Gaussian distribution, wherein the first Gaussian distribution is a unit Gaussian distribution of the first independent distribution; and calculating a second distribution distance between a second characteristic distribution of the second image and a second Gaussian distribution, wherein the second Gaussian distribution is a unit Gaussian distribution of a second independent distribution, and the relative entropy of the first group of images is obtained according to the first distribution distance and the second distribution distance.

Relative entropy, also known as KL divergence, refers to a measure of asymmetry in the difference between two probability distributions p and q, and is typically used to measure the distance between two random distributions. When the two random distributions are the same, their relative entropy is zero; as the difference between two random distributions increases, their relative entropy also increases.

In particular, relative entropy is used to measure the difference between the feature distribution and the unit gaussian distribution of each image in the first set of images, so that the distribution-to-distribution difference can be guaranteed, for example, if the first image and the second image both have living features, then the third image and the fourth image generated by reconstruction from them should also have living features, that is, the living features extracted with respect to the third image and the fourth image should be consistent with the first image and the second image.

Therefore, in order to generate the second group of images with similar activity characteristics as the first group of images, the relative entropy of the first group of images needs to be reduced, and the method for obtaining the relative entropy of the first group of images can be shown as the following formula 6:

ζ_kl＝D_kl(q_φ1(z₁|x₁)||p(z₁))+D_kl(q_φ2(z₂|x₂)||p(z₂) Equation 6)

In the above formula, ζ_klIs shown asRelative entropy of a set of images: a first distribution distance D from the first image_kl(q_φ1(z₁|x₁)||p(z₁) And a second distribution distance D of the second image_kl(q_φ2(z₂|x₂)||p(z₂) Are summed up). Wherein q is_φ1(z₁|x₁) Representing said first characteristic distribution, p (z)₁) A first Gaussian distribution, q, representing said first independent distribution_φ2(z₂|x₂) Representing said second characteristic distribution, p (z)₂) A second Gaussian distribution representing the second independent distribution.

And adjusting model parameters in the first model according to the relative entropy so that the activity characteristics of each image in the second group of images generated by reconstruction are similar to those of the first group of images.

According to the method, firstly, the input living body or non-living body images are extracted to generate the feature distribution, then the combined feature distribution is generated based on the obtained feature distribution, and finally the combined feature distribution is subjected to deconvolution up-sampling reconstruction to output new living body or non-living body images, so that the problem that the current living body or non-living body sample images are limited is solved, and the purpose of generating a large number of living body or non-living body sample images according to actual requirements is achieved.

On the basis of the method, the method aims to ensure the effectiveness of generating a living body image or a non-living body image and improve the suitability of living body detection under the two conditions of no human face shielding and shielding. The following methods are also provided in the present application:

in order to reduce the difference between the classification result of each image in the second group of images and the expected result, reduce the difference of the living body characteristics between the second group of images and the first group of images, and reduce the difference of the living body characteristics of each image in the second group of images, in the living body detection module, a lightweight living body detection characteristic extraction model can be preset for extracting the living body characteristics of the images, and a model loss value is introduced as a model parameter for adjusting the first model to adjust the image characteristics of each image in the second group of images, so that the effectiveness of the generated second group of images is ensured by reducing the model loss value.

In this application, the loss in the model loss value represents a loss of the liveness feature generated between the classification result and the expectation of each image in the second set of images, a loss of the liveness feature generated between the second set of images and the first set of images, and a loss of the liveness feature generated between the third image and the fourth image.

The model loss value can thus be made up of three aspects here, including: in the first aspect, the difference between the classification result of the third image and the classification result of the fourth image and the expectation is limited, and the calculated difference value is represented by a first loss value; in a second aspect, the difference in living body characteristics between the second set of images and the first set of images is limited, and a difference value of the calculated distance between the two sets of images is represented by a second loss value; in a third aspect, the difference in living body characteristics between the third image and the fourth image is limited, and a difference value of the distance between the third image and the fourth image is represented by a third loss value.

In the present application, the model loss value solving method can be shown in the following formula 7:

ζ_all＝αζ_cls+βζ_in-out+χζ_pair(formula 7)

As shown in the above formula, wherein, ζ_allRepresents the model loss value, ζ_clsRepresents a first loss value, ζ_in-outRepresents the second loss value, ζ_pairAnd a, β, and χ represent weight coefficients of the first loss value, the second loss value, and the third loss value, respectively. The weighting factor has the effect of increasing the weight of the pairing loss.

In order to ensure the effectiveness of the second group of images, the model loss value needs to be reduced, and as can be seen from equation 7, the model loss value needs to be reduced, and any one or all of the first loss value, the second loss value or the third loss value needs to be reduced. The following will specifically describe the solving method of the first loss value, the second loss value, and the third loss value in three parts:

in the first section, a method for solving the first loss value will be described in detail.

The first loss value is a loss value which limits the difference between the classification result of the third image and the expected classification result of the fourth image, and the difference between the classification result of the third image and the expected classification result of the fourth image is reduced by reducing the first loss value, so that the accuracy of each image in the second group of images in classification is ensured.

In the present application, a first set of images and a second set of images will be employed simultaneously, wherein: firstly, extracting the living characteristics of a third image and a fourth image through a living detection characteristic extraction model, performing supervised learning classification on the third image and the fourth image by taking a binary cross entropy loss function as supervision according to the labels of the first image and the second image as classification labels, and obtaining the difference between the respective classification result of the third image and the fourth image and the classification label through classification, namely a first classification loss value of the first image and a second classification loss value of the second image. And summing the first classification loss value and the second classification loss value to obtain a first loss value, wherein the first loss value is represented as the difference between the classification result and the preset label of the third image and the fourth image and is represented as the expectation.

Specifically, the two-class cross entropy loss function formula is shown in the following formula 8:

as shown in the above formula, C represents the number of samples, L_clsIs a score, p, predicting C samples as positive examples_jIs the probability that the model predicts the positive case of the jth sample, y_jIs the label of the jth sample, if the jth sample belongs to the positive example, the value is 1, otherwise the value is 0. The difference between the actual output classification result and the preset label in the classification can be represented by a two-classification cross entropy loss function.

For example, for the third image, the third image is used as a sample, the first image is used as a label, and the second-class cross entropy loss function is applied to obtain the first-class loss value L_cls1Here, the first classification loss value represents a difference between the classification result of the third image as a living body or a non-living body and the label of the first image.

Aiming at the fourth image, the fourth image is used as a sample, the second image is used as a label, and a second classification cross entropy loss function is applied to obtain a second classification loss value L_cls2Here, the second classification loss value represents a difference between the classification result of the fourth image as a living body or a non-living body and the label of the second image.

The method for obtaining the first loss value according to the first classification loss value and the second classification loss value is shown in the following formula 9:

ζ_cls＝L_cls1+L_cls2(formula 9)

As stated in the above equation, ζ_clsRepresents a first loss value, L_cls1Represents a first classification loss value, L_cls2Representing a second classification loss value.

And controlling the difference between the classification result and the prediction of each image in the second group of images through the first loss value, thereby improving the fitting degree of the model.

In the second section, a method for solving the second loss value will be described in detail.

The second loss value is a loss value that limits a difference in living body characteristics between the second set of images and the first set of images, by which uniformity of living body characteristics between the second set of images and the first set of images is ensured.

According to the method, a first living body feature distance between a first image and a third image and a second living body feature distance between a second image and a fourth image are obtained according to living body features of the first group of images and the second group of images extracted by a living body detection feature model, and then a second loss value representing the difference between the second group of images and the first group of images is obtained according to the first living body feature distance and the second living body feature distance.

In particular, the nature of the in-vivo detection feature model is to extract live features from the image input to the model, in which,flv () is used to represent the output results of the predefined lightweight liveness detection feature model. Using the first living feature Flv (x)₁) To represent the output result of the first image through the living body feature extracted by the feature model, and to use the second living body feature Flv (x)₂) To represent the output result of the living body feature of the second image through the feature model extraction. And, the third image is defined as

Defining the fourth image as

Using a third living body feature

To represent the output result of the living body feature of the third image extracted by the feature model; using the fourth living body feature

To represent the output result of the living body feature of the fourth image extracted by the feature model.

After extracting living body features from the first group of images and the second group of images by the living body detection feature model, calculating a first living body feature distance between the first living body feature and the third living body feature and a second living body feature distance between the second living body feature and the fourth living body feature, and obtaining a second loss value according to the first living body feature distance and the second living body feature distance, wherein a method for calculating the second loss value can be shown in the following formula 10:

as shown in the above formula, ζ_in-outThe value of the second loss is represented,

a first in-vivo characteristic distance is represented,

a second in-vivo characteristic distance is indicated,

indicating a third live body characteristic, Flv (x)₁) A first living body characteristic is represented and,

denotes a fourth living body characteristic, Flv (x)₂) Representing a second living being characteristic.

And the difference of the living body characteristics between the second group of images and the first group of images is controlled through the second loss value, so that the robustness of the model is improved.

In the third section, a method for solving the third loss value will be described in detail.

The third loss value is a loss value that limits a difference in living body characteristics between the third image and the fourth image, and by which uniformity of living body characteristics of the generated third image and the fourth image is ensured.

In this application, a third loss value is obtained by calculating a third living feature distance of a third living feature and a fourth living feature, where the third loss value is the third living feature distance, and the third loss value represents a difference of the living features between the third image and the fourth image.

In particular, the third loss value, i.e., the third living feature distance, may be obtained by calculating the distance of the output from the tag. The third and fourth live body features are output and are labels of each other, for example: when the third living body feature is taken as an output, the fourth living body feature is taken as a label; the third live feature serves as a label when the fourth live feature serves as an output.

Here, the premise that the third living body feature and the fourth living body feature are output and mutually labeled is that: because the second set of images is generated by the reconstruction of the first set of images, the second set of images having similar liveness features as the first set of images, the second set of images are labeled the same as the first set of images, for example: the label of each image in the first set of images is a live object, then the label of each image in the second set of images is also a live object, and in the second set of images, if the third image is a live object, then the fourth image is also a live object; the label of each image in the first set of images is non-live, then the label of each image in the second set of images is non-live, and if the third image is non-live, then the fourth image is also non-live.

In order to control the difference of the living body characteristics between the third image and the fourth image, a method of calculating the third loss value may be as shown in equation 11 below:

in the above formula, ζ_pairA third value of the loss is represented,

a third living body characteristic is represented that,

a fourth living body feature is shown which,

representing a third living-feature distance.

And controlling the difference of the living body characteristics between the third image and the fourth image through the third loss value, and ensuring the efficiency of generating the frame.

In the above three sections, the method of calculating the first loss value, the second loss value, and the third loss value is specifically described. And obtaining a model loss value according to the weighted summation of the first loss value, the second loss value and the third loss value. The model loss value is used as a parameter of an optimization system model in the first model, so that the living body face image generated in the training process has similar image characteristics.

In the above manner, the classification results of the third image and the second image are consistent with the expectation, the living body features of the first image and the second image are consistent, and the living body features between the third image and the fourth image are consistent. The model loss value further ensures the effectiveness of the generated image, effectively improves the performance of the first model for generating the image and enables the performance of the generated image to be more robust.

Based on the same inventive concept, the present application further provides a system for generating an image, so as to effectively generate live body or non-live body images of an occluded and non-occluded face, solve the problem of limited effective training data in the current live body detection of the face, and effectively improve the detection accuracy of the live body detection of the face, as shown in fig. 5, the system includes:

an obtaining module 501, configured to obtain a first group of images of a first target object, where the first group of images includes: a first image of an unobstructed face of the first target object and a second image of an obstructed face of the first target object;

a calculating module 502, configured to calculate a first feature distribution of the first image and a second feature distribution of the second image; generating a joint feature distribution according to the first feature distribution and the second feature distribution;

a generating module 503, configured to perform deconvolution upsampling on the joint feature distribution through the first model, and generate a second group of images, where the second group of images includes: the third image of the face of the second target object without occlusion and the fourth image of the face of the second target object with occlusion.

In one possible design, the calculation module 502 is specifically configured to extract a first mean and a first variance of the first group of images, and extract a second mean and a second variance of the second group of images; obtaining a first feature distribution of the first image according to the first mean value and the first variance; and obtaining a second feature distribution of the second image according to the second mean value and the second variance.

In a possible design, the generating module 503 is further configured to calculate a first classification loss and a second classification loss according to the classification of the third image and the fourth image, so as to obtain a first loss value; obtaining a second loss value according to a first living characteristic distance between the first image and the third image and a second living characteristic distance between the second image and the fourth image; obtaining a third loss value according to a third living body feature distance between the third image and the fourth image; weighting and summing the first loss value, the second loss value and the third loss value to obtain a model loss value; and adjusting model parameters in the first model according to the model loss value so as to adjust the image characteristics of each image in the second group of images.

Based on the system, a large number of live body or non-live body images of the shielded and non-shielded human face are effectively generated, the robustness of the performance of the generated images is ensured, the generated images can be directly trained by using a human face live body detection algorithm, and the live body detection effect of the shielded human face is effectively improved.

Based on the same inventive concept, an embodiment of the present application further provides an electronic device, where the electronic device may implement the function of the foregoing system for generating an image, and with reference to fig. 6, the electronic device includes:

at least one processor 601 and a memory 602 connected to the at least one processor 601, in this embodiment, a specific connection medium between the processor 601 and the memory 602 is not limited, and fig. 6 illustrates an example where the processor 601 and the memory 602 are connected through a bus 600. The bus 600 is shown in fig. 6 by a thick line, and the connection manner between other components is merely illustrative and not limited thereto. The bus 600 may be divided into an address bus, a data bus, a control bus, etc., and is shown with only one thick line in fig. 6 for ease of illustration, but does not represent only one bus or type of bus. Alternatively, the processor 601 may also be referred to as a controller, without limitation to name a few.

In the embodiment of the present application, the memory 602 stores instructions executable by the at least one processor 601, and the at least one processor 601 may execute the method for generating an image discussed above by executing the instructions stored in the memory 602. The processor 601 may implement the functions of the various modules in the system shown in fig. 5.

The processor 601 is a control center of the system, and may connect various parts of the entire control device by using various interfaces and lines, and perform various functions of the system and process data by executing or executing instructions stored in the memory 602 and calling up data stored in the memory 602, thereby performing overall monitoring of the system.

In one possible design, processor 601 may include one or more processing units, and processor 601 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, and the like, and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601. In some embodiments, the processor 601 and the memory 602 may be implemented on the same chip, or in some embodiments, they may be implemented separately on separate chips.

The processor 601 may be a general-purpose processor, such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, that may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method for generating an image disclosed in the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.

The memory 602, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 602 may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charge Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory 602 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 602 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.

The processor 601 is programmed to solidify the code corresponding to the image generating method described in the foregoing embodiment into the chip, so that the chip can execute the steps of the image generating method of the embodiment shown in fig. 2 when running. How to program the processor 601 is well known to those skilled in the art and will not be described herein.

Based on the same inventive concept, the present application also provides a storage medium storing computer instructions, which when executed on a computer, cause the computer to perform the method for generating an image discussed above.

In some possible embodiments, the aspects of the image generating method provided by the present application may also be implemented in the form of a program product, which includes program code for causing the control apparatus to perform the steps of the image generating method according to various exemplary embodiments of the present application described above in this specification, when the program product is run on a device.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method of generating an image, the method comprising:

2. The method of claim 1, wherein the calculating a first feature distribution of the first image and a second feature distribution of the second image comprises:

3. The method of claim 1, further comprising, prior to said deconvoluting upsampling said joint feature distribution by said first model to generate a second set of images:

4. The method of claim 1, wherein the deconvolution upsampling of the joint feature distribution by the first model generates a second set of images, in particular by the following equation:

5. The method of claim 1, further comprising, after said deconvoluting upsampling the joint feature distribution through the first model to generate a second set of images:

obtaining a first loss value according to a first classification loss and a second classification loss generated by classifying and calculating the third image and the fourth image;

6. A system for generating an image, the system comprising:

7. The system of claim 6, wherein the computing module is specifically configured to extract a first mean and a first variance of the first set of images and to extract a second mean and a second variance of the second set of images; obtaining a first feature distribution of the first image according to the first mean value and the first variance; and obtaining a second feature distribution of the second image according to the second mean value and the second variance.

8. The system of claim 6, wherein the generating module is further configured to calculate a first classification loss and a second classification loss generated according to the classification of the third image and the fourth image, and obtain a first loss value; obtaining a second loss value according to a first living characteristic distance between the first image and the third image and a second living characteristic distance between the second image and the fourth image; obtaining a third loss value according to a third living body feature distance between the third image and the fourth image; weighting and summing the first loss value, the second loss value and the third loss value to obtain a model loss value; and adjusting model parameters in the first model according to the model loss value so as to adjust the image characteristics of each image in the second group of images.

9. An electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1-5 when executing the computer program stored on the memory.

10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1-5.