CN112132766A - Image restoration method and device, storage medium and electronic device - Google Patents
Image restoration method and device, storage medium and electronic device Download PDFInfo
- Publication number
- CN112132766A CN112132766A CN202011043049.7A CN202011043049A CN112132766A CN 112132766 A CN112132766 A CN 112132766A CN 202011043049 A CN202011043049 A CN 202011043049A CN 112132766 A CN112132766 A CN 112132766A
- Authority
- CN
- China
- Prior art keywords
- image
- model
- training
- target
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000003860 storage Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 199
- 238000013145 classification model Methods 0.000 claims abstract description 126
- 230000008439 repair process Effects 0.000 claims abstract description 63
- 238000012360 testing method Methods 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 25
- 230000015654 memory Effects 0.000 claims description 19
- 230000004044 response Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 11
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 8
- 238000013441 quality evaluation Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010428 oil painting Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 206010064503 Excessive skin Diseases 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005498 polishing Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The application discloses a method and a device for repairing an image, a storage medium and an electronic device, and relates to the field of cloud computing. Wherein, the method comprises the following steps: acquiring a repair request, wherein the repair request is used for requesting to repair the first image; repairing the first image through a generative confrontation network to obtain a second image reaching a target quality grade, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training the repair function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality grade; the second image is returned. The method and the device solve the technical problem of poor quality of the repaired image in the related technology.
Description
Technical Field
The application relates to the field of artificial intelligence, in particular to a method and a device for repairing an image, a storage medium and an electronic device.
Background
With the increasing popularity of photographic equipment, digital photographs have penetrated various aspects of daily life. However, many factors can cause image defects, both artificial and non-artificial. Repairing defective parts is a very important technique. The method has great application in the aspects of restoration of literary works, production of movie and television special effects, removal of redundant objects in images and the like.
The traditional image restoration method can only carry out simple texture restoration, and the traditional method cannot realize semantic restoration. In recent years, the development of the image restoration field is greatly promoted due to the occurrence of deep learning, the generation of an antagonistic network and a variational self-encoder greatly improve the semantic restoration performance, but the quality problems of excessive buffing, oil painting feeling, unnatural image detail generation and the like still exist.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the application provides an image restoration method and device, a storage medium and an electronic device, and aims to at least solve the technical problem of poor quality of restored images in the related art.
According to an aspect of an embodiment of the present application, there is provided an image restoration method, including: acquiring a repair request, wherein the repair request is used for requesting to repair the first image; repairing the first image through a generative confrontation network to obtain a second image reaching a target quality grade, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training the repair function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality grade; the second image is returned.
According to another aspect of the embodiments of the present application, there is also provided an image restoration apparatus including: the device comprises a first acquisition unit, a second acquisition unit and a processing unit, wherein the first acquisition unit is used for acquiring a repair request, and the repair request is used for requesting to repair a first image; the restoration unit is used for restoring the first image through the generative confrontation network to obtain a second image reaching the target quality level, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training the restoration function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality level; and a response unit for returning the second image.
According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program which, when executed, performs the above-described method.
According to another aspect of the embodiments of the present application, there is also provided an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the above method through the computer program.
In the embodiment of the application, when the first image is restored through the generative countermeasure network, whether the image obtained by restoring the first image through the target generation model reaches the target quality level is determined by using the target classification model, and the image reaching the target quality level is used as the second image which is finally fed back, so that the technical problem of poor quality of the restored image in the related art can be solved, and the technical effect of improving the image quality is achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic diagram of a hardware environment for a method of image restoration according to an embodiment of the present application;
FIG. 2 is a flow chart of an alternative method of image restoration according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative image restoration result according to an embodiment of the present application;
FIG. 4 is a schematic view of an alternative image restoration apparatus according to an embodiment of the present application;
and
fig. 5 is a block diagram of a terminal according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, partial nouns or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
generative Adaptive Networks (GAN) is a deep learning model with at least two modules in the model framework: and generating a Model G (Generative Model) and a discriminant Model D (discriminant Model), and generating better output by mutual game learning between the two models. In the original GAN theory, it is not required that G and D are both neural networks, but only that functions that can be generated and discriminated correspondingly are fitted. Deep neural networks are generally used as G and D in practice. An excellent GAN application requires a good training method, otherwise the output may be unsatisfactory due to the freedom of neural network models.
According to an aspect of embodiments of the present application, there is provided a method embodiment of a method for repairing an image.
Optionally, in this embodiment, the image restoration method may be provided to the user in the form of a service, the service may be set locally, the user directly uses the service on a local device, and the service may also be set in a cloud (which will be described below as an example), as applied to a hardware environment formed by the terminal 101 and the server 103 shown in fig. 1. As shown in fig. 1, a server 103 is connected to a terminal 101 through a network, which may be used to provide services for the terminal or a client installed on the terminal, and a database 105 may be provided on the server or separately from the server, and is used to provide data storage services for the server 103, and the network includes but is not limited to: the terminal 101 is not limited to a PC, a mobile phone, a tablet computer, and the like.
The image restoration method according to the embodiment of the present application may be executed by the server 103, the terminal 101, or both the server 103 and the terminal 101. The terminal 101 may execute the image restoration method according to the embodiment of the present application by a client installed thereon. For example, when executed by the server 103 and the terminal 101 together, fig. 2 is a flowchart of an optional image restoration method according to an embodiment of the present application, and as shown in fig. 2, the method may include the following steps:
in step S202, the server obtains a repair request of the terminal, and the terminal requests the server to repair the first image carried in the request through the repair request, where the server is a server providing a repair service (the service may be specifically provided by a generative countermeasure network).
The first image is an image with defects such as missing, blurring and mosaic, for example, an image with a part of missing in a face image.
And S204, the server repairs the first image by using a generative confrontation network to obtain a second image reaching the target quality level, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training the repair function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality level.
In step S206, the server returns the second image to the terminal.
Because the generation countermeasure network (GAN) has strong generation capability, the GAN is adopted to repair the image in the application, but if the GAN model is directly used to process the repair task, taking face repair as an example, the generated face detail has the following problems: due to poor controllability of the GAN model, quality problems of excessive skin polishing of a repair result, oil painting feeling, unnatural image detail generation and the like can occur in a task of performing face repair by using the GAN. These problems make it difficult for the final repair result to meet the image quality requirements, thereby reducing the visual impression of the user.
In the technical scheme of the application, when the first image is restored through the generative countermeasure network, whether the image obtained by restoring the first image through the target generation model reaches the target quality grade or not is determined by using the target classification model, and the image reaching the target quality grade is used as the second image which is finally fed back, so that the technical problem of poor quality of the restored image in the related technology can be solved, and the technical effect of improving the image quality is achieved. The technical solution of the present application is further detailed below with reference to the steps shown in fig. 2.
In the technical solution provided in step S202, the server obtains a repair request of the terminal.
The repair service can be borne by a server, a user can access the service through various interfaces on a terminal, for example, the user can access the service through small programs in software such as instant messaging software, search software, payment software and the like, an application client special for the repair service, a website of the service input at a webpage end and the like, when the user accesses the service through the modes, a repair request can be generated at the terminal side, and the request is sent to the server with the first image.
In the technical solution provided in step S204, the server repairs the first image by using the generative countermeasure network, and obtains a second image that reaches the target quality level.
Optionally, the generative confrontation network may be trained by the service provider itself, or may be pre-trained acquired or purchased from its principal, and the training of the generative confrontation network mainly includes: training of the generated model, training of the discriminant model, and training of the classification model, which are described one by one below.
(1) Training of classification models
Step 1, a first training sample set is obtained, the first training sample set includes a plurality of images with labels, the labels are used for identifying quality levels of the images, and the quality levels of the images can be multiple, such as two high levels, three medium levels and three low levels, and multiple more detailed images can be obtained according to needs.
Optionally, step 1 comprises steps 11-12:
step 11, obtaining an image generated by a target generation model, and generating a label for the image generated by the target generation model; an image, i.e., a real image, captured by an image capturing device is acquired, and a label is generated for the captured real image.
And step 12, taking the image generated by the target generation model with the label and the acquired image with the label as a first training sample set.
And 2, training the initial classification model by using the image in the first training sample set to obtain a target classification model.
Optionally, the training of the initial classification model by using the image in the first training sample set in step 2 to obtain the target classification model may be implemented through steps 21 to 24:
and step 21, training the initial classification model by using the image in the first training sample set to obtain an intermediate classification model.
Optionally, training the initial classification model using the images in the first training sample set comprises training the initial classification model a plurality of times as follows:
step 211, initializing weight parameters in each layer of network in the initial classification model under the condition that the training is the first training, specifically assigning values to each parameter in a random mode or an empirical value mode, and inputting the images in the first training sample set into the initial classification model; or, in the case that the training is not the first training, directly inputting the images in the first training sample set into the initial classification model.
And 212, performing quality grade recognition by using the initial classification model subjected to initialization parameters or parameter adjustment, and acquiring the quality grade of the image in the first training sample set recognized by the initial classification model.
And step 213, adjusting the weight parameters in each layer of network in the initial classification model according to the quality grade identified by the initial classification model and the quality grade of the label identification.
The initial classification model is a model for representing a mapping relationship between an input image and an output quality level, and since an input image and an output quality level are determined in training the initial classification model, a training process is equivalent to a process for solving a weight parameter using the input image and the output quality level.
Step 22, when the number of times of training reaches a certain value (for example, 1000 times), the intermediate classification model may be tested by using a first test set, where the first test set includes a plurality of images with labels.
And step 23, taking the intermediate classification model as the target classification model when the identification accuracy of the intermediate classification model on the third image reaches a target threshold value, wherein the third image is the image in the first test set, and the identification accuracy refers to the ratio of the correct identification number of the intermediate classification model to the total test number.
And 24, under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
(2) Training of generative models
In the embodiment of the present disclosure, training the initial generation model by using the target discrimination model to obtain the trained target generation model includes:
step 1, a second training sample set is obtained, wherein the second training sample set comprises a plurality of images to be restored.
And 2, training the initial generation model by using the image in the second training sample set to obtain an intermediate generation model.
Optionally, training the initial generative model using the image in the second set of training samples comprises training the initial generative model a plurality of times as follows:
step 21, initializing weight parameters in each layer of network in the initial generation model under the condition that the training is the first training, specifically assigning values to each parameter in a random mode or an empirical value mode, and inputting images in a second training sample set into the initial generation model; alternatively, when the current training is not the first training, the images in the second training sample set are directly input to the initial generation model.
And step 22, acquiring a fifth image obtained by restoring the image in the second training sample set by the initial generation model.
In step 23, the recognition result of the target discrimination model on the fifth image is obtained, which may be recognized as a real image or a generated image (i.e. a non-real image).
The above-mentioned object discrimination model is a trained model for judging whether the image is a real image, and can accurately classify the generated picture and the real picture, in short, this model can be a two-classifier, and outputs 0 to the generated picture and 1 to the real picture.
And step 23, adjusting the weight parameters in each layer of network in the initial generation model according to the recognition result of the target discrimination model on the fifth image.
The generation of the model refers to a repairing function that the model learns some data and then generates similar data to enable the machine model to learn some face images and then repair the face images by self. The model can be regarded as an encoder and a decoder, the encoder for training converts an input image into a code, then trains a decoder, the decoder converts the code into an output image, calculates the Mean Square Error (MSE) between the output image and the input image, after the model is trained, takes out the decoder of the latter half part, inputs a random code, and can generate an image. The above process is a process of training parameters of each layer in the encoder and the decoder.
And 3, when the training times reach a certain value (such as 1000 times), testing the intermediate generation model by using a second test set, wherein the second test set comprises a plurality of images to be repaired.
And 4, under the condition that the recognition passing rate of the target discrimination model to the fourth image is within a preset range, taking the intermediate generation model as the target generation model, repairing the fourth image which is obtained by the intermediate generation model to the images in the second test set, wherein the second test set comprises a plurality of images to be repaired, the recognition passing rate is the ratio of the number of the real images recognized by the target discrimination model to the number of all the test images, and the preset range can be 0.45-0.55.
And 5, under the condition that the recognition passing rate of the target discrimination model to the fourth image is not in the preset range, continuing to train the intermediate generation model by using the images in the second training sample set until the recognition passing rate of the target discrimination model to the fourth image is in the preset range.
(3) Discriminant model training
Step 1, obtaining a third training sample set, wherein the third training sample set comprises a plurality of images with labels, and the labels represent whether the images are generated or real.
Step 11, obtaining an image generated by a target generation model, and generating a label for the image generated by the target generation model; an image, i.e., a real image, captured by an image capturing device is acquired, and a label is generated for the captured real image.
And step 12, taking the image generated by the target discrimination model with the label and the acquired image with the label as a third training sample set.
And 2, training the initial discrimination model by using the image in the third training sample set to obtain an intermediate discrimination model.
Optionally, training the initial discrimination model using the image in the third training sample set comprises training the initial discrimination model a plurality of times as follows:
step 21, initializing weight parameters in each layer of network in the initial discrimination model under the condition that the training is the first training, specifically assigning values to each parameter in a random mode or an empirical value mode, and inputting images in a third training sample set into the initial discrimination model; or, when the training is not the first training, the images in the third training sample set are directly input into the initial discrimination model.
Step 22, obtaining the recognition result of the initial discrimination model on the third training sample set, where the recognition result may be recognized as a real image or a generated image (i.e. a non-real image).
And step 23, adjusting the weight parameters in each layer of network in the initial discrimination model according to the recognition result of the target discrimination model on the sixth image.
And 3, when the training times reach a certain value (such as 1000 times), testing the intermediate discrimination model by using a third test set, wherein the third test set comprises images with marks.
And 4, taking the intermediate discrimination model as the target discrimination model under the condition that the recognition accuracy of the target discrimination model on the images in the third test set reaches a specified threshold value, wherein the recognition accuracy is the ratio of the number of the recognized correct images to the number of all input test images.
And 5, under the condition that the recognition accuracy of the target discrimination model does not reach the specified threshold, continuing to train the intermediate discrimination model by using the image in the third training sample set until the recognition accuracy of the target discrimination model reaches the specified threshold.
In the technical solution provided in step S206, the server returns the second image to the terminal, and may directly send the second image to the terminal, or feed back a download address of the second image to the terminal.
When the scheme is applied to a repair scene such as a human face, a loss function Lqa based on image quality evaluation is added to a loss function (equivalent to a classification model) of a generator, and the quality of an image generated by the generator is adjusted through optimization Lqa, so that the quality of the image generated by the repair scene such as the human face is improved to a certain extent.
The objective evaluation of image quality can be divided into three types, namely Full-Reference (FR), partial-Reference (RR) and no-Reference (BIQA), and a no-Reference image quality evaluation index can be adopted as a loss function of a generator in the scheme. The no-reference image quality assessment index may be calculated using a classified depth model.
Step 1, collecting sample images, and collecting a batch of human face images (including natural images and artificial images) with various data qualities for training a VGG classification model.
Step 2, image normalization process, scaling the images to the same size (e.g. all processing to 120 × 120 resolution), and then classifying and scoring.
Such as scoring may be performed using a machine learning model (which has learned criteria for automatic scoring); a multi-person scoring system can also be adopted to evaluate the natural smoothness of the face image by 5 scores, wherein the higher the score is, the lower the quality of the face image is, and the lower the score is, the higher the quality of the face image is. The man-made face image can be generated by an original GAN model, and the real image and the generated man-made image are scored together.
And 3, carrying out five classifications on the data set by using a VGG classification model, wherein each classification represents the face images with different grade qualities. Finally, a VGG classification model for image quality evaluation can be obtained.
The quality evaluation model obtained above is used for the loss function calculation of the generator, which is Lqa. Finally, the smaller the Lqa value, the higher the quality of the generated face image. By minimizing the loss function, the direction in which the generator generates the face quality can be adjusted.
And 4, completing face complete restoration by using the GAN model, wherein the restoration result is shown in figure 3.
Processing the size of an image to be repaired into 120 x 120, inputting the image to be repaired into a GAN model, enabling a classification model (namely, adopting a loss function without reference image quality evaluation indexes) to be in synergistic action with an existing loss function of the GAN (such as a global loss function for evaluating global characteristics and a local loss function for evaluating local characteristics) in the repairing process of the GAN model, weighting the loss function values (the weight of each loss function can be determined according to experience) to calculate total loss after obtaining each loss function value so as to screen a result with the minimum total loss, and the scheme can flexibly process the capabilities of various mask characteristics (such as different positions, sizes and shapes), so that the model can successfully synthesize semantically effective and visually harmonious contents from random noise aiming at lost key parts, and can take the harmony of the global characteristics and the local characteristics into account, meanwhile, the high-quality sensory experience of the completion result is considered.
By adopting the technical scheme, the image quality evaluation method is introduced into the GAN network model in a loss function mode, the quality of the generated face is adjusted, and the face quality after completion is optimized.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.
According to another aspect of the embodiments of the present application, there is also provided an image restoration apparatus for implementing the above-described image restoration method. Fig. 4 is a schematic diagram of an alternative image restoration apparatus according to an embodiment of the present application, which may include, as shown in fig. 4:
a first obtaining unit 401, configured to obtain a repair request, where the repair request is used to request to repair a first image;
a repairing unit 403, configured to repair the first image through a generative confrontation network to obtain a second image reaching a target quality level, where the generative confrontation network includes a target discrimination model, a target classification model, and a target generation model, the target discrimination model is used to train a repairing function of the target generation model, and the target classification model is used to determine whether an image generated by the target generation model reaches the target quality level;
a response unit 405, configured to return the second image in response to the repair request.
It should be noted that the first obtaining unit 401 in this embodiment may be configured to execute step S202 in this embodiment, the repairing unit 403 in this embodiment may be configured to execute step S204 in this embodiment, and the responding unit 405 in this embodiment may be configured to execute step S206 in this embodiment.
It should be noted here that the modules described above are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above embodiments. It should be noted that the modules described above as a part of the apparatus may operate in a hardware environment as shown in fig. 1, and may be implemented by software or hardware.
Because the generation countermeasure network (GAN) has strong generation capability, the GAN is adopted to repair the image in the application, but if the GAN model is directly used to process the repair task, taking face repair as an example, the generated face detail has the following problems: due to poor controllability of the GAN model, quality problems of excessive skin polishing of a repair result, oil painting feeling, unnatural image detail generation and the like can occur in a task of performing face repair by using the GAN. These problems make it difficult for the final repair result to meet the image quality requirements, thereby reducing the visual impression of the user.
In the technical scheme of the application, when the first image is repaired through the generative countermeasure network, whether the image obtained by repairing the first image through the target generation model reaches the target quality level or not is determined by using the target classification model, and the image reaching the target quality level is used as the second image which is finally fed back, so that the technical problem of poor quality of the repaired image in the related technology can be solved, and the technical effect of improving the image quality is achieved.
Optionally, the apparatus further comprises: the second obtaining unit is used for obtaining a first training sample set before obtaining the repair request, wherein the first training sample set comprises a plurality of images with labels, and the labels are used for identifying the quality grades of the images; and the training unit is used for training an initial classification model by using the image in the first training sample set to obtain the target classification model.
Optionally, when the training unit trains an initial classification model by using the image in the first training sample set to obtain the target classification model, the training unit trains the initial classification model by using the image in the first training sample set to obtain an intermediate classification model; taking the intermediate classification model as a target classification model under the condition that the identification accuracy of the intermediate classification model on a third image reaches a target threshold value, wherein the third image is an image in a first test set, and the first test set comprises a plurality of labeled images; and under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
Optionally, the training unit, when training the initial classification model by using the image in the first training sample set, comprises training the initial classification model a plurality of times as follows: initializing weight parameters in each layer of network in the initial classification model under the condition that the training is the first training, and inputting images in the first training sample set into the initial classification model; or, under the condition that the training is not the first training, directly inputting the images in the first training sample set into the initial classification model; obtaining the quality grade of the image in the first training sample set identified by the initial classification model; and adjusting the weight parameters in each layer of network in the initial classification model according to the quality grade identified by the initial classification model and the quality grade of the label identification.
Optionally, when acquiring the first training sample set, the training unit acquires an image generated by the target generation model, and generates a label for the image generated by the target generation model; acquiring an image acquired by image acquisition equipment, and generating a label for the acquired image; and taking the image generated by the target generation model with the label and the acquired image with the label as the first training sample set.
Optionally, before acquiring the repair request, the training unit trains the initial discrimination model to obtain the trained target discrimination model; and training an initial generation model by using the target discrimination model to obtain the trained target generation model.
Optionally, the training unit may obtain a second training sample set when the initial generation model is trained by using the target discrimination model to obtain the trained target generation model, where the second training sample set includes a plurality of images to be restored; training the initial generation model by using the image in the second training sample set to obtain an intermediate generation model; under the condition that the recognition passing rate of the target discrimination model to a fourth image is within a preset range, taking the intermediate generation model as the target generation model, wherein the fourth image is obtained by repairing images in a second test set by the intermediate generation model, and the second test set comprises a plurality of images to be repaired; and under the condition that the recognition passing rate of the fourth image by the target discrimination model is not in the preset range, continuing to train the intermediate generation model by using the images in the second training sample set until the recognition passing rate of the fourth image by the target discrimination model is in the preset range.
Optionally, the training unit training the initial generative model using the images in the second training sample set comprises training the initial generative model a plurality of times as follows: initializing weight parameters in each layer of network in the initial generation model under the condition that the training is the first training, and inputting images in the second training sample set into the initial generation model; or, under the condition that the training is not the first training, directly inputting the images in the second training sample set into the initial generation model; acquiring a recognition result of the target discrimination model on a fifth image, wherein the fifth image is obtained by restoring the image in the second training sample set by the initial generation model; and adjusting the weight parameters in each layer of network in the initial generation model according to the recognition result of the target discrimination model on the fifth image.
By adopting the technical scheme, the image quality evaluation method is introduced into the GAN network model in a loss function mode, the quality of the generated face is adjusted, and the face quality after completion is optimized.
It should be noted here that the modules described above are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above embodiments. It should be noted that the modules described above as a part of the apparatus may be operated in a hardware environment as shown in fig. 1, and may be implemented by software, or may be implemented by hardware, where the hardware environment includes a network environment.
According to another aspect of the embodiment of the application, a server or a terminal for implementing the image restoration method is also provided.
Fig. 5 is a block diagram of a terminal according to an embodiment of the present application, and as shown in fig. 5, the terminal may include: one or more processors 501 (only one of which is shown in fig. 5), a memory 503, and a transmission means 505. as shown in fig. 5, the terminal may further include an input-output device 507.
The memory 503 may be used to store software programs and modules, such as program instructions/modules corresponding to the image restoration method and apparatus in the embodiments of the present application, and the processor 501 executes various functional applications and data processing by running the software programs and modules stored in the memory 503, that is, implements the image restoration method described above. The memory 503 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 503 may further include memory located remotely from the processor 501, which may be connected to the terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 505 is used for receiving or sending data via a network, and may also be used for data transmission between the processor and the memory. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 505 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 505 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
Among them, the memory 503 is used to store an application program in particular.
The processor 501 may call the application stored in the memory 503 through the transmission means 505 to perform the following steps:
acquiring a repair request, wherein the repair request is used for requesting to repair a first image;
repairing the first image through a generative confrontation network to obtain a second image reaching a target quality level, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training a repairing function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality level;
returning the second image in response to the repair request.
The processor 501 is further configured to perform the following steps:
training the initial classification model by using the image in the first training sample set to obtain an intermediate classification model;
taking the intermediate classification model as a target classification model under the condition that the identification accuracy of the intermediate classification model on a third image reaches a target threshold value, wherein the third image is an image in a first test set, and the first test set comprises a plurality of labeled images;
and under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
It can be understood by those skilled in the art that the structure shown in fig. 5 is only an illustration, and the terminal may be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a Mobile Internet Device (MID), a PAD, etc. Fig. 5 is a diagram illustrating a structure of the electronic device. For example, the terminal may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 5, or have a different configuration than shown in FIG. 5.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Embodiments of the present application also provide a storage medium. Alternatively, in the present embodiment, the storage medium may be a program code for executing a method of restoring an image.
Optionally, in this embodiment, the storage medium may be located on at least one of a plurality of network devices in a network shown in the above embodiment.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps:
acquiring a repair request, wherein the repair request is used for requesting to repair a first image;
repairing the first image through a generative confrontation network to obtain a second image reaching a target quality level, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training a repairing function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality level;
returning the second image in response to the repair request.
Optionally, the storage medium is further arranged to store program code for performing the steps of:
training the initial classification model by using the image in the first training sample set to obtain an intermediate classification model;
taking the intermediate classification model as a target classification model under the condition that the identification accuracy of the intermediate classification model on a third image reaches a target threshold value, wherein the third image is an image in a first test set, and the first test set comprises a plurality of labeled images;
and under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, network devices, or the like) to execute all or part of the steps of the method described in the embodiments of the present application.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.
Claims (18)
1. A method for restoring an image, comprising:
acquiring a repair request, wherein the repair request is used for requesting to repair a first image;
repairing the first image through a generative confrontation network to obtain a second image reaching a target quality level, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training a repairing function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality level;
returning the second image.
2. The method of claim 1, wherein prior to obtaining the repair request, the method further comprises:
acquiring a first training sample set, wherein the first training sample set comprises a plurality of images with labels, and the labels are used for identifying the quality grade of the images;
and training an initial classification model by using the image in the first training sample set to obtain the target classification model.
3. The method of claim 2, wherein training an initial classification model using the images in the first training sample set to obtain the target classification model comprises:
training the initial classification model by using the image in the first training sample set to obtain an intermediate classification model;
taking the intermediate classification model as a target classification model under the condition that the identification accuracy of the intermediate classification model on a third image reaches a target threshold value, wherein the third image is an image in a first test set, and the first test set comprises a plurality of labeled images;
and under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
4. The method of claim 3, wherein training the initial classification model using the image in the first set of training samples comprises training the initial classification model a plurality of times as follows:
initializing weight parameters in each layer of network in the initial classification model under the condition that the training is the first training, and inputting images in the first training sample set into the initial classification model; or, under the condition that the training is not the first training, directly inputting the images in the first training sample set into the initial classification model;
obtaining the quality grade of the image in the first training sample set identified by the initial classification model;
and adjusting the weight parameters in each layer of network in the initial classification model according to the quality grade identified by the initial classification model and the quality grade of the label identification.
5. The method of claim 2, wherein obtaining a first set of training samples comprises:
acquiring an image generated by the target generation model, and generating a label for the image generated by the target generation model; acquiring an image acquired by image acquisition equipment, and generating a label for the acquired image;
and taking the image generated by the target generation model with the label and the acquired image with the label as the first training sample set.
6. The method according to any of claims 1 to 5, wherein before obtaining the repair request, the method further comprises:
training an initial discrimination model to obtain the trained target discrimination model;
and training an initial generation model by using the target discrimination model to obtain the trained target generation model.
7. The method of claim 6, wherein training an initial generative model using the target discriminative model to obtain the trained target generative model comprises:
acquiring a second training sample set, wherein the second training sample set comprises a plurality of images to be restored;
training the initial generation model by using the image in the second training sample set to obtain an intermediate generation model;
under the condition that the recognition passing rate of the target discrimination model to a fourth image is within a preset range, taking the intermediate generation model as the target generation model, wherein the fourth image is obtained by repairing images in a second test set by the intermediate generation model, and the second test set comprises a plurality of images to be repaired;
and under the condition that the recognition passing rate of the fourth image by the target discrimination model is not in the preset range, continuing to train the intermediate generation model by using the images in the second training sample set until the recognition passing rate of the fourth image by the target discrimination model is in the preset range.
8. The method of claim 7, wherein training the initial generative model using the image in the second set of training samples comprises training the initial generative model a plurality of times as follows:
initializing weight parameters in each layer of network in the initial generation model under the condition that the training is the first training, and inputting images in the second training sample set into the initial generation model; or, under the condition that the training is not the first training, directly inputting the images in the second training sample set into the initial generation model;
acquiring a recognition result of the target discrimination model on a fifth image, wherein the fifth image is obtained by restoring the image in the second training sample set by the initial generation model;
and adjusting the weight parameters in each layer of network in the initial generation model according to the recognition result of the target discrimination model on the fifth image.
9. An apparatus for restoring an image, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a processing unit, wherein the first acquisition unit is used for acquiring a repair request, and the repair request is used for requesting to repair a first image;
the restoration unit is used for restoring the first image through a generative confrontation network to obtain a second image reaching a target quality grade, wherein the generative confrontation network comprises a target discrimination model, a target classification model and a target generation model, the target discrimination model is used for training the restoration function of the target generation model, and the target classification model is used for determining whether the image generated by the target generation model reaches the target quality grade;
a response unit, configured to return the second image in response to the repair request.
10. The apparatus of claim 9, further comprising:
the second obtaining unit is used for obtaining a first training sample set before obtaining the repair request, wherein the first training sample set comprises a plurality of images with labels, and the labels are used for identifying the quality grades of the images;
and the training unit is used for training an initial classification model by using the image in the first training sample set to obtain the target classification model.
11. The apparatus of claim 10, wherein the training unit is further configured to:
training the initial classification model by using the image in the first training sample set to obtain an intermediate classification model;
taking the intermediate classification model as a target classification model under the condition that the identification accuracy of the intermediate classification model on a third image reaches a target threshold value, wherein the third image is an image in a first test set, and the first test set comprises a plurality of labeled images;
and under the condition that the identification accuracy of the intermediate classification model to the third image does not reach the target threshold, continuing to train the intermediate classification model by using the images in the first training sample set until the identification accuracy of the intermediate classification model to the third image reaches the target threshold.
12. The apparatus of claim 11, wherein the training unit is further configured to train the initial classification model a plurality of times as follows:
initializing weight parameters in each layer of network in the initial classification model under the condition that the training is the first training, and inputting images in the first training sample set into the initial classification model; or, under the condition that the training is not the first training, directly inputting the images in the first training sample set into the initial classification model;
obtaining the quality grade of the image in the first training sample set identified by the initial classification model;
and adjusting the weight parameters in each layer of network in the initial classification model according to the quality grade identified by the initial classification model and the quality grade of the label identification.
13. The apparatus of claim 10, wherein the second obtaining unit is further configured to:
acquiring an image generated by the target generation model, and generating a label for the image generated by the target generation model; acquiring an image acquired by image acquisition equipment, and generating a label for the acquired image;
and taking the image generated by the target generation model with the label and the acquired image with the label as the first training sample set.
14. The apparatus according to any one of claims 9 to 13, wherein the training unit is further configured to:
before a repair request is obtained, training an initial discrimination model to obtain the trained target discrimination model;
and training an initial generation model by using the target discrimination model to obtain the trained target generation model.
15. The apparatus of claim 14, wherein the training unit is further configured to:
acquiring a second training sample set, wherein the second training sample set comprises a plurality of images to be restored;
training the initial generation model by using the image in the second training sample set to obtain an intermediate generation model;
under the condition that the recognition passing rate of the target discrimination model to a fourth image is within a preset range, taking the intermediate generation model as the target generation model, wherein the fourth image is obtained by repairing images in a second test set by the intermediate generation model, and the second test set comprises a plurality of images to be repaired;
and under the condition that the recognition passing rate of the fourth image by the target discrimination model is not in the preset range, continuing to train the intermediate generation model by using the images in the second training sample set until the recognition passing rate of the fourth image by the target discrimination model is in the preset range.
16. The apparatus of claim 15, wherein the training unit is further configured to train the initial generative model a plurality of times as follows:
initializing weight parameters in each layer of network in the initial generation model under the condition that the training is the first training, and inputting images in the second training sample set into the initial generation model; or, under the condition that the training is not the first training, directly inputting the images in the second training sample set into the initial generation model;
acquiring a recognition result of the target discrimination model on a fifth image, wherein the fifth image is obtained by restoring the image in the second training sample set by the initial generation model;
and adjusting the weight parameters in each layer of network in the initial generation model according to the recognition result of the target discrimination model on the fifth image.
17. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program when executed performs the method of any of the preceding claims 1 to 8.
18. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the method of any of the preceding claims 1 to 8 by means of the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011043049.7A CN112132766A (en) | 2020-09-28 | 2020-09-28 | Image restoration method and device, storage medium and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011043049.7A CN112132766A (en) | 2020-09-28 | 2020-09-28 | Image restoration method and device, storage medium and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112132766A true CN112132766A (en) | 2020-12-25 |
Family
ID=73844365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011043049.7A Pending CN112132766A (en) | 2020-09-28 | 2020-09-28 | Image restoration method and device, storage medium and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112132766A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177892A (en) * | 2021-04-29 | 2021-07-27 | 北京百度网讯科技有限公司 | Method, apparatus, medium, and program product for generating image inpainting model |
CN113379715A (en) * | 2021-06-24 | 2021-09-10 | 南京信息工程大学 | Underwater image enhancement and data set true value image acquisition method |
CN114339306A (en) * | 2021-12-28 | 2022-04-12 | 广州虎牙科技有限公司 | Live video image processing method and device and server |
-
2020
- 2020-09-28 CN CN202011043049.7A patent/CN112132766A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113177892A (en) * | 2021-04-29 | 2021-07-27 | 北京百度网讯科技有限公司 | Method, apparatus, medium, and program product for generating image inpainting model |
CN113379715A (en) * | 2021-06-24 | 2021-09-10 | 南京信息工程大学 | Underwater image enhancement and data set true value image acquisition method |
CN114339306A (en) * | 2021-12-28 | 2022-04-12 | 广州虎牙科技有限公司 | Live video image processing method and device and server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263681B (en) | Facial expression recognition method and device, storage medium and electronic device | |
CN108428227B (en) | No-reference image quality evaluation method based on full convolution neural network | |
CN112132766A (en) | Image restoration method and device, storage medium and electronic device | |
CN107123122B (en) | No-reference image quality evaluation method and device | |
CN112346567B (en) | Virtual interaction model generation method and device based on AI (Artificial Intelligence) and computer equipment | |
CN106897372B (en) | Voice query method and device | |
US20140257995A1 (en) | Method, device, and system for playing video advertisement | |
CN109872305B (en) | No-reference stereo image quality evaluation method based on quality map generation network | |
Shao et al. | Toward a blind quality predictor for screen content images | |
Chen et al. | No-reference screen content image quality assessment with unsupervised domain adaptation | |
CN105979253A (en) | Generalized regression neural network based non-reference stereoscopic image quality evaluation method | |
CN110598019B (en) | Repeated image identification method and device | |
CN110956615B (en) | Image quality evaluation model training method and device, electronic equipment and storage medium | |
CN109919252A (en) | The method for generating classifier using a small number of mark images | |
Zhang et al. | An objective quality of experience (QoE) assessment index for retargeted images | |
CN111062426A (en) | Method, device, electronic equipment and medium for establishing training set | |
Cai et al. | Joint depth and density guided single image de-raining | |
CN113139915A (en) | Portrait restoration model training method and device and electronic equipment | |
CN112819689A (en) | Training method of face attribute editing model, face attribute editing method and equipment | |
CN112560718A (en) | Method and device for acquiring material information, storage medium and electronic device | |
Zhang et al. | Sonar image quality evaluation using deep neural network | |
Hepburn et al. | Enforcing perceptual consistency on generative adversarial networks by using the normalised laplacian pyramid distance | |
CN107292331A (en) | Based on unsupervised feature learning without with reference to screen image quality evaluating method | |
CN113327212B (en) | Face driving method, face driving model training device, electronic equipment and storage medium | |
CN105915883A (en) | Blind reference stereo image quality evaluation method based on extreme learning and binocular fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |