CN116402702A - Old photo restoration method and system based on deep neural network - Google Patents

Old photo restoration method and system based on deep neural network Download PDF

Info

Publication number
CN116402702A
CN116402702A CN202310152602.8A CN202310152602A CN116402702A CN 116402702 A CN116402702 A CN 116402702A CN 202310152602 A CN202310152602 A CN 202310152602A CN 116402702 A CN116402702 A CN 116402702A
Authority
CN
China
Prior art keywords
network
repair
encoder
old photo
old
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310152602.8A
Other languages
Chinese (zh)
Inventor
曹烈安
张仁贵
林立
崔科辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Aiqi Information Technology Co ltd
Original Assignee
Shanghai Aiqi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Aiqi Information Technology Co ltd filed Critical Shanghai Aiqi Information Technology Co ltd
Priority to CN202310152602.8A priority Critical patent/CN116402702A/en
Publication of CN116402702A publication Critical patent/CN116402702A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for repairing old photos based on a deep neural network, comprising the following steps: obtaining an old photo to be repaired; constructing a rough repair network model and training the model to eliminate unstructured damage of the old photo to obtain the old photo after rough repair; constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair. According to the invention, the complex damage in the old photo is divided into two types of structural damage and unstructured damage by adopting a two-stage repair model from thick to thin, the coarse repair stage focuses on removing the unstructured damage in the old photo, the structural damage in the old photo is mainly solved in the fine repair stage, and a gradual reconstruction strategy is adopted, so that the whole reconstruction process is firstly and then locally reconstructed gradually to obtain a clean, complete and detailed repair result.

Description

Old photo restoration method and system based on deep neural network
Technical Field
The invention relates to the technical field of image processing, in particular to an old photo restoration method and system based on a deep neural network.
Background
Old photo restoration technology is widely popular among people, and many people send restored old photos to old people, relatives and friends as gifts. Old photo restoration has become a fashion. Meanwhile, the old photo has a real record history and reflects the document history value of the social appearance at the time. Old photographs can also reflect the progress of the photographic development history. As the photographic history progresses, the carrier of the photograph also changes gradually. With the popularization of photography, there is a photography experience, and many people are interested in the history of photography because of their own photography experience, and thus are interested in the development process of photography. In addition, the artistic value hidden in the old photo is visually enjoyed by people, and is also visual art.
The problem of image complementation aims at filling the missing part in the picture with reasonable content, and can be used for repairing structural damage such as scratches, stains and the like in the old picture, but the unstructured damage in the picture can not be repaired yet, and even the complementation effect can be influenced, so that the repair quality is poor. However, in the existing old photo repair technology, for example Bringing Old Photos Back to Life, the generated repair result is too blurred and smooth, and only blurred and severely lack of detail can be generated in the area where the damage needs to be repaired again. Also, for example, resolution-robust Large Mask Inpainting with Fourier Convolut ions, only structural damage in old photos cannot be treated with non-structural damage
Patent document CN115311156a discloses a photo restoration method, comprising the following steps. Step S1: and extracting a photo area from the picture. Step S2: the broken, creased, scratched areas are extracted in the photo area. Step S3: repairing the broken, crease and scratch areas in the photo area. Step S4: the face region is detected in the photo region. Step S5: the various degradations in the face region are removed and face details are added to the face region.
However, patent document CN115311156a cannot handle unstructured lesions therein.
In view of the foregoing, there is a need in the market for a deep neural network-based old photo repair method and system that can repair both structural and non-structural damage.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an old photo restoration method and system based on a deep neural network.
The invention provides a method for repairing old photos based on a deep neural network, which comprises the following steps:
step S1: obtaining an old photo to be repaired;
step S2: constructing a rough repair network model and training the model to eliminate unstructured damage of the old photo to obtain the old photo after rough repair;
step S3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair.
Preferably, the coarse repair network model includes a variational self-encoder and a mapping network;
the variable self-encoder comprises a first variable self-encoder and a second variable self-encoder, wherein the first variable self-encoder comprises an encoder, a decoder, a discriminator aiming at a reconstruction result and a discriminator aiming at a coding vector, and the second variable self-encoder comprises an encoder, a decoder and a discriminator aiming at the reconstruction result;
the refined repair network model comprises a multi-scale densely connected edge repair network and a full repair network based on Fourier convolution;
the edge restoration network comprises three layers of the same dense residual network, each layer of the network taking the output of all previous layers as additional input through a jump connection, so that local features are extracted from the input.
Preferably, step S2 includes:
step S2.1: obtaining a coding vector through an encoder of a first variation self-encoder, and extracting image detail information in the intermediate feature image through a detail feature extraction module;
step S2.2: converting the degraded picture domain into a clean picture domain through a mapping network by the coding vector;
step S2.3: obtaining reconstructed old pictures from a decoder of the encoder by a second variation;
the detail feature extraction module comprises three layers of gating convolution.
Preferably, the method comprises the steps of independently training a first variable-fraction self-encoder, a second variable-fraction self-encoder and a mapping network respectively;
training a first variational self-encoder to map a real old photo and a degraded picture synthesized by adding various degradation kernels into the same hidden space;
the second variation self-encoder is trained to take a clean lossless picture as input, and a hidden space of the clean picture is learned in the process of learning the reconstructed image.
Preferably, step S3 includes:
step S3.1: splicing the gray level map of the rough repair result, the extracted corresponding edge map and the corresponding mask together in the channel dimension to serve as input of an edge repair network, and further repairing the rough repaired edge map;
step S3.2: extracting structural information of the further repaired edge map through a structural feature extraction module;
step S3.3: the completion network completes the repair of the old photo according to the old photo after the rough repair and the structural information;
the structural feature extraction module comprises an encoder-decoder structure formed by gating convolution, takes the output of the last three upsampling gating convolutions as extracted structural information, and adds three trainable weights into the first three layers of input of the full network.
Preferably, the losses in the course of training of the coarse and fine repair network models include reconstruction losses, antagonism losses, and feature matching losses;
the reconstruction loss calculation formula is as follows:
Figure SMS_1
in the method, in the process of the invention,
Figure SMS_2
representing a mapping network,/->
Figure SMS_3
To reconstruct the result, I gt Z is a true clean picture x Coding vectors for the hidden space of a degraded picture, z y Coding vectors for the hidden space of a clean picture, +.>
Figure SMS_4
And->
Figure SMS_5
Representing the weight;
the countermeasures loss calculation formula is as follows:
Figure SMS_6
in the method, in the process of the invention,
Figure SMS_7
a discriminator representing a coarse repair model;
the calculation formula of the feature matching loss is as follows:
Figure SMS_8
in the method, in the process of the invention,
Figure SMS_9
and->
Figure SMS_10
Feature map of the i-th layer output of discriminator and VGG network, respectively, +.>
Figure SMS_11
And->
Figure SMS_12
Representing the number of activation functions in the layer.
Preferably, the losses in the refined network model training process further include gradient constraints and high receptive field perceived losses:
the calculation formula of the gradient constraint is as follows:
Figure SMS_13
in the method, in the process of the invention,
Figure SMS_14
is a gradient operator, D ξ Discriminator, lambda, representing a rough repair model GP Weights representing gradient constraint losses, I gt Representing a real clean picture;
the perceptual penalty evaluates the distance between features extracted from the predicted image and the target image through a basic pretraining network phi (), the calculation formula is as follows:
Figure SMS_15
in phi HRF Representing a pre-trained ResNet50 network, lambda HRF The weight of the perceptual penalty is indicated,
Figure SMS_16
and (5) representing the reconstruction result of the fine repair stage.
Preferably, the method further comprises a scratch detection step;
establishing a scratch detection network taking a Unet network as a main framework, firstly training by using only synthesized images, wherein the loss in the training process comprises cross entropy loss and Focal loss;
the calculation formula of the cross entropy loss is as follows:
Figure SMS_17
in the method, in the process of the invention,
Figure SMS_18
representing cross entropy loss, s i And y i Respectively representing a picture with scratches and corresponding masks, H and W respectively representing the height and width of the picture, ++>
Figure SMS_19
And->
Figure SMS_20
The mask representing the network predicted is used to compensate for the imbalance of positive and negative pixel samples using the weight α, as follows:
Figure SMS_21
the calculation formula of the Focal loss is as follows:
Figure SMS_22
wherein, gamma represents an adjustable super parameter,
Figure SMS_23
the calculation is as follows:
Figure SMS_24
in the method, in the process of the invention,
Figure SMS_25
indicating a Focal loss.
The invention provides an old photo restoration system based on a deep neural network, which comprises the following components:
module M1: obtaining an old photo to be repaired;
module M2: constructing a rough repair network model and training the model to eliminate unstructured damage of the old photo to obtain the old photo after rough repair;
module M3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair.
Preferably, the coarse repair network model includes a variational self-encoder and a mapping network;
the variable self-encoder comprises a first variable self-encoder and a second variable self-encoder, wherein the first variable self-encoder comprises an encoder, a decoder, a discriminator aiming at a reconstruction result and a discriminator aiming at a coding vector, and the second variable self-encoder comprises an encoder, a decoder and a discriminator aiming at the reconstruction result;
the refined repair network model comprises a multi-scale densely connected edge repair network and a full repair network based on Fourier convolution;
the edge restoration network comprises three layers of the same dense residual network, each layer of the network taking the output of all previous layers as additional input through a jump connection, so that local features are extracted from the input.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the complex damage in the old photo is divided into two types of structural damage and unstructured damage by adopting a two-stage repair model from thick to thin, the coarse repair stage focuses on removing the unstructured damage in the old photo, the structural damage in the old photo is mainly solved in the fine repair stage, and the gradual reconstruction strategy is adopted, so that the reconstruction process is integrated first and then localized, and the structural integrity, detail richness and neatness of the old photo repair result are improved.
2. According to the invention, the old photo restoration problem is converted into the mapping problem among the three fields of the degraded picture field, the real old photo field and the clean picture field, and the degraded picture and the real old photo are mapped to the same hidden space by using the similarity of the appearance of the degraded picture and the real old photo, so that the paired data sets for training are increased, and the restoration quality of the old photo is improved.
3. According to the invention, the detail characteristic extraction module is introduced in the rough restoration stage, and the overall detail information of the input picture is extracted to assist the reconstruction process in the rough restoration stage by utilizing the selectable characteristic of the gating convolution, so that the problem that the output result is too smooth while the unstructured noise is removed is solved.
4. According to the invention, the edge repair network and the repair network are introduced in the fine repair stage, so that the large damaged area in the rough repair result is further filled, and the integrity and the structural rationality of the repair result are improved.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a schematic of the workflow of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
The degradation process of the old photo is complex, and the lost information cannot be recovered by a simple single-stage network, so that we divide the repair of the old photo into a coarse repair and a fine repair, and remove unstructured damage, such as noise, fading, film particles, and the like, and structured damage, such as scratches, mildew, stains, and the like, in the old photo from coarse to fine, respectively. The restoration stage restores the unstructured damage in the old photo by learning the mapping relation between the degraded picture domain and the clean picture domain; the fine repair stage is focused on filling the missing or damaged areas in the old photo by a multi-scale residual densely connected edge repair network and a fast Fourier convolution block-based complement network. The two-stage combined training enables the network to automatically find the balance between structural damage and non-structural damage repair.
Example 1
According to the old photo restoration method based on the deep neural network, as shown in fig. 1, the method comprises the following steps:
step S1: and obtaining an old photo to be repaired.
Step S2: and constructing a rough repair network model, training the model, and eliminating the unstructured damage of the old photo to obtain the old photo after rough repair. Wherein the coarse repair network model includes a variational self-encoder and a mapping network. The variable self-encoder includes a first variable self-encoder including an encoder, a decoder, a discriminator for a reconstruction result, and a discriminator for a coding vector, and a second variable self-encoder including an encoder, a decoder, and a discriminator for a reconstruction result. The step S2 comprises the following steps:
step S2.1: and obtaining the coding vector by the encoder of the first variation self-encoder, and simultaneously extracting the image detail information in the intermediate feature map by a detail feature extraction module. The detail feature extraction module comprises three layers of gating convolution. Gated convolution can be communicated by learning useful information in an autonomously selected input feature.
Step S2.2: and converting the degraded picture domain into a clean picture domain through a mapping network by the coding vector.
Step S2.3: the reconstructed old picture is obtained from the decoder of the encoder by the second variation.
Specifically, the method comprises the steps of respectively and independently training a first variable-component self-encoder, a second variable-component self-encoder and a mapping network. Training the first variational self-encoder maps the real old picture and the degraded picture synthesized by adding various degradation kernels into the same hidden space. The second variation self-encoder is trained to take a clean lossless picture as input, and a hidden space of the clean picture is learned in the process of learning the reconstructed image. Training a mapping network, learning a mapping relation between a degraded picture domain and a clean picture domain, freezing gradients in two variations self-encoders, and setting the mapping network as a training mode only. The real old photo and the synthesized degraded photo are damaged pictures, are similar in appearance, and are aligned in a potential coding space, so that the mapping process of learning the degraded photo domain to the clean photo domain can be also applied to the real old photo. Also, conversion in a hidden space of low dimension is simpler than conversion in a complex image domain. Since the two variational self-encoders and the mapping network are separately and independently trained and do not interfere with each other, the final reconstructed result from the clean picture domain will conform to the characteristics of the clean picture.
Losses in the course of the coarse repair network model training include reconstruction losses, antagonism losses, and feature matching losses. The loss function used in the training of the first variable self-encoder and the second variable self-encoder also includes KL divergence. The method comprises the following steps:
a. the reconstruction loss calculates the L1 loss between the reconstruction result and the real clean picture, and the L1 loss of the characteristic vector of the degraded picture converted to the clean picture domain and the characteristic vector of the real clean picture in the hidden space through the mapping network, wherein the calculation formula is as follows:
Figure SMS_26
in the method, in the process of the invention,
Figure SMS_27
representing a mapping network,/->
Figure SMS_28
To reconstruct the result, I gt Z is a true clean picture x Coding vectors for the hidden space of a degraded picture, z y Coding vectors for the hidden space of a clean picture, +.>
Figure SMS_29
And->
Figure SMS_30
Representing the weight;
b. in order to generate natural pictures and more realistic details, the counterloss is used, and the counterloss calculation formula is as follows:
Figure SMS_31
in the method, in the process of the invention,
Figure SMS_32
a discriminator representing the coarse repair model.
c. The feature matching loss is used for stabilizing GAN training, and the L1 distance between the real sample and the synthesized sample input identifier and the multi-layer activation function output in the pretrained VGG network is calculated according to the following calculation formula:
Figure SMS_33
in the method, in the process of the invention,
Figure SMS_34
and->
Figure SMS_35
Feature map of the i-th layer output of discriminator and VGG network, respectively, +.>
Figure SMS_36
And->
Figure SMS_37
Representing the number of activation functions in the layer.
Step S3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair. The refined network model comprises a multi-scale densely connected edge repair network and a full complement network based on Fourier convolution. The edge repair network comprises three layers of the same dense residual network, and the structure can strengthen feature propagation, encourage feature multiplexing and reduce parameter quantity. Each layer of the network takes the output of all previous layers as additional input through a jump connection so that local features are extracted from the input. At the same time, the structure of a single dense residual error network only extracts image features at one resolution, and the receptive field of the network is limited. Also, because of the different dimensions and shapes of the edge deletion, features of different dimensions are combined to capture various edge characteristics. Specifically, step S3 includes:
step S3.1: and splicing the gray level map of the rough repair result, the extracted corresponding edge map and the corresponding mask together in the channel dimension to serve as input of an edge repair network, and further repairing the rough repaired edge map. Meanwhile, the input is downsampled by 4 times and 8 times, and the input is respectively input into an upper layer, a middle layer and a lower layer of a repair network, the network has a larger receptive field by calculating the characteristics of different scales, and the edge repair network can be trained by using binary cross entropy loss (Binary Cross Entropy Loss, BCE).
Step S3.2: and extracting structural information of the further repaired edge graph through a structural feature extraction module. An edge map is a sparse image in which most of the content is meaningless 0 values, making it difficult to extract structural features therein. The structural feature extraction module therefore comprises an encoder-decoder structure formed by the gating convolutions, takes the output of the last three upsampling gating convolutions as the extracted structural information, and adds three trainable weights to the first three inputs of the completion network.
Step S3.3: and the completion network completes the repair of the old photo according to the old photo after the rough repair and the structural information. Wherein the complementing network consists of fast fourier convolution fast stacking. Specifically, the fast fourier convolution layer is based on a channel Fast Fourier Transform (FFT), with a receptive field covering the image range of the entire image. The fourier convolution layer divides the channel into two parallel branches, a local branch and a global branch, respectively. The local branch uses conventional convolution and the global branch uses real signal FFT to consider the global context. Thus, the use of fourier convolution makes the generator network consider the global context from an early layer, which is critical for high resolution image restoration.
Losses in the refinement network model training process include reconstruction losses, antagonism losses, feature matching losses, gradient constraints, and high receptive field perception losses.
In particular, the implementation of reconstruction loss, countermeasure loss, feature matching loss is the same as the steps when the coarse-restoration network model is trained. In calculating the contrast loss, the finishing stage treats only the region masked by the mask as a dummy sample in the discriminator. In addition, the gradient constraint and the high receptive field perception loss are calculated as follows:
the calculation formula of the gradient constraint is as follows:
Figure SMS_38
in the method, in the process of the invention,
Figure SMS_39
is a gradient operator, D ξ Discriminator, lambda, representing a rough repair model GP Weights representing gradient constraint losses, I gt Representing a real clean picture;
the perceptual penalty evaluates the distance between features extracted from the predicted image and the target image through a basic pretraining network phi (). That is, conventional supervised loss requires the generator to accurately reconstruct the same content as the original, however, the visible portion of the image typically does not contain enough information to accurately reconstruct the masked portion. Thus, using traditional supervision may cause the model to tend to generate an average pattern of multiple plausible results for the repair content, resulting in ambiguous results. In contrast, perceptual loss evaluates the distance between features extracted from a predicted image and a target image through () in a basic pretrained network. It does not require exact reconstruction relative to the original, allowing the reconstruction results to be appropriately changed within a reasonable range of content. The focus of large mask repair is on understanding the overall structure, and therefore, the use of a base network with a rapidly growing receptive field is important for the computation of the perceived loss. The calculation formula is as follows:
Figure SMS_40
in phi HRF Representing a pre-trained ResNet50 network, lambda HRF The weight of the perceptual penalty is indicated,
Figure SMS_41
and (5) representing the reconstruction result of the fine repair stage.
Further, in the coarse repair stage, detailed information in a part of the image is inevitably lost in the downsampling, domain alignment and mapping processes, so that the generated result is too blurred and smoothed. To solve this problem, three different scale detail feature extraction modules are added to the first variation self-encoder, and the calculated detail features are added as an addition to the decoder of the second variation self-encoder. Taking the first variation from the output of the first three layers of the encoder ensures that the picture information is not lost too much, and a part of noise in the picture information can be attenuated to a certain extent in the downsampling process. Meanwhile, the output of the rough repair model is used as the input of the fine repair model, and the end-to-end joint training is carried out. The repairing result with clean structure, complete structure and rich details is gradually reconstructed by the whole reconstruction process and then partial reconstruction process.
The old photo restoration method based on the deep neural network provided by the invention further comprises a scratch detection step;
establishing a scratch detection network taking a Unet network as a main framework, firstly training by using only synthesized images, wherein the loss in the training process comprises cross entropy loss and Focal loss;
the calculation formula of the cross entropy loss is as follows:
Figure SMS_42
in the method, in the process of the invention,
Figure SMS_43
representing cross entropy loss, s i And y i Respectively representing pictures with scratches and corresponding masks, H and W tables respectivelyShowing the height and width of the picture, +.>
Figure SMS_44
And->
Figure SMS_45
The mask representing the network predictions is used to compensate for the imbalance of the positive and negative pixel samples at the same time as the weight α is used to compensate for the imbalance of the positive and negative pixel samples since the scratched area is typically a small part of the whole image. Alpha has a value of y i Positive/negative ratio determination of (2):
Figure SMS_46
the calculation formula of the Focal loss is as follows:
Figure SMS_47
wherein, gamma represents an adjustable super parameter,
Figure SMS_48
the calculation is as follows:
Figure SMS_49
in the method, in the process of the invention,
Figure SMS_50
the representation shows the Focal loss.
In order to further improve the detection performance of the real old photo, the acquired old photo with scratches is manually marked and then the detection network is finely adjusted. Meanwhile, a user interaction interface is set, scratch automatic detection is carried out by a user, or a repair area is manually smeared by the user, and the input old photo is repaired according to the obtained mask so as to adapt to different repair and application requirements.
Example two
The invention also provides an old photo restoration system based on the deep neural network, and a person skilled in the art can realize the old photo restoration system based on the deep neural network by executing the step flow of the old photo restoration method based on the deep neural network, namely the old photo restoration method based on the deep neural network can be understood as a preferred implementation mode of the old photo restoration system based on the deep neural network.
The invention provides an old photo restoration system based on a deep neural network, which comprises the following components:
module M1: and obtaining an old photo to be repaired.
Module M2: and constructing a rough repair network model, training the model, and eliminating the unstructured damage of the old photo to obtain the old photo after rough repair. The coarse repair network model includes a variational self-encoder and a mapping network. The variable self-encoder includes a first variable self-encoder including an encoder, a decoder, a discriminator for a reconstruction result, and a discriminator for a coding vector, and a second variable self-encoder including an encoder, a decoder, and a discriminator for a reconstruction result. Specifically, the first variable component self-encoder, the second variable component self-encoder, and the mapping network are independently trained, respectively. Training a first variational self-encoder to map a real old photo and a degraded picture synthesized by adding various degradation kernels into the same hidden space; the second variation self-encoder is trained to take a clean lossless picture as input, and a hidden space of the clean picture is learned in the process of learning the reconstructed image. The system also comprises the following sub-modules:
module M2.1: and obtaining the coding vector by the encoder of the first variation self-encoder, and simultaneously extracting the image detail information in the intermediate feature map by a detail feature extraction module. The detail feature extraction module comprises three layers of gating convolution. Module M2.2: and converting the degraded picture domain into a clean picture domain through a mapping network by the coding vector. Module M2.3: the reconstructed old picture is obtained from the decoder of the encoder by the second variation.
Module M3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair. The refined network model comprises a multi-scale densely connected edge repair network and a full complement network based on Fourier convolution. The edge repair network comprises three layers of the same dense residual network, each layer of the network having the output of all previous layers as additional inputs through a jump connection, such that local features are extracted from the inputs. Specifically, the method comprises the following sub-modules:
module M3.1: and splicing the gray level map of the rough repair result, the extracted corresponding edge map and the corresponding mask together in the channel dimension to serve as input of an edge repair network, and further repairing the rough repaired edge map.
Module M3.2: and extracting structural information of the further repaired edge graph through a structural feature extraction module. The structural feature extraction module comprises an encoder-decoder structure formed by gating convolution, takes the output of the last three upsampling gating convolutions as extracted structural information, and adds three trainable weights into the first three layers of input of the full network.
Module M3.3: and the completion network completes the repair of the old photo according to the old photo after the rough repair and the structural information.
Module M4: establishing a scratch detection network taking a Unet network as a main framework, firstly training by using only synthesized images, wherein the loss in the training process comprises cross entropy loss and Focal loss;
the calculation formula of the cross entropy loss is as follows:
Figure SMS_51
in the method, in the process of the invention,
Figure SMS_52
representing cross entropy loss, s i And y i Respectively representing a picture with scratches and corresponding masks, H and W respectively representing the height and width of the picture, ++>
Figure SMS_53
And->
Figure SMS_54
The mask representing the network predicted is used to compensate for the imbalance of positive and negative pixel samples using the weight α, as follows:
Figure SMS_55
the calculation formula of the Focal loss is as follows:
Figure SMS_56
wherein, gamma represents an adjustable super parameter,
Figure SMS_57
the calculation is as follows:
Figure SMS_58
in the method, in the process of the invention,
Figure SMS_59
indicating a Focal loss. Further, the losses in the course of training of the coarse and fine repair network models include reconstruction losses, antagonism losses, and feature matching losses.
The reconstruction loss calculation formula is as follows:
Figure SMS_60
in the method, in the process of the invention,
Figure SMS_61
representing a mapping network,/->
Figure SMS_62
To reconstruct the result, I gt Z is a true clean picture x Coding vectors for the hidden space of a degraded picture, z y Coding vectors for the hidden space of a clean picture, +.>
Figure SMS_63
And->
Figure SMS_64
Representing the weights.
The countermeasures loss calculation formula is as follows:
Figure SMS_65
in the method, in the process of the invention,
Figure SMS_66
a discriminator representing the coarse repair model.
The calculation formula of the feature matching loss is as follows:
Figure SMS_67
in the method, in the process of the invention,
Figure SMS_68
and->
Figure SMS_69
Feature map of the i-th layer output of discriminator and VGG network, respectively, +.>
Figure SMS_70
And->
Figure SMS_71
Representing the number of activation functions in the layer.
In addition, the losses in the training process of the refined network model also comprise gradient constraint and high receptive field perception losses: the calculation formula of the gradient constraint is as follows:
Figure SMS_72
in the method, in the process of the invention,
Figure SMS_73
is a gradient operator, D ξ Discriminator, lambda, representing a rough repair model GP Weights representing gradient constraint losses, I gt Representing a true clean picture.
The perceptual penalty evaluates the distance between features extracted from the predicted image and the target image through a basic pretraining network phi (), the calculation formula is as follows:
Figure SMS_74
in phi HRF Representing a pre-trained ResNet50 network, lambda HRF The weight of the perceptual penalty is indicated,
Figure SMS_75
and (5) representing the reconstruction result of the fine repair stage.
Those skilled in the art will appreciate that the systems, apparatus, and their respective modules provided herein may be implemented entirely by logic programming of method steps such that the systems, apparatus, and their respective modules are implemented as logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc., in addition to the systems, apparatus, and their respective modules being implemented as pure computer readable program code. Therefore, the system, the apparatus, and the respective modules thereof provided by the present invention may be regarded as one hardware component, and the modules included therein for implementing various programs may also be regarded as structures within the hardware component; modules for implementing various functions may also be regarded as being either software programs for implementing the methods or structures within hardware components.
The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily without conflict.

Claims (10)

1. The old photo restoration method based on the deep neural network is characterized by comprising the following steps of:
step S1: obtaining an old photo to be repaired;
step S2: constructing a rough repair network model and training the model to eliminate unstructured damage of the old photo to obtain the old photo after rough repair;
step S3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair.
2. The old photo restoration method based on a deep neural network according to claim 1, wherein the coarse restoration network model includes a variational self-encoder and a mapping network;
the variable self-encoder comprises a first variable self-encoder and a second variable self-encoder, wherein the first variable self-encoder comprises an encoder, a decoder, a discriminator aiming at a reconstruction result and a discriminator aiming at a coding vector, and the second variable self-encoder comprises an encoder, a decoder and a discriminator aiming at the reconstruction result;
the refined repair network model comprises a multi-scale densely connected edge repair network and a full repair network based on Fourier convolution;
the edge restoration network comprises three layers of the same dense residual network, each layer of the network taking the output of all previous layers as additional input through a jump connection, so that local features are extracted from the input.
3. The old photo restoration method based on the deep neural network according to claim 2, wherein the step S2 includes:
step S2.1: obtaining a coding vector through an encoder of a first variation self-encoder, and extracting image detail information in the intermediate feature image through a detail feature extraction module;
step S2.2: converting the degraded picture domain into a clean picture domain through a mapping network by the coding vector;
step S2.3: obtaining reconstructed old pictures from a decoder of the encoder by a second variation;
the detail feature extraction module comprises three layers of gating convolution.
4. A depth neural network based old photo restoration method according to claim 3, comprising separately training a first variational self-encoder, a second variational self-encoder and a mapping network;
training a first variational self-encoder to map a real old photo and a degraded picture synthesized by adding various degradation kernels into the same hidden space;
the second variation self-encoder is trained to take a clean lossless picture as input, and a hidden space of the clean picture is learned in the process of learning the reconstructed image.
5. The old photo restoration method based on the deep neural network according to claim 2, wherein the step S3 includes:
step S3.1: splicing the gray level map of the rough repair result, the extracted corresponding edge map and the corresponding mask together in the channel dimension to serve as input of an edge repair network, and further repairing the rough repaired edge map;
step S3.2: extracting structural information of the further repaired edge map through a structural feature extraction module;
step S3.3: the completion network completes the repair of the old photo according to the old photo after the rough repair and the structural information;
the structural feature extraction module comprises an encoder-decoder structure formed by gating convolution, takes the output of the last three upsampling gating convolutions as extracted structural information, and adds three trainable weights into the first three layers of input of the full network.
6. The deep neural network-based old photo restoration method according to any one of claims 1 to 5, wherein the losses in the course of training of the coarse and fine restoration network models include reconstruction losses, antagonism losses and feature matching losses;
the reconstruction loss calculation formula is as follows:
Figure FDA0004091363750000021
in the method, in the process of the invention,
Figure FDA0004091363750000022
representing a mapping network,/->
Figure FDA0004091363750000023
To reconstruct the result for the coarse repair stage, I gt Z is a true clean picture x Coding vectors for the hidden space of a degraded picture, z y Coding vectors for the hidden space of a clean picture, +.>
Figure FDA0004091363750000024
And->
Figure FDA0004091363750000025
Representing the weight;
the countermeasures loss calculation formula is as follows:
Figure FDA0004091363750000026
in the method, in the process of the invention,
Figure FDA0004091363750000027
a discriminator representing a coarse repair model;
the calculation formula of the feature matching loss is as follows:
Figure FDA0004091363750000028
in the method, in the process of the invention,
Figure FDA0004091363750000029
and->
Figure FDA00040913637500000210
Feature map of the i-th layer output of discriminator and VGG network, respectively, +.>
Figure FDA00040913637500000211
And->
Figure FDA00040913637500000212
Representing the number of activation functions in the layer.
7. The deep neural network-based old photo restoration method according to claim 5, wherein the losses in the refinement network model training process further include gradient constraints and high receptive field perception losses:
the calculation formula of the gradient constraint is as follows:
Figure FDA00040913637500000213
in the method, in the process of the invention,
Figure FDA00040913637500000214
is a gradient operator, D ξ Discriminator, lambda, representing a rough repair model GP Weights representing gradient constraint losses, I gt Representing a real clean picture;
the perceptual penalty evaluates the distance between features extracted from the predicted image and the target image through a basic pretraining network phi (), the calculation formula is as follows:
Figure FDA0004091363750000031
in phi HRF Representing a pre-trained ResNet50 network, lambda HRF The weight of the perceptual penalty is indicated,
Figure FDA0004091363750000032
and (5) representing the reconstruction result of the fine repair stage.
8. The old photo restoration method based on the deep neural network according to claim 1, further comprising a scratch detection step;
establishing a scratch detection network taking a Unet network as a main framework, firstly training by using only synthesized images, wherein the loss in the training process comprises cross entropy loss and Focal loss;
the calculation formula of the cross entropy loss is as follows:
Figure FDA0004091363750000033
in the method, in the process of the invention,
Figure FDA0004091363750000034
representing cross entropy loss, s i And y i Respectively representing a picture with scratches and corresponding masks, H and W respectively representing the height and width of the picture, ++>
Figure FDA0004091363750000035
And->
Figure FDA0004091363750000036
The mask representing the network predicted is used to compensate for the imbalance of positive and negative pixel samples using the weight α, as follows:
Figure FDA0004091363750000037
the calculation formula of the Focal loss is as follows:
Figure FDA0004091363750000038
wherein, gamma represents an adjustable super parameter,
Figure FDA0004091363750000039
the calculation is as follows:
Figure FDA00040913637500000310
in the method, in the process of the invention,
Figure FDA00040913637500000311
indicating a Focal loss.
9. An old photo restoration system based on a deep neural network, comprising:
module M1: obtaining an old photo to be repaired;
module M2: constructing a rough repair network model and training the model to eliminate unstructured damage of the old photo to obtain the old photo after rough repair;
module M3: constructing a refined repair network model, taking the old photo after the rough repair as input of the model to train the model, eliminating structural damage in the old photo after the rough repair, and further obtaining the old photo after the repair.
10. The deep neural network based old photo restoration system according to claim 9, wherein the coarse restoration network model includes a variational self-encoder and a mapping network;
the variable self-encoder comprises a first variable self-encoder and a second variable self-encoder, wherein the first variable self-encoder comprises an encoder, a decoder, a discriminator aiming at a reconstruction result and a discriminator aiming at a coding vector, and the second variable self-encoder comprises an encoder, a decoder and a discriminator aiming at the reconstruction result;
the refined repair network model comprises a multi-scale densely connected edge repair network and a full repair network based on Fourier convolution;
the edge restoration network comprises three layers of the same dense residual network, each layer of the network taking the output of all previous layers as additional input through a jump connection, so that local features are extracted from the input.
CN202310152602.8A 2023-02-21 2023-02-21 Old photo restoration method and system based on deep neural network Pending CN116402702A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310152602.8A CN116402702A (en) 2023-02-21 2023-02-21 Old photo restoration method and system based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310152602.8A CN116402702A (en) 2023-02-21 2023-02-21 Old photo restoration method and system based on deep neural network

Publications (1)

Publication Number Publication Date
CN116402702A true CN116402702A (en) 2023-07-07

Family

ID=87013081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310152602.8A Pending CN116402702A (en) 2023-02-21 2023-02-21 Old photo restoration method and system based on deep neural network

Country Status (1)

Country Link
CN (1) CN116402702A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117197472A (en) * 2023-11-07 2023-12-08 四川农业大学 Efficient teacher and student semi-supervised segmentation method and device based on endoscopic images of epistaxis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117197472A (en) * 2023-11-07 2023-12-08 四川农业大学 Efficient teacher and student semi-supervised segmentation method and device based on endoscopic images of epistaxis
CN117197472B (en) * 2023-11-07 2024-03-08 四川农业大学 Efficient teacher and student semi-supervised segmentation method and device based on endoscopic images of epistaxis

Similar Documents

Publication Publication Date Title
Lim et al. DSLR: Deep stacked Laplacian restorer for low-light image enhancement
CN111292264B (en) Image high dynamic range reconstruction method based on deep learning
CN108460746B (en) Image restoration method based on structure and texture layered prediction
CN111539887B (en) Channel attention mechanism and layered learning neural network image defogging method based on mixed convolution
CN113240613B (en) Image restoration method based on edge information reconstruction
CN108230278B (en) Image raindrop removing method based on generation countermeasure network
CN114140353A (en) Swin-Transformer image denoising method and system based on channel attention
CN113658051A (en) Image defogging method and system based on cyclic generation countermeasure network
CN109087258A (en) A kind of image rain removing method and device based on deep learning
CN113989129A (en) Image restoration method based on gating and context attention mechanism
CN113269224B (en) Scene image classification method, system and storage medium
CN114936605A (en) Knowledge distillation-based neural network training method, device and storage medium
CN114627006B (en) Progressive image restoration method based on depth decoupling network
CN112381716B (en) Image enhancement method based on generation type countermeasure network
CN114332070B (en) Meteorite detection method based on intelligent learning network model compression
CN114387365A (en) Line draft coloring method and device
CN116402702A (en) Old photo restoration method and system based on deep neural network
CN116343015A (en) Medical food water content measurement system based on artificial intelligence
CN113936318A (en) Human face image restoration method based on GAN human face prior information prediction and fusion
CN116051408A (en) Image depth denoising method based on residual error self-coding
Wang et al. Multiscale supervision-guided context aggregation network for single image dehazing
CN116757986A (en) Infrared and visible light image fusion method and device
CN115546273A (en) Scene structure depth estimation method for indoor fisheye image
CN112686822B (en) Image completion method based on stack generation countermeasure network
CN115035402B (en) Multistage feature aggregation system and method for land cover classification problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination