CN116739950A - Image restoration method and device, terminal equipment and storage medium - Google Patents

Image restoration method and device, terminal equipment and storage medium Download PDF

Info

Publication number
CN116739950A
CN116739950A CN202310488070.5A CN202310488070A CN116739950A CN 116739950 A CN116739950 A CN 116739950A CN 202310488070 A CN202310488070 A CN 202310488070A CN 116739950 A CN116739950 A CN 116739950A
Authority
CN
China
Prior art keywords
image
repaired
quantized
sample
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310488070.5A
Other languages
Chinese (zh)
Inventor
滕建新
刘勤山
何杰锋
袁锦春
黄研
彭育新
王员根
李子轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Broadcasting Network
Guangzhou University
Original Assignee
Guangzhou Broadcasting Network
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Broadcasting Network, Guangzhou University filed Critical Guangzhou Broadcasting Network
Priority to CN202310488070.5A priority Critical patent/CN116739950A/en
Publication of CN116739950A publication Critical patent/CN116739950A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image restoration method, an image restoration device, terminal equipment and a storage medium, wherein the method comprises the following steps: inputting an image to be repaired into an image repairing model, so that a self-encoder module in the image repairing model extracts feature vectors of the image to be repaired and generates a target feature map; the non-quantized transducer module in the image restoration model predicts potential vectors of a missing region in the target feature map, and after a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector are obtained, a decoder in the self-encoder module reconstructs the missing region in the image to be restored according to the first quantized feature vector and the second quantized feature vector to generate a restored image. The decoder of the invention introduces the FFC residual block, can more accurately capture global information and missing detail information, and improves the quality of image restoration.

Description

Image restoration method and device, terminal equipment and storage medium
Technical Field
The present invention relates to the field of image/video restoration, and in particular, to an image restoration method, an image restoration device, a terminal device, and a storage medium.
Background
The process of image restoration is a task process that predicts and restores missing regions in an image by using available information in the image (such as texture, structure, and other valid data) to create a high quality and visually similar image.
In recent years, a Convolutional Neural Network (CNN) is utilized to optimize a transducer (also called a hybrid model) to obtain great progress in the field of image restoration, however, when an image to be restored with a large-scale missing area is processed, the prior art selects a traditional convolution method on the selection of a convolution module of the image restoration model, and the traditional convolution has the problem that the receptive field is limited, particularly when an image with high resolution or a complex scene is processed, the traditional convolution can only perform linear operation in a space domain, and each convolution operation can only process a small local receptive field, so that the problems that global information is difficult to capture, the detail information is difficult to grasp due to missing and the like occur in the image restoration model, and the quality of image restoration is reduced.
Disclosure of Invention
The embodiment of the invention provides an image restoration method, an image restoration device, terminal equipment and a storage medium, which can effectively solve the problems that in the prior art, the traditional convolution can only perform linear operation in a space domain, and only a small local receptive field can be processed in each convolution operation, so that the image restoration model is difficult to capture global information, the detail information is not known, and the like, and the quality of image restoration is reduced.
An embodiment of the present invention provides an image restoration method, including:
acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image;
wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
the encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
and the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
Preferably, before inputting the image to be repaired into the image repair model, the method further comprises:
dividing all areas of the image to be repaired into a plurality of patch areas.
The encoder in the self-encoder module is configured to extract a feature vector of the image to be repaired, and combine a plurality of feature vectors to generate a target feature map, and specifically includes:
and the encoder in the self-encoder module is used for processing the patch areas in a non-overlapping block mode through a plurality of linear layers to obtain a feature vector corresponding to each patch area, and combining the feature vectors to generate a target feature map.
Preferably, the self-encoder module further comprises: vector quantization double codebook module;
after obtaining the first quantized feature vector corresponding to the feature vector of the known region and the second quantized feature vector corresponding to the potential vector, the method further comprises:
and the vector quantization double codebook module marks a first quantization characteristic vector corresponding to a known region and a second quantization characteristic vector corresponding to a missing region in the image to be repaired respectively.
Preferably, the reconstructing the missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantization dual codebook module to generate a repaired image corresponding to the image to be repaired specifically includes:
The FFC residual block in the decoder extracts local features of all quantized feature vectors by using traditional convolution through local branches to obtain local features, and extracts features of all quantized feature vectors in a spectrum domain of a global context through global branches to obtain global features;
fusing the local features and the global features to obtain fused features;
and the decoder rebuilds the missing area in the image to be restored according to the fusion characteristics to generate a restored image corresponding to the image to be restored.
Preferably, the training process of the image restoration model includes:
the training operation of the following image restoration model is repeatedly executed until the image restoration model is judged to be converged:
acquiring a sample to-be-repaired image and a sample repaired image corresponding to the sample to-be-repaired image;
inputting the sample image to be repaired into an image repair model, so that an encoder in a self-encoder module of the image repair model extracts feature vectors of the sample image to be repaired, and combining a plurality of feature vectors to generate a sample feature map;
a non-quantized transducer module in the image restoration model predicts a sample potential vector of a missing region in the sample feature map according to the feature vector corresponding to the missing region in the sample feature map;
The method comprises the steps that a vector quantization double codebook module in a self-encoder module of an image restoration model quantizes feature vectors of known areas in a sample feature map and quantizes sample potential vectors to obtain sample first quantized feature vectors corresponding to the feature vectors of the known areas in the sample feature map and sample second quantized feature vectors corresponding to the sample potential vectors;
reconstructing a missing region in a sample to-be-repaired image by a decoder fused with an FFC residual block according to a sample first quantization feature vector and a sample second quantization feature vector stored in a vector quantization double codebook module, and generating a predicted repaired image corresponding to the sample to-be-repaired image;
and comparing the predicted repaired image with the sample repaired image, and acquiring an updated sample image to be repaired and a corresponding sample image to be repaired when the image repair model is judged not to be converged according to the comparison result.
Preferably, when training the image restoration model, the method further comprises:
updating the encoder module in the image restoration model according to the reconstruction loss function; the reconstruction loss function consists of a pixel loss function, a gradient loss function, an antagonism loss function, a perception loss function and a style loss function;
The pixel loss function is calculated according to the following formula:
wherein ,for pixel loss function, +.>For the sample to be repaired image +.>Sample repaired image corresponding to sample to-be-repaired image,>representing element subtraction, ++>Representing mean value operation;
the gradient loss function is calculated according to the following formula:
wherein ,grad [. Cndot.]A function representing a computed image gradient;
the antagonistic loss function is calculated according to the following formula:
wherein ,d for the antagonism loss function adv (. Cndot.) is the function corresponding to the arbiter network;
the perceptual loss function is calculated according to the following formula:
wherein ,is a perceptual loss function;
the style loss function is calculated according to the following formula:
wherein ,g (·) is a Gram matrix for acquiring parameters as a style loss function;
the reconstruction loss function is calculated according to the following formula:
wherein ,to reconstruct the loss function lambda g A first preset parameter lambda a Lambda is the second preset parameter p Lambda is the third preset parameter s And the fourth preset parameter.
Preferably, when training the image restoration model, the method further comprises:
updating the unquantized transducer module in the image restoration model according to the cross entropy loss function;
Wherein the cross entropy loss function is represented as follows:
wherein ,Ltrans In order to cross-entropy loss function,predicting the distribution probability of potential vectors of the missing region in the target feature map for the unquantized transducer module,/for the target feature map>For the encoder function, m is a binary mask, when m i,j =0 indicates pixel missing at (i, j), when m i,j =1 indicates that the pixel at (i, j) is valid.
On the basis of the method embodiment, the invention correspondingly provides the device item embodiment.
An embodiment of the present invention provides an image restoration apparatus including: the system comprises an image acquisition module to be repaired and a repaired image generation module;
the image acquisition module to be repaired is used for acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
the repaired image generation module is used for inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image;
wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
the encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
The non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
and the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
Based on the method embodiment, the invention correspondingly provides the terminal equipment item embodiment.
Another embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement an image restoration method according to the embodiment of the present invention.
Based on the method embodiments described above, the present invention correspondingly provides storage medium item embodiments.
Another embodiment of the present invention provides a storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is controlled to execute an image restoration method according to the embodiment of the present invention.
The invention has the following beneficial effects:
the embodiment of the invention provides an image restoration method, an image restoration device, terminal equipment and a storage medium, wherein an image to be restored is input into an image restoration model, so that an encoder in a self-encoder module in the image restoration model extracts feature vectors of the image to be restored, and a plurality of feature vectors are combined to generate a target feature map; further, the non-quantized transform module in the image restoration model can predict potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map, and after obtaining a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector, the decoder in the encoder module can reconstruct the missing region in the image to be restored according to the first quantized feature vector and the second quantized feature vector; the decoder in the self-encoder module not only can restore the content in the missing region according to the potential vector restored by the non-quantized Transformer module, but also can keep the content of the known region of the image to be restored unchanged, when the missing region in the image to be restored is reconstructed according to a plurality of quantized feature vectors, the decoder introduces the FFC residual block to reconstruct the image to be restored, the FFC residual block can carry out Fourier transformation on the feature map through global branches and update in a spectrum domain affecting global context, compared with the traditional convolution operation which can only carry out linear operation on a space domain, only can only process smaller local receptive field in each convolution operation, particularly when the image with high resolution or complex scene is processed, the convolution operation can more effectively process the global receptive field so that the model can obtain global receptive field covering the image in a shallower layer of a network, the added receptive field can make the captured image information richer, namely, the accuracy of the image to be restored can be improved, and the accuracy of the image to be restored can be improved.
Drawings
Fig. 1 is a flowchart of an image restoration method according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an image restoration model according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an image restoration device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic flow chart of an image restoration method according to an embodiment of the invention;
the image restoration method provided by the embodiment of the invention comprises the following steps:
step S1: acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
step S2: inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image; wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
The encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
and the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
For step S1, in a preferred embodiment, an image to be repaired is acquired that includes known regions and missing regions; note that the missing region is a region shielded by a mask;
In a preferred embodiment, before inputting the image to be repaired into the image repair model, the method further comprises:
dividing all areas of the image to be repaired into a plurality of patch areas. Illustratively, the pixels of each patch area are 8 x 8 in size.
For step S2, in a preferred embodiment, the image to be repaired is input into an image repair model, so that the image repair model reconstructs a missing region in the image to be repaired to generate a repaired image;
specifically, as shown in fig. 2, the image restoration model includes a self-encoder module and a non-quantized transducer module;
the Encoder in the self-Encoder module includes an Encoder P-Enc (Patch-based Encoder), a vector quantization double Codebook module D-Codes (Dual-Codebook) and a Decoder RR-Dec (Resolution-Robust Decoder);
the encoder P-Enc in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the encoder P-Enc processes a plurality of patch areas in a non-overlapped block mode and through a plurality of linear layers to obtain a feature vector corresponding to each patch area, and the traditional CNN encoder processes an input image in a sliding window mode by utilizing a plurality of convolution kernels.
The vector quantization double codebook module D-Codes of the self-encoder module marks the quantization characteristic vector corresponding to the known region and the quantization characteristic vector corresponding to the missing region in the image to be repaired respectively; specifically, the vector quantization double codebook module D-Codes is a method for effectively representing a high-dimensional vector, in which quantized feature vectors are represented as a combination of two codebooks, and marks, i.e., e and e, of the quantized feature vectors in a known region and a missing region are stored, respectively ' The method comprises the steps of carrying out a first treatment on the surface of the This further eliminates the difference between the missing patch and the known patch, thereby enabling more reasonable results when the model predicts the characteristics of the missing patch.
In a preferred embodiment, the unquantized transform module is configured to predict a potential vector of a missing region in the target feature map according to a feature vector corresponding to the missing region in the target feature map; wherein the unquantized transducer module is also called UQ-transducer model;
specifically, the target feature map generated by the self-encoder module is directly used as the input of the unquantized transform module, so that the problem of information loss caused in the downsampling and quantization processes can be avoided, and the corresponding potential vector can be predicted more accurately when the UQ-transform model can directly use the patch-based feature vector from the P-Enc as the input to predict the potential vector of the missing region; the unquantized image features are directly used as the input of the UQ-transducer model, so that the reconstruction of the input damaged image is completed on the premise of keeping information as much as possible.
The core mechanism of the UQ-Transformer model is a self-attribute mechanism, so that the model balances the importance of different elements in the feature vector and predicts the elements according to the weighted sum. Also implemented as a series of parallel multi-headed attention layers by a self-attention mechanism that focus on different parts of the input sequence, and then concatenate the outputs of these attention layers and convert by a feed-forward layer to produce the final output representation.
In a preferred embodiment, the decoder RR-Dec in the self-encoder module quantizes the potential vector output by the UQ-Transformer model to obtain a quantized feature vector corresponding to the potential vector, quantizes the feature vector of the known region to obtain a first quantized feature vector, and quantizes the potential vector to obtain a second quantized feature vector;
and inputting the first quantized feature vector and the second quantized feature vector into a decoder, so that the decoder rebuilds a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector, and a repaired image corresponding to the image to be repaired is generated.
Illustratively, the decoder RR-Dec not only recovers the content in the missing region from the repaired tag, but also keeps the content in the known region unchanged. The decoder consists of a main branch and a reference branch structure: generating a repair image in the main branch by using a plurality of deconvolution layers, extracting a multi-scale feature map from the image to be repaired by the reference branch, and fusing the feature map obtained by the reference branch with the features of the main branch through a mask guide adding Module (MGA) so as to enable the image to be repaired Can be used to restore +.>And the predicted quantized vector in the missing region is used to recover the unknown region obscured by mask m. The specific formula is expressed as follows:
wherein and />Representing the features on the main branch and the reference branch, respectively, with a spatial size of +.>(0≤l≤log 2 r), and m l A binary mask corresponding to the above spatial size is indicated, a value of 1 indicates a known region, and a value of 0 indicates the presence of missing pixels in the region.
According to the invention, the FFC residual block is introduced into the RR-Dec of the decoder, so that the decoder can obtain the receptive field covering the whole image in a shallower layer of the network compared with the prior art, the effective receptive field is increased, so that the image information which can be captured by the decoder is richer, the details of the image are captured more accurately, and the accuracy of predicting the content in the missing area in the process of repairing the image is improved.
The FFC residual block is based on a channel Fast Fourier Transform (FFT), which divides the channel into two parallel branches: the local branches use conventional convolution to process local features, and the global branches use FFTs to analyze global context features FFCs one important difference from conventional convolution is the way they process data: traditional convolution is linear operation in a space domain, and each convolution operation can only process a small local receptive field; while FFC is a linear operation in the frequency domain (complex space), it allows convolution operations to more effectively handle global receptive fields. This is because all information in the frequency domain is efficiently represented as a series of complex components, and the FFC operation can combine these components through simple multiplication and addition operations, thereby simplifying the procedure of convolution operation. Therefore, the decoder can convert data from a time domain to a frequency domain through fast Fourier transformation, then carry out convolution operation, obtain a global receptive field more quickly than the traditional convolution, and finally fuse and output local branches and global branches to obtain fusion characteristics, so that the decoder RR-Dec can reconstruct a missing region in an image to be repaired according to the fusion characteristics to generate a repaired image corresponding to the image to be repaired.
The prior art transform allows downsampling of the input image to a lower resolution and quantization to a smaller pixel space, which results in serious information loss of the image, which is difficult to compensate by the subsequent refinement network, and which presents serious hazards to the image restoration work, and although the auto-encoder optimized transform is already available so that downsampling and quantization of the image are not needed, the conventional convolution method is used, which obviously has a problem that the receptive field is limited, which limits the potential of the hybrid model to a certain extent, and leads to a lack of an effective learning process for the global structure and detail information. The invention can improve the problem of limited receptive field caused by traditional convolution in the mixed model, and simultaneously maintains the efficiency of transform calculation, the model designs a self-encoder module (P-VQVAE+) module, introduces an FFC residual block in the original auto-encoder architecture, eliminates the original traditional convolution residual block, combines the self-encoder module with the UQ-transform, so that the final image restoration model can realize receptive field covering the whole image in an early network layer, namely the restoration model obtains a large and effective receptive field, effectively reduces information loss when reconstructing the image, and outputs a restored image of a more restored image; and meanwhile, the model uses less training data and calculation, so that the performance can be improved.
In a preferred embodiment, the training process of the image restoration model includes:
the training operation of the following image restoration model is repeatedly executed until the image restoration model is judged to be converged:
acquiring a sample to-be-repaired image and a sample repaired image corresponding to the sample to-be-repaired image;
inputting the sample image to be repaired into an image repair model, so that an encoder in a self-encoder module of the image repair model extracts feature vectors of the sample image to be repaired, and combining a plurality of feature vectors to generate a sample feature map;
a non-quantized transducer module in the image restoration model predicts a sample potential vector of a missing region in the sample feature map according to the feature vector corresponding to the missing region in the sample feature map;
the method comprises the steps that a vector quantization double codebook module in a self-encoder module of an image restoration model quantizes feature vectors of known areas in a sample feature map and quantizes sample potential vectors to obtain sample first quantized feature vectors corresponding to the feature vectors of the known areas in the sample feature map and sample second quantized feature vectors corresponding to the sample potential vectors;
reconstructing a missing region in a sample to-be-repaired image by a decoder fused with an FFC residual block according to a sample first quantization feature vector and a sample second quantization feature vector stored in a vector quantization double codebook module, and generating a predicted repaired image corresponding to the sample to-be-repaired image;
And comparing the predicted repaired image with the sample repaired image, and acquiring an updated sample image to be repaired and a corresponding sample image to be repaired when the image repair model is judged not to be converged according to the comparison result.
In a preferred embodiment, when training the image restoration model, further comprising:
updating the encoder module in the image restoration model according to the reconstruction loss function; the reconstruction loss function consists of a pixel loss function, a gradient loss function, an antagonism loss function, a perception loss function and a style loss function;
the pixel loss function is calculated according to the following formula:
wherein ,for pixel loss function, +.>For the sample to be repaired image +.>To-be-repaired graph for sampleLike the corresponding sample restored image, +.>Representing element subtraction, ++>Representing mean value operation;
the gradient loss function is calculated according to the following formula:
wherein ,grad [. Cndot.]A function representing a computed image gradient;
the antagonistic loss function is calculated according to the following formula:
wherein ,for the antagonism loss function->Is a function corresponding to the identifier network;
The perceptual loss function is calculated according to the following formula:
wherein ,is a perceptual loss function;
the style loss function is calculated according to the following formula:
wherein ,g (·) is a Gram matrix for acquiring parameters as a style loss function;
the reconstruction loss function is calculated according to the following formula:
wherein ,to reconstruct the loss function lambda g A first preset parameter lambda a Lambda is the second preset parameter p Lambda is the third preset parameter s The fourth preset parameter;
reconstruction lossIs a method for calculating the input image +.>Difference from reconstructed image->Is composed of five parts including L 1 Loss->Gradient ∈two images>Resistance loss->Loss of perceptionAnd loss of style->In a preferred embodiment lambda g =5,λ a =0.1,λ p =0.1,λ s =250。
The final loss function of the encoder module is:
L vae as a function of the final loss,codebook loss vector for optimizing potential vector, +.> Alpha is a weight parameter for committed loss of gradient information from the decoder to the encoder.
In a preferred embodiment, when training the image restoration model, further comprising:
updating the unquantized transducer module in the image restoration model according to the cross entropy loss function;
wherein the cross entropy loss function is represented as follows:
wherein ,Ltrans In order to cross-entropy loss function,predicting the distribution probability of potential vectors of the missing region in the target feature map for the unquantized transducer module,/for the target feature map>For the encoder function, m is a binary mask, when m i,j =0 indicates pixel missing at (i, j), when m i,j Table=1The pixel at (i, j) is shown to be active. /> Token representing the first parameter by +.>Acquisition->The index of all quantized vectors in (a) is obtained, and O (·) sets the given parameter value to 1. The invention can simultaneously integrate and utilize the strong modeling capability of CNN on information such as texture structure and the like and the strong modeling capability of a transducer on a long-distance relation, so that the model can recover more detailed image information when repairing a high-resolution image with a large-scale missing area, and can also ensure that the repairing result is visually reasonable.
The invention can use auto-encoder to cancel down sampling and quantizing operation of image, namely directly taking unquantized image characteristics as input of UQ-transform model, and guaranteeing that the input damaged image completes image reconstruction on the premise of preserving information as much as possible. The method solves the problem of serious information loss caused by using a transducer to carry out image restoration work, and can ensure that the restoration effect can be improved while the whole model uses less data quantity and lower calculation complexity.
According to the invention, the FFC residual block is introduced into the hybrid model, so that the model can obtain the receptive field covering the image global in the shallower layer of the network, the image information captured by the receptive field is richer by increasing the receptive field, the detail information in the image is captured more accurately, and the accuracy of missing region prediction is improved. Meanwhile, all training data and test data are adopted in the selected data set, and the repair results of the model in various scenes are improved, particularly when the images have large-scale missing areas, the improvement effect is particularly remarkable, as the FFC residual error blocks are introduced, the model can better extract effective information for image repair for some pictures with periodic structures, so that the repair results are more in line with the appearance of human eyes, namely, the model can obtain excellent results when the repetitive structures in artificial environments are repaired.
As shown in fig. 3, on the basis of the above-mentioned various embodiments of the image restoration method, the present invention correspondingly provides an embodiment of a device item;
an embodiment of the present invention provides an image restoration apparatus including: the system comprises an image acquisition module to be repaired and a repaired image generation module;
The image acquisition module to be repaired is used for acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
the repaired image generation module is used for inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image;
wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
the encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
and the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It will be clearly understood by those skilled in the art that, for convenience and brevity, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Based on the various embodiments of the image restoration method, the invention correspondingly provides embodiments of the terminal equipment item.
An embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement an image restoration method according to any one of the embodiments of the present invention.
The terminal equipment can be computing terminal equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The terminal device may include, but is not limited to, a processor, a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center of the terminal device, and which connects various parts of the entire terminal device using various interfaces and lines.
The memory may be used to store the computer program, and the processor may implement various functions of the terminal device by running or executing the computer program stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid state storage device.
Based on the various embodiments of the image restoration methods described above, the present invention correspondingly provides embodiments of storage media items.
An embodiment of the present invention provides a storage medium, where the storage medium includes a stored computer program, where when the computer program runs, the device where the computer readable storage medium is located is controlled to execute an image restoration method according to any one of the embodiments of the present invention.
The storage medium is a computer readable storage medium, and the computer program is stored in the computer readable storage medium, and when executed by a processor, the computer program can implement the steps of the above-mentioned method embodiments. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (10)

1. An image restoration method, comprising:
acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image;
wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
the encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
And the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
2. An image restoration method according to claim 1, further comprising, before inputting the image to be restored to an image restoration model:
dividing all areas of the image to be repaired into a plurality of patch areas;
the encoder in the self-encoder module is configured to extract a feature vector of the image to be repaired, and combine a plurality of feature vectors to generate a target feature map, and specifically includes:
and the encoder in the self-encoder module is used for processing the patch areas in a non-overlapping block mode through a plurality of linear layers to obtain a feature vector corresponding to each patch area, and combining the feature vectors to generate a target feature map.
3. The image restoration method according to claim 1, further comprising, after obtaining a first quantized feature vector corresponding to a feature vector of the known region and a second quantized feature vector corresponding to a potential vector:
And the vector quantization double codebook module marks a first quantization characteristic vector corresponding to a known region and a second quantization characteristic vector corresponding to a missing region in the image to be repaired respectively.
4. The method for image restoration as set forth in claim 3, wherein reconstructing a missing region in the image to be restored according to the first quantized feature vector and the second quantized feature vector stored in the vector quantization dual codebook module to generate the restored image corresponding to the image to be restored specifically includes:
the FFC residual block in the decoder extracts local features of all quantized feature vectors by using traditional convolution through local branches to obtain local features, and extracts features of all quantized feature vectors in a spectrum domain of a global context through global branches to obtain global features;
fusing the local features and the global features to obtain fused features;
and the decoder rebuilds the missing area in the image to be restored according to the fusion characteristics to generate a restored image corresponding to the image to be restored.
5. The image restoration method as recited in claim 1, wherein the training process of the image restoration model includes:
The training operation of the following image restoration model is repeatedly executed until the image restoration model is judged to be converged:
acquiring a sample to-be-repaired image and a sample repaired image corresponding to the sample to-be-repaired image;
inputting the sample image to be repaired into an image repair model, so that an encoder in a self-encoder module of the image repair model extracts feature vectors of the sample image to be repaired, and combining a plurality of feature vectors to generate a sample feature map;
a non-quantized transducer module in the image restoration model predicts a sample potential vector of a missing region in the sample feature map according to the feature vector corresponding to the missing region in the sample feature map;
the method comprises the steps that a vector quantization double codebook module in a self-encoder module of an image restoration model quantizes feature vectors of known areas in a sample feature map and quantizes sample potential vectors to obtain sample first quantized feature vectors corresponding to the feature vectors of the known areas in the sample feature map and sample second quantized feature vectors corresponding to the sample potential vectors;
reconstructing a missing region in a sample to-be-repaired image by a decoder fused with an FFC residual block according to a sample first quantization feature vector and a sample second quantization feature vector stored in a vector quantization double codebook module, and generating a predicted repaired image corresponding to the sample to-be-repaired image;
And comparing the predicted repaired image with the sample repaired image, and acquiring an updated sample image to be repaired and a corresponding sample image to be repaired when the image repair model is judged not to be converged according to the comparison result.
6. The image restoration method as recited in claim 5, wherein when training the image restoration model, further comprising:
updating the encoder module in the image restoration model according to the reconstruction loss function; the reconstruction loss function consists of a pixel loss function, a gradient loss function, an antagonism loss function, a perception loss function and a style loss function;
the pixel loss function is calculated according to the following formula:
wherein ,for pixel loss function, +.>For the sample to be repaired image +.>Sample repaired image corresponding to sample to-be-repaired image,>representing element subtraction, ++>Representing mean value operation;
the gradient loss function is calculated according to the following formula:
wherein ,grad [. Cndot.]A function representing a computed image gradient;
the antagonistic loss function is calculated according to the following formula:
wherein ,for the antagonism loss function- >Is a function corresponding to the identifier network;
the perceptual loss function is calculated according to the following formula:
wherein ,is a perceptual loss function;
the style loss function is calculated according to the following formula:
wherein ,g (·) is a Gram matrix for acquiring parameters as a style loss function;
the reconstruction loss function is calculated according to the following formula:
wherein ,to reconstruct the loss function lambda g A first preset parameter lambda a Lambda is the second preset parameter p Lambda is the third preset parameter s And the fourth preset parameter.
7. The image restoration method as recited in claim 5, wherein when training the image restoration model, further comprising:
updating the unquantized transducer module in the image restoration model according to the cross entropy loss function;
wherein the cross entropy loss function is represented as follows:
wherein ,Ltrans In order to cross-entropy loss function,predicting the distribution probability of potential vectors of the missing region in the target feature map for the unquantized transducer module,/for the target feature map>For the encoder function, m is a binary mask, when m i,j =0 indicates pixel missing at (i, j), when m i,j =1 indicates that the pixel at (i, j) is valid.
8. An image restoration device, comprising: the system comprises an image acquisition module to be repaired and a repaired image generation module;
The image acquisition module to be repaired is used for acquiring an image to be repaired; wherein the image to be repaired comprises a known region and a missing region;
the repaired image generation module is used for inputting the image to be repaired into an image repair model so that the image repair model reconstructs a missing area in the image to be repaired to generate a repaired image;
wherein the image restoration model comprises a self-encoder module and a non-quantized transducer module;
the encoder in the self-encoder module is used for extracting the feature vectors of the image to be repaired and combining a plurality of feature vectors to generate a target feature map;
the non-quantized transducer module is used for predicting potential vectors of the missing region in the target feature map according to the feature vectors corresponding to the missing region in the target feature map;
the vector quantization double codebook module in the self-encoder module is used for quantizing the feature vector of the known region and quantizing the potential vector to obtain a first quantized feature vector corresponding to the feature vector of the known region and a second quantized feature vector corresponding to the potential vector;
and the decoder fused with the FFC residual block in the self-encoder module is used for reconstructing a missing region in the image to be repaired according to the first quantized feature vector and the second quantized feature vector stored in the vector quantized double codebook module to generate a repaired image corresponding to the image to be repaired.
9. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing an image restoration method according to any of claims 1 to 7 when the computer program is executed.
10. A storage medium comprising a stored computer program, wherein the computer program, when run, controls a device in which the storage medium is located to perform an image restoration method according to any one of claims 1 to 7.
CN202310488070.5A 2023-04-28 2023-04-28 Image restoration method and device, terminal equipment and storage medium Pending CN116739950A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310488070.5A CN116739950A (en) 2023-04-28 2023-04-28 Image restoration method and device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310488070.5A CN116739950A (en) 2023-04-28 2023-04-28 Image restoration method and device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116739950A true CN116739950A (en) 2023-09-12

Family

ID=87917672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310488070.5A Pending CN116739950A (en) 2023-04-28 2023-04-28 Image restoration method and device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116739950A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372720A (en) * 2023-10-12 2024-01-09 南京航空航天大学 Unsupervised anomaly detection method based on multi-feature cross mask repair

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372720A (en) * 2023-10-12 2024-01-09 南京航空航天大学 Unsupervised anomaly detection method based on multi-feature cross mask repair
CN117372720B (en) * 2023-10-12 2024-04-26 南京航空航天大学 Unsupervised anomaly detection method based on multi-feature cross mask repair

Similar Documents

Publication Publication Date Title
CN112308763A (en) Generating a composite digital image using a neural network with a dual stream encoder architecture
Fu et al. Residual scale attention network for arbitrary scale image super-resolution
CN108492249A (en) Single frames super-resolution reconstruction method based on small convolution recurrent neural network
Chen et al. MICU: Image super-resolution via multi-level information compensation and U-net
CN116739950A (en) Image restoration method and device, terminal equipment and storage medium
Muhammad et al. Multi-scale Xception based depthwise separable convolution for single image super-resolution
CN116205820A (en) Image enhancement method, target identification method, device and medium
Liu et al. Hallucinating color face image by learning graph representation in quaternion space
CN116309148A (en) Image restoration model training method, image restoration device and electronic equipment
CN116205962A (en) Monocular depth estimation method and system based on complete context information
Wu et al. Hprn: Holistic prior-embedded relation network for spectral super-resolution
Cui et al. Progressive dual-branch network for low-light image enhancement
Song et al. Learning an effective transformer for remote sensing satellite image dehazing
Zhou et al. A superior image inpainting scheme using Transformer-based self-supervised attention GAN model
Luo et al. A fast denoising fusion network using internal and external priors
Liu et al. Non-homogeneous haze data synthesis based real-world image dehazing with enhancement-and-restoration fused CNNs
Cai et al. Lightweight spatial-channel adaptive coordination of multilevel refinement enhancement network for image reconstruction
Ren et al. Enhanced latent space blind model for real image denoising via alternative optimization
Shao et al. Pixel-level self-paced adversarial network with multiple attention in single image super-resolution
Ancuti et al. NTIRE 2023 HR nonhomogeneous dehazing challenge report
Attarde et al. Super resolution of image using sparse representation of image patches with LASSO approximation on CUDA platform
Zhou et al. Supervised-unsupervised combined transformer for spectral compressive imaging reconstruction
Hua et al. An Efficient Multiscale Spatial Rearrangement MLP Architecture for Image Restoration
Yu et al. Adaptive multi-information distillation network for image dehazing
Shi et al. DDABNet: a dense Do-conv residual network with multisupervision and mixed attention for image deblurring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination