CN108460746B

CN108460746B - Image restoration method based on structure and texture layered prediction

Info

Publication number: CN108460746B
Application number: CN201810316116.4A
Authority: CN
Inventors: 胡瑞敏; 廖良; 肖晶; 朱荣; 王中元; 陈宇; 陈宇静
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2018-04-10
Filing date: 2018-04-10
Publication date: 2021-09-03
Anticipated expiration: 2038-04-10
Also published as: CN108460746A

Abstract

The invention discloses an image restoration method based on structure and texture layered prediction, which comprises a network model training part and an image restoration part, wherein the network model training part comprises training data set preprocessing, extraction of an edge structure diagram, construction and training of a structure completion network N1 and construction and training of a texture transformation network N2; the image repairing part comprises inputting a tested image to be repaired, preprocessing, extracting an edge structure diagram, generating an edge structure of a defect area by using a structure completion network N1, generating image content of the defect area by using a texture conversion network N2, and filling the generated image content of the defect area into the image to be repaired. The method decomposes the problem of image restoration into the problem of structure and texture layered prediction, automatically generates a defect structure, and utilizes the restored structure edge image to restrict the texture generation process, thereby effectively avoiding texture confusion and shape distortion and greatly improving the restoration capability aiming at the defect problem of a large-area of a natural image.

Description

Image restoration method based on structure and texture layered prediction

Technical Field

The invention relates to an image restoration method aiming at large-area defect, in particular to an image restoration method based on structure and texture layered prediction.

Background

Since Bertalmio M, the first proposal of Siggraph conference in 2000, the image restoration technology has been widely applied in the fields of cultural relic protection, movie special effects, image lossy compression, image/video real-time transmission and the like. The technology is to automatically fill the defect area or remove a specific target according to undamaged image information, but because the image of the defect area is unknown, especially when a large-area image area defect is involved, a plurality of technical problems to be solved still exist.

The image restoration algorithm based on the traditional method is mainly divided into two directions: information diffusion-based methods and sample matching-based methods. The former adopts the idea of diffusing from the edge of the defect region to the inside, and is suitable for repairing the small-scale defect region, but when the region to be repaired is large, the repair result is fuzzy. The latter can alleviate the above-mentioned blurring problem based on the idea of searching similar image blocks in an image space or a sparse representation space, but when the content to be generated does not exist in the search space, the repair result often causes repair distortion.

The above conventional image restoration algorithm mines local prior information existing on an image to be restored for restoration, but when a large-area image region is damaged, the local prior information is not enough for restoring the damaged information.

In recent years, driven by deep learning, methods based on image context learning are proposed for repairing large-area defective images. The method generates new content by understanding the high-level semantics of the image to be repaired, and can generate correct and real defective area image content on the aspect of repairing problems of a specific image set (such as human faces). However, when the method is directly applied to natural scene picture restoration including composite structures and textures, the generated content often has the problems of texture confusion and shape distortion, and the method cannot process a complex environment when trying to understand the content of the whole picture.

The invention provides an image restoration method based on structure and texture layered prediction, which resolves the problem of image restoration into the problem of structure and texture layered prediction. On one hand, compared with a natural image with high-frequency texture, the structural information is lower dimensional information capable of expressing scene context, and is easier to repair than an original image; on the other hand, regions with homogeneous texture in natural images are often structurally constrained, so structural information will also be crucial constraint information in image inpainting.

Disclosure of Invention

Aiming at the defects of the prior art in repairing the defect problem of the large-area of the natural image, the invention provides an image repairing method based on structure and texture layered prediction.

The technical scheme adopted by the invention is an image restoration method based on structure and texture layered prediction, which comprises a network model training part and an image restoration part,

the network model training part comprises the following substeps,

step 1.1, training data set preprocessing, including randomly covering images in a training image set, simulating image defects, wherein the covered area is a defect area, and the residual images are used as images to be repaired to obtain an image set to be repaired and a defect area image set corresponding to the image set to be repaired, and preprocessing;

step 1.2, extracting an edge structure chart, including extracting the edge structure chart to be repaired and the edge structure chart of the defect area from the image set to be repaired and the image set of the defect area which are obtained after the pretreatment in the step 1.1;

step 1.3, the structure completes the construction and training of the network N1, and is used for inputting the structure chart of the edge to be repaired into the structure completion network N1 to reconstruct the edge structure of the defect area;

step 1.4, when a preset corresponding iteration stop condition is met, storing a structure to complete a network N1 model;

step 1.5 is entered, otherwise, step 1.3 is returned to continue the next iterative training;

step 1.5, constructing and training a texture transformation network N2, and inputting a structure diagram of the edge of the defect area and a corresponding image to be repaired into the texture transformation network N2 to reconstruct the image content of the defect area;

step 1.6, when a preset corresponding iteration stop condition is met, storing a texture transformation network N2 model, otherwise, returning to the step 1.5 to continue the next iteration training;

the image restoration part comprises the following sub-steps,

step 2.1, inputting a tested image to be repaired and preprocessing the image;

2.2, extracting an edge structure chart, including extracting the edge structure chart from the image to be repaired preprocessed in the step 2.1;

step 2.3, generating an edge structure of the defect area by using the structure completion network N1, including inputting the edge structure diagram of the image to be repaired extracted in step 2.2 into the structure completion network N1, further generating an edge structure diagram of the defect area, and combining the edge structure diagram of the image to be repaired with the edge structure diagram of the defect area to generate a repaired image edge structure diagram;

step 2.4, generating the image content of the defect area by using the texture transformation network N2, including inputting the repaired image edge structure diagram obtained in step 2.3 and the image to be repaired after the preprocessing in step 2.1 into the texture transformation network N2, and further generating the image content of the defect area;

and 2.5, outputting the repaired image, wherein the generated image content of the defect area is filled into the image to be repaired, and the obtained result is the repaired image.

Furthermore, in step 1.3, when the training architecture completes the net N1, the parameters in the architecture complete net N1 are optimized in such a way as to minimize the output layer errors, which are defined as follows,

let the original defect area edge structure chart be x₁The structure diagram of the edge of the defect area is x₂And D is a judging network for judging whether the input edge structure chart is a real structure chart or not, and then the error of the output layer is judged

The definition is that,

wherein the content of the first and second substances,

representing the generated edge Structure graph x₂And original edge structure diagram x₁Inter-reconstruction loss, for constraining the content of the generating structure,

representing the generated edge Structure graph x₂And the original edge structurex₁The competition loss between the two, for restricting the authenticity of the generated structure,

representing the generated edge Structure graph x₂And original edge structure diagram x₁Inter histogram of gradient (HOG) difference, defined as a gradient consistency loss, for constraining the gradient consistency of the generating structure, λ_rec，λ_aAnd λ_gError weights representing generation penalty, competition penalty, and gradient consistency penalty, respectively.

Furthermore, in step 1.5, when training the texture transformation network N2, the parameters in the texture transformation network N2 are optimized in such a way as to minimize the errors of the output layer,

let the original natural image be I₁Generating an image as I₂D is a discrimination network for discriminating whether the input image is a natural image, an error of an output layer

The definition is that,

wherein the content of the first and second substances,

representing a generated image I₂With the original natural image I₁The generation loss between, for constraining the content of the generated image,

representing a generated image I₂With the original natural image I₁Competition loss between, for constraining the realism of the generated image, lambda_recAnd λ_aRepresenting the error weights of the generation loss and the contention loss, respectively.

According to the invention, by designing an end-to-end algorithm framework of image restoration based on structure and texture layered prediction, the structure information of an object in a natural image can be recovered preferentially, the consistency of the structure information is kept, the conversion from a pixel-level structure to a texture is carried out based on the recovered structure information and the existing texture information, the automatic generation of a defect region is completed, the problems of texture confusion and shape distortion existing in the existing deep learning method are avoided, and the accuracy of image restoration is greatly improved.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention;

fig. 2 is a structural diagram of a texture transformation network N2 according to an embodiment of the present invention.

Detailed Description

In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, the flow of the embodiment of the present invention includes two parts, the network model training part is: preprocessing a training data set; extracting an edge structure chart; the structure finishes the training of the network N1, and when the training error is converged, the structure is saved to finish the network N1 model; and training the texture transformation network N2, and storing the texture transformation network N2 model after the training error is converged.

The image restoration part is as follows: inputting an image to be repaired and preprocessing the image; extracting an edge structure chart; using the trained structure to complete the generation of the edge structure of the defect area by the network N1 model; generating image content of the defect area by using a trained texture transformation network N2 model; and fusing the generated image content of the defect area and the image to be repaired to obtain the finally repaired image.

The network model training part comprises the following specific steps:

step 1.1: preprocessing a training data set: and automatically randomly covering the images in the training image set to simulate image defects. The covering area is a defect area, the residual defect images are used as images to be repaired, so that an image set to be repaired and a defect area image set corresponding to the image set to be repaired are obtained, and the two image sets are preprocessed. The preprocessing steps in the implementation include, but are not limited to, scaling the input image (to fit the image to the network input requirements), and random cropping (to fit the image to the network input requirements). In the embodiment, the images in the training data set are subjected to scale transformation and randomly cropped to 128 × 128 resolution, and the central 64 × 64 image area is masked as a defective area, and the original image content of the residual image as the defective area can be used as annotation data.

Step 1.2: extracting an edge structure diagram: in specific implementation, the existing edge detection method can be used to extract the structure diagram of the edge to be repaired and the structure diagram of the edge of the defect region from the image set to be repaired and the image set of the defect region preprocessed in step 1.1, which is not repeated in the present invention. Because the edge representation of single pixel granularity cannot well distinguish the primary structure from the secondary structure in the image, the invention adopts the gradient form of the edge as the edge structure chart for representing the structure information of the image. This form can distinguish primary and secondary structures by gradient values and line thickness, while also retaining more object semantic information.

Preferably, existing Edge Detection methods include, but are not limited to, Hollistincally-Nested Edge Detection (Hed Edge Detection algorithm)

Step 1.3: the structure completes the construction and training of the network N1:

after the training is finished, in the image repairing part, inputting a structure of the corresponding edge structure diagram to be repaired of the tested image to be repaired to complete the reconstruction of the edge structure of the defect area by the network N1.

In an embodiment, the fabric completion network N1 is a convolutional neural network in the form of an encoder-decoder, the structural elements of the encoder network include, but are not limited to, convolution operations, batch normalization operations, and nonlinear mapping, and the structural elements of the decoder network include, but are not limited to, transposed convolution operations, batch normalization operations, and nonlinear mapping.

Preferably, the structure completion network N1 network structure and parameters are shown in the following table. In the last column of data items in the table, the "output size" is the data size of the layer network output, and is represented by a ternary array, the first number represents the number of feature maps, and the second three numbers represent the feature map size.

The specific settings of the network are as follows:

the structure completion network N1 includes 13 layers, in which the first layer is an input layer, the second to sixth layers are encoder network hidden layers, the seventh layer is a feature layer, the eighth to twelfth layers are decoder network hidden layers, the thirteenth layer is an output layer, and each layer has the following structure:

a first layer: and inputting the extracted structure diagram of the edge to be repaired in the step 1.2 by the input layer.

A second layer: hidden layer 1, comprising a convolutional layer with convolutional kernel number of 64, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a third layer: hidden layer 2, comprising a convolution layer with convolution kernel number of 64, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A fourth layer: hidden layer 3, comprising a convolution layer with convolution kernel number 128, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a fifth layer: hidden layers 4, including a convolution layer with a convolution kernel number of 256, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A sixth layer: hidden layer 5, comprising a convolution layer with convolution kernel number 512, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A seventh layer: the characteristic layer comprises a convolution layer with the convolution kernel number of 4000, the kernel size of 4 multiplied by 4, the step length of 1 and the edge filling of 0, a batch normalization layer and a Tanh activation function layer, and characteristic vectors extracted from the structure diagram of the edge to be repaired are output.

An eighth layer: the hidden layer 6 comprises a transposed convolutional layer with the convolutional kernel number of 512, the kernel size of 4 multiplied by 4, the step size of 1 and the edge filling of 0, a batch normalization layer and a ReLu activation function layer.

A ninth layer: the hidden layer 7 comprises a transposed convolutional layer with the number of convolutional kernels of 256, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A tenth layer: hidden layers 8, including a transposed convolutional layer with 128 convolutional kernel number, kernel size 4 x 4, step 2, and edge fill 1, a batch normalization layer, and a ReLu activation function layer.

The eleventh layer: the hidden layer 9 comprises a transposed convolutional layer with 64 convolutional kernel number, 4 × 4 kernel size, step 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A twelfth layer: the hidden layer 10 comprises a transposed convolutional layer with 64 convolutional kernel number, 4 x 4 kernel size, step 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A thirteenth layer: and the output layer comprises a transposed convolution layer with convolution kernel number of 3, kernel size of 4 multiplied by 4, step size of 2 and edge filling 1 and a Tanh activation function layer, and outputs the generated defect area edge structure chart.

The invention further provides that the structure-completed network N1 employs three loss functions for constraints: 1) reconstructing loss, namely solving Euclidean distance by using an original defect region edge structure diagram and a generated defect region edge structure diagram, and using the Euclidean distance to restrict the structure of generated content; 2) the competition loss is obtained by inputting the generated defect area edge structure chart into a discrimination network, so that the difference between the generated structure and the real structure data distribution is obtained, and the difference is used for restricting the authenticity of the generated structure; 3) and (3) loss of gradient consistency, namely solving gradient histogram difference of small partial areas inside and outside the edge of the defect area to restrict the gradient consistency between the generated structure and the structure to be restored.

The loss function in the example is set as follows,

suppose the original defect area edge structure diagram is x₁The structure diagram of the edge of the defect area is x₂And D is a judging network for judging whether the input edge structure chart is a real structure chart or not, and then the error of the output layer is judged

Is defined as:

wherein the content of the first and second substances,

representing the generated edge Structure graph x₂And original edge structure diagram x₁Reconstruction loss between, for constraining the content of the generating structure;

representing the generated edge Structure graph x₂And original edge structure diagram x₁Competition loss between, is used for constraining the authenticity of the generating structure;

representing the generated edge Structure graph x₂And original edge structure diagram x₁Inter histogram of gradient (HOG) differences, defined as gradient consistency loss, are used to constrain the gradient consistency of the resulting structure.

Wherein λ is_rec，λ_aAnd λ_gError weights representing generation penalty, competition penalty, and gradient consistency penalty, respectively.

The invention provides that the edge structure diagrams to be repaired are taken out from the training data set in batches, the parameters of each layer are updated through return loss aiming at the minimum loss of an output layer, so that the network parameters are updated until all the edge structure diagrams to be repaired in the training data set are utilized, namely training is completed once, and then step 1.4 is carried out for iterative judgment.

Step 1.4: save structure completion network N1 model: and when the preset corresponding iteration stop conditions are met, the structure is saved to complete the network N1 model, the training work of the structure to complete the network N1 is ended, and if not, the step 1.3 is returned to continue the next iteration training. And (3) training the structure completion network N1 constructed in the step (1.3) by using the edge structure diagram to be repaired extracted in the step (1.2), wherein the conditions for stopping the training process can be two, one is that the value of the error of the output layer is smaller than a set threshold value, and the other is that the iterative training reaches a certain number of times. And saving the trained structure to complete the network N1 model, and repairing the edge structure of the image to be repaired in the image repairing stage.

Step 1.5: construction and training of the texture transformation network N2:

after the training is finished, in the image repairing part, the tested edge structure diagram of the image to be repaired and the corresponding image to be repaired are input into the texture transformation network N2 to reconstruct the image content of the defect area.

In an embodiment, the texture transformation network uses a convolutional neural network in the form of a double-layer encoder-single-layer decoder, and combines the bottom texture information and the high-layer structure information to generate a whole image, and the structure of the convolutional neural network is shown in fig. 2.

Step 1.5.1: in the texture transformation network N2 structure, the two-layer encoder refers to an edge structure encoder and an image encoder to be repaired, and the structural units include, but are not limited to, a convolution operation, a batch normalization operation, and a nonlinear mapping. The network parameters of the two encoders are not shared, and the two encoders are respectively used for extracting the high-level semantic features of the edge structure chart of the training image and the bottom-level texture features of the image to be repaired corresponding to the edge structure chart in each layer of the image encoder to be repaired.

Preferably, the structure and parameters of the two-layer encoder network of the texture translation network N2 are shown in the following table. In the last column of data items in the table, the "output size" is the data size of the layer network output, and is represented by a ternary array, the first number represents the number of feature maps, and the second three numbers represent the feature map size. In specific implementation, the image size and the network model structure can be preset by a user.

The specific settings of the network are as follows:

the edge structure encoder comprises 7 layers, wherein the first layer is an input layer, the second layer to the sixth layer are encoder network hidden layers, the seventh layer is a characteristic layer, and each layer has the following structure:

a first layer: and (3) an input layer 1, namely combining the edge structure diagram of the image to be repaired extracted in the step 1.2 with the edge structure diagram of the defect area as input, namely using the edge structure diagram of the original training image as the input of the layer.

A second layer: hidden layer 1_1, which includes a convolutional layer with convolutional kernel number of 64, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a third layer: hidden layer 1_2, which includes a convolutional layer with convolutional kernel number of 64, kernel size 4 × 4, step 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A fourth layer: hidden layer 1_3, which includes a convolution layer with convolution kernel number 128, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a fifth layer: hidden layer 1_4, which includes a convolution layer with convolution kernel number 256, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A sixth layer: hidden layer 1_5, which includes a convolution layer with convolution kernel number 512, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A seventh layer: the feature layer 1 includes a convolution layer with a convolution kernel number of 4000, a kernel size of 4 × 4, a step size of 1, and an edge fill of 0, a batch normalization layer, and a Tanh activation function layer, and outputs feature vectors extracted from the edge structure diagram.

The image encoder to be repaired comprises 7 layers, wherein the first layer is an input layer, the second layer to the sixth layer are encoder network hidden layers, the seventh layer is a characteristic layer, and the structures of the layers are as follows:

a first layer: and inputting the image to be repaired preprocessed in the step 1.1 into the layer 2.

A second layer: hidden layer 2_1, which includes a convolutional layer with convolutional kernel number of 64, kernel size of 4 × 4, step size of 2 and edge padding of 1, a batch normalization layer and a ReLu activation function layer.

And a third layer: hidden layer 2_2, which includes a convolution layer with convolution kernel number of 64, kernel size 4 × 4, step size 2 and edge padding 1, a batch normalization layer and a ReLu activation function layer.

A fourth layer: hidden layer 2_3, which includes a convolution layer with convolution kernel number 128, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a fifth layer: hidden layer 2_4, which includes a convolution layer with convolution kernel number 256, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A sixth layer: hidden layer 2_5, which includes a convolution layer with convolution kernel number 512, kernel size 4 × 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A seventh layer: the feature layer 2 includes convolution layers with a convolution kernel number of 4000, a kernel size of 4 × 4, a step size of 1, and an edge fill of 0, a batch normalization layer, and a Tanh activation function layer, and outputs feature vectors extracted from the image to be restored.

Step 1.5.2: the single-layer decoder refers to a conditional edge feature decoder, and the structural units include but are not limited to a transposition convolution operation, a batch normalization operation and a nonlinear mapping. And (3) connecting the feature vector extracted from the training image edge structure diagram and the feature vector extracted from the image to be restored in the step 1.5.1 as the input of a decoder network, wherein the input of each later network is obtained by connecting the decoding feature diagram of the edge structure extracted from the upper layer with the coding feature diagram of the image to be restored of the corresponding layer. The decoder finally outputs the generated image after texture conversion, and the texture conversion process from the edge structure chart to the natural image is completed.

Preferably, the structure and parameters of the single-layer decoder network of the texture transformation network N2 are shown in the following table. In the last column of data items in the table, the "output size" is the data size of the layer network output, and is represented by a ternary array, the first number represents the number of feature maps, and the second three numbers represent the feature map size. In specific implementation, the image size and the network model structure can be preset by a user.

The specific settings of the network are as follows:

the conditional edge feature decoder comprises 7 layers, wherein the first layer is an input layer, the second layer to the sixth layer are hidden layers, the seventh layer is an output layer, and the structures of the layers are as follows:

a first layer: an input layer, connecting the characteristic vector extracted from the training image edge structure chart in the step 1.5.1 and the characteristic vector extracted from the image to be repaired as the input of the layer

A second layer: hidden layers including a convolutional layer with a convolutional kernel number of 512, a kernel size of 4 x 4, step 1 and edge fill of 0, a batch normalization layer and a ReLu activation function layer.

And a third layer: hidden layers, including a convolution layer with a convolution kernel number of 256, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

A fourth layer: hidden layers, including a convolution layer with a convolution kernel number of 128, kernel size 4 x 4, step size 2 and edge fill 1, a batch normalization layer and a ReLu activation function layer.

And a fifth layer: hidden layers, including a convolutional layer with a convolutional kernel number of 64, kernel size 4 x 4, step size 2, and edge fill 1, a batch normalization layer, and a ReLu activation function layer.

A sixth layer: hidden layers, including a convolutional layer with a convolutional kernel number of 64, kernel size 4 x 4, step size 2, and edge fill 1, a batch normalization layer, and a ReLu activation function layer.

A seventh layer: and the output layer comprises convolution layers with convolution kernel number of 3, kernel size of 4 multiplied by 4, step length of 2 and edge filling of 1, a batch normalization layer and a Tanh activation function layer, and outputs the generated image after texture transformation.

The invention further provides that the texture transformation network N2 is constrained using two loss functions: 1) reconstructing loss, namely solving Euclidean distance by using the original natural image and the generated image after texture conversion to restrict the content of the generated image; 2) and (4) competitive loss, namely inputting the generated image after texture conversion into a discrimination network to obtain the difference of data distribution of the generated image and the real image, and using the difference to restrict the reality of the generated image.

The loss function in the example is set as follows,

suppose the original natural image is I₁Generating an image as I₂D is a discrimination network for discriminating whether the input image is a natural image, an error of an output layer

Is defined as:

wherein the content of the first and second substances,

representing a generated image I₂With the original natural image I₁Generating loss between the image and the image, and restricting the content of the generated image;

representing a generated image I₂With the original natural image I₁And (4) competitive losses between the images, which are used for restricting the reality of the generated images.

Wherein λ is_recAnd λ_aRepresenting the error weights of the generation loss and the contention loss, respectively.

The invention provides that the edge structure chart of the training image and the image to be restored are taken out from the training data set in batches, the parameter of each layer is updated through return loss aiming at the minimum loss of an output layer, so that the network parameter is updated until all the edge structure charts of the training image and the image to be restored in the training data set are utilized, namely training is completed once, and then the step 1.6 is carried out for iterative judgment.

Step 1.6: saving the texture transformation network N2 model: and when the preset corresponding iteration stop condition is met, storing the texture transformation network N2 model, finishing the training work of the texture transformation network N2, and otherwise, returning to the step 1.5 to continue the next iteration training. And (3) training the texture transformation network N2 constructed in the step 1.5 by using the training image edge structure diagram extracted in the step 1.2 and the image to be repaired preprocessed in the step 1.1, wherein the conditions for stopping the training process can be two, one is that the value of the error of the output layer is smaller than a set threshold value, and the other is that the iterative training reaches a certain number of times. And saving the trained texture transformation network N2 model for repairing the image content of the image to be repaired in the image repairing stage.

The image restoration part comprises the following sub-steps:

step 2.1: inputting a tested image to be repaired and preprocessing the image, wherein the preprocessing comprises the steps of carrying out scale transformation and cutting on the image to be repaired according to a specific defect area needing to be restored, so that the image to be repaired meets the size of 128 multiplied by 128 required by a repair network, namely the image to be repaired corresponds to the input size of a network model training part.

Step 2.2: and extracting the edge structure chart, namely extracting the edge structure chart from the image to be repaired preprocessed in the step 2.1 by using the existing edge detection method.

Step 2.3: and (3) generating an edge structure of the defect area by using the structure completion network N1, wherein the method comprises the steps of inputting the edge structure diagram of the image to be repaired, which is extracted in the step 2.2, into the structure completion network N1, further generating an edge structure diagram of the defect area, and combining the edge structure diagram of the image to be repaired with the edge structure diagram of the defect area to generate a repaired image edge structure diagram.

Step 2.4: and generating the image content of the defect area by using the texture conversion network N2, wherein the method comprises the step of inputting the repaired image edge structure diagram obtained in the step 2.3 and the image to be repaired after the preprocessing in the step 2.1 into the texture conversion network N2 so as to generate the image content of the defect area.

Step 2.5: and outputting the repaired image, wherein the generated image content of the defect area is filled into the image to be repaired, and the obtained result is the repaired image.

In specific implementation, the above processes can be automatically operated by adopting a computer software technology.

The invention solves the problem that the prior art can not better realize the task of structure and texture restoration due to lack of enough prior information under the condition that a large-area defect occurs in a natural image, provides an image restoration method based on structure and texture hierarchical prediction, decomposes the complicated image restoration problem into the problems of simple structure and texture hierarchical prediction based on the structure, can better restore the structure of an object in the image and can effectively restrict the range of generating the texture. It has several key points:

1) the structural information is provided as important prior information for image restoration, and a loss function for completing network framework and gradient consistency by a structure is designed. The technical effect of the method can be that the edge structure chart to be repaired is input to generate the edge structure of the defect area.

2) The method is used for converting the problem of image restoration into the texture conversion problem based on an edge structure chart and the existing texture information, and designing a texture conversion network framework which can well integrate the high-level structure characteristics and the bottom-level texture characteristics. The method has the technical effects that the whole edge structure chart and the image to be repaired can be input, and the texture filling of the defect area is completed by generating the whole image.

It should be understood that parts of the specification not set forth in detail are well within the prior art.

It should be understood that the above description of the preferred embodiments is given for clearness of understanding and no unnecessary limitations are to be understood therefrom, for those skilled in the art may make modifications and alterations within the scope of the invention without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An image restoration method based on structure and texture layered prediction is characterized in that: comprises a network model training part and an image repairing part,

the network model training part comprises the following substeps,

the image restoration part comprises the following sub-steps,

step 2.1, inputting a tested image to be repaired and preprocessing the image,

2. The image inpainting method based on structure and texture hierarchical prediction as claimed in claim 1, wherein: in step 1.3, when the training structure completes the network N1, the parameters in the structure completion network N1 are optimized in a way of minimizing the output layer errors, which are defined as follows,

The definition is that,

wherein the content of the first and second substances,

representing the generated edge Structure graph x₂And original edge structure diagram x₁The competition loss between the two, for restricting the authenticity of the generated structure,

3. The image inpainting method based on structure and texture hierarchical prediction as claimed in claim 1, wherein: in step 1.5, when the texture transformation network N2 is trained, the parameters in the texture transformation network N2 are optimized according to the method of minimizing the error of the output layer,

The definition is that,

wherein the content of the first and second substances,