CN112381733B

CN112381733B - Image recovery-oriented multi-scale neural network structure searching method and network application

Info

Publication number: CN112381733B
Application number: CN202011269424.XA
Authority: CN
Inventors: 彭玺; 缑元彪; 李伯运
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2020-11-13
Filing date: 2020-11-13
Publication date: 2022-07-01
Anticipated expiration: 2040-11-13
Also published as: CN112381733A

Abstract

The invention discloses a multi-scale neural network structure searching method and network application for image recovery, wherein the multi-scale neural network structure searching method for image recovery comprises the steps of determining the number S of cells and the number of columns of each cell in a multi-scale neural network, and each column of the cells is only provided with a parallel module or a fusion module; constructing a super network to be searched on a multi-scale search space, wherein the super network comprises a combination module, 1 fusion module and 1 convolution layer, wherein the combination module is formed by sequentially connecting two parallel modules, an S pair of combination modules formed by serially connecting a transition module and cells; and determining a finally selected module of each cell column in the super network based on a search strategy of gradient optimization to obtain the multi-scale neural network.

Description

Image recovery-oriented multi-scale neural network structure searching method and network application

Technical Field

The invention relates to a neural network structure method, in particular to a multi-scale neural network structure searching method and network application for image recovery.

Background

Due to the complexity of the environment and the quality of the acquisition equipment, digital images are often degraded due to various noise pollution, and image recovery aims to recover clean and clear images from the degraded images.

At present, a Neural network is generally implemented by using a network Architecture Search (Neural Architecture Search), which is an automated Architecture engineering and aims to automatically find an ideal Neural network Architecture from a Search space by using Search strategies such as an evolutionary algorithm, reinforcement learning, gradient optimization and the like.

The E-CAE adopts an evolutionary algorithm to initialize a network structure population through coding a network structure into a corresponding genotype, and carries out iterative crossover, mutation and selection, thereby searching the number and the size of convolution kernels of a convolution layer on a standard convolution self-encoder architecture and judging whether jump connection is used. HiNAS adopts a gradient optimization-based method and a layering strategy, and constructs a to-be-searched super network by using cells as search units through continuous relaxation represented by a structure and a multi-candidate path strategy, thereby simultaneously searching the internal structure and the external convolution kernel number of the cells.

Although network structure search techniques based on E-CAE and hisnas achieve excellent performance, they often face the problem of explosion of structure combinations, requiring the use of large amounts of computing resources to achieve this result. For example, E-CAE needs to operate 384GPU hours on a Tesla P100 GPU, where GPU hours represent the number of hours a single GPU needs to operate.

To alleviate the problems with E-CAE, recent focus has shifted to gradient-based approaches, e.g., hisnas still needs to compute 16.5GPU hours on Tesla V100 GPU. In addition, most existing network structure search technologies are originally designed for classification tasks, and do not take the characteristics of image restoration tasks into consideration.

Disclosure of Invention

Aiming at the defects in the prior art, the image recovery-oriented multi-scale neural network structure searching method and the network application solve the problem that the image recovery task is not considered in the existing network structure searching method.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that:

in a first aspect, a multi-scale neural network structure search method facing image recovery is provided, which includes:

s1, determining the number S of cells in the multi-scale neural network and the number of columns of each cell, wherein each column of the cells only has a parallel module or a fusion module;

s2, constructing a super network to be searched on a multi-scale search space, wherein the super network comprises two parallel modules, an S pair of combination modules formed by serially connecting a transition module and a cell, 1 fusion module and 1 convolution layer which are sequentially connected;

s3, determining a finally selected module of each cell column in the super network based on a search strategy of gradient optimization to obtain a multi-scale neural network; the method of searching for each column of finally selected modules for each cell includes:

s31, initializing current iteration q to 1, initializing iteration e, structure updating threshold t, super network weight parameter θ and super network structure parameter { a }^p，a^f}；

S32, acquiring a data set consisting of a pair of degraded images and a pair of corresponding clean images, and dividing the data set into a training set and a verification set, wherein the degraded images and the corresponding clean images are respectively used as network input and target output;

s33, judging whether the current iteration number q is larger than a structure updating threshold value t, if so, entering a step S34, and otherwise, entering a step S35;

s34, based on the loss function, using the data in the verification set to update the structural parameter { a } of the super network^p，a^fBefore proceeding to step S35;

s35, updating the weight parameter theta of the super network by using the data in the training set based on the loss function, and then entering the step S36;

s36, determining whether the current iteration number q is greater than the iteration number e, if so, entering step S37, otherwise, making q equal to q +1, and then returning to step S33;

s37, judging the structural parameter a of the parallel module of each column in each cell^pWhether or not it is greater than the structural parameter a of the fusion module^fIf so, selecting the parallel module corresponding to the column, otherwise, selecting the fusion module corresponding to the column.

In a second aspect, an application of a multi-scale neural network is provided, where the multi-scale neural network is obtained by using a multi-scale neural network structure search method oriented to image recovery, and the method includes:

training the multi-scale neural network, and then inputting the degraded image into the trained multi-scale neural network to obtain a clean image; the method for training the multi-scale neural network comprises the following steps:

acquiring a data set consisting of a degraded image and a clean image, taking the degraded image as the input of a network structure, and taking the corresponding clean image as the target output of the network structure;

based on loss functions

Updating the weight parameters of the multi-scale neural network by adopting the data in the training set;

and when the weight parameters of the multi-scale neural network are converged, obtaining the trained multi-scale neural network.

The invention has the beneficial effects that: according to the technical scheme, a super network to be searched is constructed in a multi-scale search space, and a search strategy based on gradient optimization is utilized to directly search a network structure on a specific image recovery task.

The multi-scale search space of the scheme is composed of some modularized operations instead of basic operators, namely, basic operations in network modules instead of each module are searched, so that the space to be searched is greatly reduced, the search time is further reduced, and the search efficiency is improved.

When the module of each column of cells is determined, by introducing the structure updating threshold value, the network weight parameter has a better initial value when the network structure is searched, and further the condition that the structure parameter and the weight parameter are together trapped in a local optimal interval in the alternating optimization process to cause the search of a suboptimal network structure is avoided.

Drawings

Fig. 1 is a flowchart of a multi-scale neural network structure search method for image restoration.

FIG. 2 is a flow chart of a method of searching for a module finally selected for each column of each cell.

FIG. 3 is an example of a search space according to the present scheme; (a) the structure of the parallel module, the transition module and the fusion module is shown schematically; (b) the structure of the cell is shown, and each column comprises a parallel module or a fusion module; (c) the structure of the super network is shown schematically.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

Referring to fig. 1, fig. 1 shows a flow chart of a multi-scale neural network structure search method for image recovery; as shown in fig. 1, the method S includes steps S1 to S3.

In step S1, the number S of cells in the multi-scale neural network and the number of columns per cell are determined, each column of cells having only parallel modules or fusion modules.

In step S2, a super-network to be searched is constructed on a multi-scale search space, where the super-network includes a combination module, 1 fusion module and 1 convolution layer, which are connected in series by a transition module and a cell in S pair, and two parallel modules are connected in sequence.

In one embodiment of the invention, the parallel module is used for keeping the resolution of the features on each parallel path unchanged through operation, and comprises at least one parallel volume block, wherein the number of the volume blocks is equal to the resolution number of the image of the input neural network;

the transition module is used for adding a new parallel path with low resolution, and simultaneously keeping the characteristics of the existing parallel path unchanged, and comprises null operation blocks and sampling blocks, wherein the number of the null operation blocks is equal to the number of the resolutions input into the transition module, and the minimum resolution is downsampled by the downsampling block;

the fusion module is used for fusing the features on each parallel path with the features on each path respectively through downsampling, upsampling and/or null operation on the input resolution, and comprises a null operation block, an upsampling block and a downsampling block.

Taking fig. 3 as an example, a network structure is described, taking the case of three parallel branches in (a) as an example, first, a transition module in a super network converts 2 parallel branches into 3 parallel branches, then, a parallel module and a fusion module containing 3 parallel branches are combined into a cell, refer to (b) in fig. 3, and finally, after the cell is connected to the transition module, refer to (c) in fig. 3, through this process, a network part containing 3 parallel branches in the super network is constructed.

In implementation, the scheme preferably selects the 1 st layer of the lower sampling block as a convolution layer and the 2 nd layer as a batch normalization layer, and then repeats the structures of the 1 st to 2 nd layers

Second, 2R₁+1 layer is a linear rectification function layer, where S_resFor the input resolution of the downsampled block, T_resIs the target resolution of the output;

the 1 st layer of the rolling block is a rolling layer, the 2 nd layer is a batch normalization layer, and the 3 rd layer is a linear rectification function layer; the 4 th layer is a convolution layer, the 5 th layer is a batch normalization layer, and the 6 th layer is a linear rectification function layer;

the 1 st layer of the up-sampling block is a convolution layer, the 2 nd layer is a batch normalization layer, the 3 rd layer is a linear rectification function layer, and then the structures of the 1-3 layers are repeated

Then, wherein S'_resFor the input resolution of the upsampled block, T_resIs the target resolution of the output.

In step S3, determining a finally selected module for each column of each cell in the super network based on a search strategy of gradient optimization to obtain a multi-scale neural network;

wherein the method of searching for each column of finally selected modules for each cell comprises:

s31, initializing current iteration number q to 1, initializing iteration number e, structure updating threshold t, super network weight parameter θ and super network structure parameter { a }^p，a^f}；

when the method is implemented, the preferable image recovery comprises image rain removal, image denoising and image defogging;

when the image recovery task is image rain removal, the degraded image is a shot natural image containing rain, and the clean image is a shot natural image without rain in a corresponding scene;

when the image recovery task is image denoising, the degraded image is a shot natural image containing noise, and the clean image is a shot natural image without noise in a corresponding scene;

when the image recovery task is image defogging, the degraded image is a shot foggy natural image, and the clean image is a shot fogless natural image in a corresponding scene.

The multi-scale neural network obtained by the searching in the scheme can be applied to all target objects in the nature, and the method can be adopted to recover as long as noise, rainwater or fog exist in the target objects, namely, the noise, the rainwater or the fog in the image is eliminated.

s34, based on the loss function, using the data in the verification set to update the structural parameter { a } of the super network^p，a^fStep S35 is then entered;

in further detail, step S34 is based on the loss function, and uses the adaptive moment estimation optimizer to update the network structure parameters by using the data in the verification set; step S35 is to update the network weight parameters based on the loss function by using the stochastic gradient descent optimizer and the cosine annealing strategy and using the data in the training set.

According to the scheme, different network parameters are updated by comparing the iteration times with the structure updating threshold value t, so that when the network structure is searched, the network weight parameters have a better initial value, and the condition that the structure parameters and the weight parameters are trapped in a local optimal interval together in the alternate optimization process to cause the searching of a suboptimal network structure is avoided.

In implementation, the preferred calculation formula of the loss function in the scheme is as follows:

wherein λ is₁And λ₂Are all weight coefficients;

is a real clean image

And clean image f (x) of network recovery_i) Recovery loss in between; i is an image pixel index; n is the number of image pixels;

loss for structural regularization; n is { a^p，a^fThe number of structural parameters in the structure; a is { a^p，a^fStructural parameters in (1);

is a micro-loss term used to measure the complexity of the model; c^pAnd C^fThe complexity of the parallel module and the fusion module respectively.

A in loss function loss^pAll columns per cell are indicated

Set of (a)^fAll columns per cell are indicated

A collection of (a).

The complexity of the parallel and fusion modules may be predefined by module size, time cost, FLOPs, etc. In the scheme, the complexity and the loss item of the model are measured by adopting the module size

The scheme can control the balance between the performance and the size of the model, and has high value in wide application scenes including resource-limited scenes.

s37, judging the structure parameter a of the parallel module of each column in each cell^pWhether or not it is greater than the structural parameter a of the fusion module^fIf yes, corresponding row is selectedAnd taking a parallel module, otherwise, selecting a fusion module corresponding to the column.

In one embodiment of the invention, the output of each column of cells within the super network is:

wherein, y_j+1The output of column j +1 for the cell;

the outputs of the parallel module and the fusion module of the jth row of the cell respectively;

the structural parameters of the parallel module and the fusion module in the j-th column of the cell are respectively.

The scheme also provides an application of the multi-scale neural network obtained by adopting the multi-scale neural network structure searching method facing the image recovery, which comprises the following steps:

based on loss functions

In order to verify the effectiveness of the multi-scale neural network obtained by the searching in the scheme, the multi-scale neural network obtained by the searching is applied to two image recovery tasks, namely image denoising and image rain removing.

In image denoising, the input of the network is an image containing noise, and the output of the network is a corresponding clean image; in the aspect of image rain removal, the input of the network is an image containing rain, and the output of the network is a corresponding clean image. Both tasks follow the following experimental setup: the constructed super-network comprises 3 cells, each cell comprises 4 columns, each column is composed of parallel modules and fusion modules which are relaxed continuously, namely S is 3, N_i＝4，i∈{1,…,S}。

The super network comprises 4 parallel paths corresponding to 4 different feature resolutions from high to low, and the number of corresponding feature channels is 32, 64, 128 and 256. In the training process, when the weight parameter of the hyper-network is updated, a random gradient descent optimizer with momentum of 0.9 and weight attenuation of 0.0003 is adopted, and a cosine annealing strategy is adopted to automatically attenuate the learning rate from 0.025 to 0.001. When the structural parameters of the hyper-network are updated, an adaptive moment estimation (Adam) optimizer is adopted, the learning rate is set to be 0.0003, and the weight attenuation is set to be 0.001.

The iteration number of the training super network is e-10,000, and 32 image blocks of 64 × 64 are randomly cut out from the verification set and the training set respectively in each iteration and are respectively used for updating the network structure parameters and the network weight parameters. In order to avoid that the structure parameters and the weight parameters jointly fall into a local optimal interval in the alternating optimization process, only the weight parameters are updated in the previous 1,000 iterations, namely the weight parameters and the structure parameters are updated alternately after the 1,000 th iteration.

For fair comparison, λ is set₁＝0.1，λ₂0, i.e. as with other solutions, the problem of model complexity is not taken into account. In the evaluation index, the peak signal-to-noise ratio (PSNR) of the image and the Structural Similarity (SSIM) of the image are used for evaluation, and the larger the value, the better the value.

Experiment one: the performance of the scheme on image denoising is evaluated by using the BSD500 data set. The BSD500 is a data set consisting of 500 natural images. The training set comprises 200 images, the verification set comprises 100 images, and the test set comprises 200 images. In the experiment, a training set and a verification set are adopted to train the super-network and derive and search to obtain a multi-scale neural network, then the training set and the verification set are utilized to train and verify the multi-scale neural network, finally the converged multi-scale neural network is applied to a test set, and the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) are calculated.

In order to verify the superiority of the present solution, it is compared with 6 solutions in the prior art in the case where the noise standard deviation σ is 30,50, and 70, respectively. The 6 schemes are block matching and three-dimensional filtering (BM3D), weighted kernel norm minimization (WNNM), memory network (MemNet), non-local recursive network (NLRN), evolutionary convolutional self-encoder (E-CAE), hierarchical neural network structure search (HiNAS), and the test results are shown in Table 1.

Table 1 test results of this and 6 solutions with a standard deviation of noise σ of 30,50, and 70

As can be seen from table 1, the recovery effect of this solution is better than that of the other 6 technical solutions. Specifically, when the noise standard deviation σ is 30, the PSNR and SSIM of the scheme are higher than E-CAE by 1.44 and 0.0384, respectively. The corresponding indices were also improved by 0.53 and 0.0028, respectively, compared to HiNAS. Similar results can also be observed when σ is 50 and σ is 70, and the results of table 1 verify the validity of the image restoration of the present scheme.

Experiment two: the Rain removal performance of the scheme on the image is evaluated by using the Rain800 data set. Rain800 is a composite data set of 800 images, containing 700 training images and 100 test images. In the experiment, 100 training images are arbitrarily taken out from 700 training images as a verification set, the rest 600 training images are taken as a training set, a training set and the verification set are adopted to train a hyper-network and derive and search to obtain a multi-scale neural network, then the training set and the verification set are used to train and verify the multi-scale neural network, finally the converged multi-scale neural network is applied to a test set, and the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) are calculated.

In order to verify the superiority of the scheme, the scheme is compared with 8 technical schemes in the prior art, namely a discrimination sparse coding method (DSC), a layer prior method (LP), a depth detail network (detailsNet), a depth joint rain detection and removal (JORDER, JORDER-R), a cyclic compression-excitation context aggregation network (SCAN), a residual cyclic compression-excitation context aggregation network (RESCAN), a hierarchical neural network structure search (HiNAS), and the comparison result is shown in a table 2.

Table 2 test results of peak snr and structural similarity for this and 8 schemes

Index \ method	DSC	LP	DetailsNet	JORDER	JORDER-R	SCAN	RESCAN	HiNAS	This scheme
										PSNR	18.56	20.46	21.16	22.24	22.29	23.45	24.09	26.31	26.58
SSIM	0.5996	0.7297	0.7320	0.7763	0.7922	0.8112	0.8410	0.8685	0.8530

As can be seen from table 2, the performance of this scheme and the hisnas is significantly better than that of other comparative schemes. For example, compared with the optimal technical scheme except HiNAS, the PSNR and SSIM indexes of the scheme are respectively 2.49 and 0.012 higher. Although the scheme is slightly lower than HiNAS in SSIM index, but higher than HiNAS in PSNR index, the test results in Table 2 again demonstrate the effectiveness of the scheme in image recovery.

Claims

1. The image recovery-oriented multi-scale neural network structure searching method is characterized by comprising the following steps:

S32, acquiring a data set consisting of a degraded image and a corresponding clean image, and dividing the data set into a training set and a verification set, wherein the degraded image and the corresponding clean image are respectively used as a network input and a target output;

s34, updating the structural parameter { a ] of the hyper-network by using the data in the verification set based on the loss function^p，a^fStep S35 is then entered;

s37, judging the structural parameter a of the parallel module of each column in each cell^pWhether or not it is greater than the structural parameter a of the fusion module^fIf so, selecting a parallel module corresponding to the row, otherwise, selecting a fusion module corresponding to the row;

the parallel module is used for keeping the resolution of the features on each parallel path unchanged through operation, and comprises at least one parallel rolling block, and the number of the rolling blocks is equal to the resolution number of the images input to the neural network;

2. The image restoration-oriented multi-scale neural network structure searching method according to claim 1, wherein the output of each column of cells in the super network is:

wherein, y_j+1The output of column j +1 for the cell;

3. The image restoration-oriented multi-scale neural network structure searching method according to claim 1, wherein the loss function is calculated by the formula:

wherein λ is₁And λ₂Are all weight coefficients;

is a real clean image

loss of structural regularization; n is { a^p，a^fThe number of structural parameters in the structure; a is { a^p，a^fStructural parameters in (1);

4. The image restoration-oriented multi-scale neural network structure searching method as claimed in claim 1, wherein the layer 1 of the downsampling block is a convolution layer, the layer 2 is a batch normalization layer, and then the structures of the layer 1-2 are repeated

5. The image recovery-oriented multi-scale neural network structure searching method as claimed in claim 1, wherein the image recovery comprises image rain removal, image de-noising and image defogging;

6. An application method of a multi-scale neural network, wherein the multi-scale neural network is obtained by adopting the multi-scale neural network structure searching method facing image recovery as claimed in any one of claims 1 to 5, and the method is characterized by comprising the following steps:

based on loss functions