CN113962882A

CN113962882A - JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Info

Publication number: CN113962882A
Application number: CN202111155935.3A
Authority: CN
Inventors: 张译; 禹冬晔; 牟轩沁
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2022-01-21
Anticipated expiration: 2041-09-29
Also published as: CN113962882B

Abstract

The invention discloses a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which comprises the steps of firstly extracting image characteristics related to compression grade, secondly guiding and recovering a Y channel image by using the characteristics, then guiding and recovering a CbCr channel image by using the characteristics and the recovered Y channel image, and finally transforming the image to an RGB space to obtain a final recovery result. The method provided by the invention has the advantages that the image coding parameter information is not required to be predicted, the method has a good recovery effect aiming at a plurality of images with different compression levels, each recovery network only needs to train a single network model, the model avoids the problems of gradient disappearance and gradient explosion possibly occurring in the training process by using jump connection on one hand, and reduces the complexity of the model by using a recursive module shared parameter strategy on the other hand, and the high-efficiency operation of the algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, remarkable recovery effect and the like.

Description

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Technical Field

The invention belongs to the field of image processing, and particularly relates to a JPEG image compression artifact elimination method based on a controllable pyramid wavelet network.

Background

Due to the limitations of transmission bandwidth and storage capacity, images or videos shot by a camera need to be compressed in the using process, and at present, lossy compression methods represented by JPEG compression are widely applied to various links of image processing. Since high frequency information of an image is lost in a quantization stage, a compressed image may contain compression artifacts such as blocking artifacts, ringing artifacts, and blurring, which not only cause a reduction in the perceived quality of the image, but also affect the performance of various computer vision algorithms that take the compressed image as an input. Therefore, the JPEG compressed image recovery algorithm is designed quickly and effectively, and the method has wide application prospect and practical value.

Currently, JPEG image compression artifact removal techniques can be roughly classified into the following three types:

1) a filter-based approach, i.e. eliminating compression artifacts by performing a filtering operation along block boundaries in the spatial or frequency domain. The spatial domain filtering method generally selects an appropriate filter near a block boundary for filtering according to the characteristics of different image regions, i.e. a spatial adaptive filtering method. Later, some more complex filtering methods are developed, including image block shift window filtering, nonlinear filtering, adaptive non-local mean filtering, adaptive bilateral filtering, etc. The frequency domain filtering method restores image detail information mainly by adjusting Discrete Cosine Transform (DCT) coefficients.

2) The method based on the optimization of the inverse problem, namely, the optimization and solving process of treating the image decompression artifact as the inverse problem, solves the original image by utilizing certain priori knowledge of the image. Typical image priors include low rank priors, quantization constraint priors, non-local similarities, sparse representation priors, and the like. Some methods use a number of a priori knowledge to obtain an optimal solution to the inverse problem. Most algorithms are time-consuming due to the complex optimization process of the prior knowledge-based method.

3) Machine learning based methods map/transform compressed images into original images by learning a large number of original images and compressed image samples to obtain some image mapping/transformation relationship. Typical machine learning methods utilize Convolutional Neural Networks (CNN) to implement image mapping/transformation, such as ARCNN, TNRD, DnCNN, CAS-CNN, MemNet, S-Net, Deep Convolutional Sparse Coding (DCSC) networks, generative confrontation network models, and the like. Some methods (such as DMCNN, DDCN, MWCNN, DPW-SDNet, etc.) use CNN to recover the spatial domain and the frequency domain of the image respectively, so as to obtain better image recovery performance.

Among the three methods, the filtering-based method has poor image recovery performance, the inverse problem optimization-based method has high computational complexity, and the algorithm consumes time. In comparison, with the development of the GPU parallel computing technology, the method based on machine learning can not only obtain better image recovery performance, but also has faster algorithm speed. However, most current methods based on machine learning require prediction of coding information of compressed images and are effective only for partial compression level images, thereby limiting the application range of the algorithm. While DnCNN overcomes the above limitations by adjusting the training data, its image recovery performance is general and only effective for grayscale images. In addition, there are also some methods to implement the multi-compression level image recovery task by training multiple network models, however, multiple network models mean that more storage space is occupied. Therefore, there is a need for a unified network model that does not require any pre-knowledge of the coding information of the compressed image, and that is effective for gray-scale and color images of various compression levels while occupying less memory space, and thus is easy to implement on small devices.

Disclosure of Invention

The invention aims to overcome the defects and provides a JPEG image compression artifact elimination method based on a controllable pyramid wavelet network. The Y-channel image recovery network of the invention comprises 6 recursion modules, and takes multi-scale and multi-direction wavelet coefficients as input to predict the wavelet coefficients of the original image. The CbCr channel image recovery network comprises 6 recursive U-Net structures, and the CbCr channels of the compressed images and the recovered Y channels are used as input, so that the recovered CbCr channel images are obtained.

In order to achieve the above object, the method comprises the following steps:

converting an input JPEG compressed image into a YCbCr color space;

inputting a Y channel of a JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by utilizing a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting the JPEG compressed image into a recovery network;

step three, decomposing a Y channel of the JPEG compressed image in three dimensions and eight directions through controllable pyramid wavelet transformation to generate a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band, sending all sub-band coefficients together with a related characteristic diagram of the QF prediction network and compressed images corresponding to different dimensions into a Y channel recovery network, and performing inverse wavelet transformation on the sub-band coefficients output by the Y channel recovery network to a spatial domain to obtain a recovered Y channel image;

inputting the linear characteristic diagram in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into the CbCr channel restoration network together, thereby obtaining a restored CbCr channel image;

and step five, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.

The structure of the QF prediction network is as follows:

the size of the input image block is 128 x 128 pixels.

In the second step, the output of the fourth, sixth and eighth convolution layers of the QF prediction network is input into the recovery network of the Y channel and the CbCr channel after being adjusted and scaled by the number of the channels.

In the third step, the Y-channel recovery network comprises six recursion modules, each recursion module comprises four parallel convolutional neural networks, the four convolutional neural networks correspond to image input of different scales respectively, after feature map splicing and two-layer convolutional layer operation, feature maps of different scales are subjected to up-sampling and down-sampling operation, feature fusion is realized in each path of neural network, and after the feature maps are fused, network output is obtained after the feature maps pass through two layers of convolutional layers and a parameter correction linear unit.

In the third step, the nonlinear feature map extracted by the QF prediction network comprises feature maps of four scales, the feature maps of four scales are respectively connected into corresponding four convolutional neural networks by means of splicing with wavelet coefficient feature maps, compressed images of different scales are respectively spliced with corresponding wavelet subband coefficients to be used as the input of each neural network of the recursion module, the real part and the imaginary part of the coefficient are respectively extracted because the wavelet transformation coefficient is a complex number, and the real part feature map and the imaginary part feature map are spliced to be used as the input of the network.

The real part and the imaginary part of the wavelet coefficient output by the recursion module in the Y channel recovery network are summed element by element with the real part and the imaginary part of the wavelet coefficient input originally, so that the recovered wavelet coefficient is obtained; the calculation process is repeated six times, and the six recursion modules share the same network parameters.

In the fourth step, the CbCr channel recovery network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a nonlinear feature map extracted by the QF prediction network and a recovered Y-channel image obtained in the third step, and the CbCr channel recovery network repeatedly predicts the CbCr channel image through residual learning by utilizing the recovered Y-channel image and the related features extracted by the QF prediction network.

The network structure of a recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end completes down sampling for three times, and the decoder end completes up sampling for three times; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;

the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the encoding end, the output of the convolution layers with different scales is spliced and combined with the features of the corresponding scale extracted by the QF prediction network;

and the recursion module outputs the CbCr channel which is subjected to element-by-element summation with the original compressed image CbCr channel to obtain a recovered CbCr channel, the calculation process is repeated for six times, and the six recursion modules share the same network parameters.

Compared with the prior art, the method has the advantages that the QF prediction network is utilized to extract the image characteristics related to the compression grade, then the Y channel and the CbCr channel of the image are respectively recovered through the two recovery networks, and finally the image is converted into the RGB space to obtain the final recovery result. The invention provides that QF prediction network is used for extracting image characteristics related to compression grade, thereby guiding the network to restore the image in a self-adaptive way under the condition of different compression grades, therefore, the method of the invention does not need to predict image compression coding information and can better finish the restoration tasks of a plurality of images with different compression grades; the recovery network provided by the invention only needs to train a single network model, and the model avoids the problems of gradient disappearance and gradient explosion which may occur in the training process by using jump connection on one hand, and reduces the complexity of the model by using a recursive module shared parameter strategy on the other hand, thereby ensuring the efficient operation of the algorithm. The invention uses QF prediction network to extract image characteristics, which can increase the nonlinear capability of the network, thereby learning more complex image mapping/transformation relation. The invention adopts a Y-channel recovery network to analyze the multi-scale and multi-directional spatial correlation of the image by using the controllable pyramid wavelet transform, and simultaneously analyzes the amplitude and phase information of the wavelet sub-band coefficient, thereby obtaining better image recovery performance. The invention inputs the recovered Y channel image into a CbCr channel recovery network, thereby assisting in recovering the color component of the image by utilizing the structure and texture characteristics of the brightness image.

Furthermore, the recovery network of the invention uses a recursive module to share the model parameter strategy, which is beneficial to reducing the complexity of the model and ensuring the efficient operation of the algorithm.

Furthermore, the recovery network of the invention uses jump connection to increase the network depth, and avoids the problems of gradient disappearance and gradient explosion which may occur in the training process.

Drawings

FIG. 1 is a process framework diagram of the present invention;

FIG. 2 is a schematic diagram of a QF-related feature extraction network according to the present invention;

FIG. 3 is a network structure diagram of a recursive module in the Y-channel recovery network according to the present invention;

FIG. 4 is a method for stitching wavelet feature maps of different dimensions;

FIG. 5 is a network structure diagram of a recursion module in a CbCr channel recovery network;

FIG. 6 is a graph illustrating the recovery performance of the method of the present invention compared to other methods for images of different compression levels; wherein, (a) is the PSNR index increase condition when different algorithms are used for recovering images; (b) the SSIM index increases when different algorithms are used to restore the image.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

In the description of the present invention, it is to be understood that the embodiments described herein are exemplary and that the specific parameters set forth in the description of the embodiments are merely for the purpose of describing the invention and are not intended to be limiting.

The method comprises the following steps: the input JPEG compressed image is converted from the RGB color space to the YCbCr color space.

Step two: inputting the Y channel of the compressed image into a QF prediction network, extracting a nonlinear feature map by utilizing a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension (the number of channels) before inputting the Y channel into a recovery network.

The size of the input image block is 128 × 128, and the QF prediction network structure is shown in table 1.

TABLE 1QF prediction network architecture model

QF-related features are extracted using the second, fourth, sixth and eighth convolutional layers of the QF prediction network, as shown in fig. 2. Assuming that a W × H image block is input, the dimensions of the output feature map of the second, fourth, sixth, and eighth convolutional layers of the network are: w × H × 64, W/2 × H/2 × 128, W/4 × H/4 × 256, W/8 × H/8 × 512. And adjusting the number of output characteristic diagram channels of the convolution layers of the fourth layer, the sixth layer and the eighth layer to 64. Specifically, for the feature maps of W/2 XH/2X 128, each adjacent 2 feature maps are averaged; for the feature maps of W/4 XH/4X 256, taking the mean value of every adjacent 4 feature maps; for the W/8 XH/8 512 feature maps, each adjacent 8 feature maps were averaged. The extracted QF related features are respectively represented by C1, f1, f2 and f3, and the dimensions are respectively as follows: w × H × 64, W/2 × H/2 × 64, W/4 × H/4 × 64, W/8 × H/8 × 64.

Step three: the Y-channel of the compressed image undergoes a controlled pyramid wavelet transform, decomposing at 3 scales and 8 directions, producing 1 high-pass sub-band, 24 band-pass sub-bands and 1 low-pass sub-band. The sub-band coefficients, related QF characteristics and original compressed images with different scales are input into a recovery network consisting of 6 recursive modules, and the sub-band coefficients output by the network are subjected to inverse wavelet transformation to a spatial domain, so that a recovered Y-channel image is obtained.

As shown in fig. 1, the Y-channel image is first decomposed using a controlled pyramid wavelet transform (SPWT) in 3 scales and 8 directions, resulting in 1 high-pass sub-band (H0), 24 band-pass sub-bands of different scales and different directions (the sub-bands of different scales are denoted B1, B2, and B3, respectively), and 1 low-pass sub-band (L0). Splicing and merging the H0, B1 and Y channel images as the first scale input of a recursion module; b2 is spliced and combined with the downsampled Y-channel image to be used as a second scale input of the recursion module; b3 is spliced and merged with the Y-channel image which is down-sampled twice, and the spliced and merged image is used as the third scale input of the recursion module; l0 is the fourth scale input to the recursion module. For H0, B1, B2 and B3, the real part and imaginary part of the wavelet coefficients are extracted respectively, and the real part feature map and imaginary part feature map are merged. Thus, the number of channels input by the four scales of the recursion module is 18, 17 and 1, respectively.

In the recursive module, convolution and parameter correction linear unit (PReLU) operation are firstly carried out on the input of four scales, and a 64-channel characteristic diagram of the four scales is obtained. As shown in fig. 3, these feature maps are merged with corresponding QF features (C1, f1, f2 and f3), and feature map fusion is performed on feature maps of different scales on each scale after two convolutions and prilu operations. As shown in fig. 4, the feature maps of different scales are uniform in dimension by means of mean pooling and deconvolution. Specifically, the length and width dimensions of the feature map are not changed through a 1 × 1 convolution core, the length and width dimensions of the feature map are halved through a mean pooling operation with a scale factor of 2, the length and width dimensions of the feature map are doubled through a deconvolution operation with a scale factor of 2, and the length and width dimensions of the feature map are halved/doubled through multiple mean pooling/deconvolution operations. Finally, the output of the recursion module is a wavelet coefficient residual map with four scales, and the channel numbers are 17, 16 and 1 respectively.

As shown in fig. 1, the output of the recursion module is added pixel by pixel to the original wavelet subband coefficients, the result of which is again input into the recursion module, the output of which is again added pixel by pixel to the original wavelet subband coefficients. The calculation process is repeated 6 times, and 6 recursion modules share the same network parameters, so that the complexity of the model is reduced. Finally, these restored wavelet sub-band coefficients are subjected to an Inverse wavelet transform (Inverse SPWT) to the spatial domain, resulting in a restored Y-channel image. In the parameter selection, all convolution kernel sizes are 3 × 3 pixels except for the convolution kernel sizes of the first and last convolutional layers set to 7 × 7 and 5 × 5 pixels, respectively. And (3) filling the first and last convolutional layers by using mirror image 3 pixels and mirror image 2 pixels respectively, and filling the rest convolutional layers by using 1 pixel zero, so that the input and output dimensions of each layer of the network are consistent. In addition, the characteristic nonlinearity is achieved using the PReLU after each convolutional layer except the last convolutional layer.

Step four: and inputting the QF related features extracted in the second step, the Y-channel image restored in the third step and the CbCr channel of the compressed image into a CbCr channel restoration network comprising 6 recursion modules, thereby obtaining a restored CbCr channel image.

As shown in fig. 1, the CbCr channel restoration network adopts a structure similar to that of the Y channel restoration network, that is, the CbCr channel image is predicted repeatedly through residual learning by using the restored Y channel image and QF-related features, and all recursion modules share the same network parameters, thereby reducing the model complexity. As shown in fig. 5, the recursive module adopts a network structure similar to that of U-Net, except that at the encoder end, convolutional layers of different scales are output and spliced and combined with QF-related features of corresponding scale sizes, so that the network can adaptively complete the task of restoring images of different compression levels. Since the human eye is insensitive to image color distortion, the outputs of all layers are set to 64 channels, i.e., the CbCr channels are recovered using a lightweight network. In terms of parameter selection, similar to the Y-channel restoration network, all convolution kernel sizes are 3 × 3 pixels except that the convolution kernel sizes of the first and last convolutional layers are set to 7 × 7 and 5 × 5 pixels, respectively, and the convolution kernel size of the deconvolution layer is set to 2 × 2 pixels. And (3) filling the first and last convolutional layers by using mirror image 3 pixels and mirror image 2 pixels respectively, and filling the rest convolutional layers by using 1 pixel zero, so that the input and output dimensions of each layer of the network are consistent. In addition, the characteristic nonlinearity is achieved using the PReLU after each convolutional layer except the last convolutional layer.

The network training method can be summarized as follows:

1) the invention mainly comprises three networks: QF prediction network, Y channel recovery network and CbCr channel recovery network. Because a certain variable dependency relationship exists among the networks, firstly, a QF prediction network is trained; then, extracting QF related characteristics of the compressed image blocks by using a QF prediction network with fixed parameters, and training a Y channel recovery network; and finally, extracting QF related characteristics of the compressed image blocks and calculating recovered Y-channel images by using a QF prediction network and a Y-channel recovery network with fixed parameters, and training a CbCr channel recovery network.

2) The QF prediction network was trained using the VOC2012 database. The specific method comprises the following steps: compressing each RGB image in the database by using a random QF value, wherein QF is an integer and the value range is [5, 95], and generating 12700 compressed images; the compressed image was then converted to YCbCr space and 105234 non-overlapping 128 x 128 pixel image blocks were extracted from the Y channel image for network training, using an L1 loss function.

3) The Y-channel and CbCr-channel restoration networks were trained using the Berkeley Segmentation Database (BSD), the DIV2K database, the smooth Exploration database (WED), and the Flickr2K database, where BSD contains 400 images (200 for each training and test sets), DIV2K contains 900 images, WED contains 4744 images, and Flickr2K contains 2000 images. The specific method comprises the following steps: compressing each reference image by using a random QF value, wherein QF is an integer and has a value range of QF belonging to {10:20,22:2:30,35:5:60,70:10:90}, and generating 8044 compressed images; subsequently, the reference image and the compressed image were converted to YCbCr space, and 583625 non-overlapping 128 × 128 pixel image blocks were extracted for network training, respectively. Wherein, training the Y-channel recovery network needs to use the Y-channel image block as training data, and the loss function is pixel mean square error loss (l)_MSE) And loss of structural similarity (l)_SSIM) Linear combinations of (i.e. L ═ L)_MSE+λ·l_SSIM(λ ═ 0.001); training the CbCr channel recovery network requires using CbCr channel image blocks as training data, and the loss function is pixel mean square error loss.

4) Loss of pixel mean square error (l)_MSE) The calculation formula is as follows:

wherein, I (I, j) and I_R(I, j) denote a reference image I and a restored image I, respectively_RThe spatial position is a pixel value of (i, j); w and H represent the width and height of the image, respectively. Loss of structural similarity (l)_SSIM) The calculation formula is as follows:

wherein,

denotes SSIM (I, I)_R) The calculation formula of (a) is:

in the formula,

and

respectively represent I (I)_R) Local mean and local standard deviation of; c₁And C₂Is a constant, and the value thereof is the same as that of the SSIM method.

5) The invention uses PyTorch deep learning framework to perform experiments on 8-core Intel i9-9900K 3.60GHz CPU and NVIDIA GeForce RTX 2080SUPER GPU workstation. The network initialization parameter is a sampling value of normal distribution N (1, 0.02); the initialization parameter of the PReLU slope is 0.1; optimizing by using an Adam algorithm; the initial learning rate is 2 × 10^-4And the first/second order moment exponential decay rates are set to 0.9 and 0.999, respectively. When the QF prediction network is trained, the batch size is set to 64, and the learning rate is reduced to 0.8 before every epoch, so that 120 epochs are trained. When two recovery networks are trained, the batch size is set to 4, and the learning rate is reduced to 0.9 before every 20000 times of iteration, and 4 epochs are trained in total.

The network testing method can be summarized as follows:

1) recovery of compressed image of known QF value

And selecting LIVE, CSIQ, BSD100 (100 images in a BSD verification set) and a reference image of an Urban100 database to perform algorithm performance test. The specific method comprises the following steps: for each reference image of each database, JPEG compression is carried out by respectively using eight different compression levels, and QF values in experiments are 10, 20, 30, 40, 50, 60,70 and 80; then, recovering the compressed image by using different algorithms/network models; and finally, performing algorithm performance test on the restored image by using two indexes of peak signal to noise ratio (PSNR) and Structural Similarity (SSIM). Table 2 shows the recovery performance of the present invention compared to other methods for JPEG compressed images with known QF values.

TABLE 2 comparison of test Performance of the present invention (SPW-Net) with other methods on LIVE, BSD100, CSIQ, Urban100 databases

2) Recovery of compressed images of unknown QF values

Selecting an SDIVL database to perform algorithm performance test, wherein the test content comprises two parts: (1) directly selecting a JPEG compressed image of QF epsilon [10,90] in an SDIVL database for testing; (2) JPEG-compressing each original image of the SDIVL database from 10 to 90 (step size is 1) in turn according to QF, and recovering 1620 generated compressed images. Table 3 and fig. 6 show the performance comparison of the different methods in the above two-part test, respectively. The experimental result shows that compared with other methods, the method provided by the invention has better recovery performance for images with different compression levels.

TABLE 3 comparison of test Performance of the invention (SPW-Net) with other methods on SDIVL database

In a word, the JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network firstly utilizes the QF prediction network to extract image features related to compression grade, secondly respectively recovers a Y channel and a CbCr channel of an image through two recovery networks, and finally converts the image into an RGB space to obtain a final recovery result. The method provided by the invention has the advantages that the image coding parameter information is not required to be predicted, the better recovery effect can be realized for gray scales and color images with different compression levels, each recovery network only needs to train a single network model, the model avoids the problems of gradient disappearance and gradient explosion possibly occurring in the training process by using jump connection on one hand, and the complexity of the model is reduced by using a recursive module shared parameter strategy on the other hand, and the efficient operation of the algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, remarkable recovery effect and the like.

Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments. The above embodiments are intended to be illustrative, not limiting. Those skilled in the art can make many JPEG image compression artifact removal methods without departing from the scope of the claimed invention, which falls within the protection scope of the present invention.

Claims

1. A JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network is characterized by comprising the following steps:

converting an input JPEG compressed image into a YCbCr color space;

2. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein the QF prediction network has the following structure:

the size of the input image block is 128 x 128 pixels.

3. The method for eliminating the JPEG image compression artifacts based on the controllable pyramid wavelet network as claimed in claim 2, wherein in the second step, the output of the fourth, sixth and eighth convolution layers of the QF prediction network is input into the recovery network of the Y channel and the CbCr channel after being scaled by the channel number adjustment.

4. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein in the third step, the Y-channel recovery network comprises six recursion modules with the same network structure, each recursion module comprises four parallel convolutional neural networks, the four convolutional neural networks correspond to image inputs with different scales, after feature map splicing and two-layer convolutional layer operation, feature fusion is realized on each neural network through up-sampling and down-sampling operations of feature maps with different scales, and after the fused feature maps pass through two layers of convolutional layers and a parameter correction linear unit, network outputs are obtained.

5. The method for eliminating the JPEG image compression artifacts based on the controllable pyramid wavelet network as claimed in claim 1, wherein in step three, the nonlinear feature map extracted by the QF prediction network comprises feature maps of four scales, the feature maps of the four scales are respectively connected to corresponding four-way convolutional neural networks by means of being connected to the wavelet coefficient feature map, and the compressed images of different scales are respectively connected to corresponding wavelet subband coefficients and used as the input of each way of neural network of the recursion module.

6. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 4, wherein a Y channel restores the real part and the imaginary part of the wavelet coefficient output by a recursion module in the network, and the real part and the imaginary part of the wavelet coefficient input originally are summed element by element to obtain the restored wavelet coefficient; the calculation process is repeated six times, and the six recursion modules share the same network parameters.

7. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein in step four, the CbCr channel restoration network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a non-linear feature map extracted by the QF prediction network and a restored Y channel image obtained in step three, and the CbCr channel restoration network uses the restored Y channel image and the relevant features extracted by the QF prediction network to repeatedly predict the CbCr channel image through residual error learning.

8. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 7, wherein the network structure of a recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end completes three times of down-sampling, and the decoder end completes three times of up-sampling; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;