CN113962882B

CN113962882B - JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Info

Publication number: CN113962882B
Application number: CN202111155935.3A
Authority: CN
Inventors: 张译; 禹冬晔; 牟轩沁
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2023-08-25
Anticipated expiration: 2041-09-29
Also published as: CN113962882A

Abstract

The invention discloses a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which comprises the steps of firstly extracting image characteristics related to compression grades, secondly using the characteristics to guide and restore Y channel images, then using the characteristics and the restored Y channel images to guide and restore CbCr channel images, and finally transforming the images into RGB space to obtain a final restoring result. The method provided by the invention does not need to predict the image coding parameter information, can have better recovery effect on a plurality of images with different compression grades, and each recovery network only needs to train a single network model, so that the problems of gradient disappearance and gradient explosion possibly occurring in the training process are avoided by using jump connection on one hand, and on the other hand, the complexity of the model is reduced by using a recursion module sharing parameter strategy, and the efficient operation of an algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, obvious recovery effect and the like.

Description

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Technical Field

The invention belongs to the field of image processing, and particularly relates to a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network.

Background

The image or video shot by the camera needs to be compressed in the use process due to the limitation of transmission bandwidth and storage capacity, and the lossy compression method represented by JPEG compression is widely applied to various links of image processing at present. Since high frequency information of an image is lost in the quantization stage, the compressed image may contain compression artifacts such as blocking, ringing, blurring, etc., which may not only lead to degradation of perceived quality of the image, but also affect performance of various computer vision algorithms having the compressed image as input. Therefore, a quick and effective JPEG compressed image recovery algorithm is designed, and the method has wide application prospect and practical value.

Currently, JPEG image compression artifact removal techniques can be broadly divided into three types:

1) The filtering-based approach eliminates compression artifacts by performing filtering operations along block boundaries in the spatial or frequency domain. The spatial domain filtering method generally selects a suitable filter for filtering in the vicinity of a block boundary according to characteristics of different image regions, i.e., a spatial adaptive filtering method. Later, some more complex filtering methods have developed, including image block shift window filtering, nonlinear filtering, adaptive non-local mean filtering, adaptive bilateral filtering, and the like. The frequency domain filtering method recovers image detail information mainly by adjusting Discrete Cosine Transform (DCT) coefficients.

2) The method based on the inverse problem optimization, namely the image decompression artifact is regarded as an optimization and solving process of the inverse problem, and the original image is solved by utilizing certain priori knowledge of the image. Typical image priors include low rank priors, quantization constraint priors, non-local similarity, sparse representation priors, and the like. Part of the method uses a number of prior knowledge to obtain an optimal solution to the inverse problem. The prior knowledge-based method is time-consuming due to the complex optimization process.

3) The machine learning-based method maps/transforms a compressed image into an original image by learning a large number of original images and compressed image samples to obtain a certain image mapping/transformation relationship. Typical machine learning methods utilize Convolutional Neural Networks (CNNs) to implement image mapping/transformation, such as ARCNN, TNRD, dnCNN, CAS-CNN, memNet, S-Net, deep Convolutional Sparse Coding (DCSC) networks, generative countermeasure network models, and the like. Some methods (such as DMCNN, DDCN, MWCNN, DPW-SDNet, etc.) use CNNs to recover the spatial domain and the frequency domain of an image respectively, so as to obtain better image recovery performance.

In the three methods, the filtering-based method has poor image recovery performance, the inverse problem optimization-based method has high computational complexity, and the algorithm is time-consuming. In comparison, with the development of GPU parallel computing technology, the machine learning-based method not only can obtain better image recovery performance, but also has faster algorithm speed. However, most of the current methods based on machine learning require prediction of the encoded information of the compressed image and are effective only for a part of the compressed level image, thereby limiting the application range of the algorithm. Although DnCNN overcomes the above limitations by adjusting the training data, its image recovery performance is general and valid only for gray scale images. In addition, there are also methods to accomplish the multi-compression-level image restoration task by training multiple network models, which however mean that more memory space is occupied. Therefore, there is a need for a unified network model that does not require knowledge of the coding information of the compressed image, and that is efficient for gray scale and color images of various compression levels while taking up less memory space, and thus is easy to implement on a small device.

Disclosure of Invention

The invention aims to overcome the defects, and provides a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which firstly extracts image characteristics related to compression level (represented by quality factor QF), secondly uses the characteristics to guide and restore a Y-channel image, then uses the characteristics and the restored Y-channel image to guide and restore a CbCr channel image, and finally transforms the image into RGB space to obtain a final restoring result. The Y-channel image restoration network of the present invention comprises 6 recursive modules, taking multi-scale, multi-directional wavelet coefficients as inputs to predict the wavelet coefficients of the original image. The CbCr channel image restoration network contains 6 recursive U-Net structures, using the CbCr channels of the compressed image and the restored Y channels as inputs, to obtain a restored CbCr channel image.

In order to achieve the above object, the method comprises the following steps:

step one, converting an input JPEG compressed image into a YCbCr color space;

inputting a Y channel of the JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by using a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting a recovery network;

step three, the Y channel of the JPEG compressed image is subjected to controllable pyramid wavelet transformation and is decomposed in three dimensions and eight directions, so that a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band are generated, all sub-band coefficients, a relevant characteristic diagram of a QF prediction network and compressed images corresponding to different dimensions are sent into a Y channel recovery network together, and sub-band coefficients output by the Y channel recovery network are subjected to inverse wavelet transformation to a spatial domain, so that a recovered Y channel image is obtained;

inputting the linear feature map in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into a CbCr channel restoration network together, so as to obtain a restored CbCr channel image;

and fifthly, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.

The structure of the QF prediction network is as follows:

the size of the input image block is 128 x 128 pixels.

In the second step, the outputs of the fourth, sixth and eighth convolution layers of the QF prediction network are input into the recovery network of the Y channel and the CbCr channel after being scaled by channel number adjustment.

In the third step, the Y channel recovery network comprises six recursion modules, the recursion modules comprise four parallel convolutional neural networks, the four convolutional neural networks respectively correspond to image inputs of different scales, after feature map splicing and two-layer convolutional layer operation, feature maps of different scales are subjected to up-sampling and down-sampling operations, feature fusion is realized on each path of neural network, and after feature maps are fused, network output is obtained after the feature maps pass through the two-layer convolutional layers and the parameter correction linear units.

In the third step, the nonlinear feature map extracted by the QF prediction network comprises four scale feature maps, the four scale feature maps are respectively connected into the corresponding four paths of convolution neural networks in a mode of splicing with the wavelet coefficient feature maps, and compressed images with different scales are respectively spliced with corresponding wavelet subband coefficients to serve as input of each path of neural network of the recursion module, and as the wavelet transform coefficients are complex numbers, real parts and imaginary parts of the coefficients are respectively extracted, and the real part feature map and the imaginary part feature map are spliced to serve as input of the network.

The real part and the imaginary part of the wavelet coefficient output by the recursion module in the Y channel recovery network are summed with the real part and the imaginary part of the wavelet coefficient input originally element by element, so that the recovered wavelet coefficient is obtained; this calculation process is repeated six times in total, with the six recursion modules sharing the same network parameters.

In the fourth step, the CbCr channel recovery network includes six recursive modules with the same network structure, and the input of each recursive module includes the CbCr channel of the compressed image, the nonlinear feature map extracted by the QF prediction network, and the recovered Y channel image obtained in the third step, and the CbCr channel recovery network predicts the CbCr channel image through residual learning repeatedly using the recovered Y channel image and the relevant features extracted by the QF prediction network.

The network structure of the recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end finishes three downsampling and the decoder end finishes three upsampling; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;

the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the coding end, the convolution layer outputs with different scales are spliced and combined with the features extracted by the QF prediction network and corresponding to the scales;

the recursion module outputs the CbCr channels which are summed with the original compressed image element by element to obtain the restored CbCr channels, the calculation process is repeated six times, and the six recursion modules share the same network parameters.

Compared with the prior art, the method and the device have the advantages that the QF prediction network is utilized to extract the image characteristics related to the compression level, the Y channel and the CbCr channel of the image are respectively restored through the two restoration networks, and finally the image is transformed into the RGB space to obtain a final restoration result. The QF prediction network is used for extracting image characteristics related to compression levels, so that the network is guided to adaptively recover images under different compression levels, and therefore, the method does not need to predict image compression coding information and can better complete recovery tasks of a plurality of images with different compression levels; the recovery network provided by the invention only needs to train a single network model, and the model uses jump connection to avoid the problems of gradient disappearance and gradient explosion possibly occurring in the training process on one hand, and uses a recursion module sharing parameter strategy to reduce the complexity of the model and ensure the efficient operation of the algorithm on the other hand. The invention uses QF predictive network to extract image characteristics, which can increase the nonlinear capability of the network, thereby learning more complex image mapping/transformation relations. The invention adopts a Y channel recovery network to analyze the multi-scale and multi-directional spatial correlation of the image by using controllable pyramid wavelet transformation, and simultaneously analyzes the amplitude and phase information of wavelet subband coefficients, thereby obtaining better image recovery performance. The invention inputs the restored Y channel image into the CbCr channel restoration network, thereby utilizing the structure and texture characteristics of the brightness image to assist in restoring the color components of the image.

Furthermore, the recovery network of the invention uses the recursion module to share the model parameter strategy, thereby being beneficial to reducing the complexity of the model and ensuring the high-efficiency operation of the algorithm.

Furthermore, the recovery network of the invention uses jump connection to increase the network depth, thereby avoiding the problems of gradient disappearance and gradient explosion which may occur in the training process.

Drawings

FIG. 1 is a method framework of the present invention;

FIG. 2 is a diagram of a QF-related feature extraction network according to the present invention;

FIG. 3 is a diagram of a recursive modular network in a Y-channel recovery network in accordance with the present invention;

FIG. 4 is a method of stitching wavelet feature maps of different dimensions;

fig. 5 is a diagram of a recursive modular network in a CbCr channel recovery network;

FIG. 6 is a graph showing the comparison of the recovery performance of the method of the present invention with other methods for images of different compression levels; wherein, (a) is the PSNR index increase condition when the image is restored by using different algorithms; (b) To recover images using different algorithms, the SSIM index grows.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

In the description of the present invention, it should be understood that the embodiments described in the present invention are exemplary, and specific parameters are presented in the description of the embodiments for convenience of description of the invention only and should not be construed as limitations on the invention.

Step one: the input JPEG compressed image is converted from the RGB color space to the YCbCr color space.

Step two: the Y-channel of the compressed image is input into the QF prediction network, the nonlinear feature map is extracted using a cascaded convolutional layer, and the nonlinear feature map is rescaled in a third dimension (channel number) before being input into the recovery network.

The size of the input image block is 128×128 and the qf prediction network structure is shown in table 1.

Table 1QF predictive network architecture model

As shown in fig. 2, QF related features are extracted using the second, fourth, sixth and eighth convolutional layers of the QF prediction network. Assuming that w×h image blocks are input, the dimensions of the second, fourth, sixth, and eighth layer convolution layer output feature maps of the network are respectively: W.times.H.times.64, W/2.times.H/2.times.128, W/4.times.H/4.times.256, W/8.times.H/8.times.512. The number of output characteristic diagram channels of the fourth layer, the sixth layer and the eighth layer of convolution layers is adjusted to 64. Specifically, for the W/2×H/2×128 feature maps, an average value is taken for each adjacent 2 feature maps; for W/4 XH/4X 256 feature maps, taking an average value of every 4 adjacent feature maps; for the W/8×H/8×512 feature maps, the mean value is taken for each adjacent 8 feature maps. The extracted QF related features are respectively represented by C1, f2 and f3, and the dimensions are respectively as follows: W.times.H.times.64, W/2.times.H/2.times.64, W/4.times.H/4.times.64, W/8.times.H/8.times.64.

Step three: the Y-channel of the compressed image is decomposed in 3 dimensions and 8 directions via a controllable pyramid wavelet transform, resulting in 1 high-pass subband, 24 band-pass subbands, and 1 low-pass subband. These sub-band coefficients, together with QF related features and original compressed images of different scales, are input into a restoration network consisting of 6 recursion modules, and sub-band coefficients output from the network are transformed into a spatial domain through inverse wavelet, so that a restored Y-channel image is obtained.

As shown in fig. 1, the Y-channel image is first decomposed in 3 dimensions and 8 directions using a controllable pyramid wavelet transform (SPWT), resulting in 1 high-pass subband (H0), 24 band-pass subbands of different dimensions and different directions (subbands of different dimensions are denoted B1, B2 and B3, respectively), and 1 low-pass subband (L0). Combining the H0, B1 and Y channel images as a first scale input of a recursion module; combining the B2 and the downsampled Y-channel image to be used as a second scale input of the recursion module; combining the B3 and the twice down-sampled Y-channel images to be used as a third scale input of a recursion module; l0 is the fourth scale input of the recursion module. For H0, B1, B2 and B3, respectively extracting the real part and the imaginary part of the wavelet coefficient, and combining the real part characteristic diagram and the imaginary part characteristic diagram. Thus, the number of channels for the four scale inputs of the recursion module is 18, 17 and 1, respectively.

In the recursion module, convolution and parameter correction linear unit (PReLU) operation are firstly carried out on four-scale input, and a 64-channel characteristic diagram with four scales is obtained. As shown in fig. 3, these feature maps are merged with the corresponding QF features (C1, f2, and f 3), and feature map merging is performed on each scale for feature maps of different scales after two convolutions and a pralu operation. As shown in fig. 4, feature graphs with different scales achieve dimension consistency through a mean pooling and deconvolution mode. Specifically, the length and width dimensions of the feature map are unchanged through a 1×1 convolution, the length and width dimensions of the feature map are halved through a mean value pooling operation with a scale factor of 2, the length and width dimensions of the feature map are doubled through a deconvolution operation with a scale factor of 2, and the length and width dimensions of the feature map are halved/doubled through multiple mean value pooling/deconvolution operations. Finally, the output of the recursion module is a four-scale wavelet coefficient residual diagram with channel numbers of 17, 16 and 1 respectively.

As shown in fig. 1, the output of the recursion module is added pixel by pixel with the original wavelet subband coefficients, the result of which is again input to the recursion module, the output of which is again added pixel by pixel with the original wavelet subband coefficients. This calculation process is repeated 6 times, and the 6 recursion modules share the same network parameters, thereby reducing model complexity. Finally, these restored wavelet subband coefficients are subjected to an Inverse wavelet transform (Inverse SPWT) to the spatial domain, resulting in a restored Y-channel image. In terms of parameter selection, all other convolution kernel sizes are 3×3 pixels except that the convolution kernel sizes of the first and last convolution layers are set to 7×7 and 5×5 pixels, respectively. Mirror image 3 pixels and mirror image 2 pixels are adopted for the first layer and the last layer of convolution layer respectively for filling, and 1 pixel is adopted for zero filling for the rest convolution layers, so that the consistency of the input dimension and the output dimension of each layer of the network is ensured. In addition, the PReLU is used to realize the characteristic nonlinearity after each convolution layer except the final convolution layer.

Step four: and (3) inputting the QF related features extracted in the second step, the Y channel image restored in the third step and the CbCr channels of the compressed image into a CbCr channel restoration network containing 6 recursion modules, so as to obtain a restored CbCr channel image.

As shown in fig. 1, the CbCr channel recovery network adopts a structure similar to that of the Y channel recovery network, that is, the CbCr channel image is predicted by residual learning repeatedly using the recovered Y channel image and QF related features, and all recursive modules share the same network parameters, thereby reducing model complexity. As shown in FIG. 5, the recursion module adopts a network structure similar to U-Net, and is different in that at the encoder end, convolution layer outputs with different scales are spliced and combined with QF related features with corresponding scales, so that the network can adaptively complete the recovery tasks of images with different compression grades. Since the human eye is insensitive to image color distortion, the output of all layers is set to 64 channels, i.e. the CbCr channels are restored using a lightweight network. In terms of parameter selection, similar to the Y channel recovery network, all other convolution kernel sizes are 3×3 pixels except that the convolution kernel sizes of the first and last convolution layers are set to 7×7 and 5×5 pixels, respectively, and the convolution kernel size of the deconvolution layer is set to 2×2 pixels. Mirror image 3 pixels and mirror image 2 pixels are adopted for the first layer and the last layer of convolution layer respectively for filling, and 1 pixel is adopted for zero filling for the rest convolution layers, so that the consistency of the input dimension and the output dimension of each layer of the network is ensured. In addition, the PReLU is used to realize the characteristic nonlinearity after each convolution layer except the final convolution layer.

The training method of the network can be summarized as follows:

1) The invention mainly comprises three networks: QF prediction network, Y channel recovery network, and CbCr channel recovery network. Because of a certain variable dependency relationship among the networks, the QF prediction network is trained firstly; then QF related characteristics of the compressed image blocks are extracted by using a QF prediction network with fixed parameters, and a Y channel recovery network is trained; and finally, extracting QF related characteristics of the compressed image block and calculating a restored Y-channel image by using a QF prediction network and a Y-channel restoration network with fixed parameters, and training a CbCr channel restoration network.

2) QF prediction networks were trained using VOC2012 databases. The specific method comprises the following steps: compressing each RGB image in the database by using a random QF value, wherein QF is an integer and the value range is [5, 95], and 12700 compressed images are formed; the compressed image was then converted to YCbCr space and 105234 non-overlapping 128 x 128 pixel image blocks were extracted from the Y-channel image for network training using the L1 penalty function.

3) Y-channel and CbCr channel recovery networks were trained using a Berkeley segmentation database (Berkeley Segmentation Dataset, BSD), a DIV2K database, a sliding iron exploration database (Waterloo Exploration Dataset, WED), and a Flickr2K database, wherein BSD contained 400 images (200 each in training and test sets), DIV2K contained 900 images, WED contained 4744 images, and Flickr2K contained 2000 images. The specific method comprises the following steps: using a random QF value compression for each reference image, wherein QF is an integer and the value range is QF epsilon {10:20,22:2:30,35:5:60,70:10:90}, and generating by symbiosis8044 compressed images; subsequently, the reference image and the compressed image were converted into YCbCr space, and 583625 mutually non-overlapping 128×128 pixel image blocks were extracted for network training, respectively. Wherein, the training of the Y-channel recovery network requires the use of Y-channel image blocks as training data, and the loss function is pixel mean square error loss (l _MSE ) And structural similarity loss (l _SSIM ) Linear combinations of (i.e. l=l) _MSE +λ·l _SSIM (λ=0.001); training the CbCr channel recovery network requires using CbCr channel image blocks as training data, with the loss function being pixel mean square error loss.

4) Pixel mean square error loss (l _MSE ) The calculation formula is as follows:

wherein I (I, j) and I _R (I, j) represent the reference image I and the restored image I, respectively _R Wherein the spatial position is the pixel value of (i, j); w and H represent the width and height of the image, respectively. Structural similarity loss (l) _SSIM ) The calculation formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing SSIM (I, I) _R ) The average value of (2) is calculated as:

in the method, in the process of the invention,and->Respectively represent I (I) _R ) Local mean and local standard deviation of (2); c (C) ₁ And C ₂ Is constant and takes the same value as the SSIM method.

5) The present invention uses the PyTorch deep learning framework to conduct experiments on 8-kernel Intel i9-9900K 3.60GHz CPU and NVIDIA GeForce RTX 2080SUPER GPU workstations. The network initialization parameter is a sampling value of normal distribution N (1,0.02); the initialization parameter of PReLU slope is 0.1; optimizing by using an Adam algorithm; initial learning rate of 2×10 ^-4 And the first/second moment exponential decay rates were set to 0.9 and 0.999, respectively. When training the QF prediction network, the batch size is set to 64, and the learning rate is reduced to 0.8 before every time one epoch passes, and 120 epochs are trained. When training two recovery networks, the batch size was set to 4, and every 20000 iterative learning rates were reduced to the previous 0.9, a total of 4 epochs were trained.

The test method of the network can be summarized as follows:

1) Restoration of a compressed image of known QF values

The reference images of LIVE, CSIQ, BSD (100 images in the BSD verification set) and the Urban100 database were selected for the algorithmic performance test. The specific method comprises the following steps: JPEG compression is carried out on each reference image of each database by using eight different compression levels, wherein QF values in experiments are 10, 20, 30, 40, 50, 60,70 and 80; then, the compressed image is restored by using different algorithms/network models; finally, the two indexes of peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM) are used for carrying out algorithm performance test on the restored image. Table 2 shows the recovery performance of the present invention compared to other methods for JPEG compressed images of known QF values.

TABLE 2 comparison of test Performance of the invention (SPW-Net) with other methods on LIVE, BSD100, CSIQ, urban100 databases

2) Restoration of compressed images of unknown QF values

The SDILL database is selected for carrying out algorithm performance test, and the test contents comprise two parts: (1) Directly selecting JPEG compressed images of QF epsilon 10,90 in an SDILL database for testing; (2) And sequentially carrying out JPEG compression on each original image of the SDILL database according to QF from 10 to 90 (step length is 1), and recovering the generated 1620 compressed images. Table 3 and fig. 6 show a comparison of the performance of the different methods in the above two part test, respectively. Experimental results show that compared with other methods, the method provided by the invention has better recovery performance for images with different compression grades.

TABLE 3 comparison of test Performance of the invention (SPW-Net) with other methods on SDILL database

In a word, the JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network firstly utilizes the QF prediction network to extract image features related to compression level, secondly respectively restores a Y channel and a CbCr channel of an image through two restoring networks, and finally transforms the image to an RGB space to obtain a final restoring result. The method provided by the invention does not need to predict the image coding parameter information, has good recovery effect on gray scale and color images with different compression levels, and each recovery network only needs to train a single network model, so that the problems of gradient disappearance and gradient explosion possibly occurring in the training process are avoided by using jump connection on one hand, and on the other hand, the complexity of the model is reduced by using a recursion module sharing parameter strategy, and the efficient operation of an algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, obvious recovery effect and the like.

Although the specific embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the specific embodiments described above. The above embodiments are merely instructive, illustrative, and not limiting. Those skilled in the art, with the benefit of this disclosure, may make many different methods of JPEG image compression artifact removal without departing from the scope of the invention as claimed.

Claims

1. The JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network is characterized by comprising the following steps of:

step one, converting an input JPEG compressed image into a YCbCr color space;

the Y channel recovery network comprises six recursion modules with the same network structure, the recursion modules comprise four parallel convolutional neural networks, the four convolutional neural networks respectively correspond to image inputs with different scales, after characteristic image splicing and two-layer convolutional layer operation, characteristic images with different scales are subjected to up-sampling and down-sampling operation, characteristic fusion is realized on each path of neural network, and after the characteristic images are fused, network output is obtained after the characteristic images pass through the two-layer convolutional layers and the parameter correction linear units;

the CbCr channel recovery network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a nonlinear feature map extracted by a QF prediction network and a recovered Y-channel image obtained in the step three, and the CbCr channel recovery network predicts the CbCr channel image repeatedly through residual error learning by utilizing the recovered Y-channel image and relevant features extracted by the QF prediction network;

the recursion module outputs and the original compressed image CbCr channel is summed element by element to obtain a restored CbCr channel, the calculation process is repeated six times, and the six recursion modules share the same network parameters;

2. The method for eliminating JPEG image compression artifacts based on controllable pyramid wavelet network according to claim 1, wherein the structure of QF prediction network is as follows:

the size of the input image block is 128 x 128 pixels.

3. The method for removing artifacts from JPEG image compression based on controllable pyramid wavelet network according to claim 2, wherein in step two, the outputs of the fourth, sixth and eighth convolution layers of the QF prediction network are input to the restoration network of the Y channel and CbCr channel after being scaled by the channel number adjustment.

4. The method for eliminating JPEG image compression artifacts based on the controllable pyramid wavelet network according to claim 1, wherein in the third step, the nonlinear feature map extracted by the QF prediction network comprises four scale feature maps, the four scale feature maps are respectively connected into corresponding four paths of convolution neural networks in a mode of splicing with the wavelet coefficient feature maps, compressed images with different scales are respectively spliced with corresponding wavelet subband coefficients to serve as inputs of each path of the neural network of the recursion module, and the real part and the imaginary part of the coefficients are respectively extracted due to the fact that wavelet transformation coefficients are complex, and the real part feature map and the imaginary part feature map are spliced to serve as inputs of the network.

5. A method for eliminating JPEG image compression artifacts based on a controllable pyramid wavelet network according to claim 3, wherein the real part and the imaginary part of the wavelet coefficient outputted by the recursion module in the Y-channel restoration network are summed with the real part and the imaginary part of the wavelet coefficient inputted originally element by element, thereby obtaining the restored wavelet coefficient; this calculation process is repeated six times in total, with the six recursion modules sharing the same network parameters.