WO2021042270A1 - Compression artifacts reduction method based on dual-stream multi-path recursive residual network - Google Patents
Compression artifacts reduction method based on dual-stream multi-path recursive residual network Download PDFInfo
- Publication number
- WO2021042270A1 WO2021042270A1 PCT/CN2019/104234 CN2019104234W WO2021042270A1 WO 2021042270 A1 WO2021042270 A1 WO 2021042270A1 CN 2019104234 W CN2019104234 W CN 2019104234W WO 2021042270 A1 WO2021042270 A1 WO 2021042270A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network
- residual
- dual
- layer
- path
- Prior art date
Links
- 230000006835 compression Effects 0.000 title claims abstract description 37
- 238000007906 compression Methods 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000006870 function Effects 0.000 claims description 22
- 238000013507 mapping Methods 0.000 claims description 13
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 5
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 230000009977 dual effect Effects 0.000 claims description 2
- 238000013461 design Methods 0.000 abstract description 4
- 230000002708 enhancing effect Effects 0.000 abstract description 3
- 238000012417 linear regression Methods 0.000 abstract description 2
- 230000000903 blocking effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
Images
Classifications
-
- G06T5/77—
-
- G06T5/60—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the invention relates to a method for compressing and removing artifacts during storage and transmission of media files such as images and videos.
- JPEG lossy compression
- PNG lossless compression
- lossy compression has a higher compression ratio.
- lossy compression often causes irreversible information loss and compression artifacts, such as ringing, blocking, and blur, especially for low-bit-rate encoding. These compression artifacts not only degrade the user experience, but also adversely affect many primary image processing tasks.
- JPEG is currently the most widely used image compression standard.
- BDCT Block-based Discrete Cosine Transform
- coarse quantization strategies it aims to reduce statistical redundancy between pixels and achieve a high compression ratio.
- BDCT Block-based Discrete Cosine Transform
- the intensity discontinuity at the block boundary is prone to appear.
- the truncation of high-frequency discrete cosine transform coefficients can also cause ringing and blurring artifacts. Therefore, compression artifacts caused by lossy compression can generally be considered as hybrid artifacts.
- the early compression artifact removal was mainly based on filtering methods, which reduced the blocking effect by manually designing a filter that worked on the block boundary.
- wavelet transform adaptive discrete cosine transform (SA-DCT, Shape-Adaptive Discrete Cosine Transform), and methods based on sparse coding.
- SA-DCT adaptive discrete cosine transform
- SA-DCT Shape-Adaptive Discrete Cosine Transform
- sparse coding When using the aforementioned method to obtain an artifact-free image from a compressed image, it is usually accompanied by noise edges and unnatural smooth areas.
- methods based on deep learning obtain the best results by learning the nonlinear mapping relationship between the compressed image and the original image.
- the reconstructed image appears excessively smooth.
- the deep network can provide better performance, there are too many parameters, the network is difficult to train and the storage cost of the deep network model is higher than that of traditional methods.
- CAR can be divided into three subtasks: de-blocking, de-ringing artifacts and de-blurring.
- de-blocking and ringing artifacts it is necessary to suppress the interference high-frequency information, and for the deblurring tasks, it is necessary to enhance the sharp edge information and useful high-frequency information.
- the object of the present invention is to provide a compression artifact removal method based on a dual-stream multi-path recursive residual network that can significantly suppress compression artifacts, reduce distortion, and further enhance reconstructed images.
- the present invention achieves the above-mentioned objects in this way:
- Compression artifact removal method based on dual-stream multi-path recursive residual network, based on dual-stream multi-path recursive residual network architecture, using image high frequency (HF, high frequency) and low frequency (LF, low frequency) components to target removal and compression Artifacts.
- the network decomposes the compressed and distorted image into a texture layer (including HF component) and a structure layer (including LF component);
- it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are merged And feed it into the regression network to generate the final reconstructed image.
- the weight sharing strategy is adopted in each residual unit in the same tributary of the dual stream, and the training parameter amount of each tributary is fixed, which is equivalent to the parameter amount of a 4-layer convolutional neural network.
- I lq represents a compressed distorted image
- I s represents a structural layer corresponding to the coarse information of the distorted image
- I t represents a texture layer corresponding to the fine information of the distorted image.
- the network architecture is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning, and multi-path intermediate Residual learning
- RRU Recursive Residual Unit
- IRB Intermediate Residual Block
- residual learning There are three kinds of residual learning; recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly, and the amount of network learning parameters is limited by weight sharing between each recursive unit.
- the RRU is mainly composed of two convolutional layers and a ReLU activation layer, and the expression formula of the RRU is:
- X u and X u ′ represent the input and output of the u-th RRU, Represents the mapping function of RRU, Is the mapping function of the i-th convolutional layer in the u-th RRU, Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function ⁇ is the ReLU activation function.
- each IRB the low-level features are transmitted to the next network layer of the residual block by skip connection mode; assuming that the representation function of each IRB is The corresponding input and output of the b-th IRB are X b and X b ′, Indicates the weight of the i-th residual unit of the b-th recursive block; the formula of IRB composed of two RRUs is expressed as:
- X and X' respectively represent the input and output of the network
- f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network. Represents the overall mapping function of the proposed network.
- the beneficial effect of the present invention is that in the process of removing artifacts from lossy compressed images, the present invention designs a dual-stream multi-path recursive residual network architecture.
- Figure 1 is a schematic structural diagram of the dual-stream multi-path recursive residual network of the present invention
- Fig. 2 is a schematic diagram of the structure of the recursive residual unit of the present invention.
- the present invention designs a dual-stream multi-path recursive residual network architecture, which uses high frequency (HF) and low frequency (LF) components of an image to specifically remove compression artifacts.
- HF high frequency
- LF low frequency
- the network decomposes the compressed and distorted image into a texture layer (including HF components) and a structure layer (including LF components); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are combined And feed it into the regression network to generate the final reconstructed image.
- a weight sharing strategy is adopted in each residual unit in the same tributary. Therefore, the amount of training parameters for each tributary is fixed, which is equivalent to the amount of parameters of a 4-layer convolutional neural network.
- the desired structure layer is obtained by minimizing the L 0 norm.
- the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer.
- the compressed and distorted image is decomposed into a structure layer (including LF component) and a texture layer (including HF component).
- I lq represents a compressed distorted image
- I s represents a structural layer corresponding to the coarse information of the distorted image
- I t represents a texture layer corresponding to the fine information of the distorted image.
- a recursive residual network based on multi-path is designed.
- the entire network is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning and multi-path intermediate residual learning. Residual learning.
- RRU Recursive Residual Unit
- IRB Intermediate Residual Block
- Recursion refers to the use of the same weight between each feature map, that is, parameter sharing.
- the amount of network learning parameters will also increase linearly. Therefore, the amount of network learning parameters can be limited by weight sharing between each recursive unit.
- RRU is mainly composed of two convolutional layers and ReLU activation layer, and its structure diagram is shown in Figure 2. It has the same structure as the residual unit in ResNet, but the difference lies in the activation sequence. In each residual unit of ResNet, the activation function is executed after the convolutional layer, and the RRU executes the activation layer (BN and ReLU) before the convolutional layer. Therefore, the expression formula of this RRU is:
- X u and X u′ represent the input and output of the u-th RRU
- the function ⁇ is the ReLU activation function.
- each IRB the low-level features are transmitted to the next network layer of the residual block by skip connection mode. Therefore, there are multiple jump connections between the global connection and the local connection, as shown by the curved arrow in the lower middle in Figure 1.
- the representation function of each IRB is The corresponding input and output of the b-th IRB are X b and X b ′, Indicates the weight of the i-th residual unit of the b-th recursive block.
- the formula of IRB composed of two RRUs is expressed as:
- X and X' respectively represent the input and output of the network
- f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network. Represents the overall mapping function of the proposed network.
- the network proposed by the present invention has a flexible combined structure. Given a specific number of network layers, the number of RRUs and the number of IRBs can be adjusted freely. Assuming that the number of RRUs is denoted as U and the number of IRBs is denoted as B, the calculation formula for the number of network layers is as follows:
- the network structure has the following three different types:
- A. 1B9U There is only one IRB block, and the IRB contains 9 RRUs;
- B, 3B3U There are a total of 3 IRB blocks, and each IRB contains 3 RRUs;
- C. 9B1U There are a total of 9 IRB blocks, and each IRB contains only one RRU.
- the combined structure of 3B3U includes three kinds of residual learning
- the combined structure of 3B3U is respectively applied to the structure flow and texture flow designed in the present invention.
- the combination structure becomes more flexible, that is, there are more different combinations.
- the purpose of the structure layer is to restore the high-frequency information lost in the image.
- the processing process of the texture layer aims to remove compression artifacts and preserve the details such as the edges of the original image.
- Supervised learning is performed on the texture layer of the original real image, and the designed network structure including the recursive residual unit area and the inter-residual block can greatly suppress the strong blocking effect and ringing artifacts in the texture layer.
- the two branches of the structure flow and the texture flow are operated in parallel, and the corresponding information is outputted and the structure layer is enhanced. And texture layer Then, the corresponding outputs of the two tributaries are added pixel by pixel to obtain an enhanced image, namely Finally, the enhanced image It is fed into a non-linear regression network to further improve the reconstructed image.
- the structure of the regression network is the same as the network applied to the structure or texture stream.
Abstract
Description
Claims (6)
- 基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:基于双流多路径递归残差网络架构,利用图像高频(HF,high frequency)和低频(LF,low frequency)分量来针对性的去除压缩伪影。首先,网络将压缩失真图像分解为纹理层(包含HF分量)和结构层(包含LF分量);其次,利用多路径递归残差网络分别增强纹理和结构信息;最终,将结构和纹理信息合并后并馈送到回归网络中以生成最终的重建图像。The compression artifact removal method based on the dual-stream multi-path recursive residual network is characterized by: based on the dual-stream multi-path recursive residual network architecture, the high frequency (HF, high frequency) and low frequency (LF, low frequency) components of the image are used to target Remove compression artifacts. First, the network decomposes the compressed and distorted image into a texture layer (including HF component) and a structure layer (including LF component); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are merged And feed it into the regression network to generate the final reconstructed image.
- 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:在双流的同一支流内的每一个残差单元内采用权重共享的策略,每一个支流的训练参数量是固定的,等同于一个4层卷积神经网络的参数量。The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 1, characterized in that: a weight sharing strategy is adopted in each residual unit in the same tributary of the dual stream, and the training of each tributary The parameter quantity is fixed, which is equivalent to the parameter quantity of a 4-layer convolutional neural network.
- 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:首先通过最小化L 0范数的方法获得期望的结构层。其次计算出压缩失真图像与结构层的差异并作为相应的纹理层,其公式表示为: The method for removing compression artifacts based on a dual-stream multi-path recursive residual network according to claim 1, characterized in that: firstly, a desired structure layer is obtained by a method of minimizing the L 0 norm. Secondly, the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer. The formula is expressed as:I lq=I s+I t I lq =I s +I t其中,I lq表示压缩失真图像,I s表示对应于失真图像粗略信息的结构层,I t表示对应于失真图像精细信息的纹理层。 Among them, I lq represents a compressed distorted image, I s represents a structural layer corresponding to the coarse information of the distorted image, and I t represents a texture layer corresponding to the fine information of the distorted image.
- 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所述网络架构由若干个递归残差单元(RRU,Recursive Residual Unit)和中间残差块(IRB,Intermediate Residual Block)组成,集成了全局残差学习、局部残差学习以及多路径中间残差学习三种残差学习;递归是指在各个特征映射之间使用相同权重,即参数共享。随着网络深度加深,网络学习的参数量也会线性增加,通过在每个递归单元之间权重共享来限制网络学习参数量。The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 1, characterized in that: the network architecture consists of a plurality of recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks ( IRB, Intermediate Residual Block), which integrates three kinds of residual learning: global residual learning, local residual learning, and multi-path intermediate residual learning; recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly, and the amount of network learning parameters is limited by weight sharing between each recursive unit.
- 根据权利要求4所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所述RRU主要由两个卷积层和ReLU激活层组成,该RRU的表示公式为:The method for removing compression artifacts based on a dual-stream multi-path recursive residual network according to claim 4, wherein the RRU is mainly composed of two convolutional layers and a ReLU activation layer, and the expression formula of the RRU is:其中,X u、X u′表示第u个RRU的输入和输出, 表示RRU的映射函数, 是第u个RRU中第i个卷积层的映射函数, 表示第u个RRU中第i个卷积层的权重,函数σ是ReLU激活函数。 Among them, X u and X u′ represent the input and output of the u-th RRU, Represents the mapping function of RRU, Is the mapping function of the i-th convolutional layer in the u-th RRU, Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function σ is the ReLU activation function.
- 根据权利要求4所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所在每个IRB内,采用跳跃连接方式将低级特征传送给残差块的后一个网络层;假设每个IRB的表示函数为 第b个IRB的对应输入和输出分别为X b和X b′, 表示第b个递归块的第i个残差单元的权重;则由两个RRU组成IRB的公式表示为: The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 4, characterized in that: in each IRB, low-level features are transmitted to the next network layer of the residual block in a skip connection mode; Assume that the representation function of each IRB is The corresponding input and output of the b-th IRB are X b and X b′ , respectively, Indicates the weight of the i-th residual unit of the b-th recursive block; the formula of IRB composed of two RRUs is expressed as:因此,具有两个IRB的网络表示公式为:Therefore, the expression formula for a network with two IRBs is:其中,X和X′分别表示网络的输入与输出,f和f rec分别表示整个双流网络内第一个和最后一个卷积层的映射函数。 表示所提出的网络的整体映射函数。 Among them, X and X'respectively represent the input and output of the network, and f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network. Represents the overall mapping function of the proposed network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/104234 WO2021042270A1 (en) | 2019-09-03 | 2019-09-03 | Compression artifacts reduction method based on dual-stream multi-path recursive residual network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/104234 WO2021042270A1 (en) | 2019-09-03 | 2019-09-03 | Compression artifacts reduction method based on dual-stream multi-path recursive residual network |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021042270A1 true WO2021042270A1 (en) | 2021-03-11 |
Family
ID=74852002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/104234 WO2021042270A1 (en) | 2019-09-03 | 2019-09-03 | Compression artifacts reduction method based on dual-stream multi-path recursive residual network |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021042270A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107798665A (en) * | 2017-11-07 | 2018-03-13 | 天津大学 | Underwater picture Enhancement Method based on structural texture layering |
CN108460726A (en) * | 2018-03-26 | 2018-08-28 | 厦门大学 | A kind of magnetic resonance image super-resolution reconstruction method based on enhancing recurrence residual error network |
CN108921789A (en) * | 2018-06-20 | 2018-11-30 | 华北电力大学 | Super-resolution image reconstruction method based on recurrence residual error network |
CN109509152A (en) * | 2018-12-29 | 2019-03-22 | 大连海事大学 | A kind of image super-resolution rebuilding method of the generation confrontation network based on Fusion Features |
-
2019
- 2019-09-03 WO PCT/CN2019/104234 patent/WO2021042270A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107798665A (en) * | 2017-11-07 | 2018-03-13 | 天津大学 | Underwater picture Enhancement Method based on structural texture layering |
CN108460726A (en) * | 2018-03-26 | 2018-08-28 | 厦门大学 | A kind of magnetic resonance image super-resolution reconstruction method based on enhancing recurrence residual error network |
CN108921789A (en) * | 2018-06-20 | 2018-11-30 | 华北电力大学 | Super-resolution image reconstruction method based on recurrence residual error network |
CN109509152A (en) * | 2018-12-29 | 2019-03-22 | 大连海事大学 | A kind of image super-resolution rebuilding method of the generation confrontation network based on Fusion Features |
Non-Patent Citations (1)
Title |
---|
ZHOU DENG-WEN , ZHAO LI-JUAN , DUAN RAN , CHAI XIAO-LIAN: "Image Super-resolution Based on Recursive Residual Networks", ACTA AUTOMATICA SINICA, vol. 45, no. 6, 30 June 2019 (2019-06-30), pages 1157 - 1165, XP055787602, ISSN: 0254-4156, DOI: 10.16383/j.aas.c180334 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | An efficient deep convolutional neural networks model for compressed image deblocking | |
Liu et al. | Data-driven sparsity-based restoration of JPEG-compressed images in dual transform-pixel domain | |
Chen et al. | DPW-SDNet: Dual pixel-wavelet domain deep CNNs for soft decoding of JPEG-compressed images | |
CN108900848B (en) | Video quality enhancement method based on self-adaptive separable convolution | |
JP2021016150A (en) | Loop filtering device and image decoding device | |
JPH07231450A (en) | Filter device and method for reducing artifact in moving video picture signal system | |
CN112509094B (en) | JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network | |
TWI250423B (en) | Method for processing video images | |
CN111105357B (en) | Method and device for removing distortion of distorted image and electronic equipment | |
Kim et al. | Towards the perceptual quality enhancement of low bit-rate compressed images | |
CN111091515B (en) | Image restoration method and device, and computer-readable storage medium | |
WO2021042270A1 (en) | Compression artifacts reduction method based on dual-stream multi-path recursive residual network | |
CN110175959B (en) | Typhoon cloud picture enhancement method | |
KR101998036B1 (en) | Artifact reduction method and apparatus | |
Komatsu et al. | Super-resolution decoding of JPEG-compressed image data with the shrinkage in the redundant DCT domain | |
CN114173130B (en) | Loop filtering method of deep neural network suitable for low bit rate condition | |
Najgebauer et al. | Fully convolutional network for removing dct artefacts from images | |
Cai et al. | Inpainting for compressed images | |
Takagi et al. | Image restoration of JPEG encoded images via block matching and wiener filtering | |
Malviya et al. | 2D-discrete walsh wavelet transform for image compression with arithmetic coding | |
Luo et al. | Residual Hybrid Attention Network for Compression Artifact Reduction | |
Albluwi et al. | Artifacts reduction in jpeg-compressed images using cnns | |
Dolar et al. | Total variation regularization filtering for video signal processing | |
US20080187237A1 (en) | Method, medium, and system reducing image block noise | |
Nikitin et al. | Adaptive bilateral filter for JPEG 2000 deringing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19944171 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19944171 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19.10.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19944171 Country of ref document: EP Kind code of ref document: A1 |