CN110717948A

CN110717948A - Image post-processing method, system and terminal equipment

Info

Publication number: CN110717948A
Application number: CN201911199740.1A
Authority: CN
Inventors: 李东阳
Original assignee: Hefei Map Duck Mdt Infotech Ltd
Current assignee: Hefei Map Duck Mdt Infotech Ltd
Priority date: 2019-04-28
Filing date: 2019-11-29
Publication date: 2020-01-21

Abstract

The embodiment of the invention provides an image post-processing method, which comprises the following steps: acquiring image data decompressed by a decoder; processing the decompressed image data according to a post-processing network to obtain post-processing image data; the training method of the post-processing network comprises the following steps: compressing the training image through a coding network to obtain a compressed bit stream; inputting the compressed bit stream into a decoding network to obtain reconstructed image data; inputting the reconstructed image data into a post-processing network to obtain a post-processing reconstructed image; comparing the post-processing reconstructed image with the training image, and estimating according to a code rate to obtain a rate-distortion optimization result; and adjusting the parameters of the post-processing network according to the rate-distortion optimization result, so that the method can improve the quality of image decoding and has great performance improvement space.

Description

Image post-processing method, system and terminal equipment

Technical Field

The present invention relates to the field of image compression, and in particular, to an image post-processing method, system and terminal device.

Background

The current image coding network method has the problems of easy loss of details, general quality of detail reconstruction and the like.

Disclosure of Invention

In order to solve the above problem, embodiments of the present invention provide an image post-processing method, system and terminal device.

According to a first aspect of the present invention, there is provided an image post-processing method comprising:

acquiring image data decompressed by a decoder;

processing the decompressed image data according to a post-processing network to obtain post-processing image data;

the training method of the post-processing network comprises the following steps:

compressing the training image through a coding network to obtain a compressed bit stream;

inputting the compressed bit stream into a decoding network to obtain reconstructed image data;

inputting the reconstructed image data into a post-processing network to obtain a post-processing reconstructed image;

comparing the post-processing reconstructed image with the training image, and estimating according to a code rate to obtain a rate-distortion optimization result;

and adjusting parameters of the post-processing network according to the rate-distortion optimization result.

Further, the encoding network, the decoding network and the post-processing network are convolutional neural networks, and the structures of the networks are the same and/or different.

Further, the post-processing network comprises a three-layer convolutional neural network.

According to a second aspect of the present invention, there is provided an image post-processing system comprising:

first acquiring means for acquiring image data decompressed by the decoder;

the post-processing device is used for processing the decompressed image data according to a post-processing network to obtain post-processed image data;

wherein the post-processing device comprises:

the first compression unit is used for compressing the training image through a coding network to obtain a compressed bit stream;

a first reconstruction unit, configured to input the compressed bitstream into a decoding network to obtain reconstructed image data;

the first post-processing unit is used for inputting the reconstructed image data into a post-processing network to obtain a post-processing reconstructed image;

the first optimization unit is used for comparing the post-processing reconstructed image with the training image and obtaining a rate-distortion optimization result according to code rate estimation;

and the first adjusting unit is used for adjusting the parameters of the post-processing network according to the rate-distortion optimization result.

According to a third aspect of the present invention, there is provided a terminal device comprising:

a memory for storing a program;

a processor, coupled to the memory, for executing the program, which when executed performs the method provided by the present invention.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a method provided by an embodiment of the present invention;

FIG. 2 is a flow chart of a training method provided by an embodiment of the present invention;

FIG. 3 is a flow chart of a method provided by an embodiment of the present invention;

FIG. 4 is a flowchart of a training method provided by an embodiment of the present invention;

FIG. 5 is a schematic diagram of model training provided by an embodiment of the present invention;

FIG. 6 is a schematic diagram of model training provided by an embodiment of the present invention;

FIG. 7 is a schematic diagram of an apparatus provided by an embodiment of the present invention;

FIG. 8 is a schematic diagram of an exercise apparatus provided in accordance with an embodiment of the present invention;

FIG. 9 is a schematic diagram of an apparatus provided by an embodiment of the present invention;

FIG. 10 is a schematic view of an exercise apparatus provided in accordance with an embodiment of the present invention;

FIG. 11 is a schematic diagram of a post-processing network provided by an embodiment of the present invention;

fig. 12 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As an embodiment of the present invention, there is provided an image post-processing method, as shown in fig. 1, including:

step 102, image data decompressed by a decoder is acquired.

Specifically, when the decoder decodes the compressed data, the decompressed image data is obtained.

And 104, processing the decompressed image data according to a post-processing network to obtain post-processing image data.

Specifically, the decompressed image data is processed according to a trained post-processing network to obtain post-processing image data.

The training method for the post-processing network is shown in fig. 2, and includes:

step 202, compressing the training image through a coding network to obtain a compressed bit stream.

In one possible embodiment, the training image may be compressed by a convolutional neural network based coding network.

And step 204, inputting the compressed bit stream into a decoding network to obtain reconstructed image data.

Specifically, the compressed bit stream is decoded according to a decoding network corresponding to the encoding network, so as to obtain reconstructed image data.

And step 206, inputting the reconstructed image data into a post-processing network to obtain a post-processing reconstructed image.

In one possible embodiment, the post-processing reconstructed map is obtained by processing the reconstructed map data through a post-processing network including three layers of convolutional neural networks.

FIG. 11 is a schematic diagram of the post-processing network, as shown in FIG. 11, wherein the input data is the reconstructed map data and the output data is the post-processing reconstructed map; the first layer of the three-layer convolutional neural network is Conv96x5x5/1, wherein 96 represents the number of channels, 5x5 represents the size of a convolution kernel, and/1 represents that downsampling is not adopted; the second layer of the three-layer convolutional neural network is Conv64x5x5/1, wherein 64 represents the number of channels, 5x5 represents the size of a convolution kernel, and/1 represents that downsampling is not adopted; the third layer of the three-layer convolutional neural network is Conv32x5x5/1, wherein 32 represents the number of channels, 5x5 represents the size of a convolution kernel, and/1 represents that downsampling is not adopted; the GDN between the convolutional neural networks is a generalized normalized network, and downsampling is not used here to avoid distortion caused by downsampling.

And 208, comparing the post-processing reconstructed image with the training image, and estimating according to the code rate to obtain a rate-distortion optimization result.

In particular, the method comprises the following steps of,

comparing the post-processing reconstruction image with the training image to obtain a distortion residual error;

and obtaining the rate-distortion optimization result according to the code rate estimation result and the distortion residual error.

In the training model, distortion D can be expressed as Mean Square Error (MSE)

A representation is made, wherein x represents the image (also referred to as original image or input image),

representing a reconstructed map or calculated using subjective distortion such as MS-SSIM. And performing end-to-end optimization on the self-coding compression algorithm according to a loss function R + lambda D for weighting the code rate and the distortion, wherein R represents the code rate, D represents the distortion, and lambda represents the weight, and in the optimization process, firstly defining the loss function and then optimizing the network parameters by using a back propagation algorithm.

And step 210, adjusting parameters of the post-processing network according to the rate-distortion optimization result.

Specifically, parameters of the post-processing network are trained according to the rate-distortion optimization result, and the parameters are optimized according to the training result.

Further, step 202 further comprises: and compressing the training image through a coding network to obtain compressed data, and quantizing the compressed data to obtain quantized compressed data.

Specifically, in the training process, an additive uniform noise is used to set a quantifier, and the representation mode is

Wherein

Is the quantized parameter, y_iIs the coding feature and e is random noise. Wherein the variable

Entropy of (2) can be expressed by variable y_iSo that it can be used in the actual use of the modelAs a quantization operation, such a mannerIn the following, the code rate can be accurately estimated.

As an embodiment of the present invention, there is provided an image compression method, as shown in fig. 3, including:

step 302, image data is acquired.

Specifically, image data to be compressed is acquired.

And step 304, compressing the image data according to a coding network to obtain a compressed bit stream.

Specifically, the image data is compressed according to the trained coding network, and compressed image data is obtained.

The method for training the coding network is shown in fig. 4, and includes:

step 402, extracting the features of the image through a feature extraction network.

Specifically, the three layers of convolutional neural networks as shown in fig. 5 are used to extract features of the image, and in an optional manner, the result obtained by each layer of convolutional neural networks is used as an input to calculate a final feature, that is, the normalized feature obtained after each layer of convolution is convolved again and used as a cascade input.

And step 404, estimating the characteristics according to a probability model to obtain a code rate estimation result.

Specifically, the method comprises the following steps:

and estimating the distribution according to a probability model, and estimating the code rate according to the entropy to obtain the code rate estimation result.

The data distribution of natural images is generally considered to be gaussian-like, so a zero mean, variance, can be taken as

Laplacian distribution of

For feature y_iThe probability distribution is modeled, and the formula of the probability is as follows:

wherein μ represents the average distribution of the particles,representing the compression characteristics of a hyper-parametric network.

Further, a self-coding network may be employed to pair variancesLearning is carried out, the structure of the self-coding network is shown in figure 6,

and expressing the compression characteristics as the input of a hyper-parametric self-coding network, and learning the standard difference distribution, wherein in the hyper-parametric self-coding network, the expression formula of a variable z is as follows: z is h_e(y) wherein h_eRepresenting the encoder of the hyper-parametric learning network, and then performing quantization, the quantization formula being

The quantized representation may then be transmitted as an additional variable.

Code rate of features can be modeled using entropy structure

The prior distribution can be fitted using a parameterized approach and then the prior probability model is learned in a data-driven manner.

Step 406, quantizing the features to obtain quantized features.

Specifically, the method comprises the following steps:

in the training process, an additive uniform noise is used to set a quantifier, and the representation mode is

Where e is random noise. Wherein the variable

Entropy energy variable of

So that it can be used in the actual use of the model

As a quantization operation, in such a manner, the code rate can be estimated accurately.

And step 408, inputting the quantized features into a decoding network to obtain a reconstructed image.

In particular, the method comprises the following steps of,

and decoding the quantized features according to a decoding network to obtain a reconstructed image.

And step 410, inputting the reconstructed image into a post-processing network to obtain a post-processing reconstructed image.

And step 412, comparing the post-processing reconstructed image with the training image, and estimating according to the code rate to obtain a rate-distortion optimization result.

In particular, the method comprises the following steps of,

In the training model, distortion D can be expressed as Mean Square Error (MSE)

A representation is made, wherein x represents the image (also referred to as original image or input image),representing a reconstructed map or calculated using subjective distortion such as MS-SSIM. And performing end-to-end optimization on the self-coding compression algorithm according to a loss function R + lambda D for weighting the code rate and the distortion, wherein R represents the code rate, D represents the distortion, and lambda represents the weight, and in the optimization process, firstly defining the loss function and then optimizing the network parameters by using a back propagation algorithm.

Step 414, adjusting parameters of the feature extraction network according to the rate-distortion optimization result.

Specifically, a gradient back propagation algorithm is adopted to update parameters of the convolutional neural network.

As an embodiment of the present invention, there is provided an image post-processing system, as shown in fig. 7, including:

first obtaining means 701 is used for obtaining the image data decompressed by the decoder.

Specifically, the first obtaining device 701 is used for obtaining the image data decompressed by the decoder.

A post-processing device 702, configured to process the decompressed image data according to a post-processing network, so as to obtain post-processed image data.

Specifically, the post-processing device 702 processes the decompressed image data according to a trained post-processing network, so as to obtain post-processed image data.

In particular, the method comprises the following steps of,

as shown in fig. 8, the post-processing apparatus 702 includes:

the first compression unit 801 is configured to compress the training image through the coding network to obtain a compressed bitstream.

In particular, the training image may be compressed by a convolutional neural network-based coding network.

The function of the self-coding network is to convert data from image space x to data coding space y, which contains an encoder f_e. The role of the encoder is to convert the pixel value x of the image into a compression characteristic y-f_e(x)。

A first reconstructing unit 802, configured to input the compressed bitstream into a decoding network, so as to obtain reconstructed image data.

A first post-processing unit 803, configured to input the reconstructed map data into a post-processing network, so as to obtain a post-processing reconstructed map.

Specifically, in one possible embodiment, the post-processing reconstructed map is obtained by processing the reconstructed map data through a post-processing network including three layers of convolutional neural networks.

A first optimization unit 804, configured to compare the post-processing reconstructed image with the training image, and obtain a rate-distortion optimization result according to code rate estimation.

Specifically, the post-processing reconstructed image is compared with the training image to obtain a distortion residual error;

In the training model, distortion D can be expressed as Mean Square Error (MSE)

A first adjusting unit 805, configured to adjust a parameter of the post-processing network according to the rate-distortion optimization result.

As an embodiment of the present invention, there is provided an image compression system, as shown in fig. 9, including:

second acquiring means 901 for acquiring image data.

Specifically, the second acquiring means 901 acquires image data to be compressed.

A compressing device 902, configured to compress the image data according to a coding network, so as to obtain a compressed bitstream.

Specifically, the compression device 902 compresses the image data according to the trained coding network to obtain compressed image data.

As shown in fig. 10, the compressing apparatus 902 includes:

the second compression unit 1001 is configured to compress the training image through the coding network, so as to obtain a compressed bitstream.

A second reconstructing unit 1002, configured to input the compressed bitstream into a decoding network, so as to obtain reconstructed image data.

Specifically, the second reconstruction unit 1002 decodes the compressed bitstream according to a decoding network to obtain a reconstructed image.

A second post-processing unit 1003, configured to input the reconstructed map data into a post-processing network, so as to obtain a post-processing reconstructed map.

Modeling code rate by using entropy structure

And a second optimization unit 1004, configured to compare the post-processing reconstructed image with the training image, and obtain a rate-distortion optimization result according to the code rate estimation.

In the training model, distortion D can be expressed as Mean Square Error (MSE)

A second adjusting unit 1005, configured to adjust parameters of the coding network according to the rate-distortion optimization result.

As an embodiment of the present invention, there is provided a terminal device, as shown in fig. 12, including:

a memory 1201 and a processor 1202.

A memory 1201 for storing a program.

In addition to the above-described programs, the memory 1201 may also be configured to store other various data to support operations on the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, contact data, phonebook data, messages, pictures, videos, etc.

The memory 1201 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), flash memory, magnetic or optical disks.

A processor 1202, coupled to the memory 1201, for executing a program in the memory 1201, the program when executed performing a method as described in any of fig. 1-4.

The above specific processing operations have been described in detail in the foregoing embodiments, and are not described again here.

Claims

1. A method of image post-processing, the method comprising:

acquiring image data decompressed by a decoder;

2. The method of claim 1, wherein the encoding network, the decoding network, and the post-processing network are convolutional neural networks, and wherein the respective networks are identical and/or different in structure.

3. The method of claim 1, wherein the post-processing network comprises a three-layer convolutional neural network.

4. A method of image compression, the method comprising:

acquiring image data;

compressing the image data according to a coding network to obtain a compressed bit stream;

the training method of the coding network comprises the following steps:

and adjusting parameters of the coding network according to the rate-distortion optimization result.

5. The method of claim 4, wherein compressing the training image through the coding network to obtain a compressed bitstream comprises:

performing at least one downsampling compression on the training image according to the coding network to obtain a bit stream of the compressed image;

decompressing the bit stream of the compressed image according to the decoding network to obtain a reconstructed image;

calculating a residual error between the reconstructed image and the image to obtain a residual error;

carrying out at least one down-sampling compression on the residual error according to a residual error coding network to obtain a compressed residual error bit stream;

and decompressing the compressed residual bit stream according to a residual decoding network to obtain a residual reconstruction image.

6. The method according to claim 5, wherein the encoding network, the decoding network, the post-processing network and the residual encoding network are convolutional neural networks, and the structures of the respective networks are the same and/or different.

7. An image post-processing system, the system comprising:

first acquiring means for acquiring image data decompressed by the decoder;

wherein the post-processing device comprises:

8. The system of claim 7, wherein the encoding network, the decoding network, and the post-processing network are convolutional neural networks, and the structures of the respective networks are the same and/or different.

9. An image compression system, the system comprising:

second acquiring means for acquiring image data;

the compression device is used for compressing the image data according to a coding network to obtain a compressed bit stream;

wherein the compression device comprises:

the second compression unit is used for compressing the training image through a coding network to obtain a compressed bit stream;

the second reconstruction unit is used for inputting the compressed bit stream into a decoding network to obtain reconstruction image data;

the second post-processing unit is used for inputting the reconstructed image data into a post-processing network to obtain a post-processing reconstructed image;

the second optimization unit is used for comparing the post-processing reconstructed image with the training image and obtaining a rate-distortion optimization result according to code rate estimation;

and the second adjusting unit is used for adjusting the parameters of the coding network according to the rate-distortion optimization result.

10. A terminal device, comprising:

a memory for storing a program;

a processor coupled to the memory for executing the program, the program when executed performing the method of any of claims 1-6.