CN111432211B - Residual error information compression method for video coding - Google Patents

Residual error information compression method for video coding Download PDF

Info

Publication number
CN111432211B
CN111432211B CN202010247702.5A CN202010247702A CN111432211B CN 111432211 B CN111432211 B CN 111432211B CN 202010247702 A CN202010247702 A CN 202010247702A CN 111432211 B CN111432211 B CN 111432211B
Authority
CN
China
Prior art keywords
coding
quantization
entropy
data
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010247702.5A
Other languages
Chinese (zh)
Other versions
CN111432211A (en
Inventor
段强
汝佩哲
李锐
金长新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Scientific Research Institute Co Ltd
Original Assignee
Shandong Inspur Scientific Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Scientific Research Institute Co Ltd filed Critical Shandong Inspur Scientific Research Institute Co Ltd
Priority to CN202010247702.5A priority Critical patent/CN111432211B/en
Publication of CN111432211A publication Critical patent/CN111432211A/en
Application granted granted Critical
Publication of CN111432211B publication Critical patent/CN111432211B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence

Abstract

The invention provides a residual error information compression method for video coding, which relates to the field of information compression and coding and decoding. When the residual error information is decoded, the stored entropy coding data is decoded and inversely quantized by using an opposite flow, and the residual error information is restored into three-channel residual error information from the characteristic diagram by decoding through a decoder with an opposite structure. The existing residual error information is compressed or secondarily compressed, so that the storage space is reduced by times, and the storage cost is reduced.

Description

Residual error information compression method for video coding
Technical Field
The invention relates to the field of information compression, coding and decoding, in particular to a residual error information compression method for video coding.
Background
In the digital media era, a large amount of image video data is generated and stored from the fields of daily life, social networking, public security monitoring, industrial production and the like, and a large amount of storage space needs to be consumed. The compression ratio of h264, which is the mainstream video compression format at present, still has a space for improvement, and the motion estimation based on blocks also generates color difference, so that h265, which is not popularized yet, is not considered well due to low compression efficiency and various patent disputes.
Motion compensation, which is an effective method for reducing redundant information of a frame sequence, is to predict and compensate a current local image from a previous local image. It usually has a residual with the real video information, and the residual information can complement the information lost in the motion compensation process.
In view of the large-scale application of neural networks and deep learning techniques to tasks in the field of artificial intelligence, it is very promising to compress data by means of neural networks.
Disclosure of Invention
Based on the above technical problem, the present invention provides a residual information compression method for video coding, which can obtain compressed residual information at a low bit rate for storing and compressing the residual information after motion estimation of video compression.
The method is based on a neural network structure of an autoencoder, uses a GDN activation function, and combines quantization and entropy coding to compress residual error information.
An autoencoder is an artificial neural network that learns an efficient representation of input data through unsupervised learning. It does not need to specially label the training data, and the loss is calculated based on the difference between the input and the output. The process of representing the input data by the neural network can be considered as a kind of encoding, and the dimension of the encoding is usually smaller than that of the input data, so that the compression and dimension reduction effects are achieved. Simple training it makes the input and output the same and has no great significance, so it is forced to learn an efficient representation of the data by adding internal size constraints, such as bottleeck layer, and training the data to add noise and train the self-encoder to recover the original data.
After an efficient representation is obtained, it can be quantized to achieve further compression. Because sometimes floating point numbers with higher precision occupy a lot of storage space, but too many bits after the decimal point do not have a great benefit to the actual task. However, in the back propagation of the neural network, optimization is performed by gradient descent, but quantization is an unguided process and cannot be used in the process of gradient calculation. There are various methods that can replace direct quantization, such as adding uniform noise, soft quantization, etc.
The quantized characteristic values need to be further compressed by entropy coding, and the commonly used entropy coding such as arithmetic coding, huffman coding, shannon coding and the like is important to design an efficient probability model.
Entropy coding belongs to lossless compression of data, reducing bits by identifying and eliminating portions of statistical redundancy, so that it does not lose information when compression is performed. The goal is to display discrete data with fewer bits (than needed for the original data representation) without loss of information during compression.
The method for compressing the residual information based on the self-encoder and the entropy coding can obtain the compressed residual information under the condition of low bit rate, and is used for storing and compressing the residual information after motion estimation of video compression.
The residual features are used to train the self-encoder network by using the self-encoder. Then, a trained Encoder (Encoder) network is used for extraction, a Feature Map (Feature Map) is generated, then the storage space of the data is reduced through quantization (quantization), and the quantized data is further compressed through Entropy Coding (Entropy Coding). When decoding the residual information, the stored entropy-encoded data is decoded and dequantized by using the reverse flow, and decoded by a Decoder (Decoder) with the reverse structure, and the residual information is restored from the feature map.
The implementation steps comprise: building a neural network architecture, coding, quantizing, entropy coding, storing a generated file, and decoding entropy. In particular, the amount of the solvent to be used,
1) building a neural network architecture, and specifying the number of layers of convolution layers, the size of convolution kernels, a padding method and the number of threads required by coding. In general, the design principle is that the size of a convolution kernel is first large and then small, the number is first small and then large or consistent, and strides >1 is arranged at certain layers to reduce the size of a feature map;
2) training is carried out by using a training set, each label of residual information is self, a loss function is constructed by mse and bpp, and optimization is carried out by using an Adam optimizer. After multiple iterations, a trained neural network model can be obtained;
3) the encoding process is a process of inputting the existing residual error information into the Encoder part of the trained neural network and obtaining a Feature Map (Feature Map) through multi-step convolution. Wherein the activation function of each convolutional layer uses ReLU or GDN;
4) quantization is commonly used in both the manner of adding uniform noise and soft quantization. Adding uniform noise is a process of adding noise to replace quantization in training, and because differences before and after quantization are similar to uniform noise, simulation is carried out by artificially adding noise.
5) The entropy coding is started, and binary coding is carried out firstly. Non-binary numbers must be binarized or converted to binary numbers before arithmetic coding. And counting the probability density functions of all binary symbols, and carrying out arithmetic coding on each bit of the binary symbols according to the probability density function obtained by counting.
6) The encoded file is stored in a serialized form and can be processed using a serialized package such as pickle.
7) And performing entropy decoding, reading the file stored in a serialized mode, converting the file into decimal fraction, namely converting the decimal point in front of the highest bit into the decimal fraction, and then decoding according to the existing probability density function.
8) After entropy decoding, a feature map with the size identical to that before entropy encoding is obtained, then a neural network opposite to the encoding network is constructed, a convolution layer is replaced by a deconvolution layer, the feature map is restored to residual information of three channels, and one-step rounding quantization is carried out during storage.
The invention has the advantages that
The method has better effect on the tasks of image compression and super-resolution.
The method can be applied to the field of video coding and decoding and compression, and the storage space and the storage cost are reduced by times by compressing or secondarily compressing the existing residual information. The compressed residual information is mainly used for supplementing lost information in video compression and improving the picture quality of video compression.
Drawings
FIG. 1 is a schematic workflow diagram of the present invention;
fig. 2 is an exemplary diagram of a neural network structure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
The method comprises the steps of extracting residual error information through a trained Encoder (Encoder) network by using the idea of a self-Encoder to generate a Feature Map (Feature Map), reducing the storage space of data through quantization (Quantize), and further compressing the quantized data through entropy coding. When decoding the residual information, the reverse flow is used to decode and inversely quantize the stored entropy coding data, and the decoding is carried out by a Decoder (Decoder) with the reverse structure, and the residual information of three channels is recovered from the characteristic diagram.
The method comprises the following specific steps: building a neural network architecture, coding, quantizing, entropy coding, storing a generated file, and decoding entropy. In particular, the amount of the solvent to be used,
1) building a neural network architecture, and specifying the number of layers of convolution layers, the size of convolution kernels, a padding method and the number of threads required by coding. In general, the design principle is that the size of a convolution kernel is first large and then small, the number is first small and then large or consistent, and strides >1 is arranged at certain layers to reduce the size of a feature map;
2) training is carried out by using a training set, each label of residual information is self, a loss function is constructed by mse and bpp, and optimization is carried out by using an Adam optimizer. After multiple iterations, a trained neural network model can be obtained;
3) the encoding process is a process of inputting the existing residual error information into the Encoder part of the trained neural network and obtaining a Feature Map (Feature Map) through multi-step convolution. Wherein the activation function of each convolutional layer uses ReLU or GDN;
4) quantization is commonly used in both the manner of adding uniform noise and soft quantization. Adding uniform noise is a process of adding noise to replace quantization in training, and because differences before and after quantization are similar to uniform noise, simulation is carried out by artificially adding noise.
5) The entropy coding is started, and binary coding is carried out firstly. Non-binary numbers must be binarized or converted to binary numbers before arithmetic coding. And counting the probability density functions of all binary symbols, and carrying out arithmetic coding on each bit of the binary symbols according to the probability density function obtained by counting.
6) The encoded file is stored in a serialized form and can be processed using a serialized package such as pickle.
7) And performing entropy decoding, reading the file stored in a serialized mode, converting the file into decimal fraction, namely converting the decimal point in front of the highest bit into the decimal fraction, and then decoding according to the existing probability density function.
8) After entropy decoding, a feature map with the size identical to that before entropy encoding is obtained, then a neural network opposite to the encoding network is constructed, a convolution layer is replaced by a deconvolution layer, the feature map is restored to residual information of three channels, and one-step rounding quantization is carried out during storage.
The above description is only a preferred embodiment of the present invention, and is only used to illustrate the technical solutions of the present invention, and not to limit the protection scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (1)

1. A residual information compression method for video coding,
based on a neural network structure of an autoencoder, GDN activation function is used, and residual error information compression is carried out by combining quantization and entropy coding;
by adding internal size constraints, and training the data to add noise, and training the self-encoder to restore the original data, it is forced to learn an efficient representation of the data;
after the efficient representation is obtained, the efficient representation is quantized to achieve the effect of further compression;
the quantized feature values need entropy coding for further compression;
entropy coding belongs to lossless compression of data, reducing bits by identifying and eliminating portions of statistical redundancy, which makes it possible to perform compression without losing information;
using residual features to train an auto-encoder network by using the idea of an auto-encoder; then, extracting by using a trained encoder network to generate a characteristic diagram, then reducing the storage space of the data through quantization, and further compressing the quantized data by entropy coding; when residual information is decoded, the stored entropy coding data is decoded and dequantized by using an opposite flow, and is decoded by a decoder with an opposite structure, and the residual information is recovered from the characteristic diagram;
the method comprises the following steps: building a neural network architecture, coding, quantizing, entropy coding, storing a generated file, and decoding entropy;
wherein, the network structure at least comprises a group of convolution layers for downsampling by setting Strides, a group of deconvolution layers for upsampling by setting Strides and a group of layers for quantization and entropy coding;
the convolution kernel size and number of convolution layers are combined by experiment, and the activation function of the convolution layer uses GDN (generalized differentiated simulation) or ReLU;
the method comprises the following specific steps:
1) building a neural network architecture, and specifying the number of layers of convolution layers, the size of convolution kernels, a padding method and the number of threads required by coding;
2) training by using a training set, wherein each label of residual information is the label of each residual information, constructing a loss function by mse and bpp, and optimizing by using an Adam optimizer; after several iterations, a trained neural network model can be obtained;
3) the coding process is a process of inputting the existing residual error information into the Encoder part of the trained neural network and obtaining a characteristic diagram through multi-step convolution;
4) the quantization commonly uses two modes of adding uniform noise and soft quantization; adding uniform noise is the process of adding noise to replace quantization in training;
5) starting entropy coding, namely carrying out binarization firstly and coding a binary number; the non-binary number must be binarized or converted to a binary number before arithmetic coding; counting probability density functions of all binary symbols, and carrying out arithmetic coding on each bit of the binary symbols according to the probability density functions obtained by counting;
6) serializing and storing the encoded file, and processing by using a serialized packet;
7) performing entropy decoding, reading the file stored in a serialized mode, converting the file into decimal fraction, namely adding a decimal point in front of the highest bit to form decimal fraction, and then decoding according to the existing probability density function;
8) after entropy decoding, a feature map with the size identical to that before entropy encoding is obtained, then a neural network opposite to the encoding network is constructed, a convolution layer is replaced by a deconvolution layer, the feature map is restored to residual information of three channels, and one-step rounding quantization is carried out during storage.
CN202010247702.5A 2020-04-01 2020-04-01 Residual error information compression method for video coding Active CN111432211B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010247702.5A CN111432211B (en) 2020-04-01 2020-04-01 Residual error information compression method for video coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010247702.5A CN111432211B (en) 2020-04-01 2020-04-01 Residual error information compression method for video coding

Publications (2)

Publication Number Publication Date
CN111432211A CN111432211A (en) 2020-07-17
CN111432211B true CN111432211B (en) 2021-11-12

Family

ID=71550390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010247702.5A Active CN111432211B (en) 2020-04-01 2020-04-01 Residual error information compression method for video coding

Country Status (1)

Country Link
CN (1) CN111432211B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115118972A (en) * 2021-03-17 2022-09-27 华为技术有限公司 Video image coding and decoding method and related equipment
WO2023160717A1 (en) * 2022-02-28 2023-08-31 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for video processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018422A (en) * 2017-04-27 2017-08-04 四川大学 Still image compression method based on depth convolutional neural networks
CN107396124A (en) * 2017-08-29 2017-11-24 南京大学 Video-frequency compression method based on deep neural network
WO2019009447A1 (en) * 2017-07-06 2019-01-10 삼성전자 주식회사 Method for encoding/decoding image and device therefor
TW201924342A (en) * 2017-10-12 2019-06-16 聯發科技股份有限公司 Method and apparatus of neural network for video coding
CN110753225A (en) * 2019-11-01 2020-02-04 合肥图鸭信息科技有限公司 Video compression method and device and terminal equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2213101A4 (en) * 2007-11-20 2011-08-10 Ubstream Ltd A method and system for compressing digital video streams
CN110472483B (en) * 2019-07-02 2022-11-15 五邑大学 SAR image-oriented small sample semantic feature enhancement method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018422A (en) * 2017-04-27 2017-08-04 四川大学 Still image compression method based on depth convolutional neural networks
WO2019009447A1 (en) * 2017-07-06 2019-01-10 삼성전자 주식회사 Method for encoding/decoding image and device therefor
CN107396124A (en) * 2017-08-29 2017-11-24 南京大学 Video-frequency compression method based on deep neural network
TW201924342A (en) * 2017-10-12 2019-06-16 聯發科技股份有限公司 Method and apparatus of neural network for video coding
CN111133756A (en) * 2017-10-12 2020-05-08 联发科技股份有限公司 Neural network method and apparatus for video coding
CN110753225A (en) * 2019-11-01 2020-02-04 合肥图鸭信息科技有限公司 Video compression method and device and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DEEPCODER: A Deep Neural Network based Video Compression;tong chen;《IEEE》;20180301;全文 *

Also Published As

Publication number Publication date
CN111432211A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN111246206B (en) Optical flow information compression method and device based on self-encoder
US20200160565A1 (en) Methods And Apparatuses For Learned Image Compression
CN100403801C (en) Adaptive entropy coding/decoding method based on context
CN109889839B (en) Region-of-interest image coding and decoding system and method based on deep learning
CN110248190B (en) Multilayer residual coefficient image coding method based on compressed sensing
CN111432211B (en) Residual error information compression method for video coding
CN107046646B (en) Video coding and decoding device and method based on depth automatic encoder
CN111147862B (en) End-to-end image compression method based on target coding
CN103067022A (en) Nondestructive compressing method, uncompressing method, compressing device and uncompressing device for integer data
JPH07504546A (en) Method for encoding image data
CN113747163B (en) Image coding and decoding method and compression method based on context recombination modeling
CN103188494A (en) Apparatus and method for encoding depth image by skipping discrete cosine transform (DCT), and apparatus and method for decoding depth image by skipping DCT
CN115882866A (en) Data compression method based on data difference characteristic
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
CN110930408A (en) Semantic image compression method based on knowledge reorganization
Kabir et al. Edge-based transformation and entropy coding for lossless image compression
Karthikeyan et al. An efficient image compression method by using optimized discrete wavelet transform and Huffman encoder
CN111343458B (en) Sparse gray image coding and decoding method and system based on reconstructed residual
CN111080729B (en) Training picture compression network construction method and system based on Attention mechanism
CN110677624B (en) Monitoring video-oriented foreground and background parallel compression method based on deep learning
Barman et al. A quantization based codebook formation method of vector quantization algorithm to improve the compression ratio while preserving the visual quality of the decompressed image
Shah et al. Vector quantization with codebook and index compression
CN112950729A (en) Image compression method based on self-encoder and entropy coding
CN110191341A (en) A kind of coding method of depth data and coding/decoding method
CN112887722A (en) Lossless image compression method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211026

Address after: 250100 building S02, No. 1036, Langchao Road, high tech Zone, Jinan City, Shandong Province

Applicant after: Shandong Inspur Scientific Research Institute Co.,Ltd.

Address before: 250100 First Floor of R&D Building 2877 Kehang Road, Sun Village Town, Jinan High-tech Zone, Shandong Province

Applicant before: JINAN INSPUR HIGH-TECH TECHNOLOGY DEVELOPMENT Co.,Ltd.

GR01 Patent grant
GR01 Patent grant