CN101938656B

CN101938656B - Video coding and decoding system based on keyframe super-resolution reconstruction

Info

Publication number: CN101938656B
Application number: CN 201010292294
Authority: CN
Inventors: 宋利; 周强; 张文军
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2010-09-27
Filing date: 2010-09-27
Publication date: 2012-07-04
Anticipated expiration: 2030-09-27
Also published as: CN101938656A

Abstract

The invention relates to a video coding and decoding system based on keyframe super-resolution reconstruction, belonging to the technical field of video image processing. The video coding and decoding system comprises a video coding end and a video decoding end, wherein the video coding end comprises a frame selection module, a keyframe coding module, a small wave downsampling module, a non-keyframe coding module and a code stream integrating module; and the video decoding end comprises a code stream decomposing module, a keyframe decoding module, a small-wave decomposing module, a non-keyframe decoding module, a non-keyframe upsampling module and a video integrating module. The invention is used for downsampling videos before coding and enhancing resolution of the videos by adopting a video super-resolution reconstruction technology based on keyframe layered block matching after decoding, thereby reducing the bit rate of video code streams after the coding and realizing the enhancement of coding efficiency.

Description

Video coding and decoding system based on key frame super-resolution reconstruction

Technical Field

The invention relates to a system in the technical field of video image processing, in particular to a video coding and decoding system based on key frame super-resolution reconstruction.

Background

Video codec techniques are critical to video storage and transmission. The video coding standard has been developed through several stages, such as MPEG-2, H.263, MPEG-4, H.264, etc. H.264 is a new generation video coding standard, and compared with H.263 and MPEG-4, the video compression ratio is improved by about one time.

However, conventional video coding techniques all follow a predetermined framework to directly compress video entering the encoding end. Improvements to these video coding techniques are also limited to this fixed framework. However, the encoding end may be preceded by a preprocessing process, which is responsible for preprocessing the video content in order to achieve higher compression efficiency when compressing the video content. Of course, this process may result in loss of information in the video, such as high frequency details. Therefore, the decoding end needs to add another post-processing operation, and the process is mainly responsible for recovering the lost information in the front-end stage.

In order to further improve the video coding efficiency, researchers add a super-resolution reconstruction technology of image processing, and a novel video coding framework is established on the basis of the super-resolution reconstruction technology.

Through the search of the existing literature, brandi et al propose a Super-resolution reconstruction method based on key frames in the "Super-resolution of video frames and motion estimation" published in IEEE International Conference on image processing (IEEE image processing International Conference) in 2008, and try to apply the method to video coding. This concept is an innovation over conventional coding frames. Brandi et al, however, focus on this super-resolution reconstruction technique using key frames and do not propose a systematic coding framework. Moreover, the scheme has many defects in the aspect of representing high-frequency signals.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a video coding and decoding system based on the key frame super-resolution reconstruction, which can reduce the bit rate of a coded video code stream and realize the improvement of the coding efficiency by down-sampling the video before coding and adopting the video super-resolution reconstruction technology based on the key frame hierarchical block matching to improve the resolution of the video after decoding.

The invention is realized by the following technical scheme, and the invention comprises the following steps: the video coding end comprises a video coding end consisting of a frame selection module, a key frame coding module, a wavelet down-sampling module, a non-key frame coding module and a code stream integration module, and a video decoding end consisting of a code stream decomposition module, a key frame decoding module, a wavelet decomposition module, a non-key frame decoding module, a non-key frame up-sampling module and a video integration module, wherein:

the frame selection module receives a video sequence and respectively outputs a key frame and a non-key frame in the video sequence to a key frame coding module and a wavelet down-sampling module, the key frame coding module is connected with the non-key frame coding module and transmits the key coding frame, the wavelet down-sampling module is connected with the non-key frame coding module and transmits the down-sampling non-key frame, the key frame coding module is connected with the code stream integration module and transmits the key coding frame, and the non-key frame coding module is connected with the code stream integration module and transmits the low-resolution non-key coding frame;

the code stream decomposition module is connected with the key frame decoding module and transmits key frame related code streams, the code stream decomposition module is connected with the non-key frame decoding module and transmits non-key frame related code streams, the key frame decoding module is connected with the wavelet down-sampling module and transmits key decoding frames, the key frame decoding module is connected with the video integration module and transmits key decoding frames, the wavelet decomposition module is connected with the non-key frame decoding module and transmits down-sampling key frames, the wavelet decomposition module is connected with the non-key frame up-sampling module and transmits three key frame sub-bands and down-sampling key frames, the non-key frame decoding module is connected with the non-key frame up-sampling module and transmits low-resolution non-key decoding frames, and the non-key frame up-sampling module is connected with the video integration module and transmits non-key.

The key frame is the first frame of the video sequence, and a plurality of fixed number of non-key frames are arranged between every two adjacent key frames.

The frame selection module selects and divides an input video sequence to obtain a key frame sequence and a non-key frame sequence.

The key frame coding module uses an intra-frame coding mode to compress and code the input key frame.

The wavelet down-sampling module adopts a haar wavelet filter to carry out wavelet decomposition on the non-key frame, reserves sub-bands with low frequency in both horizontal and vertical directions, and outputs the sub-band coefficient divided by 2 as the down-sampling non-key frame.

The non-key frame encoding module comprises: a key frame decoding sub-module, a wavelet down-sampling sub-module, and a non-key frame encoding sub-module, wherein: the key frame coding module is connected with the key frame decoding submodule and transmits a key coding frame code stream, the key frame decoding submodule is connected with the wavelet down-sampling submodule and transmits a key decoding frame, the wavelet down-sampling submodule is connected with the non-key frame coding submodule and transmits a down-sampling decoding sequence, the non-key frame coding submodule is connected with the wavelet down-sampling module and receives a down-sampling non-key frame, and the non-key frame coding submodule is connected with the code stream integration module and transmits a low-resolution non-key frame after inter-frame coding.

The key frame decoding submodule decodes the encoded key frame code stream output by the key frame encoding module by using an intra-frame decoding mode, and prevents decoding drift of a video decoding end caused by mismatching of the video encoding end and the video decoding end.

The wavelet down-sampling sub-module adopts a haar wavelet filter to perform wavelet decomposition on the key decoding frame, reserves sub-bands with low frequency in both horizontal and vertical directions, and divides the sub-band coefficient by 2 to be used as a down-sampling decoding sequence.

The non-key frame coding sub-module takes a key frame in a down-sampling decoding sequence as a reference frame, and codes the input sampled non-key frame in an interframe coding mode, wherein the non-key frame in the interframe coding is a P frame or a B frame.

The code stream integration module integrates the key coding frame code stream and the coded non-key frame code stream into a single code stream for transmission or storage.

The code stream decomposition module decomposes the transmitted mixed code stream into an independent key frame code stream and a non-key frame code stream.

The key frame decoding module decodes the input key frame code stream by using an intra-frame decoding mode and outputs the decoded original resolution key frame.

The wavelet decomposition module decomposes the input decoded key frame into four sub-bands by using a haar wavelet filter, transmits three sub-bands containing high-frequency information to the non-key frame up-sampling module, divides the sub-band coefficients with low frequency in the horizontal and vertical directions by 2 and transmits the sub-bands to the non-key frame up-sampling module and the non-key frame decoding module.

The non-key frame decoding module takes the input downsampling key frame as a reference frame, uses an interframe decoding mode to decode a non-key frame code stream, and outputs a low-resolution non-key decoding frame.

The non-key frame up-sampling module comprises a layered block matching sub-module and a wavelet reconstruction sub-module, wherein: the wavelet decomposition module is connected with the layered block matching sub-module and is used for transmitting three sub-bands containing high-frequency information of the key frame after wavelet decomposition and a decoded key frame after down sampling; the non-key frame decoding module is connected with the layered block matching submodule and transmits the low-resolution non-key decoding frame; the hierarchical block matching sub-module is connected with the wavelet reconstruction sub-module and transmits three sub-bands containing high-frequency information after hierarchical block matching; the non-key frame decoding module is connected with the wavelet reconstruction sub-module and transmits the low-resolution non-key decoding frame; the wavelet reconstruction sub-module is connected with the video integration module and transmits non-key reconstruction frames.

The video integration module integrates the input key frames and the input non-key frames into a complete video according to the sequence of the key frames and the non-key frames in the original video sequence.

Compared with the prior art, the invention has the beneficial effects that: the scheme is capable of saving code streams and improving the coding speed under the video coding scene of medium and low code rates, and does not affect the quality of decoded videos.

Drawings

FIG. 1 is a schematic diagram of the system of the present invention.

FIG. 2 is a schematic diagram illustrating the effects of the embodiment;

wherein: (a) is a block of size 320 × 300 at (450, 420) in frame 5 of the "city" sequence in the example; (b) the method is a corresponding block of a video after a city sequence passes through a standard H.264 coding and decoding system in the embodiment; (c) is the corresponding block of the video after the "city" sequence is coded and decoded by the embodiment.

Detailed Description

The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.

As shown in fig. 1, the present embodiment includes: a video encoding end and a video decoding end. The video coding end comprises a frame selection module, a key frame coding module, a wavelet down-sampling module, a non-key frame coding module and a code stream integration module, wherein: the frame selection module is connected with the key frame coding module and used for transmitting key frames in the video; the frame selection module is connected with the wavelet down-sampling module and is used for transmitting non-key frames in the video; the key frame coding module is connected with the non-key frame coding module and transmits key coding frames; the wavelet down-sampling module is connected with the non-key frame coding module and transmits the low-resolution non-key frame subjected to 2-subtracting sampling; the key frame coding module is connected with the code stream integration module and used for transmitting key coding frames; the non-key frame coding module is connected with the code stream integration module and transmits the low-resolution non-key coding frame.

The frame selection module selects an input video sequence and decomposes key frames and non-key frames, wherein a fixed number of non-key frames are arranged between the two key frames, and the first frame of the video is a key frame.

The key frame coding module uses H.264 intra-frame coding mode to compress and code the input key frame. The efficiency of the key frame coding module is determined by the individual key frame quantization coefficients.

The wavelet down-sampling module uses a haar wavelet filter to perform wavelet decomposition on an input image, an input non-key frame is subjected to wavelet decomposition, a sub-band with low frequency in the horizontal and vertical directions is reserved, and a sub-band coefficient is divided by 2 and then output, namely the non-key frame subjected to 2-reducing down-sampling.

The non-key frame encoding module comprises: a key frame decoding sub-module, a wavelet down-sampling sub-module, and a non-key frame coding sub-module, wherein: the key frame coding module is connected with the key frame decoding submodule and transmits a key coding frame code stream; the key frame decoding submodule is connected with the wavelet down-sampling submodule and transmits a key decoding frame; the wavelet down-sampling sub-module is connected with the non-key frame coding sub-module and transmits the decoded key frame after down-sampling; the wavelet down-sampling module is connected with the non-key frame coding sub-module and is used for transmitting the down-sampled non-key frame; the non-key frame coding sub-module is connected with the code stream integration module and transmits the low-resolution non-key frame after H.264 interframe coding.

The key frame decoding submodule decodes the coded key frame code stream output by the key frame coding module by using an H.264 intra-frame decoding mode, and prevents decoding drift of a video decoding end caused by mismatching of the video coding end and the video decoding end.

The wavelet down-sampling sub-module has the same function as the wavelet down-sampling module, only the input and the output are different, so the wavelet down-sampling sub-module is called to distinguish different positions in a coding frame.

The non-key frame coding sub-module takes the decoded low-resolution key frame after down-sampling as a reference frame and uses an H.264 interframe coding mode to code the input sampled non-key frame. When encoding, non-key frames may be taken as P-frames or B-frames. The code stream size of the encoded non-key frame is controlled by the non-key frame quantization coefficient.

The video decoding end comprises a code stream decomposition module, a key frame decoding module, a wavelet decomposition module, a non-key frame decoding module, a non-key frame up-sampling module and a video integration module, wherein: the code stream decomposition module is connected with the key frame decoding module and transmits the key frame related code stream; the code stream decomposition module is connected with the non-key frame decoding module and transmits the non-key frame related code stream; the key frame decoding module is connected with the wavelet down-sampling module and used for transmitting key decoding frames; the key frame decoding module is connected with the video integration module and used for transmitting key decoding frames; the wavelet decomposition module is connected with the non-key frame decoding module and used for transmitting down-sampling key frames, and the wavelet decomposition module is connected with the non-key frame up-sampling module and used for transmitting three key frame sub-bands and down-sampling key frames; the non-key frame decoding module is connected with the non-key frame up-sampling module and transmits the low-resolution non-key decoding frame; the non-key frame up-sampling module is connected with the video integration module and transmits non-key frames with the same resolution as the key frames.

The key frame decoding module decodes the input key frame code stream by using an H.264 intra-frame decoding mode and outputs the decoded original resolution key frame.

The non-key frame decoding module takes the input downsampling key frame as a reference frame, decodes the non-key frame code stream by using an inter-frame decoding mode in H.264 and outputs a low-resolution non-key decoding frame.

The non-key frame up-sampling module comprises a layered block matching sub-module and a small reconstruction sub-module, wherein: the wavelet decomposition module is connected with the layered block matching sub-module and is used for transmitting three sub-bands containing high-frequency information of the key frame after wavelet decomposition and a decoded key frame after down sampling; the non-key frame decoding module is connected with the layered block matching submodule and transmits the low-resolution non-key decoding frame; the hierarchical block matching sub-module is connected with the wavelet reconstruction sub-module and transmits three sub-bands containing high-frequency information after hierarchical block matching; the non-key frame decoding module is connected with the wavelet reconstruction sub-module and transmits the low-resolution non-key decoding frame; the wavelet reconstruction sub-module is connected with the video integration module and transmits non-key frames with the same resolution as the key frames.

The working principle of the hierarchical block matching submodule is as follows:

l 'is'_NKTemporally AND I 'for an input low resolution non-key decoded frame'_NKThe two nearest key frames are respectively I_K1And I_K2. To I_K1、I_K2Respectively carry out haar wavelet decomposition, of

<math> <mrow> <mo>[</mo> <msub> <mi>I</mi> <mrow> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>-</mo> <mi>CA</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>I</mi> <mrow> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>-</mo> <mi>CH</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>I</mi> <mrow> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>-</mo> <mi>CV</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>I</mi> <mrow> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>-</mo> <mi>CD</mi> </mrow> </msub> <mo>]</mo> <mo>=</mo> <mi>W</mi> <mrow> <mo>(</mo> <msub> <mi>I</mi> <msub> <mi>K</mi> <mi>idx</mi> </msub> </msub> <mo>)</mo> </mrow> <mo>,</mo> <msubsup> <mi>I</mi> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>′</mo> </msubsup> <mo>=</mo> <msub> <mi>I</mi> <mrow> <msub> <mi>K</mi> <mi>idx</mi> </msub> <mo>-</mo> <mi>CA</mi> </mrow> </msub> <mo>/</mo> <mn>2</mn> <mo>,</mo> <mi>idx</mi> <mo>&Element;</mo> <mo>{</mo> <mn>1,2</mn> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>

Wherein,

i.e. some information of the key frame transmitted via the wavelet decomposition module. W is a haar wavelet decomposition filter.

The hierarchical block matching submodule adopts a hierarchical matching idea to match I 'in the k-th layer matching process'_NKDecomposition into N M_k×M_kA block of size, wherein the nth block is

Is provided withOne M at (i, j) for the matrix Mat_i×M_iThe block of (1). Based on the above conditions, using local block matching, there are

<math> <mrow> <mo>[</mo> <msub> <mi>i</mi> <mi>n</mi> </msub> <mo>,</mo> <msub> <mi>j</mi> <mi>n</mi> </msub> <mo>,</mo> <mi>idx</mi> <mo>]</mo> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>min</mi> </mrow> <mrow> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&Element;</mo> <msubsup> <mi>I</mi> <mrow> <mi>K</mi> <mn>1</mn> </mrow> <mo>′</mo> </msubsup> </mrow> </munder> <mo>{</mo> <mi>MAD</mi> <mrow> <mo>(</mo> <msubsup> <mi>B</mi> <mi>NK</mi> <mi>n</mi> </msubsup> <mo>,</mo> <msubsup> <mi>B</mi> <msubsup> <mi>I</mi> <mrow> <mi>K</mi> <mn>1</mn> </mrow> <mo>′</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>,</mo> <mi>MAD</mi> <mrow> <mo>(</mo> <msubsup> <mi>B</mi> <mi>NK</mi> <mi>n</mi> </msubsup> <mo>,</mo> <msubsup> <mi>B</mi> <msubsup> <mi>I</mi> <mrow> <mi>K</mi> <mn>2</mn> </mrow> <mo>′</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>}</mo> <mo>,</mo> <mi>idx</mi> <mo>&Element;</mo> <mo>{</mo> <mn>1,2</mn> <mo>}</mo> </mrow> </math>

，(2)

Wherein,

<math> <mrow> <mi>MAD</mi> <mrow> <mo>(</mo> <msubsup> <mi>B</mi> <msub> <mi>Mat</mi> <mn>1</mn> </msub> <msub> <mi>n</mi> <mn>1</mn> </msub> </msubsup> <mo>,</mo> <msubsup> <mi>B</mi> <mrow> <mi>Mat</mi> <mn>2</mn> </mrow> <msub> <mi>n</mi> <mn>2</mn> </msub> </msubsup> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mi>Σ</mi> <mrow> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&Element;</mo> <msubsup> <mi>B</mi> <msub> <mi>Mat</mi> <mn>1</mn> </msub> <msub> <mi>n</mi> <mn>1</mn> </msub> </msubsup> </mrow> </munder> <mo>|</mo> <msubsup> <mi>B</mi> <msub> <mi>Mat</mi> <mn>1</mn> </msub> <msub> <mi>n</mi> <mn>1</mn> </msub> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>-</mo> <msubsup> <mi>B</mi> <mrow> <mi>Mat</mi> <mn>2</mn> </mrow> <msub> <mi>n</mi> <mn>2</mn> </msub> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>|</mo> </mrow> </math>

is composed of

Of the pixel at (i, j). Of the above formula is namely in'_K1And l'_K2Using MAD criterion for searching and

best matched block, resulting best matched block located in key frame I'_K-idx(i) of_n，j_n) To (3).

The hierarchical block matching submodule starts to search from a certain initial position when matching each block in the k-th layer matching process, and the initial position matrix of the k-th layer is set as posMat_kFrom the matrix to l'_NKThe size is the same, and the value of each element in the matrix is I'_NKStarting search position, block of middle co-located pixel

Is the starting search position of the top left pixel of the block. Starting position matrix posMat_kIs the k-1 layer block matching result. When performing the first layer matching, the initial position matrix posMat₀Wherein the element is I'_NKMiddle co-located pixel is at'_NKAnd (4) a middle coordinate.

The hierarchical block matching submodule performs the first-level matching on the size M of the block₁128, the k-th layer matches the size of the used block 128/2^k-1. The block size is 8 at minimum, i.e. block matching is performed in 5 layers.

When the last layer matching is carried out, the over-complete idea, I 'is introduced'_NKDecomposed into 8 x 8 blocks overlapping each other, each block being 4 away from the adjacent blocks in the horizontal and vertical directionsA pixel. After the layer of block matching is completed, each 8 x 8 block

Each having a best matching block corresponding theretoSetting a non-key frame wavelet domain high-frequency coefficient matrix to be reconstructed as I'_NK-CH、I′_NK-CDAnd l'_NK-CV. Is prepared from l'_NK-CHBy way of example, I'_NK-CHCan be obtained by the following method,

<math> <mrow> <msubsup> <mi>I</mi> <mrow> <mi>NK</mi> <mo>-</mo> <mi>CH</mi> </mrow> <mo>′</mo> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>{</mo> <mi>n</mi> <mo>&Element;</mo> <mo>[</mo> <mn>1</mn> <mo>,</mo> <mi>N</mi> <mo>]</mo> <mo>|</mo> <msubsup> <mi>B</mi> <mrow> <mi>NK</mi> <mo>-</mo> <mi>CH</mi> </mrow> <mi>n</mi> </msubsup> <mo>}</mo> </mrow> <mi>α</mi> </mfrac> <mo>,</mo> <msubsup> <mi>B</mi> <mrow> <mi>NK</mi> <mo>-</mo> <mi>CH</mi> </mrow> <mi>n</mi> </msubsup> <mo>=</mo> <msubsup> <mi>B</mi> <msub> <mi>I</mi> <mrow> <mi>K</mi> <mo>-</mo> <mi>idx</mi> </mrow> </msub> <mrow> <mo>(</mo> <msub> <mi>i</mi> <mi>n</mi> </msub> <mo>,</mo> <msub> <mi>j</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow> </math>

wherein alpha is an overlapping coefficient matrix, and each element in the matrix is I'_NKI.e. how many 8 x 8 blocks the pixel is comprised of. The matrix division operation of the above equation divides the corresponding elements of the two matrices. I'_NK-CHAnd overlapping and adding the high-frequency coefficient sub-blocks corresponding to the 8 multiplied by 8 sub-blocks in the key frame obtained by block matching. I'_NK-CDAnd l'_NK-CVThe same method can be used to obtain.

I′_NK-CH、I′_NK-CDAnd l'_NK-CVI.e. the signal transmitted to the wavelet reconstruction sub-module.

The wavelet reconstruction sub-module uses a haar wavelet reconstruction filter to decode I 'transmitted by the non-key frame decoding module'_NKAnd reconstructing the three high-frequency sub-bands transmitted by the hierarchical matching sub-module into a high-resolution image. Namely, it is

I_NK＝W′(2I′_NK，I′_NK-CH，I′_NK-CV，I′_NK-CD) (4)

Where W' is the wavelet reconstruction operation using a haar wavelet filter.

Effects of the implementation

The first 80 frames of video from "city.yuv" (YUV file in mobile 4:2:0 format of 1280x 720) are coded and decoded by adopting the coding and decoding system of the embodiment and the standard h.264 coding and decoding system respectively:

in this embodiment, in the video encoding end, the Quantization Parameter (QP) used by the key frame encoding module is 22, the quantization parameter used by the non-key frame encoding module is 26, and the code rate of the code stream output by the code stream integration module is 3075 kbps. The standard H.264 coding system sets the quantization parameter to be 28, and the code stream code rate after coding is 3167 kbps.

The SSIM index is used in the present embodiment for comparing the decoded video performance with that of the standard h.264 codec system. This index is expressed by Image quality analysis published by IEEE trans. Image Processing (IEEE journal of Image Processing) 2004, z.wang et al: from error visibility to structural similarity (image quality assessment: From error visualization to structural similarity). SSIM reflects the degree of similarity of images more accurately than PSNR indexes.

In this embodiment, the average SSIM value of the decoded video output by the video integration module is 0.9794, and the average SSIM value of the decoded video output by the standard h.264 codec system is 0.9886. The objective quality of the video obtained by the two modes is not very different. Fig. 2 is a comparison of two video coding schemes in terms of subjective visual quality. Fig. 2(a) shows a block with a size of 320 × 300 at (450, 420) in the 5 th frame of the original video, fig. 2(b) shows the corresponding block of the video decoded by using the standard h.264 codec system, and fig. 2(c) shows the corresponding block of the video decoded by using the embodiment. The difference between fig. 2(b) and fig. 2(c) is not obvious. The two schemes also differ little in subjective visual quality.

Therefore, the system of the embodiment can reduce the code rate and improve the coding speed on the premise of ensuring the quality of the transmitted video, can extract high-quality video information (key frames in the video) from the coded code stream, has good performance and wide application prospect in the medium-low code rate video coding scene, and is a good supplement and innovative breakthrough to the existing video coding frame.

Claims

1. A video coding and decoding system based on key frame super-resolution reconstruction comprises: the video coding end comprises a frame selection module, a key frame coding module, a wavelet down-sampling module, a non-key frame coding module and a code stream integration module, and the video decoding end comprises a code stream decomposition module, a key frame decoding module, a wavelet decomposition module, a non-key frame decoding module, a non-key frame up-sampling module and a video integration module, and is characterized in that:

the video encoding end comprises: the frame selection module selects and divides an input video sequence to obtain a key frame sequence and a non-key frame sequence, and the key frame sequence and the non-key frame sequence are respectively output to the key frame coding module and the wavelet down-sampling module; the key frame coding module compresses the input key frame in an intra-frame coding mode and transmits the key coding frame to the non-key frame coding module and the code stream integration module; the wavelet down-sampling module adopts a haar wavelet filter to carry out wavelet decomposition on the non-key frame, reserves sub-bands with low frequency in both horizontal and vertical directions, and outputs the sub-band coefficient divided by 2 as a down-sampling non-key frame to the non-key frame coding module; the non-key frame coding module transmits the low-resolution non-key coding frame to the code stream integration module; the code stream integration module integrates the key coding frame code stream input by the key frame coding module and the coded non-key frame code stream input by the non-key frame coding module into a single code stream for transmission or storage;

the video decoding end comprises: the code rate decomposition module decomposes the transmitted mixed code stream into an independent key frame code stream and a non-key frame code stream, and respectively outputs the independent key frame code stream and the independent non-key frame code stream to the key frame decoding module and the non-key frame decoding module; the key frame decoding module decodes the input key frame code stream by using an intra-frame decoding mode and outputs the decoded original resolution key frame to the wavelet decomposition module and the video integration module; the wavelet decomposition module decomposes the input decoded key frame into four sub-bands by using a haar wavelet filter, transmits three sub-bands containing high-frequency information to the non-key frame upsampling module, divides 2 the sub-band coefficients with low frequency in the horizontal and vertical directions and transmits the sub-bands to the non-key frame upsampling module and the non-key frame decoding module; the non-key frame decoding module takes the input downsampling key frame as a reference frame, decodes a non-key frame code stream in an interframe decoding mode and outputs a low-resolution non-key decoding frame to the non-key frame upsampling module; the video integration module integrates the key frames input by the key frame decoding module and the non-key frames input by the non-key frame up-sampling module into a complete video according to the sequence of the key frames and the non-key frames in the original video sequence;

2. The system of claim 1, wherein the non-key frame coding module comprises: a key frame decoding sub-module, a wavelet down-sampling sub-module, and a non-key frame encoding sub-module, wherein: the key frame coding module is connected with the key frame decoding submodule and transmits a key coding frame code stream, the key frame decoding submodule is connected with the wavelet down-sampling submodule and transmits a key decoding frame, the wavelet down-sampling submodule is connected with the non-key frame coding submodule and transmits a down-sampling decoding sequence, the non-key frame coding submodule is connected with the wavelet down-sampling module and receives a down-sampling non-key frame, and the non-key frame coding submodule is connected with the code stream integration module and transmits a low-resolution non-key frame after inter-frame coding.

3. The video coding and decoding system based on the key frame super-resolution reconstruction of claim 2, wherein the key frame decoding submodule decodes the encoded key frame code stream output by the key frame encoding module by using an intra-frame decoding mode, so as to prevent decoding drift of a video decoding end caused by mismatching of the video encoding end and the video decoding end.

4. The video coding and decoding system based on key frame super-resolution reconstruction of claim 2, wherein the wavelet down-sampling sub-module performs wavelet decomposition on the key decoding frame by using a haar wavelet filter, retains sub-bands with low frequency in both horizontal and vertical directions, and divides sub-band coefficients by 2 to be used as a down-sampling decoding sequence.

5. The system of claim 2, wherein the non-key frame coding sub-module uses a key frame in the downsampled decoding sequence as a reference frame, and uses inter-frame coding to encode the input sampled non-key frame, and the non-key frame in the inter-frame coding is a P frame or a B frame.

6. The video coding and decoding system based on the key frame super resolution reconstruction of claim 1, wherein the non-key frame upsampling module comprises a hierarchical block matching sub-module and a wavelet reconstruction sub-module, wherein: the wavelet decomposition module is connected with the layered block matching submodule, and is used for transmitting three sub-bands containing high-frequency information after the key frame is subjected to wavelet decomposition and a decoded key frame after down sampling: the non-key frame decoding module is connected with the layered block matching submodule and transmits the low-resolution non-key decoding frame; the hierarchical block matching sub-module is connected with the wavelet reconstruction sub-module and transmits three sub-bands containing high-frequency information after hierarchical block matching; the non-key frame decoding module is connected with the wavelet reconstruction sub-module and transmits the low-resolution non-key decoding frame; the wavelet reconstruction sub-module is connected with the video integration module and transmits non-key reconstruction frames.