CN101146227A

CN101146227A - Build-in gradual flexible 3D wavelet video coding algorithm

Info

Publication number: CN101146227A
Application number: CN 200710146053
Authority: CN
Inventors: 涂国防; 张磊; 张灿; 吴伟仁; 陈康
Original assignee: University of Chinese Academy of Sciences
Current assignee: Graduate School of CAS; University of Chinese Academy of Sciences
Priority date: 2007-09-10
Filing date: 2007-09-10
Publication date: 2008-03-19

Abstract

The invention provides a nested progressive scalable three-dimensional small-wave video coding algorithm, which is to solve the problem that the international video coding standards H.261-H.264 and MPEG1-MPEG4, in the technical field of space multimedia communication, can not meet the requirements for video communication with low complexity and high coding efficiency in the space communication environment. The algorithm utilizes hierarchical motion compensation time-domain filtering and incorporates two-dimensional nested progressive scalable coding algorithm to code video sequences, wherein the hierarchical motion compensation time-domain filtering integrates the hierarchical block matched motion estimation with the one-dimensional motion compensation time-domain filtering, so as to eliminate the time-domain redundancy and facilitate the scalable coding/decoding in a space domain. The two-dimensional nested progressive scalable coding algorithm, after the high-efficiency coding of the small-wave coefficient by using a bit-plane coding method, reorganizes the coded stream to impart the stream with nested progressive scalability with quality and resolution. The algorithm can be used in wireless and space video communications.

Description

Build-in gradual flexible 3 D wavelet video coding algorithm

Technical field

The present invention relates to video compression coding in the multimedia communication technology field, space, particularly relate to a kind of build-in gradual flexible 3 D wavelet video coding algorithm.

Background technology

Space communication has characteristics such as communication distance is far away, propagation delay time big, bandwidth resources finite sum satellite borne equipment disposal ability is limited, these images that big data quantity, real-time are had relatively high expectations, video data communication are very unfavorable, simultaneously, the isomerism of space communication network and the diversity of terminal equipment have brought huge challenge for existing coding techniques, therefore demand studying the encryption algorithm of low complex degree, high scalability urgently, realize the application of video in space communication.

Existing international video encoding standard H.261-H.264, MPEG1-MPE64 etc. can be used for space communication, yet these standards are not at the space communication Environment Design, therefore following 2 deficiencies are arranged: (1) H.261-H.263, MPEG1, MPEG2 etc. do not have scalability, incompatibility spatial network isomery and subscriber equipment diversity characteristics; (2) complexity height.As H.264 decoding complex degree is H.263 2 times, and encoder complexity is H.263 3 times.As seen existing Streaming Media coding standard is difficult to satisfy simultaneously code efficiency, computation complexity and scalability requirement, needs the Streaming Media encryption algorithm of research space-oriented communication.

Wavelet transformation has good time-frequency locality, can remove the redundancy between the image pixel effectively, obtains compression effectiveness preferably.In addition, the multiresolution characteristic of wavelet transformation makes its support space, time, quality scalability coding effectively.Therefore, wavelet transformation is applied in image, field of video encoding widely.With respect to the 2-d wavelet video coding, the 3 D wavelet compressed encoding is used as time shaft as the one-dimensional signal with certain statistical feature and is adopted small echo to carry out decorrelation, not only can provide multiple scalabilities such as frame per second, compression quality, and owing to there is not the loop structure of predictive coding, therefore avoided the error drift problem in the predictive coding, be very suitable for towards the scalable video of communication.3 D wavelet coding research at present is a research focus of field of video encoding.

At the existing deficiency of coding standard in the video communication of space, need the encryption algorithm of research low complex degree, high scalability, realize the reliable transmission of video in space communication.In order to realize this purpose, the present invention proposes a kind of nested progressive scalability 3 D wavelet video coding algorithm.

Summary of the invention

A kind of nested progressive scalability 3 D wavelet video coding algorithm, with the every N frame of video sequence is one group of (Group ofPicture, GOP), adopt one dimension graded movement compensation time-domain filtering (Hierarchical Motion CompensationTemporal Filtering, HMCTF) method is carried out spatial transform, again each frame is carried out two-dimentional spatial domain wavelet transformation, adopt two-dimentional build-in gradual flexible encryption algorithm (Nested Progressive Scalable Coding Algorithm, NPSCA) each frame wavelet coefficient is encoded the progressive scalability code stream of nesting.Concrete technical characterictic is as follows:

1) graded movement compensation time-domain filtering

Adopt a kind of Hierarchical Motion Estimation to obtain motion vector,, obtain resolution and be respectively original image 1/2 3 dimensionality reductions of original image ², 1/2 ⁴, 1/2 ⁶3 dimensionality reduction images of size, using the lowest resolution image to carry out the spatial domain block matching motion estimates, obtain one group of motion vector under the low resolution, basis is hanged down the motion vector of the estimation of motion vectors high-definition picture of resolution image then, by this processing method, obtain the motion vector of original resolution image, and utilize motion vector to demarcate the pixel that has movement relation in adjacent two frames, utilize the Harr Lifting Wavelet to carry out one dimension time domain wavelet transformation.

2) nested progressive scalable coding algorithm

Use the Daubechies9/7 Lifting Wavelet to carry out two-dimentional spatial domain wavelet transformation to each frame behind the spatial transform, utilize the NPSCA algorithm that wavelet coefficient is encoded, at first adopt equidirectional coefficient correlation not at the same level and different sub-band coefficient correlation at the same level to organize coefficient, constitute different set, and carry out Bit-Plane Encoding, set up the progressive code stream institutional framework of nested type simultaneously, press low frequency sub-band, coarse grade high-frequency sub-band, inferior coarse grade high-frequency sub-band, the sequential organization code stream of the meticulousst level high-frequency sub-band, make coding have resolution flexible, and, have quality scalability for each subband.

Description of drawings

Fig. 1, the 3 D wavelet coding structure

In Fig. 1: 1. sequence of video images, 2. video frame packet, 3. a GOP grouping (8 frame), 4. graded movement compensation time-domain filtering, the 5. GOP of process one dimension time-domain filtering, 6.Daubechies9/7 spatial domain two-dimensional wavelet transformation, 7.GOP in the wavelet coefficient of all frames, 8. nested progressive scalability wavelet coding, the encoding code stream of all frames among the 9.GOP, 10. code stream tissue, 11. final encoding code streams.

Fig. 2, graded movement compensation time-domain filtering structure (Fig. 2 is 4 a detailed structure among Fig. 1)

In Fig. 2: 12. first order time-domain filterings are handled, 13.4 inferior classification block matching motion is estimated, 14. the corresponding motion vector of first order time-domain filtering, 15.4 inferior Harr wavelet transformation, 16.4 low frequency L frames, 17.4 high frequency H frames, 18. second level time-domain filtering is handled, 19.2 inferior classification block matching motion is estimated, the corresponding motion vector of 20. second level time-domain filterings, 21.2 Harr wavelet transformations, 22.2 individual low frequency LL frame, 23.2 individual high frequency LH frame, 24. third level time-domain filterings are handled, and 25.1 times the classification block matching motion is estimated, 26. the corresponding motion vector of third level time-domain filtering, 27.1 inferior Harr wavelet transformation, 28.1 low frequency LLL frames, 29.1 high frequency LLH frames.

Fig. 3, classification block matching motion estimate structure (Fig. 3 is the detailed structure that the classification block matching motion is estimated in 13,19 or 25 among Fig. 2, and its input is reference frame and present frame, is output as corresponding one group of motion vector)

In Fig. 3: 30. reference frames, 31. present frame, 32. reference frame image dimensionality reduction, 33. current frame image dimensionality reduction, 34. image down sampling is handled (vertically, 2 times of down-samplings of horizontal direction), 35. 1/4 image of reference frame original resolution, 36. 1/16 image of reference frame original resolution, 37. 1/64 image of reference frame original resolution, 38. 1/4 image of present frame original resolution, 1/16 image of 39. present frame original resolutions, 1/64 image of 40. present frame original resolutions,, image estimates the corresponding motion vector MV of 42.1/64 image 41.1/64 carrying out block matching motion ₃, 43. motion vector values double, 44.2*MV ₃, 45. with 2*MV ₃For carrying out block matching motion to 1/16 image, initial value estimates the corresponding motion vector MV of 46.1/16 image ₂, 47.2*MV ₂, 48. with 2*MV ₂For carrying out block matching motion to 1/4 image, initial value estimates the corresponding motion vector MV of 49.1/4 image ₁, 50.2*MV ₁, 51. with 2*MV ₁For carrying out block matching motion to the original resolution image, initial value estimates the corresponding motion vector MV of 52. original resolution images ₀

Fig. 4, coding structure (Fig. 4 is 8 a detailed structure among Fig. 1, is encoded to example with a certain frame among the GOP, is input as a certain frame wavelet coefficient among the GOP, is output as the encoding code stream of this frame)

In Fig. 4: a certain frame wavelet coefficient among the 53.GOP, 54. low frequency sub-bands coding, 55. high-frequency sub-band coding, 56. low frequency sub-band coefficients, 57. quantize 58. entropy codings, 59. high-frequency sub-band coefficients, 60. Bit Plane Decomposition, 61. Bit-Plane Encodings, 62. these frame encoding code streams.

Fig. 5, code stream institutional framework (Fig. 5 is 10 a detailed structure among Fig. 1)

In Fig. 5: 63-67 represents each frame coding subcode stream; With a certain frame coding subcode stream 65 is example, and code flow structure is as follows: 68. low frequency quantization encoding code words, 69. third level motion vector MV ₃, 70. 3rd level high frequency coefficient code words, 71. the 1st grades of motion vector MV ₁, 72. the 1st grades of high frequency coefficient code words.

Concrete implementation step

Nested progressive scalability 3 D wavelet video coding general structure is as follows:

As shown in Figure 1: after encoder receives sequence of video images 1, at first need carry out the frame grouping in video frame packet 2, video frame packet adopts the anchor-frame group technology, and a GOP grouping 3 has 8 frames.Adopt 4 pairs of GOP groupings 3 of graded movement compensation time domain to carry out 3 grades of one dimension time-domain filterings (graded movement compensation time-domain filtering detailed structure as shown in Figure 2).Adopt Daubechies9/7 spatial domain two-dimensional wavelet transformation 6 to carry out 3 grades of wavelet transformations to each frame among the GOP5 behind the one dimension time-domain filtering, obtain the wavelet coefficient 7 of all frames among the GOP, adopt nested progressive scalability wavelet coding 8 to encode (nested progressive scalability wavelet coding detailed structure as shown in Figure 4) to each frame wavelet coefficient, obtain the encoding code stream 9 of all frames among the GOP, in code stream tissue 10, constitute complete coding structure (code stream institutional framework detailed structure as shown in Figure 5), export final encoding code stream 11.

To introduce concrete implementation steps such as graded movement compensation time-domain filtering, NPSCA coding and code stream tissue below in detail:

1) graded movement compensation time-domain filtering

As shown in Figure 1, after video frame packet 2 processing, will constitute several GOP groupings, be example with a GOP grouping 3, graded movement compensation time-domain filtering 4 processing procedures are as follows: as shown in Figure 2, one dimension time-domain filtering HMCTF processing procedure divides 3 grades to carry out the time-domain filtering processing.Handle in 12 at first order time-domain filtering, HMCTF is carried out in a GOP grouping 3, at first need to carry out 4 classification block matching motions and estimate 13, obtain the corresponding motion vector 14 of first order time-domain filtering, and utilize motion vector to carry out Harr wavelet transformation 15 4 times, obtain 4 low frequency L frames 16 and 4 high frequency H frames 17, because human eye is insensitive to high frequency coefficient, reduce frame rate so can give up high-frequency frame, but the HMCTF of individual layer can only provide the speed that reduces by half, therefore need in Fig. 2,18 and 24 carry out that second level time-domain filtering is handled and the processing of third level time-domain filtering to 4 low frequency L frames 16, it is 12 identical that its process and first order time-domain filtering are handled, and obtains a low frequency LLL frame 28 and a high frequency LLH frame 29 behind third level time-domain filtering.

In above-mentioned graded movement compensation time-domain filtering, adopted the fast matched motion estimation approach of a kind of classification, its detailed structure is as shown in Figure 3.Reference frame 30 and present frame 31 carry out dimension-reduction treatment (promptly image being sampled) respectively in reference frame dimension-reduction treatment 32 and present frame dimension-reduction treatment 33, with the reference frame dimension-reduction treatment is example: at first adopt image down sampling to handle 34 pairs of reference frames 30 and carry out 2 times of down-samplings, obtain 1/4 image 35 of reference frame original resolution, continuation is carried out image down sampling processing 34 to 1/4 image 35 of reference frame original resolution, obtain the image 36 of reference frame original resolution 1/16, then the image 36 to reference frame original resolution 1/16 carries out image down sampling processing 34, obtains the image 37 of reference frame original resolution 1/64.Current frame image dimensionality reduction 33 processing procedures are identical with said process.

As shown in Figure 3,1/64 image 37 of reference frame original resolution and the image 40 of present frame original resolution 1/64 41 carrying out block matching motion and estimate in Fig. 3 obtains the corresponding motion vector MV of 1/64 image ₃42, with MV ₃Value in Fig. 2,43 carry out motion vector value and double to handle, obtain 2*MV ₃44.Then with 2*MV ₃As initial motion vectors, 45 pairs 1/16 sized images are carried out estimation in Fig. 3, obtain the corresponding motion vector 46 of 1/16 image, are labeled as MV ₂, press said process and handle, obtain the corresponding motion vector 52 of original resolution image at last, be labeled as MV ₀

2) spatial domain wavelet transformation and build-in gradual flexible Wavelet image coding

After the one dimension time-domain filtering, (it should be noted that: sampling number is identical at present with Hierarchical Motion Estimation for the wavelet transformation level, is all 3 grades need to use the Daubechies9/7 Lifting Wavelet to carry out two-dimentional spatial domain wavelet transformation to each frame behind the spatial transform.), after the wavelet transformation of spatial domain, utilize the NPSCA algorithm to encode to each frame wavelet coefficient.As shown in Figure 4, be encoded to example with a certain frame among the GOP, at first input signal is an a certain frame wavelet coefficient 53 among the GOP, carries out coefficient coding.

As shown in Figure 4, the NPSCA algorithm is divided into two parts: low frequency sub-band coding 54 and high-frequency sub-band coding 55.Wherein low frequency sub-band 54 processes of encoding are: low frequency sub-band coefficient 56 at first carries out quantification treatment in quantifying unit 57, and coefficient is encoded in entropy coding 58 and obtained encoding code stream after will quantizing then.Low frequency sub-band quantization encoding code word is as the basic layer that satisfies gross and resolution requirement, and the minimum bandwidth that it can provide matching network obtains base layer data to guarantee terminal; High-frequency sub-band cataloged procedure 55 is: high-frequency sub-band 59 is at first decomposed in bit plane resolving cell 60, obtain from the most meaningful bit plane to n bit plane of meaningless bit plane, each bit plane is called coding layer (Coding Layer, CL), in Bit-Plane Encoding 61, each bit plane is encoded respectively, and be organized into code stream.This code stream is as the enhancement layer that characterizes texture and details, it is used for improving reconstructed image quality and spatial resolution, characteristic with adaptive bandwidth and adaptive terminal calculation of equipments ability can be handled or abandon according to the network bandwidth and terminal processing capacity.Low frequency sub-band and high-frequency sub-band encoding code stream constitute the encoding code stream 62 of this frame jointly.

3) code stream tissue

The code stream that obtains behind each frame coding need carry out the code stream tissue, constitutes the video sequence code stream.As shown in Figure 5, organize together the encoding code stream that constitutes whole video sequence by the subcode stream 63-67 headtotail of coded sequence after with each frame coding.With a certain frame coding subcode stream 65 code streams is example, and concrete structure is as follows:

At first place basic layer code word, promptly low frequency sub-band quantization encoding code word 68 is then placed 3rd level high-frequency sub-band coefficient code word 70, in order better to realize the spatial domain scalable decoding, inserts third level motion vector MV before it ₃69.By that analogy, put into the 1st grade of motion vector MV at last ₁71 and corresponding the 1st grade of high-frequency sub-band coefficient code word 72.Above-mentioned code stream tissue order can realize spatial domain and the progressive scalable decoding of quality nested type.

Claims

1. nested progressive scalability 3 D wavelet video coding algorithm, it is characterized in that: with the every N frame of video sequence is one group, adopt one dimension graded movement compensation time-domain filtering method to carry out spatial transform, again each frame is carried out two-dimentional spatial domain wavelet transformation, adopt two-dimentional build-in gradual flexible encryption algorithm that each frame wavelet coefficient is encoded, the progressive scalability code stream of nesting.

2. a kind of nested progressive scalability 3 D wavelet video coding algorithm according to claim 1, its feature also is: adopt a kind of Hierarchical Motion Estimation to obtain motion vector, to 3 dimensionality reductions of original image, obtain resolution and be respectively original image 1/2 ², 1/2 ⁴, 1/2 ⁶3 dimensionality reduction images of size, using the lowest resolution image to carry out the spatial domain block matching motion estimates, obtain one group of motion vector under the low resolution, basis is hanged down the motion vector of the estimation of motion vectors high-definition picture of resolution image then, obtain the motion vector of original resolution image, and utilize motion vector to demarcate the pixel that has movement relation in adjacent two frames, carry out one dimension time domain wavelet transformation with the Harr Lifting Wavelet.

3. a kind of nested progressive scalability 3 D wavelet video coding algorithm according to claim 1, its feature also is: use the Daubechies9/7 Lifting Wavelet to carry out two-dimentional spatial domain wavelet transformation to each frame behind the spatial transform, utilize equidirectional coefficient correlation not at the same level of wavelet coefficient and different sub-band coefficient correlation at the same level to organize coefficient, constitute different set, and carry out Bit-Plane Encoding, set up the progressive code stream institutional framework of nested type simultaneously, press low frequency sub-band, coarse grade high-frequency sub-band, inferior coarse grade high-frequency sub-band, the sequential organization code stream of the meticulousst level high-frequency sub-band, make coding have resolution flexible, and, have quality scalability for each subband.