CN101729076A

CN101729076A - Nonperfect code table based Huffman decoding method for analyzing code length

Info

Publication number: CN101729076A
Application number: CN200810218565A
Authority: CN
Inventors: 裴少芳; 苏丹; 叶广明; 胡胜发
Original assignee: ANKAI (GUANGZHOU) SOFTWARE TECHN Co Ltd
Current assignee: Guangzhou Ankai Microelectronics Co.,Ltd.
Priority date: 2008-10-22
Filing date: 2008-10-22
Publication date: 2010-06-09
Anticipated expiration: 2028-10-22
Also published as: CN101729076B

Abstract

The invention discloses a nonperfect code table based Huffman decoding method for analyzing a code length, comprising the following steps of: structuring all code tables used for level comparison and analysis; determining the critical code length L of the nonperfect code table; then structuring an L bit nonperfect code table; reading the code stream value of a maximum code length, and searching an (L+1)-level code word in a corresponding code word search table taking a Huffman minimum code word of each level as a prefix; analyzing the length of a first code word next to an L-level code word; searching a symbolic value which corresponds to the first code word to complete the analysis of the first code word in a code stream; eliminating analyzed code words from the current code stream, and repeating the steps to complete the decoding of all Huffman codes. The invention can greatly reduce the storage spacek and increase the detection speed; in addition, when the maximum code length is 16, the space complexity is only one two hundreds and fifty-sixth of a perfect code table analysis method, thus the storage space is greatly saved.

Description

A kind of Hafman decoding method of analysing code length based on non-perfect code tabulation

Technical field

The present invention relates to a kind of Hafman decoding method, relate in particular to a kind of Hafman decoding method based on non-complete code table fast resolving code length.

Background technology

The Huffman algorithm is the algorithm that a kind of probability that occurs according to each element in the data to be compressed carries out Code And Decode, the shared space of packed data that can can't harm.Fig. 1 is the example that a Huffman code word generates tree, is leaf with its code word, and the number of plies is a number of levels under the code word.

When resolving Huffman code, the first code word size of the definite code stream that will resolve takes out it earlier, and the symbol table that provides with Huffman encoding just can find the pairing data element of this code word.Remove the first code word in the code stream, remaining code stream is resolved as stated above one by one, can finish the decode procedure of Huffman.Because Huffman encoding is variable-length encoding, in whole decode procedure, a problem that must solve is to determine the length of Huffman code word, has below described the common solution that code word size is resolved in the Hafman decoding:

1, level comparison and analysis method.Set up a leaf key based on Huffman code word generation tree and indicate nearest next stage to have the rank (being equal to code word size) of Huffman code word, and will be with minimum code word on the one-level as prefix bit, all the other positions mend 0, extend to maximum code length length.With Huffman prefix code place rank is index, and expanding sign indicating number with this is index value, the fixed length code search words table that to build minimum Huffman code words at different levels be prefix.When the Huffman code code length is resolved, press next rank of 0 grade of retrieval of leaf key, and with the code word in this next rank retrieval fixed length code search words table, the code stream numerical value that takes out maximum code length length compares with it, if this code stream numerical value is not less than the code word in the fixed length code search words table, the current rank of leaf key is replaced with next rank, and there is a rank of leaf with this new current rank retrieval next stage, then with the code word in this new next stage rank retrieval fixed length code search words table, compare with it with code stream numerical value, less than the code word in the fixed length code search words table, the code word size that exists in current rank and the code stream in this moment leaf key is identical up to code stream numerical value.

2, perfect code table analytic method.Generate the leaf code word of setting based on all Huffman code words that comprised in the code stream, set up a perfect code long code table.Every entry index of this perfect code long code table is prefix, expands to maximum code length length by certain rule with the Huffman code word, and its index value is corresponding Huffman prefix code code length.For current Huffman code stream to be resolved, generate tree according to the Huffman code word under its code word to be resolved, in fixed length code word perfect code long code table, retrieve with this Huffman code word and generate the corresponding code table part of tree.Intercept current Huffman code stream to be resolved with maximum codeword length, and with this code stream numerical value that intercepts out as index, generate the corresponding code length code table of tree in current code stream Huffman code word and partly retrieve, the current code length code table value that retrieves is first code word code length in the current code stream to be resolved.

Obtain extracting first code word behind the code length, generate in the pairing symbol table of tree in current Huffman code word and can be resolved to the pairing data of current code word.From code stream, remove the part of having resolved,, can finish the parsing of all Huffman codes in the continuous operation one by one of residue code stream relaying.

Though the probability that the code word that the existing Hafman decoding algorithm that generally uses has utilized coding schedule to embody occurs, and searching algorithm is optimized based on the existence that Huffman code generates tree leaves at different levels, but for most code words, code length all needs repeatedly could determine, large percentage consuming time in total decoding algorithm; Improved the code length resolution speed though analyse code length based on the perfect code tabulation, for the hardware store requirement of embedded system, its space complexity increase is too big, is difficult to meet the demands.

At audio frequency, video field, in embedded system, use very extensive based on coding, the decoding algorithm of Huffman data compression.In the Huffman algorithm, code word is represented with the variable length binary prefix code, in order to resolve a Huffman code word, must resolve the word length of Huffman code word earlier, traditional code length analytical algorithm is not that speed crosses is exactly that the code table data volume is excessive slowly, how reducing the time that code length is resolved when not increasing the code table data volume, this has very important significance for Hafman decoding.

Summary of the invention

The object of the invention is to provide a kind of and analyses the Hafman decoding method of code length based on non-perfect code tabulation, and this method can significantly reduce the storage area that code table takies and accelerate decoding speed.

Purpose of the present invention can realize by following scheme: a kind of Hafman decoding method based on non-complete code table, and step comprises:

1, by the level comparison and analysis method, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum code word at different levels are prefix;

2, determine the long L of non-complete code table critical codes: from selecting a value L between minimum code length and the maximum code length, as the critical code length that makes up non-complete code table;

3, generate the leaf code word that is no more than critical code length L bit of tree based on all Huffman code words that comprised in the code stream, making up one again is the non-complete code table of L bit of prefix with the Huffman code word;

4, generate tree according to the Huffman code word under the part to be resolved in the current code stream, read the code stream numerical value of maximum code length length, with this code stream numerical value is index, generate tree according to the Huffman code word under the current code stream to be resolved, in the fixed length code search words table that is prefix with Huffman minimum code word at different levels of correspondence, retrieve the code word of rank (code length) for (L+1);

5, the fixed length code word of comparison code fluxion value and (L+1) that just retrieved level, if code stream numerical value is less than the code word that has just retrieved, preceding L bit with code stream numerical value is new index, in the non-perfect code long code matrix section retrieval of correspondence, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise,,, resolve the first code word size after its corresponding L level according to the level comparison and analysis method with old code stream numerical value object as a comparison;

6, according to the code length of having resolved, in current code stream, extract its code word, based on code word corresponding symbol table, look into and get its corresponding value of symbol, can finish the parsing of first code word in the code stream;

7, from current code stream, reject the code word of having resolved, will remain code stream repeating step 4,5,6, can finish the decoding of all Huffman codes.

It is unit that each Huffman code word that described non-perfect code long code table comprises with code stream generates tree, makes up one by one by identical mode; Generating tree for every Huffman code word, is prefix with the Huffman code word that is no more than the L bit, and all the other positions expand to the L bit length by complete 0 to complete 1, set up the index of non-perfect code long code table; The value of index point is a Huffman prefix code code length.

Described each Huffman code word generates the corresponding code length code table part building process of tree: at first, Huffman code word according to correspondence generates tree, each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.

The present invention can significantly reduce memory space and accelerate detection speed.For example when maximum code length was N (being generally 16), it was o that the code length of level comparison and analysis method is resolved time complexity

---p wherein _iFor code length is the statistical probability of the code word of i, the code length code table space complexity of each descriptor correspondence is N; It is o (1) that complete code table code length is resolved time complexity, and corresponding each perfect code long code table space complexity of describing symbol is (2^N); In contrast to this two kinds of technology, the time complexity of non-complete code table code length analytic method is o

Approach complete code table stud-farm time resolution complexity, the space complexity of each descriptor of its correspondence is (2^8+N), calculates by common code length 16, and its space complexity has only perfect code table analytic method

Greatly saved access space.

Description of drawings

Fig. 1 is that Huffman code word of the prior art generates tree;

Fig. 2 is that single Huffman code word of the present invention generates the corresponding non-perfect code long code table generation schematic flow sheet of tree;

Fig. 3 is a code length process of analysis schematic diagram of the present invention.

Embodiment

Make up a code table that is used to retrieve the fixed length code word earlier based on the level comparison and analysis method; By level comparison and analysis method of the prior art, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum code word at different levels are prefix;

Making up one again is the non-complete code table of L bit of prefix with the Huffman code word: the leaf code word that is no more than the L bit that generates tree based on all Huffman code words that comprised in the code stream, set up a non-perfect code long code table, L is for making up the critical code length of non-complete code table, between minimum code length and maximum code length, select, for maximum code length is 16 code table, recommends to use 8 as critical code length L.Every entry index of this non-perfect code long code table with the Huffman code word that is no more than the L bit be prefix, all the other expand to the L bit length by complete 0 to complete 1, its index value is corresponding Huffman prefix code code length.In the building process of this non-perfect code long code table, each the Huffman code word generation tree that comprises with code stream is a unit, makes up one by one by identical mode.

It is as follows that each Huffman code word generates the corresponding code length code table part building process of tree, at first, Huffman code word according to correspondence generates tree, and each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.

Fig. 2 is the processing of selecting single Huffman code word generation tree counterpart in the non-perfect code long code of the 8 bits table building process for use, and wherein current rank, current leaf number and the current code word position preface in current rank is all initial from 0.Concrete steps are as follows:

1, initialization is made as 0 with all code lengths, and current rank is set is 1;

2, calculate current rank leaf sum; The current leaf of current rank position tagmeme 0 is set then;

3, answering code word with leaf position ordered pair is prefix code, mends 0 to the L position, calculates it and expands the code word sum; Current leaf current code word position tagmeme 0 is set then;

4, add its tagmeme index to expand code word, index value is current class value;

5, whether relatively expand the codeword bit preface less than the current code word sum, be, then expand the codeword bit preface and add 1 and return step 4 if the result returns; If the result returns not, then carry out next step;

6, prefix code is added 1 as next prefix code, whether more current leaf position preface less than the total leaf number of rank if the result returns be, then current leaf position preface adds 1 and return step 3; If the result returns not, then carry out next step;

7, detecting the current code word rank and whether be not more than L, is then to return step 2 if the result returns; If the result returns not, then the code table generation finishes and finishes.

After making up above-mentioned two code tables, resolving code length then, generate tree according to the Huffman code word under the current code stream part to be resolved, read the code stream numerical value of its maximum code length length, with this code stream numerical value is index, retrieves rank (code length) and be the code word of L+1 in the Huffman code minimum code word prefix fixed length code search words tables at different levels of correspondence.

Comparison code fluxion value and the code word that has just retrieved, if code stream numerical value is less than the code word that has just retrieved, preceding L bit with code stream numerical value is new index, with current code stream part to be resolved under the Huffman code word generate the corresponding non-perfect code long code matrix section of tree and retrieve, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise,,, after the Huffman code word of correspondence generates the L level of tree, retrieve first code word size to be resolved the current code stream according to the level comparison and analysis method with old code stream numerical value object as a comparison.Extract the code length code stream that just has been resolved to from current code stream, the Huffman code word generates in the tree corresponding symbol table and can find its corresponding data under current code word.

Reject the code word of having resolved from current code stream, will remain code stream and resolve as stated above, can finish the decoding of all Huffman codes.

Concrete decoding process extracts at first that rank is the code word B of L+1 in current bit stream maximum code length bit A and the fixed length code word table as shown in Figure 3, and the size of A, B relatively.If A less than B the preceding L bit number of getting A as the non-perfect code matrix section retrieval of index in correspondence, code length is an index value, and finishes decoding.If A is not less than B, then in the leaf key, there is the next stage rank of code word after the retrieval Huffman code number of words L level; Work as the corresponding code word C of next stage in the fixed length code table search.If A is not less than C, then do not replace current rank with next stage, there is next rank of code word in retrieval in the leaf key, and returns and retrieve code word C again; If A is less than C, then code length is current class value, and finishes decoding.

Claims

1. analyse the Hafman decoding method of code length based on non-perfect code tabulation for one kind, it is characterized in that step comprises:

(a), by the level comparison and analysis method, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum code word at different levels are prefix;

(b), determine the long L of non-complete code table critical codes: from selecting a value L between minimum code length and the maximum code length, as the critical code length that makes up non-complete code table;

(c), generate the leaf code word that is no more than critical code length L bit of tree, making up one again is the non-complete code table of L bit of prefix with the Huffman code word based on all Huffman code words that comprised in the code stream;

(d), generate tree according to the Huffman code word under the part to be resolved in the current code stream, read the code stream numerical value of maximum code length length, with this code stream numerical value is index, generate tree according to the Huffman code word under the current code stream to be resolved, in the fixed length code search words table that is prefix with Huffman minimum code word at different levels of correspondence, retrieve the code word of rank (code length) for (L+1);

(e), the fixed length code word of comparison code fluxion value and (L+1) that just retrieved level, if code stream numerical value is less than the code word that has just retrieved, preceding L bit with code stream numerical value is new index, in the non-perfect code long code matrix section retrieval of correspondence, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise,,, resolve the first code word size after its corresponding L level according to the level comparison and analysis method with old code stream numerical value object as a comparison;

(f), according to the code length of having resolved, in current code stream, extract its code word, based on code word corresponding symbol table, look into and get its corresponding value of symbol, can finish the parsing of first code word in the code stream;

(g), from current code stream, reject the code word resolved, will remain code stream repeating step d, e, f, can finish the decoding of all Huffman codes.

2. according to claim 1ly a kind ofly analyse the Hafman decoding method of code length, it is characterized in that it is unit that each Huffman code word that described non-perfect code long code table comprises with code stream generates tree, makes up one by one by identical mode based on non-perfect code tabulation; Generating tree for every Huffman code word, is prefix with the Huffman code word that is no more than the L bit, and all the other positions expand to the L bit length by complete 0 to complete 1, set up the index of non-perfect code long code table; The value of index point is a Huffman prefix code code length.

3. a kind of Hafman decoding method of analysing code length based on non-perfect code tabulation according to claim 1, it is characterized in that, described each Huffman code word generates the corresponding code length code table part building process of tree: at first, Huffman code word according to correspondence generates tree, each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.