A kind of Hafman decoding method of analysing code length based on non-perfect code tabulation
Technical field
The present invention relates to a kind of Hafman decoding method, relate in particular to a kind of Hafman decoding method based on non-complete code table fast resolving code length.
Background technology
The Huffman algorithm is the algorithm that a kind of probability that occurs according to each element in the data to be compressed carries out Code And Decode, the shared space of packed data that can can't harm.Fig. 1 is the example that a Huffman code word generates tree, is leaf with its code word, and the number of plies is a number of levels under the code word.
When resolving Huffman code, the first code word size of the definite code stream that will resolve takes out it earlier, and the symbol table that provides with Huffman encoding just can find the pairing data element of this code word.Remove the first code word in the code stream, remaining code stream is resolved as stated above one by one, can accomplish the decode procedure of Huffman.Because Huffman encoding is variable-length encoding, in whole decode procedure, a problem that must solve is to confirm the length of Huffman code word, has below described the common solution that code word size is resolved in the Hafman decoding:
1, level comparison and analysis method.Set up a leaf key based on Huffman code word generation tree and indicate nearest next stage to have the rank (being equal to code word size) of Huffman code word, and will be with minimum word on the one-level as prefix bit, all the other positions mend 0, extend to maximum code length length.With Huffman prefix code place rank is index, and expanding sign indicating number with this is index value, the fixed length code search words table that to build minimum Huffman code words at different levels be prefix.When the Huffman code code length is resolved; Press next rank of 0 grade of retrieval of leaf key; And with the code word in this next rank retrieval fixed length code search words table, the code stream numerical value that takes out maximum code length length compares with it, if this code stream numerical value is not less than the code word in the fixed length code search words table; The current rank of leaf key is with next rank replacement; And have the rank of leaf with this new current rank retrieval next stage, with the code word in this new next stage rank retrieval fixed length code search words table, compare with it then with code stream numerical value; Less than the code word in the fixed length code search words table, the code word size that exists in current rank and the code stream in this moment leaf key is identical up to code stream numerical value.
2, perfect code table analytic method.All Huffman code words based on being comprised in the code stream generate the leaf code word of setting, and set up a perfect code long code table.Every entry index of this perfect code long code table is prefix, expands to maximum code length length by certain rule with the Huffman code word, and its index value is corresponding Huffman prefix code code length.For current Huffman code stream to be resolved, generate tree according to the Huffman code word under its code word to be resolved, in fixed length code word perfect code long code table, retrieve with this Huffman code word and generate the corresponding code table part of tree.With the current Huffman code stream to be resolved of maximum codeword length intercepting; And the code stream numerical value that this intercepting is gone out is as index; Generate the corresponding code length code table of tree in current code stream Huffman code word and partly retrieve, the current code length code table value that retrieves is first code word code length in the current code stream to be resolved.
Obtain extracting first code word behind the code length, generate in the pairing symbol table of tree in current Huffman code word and can be resolved to the pairing data of current code word.From code stream, remove the part of having resolved,, can accomplish the parsing of all Huffman codes in the continuous operation one by one of residue code stream relaying.
Though the probability that the code word that the existing Hafman decoding algorithm that generally uses has utilized coding schedule to embody occurs; And searching algorithm is optimized based on the existence that Huffman code generates tree leaves at different levels; But for most code words; Code length all needs repeatedly to confirm large percentage consuming time in total decoding algorithm; Improved the code length resolution speed though analyse code length based on the perfect code tabulation, for the hardware store requirement of embedded system, its space complexity increase is too big, is difficult to meet the demands.
At audio frequency, video field, in embedded system, use very extensive based on coding, the decoding algorithm of Huffman data compression.In the Huffman algorithm; Code word is represented with the variable length binary prefix code; In order to resolve a Huffman code word, must resolve the word length of Huffman code word earlier, traditional code length analytical algorithm is not that speed crosses is exactly that the code table data volume is excessive slowly; How when not increasing the code table data volume, reducing the time that code length is resolved, this has very important significance for Hafman decoding.
Summary of the invention
The object of the invention is to provide a kind of and analyses the Hafman decoding method of code length based on non-perfect code tabulation, and this method can significantly reduce storage area that code table takies and accelerate decoding speed.
The object of the invention can be realized through following scheme: a kind of Hafman decoding method based on non-complete code table, and step comprises:
1, by the level comparison and analysis method, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum word at different levels are prefix;
2, confirm the long L of non-complete code table critical codes: from selecting a value L between minimum code length and the maximum code length, as the critical code length that makes up non-complete code table;
3, generate the leaf code word that is no more than critical code length L bit of tree based on all Huffman code words that comprised in the code stream, making up one again is the non-complete code table of L bit of prefix with the Huffman code word;
4, generate tree according to the Huffman code word under the part to be resolved in the current code stream; Read the code stream numerical value of maximum code length length; With this code stream numerical value is index; Generate tree according to the Huffman code word under the current code stream to be resolved, in the fixed length code search words table that is prefix with Huffman minimum word at different levels of correspondence, retrieve the code word of rank (code length) for (L+1);
5, the fixed length code word of comparison code fluxion value and (L+1) that just retrieved level; If code stream numerical value is less than the code word that has just retrieved; Preceding L bit with code stream numerical value is new index; In the non-perfect code long code matrix section retrieval of correspondence, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise, as comparison other,, resolve its corresponding L level first code word size afterwards according to the level comparison and analysis method with old code stream numerical value;
6, according to the code length of having resolved, in current code stream, extract its code word, based on code word corresponding symbol table, look into and get its corresponding value of symbol, can accomplish the parsing of first code word in the code stream;
7, from current code stream, reject the code word of having resolved, will remain code stream repeating step 4,5,6, can accomplish the decoding of all Huffman codes.
It is unit that each Huffman code word that described non-perfect code long code table comprises with code stream generates tree, makes up one by one by identical mode; Generating tree for every Huffman code word, is prefix with the Huffman code word that is no more than the L bit, and all the other positions expand to the L bit length by complete 0 to complete 1, set up the index of non-perfect code long code table; The value of index is a Huffman prefix code code length.
Described each Huffman code word generates the corresponding code length code table part building process of tree: at first; Huffman code word according to correspondence generates tree; Each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.
The present invention can significantly reduce memory space and accelerate detection speed.For example when maximum code length is N (being generally 16), the code length of level comparison and analysis method resolve time complexity be o (
)---p wherein
iFor code length is the statistical probability of the code word of i, the code length code table space complexity that each descriptor is corresponding is N; It is o (1) that complete code table code length is resolved time complexity, and corresponding each perfect code long code table space complexity of describing symbol is (2^N); In contrast to this two kinds of technology, the time complexity of non-complete code table code length analytic method be o (
), approaching complete code table stud-farm time resolution complexity, the space complexity of each descriptor that it is corresponding be (2^8+N), by common code length 16 calculating, its space complexity has only perfect code table analytic method
, greatly saved access space.
Description of drawings
Fig. 1 is that Huffman code word of the prior art generates tree;
Fig. 2 is that single Huffman code word of the present invention generates the corresponding non-perfect code long code table generation schematic flow sheet of tree;
Fig. 3 is a code length process of analysis sketch map of the present invention.
Embodiment
Make up a code table that is used to retrieve the fixed length code word earlier based on the level comparison and analysis method; By level comparison and analysis method of the prior art, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum word at different levels are prefix;
Making up one again is the non-complete code table of L bit of prefix with the Huffman code word: the leaf code word that is no more than the L bit that generates tree based on all Huffman code words that comprised in the code stream; Set up a non-perfect code long code table; L is for making up the critical code length of non-complete code table; Between minimum code length and maximum code length, selecting, is 16 code table for maximum code length, recommends to use 8 as critical code length L.Every entry index of this non-perfect code long code table with the Huffman code word that is no more than the L bit be prefix, all the other expand to the L bit length by complete 0 to complete 1, its index value is corresponding Huffman prefix code code length.In the building process of this non-perfect code long code table, each the Huffman code word generation tree that comprises with code stream is a unit, makes up one by one by identical mode.
It is following that each Huffman code word generates the corresponding code length code table part building process of tree; At first; Huffman code word according to correspondence generates tree, and each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.
Fig. 2 is the processing of selecting single Huffman code word generation tree counterpart in the non-perfect code long code of the 8 bits table building process for use, and wherein current rank, current leaf number and the current code word position preface in current rank is all initial from 0.Concrete steps are following:
1, initialization is made as 0 with all code lengths, and current rank is set is 1;
2, calculate current rank leaf sum; The current leaf of current rank position tagmeme 0 is set then;
3, answering code word with leaf position ordered pair is prefix code, mends 0 to the L position, calculates it and expands the code word sum; Current leaf current code word position tagmeme 0 is set then;
4, add its tagmeme index to expand code word, index value is current class value;
5, relatively expand the codeword bit preface whether less than the current code word sum, be, then expand the codeword bit preface and add 1 and return step 4 if the result returns; If the result returns not, then carry out next step;
6, prefix code is added 1 as next prefix code, whether more current leaf position preface less than the total leaf number of rank if the result returns be, then current leaf position preface adds 1 and return step 3; If the result returns not, then carry out next step;
7, detecting the current code word rank and whether be not more than L, is then to return step 2 if the result returns; If the result returns not, then the code table generation finishes and finishes.
After making up above-mentioned two code tables; Resolving code length then; Generate tree according to the Huffman code word under the current code stream part to be resolved; Reading the code stream numerical value of its maximum code length length, is index with this code stream numerical value, in the Huffman code minimum word prefix fixed length code search words tables at different levels of correspondence, retrieves rank (code length) and is the code word of L+1.
Comparison code fluxion value and the code word that has just retrieved; If code stream numerical value is less than the code word that has just retrieved; Preceding L bit with code stream numerical value is new index; With current code stream part to be resolved under the Huffman code word generate the corresponding non-perfect code long code matrix section of tree and retrieve, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise, as comparison other,, after the L level that the Huffman code word generation of correspondence is set, retrieve first code word size to be resolved the current code stream according to the level comparison and analysis method with old code stream numerical value.From current code stream, extract the code length code stream that just has been resolved to, the Huffman code word generates in the tree corresponding symbol table and can find its corresponding data under current code word.
Reject the code word of having resolved from current code stream, will remain code stream and resolve as stated above, can accomplish the decoding of all Huffman codes.
Concrete decoding process is as shown in Figure 3, extracts at first that rank is the code word B of L+1 in current bit stream maximum code length bit A and the fixed length code word table, and the size of comparison A, B.If A less than B the preceding L bit number of getting A as the non-perfect code matrix section retrieval of index in correspondence, code length is an index value, and finishes decoding.If A is not less than B, then in the leaf key, there is the next stage rank of code word after the retrieval Huffman code number of words L level; Work as the corresponding code word C of next stage in the fixed length code table search.If A is not less than C, then do not replace current rank with next stage, there is next rank of code word in retrieval in the leaf key, and returns and retrieve code word C again; If A is less than C, then code length is current class value, and finishes decoding.