CN101729076B - Nonperfect code table based Huffman decoding method for analyzing code length - Google Patents

Nonperfect code table based Huffman decoding method for analyzing code length Download PDF

Info

Publication number
CN101729076B
CN101729076B CN2008102185651A CN200810218565A CN101729076B CN 101729076 B CN101729076 B CN 101729076B CN 2008102185651 A CN2008102185651 A CN 2008102185651A CN 200810218565 A CN200810218565 A CN 200810218565A CN 101729076 B CN101729076 B CN 101729076B
Authority
CN
China
Prior art keywords
code
length
huffman
word
code word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2008102185651A
Other languages
Chinese (zh)
Other versions
CN101729076A (en
Inventor
裴少芳
苏丹
叶广明
胡胜发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Ankai Microelectronics Co.,Ltd.
Original Assignee
Anyka Guangzhou Microelectronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anyka Guangzhou Microelectronics Technology Co Ltd filed Critical Anyka Guangzhou Microelectronics Technology Co Ltd
Priority to CN2008102185651A priority Critical patent/CN101729076B/en
Publication of CN101729076A publication Critical patent/CN101729076A/en
Application granted granted Critical
Publication of CN101729076B publication Critical patent/CN101729076B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a nonperfect code table based Huffman decoding method for analyzing a code length, comprising the following steps of: structuring all code tables used for level comparison and analysis; determining the critical code length L of the nonperfect code table; then structuring an L bit nonperfect code table; reading the code stream value of a maximum code length, and searching an (L+1)-level code word in a corresponding code word search table taking a Huffman minimum code word of each level as a prefix; analyzing the length of a first code word next to an L-level code word; searching a symbolic value which corresponds to the first code word to complete the analysis of the first code word in a code stream; eliminating analyzed code words from the current code stream, and repeating the steps to complete the decoding of all Huffman codes. The invention can greatly reduce the storage spacek and increase the detection speed; in addition, when the maximum code length is 16, the space complexity is only one two hundreds and fifty-sixth of a perfect code table analysis method, thus the storage space is greatly saved.

Description

A kind of Hafman decoding method of analysing code length based on non-perfect code tabulation
Technical field
The present invention relates to a kind of Hafman decoding method, relate in particular to a kind of Hafman decoding method based on non-complete code table fast resolving code length.
Background technology
The Huffman algorithm is the algorithm that a kind of probability that occurs according to each element in the data to be compressed carries out Code And Decode, the shared space of packed data that can can't harm.Fig. 1 is the example that a Huffman code word generates tree, is leaf with its code word, and the number of plies is a number of levels under the code word.
When resolving Huffman code, the first code word size of the definite code stream that will resolve takes out it earlier, and the symbol table that provides with Huffman encoding just can find the pairing data element of this code word.Remove the first code word in the code stream, remaining code stream is resolved as stated above one by one, can accomplish the decode procedure of Huffman.Because Huffman encoding is variable-length encoding, in whole decode procedure, a problem that must solve is to confirm the length of Huffman code word, has below described the common solution that code word size is resolved in the Hafman decoding:
1, level comparison and analysis method.Set up a leaf key based on Huffman code word generation tree and indicate nearest next stage to have the rank (being equal to code word size) of Huffman code word, and will be with minimum word on the one-level as prefix bit, all the other positions mend 0, extend to maximum code length length.With Huffman prefix code place rank is index, and expanding sign indicating number with this is index value, the fixed length code search words table that to build minimum Huffman code words at different levels be prefix.When the Huffman code code length is resolved; Press next rank of 0 grade of retrieval of leaf key; And with the code word in this next rank retrieval fixed length code search words table, the code stream numerical value that takes out maximum code length length compares with it, if this code stream numerical value is not less than the code word in the fixed length code search words table; The current rank of leaf key is with next rank replacement; And have the rank of leaf with this new current rank retrieval next stage, with the code word in this new next stage rank retrieval fixed length code search words table, compare with it then with code stream numerical value; Less than the code word in the fixed length code search words table, the code word size that exists in current rank and the code stream in this moment leaf key is identical up to code stream numerical value.
2, perfect code table analytic method.All Huffman code words based on being comprised in the code stream generate the leaf code word of setting, and set up a perfect code long code table.Every entry index of this perfect code long code table is prefix, expands to maximum code length length by certain rule with the Huffman code word, and its index value is corresponding Huffman prefix code code length.For current Huffman code stream to be resolved, generate tree according to the Huffman code word under its code word to be resolved, in fixed length code word perfect code long code table, retrieve with this Huffman code word and generate the corresponding code table part of tree.With the current Huffman code stream to be resolved of maximum codeword length intercepting; And the code stream numerical value that this intercepting is gone out is as index; Generate the corresponding code length code table of tree in current code stream Huffman code word and partly retrieve, the current code length code table value that retrieves is first code word code length in the current code stream to be resolved.
Obtain extracting first code word behind the code length, generate in the pairing symbol table of tree in current Huffman code word and can be resolved to the pairing data of current code word.From code stream, remove the part of having resolved,, can accomplish the parsing of all Huffman codes in the continuous operation one by one of residue code stream relaying.
Though the probability that the code word that the existing Hafman decoding algorithm that generally uses has utilized coding schedule to embody occurs; And searching algorithm is optimized based on the existence that Huffman code generates tree leaves at different levels; But for most code words; Code length all needs repeatedly to confirm large percentage consuming time in total decoding algorithm; Improved the code length resolution speed though analyse code length based on the perfect code tabulation, for the hardware store requirement of embedded system, its space complexity increase is too big, is difficult to meet the demands.
At audio frequency, video field, in embedded system, use very extensive based on coding, the decoding algorithm of Huffman data compression.In the Huffman algorithm; Code word is represented with the variable length binary prefix code; In order to resolve a Huffman code word, must resolve the word length of Huffman code word earlier, traditional code length analytical algorithm is not that speed crosses is exactly that the code table data volume is excessive slowly; How when not increasing the code table data volume, reducing the time that code length is resolved, this has very important significance for Hafman decoding.
Summary of the invention
The object of the invention is to provide a kind of and analyses the Hafman decoding method of code length based on non-perfect code tabulation, and this method can significantly reduce storage area that code table takies and accelerate decoding speed.
The object of the invention can be realized through following scheme: a kind of Hafman decoding method based on non-complete code table, and step comprises:
1, by the level comparison and analysis method, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum word at different levels are prefix;
2, confirm the long L of non-complete code table critical codes: from selecting a value L between minimum code length and the maximum code length, as the critical code length that makes up non-complete code table;
3, generate the leaf code word that is no more than critical code length L bit of tree based on all Huffman code words that comprised in the code stream, making up one again is the non-complete code table of L bit of prefix with the Huffman code word;
4, generate tree according to the Huffman code word under the part to be resolved in the current code stream; Read the code stream numerical value of maximum code length length; With this code stream numerical value is index; Generate tree according to the Huffman code word under the current code stream to be resolved, in the fixed length code search words table that is prefix with Huffman minimum word at different levels of correspondence, retrieve the code word of rank (code length) for (L+1);
5, the fixed length code word of comparison code fluxion value and (L+1) that just retrieved level; If code stream numerical value is less than the code word that has just retrieved; Preceding L bit with code stream numerical value is new index; In the non-perfect code long code matrix section retrieval of correspondence, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise, as comparison other,, resolve its corresponding L level first code word size afterwards according to the level comparison and analysis method with old code stream numerical value;
6, according to the code length of having resolved, in current code stream, extract its code word, based on code word corresponding symbol table, look into and get its corresponding value of symbol, can accomplish the parsing of first code word in the code stream;
7, from current code stream, reject the code word of having resolved, will remain code stream repeating step 4,5,6, can accomplish the decoding of all Huffman codes.
It is unit that each Huffman code word that described non-perfect code long code table comprises with code stream generates tree, makes up one by one by identical mode; Generating tree for every Huffman code word, is prefix with the Huffman code word that is no more than the L bit, and all the other positions expand to the L bit length by complete 0 to complete 1, set up the index of non-perfect code long code table; The value of index is a Huffman prefix code code length.
Described each Huffman code word generates the corresponding code length code table part building process of tree: at first; Huffman code word according to correspondence generates tree; Each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.
The present invention can significantly reduce memory space and accelerate detection speed.For example when maximum code length is N (being generally 16), the code length of level comparison and analysis method resolve time complexity be o (
Figure G2008102185651D00031
)---p wherein iFor code length is the statistical probability of the code word of i, the code length code table space complexity that each descriptor is corresponding is N; It is o (1) that complete code table code length is resolved time complexity, and corresponding each perfect code long code table space complexity of describing symbol is (2^N); In contrast to this two kinds of technology, the time complexity of non-complete code table code length analytic method be o (
Figure G2008102185651D00032
), approaching complete code table stud-farm time resolution complexity, the space complexity of each descriptor that it is corresponding be (2^8+N), by common code length 16 calculating, its space complexity has only perfect code table analytic method
Figure G2008102185651D00041
, greatly saved access space.
Description of drawings
Fig. 1 is that Huffman code word of the prior art generates tree;
Fig. 2 is that single Huffman code word of the present invention generates the corresponding non-perfect code long code table generation schematic flow sheet of tree;
Fig. 3 is a code length process of analysis sketch map of the present invention.
Embodiment
Make up a code table that is used to retrieve the fixed length code word earlier based on the level comparison and analysis method; By level comparison and analysis method of the prior art, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum word at different levels are prefix;
Making up one again is the non-complete code table of L bit of prefix with the Huffman code word: the leaf code word that is no more than the L bit that generates tree based on all Huffman code words that comprised in the code stream; Set up a non-perfect code long code table; L is for making up the critical code length of non-complete code table; Between minimum code length and maximum code length, selecting, is 16 code table for maximum code length, recommends to use 8 as critical code length L.Every entry index of this non-perfect code long code table with the Huffman code word that is no more than the L bit be prefix, all the other expand to the L bit length by complete 0 to complete 1, its index value is corresponding Huffman prefix code code length.In the building process of this non-perfect code long code table, each the Huffman code word generation tree that comprises with code stream is a unit, makes up one by one by identical mode.
It is following that each Huffman code word generates the corresponding code length code table part building process of tree; At first; Huffman code word according to correspondence generates tree, and each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values (being code length code table value) are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.
Fig. 2 is the processing of selecting single Huffman code word generation tree counterpart in the non-perfect code long code of the 8 bits table building process for use, and wherein current rank, current leaf number and the current code word position preface in current rank is all initial from 0.Concrete steps are following:
1, initialization is made as 0 with all code lengths, and current rank is set is 1;
2, calculate current rank leaf sum; The current leaf of current rank position tagmeme 0 is set then;
3, answering code word with leaf position ordered pair is prefix code, mends 0 to the L position, calculates it and expands the code word sum; Current leaf current code word position tagmeme 0 is set then;
4, add its tagmeme index to expand code word, index value is current class value;
5, relatively expand the codeword bit preface whether less than the current code word sum, be, then expand the codeword bit preface and add 1 and return step 4 if the result returns; If the result returns not, then carry out next step;
6, prefix code is added 1 as next prefix code, whether more current leaf position preface less than the total leaf number of rank if the result returns be, then current leaf position preface adds 1 and return step 3; If the result returns not, then carry out next step;
7, detecting the current code word rank and whether be not more than L, is then to return step 2 if the result returns; If the result returns not, then the code table generation finishes and finishes.
After making up above-mentioned two code tables; Resolving code length then; Generate tree according to the Huffman code word under the current code stream part to be resolved; Reading the code stream numerical value of its maximum code length length, is index with this code stream numerical value, in the Huffman code minimum word prefix fixed length code search words tables at different levels of correspondence, retrieves rank (code length) and is the code word of L+1.
Comparison code fluxion value and the code word that has just retrieved; If code stream numerical value is less than the code word that has just retrieved; Preceding L bit with code stream numerical value is new index; With current code stream part to be resolved under the Huffman code word generate the corresponding non-perfect code long code matrix section of tree and retrieve, the value that retrieves is the first code word code length of current code stream part to be resolved; Otherwise, as comparison other,, after the L level that the Huffman code word generation of correspondence is set, retrieve first code word size to be resolved the current code stream according to the level comparison and analysis method with old code stream numerical value.From current code stream, extract the code length code stream that just has been resolved to, the Huffman code word generates in the tree corresponding symbol table and can find its corresponding data under current code word.
Reject the code word of having resolved from current code stream, will remain code stream and resolve as stated above, can accomplish the decoding of all Huffman codes.
Concrete decoding process is as shown in Figure 3, extracts at first that rank is the code word B of L+1 in current bit stream maximum code length bit A and the fixed length code word table, and the size of comparison A, B.If A less than B the preceding L bit number of getting A as the non-perfect code matrix section retrieval of index in correspondence, code length is an index value, and finishes decoding.If A is not less than B, then in the leaf key, there is the next stage rank of code word after the retrieval Huffman code number of words L level; Work as the corresponding code word C of next stage in the fixed length code table search.If A is not less than C, then do not replace current rank with next stage, there is next rank of code word in retrieval in the leaf key, and returns and retrieve code word C again; If A is less than C, then code length is current class value, and finishes decoding.

Claims (3)

1. analyse the Hafman decoding method of code length based on non-perfect code tabulation for one kind, it is characterized in that step comprises:
(a), by the level comparison and analysis method, make up the code table of the level comparison and analysis that is useful on, comprise the fixed length code search words table that leaf key and Huffman minimum word at different levels are prefix;
(b), confirm the long L of non-complete code table critical codes: from selecting a value L between minimum code length and the maximum code length, as the critical code length that makes up non-complete code table;
(c), generate the leaf code word that is no more than critical code length L bit of tree, making up one again is the non-complete code table of L bit of prefix with the Huffman code word based on all Huffman code words that comprised in the code stream;
(d), generate tree according to the Huffman code word under the part to be resolved in the current code stream; Read the code stream numerical value of maximum code length length; With this code stream numerical value is index; Generate tree according to the Huffman code word under the current code stream to be resolved, the retrieval rank is the code word of L+1 in the fixed length code search words table that is prefix with Huffman minimum word at different levels of correspondence;
(e), the fixed length code word of comparison code fluxion value and the L+1 level that just retrieved; If code stream numerical value is less than the code word that has just retrieved; Preceding L bit with code stream numerical value is new index; In the non-perfect code long code matrix section retrieval of correspondence, the value that retrieves is the first code word code length of current code stream part to be resolved; If code stream numerical value is not less than the code word that has just retrieved, as comparison other,, resolve the first code word size after the corresponding L level of the code word that just retrieved according to the level comparison and analysis method with the code stream numerical value of said leaf key;
(f), according to the code length of having resolved, in current code stream, extract its code word, based on code word corresponding symbol table, look into and get its corresponding value of symbol, can accomplish the parsing of first code word in the code stream;
(g), from current code stream, reject the code word resolved, will remain code stream repeating step d, e, f, can accomplish the decoding of all Huffman codes.
2. according to claim 1ly a kind ofly analyse the Hafman decoding method of code length, it is characterized in that it is unit that each Huffman code word that described non-perfect code long code table comprises with code stream generates tree, makes up one by one by identical mode based on non-perfect code tabulation; Generating tree for every Huffman code word, is prefix with the Huffman code word that is no more than the L bit, and all the other positions expand to the L bit length by complete 0 to complete 1, set up the index of non-perfect code long code table; The value of index is a Huffman prefix code code length.
3. a kind of Hafman decoding method of analysing code length based on non-perfect code tabulation according to claim 1; It is characterized in that; Described each Huffman code word generates the corresponding code length code table part building process of tree: at first; Huffman code word according to correspondence generates tree, and each bit that makes up the L bit length is complete 0 to complete 1 index, and all index values are initialized as 0; It is prefix that all that generate tree with Huffman are no more than L bit leaf numeral, and the residue code word bit is filled into the L bit length with complete 0 to complete 1, with all with the code length code table value of the expansion sign indicating number of Huffman code word prefix length assignment with corresponding Huffman prefix code.
CN2008102185651A 2008-10-22 2008-10-22 Nonperfect code table based Huffman decoding method for analyzing code length Active CN101729076B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008102185651A CN101729076B (en) 2008-10-22 2008-10-22 Nonperfect code table based Huffman decoding method for analyzing code length

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008102185651A CN101729076B (en) 2008-10-22 2008-10-22 Nonperfect code table based Huffman decoding method for analyzing code length

Publications (2)

Publication Number Publication Date
CN101729076A CN101729076A (en) 2010-06-09
CN101729076B true CN101729076B (en) 2012-11-21

Family

ID=42449416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008102185651A Active CN101729076B (en) 2008-10-22 2008-10-22 Nonperfect code table based Huffman decoding method for analyzing code length

Country Status (1)

Country Link
CN (1) CN101729076B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9252805B1 (en) * 2015-03-28 2016-02-02 International Business Machines Corporation Parallel huffman decoder
CN104717499B (en) * 2015-03-31 2018-06-05 豪威科技(上海)有限公司 A kind of storage method of huffman table and the Hofmann decoding method for JPEG
CN107204776A (en) * 2016-03-18 2017-09-26 余海箭 A kind of Web3D data compression algorithms based on floating number situation

Also Published As

Publication number Publication date
CN101729076A (en) 2010-06-09

Similar Documents

Publication Publication Date Title
CN105893337B (en) Method and apparatus for text compression and decompression
EP3051430B1 (en) Encoding program, decompression program, compression method, decompression method, compression device and decompresssion device
US10090857B2 (en) Method and apparatus for compressing genetic data
US8120516B2 (en) Data compression using a stream selector with edit-in-place capability for compressed data
CN101783788B (en) File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
KR101969848B1 (en) Method and apparatus for compressing genetic data
WO2009045668A2 (en) Two-pass hash extraction of text strings
EP3154202B1 (en) Encoding program, encoding method, encoding device, decoding program, decoding method, and decoding device
CN101557517A (en) Decoder, decoding method and apparatus
CN104579360A (en) Method and equipment for data processing
CN101729076B (en) Nonperfect code table based Huffman decoding method for analyzing code length
CN103078646B (en) Dictionary enquiring compression, decompression method and device thereof
CN100578943C (en) Optimized Huffman decoding method and device
Awan et al. LIPT: A Reversible Lossless Text Transform to Improve Compression Performance.
US20100194607A1 (en) Data compression method and apparatus
JP5913748B2 (en) Secure and lossless data compression
US20090055395A1 (en) Method and Apparatus for XML Data Processing
Farina et al. Boosting text compression with word-based statistical encoding
CN101741392B (en) Huffman decoding method for fast resolving code length
US11741121B2 (en) Computerized data compression and analysis using potentially non-adjacent pairs
EP3136607A1 (en) A method and a system for encoding and decoding of suffix tree and searching within encoded suffix tree
CN103731154A (en) Data compression algorithm based on semantic analysis
Cannane et al. General‐purpose compression for efficient retrieval
Robert et al. Simple lossless preprocessing algorithms for text compression
CN112181869B (en) Information storage method, device, server and medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 510663 no.301-303, 401-402, area C1, No.182, Science City, Guangzhou hi tech Industrial Development Zone, Guangzhou City, Guangdong Province

Patentee after: Guangzhou Ankai Microelectronics Co.,Ltd.

Address before: 301-303 401-402, zone C1, No. 182, science Avenue, Science City, Guangzhou high tech Industrial Development Zone

Patentee before: ANYKA (GUANGZHOU) MICROELECTRONICS TECHNOLOGY Co.,Ltd.

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 510555 No. 107 Bowen Road, Huangpu District, Guangzhou, Guangdong

Patentee after: Guangzhou Ankai Microelectronics Co.,Ltd.

Address before: 510663 no.301-303, 401-402, area C1, No.182, Science City, Guangzhou hi tech Industrial Development Zone, Guangzhou City, Guangdong Province

Patentee before: Guangzhou Ankai Microelectronics Co.,Ltd.