CN102244518B - The hard-wired system and method for parallel decompression - Google Patents

The hard-wired system and method for parallel decompression Download PDF

Info

Publication number
CN102244518B
CN102244518B CN201010167216.9A CN201010167216A CN102244518B CN 102244518 B CN102244518 B CN 102244518B CN 201010167216 A CN201010167216 A CN 201010167216A CN 102244518 B CN102244518 B CN 102244518B
Authority
CN
China
Prior art keywords
data
module
huffman
code
submodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010167216.9A
Other languages
Chinese (zh)
Other versions
CN102244518A (en
Inventor
欧阳剑
田甲子
李浩华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201010167216.9A priority Critical patent/CN102244518B/en
Publication of CN102244518A publication Critical patent/CN102244518A/en
Application granted granted Critical
Publication of CN102244518B publication Critical patent/CN102244518B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a kind of hard-wired system and method for parallel decompression, this system comprises: the bit manipulation module of random length, carries out the bit manipulation of random length for treating decompressed data, obtains the data of random length; Huffman code table recovers module, for the data according to random length, recovers Huffman code table; Huffman decoding module, for according to Huffman code table, executed in parallel Huffman decoding; And decoder module, for the result according to Huffman decoding, decode.The invention provides a kind of hard-wired system and method for parallel decompression, programmable logic device (FPGA) is used to go to realize Gzip decompressing function, by adopting a kind of parallel decompression algorithm, and design is applicable to the hardware circuit of this algorithm, thus increases substantially the treatment effeciency of decompression.

Description

The hard-wired system and method for parallel decompression
Technical field
The present invention relates to data decompression technology, particularly relate to a kind of hard-wired system and method for parallel decompression.
Background technology
In the large-scale data process of the Internet, carrying out compression and decompression to data is one of very important means, it can increase substantially the available capacity of disk, the effective bandwidth of input and output (I/O) during raising read-write operation, thus effectively reduce Internet data center (IDC, InternetDataCenter) cost, improves the execution speed of application layer program.
Current, the compression and decompression of data generally adopt Gzip algorithm, and carry out data processing by the mode of software algorithm, have the bit manipulation of a large amount of serial inside Gzip algorithm, and the efficiency using software algorithm to carry out processing is lower.
And the algorithm that Gzip decompresses all adopts the mode of software to realize, specifically, Gzip decompression algorithm is a kind of multistage algorithm of tabling look-up, multistage mode of tabling look-up is adopted to complete Huffman (Huffman) decoding, that is solve a code may need to table look-up repeatedly, the advantage of the method is that the use amount of internal memory is less, corresponding major part is searched only to need once to table look-up and namely can be completed, efficiency is higher, so use extensively inside software, but this algorithm degree of parallelism is lower, a large amount of cpu resources can be taken at large-scale Data processing, be not suitable for hardware implementing.
For the CPU of dominant frequency 2.66GHz, its bandwidth of carrying out compressing is 50Mb/s, and the bandwidth of decompression is 200Mb/s.In large-scale data process, the data flow of decompression to be compressed is huge; Therefore, must take a large amount of cpu resources when using CPU to carry out compressed and decompressed process, the problem that cpu resource takies when carrying out data decompression is then more obvious.
Therefore, can the treatment effeciency how increasing substantially decompression becomes existing Gzip decompression technique technical problem urgently to be resolved hurrily, especially provide corresponding solution for existing Gzip decompression algorithm.
Summary of the invention
The technical problem that the present invention will solve is to provide a kind of hard-wired system and method for parallel decompression, effectively can improve the decompression efficiency of existing Gzip decompression algorithm.
One aspect of the present invention provides a kind of hard-wired system of parallel decompression, and this system comprises: the bit manipulation module of random length, carries out the bit manipulation of random length for treating decompressed data, obtains the data of random length; Huffman code table recovers module, for the data according to random length, recovers Huffman code table; Huffman decoding module, for according to Huffman code table, executed in parallel Huffman decoding; And decoder module, for the result according to Huffman decoding, decode.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, the bit manipulation module of random length comprises further: data merge submodule, in decompressed data and data buffer storage submodule, be shifted the data after operation merge for what input was read, generate and new treat decompressed data; MUX, for selecting according to the data cached of input; If be less than predetermined figure place data cached, the data strobe after so just data being merged submodule merging is also written in data buffer storage submodule; Otherwise, the data in data buffer storage submodule are upgraded by the data after displacement; Data buffer storage submodule, treats decompressed data for the new of data cached merging submodule input, and export random length to displacement submodule treat decompressed data; Displacement submodule, carries out dextroposition operation for the data in determining to data buffer storage submodule, to abandon the data of upper cycle use, and data newly-generated after shifting function is exported to data merging submodule.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman code table recovers module and comprises further: code length calculating sub module, for the data according to random length, calculates the code length that each coded data is corresponding; Code length sub module stored, for storing the code length that code length calculating sub module calculates; Huffman code table recovers submodule, for according to the result calculated, adds up the number of each code length, adds up the symbol that each code length is corresponding, and to after symbol sequence, recovers Huffman code table according to each code length.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman decoding module comprises further: submodule is changed in position, for carrying out bit inversion, to restore input data to input data; Comparison sub-module, for taking out the data of maximum code length from the data after recovery, and does parallel comparison with expansion initial code; Leading 1 detection sub-module, for the comparative result according to comparison sub-module, determines that start bit is the position of " 1 ", thus determines the actual code length of current data; Code table initial address submodule, obtains the initial address of Huffman code table corresponding to actual code length as address lookup for the actual code length determined according to leading 1 detection sub-module; Data cutting submodule, the code length exporting data and data for changing submodule according to position carries out cutting to output data, retains the low data of code length bit wide, obtains Huffman data to decode.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, decoder module selects Lz77 decoder module, carries out Lz77 decode operation for the result according to Huffman decoding.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, Lz77 decoder module comprises further: decoding controls submodule, data result for the Huffman decoding obtained according to Huffman decoding module generates the address of reading or writing history data store submodule, completes Data Matching; History data store submodule, stores the data of mating for the address provided according to decoding control submodule; Syndrome module, the matched data obtained for controlling submodule to decoding carries out CRC32 verification, to judge the correctness decompressed.
Another aspect of the present invention provides a kind of hard-wired method of parallel decompression, and the method comprises: treat the bit manipulation that decompressed data carries out random length, obtains the data of random length; According to the data of random length, recover Huffman code table; According to Huffman code table, executed in parallel Huffman decoding; And according to the result of Huffman decoding, carry out Lz77 decoding.
In an embodiment of the hard-wired method of parallel decompression provided by the invention, treat the bit manipulation that decompressed data carries out random length, obtain the data of random length to comprise further: data are merged submodule input reads until decompressed data and be shifted the data after operation and merge, generate and new treat decompressed data; Select according to the data cached of input; If be less than predetermined figure place data cached, the data strobe after so just data being merged submodule merging is also written in data buffer storage submodule; Otherwise, the data in data buffer storage submodule are upgraded by the data after displacement; What buffer memory was new treats decompressed data, and export random length treat decompressed data; And random length treated that decompressed data carries out dextroposition operation, to abandon the data that a upper cycle uses, and data newly-generated after shifting function are exported to data merge submodule.
In an embodiment of the hard-wired method of parallel decompression provided by the invention, according to the data of random length, recover Huffman code table and comprise further: according to the data of random length, calculate the code length that each coded data is corresponding; Store the code length calculated; And according to the result calculated, add up the number of each code length, add up the symbol that each code length is corresponding, and to after symbol sequence, recover Huffman code table according to each code length.
In an embodiment of the hard-wired method of parallel decompression provided by the invention, according to Huffman code table, executed in parallel Huffman decoding comprises further: carry out bit inversion, to restore input data to input data; From the data after recovery, take out the data of maximum code length, and do parallel comparison with expansion initial code; According to the comparative result of comparison sub-module, determine that start bit is the position of " 1 ", thus determine the actual code length of current data; The actual code length determined according to leading 1 detection sub-module obtains the initial address of Huffman code table corresponding to actual code length as address lookup; The code length of changing submodule output data and data according to position carries out cutting to output data, retains the low data of code length bit wide, obtains Huffman data to decode.
In an embodiment of the hard-wired method of parallel decompression provided by the invention, according to the result of Huffman decoding, carry out decoding comprising further and select Lz77 decoder module to carry out Lz77 decode operation according to the result of Huffman decoding.
In an embodiment of the hard-wired method of parallel decompression provided by the invention, according to the result of Huffman decoding, carry out Lz77 decoding to comprise further: the data result generation reading of Huffman decoding obtained according to Huffman decoding module or the address of write history data store submodule, complete Data Matching; Matched data is stored according to address; And CRC32 verification is carried out to matched data, to judge the correctness decompressed.
The invention provides a kind of hard-wired system and method for parallel decompression, programmable logic device (FPGA) is used to go to realize Gzip decompressing function, by adopting a kind of parallel decompression algorithm, and design is applicable to the hardware circuit of this algorithm, thus increases substantially the treatment effeciency of decompression.
Accompanying drawing explanation
Fig. 1 illustrates the structural representation of the hard-wired system of a kind of parallel decompression that the embodiment of the present invention provides;
Fig. 2 illustrates the structural representation of the embodiment of the hard-wired system of parallel decompression provided by the invention;
Fig. 3 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention;
Fig. 4 illustrates the particular circuit configurations schematic diagram of the bit manipulation module of random length in an embodiment of the hard-wired system of parallel decompression provided by the invention;
Fig. 5 illustrates that in prior art, Gzip technology recovers the flow chart of the method for code table;
Fig. 6 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention;
Fig. 7 illustrates that in an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman code table recovers the structural representation of the embodiment of module;
Fig. 8 illustrates that in an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman code table recovers the particular circuit configurations schematic diagram of module;
Fig. 9 illustrates the structural representation that in an embodiment of the hard-wired system of parallel decompression provided by the invention, code table stores;
Figure 10 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention;
Figure 11 illustrates the particular circuit configurations schematic diagram of Huffman parallel decoding module in an embodiment of the hard-wired system of parallel decompression provided by the invention;
Figure 12 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention;
Figure 13 illustrates the structural representation of the embodiment of Lz77 decoder module in an embodiment of the hard-wired system of parallel decompression provided by the invention;
Figure 14 illustrates the particular circuit configurations schematic diagram of Lz77 decoder module in an embodiment of the hard-wired system of parallel decompression provided by the invention;
Figure 15 illustrates the flow chart of the hard-wired method of a kind of parallel decompression that the embodiment of the present invention provides;
Figure 16 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention;
Figure 17 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention;
Figure 18 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention;
Figure 19 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention;
Figure 20 illustrates the schematic flow sheet of a hard-wired method embody rule example of parallel decompression provided by the invention.
Embodiment
By exemplary embodiment of the present invention the present invention be described more fully with reference to the accompanying drawings and illustrate.
Fig. 1 illustrates the structural representation of the hard-wired system of a kind of parallel decompression that the embodiment of the present invention provides.
As shown in Figure 1, a kind of hard-wired system 100 of parallel decompression comprises: bit manipulation module 102, the Huffman code table of random length recover module 104, Huffman decoding module 106 and Lz77 decoder module 108.
Wherein, the bit manipulation module (Bitwise_management) 102 of random length, carries out the bit manipulation of random length for treating decompressed data, obtains the data of random length.Such as, the bit manipulation module 102 of random length for reading new data from first-in first-out module (In_FIFO_if), export the data of the data bit width that current period uses, and abandon the operations such as the data used, be responsible for the data bit operation of whole decompression process, realize the efficient output of random length data.
Huffman code table recovers module (Huft_build) 104, for the data according to random length, recovers Huffman code table.Such as, the data of the random length that the bit manipulation module 102 receiving random length exports, according to the data stream recovery Huffman code table received, and are stored in described Huffman code table in random access memory (RAM).The restoration methods of Huffman code table can adopt the existing code table restoration methods in this area, also can adopt code table restoration methods provided by the invention; Relevant content can be further detailed in embodiment subsequently.
Huffman decoding module (Lookup_table) 106, for according to Huffman code table, executed in parallel Huffman decoding.Such as, recover the Huffman code table of module 104 recovery according to Huffman code table, perform Huffman decoding concurrently, from input traffic, recover original huffman coded data.
Decoder module 108, for the result according to Huffman decoding, decodes.Such as, according to the result of Huffman decoding module decoding, adopt Lz77 decoder module (Lz77_inv) from historical data, search corresponding matched data, export this matched data to upgrade historical data, thus mate for subsequent decoding.
The invention provides a kind of hard-wired system of parallel decompression, design is applicable to the hardware circuit of Gzip decompression algorithm, the bit manipulation module of random length is adopted to realize the fast processing of the random length position of data, the decoding of Huffman decoding modular concurrent and Lz77 decoding module decodes, thus increase substantially the treatment effeciency of decompression.
Fig. 2 illustrates the structural representation of the embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in Figure 2, a kind of hard-wired system 200 of parallel decompression comprises: the bit manipulation module 202 of random length, data head parsing module 203, Huffman code table recover module 204, static code table memory module 205, Huffman decoding module 206, dynamic code table memory module 207, Lz77 decoder module 208 and module total control module 210.
Wherein, the bit manipulation module 202 of random length, carries out the bit manipulation of random length for treating decompressed data, obtains the data of random length.Such as, the bit manipulation module 102 of random length for reading new data from first-in first-out module (In_FIFO_if), export the data of the data bit width that current period uses, and abandon the operations such as the data used, be responsible for the data bit operation of whole decompression process, realize the efficient output of random length data.
Data head parsing module (Parse_data) 203, decompressed data is treated for reading from the bit manipulation module 202 of random length, resolve its packed data head, the global information extracted from the head of compression blocks is (as compression type, whether be last data block etc.), the information parsed can send to module master control unit 210 as one of control signal during other module executable operations (in Fig. 2, dotted line represents control signal, and solid line represents that data-signal flows to).
Huffman code table recovers module 204, for the data according to random length, recovers Huffman code table.Such as, Huffman code table recovers the data that module 204 receives the random length of bit manipulation module 202 output of random length, according to the data stream recovery Huffman code table received, and described Huffman code table is stored in dynamic code table memory module (Dynamic_table) 207.The restoration methods of Huffman code table can adopt the existing code table restoration methods in this area, also can adopt code table restoration methods provided by the invention; Relevant content can be further detailed in embodiment subsequently.
Huffman decoding module 206, for according to Huffman code table, executed in parallel Huffman decoding.Such as, the Huffman code table of module 104 recovery is recovered according to Huffman code table, as code table type, the static table stored in queries static code table memory module 205 or dynamic code table memory module 207 or dynamic table, perform Huffman decoding concurrently, from input traffic, recover original huffman coded data.
Static code table memory module (Statics_table) 205 and dynamic code table memory module (Dynamic_table) 207, for providing static table or the dynamic table of inquiry to storing Huffman decoding module 206; Wherein static table is fixing, is stored in inside ROM; Dynamic table dynamically generates, and is stored in inside RAM.Code table data during the static Hafman decoding of static code table correspondence are changeless, inside the ROM of programming in sheet, if current data block employing is static coding mode, then stores Huffman decoding module 206 and inquire about from static code table reading data.Dynamic code table dynamically generates according to the coded message of current data block, for each compression blocks, Huffman code table recovers module 204 and generates corresponding dynamic code table, write dynamic code table memory module 207, store Huffman decoding module 206 during decoding and read out from dynamic code table memory module 207 again.
Lz77 decoder module 208, for the result according to Huffman decoding, carries out Lz77 decoding.Such as, according to the result of Huffman decoding module decoding, from historical data, search corresponding matched data, export this matched data to upgrade historical data to first-in first-out module (Out_FIFO_if), thus mate for subsequent decoding.
Module total control module (Ctrl_unit) 210, be used for receiving global information that data head parsing module 203 the parses control signal as other module, and to random length bit manipulation module 202, data head parsing module 203, Huffman code table recovers module 204, Huffman decoding module 206, Lz77 decoder module 208 and module total control module 210 provides the control signal performing corresponding operating flow process.
The hard-wired system of parallel decompression provided by the invention, the bit manipulation of a large amount of random length is carried out by the mode of hardware structure, execution can operate by parallel decoding, thus in one-period, complete the algorithm of a Hafman decoding, has higher decoding efficiency.
In the overall process that Gzip decompresses, relate to resolution data head, recover Huffman code table, initial data is carried out to the operations such as Huffman decoding, all need the data of taking out random length from the regular data flow (as 32 fixed-length data streams) of input to process.The difficult point of random length bit manipulation is treatment effeciency, namely will calculate data bit width used inside this cycle in current period, then the data that a upper cycle crosses is abandoned, then takes out new data process.For solving the problem, the present invention subsequently will provide a kind of bit manipulation module of random length of overall importance to process random length bit manipulations all in decompression process.
Fig. 3 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in Figure 3, a kind of hard-wired system 300 of parallel decompression comprises: the bit manipulation module 302 of random length, data head parsing module 303, Huffman code table recover module 304, static code table memory module 305, Huffman decoding module 306, dynamic code table memory module 307, Lz77 decoder module 308 and module total control module 310, wherein data head parsing module 303, Huffman code table recovers module 304, static code table memory module 305, Huffman decoding module 306, dynamic code table memory module 307, Lz77 decoder module 308 and module total control module 310 can have and the data head parsing module 203 shown in Fig. 2 respectively, Huffman code table recovers module 204, static code table memory module 205, Huffman decoding module 206, dynamic code table memory module 207, Lz77 decoder module 208 and module total control module 210 have same or analogous structure, for for purpose of brevity, here its technology contents is repeated no more.
As shown in Figure 3, the bit manipulation module 302 of random length comprises further: data merge submodule 3020, data buffer storage submodule 3022 and displacement submodule 3024.
Wherein, data merge submodule 3020, be shifted the data after operation merge for what read by input in decompressed data and data buffer storage submodule 3020, generate new to treat decompressed data.Such as, data merge submodule 3020 input is read 32 wait the data decompressed and after being shifted submodule shifting function remaining data carry out data merging, thus generate and new treat that decompressed data is to upgrade the data of buffer memory in data buffer storage submodule.
Data buffer storage submodule 3022, treats decompressed data for the new of data cached merging submodule input, and export random length to displacement submodule treat decompressed data.Such as, data buffer storage submodule receives and the data of buffer memory 96, exports at current period the data that a maximum bit wide is 16.
Displacement submodule 3024, for carrying out dextroposition operation to the data in data cache sub-module, to abandon the data of upper cycle use, and exports to data merging submodule data newly-generated after shifting function.Such as, displacement submodule needs to carry out dextroposition operation to the data of current cache in data cache sub-module, 16 bit data that the cycle that abandons exports, and remaining 32 bit data are returned to data merging submodule; If according to after the data shift right operation that a upper cycle exports, remaining data are greater than certain length (as 64), then merge submodule without the need to return data and carry out data union operation.
For solveing the technical problem, the present invention, while providing efficient hardware structure, has also carried out the hypostazation of circuit structure to the module of described functionalization.The present invention subsequently will provide circuit structure corresponding to each hardware structure.
Fig. 4 illustrates the particular circuit configurations schematic diagram of the bit manipulation module of random length in an embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in Figure 4, the microstructure of the bit manipulation module 400 of random length mainly comprises: data combination unit 402, MUX 404, data buffer storage unit 406 and shift unit 408.Specifically, bit manipulation module 400 input of random length reads regular input data (as 32 input data), MUX 404 for data cachedly carrying out selecting at current period being the remaining data of a buffer memory according to input, or buffer memory remaining data and new data merge after data; Specifically, if the data of such as buffer memory are less than 64-bit, the data strobe after just data combination unit 402 being merged also is written to buffer memory, otherwise just the Data Update after displacement to data buffer storage unit 406, use when being shifted as next time; The data that data buffer storage unit 406 (as the data buffer storage that the d type flip flops of 96 are formed) inputs for buffer memory MUX 404, data after input data and shift unit 408 move to right by data combination unit 402 combine, then be input in data buffer storage unit 406 via MUX 404, when the valid data in data buffer storage unit 406 are less than 64, then the bit manipulation module 400 of random length will read in new data from outside, carries out data merge the data upgraded afterwards in data buffer storage unit 406 at data combination unit 402.The bit wide (inputting the side-play amount of displacement unit 408 as shown in Figure 4) of the random length data that the displacement unit 408 of the bit manipulation module 400 of random length used in a upper cycle according to other module, dextroposition operation is carried out to data in data buffer storage unit 406, thus used data are abandoned, and new random length data (as 1 ~ 16bit) are outputted to corresponding module, thus realize after data shifts upgrades, remaining data being maintain in data buffer storage unit 406.
The hard-wired system of parallel decompression provided by the invention, the random length bit manipulation module of the overall situation is realized by data combination unit and shift unit, enormously simplify design, all modules all read data from this module, avoid the expense on data sign processing.
In prior art, the method for Gzip technology recovery Huffman code table mainly extracts the content of encoded Huffman code table from input traffic, then carries out decoding to the content of described coding, then recovers corresponding code table.In Gzip compress technique, use standard Huffman coding method, only need know that the code length of the leaf node that Huffman sets just can recover whole code table, so also just the code length of each for code table list item has been write inside compressed data stream in coding stage, in order to improve compression ratio, run length coding is done to code length, and the result after run length coding has been done a Huffman coding.
Object of the present invention will realize Gzip decoding exactly, coding in view of Gzip is fixing, therefore, the decompression that the present invention provides also must carry out code table recovery and decoding according to fixing flow process, and namely the present invention adopts the flow process identical with the method for existing recovery code table.Gzip technology in prior art of introducing subsequently recovers the method for code table.
Fig. 5 illustrates that in prior art, Gzip technology recovers the flow chart of the method for code table.
As shown in Figure 5, flow process 500 comprises step 502, takes out the code length of bltree table from data flow.Step 504, sets up bltree table according to the code length that the bltree of aforementioned acquisition shows.Step 506, looks into bltree table respectively, carries out distance of swimming long decode, obtains the code length of ltree table and dtree table.Step 508, sets up ltree table.Step 510, sets up dtree table.In the process that Gzip decompresses, need to recover three Huffman code tables from compressed data stream, comprise bltree table, ltree table and dtree table.Due in Gzip coding stage just pre-specified corresponding coded system, therefore, wherein the code length of bltree table directly to extract from the data flow inputted; When recovering two other code table (ltree table and dtree table), needing first to inquire about the bltree table recovered, then performing distance of swimming long decode, so just can obtain each code length of ltree table and dtree table.
In view of the flow process of the recovery Huffman code table of the present invention's employing is flow process same as the prior art, therefore, no longer do concrete expansion; Those skilled in the art can know according to the instruction of explanation of the present invention and prior art the technology that whole recovery Huffman code table process adopts.
Fig. 6 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in Figure 6, a kind of hard-wired system 600 of parallel decompression comprises: the bit manipulation module 602 of random length, data head parsing module 603, Huffman code table recover module 604, static code table memory module 605, Huffman decoding module 606, dynamic code table memory module 607, Lz77 decoder module 608 and module total control module 610, the wherein bit manipulation module 602 of random length, data head parsing module 603, static code table memory module 605, Huffman decoding module 606, dynamic code table memory module 607, Lz77 decoder module 608 and module total control module 610 can have the bit manipulation module 202 with the random length shown in Fig. 2 respectively, data head parsing module 203, static code table memory module 205, Huffman decoding module 206, dynamic code table memory module 207, Lz77 decoder module 208 and module total control module 210 have same or analogous structure, for for purpose of brevity, here its technology contents is repeated no more.
As shown in Figure 6, Huffman code table recovery module 604 comprises further: code length calculating sub module 6040, code length sub module stored 6042 and Huffman code table recover submodule 6044.
Wherein, code length calculating sub module 6040, for the data according to random length, calculates the code length recovered required for Huffman code table.Specifically, the code length of bltree is through run length coding, the bltree data inside to compressed data stream are only needed to carry out distance of swimming long decode, just can obtain bltree completely, the code table that the code length of ltree and dtree is through bltree carries out encoding, so ltree and dtree will be recovered inside compressed data stream, process and the Hafman decoding of data discussed below similar, repeat no more here.
Code length sub module stored 6042, for storing the code length that code length calculating sub module calculates.
Huffman code table recovers submodule 6044, for according to the result calculated, adds up the number of each code length, adds up the symbol that each code length is corresponding, and to after symbol sequence, recovers Huffman code table according to each code length.In prior art, Huffman encoding has 316 symbols (symbol) needs coding, comprise 256 original characters (literal), 30 length (length) and 30 distances (distance), each symbol code length is different, what will do here is set up a table (table) to each code length, and that the inside stores is corresponding symbol.
The hard-wired system of parallel decompression provided by the invention, module is recovered by Huffman code table, can realize in one-period, process input data, realize recovery three Huffman code tables and the process of distance of swimming long decode, thus reach treatment effeciency maximization.
Fig. 7 illustrates that in an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman code table recovers the structural representation of the embodiment of module.
As shown in Figure 7, Huffman code table recovers in the structured flowchart of module 700, code length calculating sub module (Calc_code_len) 702 calculates code length corresponding to each coded data, for the code length required for recovery three Huffman tables, the code length calculated can be written to inside the RAM of code length sub module stored (Len_ram) 704; The code length of bltree table is directly extracted from input traffic, and the code length of ltree table and dtree table needs according to the random length data in input code flow, searches bltree table, then completes distance of swimming long decode to the data of gained of tabling look-up.Recover submodule (Build_table) 706 by Huffman code table and recover code table according to code length.Finite state machine (FSM, FiniteStateModule) 708 is for controlling the processing sequence of whole Huffman code table recovery process code length calculating sub module 702, code length sub module stored 704, Huffman code table recovery submodule 706.
Huffman code table provided by the invention recovers in the embodiment of module, first code length calculating sub module 702 adds up the number of each code length, then the actual symbol of each code length representative is added up (for example, if code length is 5, when so encoding, character a, b, c code length of encoding is all 5, corresponding code length be 5 symbol comprise character a, b, c), and symbol sequence to be deposited, set up a table finally to each code length, the content of list item is corresponding symbol.Adopt which, recovering code table only needs first to travel through the long sub module stored 704 of 2 subcategory number, then sets up corresponding table.Those skilled in the art can be clear that, consuming time under worst case is 3 times of list item maximum number, that is: maximum ltree table is maximum 286 list items, are maximumly even in this case consuming timely also no more than for 900 cycles.
Fig. 8 illustrates that in an embodiment of the hard-wired system of parallel decompression provided by the invention, Huffman code table recovers the particular circuit configurations schematic diagram of module.
Huffman code table recovery process mainly contains 4 steps, and the first step completes the statistics of each code length occurrence number, second step complete each code length of statistics for symbol, and write in order inside RAM, the third and fourth step sets up corresponding table to each code length.Fig. 8 sets forth Huffman code table recovery module and realizes circuit micro-structural corresponding to above-mentioned four steps.As shown in Figure 8, register file (Len_cnt_rft) 802, for the code length that packed data each inside receiving compressed data stream is corresponding, and records the statistical information of code length.Register file (Start_offset_rf) 804, for recording the offset address of code length in code table.Random access memory (Symbol_ram) 806: the memory of record character and corresponding code length.Code length-number store (Length_num) 808, for storing symbol number corresponding to each code length; As code length 5, to there being 3 symbol, " length_num " of that code length 5 is exactly 3.Initial code memory (Base_code) 810, for initial code during storage standards Huffman encoding and code number, for calculating and recovering code table corresponding to code length (due in huffman coding, the code size of same code length increases progressively, as long as so be aware of the number of initial code and code, just all codes can be calculated).Cplens/cpdist memory 814, wherein Cplens (Copylengthsforliteralcodes257..285) for store " 257 ~ 285 " these 28 symbol for coupling base length, Cpdist (Copyoffsetsfordistancecodes0..29), for cardinal distance corresponding to " 0 ~ 29 " these 30 distancesymbol from.Cplens/cpdist and Cplext/cpdext above belongs to the method for expressing of matching length and the distance specified inside Gzip, namely adopts base length/distance (as symbol) and extra bits (extrabit) composition.More length (length) and distance (distance) can be represented like this with less symbol.(Calctableentry) 816, for calculating its position information (extra) extra accordingly according to decoding character out, realize by the mode looking into Cplens/cpdist and Cplext/cpdext table.
Fig. 9 illustrates the structural representation that in an embodiment of the hard-wired system of parallel decompression provided by the invention, code table stores.
After three Huffman code tables are built up, need to store to provide according to certain mode to search efficiently.The mode of the memory layout-design (memorylayout) that the present invention adopts stores symbol and extra information.As shown in Figure 9, code table information adopts the layout-design mode of close-coupled, does not waste memory space.In order to energy fast finding, can the initial address of each table be kept at inside register; A form is all set up in view of for each code length, maximum 15 of each code length, therefore, each code has at most 15 tables, that is 15 registers (as Len1index, Len2index......Len15index) are needed to record the initial address of each table altogether.In RAM memory, the bit wide of each list item is 13-bit, and wherein low 9-bit represents symbol value, and high 4-bit represents extra value.By just obtaining the result of all needs to the one query of described memory, thus guarantee to provide the fastest search operation with minimum storage resources.
Figure 10 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in Figure 10, a kind of hard-wired system 1000 of parallel decompression comprises: the bit manipulation module 1002 of random length, data head parsing module 1003, Huffman code table recover module 1004, static code table memory module 1005, Huffman decoding module 1006, dynamic code table memory module 1007, Lz77 decoder module 1008 and module total control module 1010, the wherein bit manipulation module 1002 of random length, data head parsing module 1003, Huffman code table recovers module 1004, static code table memory module 1005, dynamic code table memory module 1007, Lz77 decoder module 1008 and module total control module 1010 can have the bit manipulation module 202 with the random length shown in Fig. 2 respectively, data head parsing module 203, Huffman code table recovers module 204, static code table memory module 205, dynamic code table memory module 207, Lz77 decoder module 208 and module total control module 210 have same or analogous structure, for for purpose of brevity, here its technology contents is repeated no more.
As shown in Figure 10, Huffman decoding module 1006 comprises further: submodule 10060, comparison sub-module 10062, leading 1 detection sub-module 10064, code table initial address submodule 10066 and data cutting submodule 10068 are changed in position.
Wherein, submodule 10060 is changed in position, for carrying out bit inversion, to restore input data to input data.
Comparison sub-module 10062, for taking out the data of maximum code length from the data after recovery, and does parallel comparison with expansion initial code.
Leading 1 detection sub-module 10064, for the comparative result according to comparison sub-module, determines that start bit is the position of " 1 ", thus determines the actual code length of current data.
Code table initial address submodule 10066, for the actual code length determined according to leading 1 detection sub-module as address lookup obtain actual code length corresponding, initial code/base address (basecode) in symboltable.
Data cutting submodule 10068, the code length for the input data obtained according to code table initial address submodule carries out cutting to input data, retains the low data of code length bit wide, obtains Huffman data to decode.Such as, use data to decode to be less than the low level of its code length bit wide initial address corresponding with its code length together, carry out inquiring about corresponding table inside symbolRAM, the data after decompress(ion) can be obtained.
Figure 11 illustrates the particular circuit configurations schematic diagram of Huffman parallel decoding module in an embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in figure 11, in an embodiment of the hard-wired system of parallel decompression provided by the invention, the circuit micro-structural 1100 of Huffman parallel decoding module comprises: position permute unit (Bit_rev) 1102, Memory Extension becomes the register of the initial code (basecode) of fixed length (as Base_code_extend1, Base_code_extend2......Base_code_extend15) 1104, leading 1 detecting unit (Leadingonedetect) 1106, code table initial address register file (Index_rf) 1108, random length data cell (Truncate) 1110 is cut out from fixed-length data, table (table1 ~ 15) 1112 store the coding of identical code length and the character of each coding correspondence respectively.
Huffman coding belongs to indefinite long code, the current effective bit wide wanting the data of decoding was not known before completing actual decoding, therefore the circuit micro-structural 1100 of Huffman parallel decoding module provided by the invention takes out the data of maximum code length (as 15-bit) at every turn from input traffic, and position permute unit 1102 carries out bit inversion to it; For example, the bit inversion of input data is the 1st and the 15th exchange the 15-bit data of input, the 2nd and the 14th exchange ..., the like, doing like this is because data have done same inversion when compressing, and needs data recovery.
After carrying out bit inversion to original input data, obtain the data of restoring, the expansion initial code (Basecodeextended) of 15 code lengths stored in this restored data and register 1104 is done parallel comparison, and the result compared is the data of 15-bit.The initial code of expansion the initial code of each code length is carried out to the expansion of 15-bit, extended method is that initial code moved left to 15-bit, low level mends 0, if the maximum code length of current Huffman code table is less than 15, then the expansion initial code assignment being greater than maximum code length becomes 0x3fff.
Walk abreast relatively, leading 1 detecting unit 1106 detects the result data compared, determine to start most be 1 that position, this positional representation actual code length of current data (namely judges which of extended code be this input data drop on interval, effective bit wide of present input data can be judged like this, namely code length).
Use this code length as address lookup code table initial address register file (Index_rf) 1108 subsequently, obtain the initial address of corresponding code length for table.Meanwhile, also need to carry out cutting to these input data concerning the input data obtaining code length, namely cut out random length data cell 1110 from fixed-length data an invalid high position is removed, retain the low data of code length bit wide, these data are Huffman coded data.So far Huffman code and code length thereof has been obtained, the initial address of the table also having its code length corresponding.Use the difference of the initial code of Huffman code and corresponding code length as the table bias internal of table corresponding to this code length, symbol and the extra bit wide of this Huffman code correspondence can be inquired.
In an embodiment of the hard-wired system of parallel decompression provided by the invention, the circuit micro-structural of Huffman parallel decoding module achieves the Huffman decoding of full parellel, concurrently inside one-period can complete searching of multiple table (maximum 15), calculate the code length of present input data and inquire the symbol of its correspondence; Guarantee that aforementioned decode procedure completes in one-period.
Figure 12 illustrates the structural representation of another embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in figure 12, a kind of hard-wired system 1200 of parallel decompression comprises: the bit manipulation module 1202 of random length, data head parsing module 1203, Huffman code table recover module 1204, static code table memory module 1205, Huffman decoding module 1206, dynamic code table memory module 1207, Lz77 decoder module 1208 and module total control module 1210, the wherein bit manipulation module 1202 of random length, data head parsing module 1203, Huffman code table recovers module 1204, static code table memory module 1205, Huffman decoding module 1206, dynamic code table memory module 1207 and module total control module 1210 can have the bit manipulation module 202 with the random length shown in Fig. 2 respectively, data head parsing module 203, Huffman code table recovers module 204, static code table memory module 205, Huffman decoding module 206, dynamic code table memory module 207 and module total control module 210 have same or analogous structure, for for purpose of brevity, here its technology contents is repeated no more.
As shown in figure 12, Lz77 decoder module 1208 comprises further: decoding controls submodule 12080, history data store submodule 12082 and syndrome module 12084.
Wherein, decoding controls submodule 12080, and the data result for the Huffman decoding obtained according to Huffman decoding module generates the address of reading or writing history data store submodule, completes Data Matching.
History data store submodule 12082, for controlling the address storage matched data that submodule provides according to decoding.
Syndrome module 12084, the matched data obtained for controlling submodule to decoding carries out CRC32 verification, to judge the correctness decompressed.
The process of Lz77 decode procedure, mainly according to the result of Huffman parallel decoding module decoding, searches corresponding matched data from historical data window, then exports and upgrades this data segment in history window, mating for subsequent decoding result.Therefore, the treatment effeciency of this part directly affects the efficiency of decompress(ion).
Figure 13 illustrates the structural representation of the embodiment of Lz77 decoder module in an embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in figure 13, Lz77 decoder module 1300 comprises: lz77 decoding control unit (Lz77_ctrl) 1302, historical data windows units (RAM) 1304, CRC32 verification unit (CRC32) 1306 and output data buffer storage (OUT_FIFO) 1308.
Lz77 decoding control unit 1302 is according to Huffman decoding module (Lookup_table) compiling character (decode results) out, generate the address of reading or writing historical data windows units (RAM) 1304, complete the matching treatment of data; In addition, the state information such as CRC32 verification, output of lz77 decoding control unit 1302 pairs of data controls.
Adopt the dual port RAM (32KbyteDualportRAM) of 32Kb as historical data windows units (RAM) 1304 in the present invention, for providing matched data memory space (comprise and directly matched data is write current address, or write again in current address after corresponding address reading data) to lz77 decoding control unit 1302.
CRC32 verification unit 1306, carries out CRC32 verification for the initial data of decoding out to lz77 decoding control unit 1302, compares with original CRC, judge the correctness of decompression result.After decompression result is by the verification of CRC32 verification unit, this decompression result carries out interface via exporting the unit module of data buffer storage (OUT_FIFO) 1308 with rear end.As shown in figure 13, export data buffer storage 1308 by after decompression result output, " DDR_ctrl " control module is responsible in corresponding data write Double Data Rate synchronous DRAM (DDR, DoubleDataRate).
The present invention, on the basis taking into account circuit realiration complexity and treatment effeciency, devises a kind of data stream type process structure of Lz77 decoder module, makes decompressed data each cycle have output, do not have bandwidth loss.
Figure 14 illustrates the particular circuit configurations schematic diagram of Lz77 decoder module in an embodiment of the hard-wired system of parallel decompression provided by the invention.
As shown in figure 14, first the particular circuit configurations 1400 of Lz77 decoder module reads decode results data from the first-in first-out interface (FIFO) of Huffman decoding module, judge that the data read into are original character (literal), or length/distance (length/distance).If literal, then what these data are sent to RAM writes FPDP, upgrades related status information simultaneously; If length/distance, then need first calculated address side-play amount index, then read data from corresponding address, be then written in current address, repeat this process until current matching length terminates.Continue to read Huffman decode results, repeat above-mentioned processing procedure, until whole file decompression terminates.
The line that the data implication of the I/O of the d type flip flop shown in Figure 14 is held at its Q there is explanation, input or from adder, adder one end is from the output of d type flip flop, and one end is 1 or address deviant, and the content represented inside to d type flip flop adds one or add address deviant.
Those skilled in the art can be clear that from above-mentioned Lz77 decoding process, in order to ensure that data processing remains stream mode inside Lz77 decode procedure, the efficient storage of history string window is needed to manage, namely the efficient process to data flow will be realized, mainly to ensure the complete pipelining in data handling procedure, and the efficient management to RAM read/write address, namely efficiently process RAW (ReadAfterWrite) risk, be unlikely to cause pipeline stall because of risk.This is because the write operation of RAM wants 2 cycles (cycle) just can come into force, if also need to read data to be written in these two cycle, then RAW can be caused to take a risk; Therefore need to do special process to this situation.Such as, when the data result read in is length/distance coupling pair, if the difference that RAM reads the write address in address and last cycle is greater than 2, namely data to be read have been kept in ram window, length data can be read continuously by pipeline system, and be written in corresponding ram window and go; If the difference that RAM reads the write address in address and last cycle is less than or equal to 2, namely data to be read also do not write in the RAM of this address, the data in a upper cycle were then needed to be temporary in register while write RAM, when each write RAM from register sense data, to eliminate the inherent delay of RAM itself, improve data-handling efficiency.
Figure 15 illustrates the flow chart of the hard-wired method of a kind of parallel decompression that the embodiment of the present invention provides.
As shown in figure 15, a kind of hard-wired method flow 1500 of parallel decompression comprises step 1502, treats the bit manipulation that decompressed data carries out random length, obtains the data of random length.Such as, the bit manipulation module 102 of random length for reading new data from first-in first-out module (In_FIFO_if), export the data of the data bit width that current period uses, and abandon the operations such as the data used, be responsible for the data bit operation of whole decompression process, realize the efficient output of random length data.
Step 1504, according to the data of random length, recovers Huffman code table.Such as, Huffman code table recovers the data that module 204 receives the random length of bit manipulation module 202 output of random length, according to the data stream recovery Huffman code table received, and described Huffman code table is stored in dynamic code table memory module (Dynamic_table) 207.The restoration methods of Huffman code table can adopt the existing code table restoration methods in this area, also can adopt code table restoration methods provided by the invention.
Step 1506, according to Huffman code table, executed in parallel Huffman decoding.Such as, the Huffman code table of module 104 recovery is recovered according to Huffman code table, as code table type, queries static is not shown memory module 205 or dynamically the static table or dynamic table that do not store in table memory module 207, perform Huffman decoding concurrently, from input traffic, recover original huffman coded data.
Step 1508, according to the result of Huffman decoding, decodes.Such as, according to the result of Huffman decoding module decoding, can adopt Lz77 decoder module from historical data, search corresponding matched data, export this matched data to upgrade historical data to first-in first-out module (Out_FIFO_if), thus mate for subsequent decoding.
The hard-wired method of parallel decompression provided by the invention, the bit manipulation of a large amount of random length is carried out by the mode of hardware structure, execution can operate by parallel decoding, thus in one-period, complete the algorithm of a Hafman decoding, has higher decoding efficiency.
Figure 16 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention.
As shown in figure 16, the hard-wired method flow 1600 of parallel decompression comprises: step 16020 ~ 16023,1604,1606 and 1608, wherein step 1604,1606 and 1608 can perform and the step 1504 shown in Figure 15,1506 and 1508 same or analogous technology contents respectively, for for purpose of brevity, repeat no more its technology contents here.
Step 16020, data are merged submodule input reads until decompressed data and be shifted the data after operation and merge, generate and new treat decompressed data.Such as, data merge submodule input is read 32 wait the data decompressed and after being shifted submodule shifting function remaining data carry out data merging, thus generate and new treat that decompressed data is to upgrade the data of buffer memory in data buffer storage submodule.
Step 16021, selects according to the data cached of input; Data cachedly be less than predetermined figure place if described, so just described data merge submodule merge after data strobe be also written in data buffer storage submodule; Otherwise, the data in described data buffer storage submodule are upgraded by the data after displacement.
Step 16022, what buffer memory was new treats decompressed data, and export random length treat decompressed data.Such as, data buffer storage submodule receives and the data of buffer memory 96, exports at current period the data that a maximum bit wide is 16.
To random length, step 16023, treats that decompressed data carries out dextroposition operation, to abandon the data that a upper cycle uses, and data newly-generated after shifting function are exported to data merge submodule.Such as, displacement submodule needs to carry out dextroposition operation to the data of current cache in data cache sub-module, 16 bit data that the cycle that abandons exports, and remaining 32 bit data are returned to data merging submodule; If according to after the data shift right operation that a upper cycle exports, remaining data are greater than certain length (as 16), then merge submodule without the need to return data and carry out data union operation.
The hard-wired method of parallel decompression provided by the invention, the random length bit manipulation module of the overall situation is realized by data combination unit and shift unit, enormously simplify design, all modules all read data from this module, avoid the expense on data sign processing.
Figure 17 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention.
As shown in figure 17, the hard-wired method flow 1700 of parallel decompression comprises: step 1702 ~ 1706 and 1708, wherein step 1702,1706 and 1708 can perform and the step 1502 shown in Figure 15,1506 and 1508 same or analogous technology contents respectively, for for purpose of brevity, repeat no more its technology contents here.
Step 1703, according to the data of random length, calculates the code length recovered required for Huffman code table.Such as, according to the data of the random length that the bit manipulation module of random length exports, for recovering the code length of three Huffman tables.
Step 1704, stores the code length calculated.Such as, the code length calculated can be write inside the RAM of code length sub module stored (Len_ram); Wherein, the code length of bltree table is directly extracted from input traffic, and the code length of ltree table and dtree table needs according to the random length data in input code flow, searches bltree table, then completes distance of swimming long decode to the data of gained of tabling look-up.
Step 1705, according to the result calculated, adds up the number of each code length, adds up the symbol that each code length is corresponding, and to after symbol sequence, recovers Huffman code table according to each code length.Such as, first code length calculating sub module adds up the number of each code length, then the actual symbol of each code length representative is added up (for example, if code length is 5, when so encoding, character a, b, c code length of encoding is all 5, corresponding code length be 5 symbol comprise character a, b, c), and symbol sequence to be deposited, set up a table finally to each code length, the content of list item is corresponding symbol.Adopt which, recovering code table only needs first to travel through the long sub module stored 704 of 2 subcategory number, then sets up corresponding bltree table, ltree table and dtree table.
The hard-wired method of parallel decompression provided by the invention, even in the worst cases, consuming time is 3 times of list item maximum number, that is: maximum ltree table is maximum 286 list items, is even so maximumly consuming timely also no more than for 900 cycles.
Figure 18 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention.
As shown in figure 18, the hard-wired method flow 1800 of parallel decompression comprises: step 1802,1804,18061 ~ 18065 and 1808, wherein step 1802,1804 and 1808 can perform and the step 1502 shown in Figure 15,1504 and 1508 same or analogous technology contents respectively, for for purpose of brevity, repeat no more its technology contents here.
Step 18060, carries out bit inversion, to restore input data to input data.Such as, take out the data of maximum code length (as 15-bit) from input traffic, position permute unit carries out bit inversion to it at every turn; Specifically, the bit inversion of input data is the 1st and the 15th exchange the 15-bit data of input, the 2nd and the 14th exchange ..., the like, doing like this is because data have done same inversion when compressing, and needs data recovery.
Step 18061, takes out the data of maximum code length from the data after recovery, and does parallel comparison with expansion initial code.Such as, the expansion initial code (Basecodeextended) of 15 code lengths stored in the aforementioned restored data that obtains and register is done parallel comparison, and the result compared is the data of 15-bit.The initial code of expansion the initial code of each code length is carried out to the expansion of 15-bit, extended method is that initial code moved left to 15-bit, low level mends 0, if the maximum code length of current Huffman code table is less than 15, then the expansion initial code assignment being greater than maximum code length becomes 0x3fff.
Step 18062, according to the comparative result of comparison sub-module, determines that start bit is the position of " 1 ", thus determines the actual code length of current data.Such as, walk abreast relatively, leading 1 detecting unit detects the result data compared, determine to start most be 1 that position, this positional representation actual code length of current data (namely judges which of extended code be this input data drop on interval, effective bit wide of present input data can be judged like this, namely code length).
Step 18063, the actual code length determined according to leading 1 detection sub-module obtains the code length of input data corresponding to actual code length as address lookup.Such as, use the aforementioned code length obtained as address lookup code table initial address register file (Index_rf), obtain the initial address of corresponding code length for table.
Step 18064, the code length of the input data obtained according to code table initial address submodule carries out cutting to input data, retains the low data of code length bit wide, obtains Huffman decoding data.Such as, also need to carry out cutting to these input data, namely cut out random length data cell from fixed-length data and an invalid high position is removed concerning the input data obtaining code length, retain the low data of code length bit wide, these data are Huffman coded data.
So far Huffman code and code length thereof has been obtained, the initial address of the table also having its code length corresponding.Use the difference of the initial code of Huffman code and corresponding code length as the table bias internal of table corresponding to this code length, symbol and the extra bit wide of this Huffman code correspondence can be inquired.
Figure 19 illustrates the flow chart of another embodiment of the hard-wired method of parallel decompression provided by the invention.
As shown in figure 19, the hard-wired method flow 1900 of parallel decompression comprises: step 1902,1904,1906 and 1908 ~ 1910, wherein step 1902,1904 and 1906 can perform and the step 1502 shown in Figure 15,1504 and 1506 same or analogous technology contents respectively, for for purpose of brevity, repeat no more its technology contents here.
Step 1908, the data result generation reading of Huffman decoding obtained according to Huffman decoding module or the address of write history data store submodule, complete Data Matching.Such as, lz77 decoding control unit is according to Huffman decoding module (Lookup_table) compiling character (decode results) out, generate the address of reading or writing historical data windows units (RAM), complete the matching treatment of data; In addition, lz77 decoding control unit verifies the CRC32 of data, the state information such as output controls.
Step 1909, stores matched data according to address.Such as, adopt the dual port RAM (32KbyteDualportRAM) of 32Kb as historical data windows units (RAM), for providing matched data memory space (comprise and directly matched data is write current address, or write again in current address after corresponding address reading data) to lz77 decoding control unit.
Step 1910, carries out CRC32 verification to matched data, to judge the correctness decompressed.Such as, CRC32 verification unit, carries out CRC32 verification for the initial data of decoding out to lz77 decoding control unit, compares with original CRC, judge the correctness of decompression result.After decompression result is by the verification of CRC32 verification unit, this decompression result carries out interface via exporting the unit module of data buffer storage (OUT_FIFO) with rear end.
The hard-wired method of parallel decompression provided by the invention, on the basis taking into account circuit realiration complexity and treatment effeciency, devise a kind of data stream type process structure of Lz77 decoder module, make decompressed data each cycle have output, there is no bandwidth loss.
Figure 20 illustrates the schematic flow sheet of a hard-wired method embody rule example of parallel decompression provided by the invention.
As shown in figure 20, a kind of hard-wired method flow 2000 of parallel decompression comprises step 2002, data are merged submodule input reads until decompressed data and be shifted the data after operation and merge, generate and new treat decompressed data.Such as, data merge submodule input is read 32 wait the data decompressed and after being shifted submodule shifting function remaining data carry out data merging, thus generate and new treat that decompressed data is to upgrade the data of buffer memory in data buffer storage submodule.
Step 2004, what buffer memory was new treats decompressed data, and export random length treat decompressed data.Such as, data buffer storage submodule receives and the data of buffer memory 96, exports the data of one 64 at current period.
To random length, step 2006, treats that decompressed data carries out dextroposition operation, to abandon the data that a upper cycle uses, and data newly-generated after shifting function are exported to data merge submodule.Such as, displacement submodule needs to carry out dextroposition operation to the data of current cache in data cache sub-module, 64 bit data that the cycle that abandons exports, and remaining 32 bit data are returned to data merging submodule; If according to after the data shift right operation that a upper cycle exports, remaining data are greater than certain length (as 64), then merge submodule without the need to return data and carry out data union operation.
Step 2008, according to the data of random length, calculates the code length that each coded data is corresponding.Such as, according to the data of the random length that the bit manipulation module of random length exports, calculate code length corresponding to each coded data for recovering three Huffman tables.
Step 2010, stores the code length calculated.Such as, the code length calculated can be write inside the RAM of code length sub module stored (Len_ram); Wherein, the code length of bltree table is directly extracted from input traffic, and the code length of ltree table and dtree table needs according to the random length data in input code flow, searches bltree table, then completes distance of swimming long decode to the data of gained of tabling look-up.
Step 2012, according to the result calculated, adds up the number of each code length, adds up the symbol that each code length is corresponding, and to after symbol sequence, recovers Huffman code table according to each code length.Such as, first code length calculating sub module adds up the number of each code length, then the actual symbol of each code length representative is added up (for example, if code length is 5, when so encoding, character a, b, c code length of encoding is all 5, corresponding code length be 5 symbol comprise character a, b, c), and symbol sequence to be deposited, set up a table finally to each code length, the content of list item is corresponding symbol.Adopt which, recovering code table only needs first to travel through the long sub module stored 704 of 2 subcategory number, then sets up corresponding bltree table, ltree table and dtree table.
Step 2014, carries out bit inversion, to restore input data to input data.Such as, take out the data of maximum code length (as 15-bit) from input traffic, position permute unit carries out bit inversion to it at every turn; Specifically, the bit inversion of input data is the 1st and the 15th exchange the 15-bit data of input, the 2nd and the 14th exchange ..., the like, doing like this is because data have done same inversion when compressing, and needs data recovery.
Step 2016, takes out the data of maximum code length from the data after recovery, and does parallel comparison with expansion initial code.Such as, the expansion initial code (Basecodeextended) of 15 code lengths stored in the aforementioned restored data that obtains and register is done parallel comparison, and the result compared is the data of 15-bit.The initial code of expansion the initial code of each code length is carried out to the expansion of 15-bit, extended method is that initial code moved left to 15-bit, low level mends 0, if the maximum code length of current Huffman code table is less than 15, then the expansion initial code assignment being greater than maximum code length becomes 0x3fff.
Step 2018, according to the comparative result of comparison sub-module, determines that start bit is the position of " 1 ", thus determines the actual code length of current data.Such as, walk abreast relatively, leading 1 detecting unit detects the result data compared, determine to start most be 1 that position, this positional representation actual code length of current data (namely judges which of extended code be this input data drop on interval, effective bit wide of present input data can be judged like this, namely code length).
Step 2020, the actual code length determined according to leading 1 detection sub-module obtains the initial address of Huffman code table corresponding to actual code length as address lookup.Such as, use the aforementioned code length obtained as address lookup code table initial address register file (Index_rf), obtain the initial address of corresponding code length for table, i.e. the initial address of the table 1 shown in Figure 11 ~ 15.
Step 2022, the code length of changing submodule output data and data according to position carries out cutting to output data, retains the low data of code length bit wide, obtains Huffman data to decode.Such as, also need to carry out cutting to these input data, namely cut out random length data cell from fixed-length data and an invalid high position is removed concerning the input data obtaining code length, retain the low data of code length bit wide, these data are Huffman coded data.So far Huffman code and code length thereof has been obtained, the initial address of the table also having its code length corresponding.Use the difference of the initial code of Huffman code and corresponding code length as the table bias internal of table corresponding to this code length, symbol and the extra bit wide of this Huffman code correspondence can be inquired.
Step 2024, the data result generation reading of Huffman decoding obtained according to Huffman decoding module or the address of write history data store submodule, complete Data Matching.Such as, lz77 decoding control unit is according to Huffman decoding module (Lookup_table) compiling character (decode results) out, generate the address of reading or writing historical data windows units (RAM), complete the matching treatment of data; In addition, lz77 decoding control unit verifies the CRC32 of data, the state information such as output controls.
Step 2026, stores matched data according to address.Such as, adopt the dual port RAM (32KbyteDualportRAM) of 32Kb as historical data windows units (RAM), for providing matched data memory space (comprise and directly matched data is write current address, or write again in current address after corresponding address reading data) to lz77 decoding control unit.
Step 2028, carries out CRC32 verification to matched data, to judge the correctness decompressed.Such as, CRC32 verification unit, carries out CRC32 verification for the initial data of decoding out to lz77 decoding control unit, compares with original CRC, judge the correctness of decompression result.After decompression result is by the verification of CRC32 verification unit, this decompression result carries out interface via exporting the unit module of data buffer storage (OUT_FIFO) with rear end.
Huffman decoding needs the bit manipulation of random length, in order to ensure the Stream Processing to data, need in one-period, complete following three step operations: Huffman decodes, and then abandons used data according to current code length, then the data needed of next time decoding are taken out.After Huffman decoding, data will through Lz77 decoding process, and processing method is the result according to Huffman decoding, searches corresponding matched data, then export and be updated to inside history character window from history character window.The hard-wired method of parallel decompression provided by the invention, adopt the mode each code length all being set up to a form (tablesforeachlength), parallel search 15 forms on hardware, under any circumstance, one-period just can complete a Huffman decoding, the method adopted than software has higher degree of parallelism, has higher decoding efficiency.
With reference to aforementioned exemplifying description, those skilled in the art can clearly know the present invention and have the following advantages:
The invention provides a kind of hard-wired system and method for parallel decompression, programmable logic device (FPGA) is used to go to realize Gzip decompressing function, by adopting a kind of parallel decompression algorithm, and design is applicable to the hardware circuit of this algorithm, thus increases substantially the treatment effeciency of decompression.
The invention discloses a kind of hard-wired system and method for parallel decompression, the random length bit manipulation module of the overall situation is realized by data combination unit and shift unit, enormously simplify design, all modules all read data from this module, avoid the expense on data sign processing.
The invention discloses a kind of hard-wired system and method for parallel decompression, even in the worst cases, consuming time is 3 times of list item maximum number, that is: maximum ltree table is maximum 286 list items, is even so maximumly consuming timely also no more than for 900 cycles.
The invention discloses a kind of hard-wired system and method for parallel decompression, on the basis taking into account circuit realiration complexity and treatment effeciency, devise a kind of data stream type process structure of Lz77 decoder module, make decompressed data each cycle have output, there is no bandwidth loss.
Description of the invention provides in order to example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is in order to principle of the present invention and practical application are better described, and enables those of ordinary skill in the art understand the present invention thus design the various embodiments with various amendment being suitable for special-purpose.

Claims (10)

1. a hard-wired system for parallel decompression, is characterized in that, described system comprises:
The bit manipulation module of random length, carries out the bit manipulation of random length for treating decompressed data, obtain the data of random length;
Huffman code table recovers module, for the data according to described random length, recovers Huffman code table;
Huffman decoding module, for according to described Huffman code table, executed in parallel Huffman decoding; And
Decoder module, for the result according to Huffman decoding, decodes;
Described Huffman code table recovers module and comprises further:
Code length calculating sub module, for the data according to described random length, calculates the code length that each coded data is corresponding;
Code length sub module stored, for storing the code length that described code length calculating sub module calculates;
Huffman code table recovers submodule, for the result according to calculating, add up the number of each code length, add up the symbol that each described code length is corresponding, and to after described symbol sequence, recover corresponding Huffman code table according to each described code length, wherein obtain the initial address of Huffman code table corresponding to described code length using described code length as address lookup.
2. system according to claim 1, is characterized in that, the bit manipulation module of described random length comprises further:
Data merge submodule, be shifted the data after operation merge for what read by input in decompressed data and data buffer storage submodule, generate new to treat decompressed data;
MUX, for selecting according to the data cached of input; Data cachedly be less than predetermined figure place if described, so just described data merge submodule merge after data strobe be also written in described data buffer storage submodule; Otherwise, the data in described data buffer storage submodule are upgraded by the data after displacement;
Described data buffer storage submodule, for described in buffer memory data merge submodule input described newly treat decompressed data, and to displacement submodule export random length treat decompressed data;
Described displacement submodule, for carrying out dextroposition operation to the data in described data buffer storage submodule, to abandon the data of upper cycle use, and exports to described data merging submodule data newly-generated after shifting function.
3. system according to claim 1, is characterized in that, described Huffman decoding module comprises further:
Submodule is changed in position, for carrying out bit inversion, to restore described input data to input data;
Comparison sub-module, for taking out the data of maximum code length from the data after recovery, and does parallel comparison with expansion initial code;
Leading 1 detection sub-module, for the comparative result according to described comparison sub-module, determines that start bit is the position of " 1 ", thus determines the actual code length of current data;
Code table initial address submodule, obtains the initial address of Huffman code table corresponding to described actual code length as address lookup for the actual code length determined according to described leading 1 detection sub-module;
Data cutting submodule, the code length exporting data and data for changing submodule according to described position carries out cutting to described output data, retains the low data of described code length bit wide, obtains Huffman data to decode.
4. system according to claim 1, is characterized in that, described decoder module selects Lz77 decoder module, carries out Lz77 decode operation for the result according to Huffman decoding.
5. system according to claim 4, is characterized in that, described Lz77 decoder module comprises further:
Decoding controls submodule, and the data result for the Huffman decoding obtained according to described Huffman decoding module generates the address of reading or write history data store submodule, completes Data Matching;
History data store submodule, stores the data of mating for the address provided according to described decoding control submodule;
Syndrome module, the matched data obtained for controlling submodule to decoding carries out CRC32 verification, to judge the correctness decompressed.
6. a hard-wired method for parallel decompression, is characterized in that, described method comprises:
Treat the bit manipulation that decompressed data carries out random length, obtain the data of random length;
According to the data of described random length, recover Huffman code table;
According to described Huffman code table, executed in parallel Huffman decoding; And
According to the result of Huffman decoding, decode;
Wherein, the described data according to described random length, recover Huffman code table and comprise further:
According to the data of random length, calculate the code length that each coded data is corresponding;
Store the described code length calculated; And
According to the result calculated, add up the number of each code length, add up the symbol that each described code length is corresponding, and to after described symbol sequence, recover corresponding Huffman code table according to each described code length, wherein obtain the initial address of Huffman code table corresponding to described code length using described code length as address lookup.
7. method according to claim 6, is characterized in that, described in treat the bit manipulation that decompressed data carries out random length, obtain the data of random length and comprise further:
Data are merged submodule input reads until decompressed data and be shifted the data after operation and merge, generate and new treat decompressed data;
Select according to the data cached of input; Data cachedly be less than predetermined figure place if described, so just described data merge submodule merge after data strobe be also written in data buffer storage submodule; Otherwise, the data in described data buffer storage submodule are upgraded by the data after displacement;
Newly described in buffer memory treat decompressed data, and export random length treat decompressed data; And
Described random length treated that decompressed data carries out dextroposition operation, to abandon the data that a upper cycle uses, and data newly-generated after shifting function are exported to described data merges submodule.
8. method according to claim 6, is characterized in that, described according to described Huffman code table, executed in parallel Huffman decoding comprises further:
Bit inversion is carried out, to restore described input data to input data;
From the data after recovery, take out the data of maximum code length, the data of described maximum code length are done parallel comparison with expansion initial code by comparison sub-module;
According to the comparative result of described comparison sub-module, leading 1 monitoring submodule determination start bit is the position of " 1 ", thus determines the actual code length of current data;
According to the actual code length that described leading 1 detection sub-module is determined, described actual code length is obtained Huffman code table corresponding to described actual code length by submodule initial address as address lookup is changed in position;
The code length of changing submodule output data and data according to described position carries out cutting to described output data, retains the low data of described code length bit wide, obtains Huffman data to decode.
9. method according to claim 6, is characterized in that, the described result according to Huffman decoding, and carrying out decodes comprises further and select Lz77 decoder module to carry out Lz77 decode operation according to the result of Huffman decoding.
10. method according to claim 9, is characterized in that, the described result according to Huffman decoding, carries out Lz77 decoding and comprises further:
Data result according to Huffman decoding generates the address of reading or writing history data store submodule, completes Data Matching;
Matched data is stored according to described address; And
CRC32 verification is carried out to described matched data, to judge the correctness decompressed.
CN201010167216.9A 2010-05-10 2010-05-10 The hard-wired system and method for parallel decompression Active CN102244518B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010167216.9A CN102244518B (en) 2010-05-10 2010-05-10 The hard-wired system and method for parallel decompression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010167216.9A CN102244518B (en) 2010-05-10 2010-05-10 The hard-wired system and method for parallel decompression

Publications (2)

Publication Number Publication Date
CN102244518A CN102244518A (en) 2011-11-16
CN102244518B true CN102244518B (en) 2016-01-20

Family

ID=44962395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010167216.9A Active CN102244518B (en) 2010-05-10 2010-05-10 The hard-wired system and method for parallel decompression

Country Status (1)

Country Link
CN (1) CN102244518B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102983866B (en) * 2012-11-14 2015-12-02 无锡芯响电子科技有限公司 A kind of dynamic Huffman code system for implementing hardware and its implementation
WO2014089753A1 (en) * 2012-12-11 2014-06-19 华为技术有限公司 File compression method, file decompression method, device and server
CN103078647A (en) * 2013-01-15 2013-05-01 中国科学院计算技术研究所 Hardware decoding implementation system and method of LZ77 compression algorithm
TWI490855B (en) * 2013-04-02 2015-07-01 Mstar Semiconductor Inc Decoding circuit and associated decoding method
CN104283567B (en) * 2013-07-02 2018-07-03 北京四维图新科技股份有限公司 A kind of compression of name data, decompression method and equipment
CN106559278B (en) * 2015-09-25 2020-09-15 中兴通讯股份有限公司 Data processing state monitoring method and device
CN107977233B (en) 2016-10-19 2021-06-01 华为技术有限公司 Method and device for quickly loading kernel mirror image file
CN106533628B (en) * 2016-11-30 2019-10-18 郑州云海信息技术有限公司 A kind of Huffman parallel decoding method and device thereof
CN107027036A (en) * 2017-05-12 2017-08-08 郑州云海信息技术有限公司 A kind of FPGA isomeries accelerate decompression method, the apparatus and system of platform
CN107105266A (en) * 2017-05-22 2017-08-29 郑州云海信息技术有限公司 A kind of coding/decoding method, the apparatus and system of PNG images
CN107565974B (en) * 2017-08-14 2020-06-12 同济大学 Static Huffman parallel full coding implementation method
CN107547906B (en) * 2017-08-29 2020-02-21 郑州云海信息技术有限公司 JPEG image decoding method and device
CN107483952A (en) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 A kind of method, apparatus and system of jpeg image decompression
CN110620637B (en) * 2019-09-26 2023-02-03 上海仪电(集团)有限公司中央研究院 Data decompression device and method based on FPGA
CN112737596A (en) 2021-01-07 2021-04-30 苏州浪潮智能科技有限公司 Dynamic Huffman coding method, device and equipment based on sorting network
CN113011585B (en) * 2021-03-19 2023-09-26 上海西井科技股份有限公司 Compiling optimization method, system, equipment and storage medium for eliminating splicing operator
CN113364467B (en) * 2021-06-04 2022-07-08 山东云海国创云计算装备产业创新中心有限公司 Huffman decoding system, method, equipment and storage medium
CN113839679B (en) * 2021-08-31 2023-09-15 山东云海国创云计算装备产业创新中心有限公司 Huffman decoding system, method, equipment and computer readable storage medium
CN113839678B (en) * 2021-08-31 2023-11-03 山东云海国创云计算装备产业创新中心有限公司 Huffman decoding system, method, equipment and computer readable storage medium
CN113746486B (en) * 2021-09-15 2022-09-02 北京中科胜芯科技有限公司 Parallel pipelined decompression device for FPGA configuration code stream
CN113626092A (en) * 2021-10-14 2021-11-09 广州匠芯创科技有限公司 Embedded system starting method and SOC chip
CN117097346B (en) * 2023-10-19 2024-03-19 深圳大普微电子股份有限公司 Decompressor and data decompression method, system, equipment and computer medium
CN117873493A (en) * 2024-03-08 2024-04-12 四川华鲲振宇智能科技有限责任公司 Decompression method for heterogeneous acceleration platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5686915A (en) * 1995-12-27 1997-11-11 Xerox Corporation Interleaved Huffman encoding and decoding method
US6043765A (en) * 1997-09-26 2000-03-28 Silicon Engineering, Inc. Method and apparatus for performing a parallel speculative Huffman decoding using both partial and full decoders
CN101017574A (en) * 2007-02-16 2007-08-15 上海广电集成电路有限公司 Huffman decoding method suitable for JPEG code stream
CN101042773A (en) * 2006-03-22 2007-09-26 国际商业机器公司 Data processing device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100525450C (en) * 2007-03-13 2009-08-05 北京中星微电子有限公司 Method and device for realizing Hoffman decodeng

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5686915A (en) * 1995-12-27 1997-11-11 Xerox Corporation Interleaved Huffman encoding and decoding method
US6043765A (en) * 1997-09-26 2000-03-28 Silicon Engineering, Inc. Method and apparatus for performing a parallel speculative Huffman decoding using both partial and full decoders
CN101042773A (en) * 2006-03-22 2007-09-26 国际商业机器公司 Data processing device
CN101017574A (en) * 2007-02-16 2007-08-15 上海广电集成电路有限公司 Huffman decoding method suitable for JPEG code stream

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PNG的硬件解码加速设计;郑天翼等;《现代电子技术》;20090228(第04期);第182~184页 *
霍夫曼解码器的设计及在MP3解码中的应用;宋奇刚等;《今日电子》;20050331(第03期);第49~52页 *

Also Published As

Publication number Publication date
CN102244518A (en) 2011-11-16

Similar Documents

Publication Publication Date Title
CN102244518B (en) The hard-wired system and method for parallel decompression
KR101956031B1 (en) Data compressor, memory system comprising the compress and method for compressing data
US9647684B2 (en) Memory-based history search
US7403136B2 (en) Block data compression system, comprising a compression device and a decompression device and method for rapid block data compression with multi-byte search
US8832034B1 (en) Space-efficient, revision-tolerant data de-duplication
RU2629440C2 (en) Device and method for acceleration of compression and decompression operations
JP6009676B2 (en) Data compression device and data decompression device
WO2013095615A1 (en) Bitstream processing using coalesced buffers and delayed matching and enhanced memory writes
CN103095305A (en) System and method for hardware LZ77 compression implementation
CN103078646B (en) Dictionary enquiring compression, decompression method and device thereof
CN103248367A (en) Method and device for coding and decoding code stream data
US10303402B2 (en) Data compression using partial statistics
CN114157305B (en) Method for rapidly realizing GZIP compression based on hardware and application thereof
Beal et al. Compressed parameterized pattern matching
EP1201036A1 (en) Method and apparatus for reducing the time required for compressing data
KR101030726B1 (en) Memory efficient multimedia huffman decoding method and apparatus for adapting huffman table based on symbol from probability table
Wang et al. A simplified variant of tabled asymmetric numeral systems with a smaller look-up table
CN1364341A (en) Arithmetic decoding of arithmeticlaly encoded information signal
US10496703B2 (en) Techniques for random operations on compressed data
JP7381393B2 (en) Conditional transcoder and transcoding method for encoded data
YuanJing The combinational application of LZSS and LZW algorithms for compression based on Huffman
Vasanthi et al. Implementation of Robust Compression Technique Using LZ77 Algorithm on Tensilica's Xtensa Processor
Jiancheng et al. Block‐Split Array Coding Algorithm for Long‐Stream Data Compression
US20240022260A1 (en) Low complexity optimal parallel huffman encoder and decoder
Liddell et al. Hybrid prefix codes for practical use

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant