CN202931290U - Compression hardware system based on GZIP - Google Patents

Compression hardware system based on GZIP Download PDF

Info

Publication number
CN202931290U
CN202931290U CN 201220601511 CN201220601511U CN202931290U CN 202931290 U CN202931290 U CN 202931290U CN 201220601511 CN201220601511 CN 201220601511 CN 201220601511 U CN201220601511 U CN 201220601511U CN 202931290 U CN202931290 U CN 202931290U
Authority
CN
China
Prior art keywords
unit
code word
buffer unit
huffman
matching length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn - After Issue
Application number
CN 201220601511
Other languages
Chinese (zh)
Inventor
汤晓东
狄永清
李冰
李玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUXI XINXIANG ELECTRONIC TECHNOLOGY Co Ltd
Original Assignee
WUXI XINXIANG ELECTRONIC TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUXI XINXIANG ELECTRONIC TECHNOLOGY Co Ltd filed Critical WUXI XINXIANG ELECTRONIC TECHNOLOGY Co Ltd
Priority to CN 201220601511 priority Critical patent/CN202931290U/en
Application granted granted Critical
Publication of CN202931290U publication Critical patent/CN202931290U/en
Anticipated expiration legal-status Critical
Withdrawn - After Issue legal-status Critical Current

Links

Images

Abstract

The utility model discloses a compression hardware system based on GZIP, comprising an input buffer memory unit used for buffering input data; a LZ77 coding unit; a dynamic new character/ match length Huffman coding frequency statistic control unit; a dynamic back distance Huffman coding frequency statistic control unit; a dynamic new character/ match length Huffman coding unit; a dynamic back distance Huffman coding unit; a dynamic codon length Huffman coding unit; a static state new character/ match length Huffman coding unit; a static state back Huffman coding unit; a data packing unit; and an output buffer memory unit. The compression hardware system can realize GZIP compression algorithm, is compatible with software, lifts GZIP compression data throughput and needs no intervention of CPU (Central Processing Unit) in a compression process.

Description

A kind of compression hardware system based on GZIP
Technical field
The utility model relates to a kind of based on GZIP compression hardware system; Belong to the data compression technique field.
Background technology
Along with the development of cloud computing technology, mass data storage and transmission are more and more severeer.Therefore, the data lossless compress technique is widely used to reduce data space, improving data transmission efficiency.GZIP, namely GNU ZIP compression algorithm is very famous lossless compression algorithm, without patent protection, moderate complexity is fit to hardware platform and realizes.
In traditional field of data compression, used widely based on the scheme of the realization of software platform, yet in the implementation method based on software platform, taken too many CPU, be i.e. Central Processing Unit and memory source.
In the utility model, provide a kind of brand-new GZIP hardware implementation structure and proposed multiple speeding scheme and promoted the whole system performance, can reduce significantly the consumption of CPU and memory source.High-performance system bus PCIE2.0 is as communicating bridge between compressing card and computer, DMA, be that Direct Memory Access compresses kernel to the transfer of data in calculator memory to GZIP by the PCIE2.0 interface, after the kernel compression is complete, the data that DMA will compress again are delivered in the internal memory of just calculating, and need not CPU and intervene in data transmission and compression process.
The utility model content
The utility model purpose is a kind of GZIP of realization compression algorithm to be provided, to accomplish the Software Compression data throughput of compatibility, lifting GZIP compression mutually for the defective that prior art exists, and makes the GZIP compression hardware system of the intervention that need not CPU in data compression process.
The utility model adopts following technical scheme for achieving the above object: a kind of compression hardware system based on GZIP, and this system comprises:
An input-buffer unit is used for the input data are carried out buffer memory;
A LZ77 coding unit is used for the input data are carried out the LZ77 coding;
A dynamic fresh character/matching length Huffman coding frequency statistics control unit is used for fresh character and the matching length of the output of LZ77 coding unit are added up;
One dynamically refers to back distance H uffman coding frequency statistics control unit, is used for the finger of LZ77 coding unit output is returned apart from adding up;
A dynamic fresh character/matching length Huffman coding unit is used for fresh character and the matching length of the output of LZ77 coding unit are carried out dynamic Huffman code;
One dynamically refers to back distance H uffman coding unit, is used for the finger of LZ77 coding unit output is returned apart from carrying out dynamic Huffman code;
A dynamic code word length Huffman coding unit is used for encoding to the information of dynamic fresh character/matching length Huffman tree and to the information that dynamic finger returns distance H uffman tree;
A static fresh character/matching length Huffman coding unit is used for the fresh character/matching length after the output of LZ77 coding unit is carried out static Huffman coding;
A static state refers to back distance H uffman coding unit, is used for that the finger after the output of LZ77 coding unit is returned distance and carries out static Huffman coding;
A data packaged unit is used for judgement and adopts a kind of of directly storage, static Huffman coding and three kinds of patterns of dynamic Huffman code, and according to the set form output of encoding;
An output buffer unit is used for the compression data afterwards that data cached packaged unit is exported.
Preferably, described input-buffer unit comprises:
Two data block cache unit are used for depositing initial data to be compressed;
Two data selected cells are for the read-write control of controlling the data-block cache unit.
Preferably, described LZ77 coding unit comprises:
Two pairs of Head/Prev Hash tables are used for the Rapid matching of LZ77 coding unit coded string is searched;
A read-only memory unit ROM, the constant table when being used for depositing cyclic redundancy check (CRC) code CRC32 verification calculating;
A fresh character/matching length buffer unit is used for depositing LZ77 coding unit output fresh character or matching length afterwards;
One refers to back apart from buffer unit, is used for depositing LZ77 coding unit output finger afterwards and returns distance;
A major state machine unit is used for that the data of data block cache unit are carried out data and reads.
Preferably, described dynamic fresh character/matching length Huffman coding unit comprises:
A fresh character/matching length frequency buffer unit is used for depositing the frequency of fresh character after the output of LZ77 coding unit and matching length;
Fresh character/matching length Father's Day point cache unit is used for depositing father's node of fresh character and each node of matching length Huffman tree, wherein except root node;
Fresh character/matching length depth buffer unit is used for depositing fresh character and the degree of depth of each node of matching length Huffman tree in fresh character and matching length Huffman tree;
The rickle buffer unit of fresh character/matching length is used for depositing continuously all nodes of fresh character and matching length Huffman tree;
A fresh character/matching length code word value buffer unit is used for depositing the value that Huffman corresponding to all leaf nodes of fresh character/matching length Huffman tree encodes;
A fresh character/matching length code word size buffer unit is used for depositing fresh character and matching length Huffman sets the effective length that a Huffman corresponding to all nodes encodes;
3 data selected cells are respectively used to control the control of fresh character/matching length frequency buffer unit, fresh character/matching length code word value buffer unit, fresh character/matching length code word size buffer unit;
A pipeline multiplier unit is used for the aiding data piece through the size after dynamic fresh character and matching length Huffman coding;
a major state machine unit, be used for frequency information according to each character in the data block to be compressed of depositing in fresh character/matching length frequency buffer unit, utilize fresh character/matching length Father's Day point cache unit, fresh character/matching length depth buffer unit, the rickle buffer unit of fresh character/matching length goes to construct the Huffman tree, and the information of Huffman tree is left in the rickle buffer unit of fresh character/matching length, after the information that obtains fresh character/matching length Huffman tree, major state machine unit traversal Huffman tree draws the code word size of each node in the Huffman tree, and this node is judged, if leaf node, the frequency of this node is read in described major state machine unit continuation from fresh character/matching length buffer unit, and utilize the pipeline multiplier unit to remove to calculate this current character through the size after the Huffman coding, go to calculate again the code word value of each node in the Huffman tree according to the code word size of each node in the Huffman tree that draws, major state machine unit is judged these nodes, if leaf node is just deposited the code word value of leaf node in fresh character/matching length code word value buffer unit.
Preferably, described dynamic finger returns distance H uffman coding unit and comprises:
One refers to back the frequency of distance buffer unit, is used for depositing the frequency that the output of LZ77 coding unit refers to back distance afterwards;
One refers to back apart from Father's Day point cache unit, is used for depositing the father's node that refers to back each node of distance H uffman tree, wherein except root node;
One refers to back apart from the depth buffer unit, is used for depositing referring to back that distance H uffman sets the degree of depth of each node in referring to back distance H uffman tree;
One refers to back the rickle buffer unit of distance, is used for depositing continuously referring to back that distance H uffman sets all nodes;
One refers to back apart from code word value buffer unit, is used for depositing the value of the Huffman coding that refers to back that all leaf nodes of distance H uffman tree are corresponding;
One refers to back distance codes word length buffer unit, is used for depositing the effective length of the Huffman coding that refers to back that all nodes of distance H uffman tree are corresponding;
3 data selected cells are respectively used to control and refer to back the frequency of distance buffer unit, refer to back apart from code word value buffer unit, refer to back the control of distance codes word length buffer unit;
A pipeline multiplier unit is used for the aiding data piece through dynamically referring to back the distance H uffman size afterwards of encoding;
a major state machine unit, be used for frequency information according to each character in the data block to be compressed that refers to back deposit in the frequency of distance buffer unit, and utilization refers to back apart from Father's Day point cache unit, refer to back apart from the depth buffer unit, refer to back that the rickle buffer unit of distance goes to construct the Huffman tree, and the information of Huffman tree is left in rickle buffer unit, after obtaining referring to back the information of distance H uffman tree, major state machine unit traversal Huffman tree draws the code word size of each node in the Huffman tree, and this node is judged, if leaf node, major state machine unit will read the frequency of this node from fresh character/matching length buffer unit, utilize the pipeline multiplier unit to remove to calculate this current character through the size after the Huffman coding, go to calculate again the code word value of each node in the Huffman tree according to the code word size of each node in the Huffman tree that draws, major state machine unit is judged these nodes, if just depositing the code word value of leaf node into finger, leaf node returns apart from code word value buffer unit.
Preferably, described dynamic code word length Huffman coding unit comprises:
A code word size data statistics unit is used for statistics fresh character/matching length code word size buffer unit and refers to back the frequency of each code word size appearance of distance codes word length buffer unit;
A code word size frequency buffer unit is used for depositing the result of code word size data statistics unit statistics;
A code word size Father's Day point cache unit is used for depositing father's node of each node of code word size Huffman tree;
A code word size depth buffer unit is used for depositing the degree of depth of each node of code word size Huffman tree;
A code word size rickle buffer unit is used for depositing continuously all nodes of code word size Huffman tree;
A code word size code word value buffer unit is used for depositing the value that Huffman corresponding to each leaf node of code word size Huffman tree encodes;
The code word size buffer unit of a code word size is used for depositing the code word size that Huffman corresponding to all nodes of code word size Huffman tree encodes;
A code word size leaf node buffer unit is used for depositing to fresh character/matching length code word size buffer unit and refers to back that distance codes word length buffer unit travels through the leaf node of the code word size that obtains afterwards;
A code word size number of repetition buffer unit is used for depositing the traversal number of repetition of code word size afterwards;
5 data selected cells are respectively used to code word size buffer unit, the code word size leaf node buffer unit of control code word length frequency buffer unit, code word size code word value buffer unit, code word size, the control of code word size number of repetition buffer unit;
A pipeline multiplier unit is used for the calculated data piece through the size after dynamic code word length Huffman coding;
A code word size major state machine, be used for completing fresh character/matching length code word size buffer unit and refer to back that distance codes word length buffer unit travels through the code word size of all leaf nodes, and the result that will add up leaves in code word size leaf node buffer unit and code word size number of repetition buffer unit, and the frequency information of each leaf node is left in code word size frequency buffer unit.
Preferably, described data packaged unit comprises:
Read the avriable length codes unit, be used for reading the LZ77 coding unit, dynamically fresh character/matching length Huffman coding unit, dynamically refer to back distance H uffman coding unit and dynamic code word length Huffman coding unit information accordingly;
The avriable length codes packaged unit, thus know according to the information that reads the avriable length codes unit and provide the compact model that adopts for current data block.
The beneficial effects of the utility model: this compression hardware system and accelerated method thereof can be realized the GZIP compression algorithm, accomplish the software realization compression data throughput of compatibility, lifting GZIP compression mutually, make the intervention that need not CPU in data compression process.
Description of drawings
An additional and accompanying drawing formation specification part is included in the description of particular aspects of the present utility model.The clearer concept of the module of the system that the utility model and the utility model provide and flow process will be more readily understood by nonrestrictive embodiment shown in reference example and accompanying drawing.By can better understanding the utility model with reference to one or more accompanying drawings in conjunction with description of the present utility model.
Fig. 1 illustrates a kind of GZIP compression hardware implementation structure figure of system that the utility model embodiment provides;
The specific works flow process schematic diagram that Fig. 2 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
Write the structural representation of the embodiment of buffer unit in the embodiment that Fig. 3 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The structural representation of the embodiment of LZ77 coding unit in the embodiment that Fig. 4 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The dynamic workflow schematic diagram of the embodiment of fresh character/matching length Huffman coding frequency statistics control unit in the embodiment that Fig. 5 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
Dynamically refer to back the encode workflow schematic diagram of embodiment of frequency statistics control unit of distance H uffman in the embodiment that Fig. 6 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The dynamic structural representation of the embodiment of fresh character/matching length Huffman coding unit in the embodiment that Fig. 7 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The schematic diagram that dynamically refers to back the embodiment of distance H uffman coding unit in the embodiment that Fig. 8 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The structural representation of the embodiment of dynamic code word length Huffman coding unit in the embodiment that Fig. 9 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The structural representation of the embodiment of static fresh character/matching length Huffman coding unit in the embodiment that Figure 10 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
In the embodiment that Figure 11 illustrates that the utility model provides a kind of GZIP compression hardware system realizes, static state refers to back the structural representation of the embodiment of distance H uffman coding unit;
The structural representation of the embodiment of data packaged unit in the embodiment that Figure 12 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
The structural representation of the embodiment of output buffer unit in the embodiment that Figure 13 illustrates that the utility model provides a kind of GZIP compression hardware system realizes;
Figure 14 illustrates the embodiment structural representation of two Head/Prev accelerated methods in a kind of GZIP compression hardware system accelerating method that the utility model provides;
Figure 15 illustrates the workflow schematic diagram of the embodiment of two Head/Prev accelerated methods in a kind of GZIP compression hardware system accelerating method that the utility model provides;
Figure 16 illustrates that in a kind of GZIP compression hardware system accelerating method that the utility model provides, Huffman adds up the structural representation of the embodiment of accelerated method in advance;
Figure 17 illustrates the workflow schematic diagram that Huffman in a kind of GZIP compression hardware system accelerating method that the utility model provides empties the embodiment of accelerated method in advance;
Figure 18 illustrates CRC32 in a kind of GZIP compression hardware system accelerating method that the utility model provides and interts the workflow schematic diagram of the embodiment calculated.
Embodiment
With exemplary embodiment of the present utility model, the utility model is carried out more comprehensively describing and explanation with reference to the accompanying drawings.
Fig. 1 illustrates a kind of GZIP compression hardware implementation structure figure of system that the utility model embodiment provides.
as shown in Figure 1, the utility model provides a kind of GZIP compression hardware system realizes that 100 mainly comprise: input-buffer unit 101, fresh character/matching length frequency statistics control unit 102, refer to back frequency of distance statistics control unit 103, LZ77 coding unit 104, dynamic fresh character/matching length Huffman coding unit 105, dynamically refer to back distance H uffman coding unit 106, static fresh character/matching length Huffman coding unit 107, static state refers to back distance H uffman coding unit 108, dynamic code word length Huffman coding unit 109, data packaged unit 110, output buffer unit 111.
Wherein, input-buffer unit 101 is used for data to be compressed are carried out buffer memory, and especially, initial data realizes promoting in Design of Digital Circuit the ping-pong operation of data throughput through two data memory cell in data buffer storage unit.
Fresh character/matching length frequency statistics control unit 102 is mainly fresh character/matching length of exporting from LZ77 coding unit 104 for receiving, and makes further judgement, if fresh character directly output.Otherwise, be exactly matching length, at this moment, fresh character/matching length frequency statistics unit 102 is mapped to corresponding scope with matching length and exports from dynamic fresh character/matching length Huffman tree character list.
Referring to back frequency of distance statistics control unit 103, is mainly to return distance for the finger that receives from 104 outputs of LZ77 coding unit, and inquiry refers to back that dynamically distance H uffman sets character list and will refer to back that distance map exports to corresponding scope again.
At first LZ77 coding unit 104, is mainly for the data of input-buffer unit 101 being carried out the LZ77 coding, and the result of coding is outputed to respectively fresh character/matching length frequency statistics control unit 102 and refer to back that frequency of distance adds up control unit 103.Secondly, LZ77 coding unit 104 is will complete the CRC32 verification of initial data is calculated, and a result feedback that calculates is to data packaged unit 110.
Dynamically fresh character/matching length Huffman coding unit 105, be mainly to carry out dynamic Huffman code for the fresh character/matching length to 104 outputs of LZ77 coding unit.
Dynamically referring to back distance H uffman coding unit 106, is mainly to return distance for the finger to 104 outputs of LZ77 coding unit to carry out dynamic Huffman code.
Static character/matching length Huffman coding unit 107 is mainly used in the fresh character/matching length of LZ77 coding unit 104 outputs is carried out static Huffman coding.For example, under some special data compression occasion, as in the audio frequency and video fields, the statistical property excursion of data is very little.Therefore, in order to promote the speed of Huffman coding, need to change very little data to these statistical properties and carry out in advance the Huffman coding, and the result of Huffman coding is solidificated in ROM, in practice, even some influence of fluctuations of the statistical property of data is to some compression ratios, but the speed of compression but is able to the lifting of conspicuousness.
Static state refers to back distance H uffman coding unit 108, is mainly to return distance for the finger to the output of LZ77 coding unit to carry out static Huffman coding.
Dynamic code word length Huffman coding unit 109, be mainly to carry out dynamic Huffman code for the code word size information to dynamic fresh character/matching length Huffman coding with the code word size information that dynamically refers to back distance H uffman coding, the information of setting when reducing dynamic Huffman code with this is to promote the compression ratio of data.
data packaged unit 110, the size of the initial data that provides according to LZ77 coding unit 104, dynamic fresh character/matching length Huffman coding unit 105, dynamically refer to back distance H uffman coding unit 106, what dynamic code word length Huffman coding unit 109 provided adopts dynamic Huffman code size afterwards to original data block, static fresh character/matching length Huffman coding unit 107, what static state referred to back that distance H uffman coding unit 108 provides carries out to original data block the direct storage that size after static Huffman coding decides employing, dynamic Huffman code, a kind of compression data block for the treatment of in three kinds of compact models of static Huffman coding compresses.
Output buffer unit 111 is mainly for receiving from the data after the compression of data packaged unit 110 outputs.
The specific works flow process that Fig. 2 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 2, the specific works flow process 200 that realizes of the utility model a kind of GZIP compression hardware system of providing mainly comprises:
Step 201 is filled a data buffer unit in the input-buffer unit.Be used for receiving initial data to be compressed, if initial data can be divided into a plurality of data blocks, be used alternatingly two data memory cell in the input data buffer storage unit, make the transmission of data and parallel the carrying out of processing of data, with this throughput that promotes data, enter step 202 after filling up.
Step 202, the LZ77 coding unit data in notice GZIP compression unit are filled up.The LZ77 coding unit begins to select corresponding data buffer storage unit in the input-buffer unit after receiving this information, enter step 203.
Step 203, LZ77 work.Start working after the control of LZ77 corresponding data buffer storage unit in obtaining the input-buffer unit, and current data block is carried out the CRC32 verification calculate, enter step 204.
Step 204 judges whether LZ77 work is completed.Just proceed the LZ77 coding work if the LZ77 coding work is not yet completed, enter step 203, otherwise enter step 205 with regard to beginning to prepare to start follow-up working cell.
Step 205, dynamically fresh character/matching length Huffman coding unit and refer to back that dynamically distance H uffman coding unit starts working.After the LZ77 coding unit finishes, just begin to start dynamic fresh character/matching length Huffman coding unit the fresh character/matching length of LZ77 coding output is carried out dynamic Huffman code; Simultaneously, startup refers to back that dynamically distance H uffman coding unit returns apart from carrying out dynamic Huffman code the finger of LZ77 coding output, then enters step 206.
Step 206 judges dynamic fresh character/matching length Huffman coding unit and refers to back whether the work of distance H uffman coding unit finishes.If there is no to finish just to proceed the process 205 of dynamic Huffman code, otherwise, just prepare to enter step 207.
Step 207, dynamic code word length Huffman coding unit is started working.If dynamic fresh character/matching length Huffman coding unit and dynamically refer to back that distance H uffman coding unit all finishes, dynamic code word length Huffman coding unit is started working, and enters step 208.
Step 208 judges whether the work of dynamic code word length Huffman coding unit finishes.If do not finish, proceed dynamic code word length Huffman cataloged procedure, otherwise, just enter step 209.
Step 209, the log-on data packaged unit.after dynamic code word length Huffman coding unit end-of-job, the data packaged unit begins to start, and the result that provides according to the LZ77 coding unit, the result that dynamically fresh character/matching length Huffman coding unit provides, dynamically refer to back the result that distance H uffman coding unit provides, the result that dynamic code word length Huffman coding unit provides judges, select directly storage, dynamic Huffman code, a kind of in three kinds of compact models of static Huffman coding compresses current data block, the current data block coding directly enters step 210 after completing.
Step 210, the data block that pre-treatment is worked as in judgement is last data block.If last data block just represents that data compression is complete, begin to process next data block otherwise enter step 201.
This shows, original data to be compressed are divided into a plurality of data blocks and compress, thereby make the realization of ping-pong operation become possibility.
Write the structural representation of the embodiment of buffer unit in the embodiment that Fig. 3 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
The structure that the embodiment of buffer unit 300 is provided in the embodiment that as shown in Figure 3, the utility model a kind of GZIP compression hardware system of providing realizes further comprises: data selection unit 301, data buffer storage unit 302, data buffer storage unit 303, data selection unit 304.
As shown in Figure 3, wherein data selection unit 301, are mainly to carry out data stuffing for one that controls data buffer storage unit 302 and data buffer storage unit 303.
Data buffer storage unit 302 and data buffer storage unit 303, be mainly for buffer memory data to be compressed, during real work, a data buffer unit is used for carrying out data encoding, another data buffer storage unit is used for carrying out the filling of data, makes the filling of data and parallel the carrying out of coding of data.
Data selection unit 304 is mainly to process for one that selects data buffer storage unit 302 and data buffer storage unit 303 coding that carries out data.
The structural representation of the embodiment of LZ77 coding unit in the embodiment that Fig. 4 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 4, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing, the structure 400 of the embodiment of LZ77 coding unit further comprises: Head1 Hash look-up table 401, Prev1 Hash look-up table 402, Head2 Hash look-up table 403, Prev2 Hash look-up table 404, LZ77 major state machine unit 405, ROM look-up table 406, fresh character/matching length buffer unit 407, refer to back apart from buffer unit 408.
Wherein, Head1 Hash look-up table 401, Prev1 Hash look-up table 402, Head2 Hash look-up table 403, Prev2 Hash look-up table 404 are mainly mated chracter search by LZ77 major state machine unit 405 as the Hash table fast.
LZ77 major state machine unit 405, mainly utilize Head1 Hash table 401, Prev1 Hash402 table, Head2 Hash table 403, Prev2 Hash table 404 is completed the LZ77 cataloged procedure to original data block, and complete the CRC32 checking procedure of original data block and count the size of original document, LZ77 major state machine is also dynamic fresh character/matching length Huffman coding unit, refer to back that dynamically distance H uffman coding unit and data packaged unit submit necessary information, the result that mainly comprises the CRC32 verification, the initial data block size, the size of file to be compressed.
ROM look-up table unit 406 when the 405 pairs of original data block CRC32 verifications in LZ77 major state machine unit, need to utilize the data of solidifying in ROM to calculate.
Fresh character/matching length buffer unit 407 is mainly used in depositing fresh character or the matching length that LZ77 encodes and exports initial data.
Refer to back apart from buffer unit 408, be mainly used in depositing LZ77 the encode finger of output of initial data is returned distance.
The dynamic workflow schematic diagram of the embodiment of fresh character/matching length Huffman coding frequency statistics control unit in the embodiment that Fig. 5 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 5, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing dynamically the workflow 500 of the embodiment of fresh character/matching length Huffman coding frequency statistics control unit further comprise:
Step 501, dynamically the major state machine of fresh character/matching length Huffman coding frequency statistics control unit receives a fresh character/matching length from the LZ77 coding module, enters step 502.
Step 502, dynamically the major state machine of fresh character/matching length Huffman coding frequency statistics control unit judges this fresh character/matching length that receives, if what receive is that matching length enters step 503, otherwise enters step 504.
Step 503 is set character list with the matching length inquiry dynamic fresh character/matching length Huffman that receives, and the matching length that receives is mapped to corresponding leaf node in dynamic fresh character/matching length Huffman tree character list, enters step 504.
Step 504, dynamically the major state machine of fresh character/matching length Huffman coding frequency statistics control unit adds 1 with node unit corresponding in fresh character/matching length frequency buffer unit, enters step 505.
Step 505, whether the work that judges finishes, and does not continue to prepare to receive character late if finish just to enter step 501, otherwise just enters done state.
Dynamically refer to back the encode workflow schematic diagram of embodiment of frequency statistics control unit of distance H uffman in the embodiment that Fig. 6 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 6, refer to back dynamically in the embodiment that the utility model a kind of GZIP compression hardware system of providing realizes that the encode workflow 600 of embodiment of frequency statistics control unit of distance H uffman further comprises:
Step 601, the major state machine that dynamically refers to back distance H uffman coding frequency statistics control unit receives one and refers to back distance enter step 602 from the LZ77 coding unit.Prerequisite is that the LZ77 coding unit has been found matched character string at this moment, and has exported and refer to back distance, otherwise this module is not worked.
Step 602 is returned the finger that receives to distance and is referred to back dynamically that as search index distance H uffman sets character list, with referring to back that distance map returns distance H uffman to dynamic finger and set leaf node in character list, enters step 603.
Step 603 refers to back that dynamically the major state machine of distance H uffman coding frequency statistics control unit will refer to back that unit corresponding in the frequency of distance buffer unit adds 1, enters step 604.
Step 604 judges whether the course of work finishes, and does not return distance if there is no to finish to begin to receive with regard to entering step 601 finger of next LZ77 coding unit output, otherwise just enters done state.
The dynamic structural representation of the embodiment of fresh character/matching length Huffman coding unit in the embodiment that Fig. 7 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 7, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing dynamically the structure 700 of the embodiment of fresh character/matching length Huffman coding unit further comprise:
Data selection unit 701, be used for controlling the control of fresh character/matching length frequency buffer unit 702, in the statistics stage of character, data selection unit 701 selects dynamic fresh characters/matching length Huffman frequency statistics unit to remove to control fresh character/matching length frequency buffer unit 702; Building the Huffman tree stage, data selection unit 701 selects dynamic fresh characters/matching length Huffman coding major state machine unit 707 to remove to control fresh character/matching length frequency buffer unit 702.
Fresh character/matching length frequency buffer unit 702 is used for depositing the frequency of each node in dynamic fresh character/matching length Huffman tree, comprises leaf node, intermediate node and root node.
Data selection unit 703 is used for controlling the control of fresh character/matching length code word size buffer unit 704.In the process that builds the Huffman tree, data selection unit 703 selects dynamic fresh character/matching length Huffman coding major state machine 707 to remove to control fresh character/matching length code word size buffer unit 704; Building complete data selection unit 703 at Huffman tree, Huffman table selects the data packaged unit to remove to control fresh character/matching length code word size buffer unit 704.
Fresh character/matching length code word size buffer unit 704 is mainly used to deposit the code word size of each node in dynamic fresh character/matching length Huffman tree.
Data selection unit 705 is mainly the control of controlling fresh character/matching length code word value buffer unit 706.In the process that builds the Huffman tree, data selection unit 705 selects fresh character/matching length Huffman coding major state machine to remove to control code word value buffer unit 706, after obtaining code word size and code word value, data selection unit goes to select the data packaged unit to remove to control fresh character/matching length code word value buffer unit 706.
Fresh character/matching length code word value buffer unit 706 is mainly the code word value of depositing each leaf node in fresh character/matching length Huffman tree.
dynamically fresh character/matching length Huffman coding major state machine unit 707, obtain the frequency of each leaf in the Huffman tree in fresh character/matching length frequency buffer unit 702 after, dynamically fresh character/matching length Huffman major state machine unit 707 is according to each character frequency of depositing in fresh character/matching length frequency buffer unit, and utilize the rickle buffer unit 709 of fresh character/matching length, fresh character/matching length depth buffer unit 710, fresh character/matching length Father's Day point cache unit 711 goes to build the Huffman tree, and calculate Huffman table, dynamically fresh character/matching length Huffman coding major state machine unit 707 also utilizes the frequency information of each character of depositing in fresh character/matching length frequency buffer unit 702 and pipeline multiplier unit 708 to go to calculate data to be compressed through the size after dynamic fresh character/matching length Huffman coding in this process.
Pipeline multiplier unit 708, the dynamically main multiplication calculating of calculating the code word size that leaves Huffman coding corresponding in frequency that in fresh character/matching length frequency buffer unit 702, character occurs and fresh character/matching length code word size buffer unit 706 in pipeline multiplier unit 708 in fresh character/matching length major state machine unit 707.
The rickle buffer unit 709 of fresh character/matching length, rickle buffer unit 709 first halfs of fresh character/matching length are mainly used to safeguard the character that occurs in fresh character/matching length frequency buffer unit 702, make these characters present physically Coutinuous store, logically consist of a binary tree, and this binary tree is satisfied: left sibling and right node are greater than or equal to this node, wherein except leaf node.The latter half of the rickle buffer unit 709 of fresh character/matching length is mainly to deposit fresh character/matching length Huffman tree.
Fresh character/matching length depth buffer unit 710 is mainly the degree of depth of depositing each node in fresh character/matching length Huffman tree, and wherein the degree of depth of root node is maximum, and the degree of depth of leaf node is 0.
Fresh character/matching length Father's Day point cache unit 711 is mainly father's node of depositing each node in fresh character/matching length Huffman tree, wherein except root node.
At fresh character/matching length Huffman end-of-job, code word size and the code word value of each leaf node have been obtained in fresh character/matching length code word size buffer unit 704 and code word value buffer unit 706, the Huffman end-of-job, this moment, fresh character/matching length Huffman coding unit was just given the data packaged unit control of fresh character/matching length code word size buffer unit 704 and code word value buffer unit 706.
The structural representation that dynamically refers to back the embodiment of distance H uffman coding unit in the embodiment that Fig. 8 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 8, refer to back that dynamically the structure 800 of the embodiment of distance H uffman coding unit further comprises in the embodiment that the utility model a kind of GZIP compression hardware system of providing realizes:
Data selection unit 801 is used for controlling the control that refers to back frequency of distance buffer unit 802, and in the statistics stage of character, data selection unit 801 selects dynamically to refer to back that distance H uffman frequency statistics unit goes control to refer to back frequency of distance buffer unit 802; Building the Huffman tree stage, data selection unit 801 selects dynamically to refer to back that the distance H uffman major state machine unit 807 of encoding goes control to refer to back frequency of distance buffer unit 802.
Refer to back frequency of distance buffer unit 802, be used for depositing the frequency that dynamic finger returns each node in distance H uffman tree, comprise leaf node, intermediate node and root node.
Data selection unit 803 is used for controlling the control that refers to back distance codes word length buffer unit 804.In the process that builds the Huffman tree, data selection unit 803 selects dynamically to refer to back that distance H uffman coding major state machine 807 goes control to refer to back distance codes word length buffer unit 804; Building complete data selection unit 803 at Huffman tree, Huffman table selects the data packaged unit to go control to refer to back distance codes word length buffer unit 804.
Refer to back distance codes word length buffer unit 804, be mainly used to deposit the code word size that dynamic finger returns each node in distance H uffman tree.
Data selection unit 805 is mainly to control the control that refers to back apart from code word value buffer unit 806.In the process that builds the Huffman tree, data selection unit 805 selects to refer to back that distance H uffman coding major state machine removes to control code word value buffer unit 806, after obtaining code word size and code word value, data selection unit 805 selects the data packaged unit to go control to refer to back apart from code word value buffer unit 806.
Referring to back apart from code word value buffer unit 806, is mainly to deposit the code word value that refers to back each leaf node in distance H uffman tree.
dynamically refer to go back to distance H uffman coding major state machine unit 807, obtain the frequency of each leaf in the Huffman tree in referring to back frequency of distance buffer unit 802 after, refer to back that dynamically distance H uffman major state machine unit 807 is according to each character frequency that refers to back deposit in the frequency of distance buffer unit, and utilize and to refer to back apart from rickle buffer unit 809, refer to back apart from depth buffer unit 810, refer to back go to build the Huffman tree apart from Father's Day point cache unit 811, and calculate Huffman table, refer to back dynamically that in this process distance H uffman coding major state machine unit 807 also utilizes the frequency information that refers to back each character of depositing in frequency of distance buffer unit 802 and pipeline multiplier unit 808 to go to calculate data to be compressed through dynamically referring to back the distance H uffman size afterwards of encoding.
Pipeline multiplier unit 808 dynamically refers to back calculate the multiplication calculating of leaving the code word size that the Huffman that refers to back the frequency that in frequency of distance buffer unit 802, character occurs and refer to back correspondence in distance codes word length buffer unit 806 encodes in pipeline multiplier unit 808 apart from major state machine unit 807 is main.
Refer to back the rickle buffer unit 809 of distance, refer to back that rickle buffer unit 809 first halfs of distance are mainly used to safeguard the character that refers to back appearance in frequency of distance buffer unit 802, make these characters present physically Coutinuous store, logically consist of a binary tree, and this binary tree is satisfied: left sibling and right node are greater than or equal to this node, wherein except leaf node.The latter half that refers to back the rickle buffer unit 809 of distance is mainly to deposit to refer to back that distance H uffman sets.
Referring to back apart from depth buffer unit 810, is mainly to deposit the degree of depth that refers to back each node in distance H uffman tree, and wherein the degree of depth of root node is maximum, and the degree of depth of leaf node is 0.
Referring to back apart from Father's Day point cache unit 811, is mainly to deposit the father's node that refers to back each node in distance H uffman tree, wherein except root node.
Referring to back distance H uffman end-of-job, code word size and the code word value of each leaf node have been obtained in referring to back distance codes word length buffer unit 804 and code word value buffer unit 806, the Huffman end-of-job refers to back that distance H uffman coding unit just gives the data packaged unit control that refers to back distance codes word length buffer unit 804 and code word value buffer unit 806 at this moment.
The structural representation of the embodiment of dynamic code word length Huffman coding unit in the embodiment that Fig. 9 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in Figure 9, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing, the structure 900 of the embodiment of dynamic code word length Huffman coding unit further comprises:
Code word size data statistics unit 901 is mainly that fresh character from Fig. 7/matching length code word size buffer unit 704 and Fig. 8 middle finger return in distance codes word length buffer unit 804 and read the code word size of each leaf node, and adds up.
Data selection unit 902 is used for the control of control code word length frequency buffer unit 903, and in the statistics stage of character, data selection unit 902 option code word length data statistics unit 901 remove control code word length frequency buffer unit 903; Building the Huffman tree stage, data selection unit 902 selects dynamic code word length Huffman coding major state machine unit 910 to remove control code word length frequency buffer unit 903.
Code word size frequency buffer unit 903 is used for depositing the frequency of each node in dynamic code word length Huffman tree, comprises leaf node, intermediate node and root node.
Data selection unit 904 is used for the control of code word size buffer unit 905 of control code word length.In the process that builds the Huffman tree, data selection unit 904 selects dynamic code word length Huffman coding major state machine 910 to remove the code word size buffer unit 905 of control code word length; Building complete data selection unit 904 at Huffman tree, Huffman table selects the data packaged unit to remove the code word size buffer unit 905 of control code word length.
The code word size buffer unit 905 of code word size is mainly used to deposit the code word size of each node in dynamic code word length Huffman tree.
Data selection unit 906 is mainly the control of control code word length code word value buffer unit 907.In the process that builds the Huffman tree, data selection unit 906 option code word length Huffman coding major state machine unit 910 remove control code word length code word value buffer unit 907, after obtaining code word size and code word value, data selection unit 906 selects the data packaged unit to remove control code word length code word value buffer unit 907.
Code word size code word value buffer unit 907 is mainly the code word value of depositing each leaf node in code word size Huffman tree.
Data selection unit 908, it is mainly the control of option code word length leaf node buffer unit 909, in the data statistics stage, data selection unit 908 selects dynamic code word length Huffman coding major state machine unit 910 to remove control code word length leaf node buffer unit; After the code word value of the code word size that obtains code word size and code word size, data selection unit 908 selects the data packaged unit to remove control code word length leaf node buffer unit 909.
Code word size leaf node buffer unit 909 is mainly to deposit all leaf nodes in code word size Huffman tree.
dynamic code word length Huffman coding major state machine unit 910, obtain the frequency of each leaf in the Huffman tree in code word size frequency buffer unit 903 after, dynamic code word length Huffman major state machine unit 910 is according to each character frequency of depositing in code word size frequency buffer unit 903, and utilize the rickle buffer unit 912 of code word size, code word size depth buffer unit 913, code word size Father's Day point cache unit 914 goes to build the Huffman tree, and calculate Huffman table, dynamic code word length Huffman coding major state machine unit 910 also utilizes the frequency information of each character of depositing in code word size frequency buffer unit 903 and pipeline multiplier unit 911 to go to calculate data to be compressed through the size after dynamic code word length Huffman coding in this process.
Pipeline multiplier unit 911, the main multiplication calculating of calculating the code word size of Huffman coding corresponding in the code word size buffer unit 905 that leaves frequency that in code word size frequency buffer unit 903, character occurs and code word size in pipeline multiplier unit 911 in dynamic code word length major state machine unit 910.
The rickle buffer unit 912 of code word size, rickle buffer unit 912 first halfs of code word size are mainly used to safeguard the character that occurs in code word size frequency buffer unit 903, make these characters present physically Coutinuous store, logically consist of a binary tree, and this binary tree is satisfied: left sibling and right node are greater than or equal to this node, wherein except leaf node.The latter half of the rickle buffer unit 912 of code word size is mainly to deposit code word size Huffman tree.
Code word size depth buffer unit 913 is mainly the degree of depth of depositing each node in code word size Huffman tree, and wherein the degree of depth of root node is maximum, and the degree of depth of leaf node is 0.
Code word size Father's Day point cache unit 914 is mainly father's node of depositing each node in code word size Huffman tree, wherein except root node.
Data selection unit 915, it is mainly the control of control code word length number of repetition buffer unit 916, in the process of structure Huffman tree, data selection unit 915 option code word length Huffman coding major state machine unit 910 remove control code word length number of repetition buffer unit 916, complete in code word size Huffman tree and code word size Huffman table structure, data selection unit 915 selects the data packaged units to remove control code word length number of repetition buffer unit.
Code word size number of repetition buffer unit 916 is mainly the number of repetition of depositing each leaf node in code word size Huffman tree.
At code word size Huffman end-of-job, code word size and the code word value of each leaf node have been obtained in the code word size buffer unit 905 of code word size and code word value buffer unit 907, the Huffman end-of-job, this moment, code word size Huffman coding unit was just given the data packaged unit control of the code word size buffer unit 905 of code word size and code word value buffer unit 907.
The structural representation of the embodiment of static fresh character/matching length Huffman coding unit in the embodiment that Figure 10 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in figure 10, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing, the structure 1000 of the embodiment of static fresh character/matching length Huffman coding unit further comprises:
Data packaged unit 1001, fresh character/matching length of mainly completing leaving in the LZ77 coding unit in fresh character/matching length buffer unit carries out static Huffman coding.
Static fresh character/matching length code word size constant table unit 1002 is mainly used to deposit the code word size of Huffman coding corresponding to fresh character/matching length, can use ROM in design, and namely read-only memory goes to be realized.
Static fresh character/matching length code word value buffer unit 1003 is mainly the code word value of depositing Huffman coding corresponding to fresh character/matching length, can use ROM in design, and namely read-only memory is realized.
Pipeline multiplier unit 1004 is mainly that the fresh character/matching length that calculates leaving in fresh character in the LZ77 coding unit/matching length buffer unit carries out the static Huffman coding size of data afterwards.
In Figure 10, Literal_length[8:0] be fresh character or the matching length that reads, Code[8:0], Code_length[3:0] be respectively to export fresh character or corresponding static Huffman code word value and the code word size of matching length, Static_literal_length[31:0] be mainly that output data to be compressed are through the size after static fresh character/matching length Huffman coding.
The static structural representation that refers to back the embodiment of distance H uffman coding unit in the embodiment that Figure 11 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in figure 11, staticly in the embodiment that the utility model a kind of GZIP compression hardware system of providing realizes refer to back that the structure 1100 of the embodiment of distance H uffman coding unit further comprises:
Data packaged unit 1101 is mainly completed and is encoded to leaving in the LZ77 coding unit to refer to back return apart from carrying out static Huffman apart from the finger in buffer unit.
Static state refers to back apart from code word value buffer unit 1102, is mainly the code word value of depositing the Huffman coding that refers to back that distance is corresponding, can use ROM in design, and namely read-only memory is realized.
Pipeline multiplier unit 1103 is mainly to calculate to return and return distance apart from the finger in buffer unit and carry out static state and refer to back the distance H uffman coding size of data afterwards leaving LZ77 coding unit middle finger in.
In Figure 11, Distance[14:0] be that the finger that reads returns distance, Code[4:0], Code_length[2:0] be respectively to export static Huffman code word value and the code word size that refers to back that distance is corresponding, refer to back that in static state in distance H uffman coding, code word size is fixed as 5 bit bit wides, Static_literal_length[31:0] be mainly the size after output data to be compressed refer to back distance H uffman coding through static state.
The structural representation of the embodiment of data packaged unit in the embodiment that Figure 12 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
In Figure 12, in the embodiment that the utility model provides a kind of GZIP compression hardware system realizes, the structure 1200 of the embodiment of data packaged unit further comprises:
The code word size buffer unit 1201 of dynamic code word length is mainly the code word size of depositing each node in dynamic code word length Huffman tree, with the code word size buffer unit 905 of dynamic code word length in Fig. 9 be multiplexing.
Dynamic code word length code word value buffer unit 1202 is mainly the code word value of depositing each leaf node in dynamic code word length Huffman tree, with dynamic code word length code word value buffer unit 907 in Fig. 9 be multiplexing.
Static state refers to back apart from code word value buffer unit 1203, is mainly to deposit the static Huffman encoded radio that refers to back distance, refers to back that with static state in Figure 11 apart from code word value cell 1102 be multiplexing.
Input data buffer storage unit 1204 is mainly to deposit original data to be compressed, with data buffer storage unit 303 and Unit 304 in Fig. 3 be multiplexing.
Fresh character/matching length buffer unit 1205 is mainly fresh character/matching length of depositing LZ77 coding unit output, with fresh character in Fig. 4/matching length buffer unit 407 be multiplexing.
Referring to back apart from buffer unit 1206, is mainly that the finger of depositing LZ77 coding unit output returns distance, and it is multiplexing returning apart from buffer unit 408 with finger in Fig. 4.
Dynamically fresh character/matching length code word value buffer unit 1207, be mainly the code word value of depositing all leaf nodes in dynamic fresh character/matching length Huffman tree, with in Fig. 7 dynamic fresh character/matching length code word value buffer unit 706 is multiplexing.
Dynamic fresh character/matching length code word size buffer unit 1208, mainly the code word size of depositing all nodes in dynamic fresh character/matching length Huffman tree, with in Fig. 7 dynamic fresh character/matching length code word size buffer unit 704 is multiplexing.
Static fresh character/matching length code word value buffer unit 1209, primary is the code word value of depositing the static Huffman coding of fresh character/matching length, with in Figure 10 static fresh character/matching length code word value buffer unit 1003 is multiplexing.
Static fresh character/matching length code word size buffer unit 1210 is mainly the code word size of depositing the static Huffman coding of fresh character/matching length, with in Figure 10 static fresh character/matching length code word size buffer unit 1002 is multiplexing.
Dynamically referring to back apart from code word value buffer unit 1211, is mainly to deposit the code word value that dynamic finger returns all leaf nodes in distance H uffman tree, and it is multiplexing returning apart from code word value buffer unit 806 with dynamic finger in Fig. 8.
Avriable length codes packaged unit 1212 is mainly to receive to read the code word value the avriable length codes unit sent here and code word size and these avriable length codes are packaged into 64 bit bit wides output in the output buffer unit.
Dynamically referring to back distance codes word length buffer unit 1213, is mainly to deposit the code word size that dynamic finger returns all nodes in distance H uffman tree, and it is multiplexing returning distance codes word length buffer unit 804 with dynamic finger in Fig. 8.
Dynamic code word length number of repetition buffer unit 1214 is mainly the number of repetition of depositing all leaf nodes in dynamic code word length Huffman tree, with dynamic code word length number of repetition buffer unit 916 in Fig. 9 be multiplexing.
Dynamic code word length leaf node buffer unit 1215 is mainly to deposit all leaf nodes in dynamic code word length Huffman tree, with dynamic code word length leaf node buffer unit 909 in Fig. 9 be multiplexing.
read avriable length codes unit 1216, mainly according to LZ77 coding unit in Fig. 4, dynamic fresh character/matching length Huffman coding unit in Fig. 7, dynamically refer to back distance H uffman coding unit in Fig. 8, dynamic code word length Huffman coding unit in Fig. 9, static fresh character in Figure 10/matching length Huffman coding unit, in Figure 11, static state refers to back that the result that distance H uffman coding unit is sent judges, adopt directly storage according to the result of judgement, dynamic Huffman code, a kind of in three kinds of compact models of static Huffman coding compresses data block to be compressed, and then from data buffer storage unit 1201-1211, give the output of avriable length codes packaged unit according to specific order reading out data and with the avriable length codes that reads in 1213-1215.
The structural representation of the embodiment of output buffer unit in the embodiment that Figure 13 illustrates that the utility model provides a kind of GZIP compression hardware system realizes.
As shown in figure 13, in an embodiment realizing of the utility model a kind of GZIP compression hardware system of providing, the structure 1300 of the embodiment of output buffer unit further comprises:
FIFO buffer unit 1301 is mainly the compression data afterwards that receive data packaged unit 1302 sends out.
Data packaged unit 1302 is mainly coding output data, with data packaged unit in Figure 12 be multiplexing.
Figure 14 illustrates the embodiment structural representation of two Head/Prev accelerated methods in a kind of GZIP compression hardware system accelerating method that the utility model provides.
As shown in figure 14, in a kind of GZIP compression hardware system accelerating method of providing of the utility model, the embodiment structure of two Head/Prev accelerated methods further comprises:
Head1 Hash look-up table 1401, Prev 1 Hash look-up table 1402, Head2 Hash look-up table 1404, Prev2 Hash look-up table 1405, it is mainly the address of depositing each character that occurs in data to be compressed, all need Head1 Hash look-up table 1401 before each the use, Prev 1 Hash look-up table 1402, Head2 Hash look-up table 1404, Prev2 Hash look-up table 1405 empties, and then use, the Head1 Hash look-up table 1401 here, Prev 1 Hash look-up table 1402, Head2 Hash look-up table 1404, Head1 Hash look-up table 401 in Prev2 Hash look-up table 1405 and Fig. 4, Prev1 Hash look-up table 402, Head2 Hash look-up table 403, Prev2 Hash look-up table 404 is corresponding multiplexing.
LZ77 major state machine unit 1403, mainly to utilize Head1 Hash look-up table 1401, Prev 1 Hash look-up table 1402, Head2 Hash look-up table 1404, Prev2 Hash look-up table 1405 to complete the LZ77 cataloged procedure of data block, with LZ77 major state machine unit 405 in Fig. 4 be multiplexing.
Figure 15 illustrates the workflow schematic diagram of the embodiment of two Head/Prev accelerated methods in a kind of GZIP compression hardware system accelerating method that the utility model provides.
As shown in figure 15, in a kind of GZIP compression hardware system accelerating method of providing of the utility model, the workflow 1500 of the embodiment of two Head/Prev accelerated methods further comprises:
Step 1501 empties Head1 and Prev1, empties to complete to enter step 1502.
Step 1502, utilize the Head1 Hash look-up table and the Prev1 Hash look-up table that empty in step 1501 to begin to compress first data block, in first data block of compression, complete the emptying of Head2 Hash table and Prev2 Hash table, finishing dealing with just enters step 1503 afterwards.
Step 1503, utilize the Head2 Hash watchcase Prev2 Hash table that empties in step 1502 to begin to compress second data block, also complete in second data block of compression the emptying of Head1 Hash table and Prev1 Hash table, finishing dealing with just enters step 1504 afterwards.
Step 1504 begins to utilize the Head1 Hash table that empties in step 1503 and Prev1 Hash table to the compression of the 3rd data block, completes in compression emptying that Head2 Hash table and Prev2 Hash show.
According to above-mentioned operating procedure, until all data blocks are all compressed complete, the two Head that adopt first here and the structure of Prev are removed packed data, have promoted significantly the throughput of data.
Figure 16 illustrates that in a kind of GZIP compression hardware system accelerating method that the utility model provides, Huffman adds up the structural representation of the embodiment of accelerated method in advance.
As shown in figure 16, the structure 1600 that in a kind of GZIP compression hardware system accelerating method of providing of the utility model, Huffman adds up the embodiment of accelerated method in advance further comprises:
LZ77 coding unit 1601 is mainly completed the LZ77 coding for the treatment of packed data, with LZ77 coding unit 104 in Fig. 1 be multiplexing.
Fresh character/matching length or refer to back apart from buffer unit 1602 is mainly to deposit fresh character/matching length or refer to back distance, with fresh character in Fig. 4/matching length buffer unit 407 or refer to back that apart from buffer unit 408 be multiplexing.
Frequency statistics control unit 1603, mainly receive fresh character/matching length of exporting from the LZ77 coding unit or refer to back distance, and add up, with dynamic fresh character in Fig. 1/matching length Huffman coding frequency statistics control unit 102 or refer to back that dynamically the distance H uffman frequency statistics control unit 103 of encoding is multiplexing.
Frequency buffer unit 1604 is mainly the frequency of depositing nodes all in dynamic Huffman tree, and it is multiplexing returning frequency of distance buffer unit 802 with finger in fresh character in Fig. 7/matching length frequency buffer unit 702 or Fig. 8.
By finding out in Figure 16, this method that the utility model provides can be completed storage and the statistics of character simultaneously, and the work of complete parallel is from having promoted Huffman coded data throughput.
Figure 17 illustrates the workflow schematic diagram that Huffman in a kind of GZIP compression hardware system accelerating method that the utility model provides empties the embodiment of accelerated method in advance.
As shown in figure 17, the workflow 1700 that in a kind of GZIP compression hardware system accelerating method of providing of the utility model, Huffman empties the embodiment of accelerated method in advance further comprises:
Step 1701 was completed before carrying out first data block is carried out the Huffman coding the emptying of frequency buffer unit, and entered step 1702 after emptying.
Step 1702 is mainly to complete to treat the character that occurs in packed data and add up, and entering step 1705 after statistics is good.
Step 1705 is set up the Huffman tree according to the result of step 1702 statistics, and the Huffman tree just enters step 1706 after building up.
Step 1706, the Huffman tree of setting up according to step 1705 draws the Huffman table, and the data that obtain leaving in the frequency buffer unit after the Huffman table are just die on, then completing steps 1703 and step 1704 simultaneously.
Step 1703 is completed the Huffman cataloged procedure for the treatment of compression data block.
Step 1704 is completed emptying of frequency buffer unit, and step 1703 and step 1704 are to carry out simultaneously, just enters step 1702 after step 1703 and step 1704 are all completed, and begins to prepare to process next data block.
As can be seen from Figure 17, step 1703 is the work of complete parallel with step 1704, thereby has promoted Huffman coded data throughput.
Figure 18 illustrates CRC32 in a kind of GZIP compression hardware system accelerating method that the utility model provides and interts the workflow schematic diagram of the embodiment calculated.
As shown in figure 18, in a kind of GZIP compression hardware system accelerating method of providing of the utility model, CRC32 interts the workflow 1800 of the embodiment calculated and further comprises:
Step 1801 reads a character, prepares to carry out the LZ77 coding, enters step 1802.
Step 1802 begins to carry out the search procedure of matched character string from this current character, utilize the CRC32 verification that the time of reading the Hash table goes to complete current character to calculate, and then enter step 1803.
Step 1803, whether the judgement processing finishes, if do not have, continues execution in step 1801, otherwise just enters done state.
As can be seen from Figure 18, reused the LZ77 characteristic of processing character one by one, and can show by inquiry Hash in the process of coding, utilize such time space to complete the CRC32 verification and calculate, thereby data throughput has been promoted.
With reference to aforementioned the utility model exemplary description, those skilled in the art can know and the utlity model has following advantage:
The utility model provides a kind of GZIP compression hardware method that system realizes, and has realized the basic function of GZIP compression on FPGA.
The utility model provides a kind of GZIP compression hardware method that system realizes, realizes that finally hardware realizes realizing compatibility mutually with software, and after hardware-compressed, software can carry out correct decompress(ion).
In the utility model, employing ping-pong operation, two Head and the data throughput that Prev Hash structure, Huffman add up in advance, Huffman empties in advance, the interspersed calculating of CRC32 promotes the GZIP compression, test result shows, the GZIP compression hardware is realized realizing having had on data throughput significantly than software promoting.
Illustrate and describe although the utility model is specialized some specific examples herein, yet the utility model is not restricted to shown details, because not departing from spirit of the present utility model and scope and equivalency range in claim, can make multiple improvement and structural change.Therefore, in a wide range and as illustrated in claim in some sense with scope of the present utility model as one man explain additional what is claimed is suitable.

Claims (7)

1. the compression hardware system based on GZIP, is characterized in that, this system comprises:
An input-buffer unit is used for the input data are carried out buffer memory;
A LZ77 coding unit is used for the input data are carried out the LZ77 coding;
A dynamic fresh character/matching length Huffman coding frequency statistics control unit is used for fresh character and the matching length of the output of LZ77 coding unit are added up;
One dynamically refers to back distance H uffman coding frequency statistics control unit, is used for the finger of LZ77 coding unit output is returned apart from adding up;
A dynamic fresh character/matching length Huffman coding unit is used for fresh character and the matching length of the output of LZ77 coding unit are carried out dynamic Huffman code;
One dynamically refers to back distance H uffman coding unit, is used for the finger of LZ77 coding unit output is returned apart from carrying out dynamic Huffman code;
A dynamic code word length Huffman coding unit is used for encoding to the information of dynamic fresh character/matching length Huffman tree and to the information that dynamic finger returns distance H uffman tree;
A static fresh character/matching length Huffman coding unit is used for the fresh character/matching length after the output of LZ77 coding unit is carried out static Huffman coding;
A static state refers to back distance H uffman coding unit, is used for that the finger after the output of LZ77 coding unit is returned distance and carries out static Huffman coding;
A data packaged unit is used for judgement and adopts a kind of of directly storage, static Huffman coding and three kinds of patterns of dynamic Huffman code, and according to the set form output of encoding;
An output buffer unit is used for the compression data afterwards that data cached packaged unit is exported.
2. the compression hardware system based on GZIP according to claim 1, is characterized in that, described input-buffer unit comprises:
Two data block cache unit are used for depositing initial data to be compressed;
Two data selected cells are for the read-write control of controlling the data-block cache unit.
3. the compression hardware system based on GZIP according to claim 1, is characterized in that, described LZ77 coding unit comprises:
Two pairs of Head/Prev Hash tables are used for the Rapid matching of LZ77 coding unit coded string is searched;
A read-only memory unit ROM, the constant table when being used for depositing cyclic redundancy check (CRC) code CRC32 verification calculating;
A fresh character/matching length buffer unit is used for depositing LZ77 coding unit output fresh character or matching length afterwards;
One refers to back apart from buffer unit, is used for depositing LZ77 coding unit output finger afterwards and returns distance;
A major state machine unit is used for that the data of data block cache unit are carried out data and reads.
4. the compression hardware system based on GZIP according to claim 1, is characterized in that, described dynamic fresh character/matching length Huffman coding unit comprises:
A fresh character/matching length frequency buffer unit is used for depositing the frequency of fresh character after the output of LZ77 coding unit and matching length;
Fresh character/matching length Father's Day point cache unit is used for depositing father's node of fresh character and each node of matching length Huffman tree, wherein except root node;
Fresh character/matching length depth buffer unit is used for depositing fresh character and the degree of depth of each node of matching length Huffman tree in fresh character and matching length Huffman tree;
The rickle buffer unit of fresh character/matching length is used for depositing continuously all nodes of fresh character and matching length Huffman tree;
A fresh character/matching length code word value buffer unit is used for depositing the value that Huffman corresponding to all leaf nodes of fresh character/matching length Huffman tree encodes;
A fresh character/matching length code word size buffer unit is used for depositing fresh character and matching length Huffman sets the effective length that a Huffman corresponding to all nodes encodes;
3 data selected cells are respectively used to control the control of fresh character/matching length frequency buffer unit, fresh character/matching length code word value buffer unit, fresh character/matching length code word size buffer unit;
A pipeline multiplier unit is used for the aiding data piece through the size after dynamic fresh character and matching length Huffman coding;
a major state machine unit, be used for frequency information according to each character in the data block to be compressed of depositing in fresh character/matching length frequency buffer unit, utilize fresh character/matching length Father's Day point cache unit, fresh character/matching length depth buffer unit, the rickle buffer unit of fresh character/matching length goes to construct the Huffman tree, and the information of Huffman tree is left in the rickle buffer unit of fresh character/matching length, after the information that obtains fresh character/matching length Huffman tree, major state machine unit traversal Huffman tree draws the code word size of each node in the Huffman tree, and this node is judged, if leaf node, the frequency of this node is read in described major state machine unit continuation from fresh character/matching length buffer unit, and utilize the pipeline multiplier unit to remove to calculate this current character through the size after the Huffman coding, go to calculate again the code word value of each node in the Huffman tree according to the code word size of each node in the Huffman tree that draws, major state machine unit is judged these nodes, if leaf node is just deposited the code word value of leaf node in fresh character/matching length code word value buffer unit.
5. the compression hardware system based on GZIP according to claim 1, is characterized in that, described dynamic finger returns distance H uffman coding unit and comprises:
One refers to back the frequency of distance buffer unit, is used for depositing the frequency that the output of LZ77 coding unit refers to back distance afterwards;
One refers to back apart from Father's Day point cache unit, is used for depositing the father's node that refers to back each node of distance H uffman tree, wherein except root node;
One refers to back apart from the depth buffer unit, is used for depositing referring to back that distance H uffman sets the degree of depth of each node in referring to back distance H uffman tree;
One refers to back the rickle buffer unit of distance, is used for depositing continuously referring to back that distance H uffman sets all nodes;
One refers to back apart from code word value buffer unit, is used for depositing the value of the Huffman coding that refers to back that all leaf nodes of distance H uffman tree are corresponding;
One refers to back distance codes word length buffer unit, is used for depositing the effective length of the Huffman coding that refers to back that all nodes of distance H uffman tree are corresponding;
3 data selected cells are respectively used to control and refer to back the frequency of distance buffer unit, refer to back apart from code word value buffer unit, refer to back the control of distance codes word length buffer unit;
A pipeline multiplier unit is used for the aiding data piece through dynamically referring to back the distance H uffman size afterwards of encoding;
a major state machine unit, be used for frequency information according to each character in the data block to be compressed that refers to back deposit in the frequency of distance buffer unit, and utilization refers to back apart from Father's Day point cache unit, refer to back apart from the depth buffer unit, refer to back that the rickle buffer unit of distance goes to construct the Huffman tree, and the information of Huffman tree is left in rickle buffer unit, after obtaining referring to back the information of distance H uffman tree, major state machine unit traversal Huffman tree draws the code word size of each node in the Huffman tree, and this node is judged, if leaf node, major state machine unit will read the frequency of this node from fresh character/matching length buffer unit, utilize the pipeline multiplier unit to remove to calculate this current character through the size after the Huffman coding, go to calculate again the code word value of each node in the Huffman tree according to the code word size of each node in the Huffman tree that draws, major state machine unit is judged these nodes, if just depositing the code word value of leaf node into finger, leaf node returns apart from code word value buffer unit.
6. the compression hardware system based on GZIP according to claim 1, is characterized in that, described dynamic code word length Huffman coding unit comprises:
A code word size data statistics unit is used for statistics fresh character/matching length code word size buffer unit and refers to back the frequency of each code word size appearance of distance codes word length buffer unit;
A code word size frequency buffer unit is used for depositing the result of code word size data statistics unit statistics;
A code word size Father's Day point cache unit is used for depositing father's node of each node of code word size Huffman tree;
A code word size depth buffer unit is used for depositing the degree of depth of each node of code word size Huffman tree;
A code word size rickle buffer unit is used for depositing continuously all nodes of code word size Huffman tree;
A code word size code word value buffer unit is used for depositing the value that Huffman corresponding to each leaf node of code word size Huffman tree encodes;
The code word size buffer unit of a code word size is used for depositing the code word size that Huffman corresponding to all nodes of code word size Huffman tree encodes;
A code word size leaf node buffer unit is used for depositing to fresh character/matching length code word size buffer unit and refers to back that distance codes word length buffer unit travels through the leaf node of the code word size that obtains afterwards;
A code word size number of repetition buffer unit is used for depositing the traversal number of repetition of code word size afterwards;
5 data selected cells are respectively used to code word size buffer unit, the code word size leaf node buffer unit of control code word length frequency buffer unit, code word size code word value buffer unit, code word size, the control of code word size number of repetition buffer unit;
A pipeline multiplier unit is used for the calculated data piece through the size after dynamic code word length Huffman coding;
A code word size major state machine, be used for completing fresh character/matching length code word size buffer unit and refer to back that distance codes word length buffer unit travels through the code word size of all leaf nodes, and the result that will add up leaves in code word size leaf node buffer unit and code word size number of repetition buffer unit, and the frequency information of each leaf node is left in code word size frequency buffer unit.
7. the compression hardware system based on GZIP according to claim 1, is characterized in that, described data packaged unit comprises:
Read the avriable length codes unit, be used for reading the LZ77 coding unit, dynamically fresh character/matching length Huffman coding unit, dynamically refer to back distance H uffman coding unit and dynamic code word length Huffman coding unit information accordingly;
The avriable length codes packaged unit, thus know according to the information that reads the avriable length codes unit and provide the compact model that adopts for current data block.
CN 201220601511 2012-11-14 2012-11-14 Compression hardware system based on GZIP Withdrawn - After Issue CN202931290U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201220601511 CN202931290U (en) 2012-11-14 2012-11-14 Compression hardware system based on GZIP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201220601511 CN202931290U (en) 2012-11-14 2012-11-14 Compression hardware system based on GZIP

Publications (1)

Publication Number Publication Date
CN202931290U true CN202931290U (en) 2013-05-08

Family

ID=48221199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201220601511 Withdrawn - After Issue CN202931290U (en) 2012-11-14 2012-11-14 Compression hardware system based on GZIP

Country Status (1)

Country Link
CN (1) CN202931290U (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102970043A (en) * 2012-11-14 2013-03-13 无锡芯响电子科技有限公司 GZIP (GNUzip)-based hardware compressing system and accelerating method thereof
CN110728725A (en) * 2019-10-22 2020-01-24 苏州速显微电子科技有限公司 Hardware-friendly real-time system-oriented lossless texture compression algorithm
CN110995753A (en) * 2019-12-19 2020-04-10 中国电力科学研究院有限公司 Combined compression method for remote communication message in electricity consumption information acquisition system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102970043A (en) * 2012-11-14 2013-03-13 无锡芯响电子科技有限公司 GZIP (GNUzip)-based hardware compressing system and accelerating method thereof
CN102970043B (en) * 2012-11-14 2016-03-30 无锡芯响电子科技有限公司 A kind of compression hardware system based on GZIP and accelerated method thereof
CN110728725A (en) * 2019-10-22 2020-01-24 苏州速显微电子科技有限公司 Hardware-friendly real-time system-oriented lossless texture compression algorithm
CN110995753A (en) * 2019-12-19 2020-04-10 中国电力科学研究院有限公司 Combined compression method for remote communication message in electricity consumption information acquisition system

Similar Documents

Publication Publication Date Title
CN102970043B (en) A kind of compression hardware system based on GZIP and accelerated method thereof
CN103236847B (en) Based on the data lossless compression method of multilayer hash data structure and Run-Length Coding
CN102457283B (en) A kind of data compression, decompression method and equipment
CN1183683C (en) Position adaptive coding method using prefix prediction
CN103997346B (en) Data matching method and device based on assembly line
CN104202054A (en) Hardware LZMA (Lempel-Ziv-Markov chain-Algorithm) compression system and method
CN107027036A (en) A kind of FPGA isomeries accelerate decompression method, the apparatus and system of platform
CN103150260B (en) Data de-duplication method and device
CN102244518A (en) System and method for realizing parallel decompression of hardware
CN104348490A (en) Combined data compression algorithm based on effect optimization
CN103427844B (en) A kind of high-speed lossless data compression method based on GPU and CPU mixing platform
CN103384884A (en) File compression method and device, file decompression method and device, and server
CN202931289U (en) Hardware LZ 77 compression implement system
CN104199951B (en) Web page processing method and device
CN103023509A (en) Hardware LZ77 compression implementation system and implementation method thereof
CN107565971A (en) A kind of data compression method and device
CN103916131A (en) Data compression method and device for performing the same
CN103095305A (en) System and method for hardware LZ77 compression implementation
CN116016606B (en) Sewage treatment operation and maintenance data efficient management system based on intelligent cloud
CN100349160C (en) Data compression method by finite exhaustive optimization
CN202931290U (en) Compression hardware system based on GZIP
CN103428494A (en) Image sequence coding and recovering method based on cloud computing platform
CN102983866A (en) Dynamic Huffman encoding hardware implementation system and implementation method thereof
CN114157305B (en) Method for rapidly realizing GZIP compression based on hardware and application thereof
CN105005464B (en) A kind of Burrows Wheeler mapping hardware processing units

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: Room E701 No. 20 building science and Technology Park Liye sensor network university 214000 Jiangsu province Wuxi City District Qingyuan Road

Patentee after: Wuxi Xinxiang Electronic Technology Co., Ltd.

Address before: 214000 Jiangsu Province, Wuxi City District Qingyuan Road Branch Park 530 building A room 512

Patentee before: Wuxi Xinxiang Electronic Technology Co., Ltd.

AV01 Patent right actively abandoned

Granted publication date: 20130508

Effective date of abandoning: 20160330

C25 Abandonment of patent right or utility model to avoid double patenting