CN107483055B - Lossless compression method and system - Google Patents
Lossless compression method and system Download PDFInfo
- Publication number
- CN107483055B CN107483055B CN201710660141.XA CN201710660141A CN107483055B CN 107483055 B CN107483055 B CN 107483055B CN 201710660141 A CN201710660141 A CN 201710660141A CN 107483055 B CN107483055 B CN 107483055B
- Authority
- CN
- China
- Prior art keywords
- dictionary
- character string
- module
- cam
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
Abstract
The application discloses a lossless compression method, which has the characteristics of real lossless compression, real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip. According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding. Systems employing such methods are also provided.
Description
Technical Field
The invention belongs to the technical field of data communication and data processing, and particularly relates to a lossless compression method and a lossless compression system.
Background
The implementation of the data lossless compression function is currently implemented by a dedicated compression chip, for example, a lossless compression chip of ADI corporation in the united states.
There are three bottlenecks in implementing lossless data compression with a dedicated compression chip: firstly, the difficulty of hardware design is increased, for example, the design area of a PCB is increased, the power consumption of a circuit board is increased, the heat dissipation of the circuit board becomes more difficult, and the like; secondly, the design cost is increased by adding hardware chips; thirdly, the data lossless compression is realized by a special compression chip, and the program of the future system is difficult to upgrade and is inconvenient to upgrade and maintain.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method overcomes the defects of the prior art, provides a lossless compression method, has the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip.
The technical solution of the invention is as follows: according to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.
The system which adopts the lossless compression method is also provided, the system utilizes the Block RAM in the FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises the following steps: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk.
The invention constructs a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm, and is realized by adopting the mature FPGA technology, thereby having the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcoming the technical bottleneck and hardware condition limitation brought by a special compression chip.
Drawings
FIG. 1 is a flow chart of a lossless compression method according to the present invention;
FIG. 2 is a schematic diagram of a lossless compression system according to the present invention;
FIG. 3 is a diagram of a CAM dictionary in accordance with the present invention;
FIG. 4 is a block diagram of a data compression storage system arrangement according to the present invention.
Detailed Description
The lossless compression method (Lempel-Ziv-Welch Encoding, LZW for short) has three important objects: a data stream (CharStream), a coded stream (codeStream), and a coding Table (String Table). When encoding, the data stream is an input object (data sequence of text file), and the encoded stream is an output object (encoded data subjected to compression operation); when decoding, the coded stream is an input object, and the data stream is an output object; the compiled table is an object that needs to be used both in encoding and decoding. The corresponding relation between the character string and the code is dynamically generated in the compression process and is hidden in the compressed data, and the recovery is carried out according to the table during decompression, namely lossless compression.
At present, the special compression chip is used for realizing the data lossless compression function, so that the design cost, the hardware design difficulty, the system program maintenance and upgrading difficulty and the like are increased, and therefore the invention solves the problem of insufficient data lossless compression realized by the special compression chip by constructing a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm.
According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.
Preferably, the method adopts verilog language to carry out module development design.
Preferably, as shown in fig. 1, the method comprises the steps of:
(1) starting;
(2) constructing a character string pair Precode + Char and a Real character string Real _ string, and inquiring the character string pair Precode + Char in a CAM dictionary by using a content addressing method;
(3) judging whether the character string pair exists in the CAM dictionary, if so, executing the step (4), otherwise, executing the step (5);
(4) sending the chip selection address corresponding to the character string to a decoding module for decoding, and skipping step (6);
(5) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(6) the decoding module decodes a unique address corresponding to the character string from the CAM dictionary with the address depth of 2K through a chip selection address, the unique address is sent to the mirror image dictionary module, the Real character string is found and compared with the constructed Real character string Real _ string, if the comparison result is equal, the step (7) is executed, otherwise, the step (8) is executed;
(7) assigning the unique address obtained by decoding to a prefix Precode of a character string pair, waiting for the next original data Char, and skipping to the step (9);
(8) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(9) and (6) ending.
Wherein, the character string pair (Precode + Char) is stored in the CAM dictionary, and the Real character string (Real _ string) is stored in the mirror dictionary.
As shown in fig. 2, there is also provided a system using the lossless compression method, where the system uses a blockaram inside an FPGA to construct a CAM dictionary and a mirror dictionary, and the system includes: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk. Thus, the real-time compression and storage functions of the video data can be realized.
For example, the hardware configuration is as follows:
the FPGA model: xilinx corporation XC7K325T
DDR particles: MT42L32M16D1FE-25AIT from Micro
SATA connector: molex high-speed connector
Preferably, in the system, the first and second sensors,
CAM _ LZW module: the module is the top layer of an LZW compression algorithm and instantiates a Cam _ dis module, a Decoder module and a Mirror _ dis module;
a Cam _ dis module: the module queries the storage position/address of the character string pair Precode + Char in the CAM dictionary by adopting a content addressing method, traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 double-port Block RAMs in the FPGA, and each Block RAM is a double-port RAM with a data bit width of 19 bits and an address depth of 2K;
a Decoder module: the module is responsible for address decoding, and a storage address corresponding to a character string pair Precode + Char is decoded from a 2K deep CAM dictionary through 32 Block RAM chip selection addresses;
a Mirror _ dis module: the module is built by 1 dual-port Block RAM inside an FPGA and is used for storing Real character strings Real _ string, if a character string pair Precode + Char is found in a Cam _ dis dictionary, Real character string comparison is carried out in the module, the dictionary can be found and matched successfully only when data strings are really matched, and otherwise, the code values are output without matching.
The invention constructs a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm, and is realized by adopting the mature FPGA technology, thereby having the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcoming the technical bottleneck and hardware condition limitation brought by a special compression chip.
The software simulation and hardware debugging of the LZW compression method of the present invention will be described in further detail below.
Simulation of LZW compression algorithm software: and verifying the functional correctness of the LZW compression algorithm by constructing the double dictionary based on the Block RAM by using Modelsim simulation software.
On LZW compression algorithm hardware debugging:
firstly, integrating, constraining, realizing and generating a bit file for a double-dictionary-based LZW compression algorithm program constructed on a Block RAM by using a Xilinx compiling tool vivado, and downloading the program; secondly, the internal signal waveform designed by the Xilinx online logic analyzer Chipscope pro is observed, and the lzw compression algorithm is verified to be capable of accurately and unmistakably compressing data.
The invention has the beneficial effects that:
1) the ability to compress video or image data in real time (8.7 MB/s);
2) the compression effect can compress the original data to 50% -70%, and the method is suitable for the field of actual data compression;
3) for some video data acquisition and storage fields, the use space of a memory can be saved;
4) for some network data acquisition and transmission fields, the data transmission speed can be improved;
5) the compression algorithm is convenient to upgrade and maintain.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.
Claims (4)
1. A lossless compression method is characterized in that a Block RAM in an FPGA builds a dynamic dictionary and a mirror dictionary with the depth of 2K, and a character string pair Precode + Char and a Real character string Real _ string are built according to original data Char; if the same character string pair is found from the dynamic dictionary and the Real character string same as the Real character string Real _ string exists in the mirror dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dynamic dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix precoding of the string pair.
2. The lossless compression method as claimed in claim 1, wherein the method is designed for module development using verilog language.
3. A lossless compression method as claimed in claim 2, characterized in that the method comprises the steps of:
(1) starting;
(2) constructing a character string pair Precode + Char and a Real character string Real _ string, and inquiring the character string pair Precode + Char in a CAM dictionary by using a content addressing method;
(3) judging whether the character string pair exists in the CAM dictionary, if so, executing the step (4), otherwise, executing the step (5);
(4) sending the chip selection address corresponding to the character string to a decoding module for decoding, and skipping step (6);
(5) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(6) the decoding module decodes a unique address corresponding to the character string from the CAM dictionary with the address depth of 2K through a chip selection address, the unique address is sent to the mirror image dictionary module, the Real character string is found and compared with the constructed Real character string Real _ string, if the comparison result is equal, the step (7) is executed, otherwise, the step (8) is executed;
(7) assigning the unique address obtained by decoding to a prefix Precode of a character string pair, waiting for the next original data Char, and skipping to the step (9);
(8) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(9) and (6) ending.
4. A lossless compression system is characterized in that the system utilizes a Block RAM inside an FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM-based dual-dictionary construction and LZW compression algorithm realized in an FPGA, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk; in the system, the system is provided with a plurality of sensors,
CAM _ LZW module: the module is the top layer of an LZW compression algorithm and instantiates a Cam _ dis module, a Decoder module and a Mirror _ dis module;
a Cam _ dis module: the module uses the content addressing method to query the character string pair Precode +
The address/storage position of Char in the CAM dictionary traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 dual-port Block RAMs in the FPGA, and each Block RAM is a dual-port RAM with a data bit width of 19 bits and an address depth of 2K;
a Decoder module: the module is responsible for address decoding, and a storage address corresponding to a character string pair Precode + Char is decoded from a 2K deep CAM dictionary through 32 Block RAM chip selection addresses;
a Mirror _ dis module: the module is built by 1 dual-port Block RAM inside an FPGA and is used for storing Real character strings Real _ string, if a character string pair Precode + Char is found in a Cam _ dis dictionary, Real character string comparison is carried out in the module, the dictionary can be found and matched successfully only when data strings are really matched, and otherwise, the code values are output without matching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710660141.XA CN107483055B (en) | 2017-08-04 | 2017-08-04 | Lossless compression method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710660141.XA CN107483055B (en) | 2017-08-04 | 2017-08-04 | Lossless compression method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107483055A CN107483055A (en) | 2017-12-15 |
CN107483055B true CN107483055B (en) | 2020-06-16 |
Family
ID=60597735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710660141.XA Active CN107483055B (en) | 2017-08-04 | 2017-08-04 | Lossless compression method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107483055B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108259041A (en) * | 2017-12-29 | 2018-07-06 | 中国电子科技集团公司第二十研究所 | A kind of Big Dipper data expansion method based on modified LZW Coding Compression Technologies |
CN108494409B (en) * | 2018-03-14 | 2021-07-13 | 电子科技大学 | Underground high-speed real-time compression method of neutron logging-while-drilling instrument based on small dictionary |
CN108494408B (en) * | 2018-03-14 | 2021-07-13 | 电子科技大学 | Hash dictionary-based underground high-speed real-time compression method for density logging while drilling instrument |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101572552A (en) * | 2009-06-11 | 2009-11-04 | 哈尔滨工业大学 | High-speed lossless data compression system based on content addressable memory |
CN103326732A (en) * | 2013-05-10 | 2013-09-25 | 华为技术有限公司 | Method for packing data, method for unpacking data, coder and decoder |
CN105183557A (en) * | 2015-08-26 | 2015-12-23 | 东南大学 | Configurable data compression system based on hardware |
CN106407285A (en) * | 2016-08-26 | 2017-02-15 | 西安空间无线电技术研究所 | RLE and LZW-based optimized bit file compression and decompression method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8159374B2 (en) * | 2009-11-30 | 2012-04-17 | Red Hat, Inc. | Unicode-compatible dictionary compression |
US9760593B2 (en) * | 2014-09-30 | 2017-09-12 | International Business Machines Corporation | Data dictionary with a reduced need for rebuilding |
-
2017
- 2017-08-04 CN CN201710660141.XA patent/CN107483055B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101572552A (en) * | 2009-06-11 | 2009-11-04 | 哈尔滨工业大学 | High-speed lossless data compression system based on content addressable memory |
CN103326732A (en) * | 2013-05-10 | 2013-09-25 | 华为技术有限公司 | Method for packing data, method for unpacking data, coder and decoder |
CN105183557A (en) * | 2015-08-26 | 2015-12-23 | 东南大学 | Configurable data compression system based on hardware |
CN106407285A (en) * | 2016-08-26 | 2017-02-15 | 西安空间无线电技术研究所 | RLE and LZW-based optimized bit file compression and decompression method |
Non-Patent Citations (1)
Title |
---|
《基于CAM字典的LZW算法的实现》;朱宝峰等;《电测与仪表》;20100731;第47卷(第535期);69-73 * |
Also Published As
Publication number | Publication date |
---|---|
CN107483055A (en) | 2017-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107483055B (en) | Lossless compression method and system | |
US11463102B2 (en) | Data compression method, data decompression method, and related apparatus, electronic device, and system | |
US9454552B2 (en) | Entropy coding and decoding using polar codes | |
KR101956031B1 (en) | Data compressor, memory system comprising the compress and method for compressing data | |
CN107565971B (en) | Data compression method and device | |
US20090284400A1 (en) | Method and System for Reducing Required Storage During Decompression of a Compressed File | |
CN107027036A (en) | A kind of FPGA isomeries accelerate decompression method, the apparatus and system of platform | |
CN110518917B (en) | LZW data compression method and system based on Huffman coding | |
CN101989443A (en) | Multi-mode encoding for data compression | |
CN103346800B (en) | A kind of data compression method and device | |
CN103236847A (en) | Multilayer Hash structure and run coding-based lossless compression method for data | |
CN111884660B (en) | Huffman coding equipment | |
CN101783788A (en) | File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device | |
Li et al. | Implementation of LZMA compression algorithm on FPGA | |
CN104125475A (en) | Multi-dimensional quantum data compressing and uncompressing method and apparatus | |
CN109672449B (en) | Device and method for rapidly realizing LZ77 compression based on FPGA | |
CN105279123A (en) | Serial port conversion structure and method of dual-redundancy 1553B bus | |
CN109491854B (en) | SoC prototype verification method based on FPGA | |
CN117040539A (en) | Petroleum logging data compression method and device based on M-ary tree and LZW algorithm | |
CN116894016A (en) | Log compression method and device for rail transit signals | |
CN103210590A (en) | Compression method and apparatus | |
CN108932315A (en) | A kind of method and relevant apparatus of data decompression | |
CN103631983A (en) | Method and system for simulating tactical data messages | |
CN113891088A (en) | PNG image decompression logic circuit and device | |
Yang et al. | A Hardware Implementation of Real Time Lossless Data Compression and Decompression Circuits |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |