CN107483055B - Lossless compression method and system - Google Patents

Lossless compression method and system Download PDF

Info

Publication number
CN107483055B
CN107483055B CN201710660141.XA CN201710660141A CN107483055B CN 107483055 B CN107483055 B CN 107483055B CN 201710660141 A CN201710660141 A CN 201710660141A CN 107483055 B CN107483055 B CN 107483055B
Authority
CN
China
Prior art keywords
dictionary
character string
module
cam
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710660141.XA
Other languages
Chinese (zh)
Other versions
CN107483055A (en
Inventor
王怀亮
郭兆军
李焕涛
赵林伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century Mingchen Technology Co ltd
Original Assignee
Beijing Century Mingchen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century Mingchen Technology Co ltd filed Critical Beijing Century Mingchen Technology Co ltd
Priority to CN201710660141.XA priority Critical patent/CN107483055B/en
Publication of CN107483055A publication Critical patent/CN107483055A/en
Application granted granted Critical
Publication of CN107483055B publication Critical patent/CN107483055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method

Abstract

The application discloses a lossless compression method, which has the characteristics of real lossless compression, real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip. According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding. Systems employing such methods are also provided.

Description

Lossless compression method and system
Technical Field
The invention belongs to the technical field of data communication and data processing, and particularly relates to a lossless compression method and a lossless compression system.
Background
The implementation of the data lossless compression function is currently implemented by a dedicated compression chip, for example, a lossless compression chip of ADI corporation in the united states.
There are three bottlenecks in implementing lossless data compression with a dedicated compression chip: firstly, the difficulty of hardware design is increased, for example, the design area of a PCB is increased, the power consumption of a circuit board is increased, the heat dissipation of the circuit board becomes more difficult, and the like; secondly, the design cost is increased by adding hardware chips; thirdly, the data lossless compression is realized by a special compression chip, and the program of the future system is difficult to upgrade and is inconvenient to upgrade and maintain.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method overcomes the defects of the prior art, provides a lossless compression method, has the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip.
The technical solution of the invention is as follows: according to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.
The system which adopts the lossless compression method is also provided, the system utilizes the Block RAM in the FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises the following steps: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk.
The invention constructs a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm, and is realized by adopting the mature FPGA technology, thereby having the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcoming the technical bottleneck and hardware condition limitation brought by a special compression chip.
Drawings
FIG. 1 is a flow chart of a lossless compression method according to the present invention;
FIG. 2 is a schematic diagram of a lossless compression system according to the present invention;
FIG. 3 is a diagram of a CAM dictionary in accordance with the present invention;
FIG. 4 is a block diagram of a data compression storage system arrangement according to the present invention.
Detailed Description
The lossless compression method (Lempel-Ziv-Welch Encoding, LZW for short) has three important objects: a data stream (CharStream), a coded stream (codeStream), and a coding Table (String Table). When encoding, the data stream is an input object (data sequence of text file), and the encoded stream is an output object (encoded data subjected to compression operation); when decoding, the coded stream is an input object, and the data stream is an output object; the compiled table is an object that needs to be used both in encoding and decoding. The corresponding relation between the character string and the code is dynamically generated in the compression process and is hidden in the compressed data, and the recovery is carried out according to the table during decompression, namely lossless compression.
At present, the special compression chip is used for realizing the data lossless compression function, so that the design cost, the hardware design difficulty, the system program maintenance and upgrading difficulty and the like are increased, and therefore the invention solves the problem of insufficient data lossless compression realized by the special compression chip by constructing a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm.
According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.
Preferably, the method adopts verilog language to carry out module development design.
Preferably, as shown in fig. 1, the method comprises the steps of:
(1) starting;
(2) constructing a character string pair Precode + Char and a Real character string Real _ string, and inquiring the character string pair Precode + Char in a CAM dictionary by using a content addressing method;
(3) judging whether the character string pair exists in the CAM dictionary, if so, executing the step (4), otherwise, executing the step (5);
(4) sending the chip selection address corresponding to the character string to a decoding module for decoding, and skipping step (6);
(5) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(6) the decoding module decodes a unique address corresponding to the character string from the CAM dictionary with the address depth of 2K through a chip selection address, the unique address is sent to the mirror image dictionary module, the Real character string is found and compared with the constructed Real character string Real _ string, if the comparison result is equal, the step (7) is executed, otherwise, the step (8) is executed;
(7) assigning the unique address obtained by decoding to a prefix Precode of a character string pair, waiting for the next original data Char, and skipping to the step (9);
(8) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(9) and (6) ending.
Wherein, the character string pair (Precode + Char) is stored in the CAM dictionary, and the Real character string (Real _ string) is stored in the mirror dictionary.
As shown in fig. 2, there is also provided a system using the lossless compression method, where the system uses a blockaram inside an FPGA to construct a CAM dictionary and a mirror dictionary, and the system includes: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk. Thus, the real-time compression and storage functions of the video data can be realized.
For example, the hardware configuration is as follows:
the FPGA model: xilinx corporation XC7K325T
DDR particles: MT42L32M16D1FE-25AIT from Micro
SATA connector: molex high-speed connector
Preferably, in the system, the first and second sensors,
CAM _ LZW module: the module is the top layer of an LZW compression algorithm and instantiates a Cam _ dis module, a Decoder module and a Mirror _ dis module;
a Cam _ dis module: the module queries the storage position/address of the character string pair Precode + Char in the CAM dictionary by adopting a content addressing method, traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 double-port Block RAMs in the FPGA, and each Block RAM is a double-port RAM with a data bit width of 19 bits and an address depth of 2K;
a Decoder module: the module is responsible for address decoding, and a storage address corresponding to a character string pair Precode + Char is decoded from a 2K deep CAM dictionary through 32 Block RAM chip selection addresses;
a Mirror _ dis module: the module is built by 1 dual-port Block RAM inside an FPGA and is used for storing Real character strings Real _ string, if a character string pair Precode + Char is found in a Cam _ dis dictionary, Real character string comparison is carried out in the module, the dictionary can be found and matched successfully only when data strings are really matched, and otherwise, the code values are output without matching.
The invention constructs a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm, and is realized by adopting the mature FPGA technology, thereby having the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcoming the technical bottleneck and hardware condition limitation brought by a special compression chip.
The software simulation and hardware debugging of the LZW compression method of the present invention will be described in further detail below.
Simulation of LZW compression algorithm software: and verifying the functional correctness of the LZW compression algorithm by constructing the double dictionary based on the Block RAM by using Modelsim simulation software.
On LZW compression algorithm hardware debugging:
firstly, integrating, constraining, realizing and generating a bit file for a double-dictionary-based LZW compression algorithm program constructed on a Block RAM by using a Xilinx compiling tool vivado, and downloading the program; secondly, the internal signal waveform designed by the Xilinx online logic analyzer Chipscope pro is observed, and the lzw compression algorithm is verified to be capable of accurately and unmistakably compressing data.
The invention has the beneficial effects that:
1) the ability to compress video or image data in real time (8.7 MB/s);
2) the compression effect can compress the original data to 50% -70%, and the method is suitable for the field of actual data compression;
3) for some video data acquisition and storage fields, the use space of a memory can be saved;
4) for some network data acquisition and transmission fields, the data transmission speed can be improved;
5) the compression algorithm is convenient to upgrade and maintain.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims (4)

1. A lossless compression method is characterized in that a Block RAM in an FPGA builds a dynamic dictionary and a mirror dictionary with the depth of 2K, and a character string pair Precode + Char and a Real character string Real _ string are built according to original data Char; if the same character string pair is found from the dynamic dictionary and the Real character string same as the Real character string Real _ string exists in the mirror dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dynamic dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix precoding of the string pair.
2. The lossless compression method as claimed in claim 1, wherein the method is designed for module development using verilog language.
3. A lossless compression method as claimed in claim 2, characterized in that the method comprises the steps of:
(1) starting;
(2) constructing a character string pair Precode + Char and a Real character string Real _ string, and inquiring the character string pair Precode + Char in a CAM dictionary by using a content addressing method;
(3) judging whether the character string pair exists in the CAM dictionary, if so, executing the step (4), otherwise, executing the step (5);
(4) sending the chip selection address corresponding to the character string to a decoding module for decoding, and skipping step (6);
(5) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(6) the decoding module decodes a unique address corresponding to the character string from the CAM dictionary with the address depth of 2K through a chip selection address, the unique address is sent to the mirror image dictionary module, the Real character string is found and compared with the constructed Real character string Real _ string, if the comparison result is equal, the step (7) is executed, otherwise, the step (8) is executed;
(7) assigning the unique address obtained by decoding to a prefix Precode of a character string pair, waiting for the next original data Char, and skipping to the step (9);
(8) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;
(9) and (6) ending.
4. A lossless compression system is characterized in that the system utilizes a Block RAM inside an FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM-based dual-dictionary construction and LZW compression algorithm realized in an FPGA, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk; in the system, the system is provided with a plurality of sensors,
CAM _ LZW module: the module is the top layer of an LZW compression algorithm and instantiates a Cam _ dis module, a Decoder module and a Mirror _ dis module;
a Cam _ dis module: the module uses the content addressing method to query the character string pair Precode +
The address/storage position of Char in the CAM dictionary traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 dual-port Block RAMs in the FPGA, and each Block RAM is a dual-port RAM with a data bit width of 19 bits and an address depth of 2K;
a Decoder module: the module is responsible for address decoding, and a storage address corresponding to a character string pair Precode + Char is decoded from a 2K deep CAM dictionary through 32 Block RAM chip selection addresses;
a Mirror _ dis module: the module is built by 1 dual-port Block RAM inside an FPGA and is used for storing Real character strings Real _ string, if a character string pair Precode + Char is found in a Cam _ dis dictionary, Real character string comparison is carried out in the module, the dictionary can be found and matched successfully only when data strings are really matched, and otherwise, the code values are output without matching.
CN201710660141.XA 2017-08-04 2017-08-04 Lossless compression method and system Active CN107483055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710660141.XA CN107483055B (en) 2017-08-04 2017-08-04 Lossless compression method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710660141.XA CN107483055B (en) 2017-08-04 2017-08-04 Lossless compression method and system

Publications (2)

Publication Number Publication Date
CN107483055A CN107483055A (en) 2017-12-15
CN107483055B true CN107483055B (en) 2020-06-16

Family

ID=60597735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710660141.XA Active CN107483055B (en) 2017-08-04 2017-08-04 Lossless compression method and system

Country Status (1)

Country Link
CN (1) CN107483055B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108259041A (en) * 2017-12-29 2018-07-06 中国电子科技集团公司第二十研究所 A kind of Big Dipper data expansion method based on modified LZW Coding Compression Technologies
CN108494409B (en) * 2018-03-14 2021-07-13 电子科技大学 Underground high-speed real-time compression method of neutron logging-while-drilling instrument based on small dictionary
CN108494408B (en) * 2018-03-14 2021-07-13 电子科技大学 Hash dictionary-based underground high-speed real-time compression method for density logging while drilling instrument

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572552A (en) * 2009-06-11 2009-11-04 哈尔滨工业大学 High-speed lossless data compression system based on content addressable memory
CN103326732A (en) * 2013-05-10 2013-09-25 华为技术有限公司 Method for packing data, method for unpacking data, coder and decoder
CN105183557A (en) * 2015-08-26 2015-12-23 东南大学 Configurable data compression system based on hardware
CN106407285A (en) * 2016-08-26 2017-02-15 西安空间无线电技术研究所 RLE and LZW-based optimized bit file compression and decompression method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8159374B2 (en) * 2009-11-30 2012-04-17 Red Hat, Inc. Unicode-compatible dictionary compression
US9760593B2 (en) * 2014-09-30 2017-09-12 International Business Machines Corporation Data dictionary with a reduced need for rebuilding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572552A (en) * 2009-06-11 2009-11-04 哈尔滨工业大学 High-speed lossless data compression system based on content addressable memory
CN103326732A (en) * 2013-05-10 2013-09-25 华为技术有限公司 Method for packing data, method for unpacking data, coder and decoder
CN105183557A (en) * 2015-08-26 2015-12-23 东南大学 Configurable data compression system based on hardware
CN106407285A (en) * 2016-08-26 2017-02-15 西安空间无线电技术研究所 RLE and LZW-based optimized bit file compression and decompression method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于CAM字典的LZW算法的实现》;朱宝峰等;《电测与仪表》;20100731;第47卷(第535期);69-73 *

Also Published As

Publication number Publication date
CN107483055A (en) 2017-12-15

Similar Documents

Publication Publication Date Title
CN107483055B (en) Lossless compression method and system
US11463102B2 (en) Data compression method, data decompression method, and related apparatus, electronic device, and system
US9454552B2 (en) Entropy coding and decoding using polar codes
KR101956031B1 (en) Data compressor, memory system comprising the compress and method for compressing data
CN107565971B (en) Data compression method and device
US20090284400A1 (en) Method and System for Reducing Required Storage During Decompression of a Compressed File
CN107027036A (en) A kind of FPGA isomeries accelerate decompression method, the apparatus and system of platform
CN110518917B (en) LZW data compression method and system based on Huffman coding
CN101989443A (en) Multi-mode encoding for data compression
CN103346800B (en) A kind of data compression method and device
CN103236847A (en) Multilayer Hash structure and run coding-based lossless compression method for data
CN111884660B (en) Huffman coding equipment
CN101783788A (en) File compression method, file compression device, file decompression method, file decompression device, compressed file searching method and compressed file searching device
Li et al. Implementation of LZMA compression algorithm on FPGA
CN104125475A (en) Multi-dimensional quantum data compressing and uncompressing method and apparatus
CN109672449B (en) Device and method for rapidly realizing LZ77 compression based on FPGA
CN105279123A (en) Serial port conversion structure and method of dual-redundancy 1553B bus
CN109491854B (en) SoC prototype verification method based on FPGA
CN117040539A (en) Petroleum logging data compression method and device based on M-ary tree and LZW algorithm
CN116894016A (en) Log compression method and device for rail transit signals
CN103210590A (en) Compression method and apparatus
CN108932315A (en) A kind of method and relevant apparatus of data decompression
CN103631983A (en) Method and system for simulating tactical data messages
CN113891088A (en) PNG image decompression logic circuit and device
Yang et al. A Hardware Implementation of Real Time Lossless Data Compression and Decompression Circuits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant