CN107483055B

CN107483055B - Lossless compression method and system

Info

Publication number: CN107483055B
Application number: CN201710660141.XA
Authority: CN
Inventors: 王怀亮; 郭兆军; 李焕涛; 赵林伟
Original assignee: Beijing Century Mingchen Technology Co ltd
Current assignee: Beijing Century Mingchen Technology Co ltd
Priority date: 2017-08-04
Filing date: 2017-08-04
Publication date: 2020-06-16
Anticipated expiration: 2037-08-04
Also published as: CN107483055A

Abstract

The application discloses a lossless compression method, which has the characteristics of real lossless compression, real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip. According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding. Systems employing such methods are also provided.

Description

Lossless compression method and system

Technical Field

The invention belongs to the technical field of data communication and data processing, and particularly relates to a lossless compression method and a lossless compression system.

Background

The implementation of the data lossless compression function is currently implemented by a dedicated compression chip, for example, a lossless compression chip of ADI corporation in the united states.

There are three bottlenecks in implementing lossless data compression with a dedicated compression chip: firstly, the difficulty of hardware design is increased, for example, the design area of a PCB is increased, the power consumption of a circuit board is increased, the heat dissipation of the circuit board becomes more difficult, and the like; secondly, the design cost is increased by adding hardware chips; thirdly, the data lossless compression is realized by a special compression chip, and the program of the future system is difficult to upgrade and is inconvenient to upgrade and maintain.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method overcomes the defects of the prior art, provides a lossless compression method, has the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcomes the technical bottleneck and hardware condition limitation caused by a special compression chip.

The technical solution of the invention is as follows: according to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.

The system which adopts the lossless compression method is also provided, the system utilizes the Block RAM in the FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises the following steps: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk.

The invention constructs a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm, and is realized by adopting the mature FPGA technology, thereby having the characteristics of real-time compressibility, compression stability and convenience for software program maintenance and upgrading, and overcoming the technical bottleneck and hardware condition limitation brought by a special compression chip.

Drawings

FIG. 1 is a flow chart of a lossless compression method according to the present invention;

FIG. 2 is a schematic diagram of a lossless compression system according to the present invention;

FIG. 3 is a diagram of a CAM dictionary in accordance with the present invention;

FIG. 4 is a block diagram of a data compression storage system arrangement according to the present invention.

Detailed Description

The lossless compression method (Lempel-Ziv-Welch Encoding, LZW for short) has three important objects: a data stream (CharStream), a coded stream (codeStream), and a coding Table (String Table). When encoding, the data stream is an input object (data sequence of text file), and the encoded stream is an output object (encoded data subjected to compression operation); when decoding, the coded stream is an input object, and the data stream is an output object; the compiled table is an object that needs to be used both in encoding and decoding. The corresponding relation between the character string and the code is dynamically generated in the compression process and is hidden in the compressed data, and the recovery is carried out according to the table during decompression, namely lossless compression.

At present, the special compression chip is used for realizing the data lossless compression function, so that the design cost, the hardware design difficulty, the system program maintenance and upgrading difficulty and the like are increased, and therefore the invention solves the problem of insufficient data lossless compression realized by the special compression chip by constructing a double dictionary based on the Block RAM inside the FPGA to realize the LZW compression algorithm.

According to the lossless compression method, a dynamic dictionary with the depth of 2K is built in a Block RAM inside an FPGA, and a character string pair is Precode + Char; if the same character string pair is found from the dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix of the string pair precoding.

Preferably, the method adopts verilog language to carry out module development design.

Preferably, as shown in fig. 1, the method comprises the steps of:

(1) starting;

(2) constructing a character string pair Precode + Char and a Real character string Real _ string, and inquiring the character string pair Precode + Char in a CAM dictionary by using a content addressing method;

(3) judging whether the character string pair exists in the CAM dictionary, if so, executing the step (4), otherwise, executing the step (5);

(4) sending the chip selection address corresponding to the character string to a decoding module for decoding, and skipping step (6);

(5) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;

(6) the decoding module decodes a unique address corresponding to the character string from the CAM dictionary with the address depth of 2K through a chip selection address, the unique address is sent to the mirror image dictionary module, the Real character string is found and compared with the constructed Real character string Real _ string, if the comparison result is equal, the step (7) is executed, otherwise, the step (8) is executed;

(7) assigning the unique address obtained by decoding to a prefix Precode of a character string pair, waiting for the next original data Char, and skipping to the step (9);

(8) updating the CAM dictionary and the mirror image dictionary and outputting a compressed Code value;

(9) and (6) ending.

Wherein, the character string pair (Precode + Char) is stored in the CAM dictionary, and the Real character string (Real _ string) is stored in the mirror dictionary.

As shown in fig. 2, there is also provided a system using the lossless compression method, where the system uses a blockaram inside an FPGA to construct a CAM dictionary and a mirror dictionary, and the system includes: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM built double dictionary based on FPGA internal implementation, an LZW compression algorithm is realized, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk. Thus, the real-time compression and storage functions of the video data can be realized.

For example, the hardware configuration is as follows:

the FPGA model: xilinx corporation XC7K325T

DDR particles: MT42L32M16D1FE-25AIT from Micro

SATA connector: molex high-speed connector

Preferably, in the system, the first and second sensors,

CAM _ LZW module: the module is the top layer of an LZW compression algorithm and instantiates a Cam _ dis module, a Decoder module and a Mirror _ dis module;

a Cam _ dis module: the module queries the storage position/address of the character string pair Precode + Char in the CAM dictionary by adopting a content addressing method, traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 double-port Block RAMs in the FPGA, and each Block RAM is a double-port RAM with a data bit width of 19 bits and an address depth of 2K;

a Decoder module: the module is responsible for address decoding, and a storage address corresponding to a character string pair Precode + Char is decoded from a 2K deep CAM dictionary through 32 Block RAM chip selection addresses;

a Mirror _ dis module: the module is built by 1 dual-port Block RAM inside an FPGA and is used for storing Real character strings Real _ string, if a character string pair Precode + Char is found in a Cam _ dis dictionary, Real character string comparison is carried out in the module, the dictionary can be found and matched successfully only when data strings are really matched, and otherwise, the code values are output without matching.

The software simulation and hardware debugging of the LZW compression method of the present invention will be described in further detail below.

Simulation of LZW compression algorithm software: and verifying the functional correctness of the LZW compression algorithm by constructing the double dictionary based on the Block RAM by using Modelsim simulation software.

On LZW compression algorithm hardware debugging:

firstly, integrating, constraining, realizing and generating a bit file for a double-dictionary-based LZW compression algorithm program constructed on a Block RAM by using a Xilinx compiling tool vivado, and downloading the program; secondly, the internal signal waveform designed by the Xilinx online logic analyzer Chipscope pro is observed, and the lzw compression algorithm is verified to be capable of accurately and unmistakably compressing data.

The invention has the beneficial effects that:

1) the ability to compress video or image data in real time (8.7 MB/s);

2) the compression effect can compress the original data to 50% -70%, and the method is suitable for the field of actual data compression;

3) for some video data acquisition and storage fields, the use space of a memory can be saved;

4) for some network data acquisition and transmission fields, the data transmission speed can be improved;

5) the compression algorithm is convenient to upgrade and maintain.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. A lossless compression method is characterized in that a Block RAM in an FPGA builds a dynamic dictionary and a mirror dictionary with the depth of 2K, and a character string pair Precode + Char and a Real character string Real _ string are built according to original data Char; if the same character string pair is found from the dynamic dictionary and the Real character string same as the Real character string Real _ string exists in the mirror dictionary, assigning the storage address corresponding to the character string to the Precode; if no identical string pair is found from the dynamic dictionary, the original data Char is assigned to the precoding and the compressed Code is output, the Code value being the prefix precoding of the string pair.

2. The lossless compression method as claimed in claim 1, wherein the method is designed for module development using verilog language.

3. A lossless compression method as claimed in claim 2, characterized in that the method comprises the steps of:

(1) starting;

(9) and (6) ending.

4. A lossless compression system is characterized in that the system utilizes a Block RAM inside an FPGA to construct a CAM dictionary and a mirror image dictionary, and the system comprises: the device comprises a CAM _ LZW module, a CAM _ dis module, a Decoder module and a Mirror _ dis module; the external data receiving port of the FPGA chip is 4 paths of Camera Link high-speed signals, the data storage interface is 4 paths of SATA interfaces, and the FPGA chip is externally connected with DDR memory particles; input data are compressed by a Block RAM-based dual-dictionary construction and LZW compression algorithm realized in an FPGA, the dictionary is circularly covered after being fully stored, and then the compressed data are stored in an SSD hard disk; in the system, the system is provided with a plurality of sensors,

a Cam _ dis module: the module uses the content addressing method to query the character string pair Precode +

The address/storage position of Char in the CAM dictionary traverses the address with the depth of 2K in 16 clock cycles and searches the storage address corresponding to the character string pair Precode + Char from the CAM dictionary; the CAM dictionary is built by 32 dual-port Block RAMs in the FPGA, and each Block RAM is a dual-port RAM with a data bit width of 19 bits and an address depth of 2K;