CN113824449A

CN113824449A - Static Huffman parallel coding method, system, storage medium and equipment

Info

Publication number: CN113824449A
Application number: CN202111112010.0A
Authority: CN
Inventors: 秦臻; 刘宇豪; 王振
Original assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Current assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date: 2021-09-18
Filing date: 2021-09-18
Publication date: 2021-12-21
Anticipated expiration: 2041-09-18
Also published as: CN113824449B

Abstract

The invention provides a static Huffman parallel coding method, a system, a storage medium and equipment, wherein the method comprises the following steps: inputting data to be processed into an FIFO buffer; based on a static coding code table specified in RFC1951, performing table look-up translation on the code words of the data to be processed one by one, and performing 8-path parallel coding to obtain coded data of 8 bytes Byte 0-Byte 7; carrying out 7-path parallel shift on the coded data of the Byte1 to the Byte7 to obtain shift data of the Byte1 to the Byte 7; and summing the encoded data of the Byte0 with the shifted data of the bytes 1 to 7 to obtain the static compression result of the input 8 bytes. The method is beneficial to optimizing the traditional static Huffman coding algorithm, so that the algorithm can meet the parallel design requirement when the hardware is realized, the parallel computing characteristic of the hardware can be furthest exerted, and the efficiency of a hardware circuit is improved.

Description

Static Huffman parallel coding method, system, storage medium and equipment

Technical Field

The invention relates to the technical field of data compression, in particular to a static Huffman parallel coding method, a static Huffman parallel coding system, a static Huffman parallel coding storage medium and static Huffman parallel coding equipment.

Background

In recent years, with the continuous development of scientific technology, data information shows an increase in explosion. The rise of cloud computing, the promotion of artificial intelligence, the arrival of big data era, new work load constantly increases, and the problem of mass data transmission and storage constantly arouses the concern, how can high-efficiently transmit mass data, does not cause the problem of too heavy load to the treater to be solved urgently simultaneously. If the file can be compressed in advance during file transmission, the data flow can be saved, and the transmission time is reduced; if the file is compressed before the disk file is read, the file reading speed can be improved; in summary, compressing data provides a new way to improve computer performance from a new perspective, and research into it is of great practical significance.

Disclosure of Invention

In view of the above, the present invention provides a static huffman parallel coding method, system, storage medium and device, so as to solve the disadvantages of the existing non-parallel hardware circuit.

Based on the above purpose, the present invention provides a static huffman parallel coding method, which comprises the following steps:

inputting data to be processed into an FIFO buffer;

based on a static coding code table specified in RFC1951, performing table look-up translation on the code words of the data to be processed one by one, and performing 8-path parallel coding to obtain coded data of 8 bytes Byte 0-Byte 7;

carrying out 7-path parallel shift on the coded data of the Byte1 to the Byte7 to obtain shift data of the Byte1 to the Byte 7; and

the encoded data of the Byte0 is summed with the Byte1 to Byte7 shift data to obtain the static compression result of the 8 bytes input.

In one or more embodiments of the invention, the method further comprises:

and (4) beating one beat of the static compression result by using a D trigger to output, ensuring that no burr exists in the output result, and transmitting the output result to subsequent processing.

In one or more embodiments of the present invention, the table-lookup and translation of the code words of the data to be processed one by one includes static code table query of statistical characters, static code table query of length (length), and static code table query of distance (distance).

In one or more embodiments of the present invention, when the 8-way parallel encoding is performed, each byte of the translated character data and length data is converted into 7-9-bit data, and each byte of the translated distance data is converted into 5-bit data.

In one or more embodiments of the invention, when performing the 7-way parallel shift, the Byte1 is left-shifted by the sum of the length bits of the encoded data of Byte 0; shifting the Byte2 left by the sum of the length bits of the coded data of the Byte0 and the Byte 1; shifting the Byte3 left by the sum of the length bits of the encoded data of the bytes 0, 1 and 2; shifting the Byte4 to the left by the sum of the length bits of the encoded data of the Byte0, the Byte1, the Byte2 and the Byte 3; shifting the Byte5 to the left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3 and Byte 4; shifting the Byte6 left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3, Byte4 and Byte 5; the Byte7 is left-shifted by the sum of the length-bit numbers of the encoded data of the bytes 0, 1, 2, 3, 4, 5 and 6.

In one or more embodiments of the invention, performing 8-way parallel encoding consumes a single clock cycle.

In one or more embodiments of the invention, performing a 7-way parallel shift takes a single clock cycle.

According to another aspect of the present invention, there is also provided a static huffman parallel coding system, comprising:

the input module is configured to input data to be processed into the FIFO cache;

the parallel coding module is configured and used for performing table look-up translation on code words of the data to be processed one by one based on a static coding code table specified by RFC1951, and performing 8-path parallel coding to obtain coded data of 8 bytes Byte 0-Byte 7;

the parallel shift module is configured to perform 7-way parallel shift on the encoded data of the bytes 1-7 to obtain shift data of the bytes 1-7; and

and the output data integration module is configured to sum the encoded data of the Byte0 with the shift data of the bytes 1-7 to obtain the static compression result of the input 8 bytes.

In one or more embodiments of the invention, the method further comprises:

According to yet another aspect of the present invention, there is also provided a computer readable storage medium storing computer program instructions which, when executed, implement any of the methods described above.

According to yet another aspect of the present invention, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, performs any of the methods described above.

The invention has at least the following beneficial technical effects:

1. the invention optimizes the existing algorithm, designs a parallel structure hardware circuit which is suitable for the Huffman coding function of the static deflate format, the parallel structure can complete the parallel coding processing of the input data stream, and the parallel design can well play the computing performance of the hardware. By researching the calling mechanism of the algorithm to the computing unit, the computation without data dependency is performed in parallel by the form of parallel lines. The purpose of this is to hope that the source data can be input in every clock cycle, and on average, the static encoding data can be output in every clock cycle;

2. the invention provides a parallel hardware architecture based on a deflate format static Huffman coding algorithm. Compared with non-parallel hardware implementation, the static Huffman coding algorithm in the traditional deflate format is optimized, so that the algorithm can meet the parallel design requirement when the hardware is implemented, the parallel computing characteristic of the hardware can be exerted to the maximum extent, and the efficiency of a hardware circuit is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.

Fig. 1 is a schematic diagram of a static huffman parallel coding method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a static huffman parallel coding system according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a computer-readable storage medium for implementing a static huffman parallel coding method according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a hardware structure of a computer device for performing a static huffman parallel coding method according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two non-identical entities with the same name or different parameters, and it is understood that "first" and "second" are only used for convenience of expression and should not be construed as limiting the embodiments of the present invention. Furthermore, the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements does not include all of the other steps or elements inherent in the list.

In view of the above, a first aspect of the embodiments of the present invention provides an embodiment of a static huffman parallel coding method. Fig. 1 is a schematic diagram illustrating an embodiment of a static huffman parallel coding method provided by the present invention. As shown in fig. 1, the embodiment of the present invention includes the following steps:

step S10, inputting the data to be processed into FIFO buffer;

step S20, based on the static coding code table specified in RFC1951, the code words of the data to be processed are translated one by table lookup, and 8-path parallel coding is carried out to obtain the coding data of 8 bytes from Byte0 to Byte 7;

step S30, carrying out 7-path parallel shift on the coded data of the Byte1 to the Byte7 to obtain shift data of the Byte1 to the Byte 7;

step S40, the coded data of the Byte0 and the shift data of the bytes 1 to 7 are summed to obtain the static compression result of the input 8 bytes.

The Huffman coding is an important step in a classic lossless compression standard DEFLATE, wherein static Huffman coding is secondary compression of data after being compressed by LZ77, the compression rate is further improved, the static Huffman coding is that the number of times of character occurrence is counted, and variable length coding is carried out after data is looked up, so that more characters which occur are replaced by shorter codes, and meanwhile, characters which are few in times are replaced by longer codes, the overall size reduction of the data can be realized, and the purpose of compression is achieved.

The FIFO (First Input First Output) buffer used in this embodiment is a First-in First-out data buffer, the data that enters First is read from the FIFO buffer First, and there is no external read-write address line compared with the RAM.

In this embodiment, based on a static encoding code table (see table 1) specified in RFC1951, the code words of the data to be processed are translated one by table lookup, and 8-way parallel encoding is performed to obtain encoded data of 8 bytes Byte0 to Byte 7.

[ Table 1]

In some preferred embodiments, the table-lookup translation of the code words of the data to be processed includes static code table query of statistical character (literal), static code table query of length (length), and static code table query of distance (distance).

In some embodiments, during the 8-way parallel encoding, each byte of the translated character data and length data is converted into 7-9 bit data, and each byte of the translated distance data is converted into 5bit data.

As a general compression, bytes are used as basic characters for encoding, and one character (letter) can represent 256 possibilities, the minimum value of length (length) is 3, and the number of the length (length) is 256 as the number of the character (letter).

The static huffman coding of this embodiment adopts the static coding code table defined in the above RFC1951, and the set of tables is also used for decompression, so that it is not necessary to transfer information of the tree. Taking distance (distance) as a maximum value, 32768, a large point file is coded, and the probability of a large value of distance (distance) is high, in this case, the tree is large, and the calculation amount and the occupied memory are large. Therefore, the distance (distance) is processed and divided into a plurality of sections, each section is taken as a coded value, when the distance (distance) falls into the section, which is equivalent to the coded value, the section in which the distance (distance) is located is represented by the code of the section, the section division rule is shown in the following table 2 and table 3, it can be seen that some sections comprise a plurality of distances (distance), the distance (distance) is distinguished only by adding additional bits to the code, the additional bits are mapped to the values in the sections, the distance with longer total length is divided into fewer sections according to the fact that the higher the frequency of the smaller distance is, the lower the frequency of the larger distance is, and the sections are not equally spaced and are more sparse.

[ Table 2]

1

2

3

4

5,6

7,8

9-12

13-16

17-24

……

[ Table 3]

The embodiment of the invention optimizes the existing algorithm, provides a parallel coding method suitable for the Huffman coding function of a static deflate format, can complete parallel coding processing on input data streams, and can well play the computing performance of hardware due to the parallel design. By researching the calling mechanism of the algorithm to the computing unit, the computation without data dependency is performed in parallel by the form of parallel lines. This is done to expect that the source data is input every clock cycle and on average statically encoded data is output every clock cycle. Compared with non-parallel hardware implementation, the embodiment of the invention optimizes the static Huffman coding algorithm in the traditional deflate format, so that the algorithm can meet the parallel design requirement when the hardware is implemented, the parallel computing characteristic of the hardware can be exerted to the maximum extent, and the efficiency of a hardware circuit is improved.

In some embodiments, the method further comprises: and (3) beating one beat of the static compression result by using a D trigger for outputting, ensuring that no burr (Glitch) exists in the output result, and transmitting the output result to subsequent processing.

In this embodiment, after summing the Byte0 encoded data with the Byte1-7 encoded and shift data to obtain the static compression results of 8 input bytes, the static compression results are output by a D flip-flop to ensure that there is no Glitch in the output results, and the results are transmitted to the subsequent processing. Typically, this process consumes a single clock cycle.

In some embodiments, when performing the 7-way parallel shift, the Byte1 is left shifted by the sum of the length bits of the encoded data of Byte 0; shifting the Byte2 left by the sum of the length bits of the coded data of the Byte0 and the Byte 1; shifting the Byte3 left by the sum of the length bits of the encoded data of the bytes 0, 1 and 2; shifting the Byte4 to the left by the sum of the length bits of the encoded data of the Byte0, the Byte1, the Byte2 and the Byte 3; shifting the Byte5 to the left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3 and Byte 4; shifting the Byte6 left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3, Byte4 and Byte 5; the Byte7 is left-shifted by the sum of the length-bit numbers of the encoded data of the bytes 0, 1, 2, 3, 4, 5 and 6.

Specifically, in the present embodiment, the parallel shift is performed in parallel by 7 ways. The reason why the path is not divided into 8 paths is that the path 1 (i.e. the encoded data of Byte 0) is the lowest bit of the output result and does not need to be shifted any more. The 2 nd path is the Byte1 coded data, and the Byte0 coded data is shifted to the left by the sum of the length bits; the 3 rd path is the coded data of the Byte2, and the sum of the length digits of the coded data of the Byte0 and the Byte1 is shifted to the left; the 4 th path is the coded data of the Byte3, and the sum of the length digits of the coded data of the Byte0, the Byte1 and the Byte2 is shifted to the left; the 5 th path is the coded data of the Byte4, and the sum of the length digits of the coded data of the Byte0, the Byte1, the Byte2 and the Byte3 is shifted to the left; the 6 th path is the coded data of the Byte5, and the sum of the length bits of the coded data of the Byte0, the Byte1, the Byte2, the Byte3 and the Byte4 is shifted to the left; the 7 th path is the coded data of the Byte6, and the sum of the length bits of the coded data of the Byte0, the Byte1, the Byte2, the Byte3, the Byte4 and the Byte5 is shifted to the left; the 8 th path is the coded data of the Byte7, and the sum of the length bits of the coded data of the Byte0, the Byte1, the Byte2, the Byte3, the Byte4, the Byte5 and the Byte6 is shifted to the left. The parallel shift work after coding can be finished through the operation.

In some embodiments, performing 8-way parallel encoding and 7-way parallel shifting each takes a single clock cycle.

In a second aspect of the embodiments of the present invention, a static huffman parallel coding system is further provided. Fig. 2 is a schematic diagram of an embodiment of a static huffman parallel coding system provided by the present invention. As shown in fig. 2, a static huffman parallel coding system comprises: an input module 10 configured to input data to be processed into a FIFO buffer; the parallel coding module 20 is configured to perform table look-up translation on the code words of the data to be processed one by one based on a static coding code table specified in RFC1951, and perform 8-way parallel coding to obtain coded data of 8 bytes Byte0 to Byte 7; the parallel shift module 30 is configured to perform 7-way parallel shift on the encoded data of the bytes 1-7 to obtain shift data of the bytes 1-7; and the output data integration module 40 is configured to sum the encoded data of the Byte0 with the shifted data of the bytes 1-7 to obtain the static compression result of the input 8 bytes.

The static Huffman parallel coding system of the embodiment of the invention provides a parallel structure hardware circuit suitable for the Huffman coding function of a static deflate format, the parallel structure can complete the parallel coding processing of the input data stream, and the parallel design can well play the computing performance of hardware. By researching the calling mechanism of the algorithm to the computing unit, the computation without data dependency is performed in parallel by the form of parallel lines. The purpose of this is that it is hoped that each clock cycle can be input with source data, and on average each clock cycle can output static coded data, thereby improving the efficiency of hardware circuit.

In a third aspect of the embodiment of the present invention, a computer-readable storage medium is further provided, and fig. 3 is a schematic diagram of a computer-readable storage medium implementing a static huffman parallel coding method according to an embodiment of the present invention. As shown in fig. 3, the computer-readable storage medium 3 stores computer program instructions 31, the computer program instructions 31 being executable by a processor. The computer program instructions 31 when executed implement the method of any of the embodiments described above.

It is to be understood that all embodiments, features and advantages set forth above with respect to the static huffman parallel coding method according to the invention apply equally, without conflict with each other, to the static huffman parallel coding system and to the storage medium according to the invention.

In a fourth aspect of the embodiments of the present invention, there is further provided a computer device, including a memory 402 and a processor 401, where the memory stores a computer program, and the computer program, when executed by the processor, implements the method of any one of the above embodiments.

Fig. 4 is a schematic hardware structure diagram of an embodiment of a computer device for performing a static huffman parallel coding method according to the present invention. Taking the computer device shown in fig. 4 as an example, the computer device includes a processor 401 and a memory 402, and may further include: an input device 403 and an output device 404. The processor 401, the memory 402, the input device 403 and the output device 404 may be connected by a bus or other means, and fig. 4 illustrates an example of a connection by a bus. The input device 403 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the static huffman parallel coding system. The output device 404 may include a display device such as a display screen.

The memory 402, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the static huffman parallel coding method in the embodiments of the present application. The memory 402 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of a static huffman parallel coding method, or the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 402 may optionally include memory located remotely from processor 401, which may be connected to local modules via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The processor 401 executes various functional applications of the server and data processing, i.e., implementing the static huffman parallel coding method of the above-described method embodiments, by running the non-volatile software programs, instructions and modules stored in the memory 402.

Finally, it should be noted that the computer-readable storage medium (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.

The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with the following components designed to perform the functions herein: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims

1. A static Huffman parallel coding method is characterized by comprising the following steps:

inputting data to be processed into an FIFO buffer;

2. The method of claim 1, further comprising:

3. The method of claim 1,

and performing table look-up translation on the code words of the data to be processed one by one, wherein the table look-up translation comprises static code table query of characters, static code table query of length and static code table query of distance.

4. The method of claim 3,

and when the 8-path parallel coding is carried out, converting each byte of the translated character data and length data into 7-9 bit data, and converting each byte of the translated distance data into 5bit data.

5. The method of claim 1,

when 7-way parallel shift is carried out, the Byte1 is shifted to the left by the sum of the length bits of the coded data of the Byte 0; shifting the Byte2 left by the sum of the length bits of the coded data of the Byte0 and the Byte 1; shifting the Byte3 left by the sum of the length bits of the encoded data of the bytes 0, 1 and 2; shifting the Byte4 to the left by the sum of the length bits of the encoded data of the Byte0, the Byte1, the Byte2 and the Byte 3; shifting the Byte5 to the left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3 and Byte 4; shifting the Byte6 left by the sum of the length bits of the encoded data of Byte0, Byte1, Byte2, Byte3, Byte4 and Byte 5; the Byte7 is left-shifted by the sum of the length-bit numbers of the encoded data of the bytes 0, 1, 2, 3, 4, 5 and 6.

6. The method of claim 1, wherein performing 8-way parallel encoding takes a single clock cycle.

7. The method of claim 1, wherein performing the 7-way parallel shift takes a single clock cycle.

8. A static huffman parallel coding system, comprising:

9. A computer-readable storage medium, characterized in that computer program instructions are stored which, when executed, implement the method according to any one of claims 1-7.

10. A computer device comprising a memory and a processor, characterized in that the memory has stored therein a computer program which, when executed by the processor, performs the method according to any one of claims 1-7.