CN108886367B

CN108886367B - Method, apparatus and system for compressing and decompressing data

Info

Publication number: CN108886367B
Application number: CN201780015236.4A
Authority: CN
Inventors: 安耶洛斯·阿雷拉基斯; 佩尔·斯滕斯特伦
Original assignee: Zeropoint Technologies AB
Current assignee: Zeropoint Technologies AB
Priority date: 2016-01-29
Filing date: 2017-01-30
Publication date: 2022-01-11
Anticipated expiration: 2037-01-30
Also published as: WO2017131579A1; CN108702160A; CN108886367A; CN108702160B; WO2017131578A1

Abstract

Methods, devices and systems expand compressors and decompressors for encoding and decoding data in a cache/memory/data transfer subsystem of a computer system or in a communication network. The example variable length compressor and decompressor are expanded to have new features to: more densely compressing when a particular value occurs in a particular location in a data block; improving compression and decompression delays when there are specific values in the data block that occur frequently; decompression delay is also improved by recording the length of the variable length encoded values of the compressed data block. The compressor and decompressor are extended to support compression and decompression of common compression scenarios used in conjunction with variable length compression to improve compressibility in a cache/memory/data transfer subsystem of a computer system or in a communication network.

Description

Method, apparatus and system for compressing and decompressing data

Technical Field

The present disclosure relates generally to the field of data compression and decompression, such as data compression and decompression in a cache/memory subsystem and/or in a data transmission subsystem of a computer system or in a data communication system.

Background

Data compression is a relatively mature technique for reducing the size of data. Which is applied to data stored in a memory subsystem of a computer system to increase storage capacity. Data compression is also used when data is transferred between different subsystems within a computer system, or generally when the transfer is between two points in a data communication system including a communication network.

Data compression requires two basic operations: 1) compression (also called encoding), which takes uncompressed data as input and converts the uncompressed data into compressed data by replacing data values with corresponding code words (also called encoding, word codes or codes in the literature); and 2) decompression (also called decoding), which takes compressed data as input and converts it to uncompressed by replacing the code words with the corresponding data values. Data compression may be lossless or lossy depending on whether the actual data value after decompression is identical to the original data value before compression (lossless), or whether the data value after decompression is different from the original data value and the original value is not available (lossy). Compression and decompression may be implemented in software, or hardware, or a combination of software and hardware, to implement corresponding methods, apparatus and systems.

FIG. 1 depicts an example of a computer system 100. Computer system 100 includes one or more processing units P1 … Pn connected to a memory hierarchy 110 using a communication means such as an interconnection network. Each processing unit includes a processor (or core), and each processing unit may be a CPU (central processing unit), a GPU (graphics processing unit), or generally a block that performs computations. Memory hierarchy 110, on the other hand, constitutes a storage subsystem of computer system 100 and includes cache memory 120, which may be organized in one or several levels (levels, tiers), L1-L3, and memory 130 (a.k.a. primary memory). The memory 130 may also be connected to a secondary storage device (e.g., a hard disk drive, solid state drive, or flash memory). Memory 130 may be organized into levels, such as fast main memory (e.g., DDR) and flash memory. Cache memory 120 in the present example includes three levels, where L1 and L2 are private caches in that each processing unit P1-Pn is connected to a designated L1/L2 cache, and L3 is shared among all processing units P1-Pn. With different numbers of processing units and, in general, different combinations between processing units and memory subsystems, alternative examples may implement different cache hierarchies with more, fewer, or even no cache levels, with or without dedicated or shared designated caches, implementing various memory levels, all as readily implemented by the skilled person.

Data compression may be applied to a computer system in different ways. FIG. 2 depicts an example 200 of a computer system, such as system 100 of FIG. 1, in which data is compressed in a memory, such as main memory, of such a computer system. This means that the data is compressed before being saved in memory and the data is decompressed when leaving memory by the corresponding compression operation as described above.

In the alternative example 300 of the computer system shown in FIG. 3, data compression may be applied to the L3 cache of the cache system. Similar to the previous example, the data needs to be compressed before it is stored in the cache, and the data needs to be decompressed before it leaves the cache (e.g., to other cache levels (L2) or memory 330 where the data is uncompressed). In an alternative example, the data may be stored in compressed form in any level of the cache hierarchy.

Data may also be compressed only when it is being transferred between different subsystems in a computer system. In the alternative example 400 of the computer system shown in fig. 4, data is compressed when transferred between the L3 cache and the memory 430 using the corresponding communication device. Similar to the previous example, there needs to be compression and decompression at the end of the communication device in order to compress the data before transmitting it, and to decompress the data when it is received at the other end.

In an alternate example of a computer system 500, data compression may be applied in a combination of subsystems as depicted in FIG. 5. In this example, the data is compressed while it is held in memory 530 and while it is being transferred between memory 530 and cache hierarchy 520. Thus, when data is moved from the cache hierarchy 520 to the memory 530, it may only be necessary to compress the data before it is transferred from the L3 cache. Alternatively, compressed data leaving memory 530 destined for cache hierarchy 520 may only need to be decompressed when it is received to the other end of the communication device connecting memory 530 with cache hierarchy 520. Any example is possible with respect to applying compression to a combination of different subsystems in a computer system, and may be implemented by one skilled in the art.

Data transmission may also be between two arbitrary points within a communication network. Fig. 6 depicts an example of a data communication system 600 including a communication network 605 between two points, where data is transmitted by a transmitter 610 and received by a receiver 620. In such instances, the points may be two intermediate nodes in the network or a source node and a destination node of a communication link or a combination of these. Data compression may be applied to a data communication system such as the example system 700 depicted in fig. 7. Compression needs to be applied before the data is transmitted over the communication network 705 by the transmitter 710 and decompression needs to be applied after the data is received by the receiver 720.

There are various different algorithms for implementing data compression. One family of data compression algorithms is the statistical compression algorithm, which is data dependent and can provide compression efficiencies close to entropy, because it assigns variable length (also called variable width) codes based on the statistical properties of the data values: short codewords are used to encode data values that occur frequently, while longer codewords encode data values that occur less frequently. Huffman coding is a known statistical compression algorithm.

A known variant of huffman coding for accelerated decompression is canonical huffman coding. Based on this, the code words have a numerical sequence property, which means that code words of the same length are consecutive integers.

Examples of canonical huffman-based compression and decompression schemes are presented in the prior art. Such compression and decompression mechanisms may be used in the foregoing examples to achieve huffman-based compression and decompression.

Fig. 9 shows an example of a compressor 900 from the prior art implementing huffman coding, e.g. canonical huffman coding. The compressor takes as input an uncompressed block, which is a string of data values and includes one or more data values generally denoted throughout this disclosure as v1, v 2. Unit 910, which may be a storage unit or an extractor for extracting data values from uncompressed blocks, provides the data values to variable length coding unit 920. The variable length coding unit 920 includes a Code Table (CT)922 and a Codeword (CW) selector 928. CT 922 is a table that may be implemented as a look-up table (LUT) or computer cache memory (with any arbitrary dependencies) and contains one or more entries; each entry includes a value 923 that may be compressed using a codeword, CW 925, and a codeword length (cL) 927. Since the various sets of codewords used by the statistical compression algorithm are of variable length, when stored in CT 922 with each entry having a fixed-size width (codeword 925), these sets of codewords must be padded with zeros. The codeword length 927 holds the actual length (e.g., in bits) of the variable length code. CW selector 928 uses cL to identify the actual CW and discard the padded zeros. The encoded values are then concatenated to the rest of the compressed values, together forming a compressed block. An exemplary flow chart of a compression method following the previously described compression steps is depicted in fig. 27.

An example of a prior art decompressor 1000 is shown in fig. 10. Canonical huffman decompression can be divided into two steps: codeword detection and value retrieval (retrieve ). Each of these steps is carried out by one of the following units: (1) a Codeword Detection Unit (CDU)1020 and a (2) Value Retrieval Unit (VRU) 1030. The purpose of CDU 1020 is to find valid codewords within the compressed sequence (i.e., the sequence of codewords of compressed data values). The CDU 1020 includes a set of comparators 1022 and a priority encoder 1024. Each

comparator

1022a, 1022b, 1022c compares each potential bit sequence (bit sequence) to a known codeword, which in this example is the first assigned (at code generation time) canonical huffman codeword (FCW) of a particular length. In an alternative implementation, the last assigned canonical huffman codeword could also be used, but in this case the exact comparison made would be different. The bit sequence to be compared may be stored in a memory unit 1010 (e.g. implemented as a FIFO or flip-flop) and determines the number of comparators and the maximum width of the widest of the bit sequences, the maximum size of the bit sequence to be compared depending on the maximum length of the valid huffman code word determined at code generation (mCL). However, depending on the chosen implementation of such a decompressor (e.g. in software or in hardware), the maximum length may be limited to a specific value at design, compilation, configuration or runtime. The output of comparator 1022 is inserted into a priority encoder such as structure 1024, which outputs the length of the matched codeword (referred to as "matched length" in fig. 10). Based on this, the detected valid codeword (matched codeword) is extracted from the bit sequence stored in the storage unit 1010; the bit sequence is shifted as many positions as defined by the "matched length" and the empty portion is loaded into subsequent bits of the compressed sequence so that the CDU 1020 can determine the next valid codeword.

On the other hand, the Value Retrieval Unit (VRU)1030 includes an offset table 1034, a subtractor unit 1036, and a decompression look-up table (DeLUT) 1038. The "matched length" from the previous step is used to determine the offset value (stored in offset table 1034) that must be subtracted (1036) from the arithmetic value of the matched codeword also determined in the previous step to obtain the address of the DeLUT 1038 from which the original data value corresponding to the detected codeword can be retrieved and attached to the remaining decompressed values stored in decompression block 1040. The operation of the decompressor is repeated until all values held in compressed form in the input compressed sequence (referred to as compressed blocks in fig. 10) are restored to the uncompressed data values v1, v2, …, vn.

Fig. 28 depicts an exemplary flow chart of a decompression method following the decompression steps previously described.

The aforementioned compressor and decompressor may quickly and efficiently compress and decompress data blocks compressed using variable length canonical huffman coding. However, the codeword detection phase during decompression is inherently sequential, making decompression slow; although the prior art teaches how to parallelize it, the implementation is complex and adds a lot of area overhead or requires a lot of computational resources. This is a challenge when applying statistical compression to the computer systems or communication networks of the foregoing examples. Furthermore, while the aforementioned compressors with statistical compression algorithms can effectively compress the most frequent values, the decompressors all operate at the same speed regardless of whether the compressed data is frequent or infrequent. In addition, the compressor is less efficient when the frequent values occur in the standard positions of the stream of values. The present inventors have recognized that there is room for improvement in the art of data compression and decompression.

Disclosure of Invention

It is an object of the present invention to provide improvements in the field of data compression and decompression.

The present disclosure generally discloses methods, devices and systems for compressing blocks of data values and decompressing compressed blocks of data values when applying compression to, for example, a cache subsystem and/or a memory subsystem and/or a data transmission subsystem in a data communication system and/or a computer system. There are various ways to efficiently compress data using entropy-based variable length coding in the subsystem, and one such way is by using huffman coding. An existing compressor may be used to compress a block of data values using huffman coding, while an existing decompressor may be used to decompress the block of data compressed using huffman coding. However, decompression of variable length encoded sequences is inherently sequential and determining huffman encoded data values in such compressed blocks is slow because the huffman encoded values are variable length codewords; therefore, their boundaries are unknown. Furthermore, the speed of existing decompressors is the same regardless of whether the compressed block includes frequent or infrequent compressed values, and existing compressors are also not as efficient when the most frequent values occur in the standard locations in the block. The method, apparatus and system disclosed in this disclosure enhances existing compressors and decompressors that utilize variable length coding with the following new features: more dense compression when a particular value occurs in a particular location in a data block; improving compression and decompression latency when frequently occurring specific values occur in a data block; decompression latency is also improved by recording the length of the variable length encoded values of the compressed data block. Furthermore, the proposed method, device and system even further enhances the compressor and decompressor by combining them with other aggressive compressors and decompressors, respectively, for common compression scenarios in the computer system and communication system.

A first aspect of the present invention is a data compression apparatus for compressing an uncompressed data block including n data values into a compressed data block, the data compression apparatus comprising:

a compressor configured to compress data values of uncompressed data blocks into corresponding variable length codewords;

a detector configured to detect a presence of at least one particular data value in an uncompressed data block; and

a compressed data block generator coupled with the compressor and the detector and configured to generate a compressed data block by combining:

a data value mask containing n mask positions, wherein each mask position indicates whether a corresponding data value in an uncompressed data block is equal to any one of the at least one particular data value detected by the detector; and

for data values in the uncompressed data block that are not equal to the at least one particular data value, the corresponding variable length codeword resulting from compression by the compressor,

wherein the compressed data block comprises a data value mask and m variable length codewords, wherein m ≦ n, and wherein no variable length codewords for data values in the uncompressed data block that are equal to any one of the at least one particular data value are included in the compressed data block.

In some embodiments, the detector is configured to detect the presence of a particular data value in the uncompressed data chunk, wherein each of the n mask positions of the data value mask comprises a single bit. In other embodiments, the detector is configured to detect the presence of a plurality of different specific data values in the uncompressed data chunk, wherein each of the n mask positions of the data value mask contains a fixed-size bit combination that is capable of encoding any one of the plurality of specific data values.

In some embodiments, the or each particular data value is a frequently occurring data value which may be encoded with the smallest bits if variable length coding is used instead. Advantageously, such a specific data value is 0 (zero). Alternatively, the or each particular data value may be a data value which, when present, requires very fast decompression.

Further features of preferred embodiments of the data compression apparatus according to the first aspect of the invention are described with reference to fig. 16 and 18 and are furthermore defined in the appended dependent claims 7-11 filed herewith. A second aspect of the present invention is a corresponding data compression method for compressing an uncompressed data block including predetermined n data values into a compressed data block, the data compression method including:

compressing data values of the uncompressed data blocks into corresponding variable length codewords;

detecting a presence of at least one particular data value in an uncompressed data block; and

generating a compressed data block by combining:

a data value mask containing n mask positions, wherein each mask position indicates whether a respective data value in the uncompressed data block is equal to any one of the at least one particular data value; and

for data values in the uncompressed data block that are not equal to any of the at least one particular data value, the corresponding variable length codeword,

The data compression method according to the second aspect of the present invention may comprise any or all of the functional features of the data compression apparatus according to the first aspect of the present invention including embodiments thereof.

A third aspect of the present invention is a data decompression apparatus for decompressing a compressed data block into a decompressed data block, the decompressed data block including n data values at respective data value positions, the data decompression apparatus comprising:

a decompressor configured to decompress a variable length codeword of a compressed data block into a corresponding decompressed data value; and

a decompressed data block generator configured to:

reading a data value mask from the compressed data chunk containing n mask locations, wherein each mask location indicates whether a corresponding data value in the uncompressed data chunk is equal to any one of the at least one particular data value prior to data compression resulting in the compressed data chunk; and

generating a decompressed data block by combining the decompressed data value from the decompressor with the at least one particular data value indicated by the corresponding mask position of the data value mask based on the data value mask,

wherein an order of data values of the generated decompressed data block is the same as an order in which the data values appear in an uncompressed data block prior to data compression.

In some embodiments, each of the n mask positions of the data value mask comprises a single bit that indicates or does not indicate a particular data value. In other embodiments, each of the n mask positions of the data value mask contains a fixed-size bit combination that is capable of decoding any one of a plurality of particular data values. Advantageously, such a specific data value is 0 (zero).

Further features of preferred embodiments of the data compression apparatus according to the third aspect of the invention are described with reference to figures 17 and 19 and are furthermore defined in the appended dependent claim 17, filed herewith.

A fourth aspect of the present invention is a corresponding data decompression method for decompressing a compressed data block into a decompressed data block, the decompressed data block comprising n data values at respective data value positions, the data decompression method comprising:

decompressing the variable length codewords of the compressed data block into corresponding decompressed data values;

generating a decompressed data block by combining the decompressed data value with the at least one particular data value indicated by the corresponding mask position of the data value mask based on the data value mask,

The data decompression method according to the fourth aspect of the present invention may comprise any or all of the functional features of the data decompression apparatus according to the third aspect of the present invention including embodiments thereof.

A fifth aspect of the present invention is a data compression apparatus for compressing an uncompressed data block including n data values into a compressed data block, the data compression apparatus comprising:

a compressed data chunk generator coupled to the compressor and the detector and configured to generate a data value mask comprising n mask positions, wherein each mask position indicates whether a respective data value in an uncompressed data chunk is equal to any one of the at least one particular data value detected by the detector,

wherein the compressed data chunk generator comprises a mask code generator configured to:

analyzing the generated data value mask, including determining whether it matches any of a plurality of mask patterns; and

generating a mask code to represent the results of the analysis;

wherein the compressed data block generator is further configured to generate the compressed data block by combining at least:

the generated mask code; and

for any one of the at least one specific data value in the uncompressed data block

Determining the corresponding variable length of data values that are all unequal, compressed by the compressor

The code words of the degree are,

wherein the compressed data block comprises the generated mask code and m variable length codewords, wherein m ≦ n, and wherein the variable length codewords for data values in the uncompressed data block that are equal to any one of the at least one particular data value are not included in the compressed data block, and

wherein the compressed data chunk further comprises a data value mask unless the result of said analysis by the mask code generator is that the generated data value mask indicates a predetermined repeating pattern of particular data values in the uncompressed data chunk.

Further features of preferred embodiments of the data compression apparatus according to the fifth aspect of the invention are described with reference to fig. 18 and are furthermore defined in the appended dependent claims 21-24 filed herewith.

A sixth aspect of the present invention is a data compression method for compressing an uncompressed data block including predetermined n data values into a compressed data block, the data compression method including:

detecting a presence of at least one particular data value in an uncompressed data block;

generating a data value mask comprising n mask locations, wherein each mask location indicates whether a respective data value in the uncompressed data block is equal to any one of the at least one particular data value;

analyzing the generated data value mask, including determining whether it matches any of a plurality of mask patterns;

generating a mask code to represent the results of the analysis; and

generating a compressed data block by combining at least:

the generated mask code; and

for data values in the uncompressed data block that are not equal to any of the at least one particular data value, the corresponding variable length codeword resulting from compression by the compressor,

wherein the compressed data block comprises the generated mask code and m variable length codewords, wherein 0 ≦ m ≦ n, and wherein the variable length codewords for data values in the uncompressed data block that are equal to any one of the at least one particular data value are not included in the compressed data block, and

wherein the compressed data chunks further comprise data value masks unless the result of the analyzing step is that the generated data value masks indicate a predetermined repeating pattern of particular data values in the uncompressed data chunks.

The data compression method according to the sixth aspect of the present invention may comprise any or all of the functional features of the data compression apparatus according to the fifth aspect of the present invention, including embodiments thereof.

A seventh aspect of the present invention is a data decompression apparatus for decompressing a compressed data block into a decompressed data block, the decompressed data block including n data values at respective data value positions, the data decompression apparatus comprising:

a decompressed data block generator configured to:

reading a mask code from the compressed data block, the mask code representing any one of a plurality of mask patterns;

reading a data value mask from the compressed data chunk containing n mask positions when the mask pattern represented by the read mask encoding does not indicate a predetermined repeating pattern of any particular data value of at least one particular data value in the uncompressed data chunk prior to data compression resulting in the compressed data chunk, wherein each mask position indicates whether a corresponding data value in the uncompressed data chunk is equal to any particular data value of the at least one particular data value prior to data compression; and

generating the decompressed data block by combining, based on mask encoding and, where applicable, data value masking, one of the two being a decompressed data value from the decompressor, the other of the two being the at least one particular data value indicated by the predetermined repetition pattern represented by mask encoding or indicated by the respective mask position of the data value masking, where applicable,

Further features of preferred embodiments of the data compression apparatus according to the seventh aspect of the invention are described with reference to fig. 19 and are furthermore defined in the appended dependent claims 27-30 filed herewith.

An eighth aspect of the present invention is a corresponding data decompression method for decompressing a compressed data block into a decompressed data block, the decompressed data block comprising n data values at respective data value positions, the data decompression method comprising:

generating a decompressed data block by combining, based on the mask encoding and, if applicable, the data value mask, one of the two being a decompressed data value, the other of the two being the at least one particular data value indicated by the predetermined repetition pattern represented by the mask encoding or indicated by the respective mask position of the data value mask, if applicable,

The data decompression method according to the eighth aspect of the present invention may include any or all of the functional features of the data compression apparatus according to the seventh aspect of the present invention including embodiments thereof.

A ninth aspect of the present invention is a data compression apparatus for compressing an uncompressed data block including n data values into a compressed data block, the data compression apparatus comprising:

a compressor configured to compress data values of uncompressed data blocks into corresponding variable length codewords and output the variable length codewords and their respective code lengths;

a compressed data chunk generator coupled to the compressor and comprising a length mask register having n storage locations, one for each of the n data values of the uncompressed data chunk, the length mask register configured to store a length mask for the n locations,

wherein the compressed data block generator is configured to store respective code lengths of the variable length code words provided by the compressor at respective positions of the length mask, an

Wherein the compressed data block generator is configured to generate the compressed data block by combining:

the length mask stored in the length mask register; and

compressed data values in the form of variable length codewords provided by the compressor.

In an advantageous embodiment, the data compression device further comprises a detector coupled with the compressor and configured to detect data values in the uncompressed data blocks that cannot be compressed by said compressor, wherein the compressed data block generator is configured to: when the detector has detected that the data values of the uncompressed data block cannot be compressed, the code length with the special value is stored at the corresponding position in the length mask and the uncompressed data value is stored in the compressed data block. Advantageously, the special value of the code length is 0.

Further features of preferred embodiments of the data compression apparatus according to the ninth aspect of the invention are described with reference to fig. 23 and are furthermore defined in the appended dependent claims 33-39 filed herewith.

A tenth aspect of the present invention is a corresponding data compression method for compressing an uncompressed data block comprising n data values into a compressed data block, the data compression method comprising:

compressing data values of an uncompressed data block into corresponding variable length codewords and outputting the variable length codewords and their respective code lengths;

storing a length mask of n locations in a length mask register having n storage locations, one for each of the n data values of the uncompressed data block,

storing respective code lengths of the variable-length code words at respective positions of the length mask, an

Generating a compressed data block by combining:

the length mask stored in the length mask register; and

compressed data values in the form of variable length codewords.

The data compression method according to the tenth aspect of the present invention may comprise any or all of the functional features of the data compression apparatus according to the ninth aspect of the present invention including embodiments thereof.

An eleventh aspect of the present invention is a data decompression apparatus for decompressing a compressed data block into a decompressed data block, the decompressed data block including n data values at respective data value positions, the data decompression apparatus comprising:

a decompressor configured to decompress a variable length codeword of a compressed data block into a corresponding decompressed data value;

an extractor mechanism for: reading a length mask of n positions from the compressed data block, determining respective code lengths of variable length codewords in the compressed data block according to the length mask, extracting respective variable length codewords from the compressed data block based on the determined respective code lengths, and providing the extracted respective variable length codewords to a decompressor; and

a decompressed data block generator configured to generate a decompressed data block from the decompressed data values from the decompressor,

In an advantageous embodiment, the decompressed data block generator is configured to:

for one or more locations in the length mask, detecting one or more code lengths having a special value indicating that one or more corresponding data values are included in uncompressed form in the compressed data block; and

based on the detected one or more code lengths having special values, a decompressed data block is generated by combining the decompressed data value from the decompressor with one or more corresponding data values from an uncompressed form of the compressed data block. Advantageously, the special value of the code length is 0.

A twelfth aspect of the present invention is a data decompression method for decompressing a compressed data block into a decompressed data block, the decompressed data block including n data values at respective data value positions, the data decompression method comprising:

reading a length mask of n locations from the compressed data block;

determining a corresponding code length of a variable length codeword in the compressed data block according to the length mask;

extracting respective variable length codewords from the compressed data block based on the determined respective code lengths;

decompressing the extracted variable length codewords into corresponding decompressed data values; and

a decompressed data block is generated from the decompressed data values,

The data decompression method according to the twelfth aspect of the present invention may include the data decompression apparatus according to the eleventh aspect of the present invention including any or all of its functional features of its embodiments.

Another aspect of the invention is a system comprising one or more memories, a data compression apparatus according to the first, fifth or ninth aspect described above, and a data decompression apparatus according to the third, seventh or eleventh aspect described above.

Yet another aspect of the present invention is a computer program product comprising code instructions which, when loaded and executed by a processing device, cause the performance of a method according to the second, sixth or tenth aspect described above.

Yet another aspect of the present invention is a computer program product comprising code instructions which, when loaded and executed by a processing device, cause the performance of a method according to the fourth, eighth or twelfth aspect described above.

Other aspects, objects, features and advantages of the disclosed embodiments will appear from the following detailed disclosure, from the attached dependent claims as well as from the drawings. In general, all terms used in the claims should be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein.

All references to "a/an/the [ element, device, component, means, step, etc ]" are to be interpreted openly as referring to at least one instance of the element, device, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

Drawings

Examples in the background and embodiments of aspects of the invention are described with reference to the following drawings:

FIG. 1 illustrates a block diagram of a computer system including n processing cores, each coupled to a cache hierarchy having three levels and a main memory.

FIG. 2 illustrates the block diagram of FIG. 1, wherein the main memory holds data in compressed form.

FIG. 3 illustrates the block diagram of FIG. 1, wherein an L3 cache holds data in compressed form. Other cache levels may also store data in compressed form.

FIG. 4 illustrates the block diagram of FIG. 1, wherein data is compressed in the communicator, such as when transferred between the memory and cache hierarchies.

FIG. 5 illustrates the block diagram of FIG. 1, wherein compression may be applied to the main memory and the links connecting the memory to the cache hierarchy. In general, compression may be applied to any combination of portions like a cache hierarchy, a transport (e.g., a link connecting a memory to a cache subsystem), and main memory.

Fig. 6 illustrates a block diagram of a data transfer link connecting two points in a communication network. These points may be two intermediate nodes in the network or a source node and a destination node in a communication link or a combination of these scenarios.

Fig. 7 illustrates a block diagram of the data transfer link of fig. 6, wherein the data being transmitted is in compressed form, so they may need to be compressed in the transmitter and decompressed in the receiver.

Fig. 8 illustrates on the left a block of uncompressed data values and on the right the same block in compressed form using a variable length code that has been generated using huffman coding. All data values of the uncompressed block are replaced by corresponding huffman code words.

Fig. 9 illustrates a compressor used to compress (or encode) a block as illustrated in fig. 8 using huffman coding.

Fig. 10 illustrates a decompressor being used to decode (or decompress) a block compressed using canonical huffman coding.

Fig. 11 illustrates a block of uncompressed data values (i.e., a block of uncompressed data) on the left and the same block in compressed form (i.e., a block of compressed data) using a variable length code generated using huffman coding on the right. All data values of the uncompressed data block are replaced by corresponding huffman code words in the compressed data block.

Fig. 12 illustrates an uncompressed data block on the left and the same block in compressed form on the right in an alternative way comprising a bit mask indicating which data values are zero values and non-zero values and a variable length code (i.e. a bit sequence encoded by the variable length code) comprising a mixture of compressed non-zero data values.

Fig. 13 illustrates on the left a block of uncompressed data values and on the right the same block in compressed form using a variable length code that has been generated using huffman coding. All data values of the uncompressed data block are replaced by corresponding huffman code words in the compressed data block.

Fig. 14 illustrates an uncompressed data block on the left and the same block in compressed form on the right in an alternative way comprising a mask encoding indicating whether a bit mask is included in the compressed block or indicating whether zero data values and non-zero data values occur in a specific order or follow a specific pattern and a variable length encoding (i.e. a bit sequence encoded by a variable length encoding) comprising a mixture of compressed non-zero data values.

Fig. 15 illustrates an uncompressed data block on the left and the same block in compressed form on the right in an alternative way comprising mask encoding indicating whether a bit mask indicating which values are zero and non-zero values follows, a bit mask indicating which values are a mixture of compressed non-zero data values and a variable length encoding (i.e. a sequence of bits encoded by a variable length encoding) comprising a mask encoding, a bit mask and a variable length encoding (i.e. a sequence of bits encoded by a variable length encoding) comprising a mixture of compressed non-zero data values.

FIG. 16 illustrates a data compression device based on the compressor of FIG. 9 but modified and expanded to be able to compress the uncompressed data blocks of FIG. 12.

Fig. 17 illustrates a data decompression apparatus based on the decompressor of fig. 10 but modified and extended to be able to decompress the compressed data blocks of fig. 12.

Fig. 18 illustrates a data compression device based on the compression device of fig. 16 but modified and expanded to be able to compress the uncompressed data blocks of fig. 14 and 15.

Fig. 19 illustrates a data decompression apparatus based on the decompression apparatus of fig. 17 but modified and expanded to be able to decompress the compressed data blocks of fig. 14 and 15.

Fig. 20 illustrates on the left side a block of uncompressed data values (i.e. a block of uncompressed data) and on the right side the same block in compressed form (i.e. a block of compressed data) using a variable length code generated using huffman coding.

Fig. 21 illustrates an uncompressed data block on the left and the same block in compressed form on the right in an alternative way comprising a bit mask holding the length of the variable length encoding of the compressed data values and a variable length encoding (i.e. a bit sequence encoded by the variable length encoding) comprising a mixture of compressed data values and uncompressed data values.

Fig. 22 illustrates an uncompressed data block on the left and the same block in compressed form on the right in an alternative way comprising a bit mask holding the variable length encoded length of the compressed values and a variable length encoding (i.e. a bit sequence encoded by the variable length encoding) comprising a mixture of compressed data values and uncompressed data values, but wherein the uncompressed data values are stored at the end of the compressed data block in the reverse order of their appearance in the original uncompressed data block.

FIG. 23 illustrates a data compression device based on the compressor of FIG. 9 but modified and expanded to be able to compress the uncompressed data blocks of FIG. 21.

Fig. 24 illustrates a data decompression apparatus based on the decompressor of fig. 10 but modified and expanded to be able to decompress the compressed data blocks of fig. 21.

Fig. 25 illustrates a data decompression apparatus based on the decompressor of fig. 10 but modified and extended to be able to decompress the compressed data blocks of fig. 22.

Fig. 26 illustrates a decompression apparatus based on the decompression apparatus of fig. 24 but modified and expanded to be able to decompress 2 compressed data in parallel.

Fig. 27 illustrates an example flow diagram of a compression method for compressing uncompressed data blocks using variable length coding (e.g., huffman).

Fig. 28 illustrates an example flow diagram of a decompression method for decompressing a compressed data block compressed using variable length coding (e.g., canonical huffman).

Fig. 29 illustrates an example flow diagram of a new method that builds on the compression method of fig. 27 and follows similar steps as the compression device of fig. 18 to enable the compression of the uncompressed data blocks of fig. 14 and 15.

Fig. 30 illustrates an example flow chart of a new method that builds on the method of fig. 28 and follows steps similar to the decompression apparatus of fig. 19 to enable decompression of the compressed data blocks of fig. 14 and 15.

Fig. 31 illustrates an example flow diagram of a new method that builds on the compression method of fig. 27 and follows steps similar to the compression apparatus of fig. 23 (skipping the steps required to determine uncompressed values and mix the compressed and uncompressed values) to enable compression of the uncompressed data blocks of fig. 21.

Fig. 32 illustrates an example flow diagram of a new method that builds on the decompression method of fig. 28 and follows steps similar to the decompression apparatus of fig. 24 (skipping the steps required to determine the uncompressed value and mix the compressed and uncompressed values) to enable decompression of the compressed data block of fig. 21.

Detailed Description

The present disclosure discloses methods, devices and systems for compressing one or more blocks of data values and decompressing one or more compressed blocks of data values when applying compression to a cache subsystem and/or a memory subsystem and/or a data transmission subsystem in a communication network and/or a computer system. The disclosed methods, apparatus and systems extend and optimize baseline compression methods, apparatus and systems and decompression methods, apparatus and systems to be applicable to data compression scenarios common in the aforementioned applied systems with better compressibility and higher compression/decompression speed.

A data block includes one or more data values and may be of any size. In embodiments of a computer system as depicted in FIG. 1, a block of data values may alternatively be referred to as 1) a cache line, cache set, or cache sector when the block of data is stored in a cache hierarchy within such a computer system; 2) a cache line, a memory page, or a memory sector, when a block of data is stored in a memory within such a computer system or transmitted in a communication device within such a computer system. On the other hand, in the embodiment of a transport link within a communication network as depicted in fig. 6, a data block may also refer to a packet (packet, information packet), a flit (flit), a payload, a header (header ), etc.

Entropy-based variable length compression, such as huffman compression, may be applied to the blocks of data values shown on the left side of fig. 8 in the context of the cache/memory/data transmission subsystem of the example computer system as depicted in fig. 2, 3, 4, 5 or the example communication link as depicted in fig. 7. The block comprises 8 data values, however, as previously mentioned, it may be of any size. Using an example set of canonical huffman codewords and a prior art huffman compressor such as the example embodiment of fig. 9, all data values in the block are compressed (or encoded) to form a compressed block, as depicted on the right side of fig. 8. Furthermore, the example variable length compressed block of data values plotted on the right of fig. 8 may be decompressed by a prior art canonical huffman decompressor such as the example embodiment of fig. 10.

When data compression is applied in a cache/memory subsystem or a transport network subsystem or a peer-to-peer communication network, a common scenario is that a particular data value 0 occurs very frequently within a data block, but the data block is not completely filled with the particular data value 0. Assuming that data value 0 is encoded with a 1-bit codeword (best case), a conventional representation of such an uncompressed data block and the corresponding data block after compression using a variable length code (e.g., huffman coding) is depicted in fig. 11. An alternative representation of the uncompressed data block of fig. 11 is presented in fig. 12, where the compressed data block includes a mask having a width of X bits, where X is as large as the number of data values (here 8) contained in the block prior to variable length encoding, in accordance with an embodiment of the present invention. A Mask, referred to as a Z-Value Mask, encodes the frequently occurring particular data Value present in the compressed block, such as data Value 0, thereby omitting the example huffman codeword in variable length coding when such a zero data Value is present in the uncompressed data block. In this way, the compressed data block is 2 bits (14 bits in total) larger than the embodiment of fig. 21 (12 bits in total), however with such encoding decompression of the compressed block may become faster.

A block diagram of an example data compression apparatus 1600 capable of forming the compressed data block of fig. 12 is depicted in fig. 16. The example data compression device includes a compressor in the form of a variable length code unit 1620, a detector 1630 in the form of a comparator, and a compressed data block generator 1640 that includes a Z value mask register 1640 to store a Z value mask, a storage unit 1650 to store a variable length codeword 1625 received from the variable length code unit (compressor) 1620, and a concatenator 1670. As in the compressor embodiment of fig. 9, the data compression device 1600 takes as input an uncompressed data chunk 1610 which is a stream of data values and includes one or more data values v1, v2, …, vn and which may be retrieved from a storage unit 1605 or from an extractor having data values out of the uncompressed data chunk. However, the data value of the uncompressed data block 1610 is supplied not only to the variable length coding unit 1620 but also to a comparator (detector) 1630 that compares whether the value is a zero value (i.e., a specific data value). If the value is non-zero, it is encoded using unit 1620 and the encoded value will be placed in the accumulated variable length code 1635, and the result of comparator 1630 (a "0" indicating no match) will be stored in the corresponding location in the Z-value mask 1640 (the first location in the Z-value mask corresponds to the first value in the block, etc.). If the value is zero, the result of comparator 1630 (a "1" indicating a match) will disable storage unit 1650 so that the encoded value is omitted from variable length coding 1635. Further, a "1" is written in the corresponding location in the Z-value mask 1640 to indicate a zero value. When the entire block is compressed (which may be accomplished by one skilled in the art using a counter and comparing to the maximum number of values included in the uncompressed block), the Z-value mask is read from the Z-value mask register 1640 and concatenated prior to the variable length coding 1635 using a concatenator 1670, forming compressed block 1690. The memory cell 1650 may be implemented in different ways by those skilled in the art, for example, using a set of tri-state buffers.

Thus, in the data compression apparatus 1600 in FIG. 16, the aforementioned detector comprises a comparator 1630 having a first input 1631a configured to receive a data value v1-vn at a corresponding data value position in an uncompressed data block 1610, a second input 1631b configured to receive a particular data value 1632, and an output 1633 configured to output a comparison result between the particular data value 1632 and a data value v1-vn at a corresponding data value position in the uncompressed data block 1610.

Further, in the data compaction device 1600 of FIG. 16, the aforementioned compacted data chunk generator 1640-. The output 1633 of comparator 1630 is coupled to the mask register 1640 such that the mask register 1640 is updated at a respective one of its n storage locations with the result of the comparison between a particular data value 1632 and the data value v1-vn at the respective data value location in uncompressed data block 1610.

Also, in the data compression apparatus 1600 of FIG. 16, the aforementioned compressed data block generator 1640-. The storage unit 1650 also has an output 1653 configured to output the stored variable length codeword 1625 into the accumulated variable length code 1655, and a second input 1631b configured to receive a control signal dependent on the output 1633 of the comparator 1630. The storage unit 1650 is configured to: when the comparison result of the comparator 1630 indicates a match between a particular data value 1632 and a data value v1-vn at the corresponding data value position in the uncompressed data block 1610, the stored variable length codeword 1625 is inhibited from being output into the variable length code 1655.

In addition, in the data compression apparatus 1600 in fig. 16, the aforementioned compressed data block generator 1640-: when all n data values v1-vn of uncompressed data block 1610 have been processed, compressed data block 1690 is generated by concatenating the data value mask Z-value mask from mask register 1640 with the accumulated variable length code 1655.

In the disclosed embodiment of the data compression apparatus 1600 according to fig. 16, the data value mask Z value mask is placed before the accumulated variable length code 1655 in the compressed data block 1690. In other embodiments, the data value mask Z value mask may instead be appended at the end of the accumulated variable length code 1655 in the compressed data chunk 1690.

The above disclosure of fig. 16 may alternatively be viewed as a device that compresses an uncompressed data block comprising one or more data values into a compressed block (i.e., a data compression device); wherein the compressed data chunk comprises a bit mask and a variable length bit sequence further comprising one or more compressed data values encoded using variable length coding, wherein the number of compressed data values is less than or equal to the number of uncompressed data values; wherein the apparatus comprises: a compressor for compressing a data value with a variable length codeword corresponding to the data value; a first mechanism that detects one or more particular data values that are not to be compressed by the compressor for compressing data values using variable length codewords; a second mechanism to update a bit mask indicating the specific value using the detection information of the first mechanism.

A block diagram of an example data decompression apparatus 1700 capable of decompressing the compressed blocks of fig. 12 is depicted in fig. 17. The data decompression apparatus 1700 is constructed based on the decompressor 1000 of fig. 10 and includes a storage unit 1705 that holds a part of the compressed data block 1710 (the size of the storage unit 1710 is at least the maximum of the uncompressed value length and the maximum codeword length), a codeword detection unit 1720 (similar to the codeword detection unit 1020 of the decompressor 1000 of fig. 10), a value retrieval unit 1730 (similar to the value retrieval unit 1030 of the decompressor 1000 of fig. 10), and additional logic (expansion logic) 1740 forming a compressed data block generator 1780. The codeword detection unit 1720 and the value retrieval unit 1730 thus form a decompressor.

The decompressed data chunk generator includes a register 1750 for storing a Z-value mask retrieved from an attached portion of the compressed data chunk 1710, a Variable Length (VL) value location generator 1740, a Variable Length (VL) value location assignor 1760, and a selector 1780. The Z-value mask is used in two ways: 1) each Z-value mask bit is used to generate a control signal for the selector 1780 such that if a data value has been encoded as a zero value (i.e., a particular data value), then writing a zero in the decompressed block 1790 is selected instead; 2) all Z-value mask bits are used by the VL-value position generator 1740. The VL-value location generator generates a mask index (indicated by the zero bits in the Z-value mask) for all non-zero values. For example, in the embodiment of fig. 12, non-zero values are marked in the Z-value mask indices 5 and 6. VL-value position assigner 1760 uses this information to decide where to place the respective (non-zero) values (v 5 and v6 of the embodiment of fig. 12) decoded by the decompressor.

Thus, when the portion of the compressed data values comprised by the compressed data block is data value 0 (or another most frequent data value), the data decompression apparatus 1700 of fig. 17 speeds up the decompression of the compressed data block, since the data decompression apparatus only needs to decompress non-zero data values that were encoded using variable length coding.

As can be appreciated from the above, in the data decompression apparatus 1700 of fig. 17, the aforementioned decompressed data block generator 1740-1780 comprises a value location generator 1740, a value location assignor 1760 and a plurality of selectors 1780, one for each of the n data value locations of the decompressed data block 1790. The value location generator 1740 is configured to control the value location assigner 1760 and the plurality of selectors 1780 such that when a respective mask location of the data value mask Z-value mask indicates a particular data value 1782, the particular data value 1782 is received from the respective selector and included at the respective data value location in the decompressed data block 1790, and such that when the respective mask location of the data value mask Z-value mask does not indicate the particular data value 1782, a corresponding decompressed data value is received from the decompressor 1720-1730 and included at the respective data value location in the decompressed data block 1790.

The disclosure of fig. 17 above may alternatively be viewed as a device that decompresses a compressed data block into one or more data values (i.e., a data decompression device); wherein the compressed data chunk comprises a bit mask and a variable length bit sequence, further comprising one or more compressed data values encoded with a variable length code and one or more specific values encoded with a bit mask; wherein the apparatus comprises: a decompressor for decompressing a variable length codeword to reconstruct a data value corresponding to the variable length codeword; a first mechanism to read the bitmask to determine whether a particular value occurs in an uncompressed block during compression and to generate a value position in the uncompressed block; and a second mechanism to recreate the particular data value using the indication from the first mechanism such that the order of the values in the decompressed data block is the same as the order of the values in the original data block prior to compression.

In an alternative embodiment, the skilled person may encode another very frequent specific data value like data value 0. In yet another alternative embodiment, more frequent data values may be encoded with a mask that uses a fixed size encoding. For example, the 3 most frequent data values may be encoded as 00, 01, 10, leaving 11 to represent infrequent (or less frequent) data values that are encoded using variable length coding. A plurality of said specific data values may be detected by an alternative embodiment wherein the

comparators

1630 or 1830 of the

data compression devices

1600 and 1800, respectively, are replaced by a small dictionary containing a plurality of said specific values. Alternative embodiments can be implemented by those skilled in the art.

Fig. 13 shows a representation of an uncompressed data block (on the left in fig. 13) and a corresponding compressed block (on the right in fig. 13) encoded using variable length huffman. The uncompressed block of fig. 13 is a common case where a particular data value of 0 appears in an even index position of the block (assuming that the index of the first value is 0, the index of the second value is 1, and so on). This may occur when the data values of the blocks are small integer values declared as large integers. Similarly, a particular data value of 0 may consistently occur in odd index positions of the block. Fig. 14 shows an alternative way of compressing the uncompressed block of fig. 13, in which a scene with a 0 value for every two data values (where the first data value of the block is 0) is encoded using 2 bits (10), and the remaining non-zero data values are compressed as described previously, according to an embodiment of the invention. In this way, the compression is improved by 2 bits (16 bits total) when compared to the representation of the compressed block of fig. 13 (18 bits total). In order to be able to encode such different compression scenarios, as well as the uncompressed blocks of the embodiment of fig. 12, it is required to set a mask code before the mask (if any). For example, with 2-bit mask coding, we can encode the following mode scenarios: 00 → no mask following (no zero value in data block); 01 → a value of 0 occurs every two data values, and the data block does not begin with a value of 0 (zero values are in odd positions); 10 → a value of 0 occurs every two data values and the data block starts with a value of 0 (zero values in even positions); 11 → the value 0 occurs one or more times and cannot be described identically using the pattern. Thus, the 2-bit encoding can add more bits to the compressed data block or can result in better compression (01 and 10 modes), but in general it can speed up decompression by adding a small number of sources.

A block diagram of an example data compression apparatus 1800 capable of forming the compressed blocks of fig. 14 and 15 is depicted in fig. 18. Like the compression device of fig. 16, it includes a compressor in the form of a variable length encoding unit 1820, a Z-value mask register 1840 for storing a Z-value mask,

other logic

1830 and 1850, a compressed data chunk generator 1870, and a mask encoding generator 1860. Compression of zero data values (even though they appear to be following a pattern such as in fig. 14) and non-zero data values is performed in the same manner as the data compression apparatus 1600 of fig. 16 until block compression is complete. To this end, Z-value mask register 1840 supplies the Z-value mask to mask encode generator 1860. In this example embodiment, the generator 1860 includes Boolean logic that attempts to identify whether a particular pattern (Z-even and Z-odd) with zero values is present, or whether any zero values (any-Z) have been present. Mask encoder 1867 uses these signals to generate corresponding mask encodings 1868. Note that if all data values in a data block (e.g., 8 values, i.e., n-8) are zero values, then this data block is compressed to "1111111111", where the first 2 bits are mask encoded and the remaining 8 bits are a Z-value mask. If this is the case, logic 1864 & 1866 turns Z-odd and Z-even to "0", otherwise they remain set/reset based on logic 1861 & 1862. If no zero values are included in the data block, Z-even, Z-odd, and any-Z will be "0," generating "00" mask code 1868 in this manner. In addition, linker 1870 checks mask encoding 1868: if it is "00", "01", or "10", mask encoding 1868 is attached before variable length encoding 1835, otherwise mask 1868 is placed before the Z-value mask, which in turn is placed before variable length encoding 1835. The components 18nn of the data compression apparatus 1800 in fig. 18 may be the same as the corresponding components 16nn of the data compression apparatus 1600 in fig. 16, sharing the same last two numerals nn in their reference numerals, except for the mask code generator 1860 and its subcomponents, as well as additional inputs to the concatenator 1870 and its additional functionality as described above.

As can be understood from the above, in the data compression apparatus 1800 in fig. 18, the aforementioned mask encoding generator 1860 may be configured to: when the generated data value mask Z value mask indicates a predetermined repeating pattern in the uncompressed data block 1810 in which a particular data value (e.g., a zero data value) is located at every other data value position, a mask encoding 1868 is generated having a first mask encoding value, wherein only the mask encoding 1868 is included and no data value mask Z value mask is included in the generated compressed data block 1890.

Mask code generator 1860 may also be configured to: when the generated data value mask Z value mask indicates a predetermined repetition pattern in the uncompressed data block 1810 in which a particular data value (e.g., a zero data value) is located at every other data value position (however offset by one position relative to the predetermined repetition pattern of the first mask encoded value), a mask encoding 1868 is generated having a second mask encoded value, wherein only the mask encoding 1868 is included and no data value mask Z value mask is included in the generated compressed data block 1890.

Mask code generator 1860 may also be configured to: when the generated data value mask indicates that a particular data value is present at least one data value position in the uncompressed data block, a mask encoding 1868 is generated having a third mask encoding value, where the mask encoding 1868 and data value mask Z value mask are included in the generated compressed data block 1890. Alternatively, mask code generator 1860 may be configured to: when the generated data value mask Z-value mask indicates a predetermined repeating pattern in the uncompressed data block 1810 in which a particular data value (e.g., a zero data value) is located at each data value position, a mask encoding 1868 is generated having a third mask encoding value, wherein only the mask encoding 1868 is included and no data value mask Z-value mask is included in the generated compressed data block 1890.

Mask code generator 1860 may also be configured to: when the generated data value mask Z value mask indicates that no particular data value is present in all data value positions in the uncompressed data block 1810, a mask encoding 1868 is generated having a fourth mask encoding value in which only the mask encoding 1868 is included in the generated compressed data block 1890 and no data value mask Z value mask is included, alternatively, the mask encoding generator 1860 may be configured to: when the generated data value mask Z value mask does not indicate any of the predetermined repeating patterns of first, second, or third mask encoding values, a mask encoding 1868 is generated having a fourth mask encoding value, where the mask encoding 1868 and the data value mask Z value mask are included in the generated compressed data block 1890.

Advantageously, mask encoding 1868 may be placed before the m variable length codewords in the compressed data blocks 1890, where data value mask Z value masks (if present) may be provided after mask encoding 1868 and before the m variable length codewords in the compressed data blocks 1890. However, other relative orders between mask encoding 1868 and the m variable length codewords and the data value mask Z-value mask (if present) are also possible in the compressed data block 1890.

The disclosure of fig. 18 above may alternatively be viewed as a device that compresses an uncompressed data block comprising one or more data values into a compressed block (i.e., a data compression device); wherein the compressed data chunk comprises a mask code, a bit mask, and a variable length bit sequence further comprising one or more compressed data values encoded using a variable length code, wherein the number of compressed data values is less than or equal to the number of uncompressed data values; wherein the apparatus comprises: a compressor for compressing a data value with a variable length codeword corresponding to the data value; a first mechanism that detects one or more particular data values that are not to be compressed by the compressor for compressing data values using variable length codewords; a second mechanism that updates a bit mask indicating the specific value using the detection information of the first mechanism; a third mechanism to look up a particular pattern for the particular data value and generate the mask code; and a fourth mechanism to use the information from the second mechanism and the third mechanism to generate a compressed block using mask encoding, bit masking, and variable length encoding.

A block diagram of an example data decompression apparatus 1900 capable of decompressing the compressed data blocks of fig. 14 and 15 is depicted in fig. 19. The data decompression apparatus 1900 is constructed based on the decompressor 1000 of fig. 10, and is similar to the data decompression apparatus 1700 of fig. 17. Like data decompression apparatus 1700, data decompression apparatus 1900 includes a storage unit 1905 that holds a portion of a compressed data block 1910 (the size of storage unit 1910 is at least the maximum of the uncompressed value length and the maximum codeword length), a codeword detection unit 1920, a value retrieval unit 1930, additional logic 1940-1960 and 1980 included in a decompressed data block generator 1940-1980 that also includes a mask codec 1970. The codeword detection unit 1720 and the value retrieval unit 1730 thus form a decompressor.

The decoder 1970 includes a comparator 1974 and a selector 1978, and generates a control signal of the selector 1980 based on the mask code 1915. Unlike the data decompression apparatus 1700, the data decompression apparatus 1900 starts by checking the first two bits of a block (i.e., mask encoding 1915). If "11," the next 8 bits of the compressed block are fetched from storage 1905 and copied to the Z value mask 1950. The content of the mask is used for a VL-value position generator 1940 (as in the apparatus of fig. 17) and is used for a selector 1978 which feeds a decoding unit 1970. The distributor 1960 and selector 1980 work as follows:

if the mask encoding 1915 is "00" (Z value mask is not present), the decoded value 1935 output from the decompressor's unit 1930 is fed to a selector 1980 whose control signals ctrl0 ', ctrl1 ', etc. are set to "0". This occurs because in this case, useZ-mask (using Z mask), isEven (even number), and isOdd (odd number) in the decoding unit 1970 are set to "0". As a result, selector 1978 will select the first input (isEven or isOdd), which is "0".

If the mask encoding 1915 is "01" (Z-odd, Z-value mask not present), the decoded value 1935 output from the decompressor's unit 1930 is fed to the selector 1980 in the even positions (v1, v3, v5, etc.), while the selector 1980 in the odd positions (with control signals ctrl1 ', ctrl3 ', etc.) selects the value 0 (i.e. the particular data value 1982). This means that ctrl0 ', ctrl 2', etc. are set to "0" and ctrl1 ', ctrl 3', etc. are set to "1". This occurs because in this case, in the decoding unit 1970, useZ-mask is "0", isEven is "0", and isOdd is "1". As a result, selector 1978 will select the first input (isEven or isOdd), resulting in a "0" for ctrl0 ', ctrl 2', etc., and a "1" for ctrl1 ', ctrl 3', etc.

If the mask code 1915 is "10" (Z-even, Z-value mask not present), the decoded value 1935 output from the decompressor's unit 1930 is fed to the selector 1980 in the odd positions (v2, v4, v6, etc.), while the selector 1980 in the even positions (with control signals ctrl0 ', ctrl2 ', etc.) selects the value 0 (i.e. the particular data value 1982). This means that ctrl0 ', ctrl 2', etc. are set to "1" and ctrl1 ', ctrl 3', etc. are set to "0". This occurs because in this case, in the decoding unit 1970, useZ-mask is "0", isEven is "1", and isOdd is "0". As a result, selector 1978 will select the first input (isEven or isOdd), resulting in a "1" for ctrl0 ', ctrl 2', etc., and a "0" for ctrl1 ', ctrl 3', etc.

If the mask encoding 1915 is "11" (Z-value mask present), the decoded value 1935 output from the decompressor's unit 1930 is fed to the selector 1980 in the position determined by the unit 1940, as in the decompression apparatus 1700, where the control signals ctrl0 ', ctrl1 ', etc. of the selector 1980 are equal to the control signals ctrl0, ctrl1, etc. This occurs because in this case, in the decoding unit 1970, useZ-mask is "1", isEven is "0", and isOdd is "0". As a result, all selectors 1978 will select the second input (ctrl0, ctrl1, etc.), which is the output of the Z-value mask.

As can be understood from the above, in the data decompression apparatus 1900 in fig. 19, the aforementioned decompressed data block generator 1940 may be configured to: when the read mask code 1915 has the first mask code value, the decompressed data block 1990 is generated by combining the decompressed data values 1935 from the decompressor 1920-1930 with the at least one particular data value 1982 (e.g., a zero data value) in a predetermined repeating pattern of particular data values at every other data value position in the decompressed data block 1990.

Further, the decompressed data block generator 1940 may be configured to: when the read mask code 1915 has a second mask code value, the decompressed data block 1990 is generated by combining the decompressed data values 1935 from the decompressor 1920-1930 with the at least one particular data value 1982 (e.g., a zero data value) in a predetermined repeating pattern of particular data values at every other data value position in the decompressed data block (1990), yet offset by one position relative to the predetermined repeating pattern of first mask code values.

Also, the decompressed data block generator 1940-1980 may be configured to: when the read mask code 1915 has the third mask code value, a decompressed data block 1990 is generated that contains the at least one particular data value 1982 (e.g., a zero data value) at each data value position in the decompressed data block 1990.

Additionally, the decompressed data block generator 1940-1980 may be configured to: when the read mask code 1915 has the fourth mask code value, a decompressed data block 1990 is generated by combining the decompressed data values 1935 from the decompressor 1920-1930 with the at least one particular data value 1982 (e.g., a zero data value) as indicated by the respective mask position of the data value mask Z-value mask.

Alternative embodiments of this new decompressor embodiment can be implemented by those skilled in the art. Alternative embodiments may also be implemented by those skilled in the art if another value, which occurs more frequently than the value 0, is used instead of the value 0.

The disclosure of fig. 19 above may alternatively be viewed as a device that decompresses a compressed data block into one or more data values (i.e., a data decompression device); wherein the compressed data chunk comprises a mask code, a bit mask, and a variable length bit sequence, further comprising one or more compressed data values encoded with a variable length code and one or more specific values encoded with a bit mask; wherein the apparatus comprises: a decompressor for decompressing the variable length code to reconstruct the data value corresponding to the variable length codeword; a first mechanism to decode the mask code; a second mechanism to read the bitmask to determine whether a particular value occurs in the uncompressed block during compression and to generate a value position in the uncompressed block; a third mechanism to use the indications from the first and second mechanisms to recreate the particular data value such that the order of the values in the decompressed data block is the same as the order of the values in the original data block prior to compression.

The foregoing embodiments of the data compression apparatus (1000, 1700 and 1900) have in common that a codeword of unknown length needs to be found and extracted from the variable length bit sequence constituting the compressed data block. This is done by

codeword detection units

1020, 1720 and 1920. Codeword detection is an inherently sequential process because decoding one compressed data value in a bit sequence (which constitutes a compressed data block) requires decompressing (decoding) all previously occurring compressed data values in the sequence. The example decompressor solves this problem by: a hierarchy of codeword detection units is constructed, the hierarchy of codeword detection units being aligned in different bit positions of the incoming compressed bit sequence or a part thereof such that a following codeword is selected among a number of predicted codewords based on results of codeword detection units of a previous level. Such an implementation adds significant resources due to the increased area cost and power consumption of multiple sets of comparators and priority encoders.

On the other hand, if each variable length codeword is known in advance, the codeword detection step can be avoided completely. A representation of the data block resulting from compression using variable length huffman coding is depicted in fig. 20. An alternative representation according to an embodiment of the invention is depicted in fig. 21, where a mask of length values of variable length code words (L-mask) is stored before the sequence of variable length code words (variable length coding). Each length value has a fixed size that depends on the maximum size of the codeword. As previously mentioned in the background section of this document, this may define a particular length in design, assembly, configuration, or run time. A length of 0 also indicates that the data value is uncompressed. In the example embodiment of the compressed block of fig. 21, the first compressed data value in the variable length bit sequence has a length of 1 (the corresponding codeword is "0"), the second compressed data value in the variable length bit sequence has a length of 2 (the corresponding codeword is "11"), and so on. In this way, value retrieval of multiple compressed data values can be parallelized by a simplified decompressor, but at the expense of increasing the size of the compressed data block, thus potentially reducing compression efficiency.

A block diagram of an example data compression apparatus 2300 capable of forming the compressed data block of fig. 21 is depicted in fig. 23. The data compression apparatus 2300 includes a compressor in the form of a variable length encoding unit 2320, a register 2350 for storing an L-mask ("L" for code length), and

logic

2340, 2360, 2370 that forms a compressed data block generator. Unlike the device of fig. 9, unit 2320 outputs not only the encoded data value 1625 but also cL (code length), which is recorded in the L-mask. The number of entries included in the L-mask is determined in design, assembly, configuration, or runtime (depending on the implementation) based on the block size and the number of its data values. For example, if the uncompressed form of the data block includes 8 data values (i.e., n-8), the L-mask register 2350 includes 8 entries. The entry width is determined by the maximum codeword length, which can be bounded as described in the previous paragraph, so it is log2(max _ cw _ length). For each value that is being compressed, (not shown in fig. 23, but one skilled in the art may implement a counter to monitor which value is being compressed), cL is stored in a corresponding location of the L-mask register 2350. When all values of the data block have been compressed, a compressed block 2390 is formed using a concatenator unit 2370 to concatenate the L-mask before the variable length coding 2355. Thus, in the compressed data block 2390 of the disclosed embodiment, a length mask L-mask is placed before the variable length codeword 2325; however, in other embodiments, the reverse order is possible.

In addition, if the variable length coding unit 2320 cannot compress all possible data values, the comparator 2330 and the

selectors

2340 and 2360 are required so that the variable length coding unit can determine whether a data value will remain uncompressed and record its size as 0 in the L-mask. Thus, in an alternative embodiment where all values are compressed, this logic may be omitted, thus coupling cL to L-mask register 2350.

Thus, the data compression device 2300 of FIG. 23 also includes a detector in the form of a comparator 2330 coupled to the compressor 2320 and configured to detect data values in the uncompressed data chunks 2310 that cannot be compressed by the compressor 2320. Compressed data block generator 2340-: when detector (comparator) 2330 has detected that the data value of uncompressed data block 2310 cannot be compressed, the code length cL with the special value is stored at the corresponding position in the length mask L-mask and this uncompressed data value is stored in compressed data block 2390. In the disclosed embodiment, the special value of the code length is 0.

Further, in the data compression apparatus 2300 of fig. 23, the detector 2330 is configured to detect data values in the uncompressed data block 2310 that cannot be compressed by the compressor 2320 as one or more of:

the data values not present in the code table 2322 of the compressor 2320,

the data value of codeword 2325 is present in the code table but absent from the code table of the compressor,

a data value that exists in the code table with codeword 2325 in the code table but is indicated as invalid in the code table of the compressor.

Also, in the data compression apparatus 2300 of fig. 23, the compressed data block generator 2340-: a compressed data chunk 2390 is generated by including the compressed data values in the form of variable length codewords 2325 provided by compressor 2320 and the uncompressed data values detected by detector 2330 in the order of the data values v1-vn of uncompressed data chunk 2310.

In an alternative embodiment of the data compression apparatus 2300, where some data values remain uncompressed, it may choose not to mix the uncompressed data values with the codewords of the compressed data values. Alternatively, all uncompressed data values are held in different buffers, placed at the end in reverse order and attached by a linker 2370 to the end of the compressed data block 2390. Although the order of the compressed data values and uncompressed data values in the compressed data block is scrambled compared to the uncompressed data block, the original order is maintained in the L-mask. An example implementation of such a compressed data block with uncompressed data values stored at its ends is illustrated in fig. 22. The uncompressed data value of "500" appears in the 6 th position of the uncompressed data block. The 0 code length is recorded by placing it in the 6 th position of the L-mask and placing the uncompressed data value "500" at the end of the compressed data block. Example embodiments of the alternative data compression apparatus may be implemented by those skilled in the art.

Accordingly, in an alternative embodiment of the data compression apparatus 2300, the compressed data block generator 2340-: a compressed data block 2390 is generated by first including the compressed data values in the form of variable length codewords 2325 provided by compressor 2320, in the order of data values v1-vn in uncompressed data block 2310, and then including the uncompressed data values detected by detector 2330, or vice versa. Advantageously, uncompressed data values are stored in the generated compressed data block 2390 in reverse order compared to the order in uncompressed data block 2310. Since the order of all the corresponding code lengths cL in the length mask L-mask follows the order of all the data values v1-vn of the uncompressed data block 2310, all the data values of the uncompressed data block 2310 in the original order will still be allowed to be reconstructed during decompression of the compressed data block 2390.

The disclosure of fig. 23 above may alternatively be viewed as a device that compresses an uncompressed data block comprising one or more data values into a compressed block (i.e., a data compression device); wherein the compressed data chunk comprises a bit mask and a variable length bit sequence, the variable length bit sequence further comprising one or more compressed data values encoded using variable length encoding; wherein the apparatus comprises: a compressor for compressing data values using variable length coding corresponding to the data values; a mechanism to record with the bit mask an encoding length of the compressed one or more data values encoded with variable length encoding.

A block diagram of an example data decompression device 2400 capable of decompressing the compressed data blocks of fig. 21 is depicted in fig. 24. The data decompression apparatus 2400 includes a storage unit called a compression buffer 2405 that holds a portion of the compressed data chunk 2410 (the size of the storage unit 2405 is at least the largest of the uncompressed value length and the largest codeword length), a codeword extractor 2420, an L-mask storage unit 2430, a value retrieval unit 2440 (similar to the value retrieval unit 1030 of the decompressor 1000 of fig. 10), and additional logic 2450 and 2470 that form a decompressed data chunk generator 2450 and 2470. The L-mask storage unit 2430 holds the first N bits of the compressed data block 2410 and is fully compatible with the L-mask of the data compression apparatus 2300. For example, for a data block of 8 data values and a maximum codeword length of 15 bits, N is 4 x 8-32 bits, while L-mask storage unit 2430 includes 8 entries. Decompressed data chunk generator 2450-2470 comprises a comparator 2450 that checks whether the decompressed compressed data value is actually compressed or uncompressed (and therefore decompression is not required), a selector 2460 that selects between the decoded data value 2445 and the uncompressed data value based on the result of the comparator 2450, and a decompressed data chunk store 2470 that holds the formed decompressed data chunk 2490 with data value v1 … vn.

Block decompression is divided into execution steps. At the beginning of each execution step, an L-mask entry indicating the length of a particular codeword is used to discard the codeword to be decompressed in that execution step, so that the codeword should not be part of the bit sequence stored in the compression buffer 2405 in the next execution step. The amount of discarding is referred to as "shift amount" in fig. 14. If the length of a particular L-mask entry is 0, the shift amount is the size of the uncompressed value (again, depending on the implementation, decided in design, assembly, configuration, or runtime), e.g., 32 bits. In the same execution step, a CW (codeword) extractor 2420 takes as input the compressed sequence stored in the compression buffer 2405 and the codeword length provided by the L-mask storage unit 2430 and extracts a codeword from the compressed sequence. In this embodiment, codeword extractor 2420 includes a subtractor that subtracts a codeword length value from the buffer width and a shifter that right-shifts the input to unit 2420 by an amount calculated by the subtractor. The shifter shifts right if the leftmost portion of the buffer contains the beginning of the compressed sequence. The extracted codeword is then supplied to a value retrieval unit 2440 which determines an associated uncompressed data value using the codeword. The codeword length of the L-mask storage unit 2430 is also used to locate the appropriate offset between the offsets 2444 in the value retrieval unit 2440.

For example, the variable length encoding of the compressed data block 01111110101110 … as depicted in fig. 21 includes

codewords

0, 11, 111101, etc., as indicated by the L-mask "1, 2, 6, 1, 4 …" held in the beginning of the compressed data block. Therefore, to retrieve the data value associated with codeword 0, we need to first discard the bit sequence 1111110101110 … (which is performed by CW extractor 2420). In the same step, we also need to discard codeword 0 from the compression buffer 2405 in order to be able to retrieve the second data value associated with the second codeword 11 in the second execution step.

By comparing the example data decompression apparatus 2400 of fig. 24 with the decompression apparatus 1000 of fig. 10, the former can be faster than the latter because the logical depth of the codeword detection unit 1020 (of the decompression apparatus 1020) including the comparator, the priority encoder, and the shift is larger than the logical depth of the CW extractor 2420 (of the data decompression apparatus 2400) including the algorithm and the shift operation.

As is apparent from the above, in the data decompression apparatus 2400 in fig. 24, the decompressed data block generator 2450 and 2470 are configured to:

for one or more locations in the length mask L-mask, detecting one or more code lengths having a special value that indicates that one or more corresponding data values are included in uncompressed form in compressed data block 2410; and

based on the detected one or more code lengths with special values, decompressed data block 2490 is generated by combining the decompressed data values from decompressor 2440 with one or more corresponding data values from uncompressed form of compressed data block 2410. Again, the special value of the code length may advantageously be 0.

As explained above for the data compression apparatus 2300, on the compressor side, uncompressed data values need not be mixed with compressed data values in the variable length encoding 2355. An alternative embodiment of a data decompression apparatus 2400 that can decompress data blocks that have been so compressed is illustrated in fig. 25. Unlike the data decompression device 2400, the data decompression device 2500 further includes an uncompressed value extraction unit 2580. The difference between the

device

2400 and 2500 is that the former will extract the uncompressed value from the compression buffer 2405, while the latter extracts the uncompressed value from the storage unit 2584 containing the compressed data block 2510 in the reverse order (i.e., first uncompressed data value, second uncompressed data value, …, last uncompressed data value, variable length coding in reverse order). When the L-mask entry contains a length of 0, a counter increment holding the number of uncompressed data values is incremented and the corresponding uncompressed data value is read from storage unit 2584 using selector 2588. In this data decompression apparatus 2500, when the L-mask entry is 0, the "shift amount" content of the compression buffer 2505 is 0, unlike the apparatus 2400 in which the "shift amount" is equal to the width of the uncompressed data value. Thus, the compression buffer 2505 may have a narrower width (i.e., equal to the maximum codeword length), even more minimizing the logical depth of the disclosed data decompression apparatus.

The aforementioned

data decompression devices

2400 and 2500 have improved performance compared to the decompression device 1000, however, the compressed data blocks are still sequentially decompressed. An alternative embodiment of a data decompression device 2400 that can decompress 2 values in the same execution step is illustrated in fig. 26, where a data decompression device 2600 decompresses in parallel (for each execution step) from two streams of the same compressed data block. This is done by almost copying 2400 units, except: 1) the value retrieval unit 2640 is one (offset 2644 and DeLUT 2648 are mapped to a 2-port memory); and 2) the L-mask is divided into two pieces, 2630a and 2630 b: 2630a contains the first half of the L-mask and 2630b contains the second half thereof. These elements are now denoted "a" and "b", e.g., CW extractor 2620a and CW extractor 2620b, while data decompression apparatus 2600 essentially comprises two decompressors: one decompressor includes all "a" units and the other decompressor includes all "b" units. There is also an additional unit, adder 2690. In an alternative embodiment, the value retrieval unit may be replicated instead.

In data decompression device 2600, the decompression works as follows: half of the L-mask is placed in storage 2630a and the other half is placed in 2630 b. The length values of the first half of the L-mask are then accumulated (2690) to calculate the position of the beginning of the second portion within the compressed data block to be decompressed by decompressor "b" (measured in the number of bits skipped from the variable length coded beginning of the compressed data block). The two decompressors each hold a decompressed sub-block of values in 2670a and 2670b, which are concatenated when all of the compressed data values are decompressed.

In an alternative embodiment where 2 values are decompressed in the same execution step, the L-mask may be maintained as one and adder 2690 is not needed if the 2 values decompressed in the same execution step are consecutive instead. In such an example embodiment, an additional shifter is required before the CW extractor "b" to discard the codeword to be extracted by the CW extractor "a".

When implemented in hardware, the additional area/logic added due to the parallelization in the data decompression apparatus is small compared to the additional logic required for parallelizing the decompression apparatus 1000, since the codeword detection unit 1020 has a more complex design than the corresponding codeword extractor 2620.

In alternative embodiments where multiple compressed data values are decompressed in parallel, corresponding modifications to the decompressor design are required by those skilled in the art.

The disclosure of fig. 24-26 above may alternatively be viewed as a device that decompresses a compressed data block into one or more data values (i.e., a data decompression device); wherein the compressed data chunk comprises a bit mask and a variable length bit sequence, further comprising one or more compressed data values encoded with a variable length code and one or more code length values encoded with a bit mask; wherein the apparatus comprises: a decompressor for decompressing the variable length codes to reconstruct data values corresponding to the variable length codes; a first mechanism to read the bitmask to determine a length of a codeword within the variable length bit sequence; and a second mechanism using the indication from the first mechanism to extract the codewords from the variable length bit sequence and to supply to the decompressor; and a third mechanism to form a decompressed block.

In the aforementioned embodiments of the data compression device and/or the data decompression device, delay elements such as flip-flops may be inserted by a person skilled in the art, so that the compression of the data values of one block and/or the decompression of the values of one compressed block may be pipelined in multiple stages to reduce clock cycles and increase processing (compression or/and decompression) throughput.

Furthermore, the alternative embodiments of the data compression device and/or the data decompression device disclosed in the present disclosure may be parallelized by simultaneously compressing data values of a plurality of blocks or/and decompressing data values of a plurality of compressed blocks by a person skilled in the art and in accordance with teachings known per se. In such a case, corresponding modifications to the decompressor design are required by those skilled in the art.

The respective

data compression devices

1600, 1800, 2300 in fig. 16, 18 and 23 may be implemented, for example, in hardware, for example, as digital circuits in an integrated circuit, as a special purpose device (e.g., a memory controller), as a programmable processing device (e.g., a Central Processing Unit (CPU) or a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), etc. the functions of the respective data compression methods described in the present disclosure may be performed, for example, by the respective

data compression devices

1600, 1800, 2300 being suitably configured, or as a respective computer program product comprising code instructions that, when loaded and executed by a general purpose processing device such as a CPU or DSP, cause the respective methods to be performed.

The respective

data decompression devices

1700, 1900, 2400, 2500, 2600 in figures 17, 19, 24, 25 and 26 may for example be implemented in hardware, e.g., as digital circuits in an integrated circuit, a special purpose device (e.g., a memory controller), a programmable processing device (e.g., a Central Processing Unit (CPU) or Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), etc., the functions of the respective data decompression methods described in this disclosure may be performed, e.g., by the respective

data decompression devices

1700, 1900, 2400, 2500, 2600, or as a corresponding computer program product comprising code instructions which, when loaded and executed by a general-purpose processing device such as a CPU or DSP (e.g. processing unit P1 … Pn of any of fig. 1 to 5), cause the performance of a corresponding method.

Example embodiments disclosed herein propose methods, devices and systems for data block compression and decompression, i.e., for storing or transmitting information more compactly: data block compression and decompression in or for a cache/memory subsystem of a computer system, data block compression and decompression in or for a data transmission subsystem of a computer system, or data block compression and decompression in or for a communication network.

FIG. 33 illustrates a general system 3300 in accordance with the present invention. The system includes one or more memories 3310, a data compression device 3320 (such as, for example, any of the

data compression devices

1600, 1800, 2300), and a data decompression device 3330 (such as, for example, any of the

data decompression devices

1700, 1900, 2400, 2500, 2600). Advantageously, system 3300 is a computer system (such as any of computer systems 100 and 500 of FIGS. 1-5), and the one or more memories 3310 are one or more cache memories (such as any of cache memories L1-L3 of FIGS. 1-5), one or more random access memories (such as any of

memories

130 and 530 of FIGS. 1-5), or one or more secondary memories. Advantageously, the system 3300 is a data communication system (such as the

communication networks

600, 700 of fig. 6-7), wherein the one or more memories 3310 may be data buffers (such as the

transmitters

610, 710 and

receivers

620, 720 of fig. 6-7) associated with transmitting and receiving nodes in the data communication system.

While aspects of the invention have been described using example embodiments, they are not limited to the disclosed embodiments, and cover alternative embodiments that can be implemented by those skilled in the art.

Claims

1. A data compression apparatus (1600; 1800) for compressing data comprisesnAn uncompressed data block (1610; 1810) of data values (v 1-vn) is compressed into a compressed data block (1690; 1890), the data compression apparatus comprising:

a compressor (1620; 1820) configured to compress data values of the uncompressed data blocks into corresponding variable-length codewords (1625; 1825);

a detector (1630; 1830) configured to detect a presence of at least one particular data value (1632; 1832) in the uncompressed data block (1610; 1810); and

a compressed data block generator (1640- > 1670; 1840- > 1870) coupled to the compressor and the detector and configured to generate the compressed data block (1690; 1890) by combining:

a data value mask (Z-value mask), said data value mask containingnA plurality of mask locations, wherein each mask location indicates whether a corresponding data value in the uncompressed data block (1610; 1810) is equal to any one of the at least one particular data value (1632; 1832) detected by the detector (1630; 1830); and

for data values in the uncompressed data block (1610; 1810) that are not equal to the at least one particular data value (1632; 1832), the corresponding variable length codeword resulting from compression by the compressor,

wherein the compressed data block (1690; 1890) comprises the data value mask (Z-value mask) andma variable length codeword in which, among other things,m ≤ nand wherein a variable length codeword for a data value in the uncompressed data block (1610; 1810) that is equal to any one of the at least one particular data value (1632; 1832) is not included in the compressed data block (1690; 1890).

2. The data compression device according to claim 1, wherein the detector (1630; 1830) is configured to detect the presence of a particular data value (1632; 1832) in the uncompressed data block (1610; 1810), and wherein the data value mask (Z-value mask) is a binary masknEach mask position of the plurality of mask positions comprises a single bit.

3. The data compression device of claim 1, wherein the detector (1630; 1830) is configured to detect the presence of a plurality of different specific data values (1632; 1832) in the uncompressed data block (1610; 1810), and wherein the data value mask (Z-value mask) is a binary masknEach mask position of the plurality of mask positions contains a fixed-size bit combination capable of encoding any one of a plurality of particular data values (1632; 1832).

4. The data compression device of any preceding claim, wherein the or each particular data value of the at least one particular data value (1632; 1832) is a frequently occurring data value which may be encoded with the fewest bits if variable length coding is used instead.

5. A data compression device as claimed in any one of claims 1 to 3, wherein the or each particular data value of the at least one particular data value (1632; 1832) is a data value which, when present, requires very fast decompression.

6. The data compression device according to any of claims 1-3, wherein a particular data value or a particular data value of the at least one particular data value (1632; 1832) is 0.

7. The data compression device according to claim 2, wherein the detector (1630; 1830) comprises a comparator (1630; 1830) having:

a first input (1631 a, 1831 a) configured to receive data values (v 1-vn) at respective data value positions in the uncompressed data block (1610; 1810);

a second input (1631 b, 1831 b) configured to receive the particular data value (1632; 1832); and

an output (1633; 1833) configured to output a result of a comparison between the particular data value (1632; 1832) and the data value (v 1-vn) at the corresponding data value position in the uncompressed data block (1610; 1810).

8. The data compression device as claimed in claim 7,

wherein the compressed data block generator (1640-nA mask register (1640; 1840) of one bit storage locations, each for said data value mask (Z-value mask)nOne of the mask positions, an

Wherein an output (1633; 1833) of the comparator (1630; 1830) is coupled to the mask register (1640; 1840) such that the particular data value (1632; 1832) is utilized with the corresponding data value bit in the uncompressed data block (1610; 1810)The result of the comparison between the data values (v 1-vn) at a location causes the mask register (1640; 1840) to be at itnA corresponding one of the storage locations is updated.

9. The data compression device as claimed in claim 8, wherein the compressed data block generator (1640-:

a first input (1651 a, 1851 a) configured to receive a variable length codeword (1625; 1825) from the compressor (1620; 1820) corresponding to the data value (v 1-vn) at a respective data value position in the uncompressed data block (1610; 1810) and to store the received variable length codeword in the storage unit (1650; 1850);

an output (1653; 1853) configured to output the stored variable length code word (1625; 1825) into an accumulated variable length code (1655; 1855); and

a second input (1631 b, 1831 b) configured to receive a control signal dependent on an output (1633; 1833) of the comparator (1630; 1830),

wherein the memory cell (1650; 1850) is configured to: when the result of the comparison by the comparator (1630; 1830) indicates a match between the particular data value (1632; 1832) and the data value (v 1-vn) at the corresponding data value position in the uncompressed data block (1610; 1810), the stored variable length codeword (1625; 1825) is disabled from being output into the variable length code (1655; 1855).

10. The data compression device as recited in claim 9, wherein the compressed data block generator (1640-: when all of the uncompressed data blocks (1610; 1810)nWhen data values (v 1-vn) have all been processed, the compressed data block (1690; 1890) is generated by concatenating the data value mask (Z-value mask) from the mask register (1640; 1840) with the accumulated variable length code (1655; 1855).

11. The data compression apparatus as claimed in claim 10, wherein said data value mask (Z-value mask) is placed before said accumulated variable length code (1655; 1855) in said compressed data block (1690; 1890).

12. A data compression method for compressing data to include a predeterminednAn uncompressed data block (1610; 1810) of data values (v 1-vn) is compressed into a compressed data block (1690; 1890), the data compression method comprising:

compressing data values of the uncompressed data blocks into corresponding variable length codewords (1625; 1825);

detecting the presence of at least one particular data value (1632; 1832) in the uncompressed data block (1610; 1810); and

generating the compressed data block (1690; 1890) by combining:

a data value mask (Z-value mask), said data value mask containingnA plurality of mask locations, wherein each mask location indicates whether a corresponding data value in the uncompressed data block (1610; 1810) is equal to any one of the at least one particular data value (1632; 1832); and

for data values in the uncompressed data block (1610; 1810) that are not equal to any of the at least one particular data value (1632; 1832), a corresponding variable length codeword,

wherein the compressed data block (1690; 1890) comprises the data value mask (Z-value mask) andma variable length codeword in which, among other things,m ≤ nand wherein, at said passing pressureThe reduced data blocks (1690; 1890) do not include a variable length codeword for a data value in the uncompressed data block (1610; 1810) that is equal to any one of the at least one particular data value (1632; 1832).

13. A data decompression apparatus (1700; 1900) for decompressing a compressed data block (1710; 1910) into a decompressed data block (1790; 1990), the decompressed data block comprising at respective data value positionsn-a data value (v 1-vn), said data decompression apparatus comprising:

a decompressor (1720-1730; 1920-1930) configured to decompress a variable length codeword (1625; 1825) of the compressed data block into a corresponding decompressed data value (1735; 1935); and

a decompressed data block generator (1740-:

reading from the compressed data block (1710; 1910) includesnA data value mask (Z-value mask) of mask positions, wherein each mask position indicates whether a corresponding data value in an uncompressed data chunk (1610; 1810) is equal to any one of at least one particular data value (1632; 1832) prior to data compression resulting in the compressed data chunk (1690/1710; 1890/1910); and is

Generating the decompressed data block (1790; 1990) by combining the decompressed data values from the decompressor (1720-1730; 1920-1930) with the at least one particular data value (1782; 1982) indicated by the respective mask position of the data value mask (Z-value mask) based on the data value mask (Z-value mask),

wherein an order of data values of the generated decompressed data block (1790; 1990) is the same as an order in which the data values occurred in the uncompressed data block (1610; 1810) prior to the data compression.

14. Data decompression device according to claim 13, wherein the data value mask (Z-value mask) isnEach mask position of the plurality of mask positions comprises a single bit that indicates or does not indicate a particular data value (1782; 1982).

15. Data decompression device according to claim 13, wherein the data value mask (Z-value mask) isnEach mask position of the plurality of mask positions comprises a fixed size bit combination capable of decoding any one of a plurality of particular data values (1782; 1982).

16. The data decompression device of any one of claims 13 to 15, wherein the or one of the at least one specific data value (1782; 1982) is 0.

17. The data decompression apparatus as claimed in claim 14, wherein the decompressed data block generator (1740-:

a value position generator (1740; 1940);

a value position assigner (1760; 1960); and

a plurality of selectors (1780; 1980) each for said decompressed data block (1790; 1990)nOne of the data value positions,

wherein the value position generator (1740; 1940) is configured to control the value position assigner (1760; 1960) and the plurality of selectors (1780; 1980), such that when the corresponding mask position of the data value mask (Z-value mask) indicates the particular data value (1782; 1982), receiving the particular data value (1782; 1982) from the respective selector and including the particular data value in the decompressed data block (1790; 1990) at the respective data value position, and such that when the corresponding mask position of the data value mask (Z-value mask) does not indicate the particular data value (1782; 1982), corresponding decompressed data values are received from the decompressor (1720-1730; 1920-1930) and included in the decompressed data blocks (1790; 1990) at respective data value locations.

18. A data decompression method for decompressing a compressed data block (1710; 1910) into a decompressed data block (1790; 1990) comprising data values at respective data value positionsn-a data value (v 1-vn), the data decompression method comprising:

decompressing variable length codewords (1625; 1825) of the compressed data block into corresponding decompressed data values (1735; 1935);

reading from the compressed data block (1710; 1910) includesnA data value mask (Z-value mask) of mask positions, wherein each mask position indicates whether a corresponding data value in an uncompressed data chunk (1610; 1810) is equal to any one of at least one particular data value (1632; 1832) prior to data compression resulting in the compressed data chunk (1690/1710; 1890/1910); and

generating the decompressed data block (1790; 1990) by combining decompressed data values with the at least one particular data value (1782; 1982) indicated by the corresponding mask position of the data value mask (Z-value mask) based on the data value mask (Z-value mask),

19. A data compression apparatus (1800) for compressing data comprisingnUncompressed data blocks (1810) of data values (v 1-vn) are compressed into compressed data blocks (1890)The data compression apparatus includes:

a compressor (1820) configured to compress data values of the uncompressed data block into corresponding variable-length codewords (1825);

a detector (1830) configured to detect a presence of at least one particular data value (1832) in the uncompressed data block (1810); and

a compressed data block generator (1840-1870) coupled to the compressor and the detector and configured to generate a compressed data block comprisingnA data value mask (Z-value mask) of mask positions, wherein each mask position indicates whether a corresponding data value in the uncompressed data block (1810) is equal to any one of the at least one particular data value (1832) detected by the detector (1830),

wherein the compressed data block generator (1840-1870) comprises a mask code generator (1860) configured to:

analyzing the generated data value mask (Z-value mask), including determining whether the data value mask matches any of a plurality of mask patterns; and

generating a mask code (1868) to represent results of the analysis;

wherein the compressed data block generator (1840-1870) is further configured to generate the compressed data block (1890) by combining at least:

the generated mask encoding (1868); and

for data values in the uncompressed data block (1810) that are not equal to any of the at least one particular data value (1832), the corresponding variable length codeword resulting from compression by the compressor,

wherein the compressed data block (1890) comprises the generated mask code (1868) andma variable length codeword in which, among other things,m ≤ nand it isDoes not include in the compressed data block (1890) a variable length codeword for a data value in the uncompressed data block (1810) equal to any one of the at least one particular data value (1832), and

wherein the compressed data chunks (1890) further include the data value masks (Z-value masks) unless a result of the analysis by the mask code generator (1860) is that the generated data value masks (Z-value masks) indicate a predetermined repeating pattern of particular data values in the uncompressed data chunks (1810).

20. The data compression device as claimed in claim 19, wherein the mask code generator (1860) is configured to: generating the mask encoding (1868) with a first mask encoding value when the generated data value mask (Z value mask) indicates a predetermined repeating pattern in the uncompressed data block (1810) in which a particular data value is located at every other data value position, and wherein only the mask encoding (1868) is included in the generated compressed data block (1890) and no data value mask (Z value mask) is included.

21. The data compression device as claimed in claim 20, wherein the mask code generator (1860) is configured to: generating the mask encoding (1868) with second mask encoding values when the generated data value mask (Z-value mask) indicates a predetermined repetition pattern in the uncompressed data block (1810) in which certain data values are located at every other data value position, but offset by one position relative to the predetermined repetition pattern of the first mask encoding values, and wherein only the mask encoding (1868) and not the data value mask (Z-value mask) is included in the generated compressed data block (1890).

22. The data compression device as claimed in claim 21, wherein the mask code generator (1860) is configured to: generating the mask encoding (1868) having a third mask encoding value when the generated data value mask (Z-value mask) indicates that a particular data value is present in at least one data value position in the uncompressed data block (1810), and wherein the mask encoding (1868) and the data value mask (Z-value mask) are included in the generated compressed data block (1890).

23. The data compression device as claimed in claim 22, wherein the mask code generator (1860) is configured to: generating the mask encoding (1868) with a fourth mask encoding value when the generated data value mask (Z-value mask) indicates that no particular data value is present in all data value positions in the uncompressed data block (1810), and wherein only the mask encoding (1868) is included in the generated compressed data block (1890) and no data value mask (Z-value mask) is included.

24. The data compression device according to any of claims 19-23, wherein in said compressed data blocks (1890) said mask encoding (1868) is placed in said compressed data blocks (1890)mA variable length codeword and wherein, in said compressed data block (1890), said data value mask (Z value mask), if any, is disposed after said mask encoding (1868) and after said mask encoding (Z value mask)mA variable length codeword.

25. A data compression method for compressing data comprisingnAn uncompressed data block (1810) of data values (v 1-vn) is compressed into a compressed data block (1890), the data compression method comprising:

compressing data values of the uncompressed data blocks into corresponding variable length codewords (1825);

detecting a presence of at least one particular data value (1832) in the uncompressed data block (1810);

generating a containmentnOf individual mask positionsA data value mask (Z-value mask), wherein each mask position indicates whether a respective data value in the uncompressed data block (1810) is equal to any one of the at least one particular data value (1832);

analyzing the generated data value mask (Z-value mask), including determining whether the data value mask matches any of a plurality of mask patterns;

generating a mask code (1868) to represent results of the analysis; and

generating the compressed data block (1890) by combining at least:

the generated mask encoding (1868); and

for data values in the uncompressed data block (1810) that are not equal to any of the at least one particular data value (1832), the corresponding variable length codeword resulting from compression by a compressor,

wherein the compressed data block (1890) comprises the generated mask code (1868) andma variable length codeword in which, among other things,0 ≤ m ≤ nand wherein a variable length codeword for a data value in the uncompressed data block (1810) equal to any one of the at least one particular data value (1832) is not included in the compressed data block (1890), and

wherein the compressed data chunks (1890) further include the data value masks (Z-value masks) unless the result of the analyzing step is that the generated data value masks (Z-value masks) indicate a predetermined repeating pattern of particular data values in the uncompressed data chunks (1810).

26. A data decompression apparatus (1900) for decompressing a compressed data block (1910) into a decompressed data block (1990), the decompressed data block comprising at respective data value positionsn-a data value (v 1-vn), said data decompression apparatus comprising:

a decompressor (1920-1930) configured to decompress the variable length code word (1825) of the compressed data block into a corresponding decompressed data value (1935); and

a decompressed data block generator (1940-1980) configured to:

reading a mask encoding (1915) from the compressed data block (1910), the mask encoding (1915) representing any of a plurality of mask patterns;

reading from the compressed data block (1910) a predetermined repeating pattern comprising any particular data value of at least one particular data value in an uncompressed data block (1810) prior to data compression resulting in the compressed data block (1890/1910) when the mask pattern represented by the read mask encoding (1915) does not indicate the predetermined repeating patternnA data value mask (Z-value mask) of mask positions, wherein each mask position indicates whether a respective data value in the uncompressed data block (1810) prior to data compression is equal to any one of the at least one particular data value (1832); and

generating the decompressed data block (1990) based on the mask encoding (1915) and, where applicable, the data value mask (Z-value mask), by combining one of the decompressed data values from the decompressor (1920-1930) and the other of the at least one particular data value (1982) indicated by the predetermined repetition pattern represented by the mask encoding (1915) or indicated by the respective mask position of the data value mask (Z-value mask), where applicable,

wherein an order of data values of the generated decompressed data block (1990) is the same as an order in which the data values occurred in the uncompressed data block (1810) prior to the data compression.

27. A data decompression apparatus as claimed in claim 26, wherein the decompressed data block generator (1940-1980) is configured to generate the decompressed data block (1990) when the read mask code (1915) has the first mask code value by: the manner combines the decompressed data values from the decompressor (1920 1930) with the at least one particular data value (1982) in a predetermined repeating pattern of particular data values at every other data value position in the uncompressed data block (1990).

28. A data decompression apparatus as claimed in claim 27, wherein the decompressed data block generator (1940-1980) is configured to generate the decompressed data block (1990) when the read mask code (1915) has the second mask code value by: the manner combines the decompressed data values from the decompressor (1920 1930) with the at least one particular data value (1982) in a predetermined repetition pattern of the second mask encoded values at every other data value position in the decompressed data block (1990) but offset by one position relative to the predetermined repetition pattern of the first mask encoded values.

29. A data decompression apparatus according to claim 28, wherein the decompressed data block generator (1940-1980) is configured to: when the read mask code (1915) has a third mask code value, generating the decompressed data block (1990) by combining the decompressed data values from the decompressor (1920-1930) with the at least one particular data value (1982) indicated by the corresponding mask position of the data value mask (Z-value mask).

30. A data decompression apparatus according to claim 29, wherein the decompressed data block generator (1940-1980) is configured to: when the read mask code (1915) has the fourth mask code value, the decompressed data block is generated (1990) containing all decompressed data values from the decompressor (1920-1930).

31. A data decompression method for decompressing a compressed data block (1910) into a decompressed data block (1990), the decompressed data block comprising at respective data value locationsn-a data value (v 1-vn), the data decompression method comprising:

decompressing a variable length codeword (1825) of the compressed data block into a corresponding decompressed data value (1935);

generating the decompressed data block (1990) based on the mask encoding (1915) and, where applicable, the data value mask (Z-value mask), by combining one of the two being decompressed data values and the other of the two being the at least one particular data value (1982) indicated by the predetermined repetition pattern represented by the mask encoding (1915) or indicated by a respective mask position of the data value mask (Z-value mask), where applicable,

32. A data compression apparatus (2300) for use in a data compression system comprisingnAn uncompressed data block (2310) of data values (v 1-vn) compressed into a compressed data block (2390), the data compression apparatus comprising:

a compressor (2320) configured to compress data values of the uncompressed data blocks into corresponding variable-length codewords (2325), and to output the variable-length codewords (2325) and their respective code lengths (cL);

a compressed data block generator (2340-2370) coupled to the compressor and including a data block generator having a bit rate of bitsnA length mask register (2350) of storage locations, each for said uncompressed data block (2310)nOne of the data values (v 1-vn), the length mask register (2350) being configured to storenA length mask of individual positions (L-mask),

wherein the compressed data block generator (2340-

Wherein the compressed data block generator (2340-:

the length mask (L-mask) stored in the length mask register (2350); and

compressed data values in the form of variable length codewords (2325) provided by the compressor (2320).

33. The data compression apparatus as defined in claim 32, wherein the length mask (L-mask) is placed before the variable length codeword (2325) in the compressed data block (2390).

34. The data compression device as claimed in claim 32 or 33 further comprising a detector (2330) coupled to the compressor (2320) and configured to detect data values in the uncompressed data block (2310) that cannot be compressed by the compressor (2320),

wherein the compressed data block generator (2340-2370) is configured to: storing a code length (cL) having a special value at a corresponding position in the length mask (L-mask) when the detector (2330) has detected that a data value of the uncompressed data block (2310) cannot be compressed and storing the data value in the compressed data block (2390).

35. The data compression apparatus as claimed in claim 34, wherein the special value of the code length is 0.

36. The data compression device as defined in claim 34, wherein the detector (2330) is configured to detect data values in the uncompressed data block (2310) that cannot be compressed by the compressor (2320) as one or more of:

data values not present in a code table (2322) of the compressor (2320),

a data value of a codeword (2325) present in the code table but absent from the code table of the compressor,

a data value present in the code table having a codeword (2325) in the code table but indicated as invalid in the code table of the compressor.

37. A data compression apparatus as claimed in claim 34, wherein the compressed data block generator (2340-2370) is configured to: generating the compressed data block (2390) by including compressed data values in the form of variable length codewords (2325) provided by the compressor (2320) in the order of the data values (v 1-vn) in the uncompressed data block (2310) and uncompressed data values detected by the detector (2330).

38. A data compression apparatus as claimed in claim 34 wherein the compressed data block generator (2340-2370) is configured to generate the compressed data block (2390) by: said manner is by first including compressed data values in the form of variable length codewords (2325) provided by said compressor (2320) in the order of said data values (v 1-vn) in said uncompressed data block (2310), and then including uncompressed data values detected by said detector (2330), or vice versa, wherein the order of all corresponding code lengths (cL) in said length mask (L-mask) follows the order of all data values (v 1-vn) of said uncompressed data block (2310), thereby allowing reconstruction of all data values of said uncompressed data block (2310) in the original order during decompression of said compressed data block (2390).

39. The data compression apparatus of claim 38, wherein the uncompressed data values are stored in the generated compressed data block (2390) in a reverse order compared to their order in the uncompressed data block (2310).

40. A data compression method for compressing data comprisingnAn uncompressed data block (2310) of data values (v 1-vn) is compressed into a compressed data block (2390), the data compression method comprising:

compressing data values of the uncompressed data block into corresponding variable length codewords (2325), and outputting the variable length codewords (2325) and their respective code lengths (cL);

will be provided withnA length mask (L-mask) of each position is stored in a memory havingnA length mask register (2350) of storage locations, each for said uncompressed data block (2310)nOne of the data values (v 1-vn),

storing respective code lengths (cL) of the variable length codewords (2325) at respective positions in the length mask (L-mask), and

generating the compressed data block (2390) by combining:

the length mask (L-mask) stored in the length mask register (2350); and

compressed data values in the form of the variable length codewords (2325).

41. A data decompression apparatus (2400; 2500; 2600) for decompressing a compressed data block (2410; 2510; 2610) into a decompressed data block (2490; 2590; 2690) comprising data values at respective data value positionsn-a data value (v 1-vn), said data decompression apparatus comprising:

a decompressor (2440; 2540; 2640) configured to decompress a variable length codeword of the compressed data block into a corresponding decompressed data value (2445; 2545; 2645);

an extractor mechanism (2420-: reading from the compressed data block (2410; 2510; 2610)nA length mask (L-mask) of locations from which respective code lengths of variable length codewords in the compressed data block (2410; 2510; 2610) are determined, respective variable length codewords are extracted from the compressed data block (2410; 2510; 2610) based on the determined respective code lengths, and the extracted respective variable length codewords are provided to the decompressor (2440; 2540; 2640); and

a decompressed data block generator (2450-,

wherein an order of data values of the generated decompressed data block (2490; 2590; 2690) is the same as an order in which the data values occurred in the uncompressed data block (2310) prior to data compression.

42. The data decompression device according to claim 41, wherein the decompressed data block generator (2450-:

for one or more positions in the length mask (L-mask), detecting one or more code lengths having a special value indicating that one or more corresponding data values are included in uncompressed form in the compressed data block (2410; 2510, 2584; 2610); and

based on the detected one or more code lengths having the special value, generating the decompressed data block (2490; 2590; 2690) by combining decompressed data values from the decompressor (2440; 2540; 2640) with the one or more corresponding data values from the uncompressed form of the compressed data block (2410; 2510; 2610).

43. A data decompression apparatus according to claim 42, wherein the special value of code length is 0.

44. A data decompression method for decompressing a compressed data block (2410; 2510; 2610) into a decompressed data block (2490; 2590; 2690), the decompressed data block comprising at respective data value positionsn-a data value (v 1-vn), the data decompression method comprising:

from the compressed data block(s) ((2410; 2510, a surfactant; 2610) readingnA length mask of individual positions (L-mask);

determining a respective code length of a variable length codeword in the compressed data block (2410; 2510; 2610) in dependence on the length mask (L-mask);

extracting respective variable length codewords from the compressed data block (2410; 2510; 2610) based on the determined respective code lengths;

decompressing the extracted variable length codeword into a corresponding decompressed data value (2445; 2545; 2645); and

generating the decompressed data block (2490; 2590; 2690) from the decompressed data value,

45. A data compression and decompression system (3300) comprising one or more memories (3310), a data compression apparatus (3320; 1600; 1800) according to any one of claims 1-11, 19-24 or 32-39, and a data decompression apparatus (3330; 1700, 1900) according to any one of claims 13-17, 26-30 or 41-43.

46. The data compression and decompression system (3300) of claim 45, wherein the system is a computer system (100; 200; 300; 400; 500), and wherein the one or more memories (3310) are from the group consisting of:

a cache memory (L1-L3),

a random access memory (130; 230; 330; 430; 530), and

a secondary memory.

47. The data compression and decompression system (3300) of claim 45, wherein the system is a data communication system (600; 700), and wherein the one or more memories (3310) are data buffers.

48. A computer readable medium comprising code instructions which when loaded and executed by a processing device cause performance of the method of claim 12, 25 or 40.

49. A computer readable medium comprising code instructions which when loaded and executed by a processing device cause performance of the method of claim 18, 31 or 44.