EP1941617A1 - Method and system for compressing data - Google Patents
Method and system for compressing dataInfo
- Publication number
- EP1941617A1 EP1941617A1 EP06799795A EP06799795A EP1941617A1 EP 1941617 A1 EP1941617 A1 EP 1941617A1 EP 06799795 A EP06799795 A EP 06799795A EP 06799795 A EP06799795 A EP 06799795A EP 1941617 A1 EP1941617 A1 EP 1941617A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- elements
- tables
- sequence
- subsequence
- length
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 238000000034 method Methods 0.000 title claims abstract description 88
- 230000001131 transforming effect Effects 0.000 claims abstract description 11
- 238000013144 data compression Methods 0.000 claims abstract description 5
- 238000007906 compression Methods 0.000 claims description 46
- 230000006835 compression Effects 0.000 claims description 41
- 238000004590 computer program Methods 0.000 claims 2
- 230000015654 memory Effects 0.000 description 20
- 230000006837 decompression Effects 0.000 description 14
- 238000012360 testing method Methods 0.000 description 7
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 210000003813 thumb Anatomy 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3088—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/42—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
Definitions
- the present invention relates to a lossless compression method which allows for block-wise random access of the compressed data.
- the memory size of the firmware (pre-installed software) in embedded computer devices is rapidly increasing.
- the memory unit is one of the most expensive components, greatly affecting the bill of material.
- it is important to reduce the size of the firmware in such devices.
- the firmware in embedded devices consists of machine code, that is, the actual microprocessor instructions. Embedded into the code is also constant, literal data, i.e. data that remains unchanged (the value of literal data is written at compile-time and read-only at run time), such as integers and strings, which typically occupies a few percent of the total code size.
- Firmware is normally stored in non-volatile memory such as NOR or NAND flash. There is a trend to replace NOR flash with NAND flash since the latter memory type is several times cheaper. The drawback with NAND flash is that it is a block-based memory, like a hard disk, which means that it can only be read and written in blocks, typically 512 bytes or 2 kilobytes in size.
- the microprocessor in an embedded device is usually a RISC (Reduced Instruction Set Computer) processor.
- This type of processor is characterized by having a relatively small number of instructions of fixed length, typically 16- or 32-bits.
- CISC (Complex Instruction Set Computer) processors are normally found in personal computers and have a much larger instruction set.
- An advantage of RISC processors is that they allow for a simpler and cheaper design, but a disadvantage is that more instructions are required to accomplish a given function, compared to CISC processors.
- the relatively low code density of RISC processors is known by embedded system manufacturers and they have addressed this concern in various ways.
- IBM's has developed the CodePack code compression technique for the 32-bit RISC PowerPC processors.
- compression is performed by a software utility that creates compressed application code images from standard PowerPC executable files.
- Decompression is performed by an ASIC core that is placed between the processor and the memory controller. The core decompresses instructions on-the-fly as needed by the processor.
- the technique is described in the IBM Technical White Paper "CodePack: Code Compression for PowerPC Processors" by M. Game and A. Booker. Compression is done by splitting each 32-bit instruction into two 16-bit halves. Each half is then substituted with a variable-length (entropy) code, representing an index into one of two decode tables, holding 512 16-bit entries each.
- An element is normally an 8-bit byte but the method easily generalizes to elements of other lengths.
- An element is either encoded as a literal byte, i.e. data that is represented "as is" in the compressed data, or as a backward reference to a repeated element sequence.
- the reference is specified by the length, in bytes, of the repeated sequence and by the backward distance, also in bytes, to the most recent repeated sequence.
- the maximal distance allowed by the encoding scheme defines a "window" which defines how far back, relative to the current element, a reference can be made.
- a popular and patent-free version of the LZ77 method is the DEFLATE algorithm, specified in the RFC (Request for Comments) 1951 internet standard and implemented in software as the zlib library (www.zlib.net).
- This algorithm specifies how to encode literal bytes and backward, length-distance references as uniquely decodeable and variable length bit sequences. It further employs Huffman entropy coding of the literals and length references to maximize the compression ratio.
- the DEFLATE algorithm's encoding method therefore uniquely defines the decompression method. Decompression interprets the compressed data as a bit stream and outputs either literal bytes or a sequence of bytes which is a repetition occurring before the most recent encoded byte.
- the LZ77 method performs well on many types of data it has the drawback of not allowing random access to different parts of the compressed data. For decompressing a certain part, all data before that part has to be decompressed first.
- One way to achieve random access in LZ77 to individual blocks, in a block based memory, is to compress each block separately, concatenate the compressed data of all blocks, and store pointers to each compressed block in a table. But the problem with this approach is that the compression ratio decreases as the block size gets smaller.
- Another drawback of the LZ77 method is that it is not optimal for compressing code consisting of 16- and 32-bit RISC instructions since they have rather different characteristics than, for example, text files. This also explains, partly, why IBM has chosen a different approach in their CodePack technique.
- IBM has also disclosed US Patent 5,001,478, which is a method for encoding compressed data.
- the compression method operates by transforming the input data into a sequence of history, lexicon, and literal references.
- the history references are of the same type as the backward length-distance references in the LZ77 method.
- the lexicon references are also history-type references, but refer to a string in a lexicon which works like a buffer, holding the most recent history references. In this way, the compression ratio is increased since lexicon references require shorter binary codes.
- the object of the invention is achieved by a novel method for lossless data compression of an input sequence of elements. The method comprises the steps of transforming the input sequence of data elements into a processed sequence of symbols, each symbol representing one of a literal element, a backward reference to a previously occurring subsequence of elements, or a table reference to a subsequence in one or more tables of frequently occurring subsequences of elements.
- the method further includes the step of encoding the processed sequence of symbols into a uniquely decodeable bit stream for compressing the original sequence.
- the bit stream is stored in storage means.
- the one or more tables of frequently occurring subsequences of elements is a pre-defined table or a predefined set of tables. (In the following description and claims a pre-defined table or pre-defined set of tables is defined as a table or set of tables wherein the content of the tables is pre-defined).
- the method may include a step of building the one or more tables of frequently occurring subsequences of elements as a pre-processing step. This has the advantage that an efficient lossless compression method is obtained in a simple manner.
- the tables may be adapted for the specific data to be compressed, and the step of building the tables can be carried out every time a new set of data is to be compressed.
- the tables may be given beforehand, as the same set of tables may be applicable on different sequences of elements.
- the tables may be built once for a particular type of data sets and then re-used each time a particular set of data of that type is to be compressed.
- pre-built tables has the advantage that, e.g. in case of transmission of the bit stream to a receiver via a wireless interface, the receiver can be provided with the tables beforehand, which reduces the amount of data to be transmitted.
- the method may further include the step of building a table of bit positions referring to the starts of each compressed block in the bit stream.
- the compression method of the present invention operates on the original data as a block-partitioned sequence of elements, transforms the elements into a sequence of symbols, and outputs a bit stream, possibly together with a set of tables.
- the corresponding decompression method operates on the bit stream, encodes it into a sequence of symbols, and outputs the original data as a sequence of elements.
- the decompression method also needs access to the tables produced during the compression phase in order to identify the beginning of each block and in order to transform the table reference symbols into the corresponding subsequences. Either the tables containing the subsequences are pre-stored in the decoding device, or alternatively, the tables are included in the output bit stream.
- compression means that less number of bits, or bytes, are needed to represent the compressed data than the original data and that lossless means that the original data can be reconstructed exactly from the compressed data. Compression is achieved only if the sum of the size of the bit stream and the size of all the tables is less than the size of the original data.
- the pre-processing step of building one or more tables of commonly occurring subsequences and referencing to them comprises a method to address this limitation and results in a higher compression ratio.
- commonly shorter subsequences are repeated more frequently than longer ones.
- the tables therefore contain short and frequent subsequences of elements.
- one or two tables are built in the pre-processing step, one table consisting of the most common pairs, or bigrams, of elements and/or one table consisting of the most common subsequences of three elements, or trigrams, respectively.
- the present invention is not limited to this particular choice of tables.
- the compression method processes the original data as a sequence of elements.
- An element is usually understood to mean an 8-bit byte, but the present invention is not limited to an element being a byte and it would certainly be possible to let an element belong to any finite alphabet. In the sequel, the terms element and byte are therefore used interchangeably.
- the original data is further assumed to be divided into blocks whose sizes typically range from 512 bytes to 4 kilobytes. These block sizes do not have to be equal to the physical block size of the block-based memory, or storage device. In most applications, the blocks are all of the same size but this is no limitation of the present invention and they could even be of variable length.
- the method first transforms this sequence into a new representation which consists of a sequence of literal bytes, length-distance backward references, or table-index references.
- the length-distance references are of the same type as in the LZ77 method, that is, references to sequences of bytes that have occurred previously in the original byte sequence. If there are several repeated subsequences found, the one with the longest length is chosen and, if there are two or more repetitions of the same length, the one with the shortest backward distance is chosen.
- the lengths are limited to be at least three bytes, with a maximum length defined by the number of bits used to encode the lengths. To allow for random access to individual blocks, the distance references can not refer beyond the most recent block boundary.
- the table-index references also refer to sequences of bytes. However, these sequences are stored in one or more tables.
- the first parameter in the ordered pair table-index is a number which specifies in which table the sequence is found and the second parameter is an index number which uniquely identifies the sequence in this particular table.
- one table is built containing only two-byte sequences and one table is built containing only three-byte sequences, but this is no limitation of the present invention. If each table only contains sequences of a fixed length, less memory is needed to represent the tables.
- a fixed set of tables that are optimized for a particular type of data for example, machine code for a particular RISC processor are used.
- the firmware, or ROM image from a number of different embedded devices with the same processor could be used to find the most common two- and three-byte sequences. The same set of tables could then be used when compressing different sets of data without having to go through the pre-processing step.
- Fig. 1 is a flow chart depicting an exemplary compression method of the present invention.
- Fig. 2 schematically illustrates the exemplary compression process, resulting in a compressed bitstream and a set of tables, and the decompression process for decompressing an arbitrary block, given the bit stream and the tables.
- Fig. 3 is a table showing an exemplary encoding of a sample text string.
- the memory size of the software in computer devices is rapidly increasing.
- the memory unit is one of the most expensive components, and, accordingly, it is important to reduce the size of the software (in particular firmware) in such devices.
- firmware is normally stored in non-volatile memory such as NOR or NAND flash.
- the present invention provides a data compression method that is particularly suitable for use when compressing firmware of such devices.
- the present invention is applicable on any type of data in any application wherein a lossless compression of the data is desired.
- FIG. 1 A flow chart of an exemplary compression method according to the present invention is depicted in Fig. 1.
- the data to be compressed will be processed as a sequence of bytes but it should be understood that elements from any finite alphabet could be used as well, and the present invention is not limited to the case of an element being a byte.
- the compression process starts, it is assumed that the original data to be compressed and the block boundaries are given. It is further assumed that the table of bit positions, which at the end of the process refers to the starts of each compressed block, and the bitstream, which at the end of the process contains the compressed data, have been initialized.
- the initial step of the compression method is comprised of building two- and three-byte tables, as shown in step 100.
- These tables should contain frequently occurring subsequences in the original data.
- one table is defined to consist of the 512 most common two-byte sequences found in the original data and one table to consist of the 512 most common three-byte sequences.
- the size of the two tables adds to 2,560 bytes.
- the sort and count methods needed to determine the most common subsequences are familiar to those skilled in the art of the invention and will therefore not be discussed more in detail.
- the choice of the number of tables and their content and sizes is not limited to these particular values and is no limitation of the present invention.
- the transformation of the original byte sequence, given the tables, into a sequence of literal bytes, table or backward references now proceeds as follows.
- the original data is processed block-by-block and as a sequence of bytes within each block.
- the current block is set to the first block, step 101, and then the current byte pointer is set to point to the first byte of the current block to be processed, step 102.
- the bytes in each block are then transformed as follows. First, a search is made to find the longest repetition within the current block and before the current byte pointer, step 103. If a backward reference of length at least four bytes is found (test condition 104), the corresponding length-distance reference symbol is output, step 105. At the beginning of each block it is, of course, not possible to find a length-distance backward reference, meaning that condition 104 leads to step 106. If a backward reference of length at least four bytes is not found, a search for the next three bytes is also made in the three-byte table, step 106.
- the symbol with the shortest bit length is output, that is, either a length-distance or a table-index symbol.
- the binary encoding of these symbols is defined later.
- the three-byte table symbol will always have the shortest encoding and will therefore always be output. This assumption is made in the flow chart of fig. 1, but this is no limitation of the present invention. Therefore, in this embodiment, if a match is found (test condition 107) in the three-byte table, the corresponding three-byte table symbol is directly output (step 108).
- a length-distance reference symbol output (step 110) if the length is equal to three (test condition 109). If no backward reference or three-byte table reference can be found, the next two bytes are searched for in the two-byte table, as shown in step 111. If they are found in the table, the corresponding table-index symbol is output, step 113. Finally, if no backward or table references can be found, a literal byte symbol is output, step 114, and the byte pointer is moved to point to the next byte.
- step 115 in fig. 1, the prefix and parameter encoding of each symbol is preferably appended to the bit stream after each symbol has been generated and not after all the symbols have been generated for all of the data to be compressed.
- the step corresponding to step 115 also incorporates the forward movement of the current byte pointer.
- the bit stream is generated after all of the symbols have been generated.
- test condition 116 determines if all bytes in the original data have been compressed. If this is the case, test condition 116 is true and the last entry of the table of bit pointers is set to refer to one bit past the last bit of the bitstream, step 117, and the compression process is completed. If there are more bytes to be compressed, another test (118) is made to determine whether the current byte is at the beginning of a new block. If it is, the table of bit pointers is updated with a new entry referring to the first bit of the next block, and the current block is moved one step forward, step 119. Then step 102 is repeated, that is, the current byte pointer is set to the first byte of the new current block. The remaining bytes within each block are then transformed in the same way until no bytes remain, starting at step 103. The step corresponding to step 103 is also carried out if test condition 118 fails, that is, the current byte is not at the start of the next block.
- the next step of the method according to the invention consists of encoding the symbol sequence of all the blocks into a bit stream.
- the encoding step also uniquely defines the decoding step of transforming the bit stream into a sequence of symbols.
- the whole decompression process is defined by transforming the sequence of symbols into the original sequence of bytes, or elements.
- the symbol type and table index numbers are encoded together in a two-bit prefix code.
- a literal byte symbol is encoded as binary "00”, a length-distance backward reference symbol as binary "01”, a two-byte table symbol as binary "10”, and a three-byte table symbol as binary "11".
- the other parameters of each symbol are encoded into binary form.
- the binary encoding of a symbol and its parameters are appended to the bit stream after each symbol, and not after the complete element sequence has been transformed into a sequence of symbols, as depicted in the flow chart in fig. 1 and in particular block 115.
- a literal byte symbol is encoded with another 8 bits. These 8 bits simply constitute the binary representation of the byte.
- the parameters of the length-distance reference symbols are encoded as follows. First a fixed number of bits following the prefix code are used for encoding the distance.
- the distance is advantageously encoded with the same number of bits that are needed to represent the current block length. For example, in case the block length is 512 bytes, 9 bits are used to code the distance. It has further been found that the maximum length allowed by encoding the length with 5 bits works well. In a preferred embodiment, the minimum length is 3 bytes, which means that these 5 bits can encode lengths from 3 through 34 bytes. With these choices, the length-distance references are encoded with a total of 16 bits, including the prefix code.
- the table index number is encoded with a fixed number of bits. In case both the two-and three-byte tables contain 512 entries each, 9 bits are used to encode the table index.
- the table index numbers are represented with Huffman coding, resulting in a variable number of bits and in average less number of bits than with a fixed code length.
- Huffman codes are preferably built as part of the pre-processing step when building the tables. The count of each subsequence is then used to compute the Huffman code, but all these counts do not have to be stored in the tables. There are more memory efficient methods of representing these Huffman codes, but these methods are familiar to those skilled in the art of the invention and will therefore not be described herein. In this embodiment, the Huffman code must also be made available to the decoder.
- the step of encoding the symbols into a bit stream is completed by concatenating the bit stream of each compressed block into one single bit stream, representing all the compressed blocks.
- the generated bit stream is stored in storage means, such as NAND or NOR flash or any other type of volatile or non-volatile memory units. If the bit stream is intended to be transmitted to another device, e.g. by a wireless transmission, the bit stream may be stored in storage means in form of a buffer prior to transmission.
- a table of bit positions referring to the start of each compressed block in the bit stream is built.
- this table is built incrementally as soon as the encoding of the current block is completed, as shown in step 118 and 119.
- a 32-bit integer number is used to represent the bit position of the start, or first bit, of each compressed block.
- the table is then represented as an array of integers.
- the bit position of the first compressed block will always be zero, so this position is not stored in the array, or table.
- one plus the position of the last bit of the last block is recorded in the last element of the array. In this way the length of the array is the same as the number of blocks, and the first and last bit position of each compressed block can be computed from the array. This information is needed during decompression to determine where to start and end the decoding of each block in the bit stream.
- Schematic diagrams of the compression and decompression processes are depicted in fig. 2.
- the upper diagram shows the compression process which takes as input a sequence of n blocks.
- the result of the compression is the compressed data, or the bitstream, and a set of tables.
- the set of tables include the table of bit pointers and, in an exemplary embodiment, a two- and a three-byte table.
- the lower diagram shows the decompression process. Given the bitstream and the set of tables, it can decompress any block k between 1 and n.
- the example sequence is the text string "this is miss", that is, the elements are characters which are represented with bytes, encoded with the ASCII standard.
- this string is contained in one block and that the current byte pointer is located at the initial character "t".
- the two-byte table contains the strings "th” and "is”, with indices 0 and 1, respectively.
- the three-byte table contains the string "mis” whose index is 5.
- the first symbol encodes the initial string "th" as a reference to the corresponding string in the two-byte table, under the assumption that the string "thi” is not in the three-byte table.
- the second symbol encodes the string "is”, also with a reference to the two-byte table. Similarly, this is under the assumption that the string "is " is not contained in the three-byte table.
- the third symbol encodes the following space as a literal byte.
- the fourth symbol encodes the string "is " as a backward reference of length 3 and distance 3.
- the fifth symbol encodes the string "mis” as a reference to the three-byte table.
- the sixth symbol encodes the single letter "s" as a literal byte.
- the table in fig. 3 displays the encoded symbols and their type, binary encoding, and the binary length, respectively.
- the spaces in the binary strings are inserted just for illustrative purposes.
- the two literals, the space and letter "s" are encoded into binary form using their decimal ASCII codes 32 and 115, respectively.
- the length-distance backward reference encodes the distance 3 as the binary form of decimal 2, that is, the distance minus one, since the minimum distance is 1.
- the corresponding length 3 is encoded as the binary form of decimal 0, that is, the length minus 3, since, in this example, the minimum length is 3.
- This particular sequence is therefore encoded with a total of 69 bits, which should be compared to the 96 bits (12 times 8 bits) needed to encode the original string.
- the present invention further relates to a device for compressing data.
- the device comprises means for transforming the original sequence of data elements into a new sequence of symbols representing one of a literal element, a backward reference to a previously occurring subsequence of elements, or a table reference to a subsequence in one of the tables.
- the device also comprises means for encoding the new sequence of symbols into a uniquely decodeable bit stream for compressing the original sequence, and means for building a table of bit positions referring to the start of each compressed block in the bit stream.
- the device may also comprise means for decompressing arbitrary blocks, given the bitstream and tables. Further, the device may comprise means for building one or more tables of frequently occurring subsequences of elements.
- the present invention further relates to a device for decoding a set of symbols into a sequence of elements given a set of tables of subsequences of elements. This device comprises means for decoding the symbols into a one of a literal element, a backward reference to a previously occurring subsequence of elements, or a table reference to a subsequence in one of the given tables, and means for transforming the decoded symbols into the original sequence of elements.
- the above description of the compression and decompression process may, as stated above, e.g. be applied to compress part of the firmware, for example all applications and/or the operating system, in a mobile device.
- the firmware might be stored in NAND flash memory and loaded to RAM by the boot code at system startup.
- decompression is initiated by the boot code.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE0502351A SE530081C2 (en) | 2005-10-24 | 2005-10-24 | Method and system for data compression |
PCT/SE2006/001198 WO2007050018A1 (en) | 2005-10-24 | 2006-10-23 | Method and system for compressing data |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1941617A1 true EP1941617A1 (en) | 2008-07-09 |
EP1941617A4 EP1941617A4 (en) | 2012-09-19 |
Family
ID=37968054
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP06799795A Ceased EP1941617A4 (en) | 2005-10-24 | 2006-10-23 | Method and system for compressing data |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1941617A4 (en) |
SE (1) | SE530081C2 (en) |
WO (1) | WO2007050018A1 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8983388B2 (en) | 2008-09-30 | 2015-03-17 | Google Technology Holdings LLC | Method and apparatus to facilitate preventing interference as between base stations sharing carrier resources |
US8996018B2 (en) | 2008-10-30 | 2015-03-31 | Google Technology Holdings LLC | Method and apparatus to facilitate avoiding control signaling conflicts when using shared wireless carrier resources |
US8165597B2 (en) | 2009-03-25 | 2012-04-24 | Motorola Mobility, Inc. | Method and apparatus to facilitate partitioning use of wireless communication resources amongst base stations |
EP2712089A1 (en) * | 2012-09-20 | 2014-03-26 | Alcatel-Lucent | Method for compressing texts and associated equipment |
US12050557B2 (en) | 2017-05-19 | 2024-07-30 | Takashi Suzuki | Computerized systems and methods of data compression |
US10387377B2 (en) | 2017-05-19 | 2019-08-20 | Takashi Suzuki | Computerized methods of data compression and analysis |
US11741121B2 (en) | 2019-11-22 | 2023-08-29 | Takashi Suzuki | Computerized data compression and analysis using potentially non-adjacent pairs |
CN113868206A (en) * | 2021-10-08 | 2021-12-31 | 八十一赞科技发展(重庆)有限公司 | Data compression method, decompression method, device and storage medium |
CN116665836B (en) * | 2023-07-26 | 2023-10-27 | 国仪量子(合肥)技术有限公司 | Editing and storing method, reading and playing method and electronic equipment for sequence data |
CN118018032B (en) * | 2024-04-09 | 2024-06-14 | 西安西驰信息技术有限公司 | Remote control data transmission method of intelligent switch cabinet |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998006028A1 (en) * | 1996-08-06 | 1998-02-12 | Reynar Jeffrey C | A lempel-ziv data compression technique utilizing a dicionary pre-filled with fequent letter combinations, words and/or phrases |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4730348A (en) * | 1986-09-19 | 1988-03-08 | Adaptive Computer Technologies | Adaptive data compression system |
US5010344A (en) * | 1989-12-28 | 1991-04-23 | International Business Machines Corporation | Method of decoding compressed data |
US5001478A (en) * | 1989-12-28 | 1991-03-19 | International Business Machines Corporation | Method of encoding compressed data |
US5488365A (en) * | 1994-03-01 | 1996-01-30 | Hewlett-Packard Company | Method and apparatus for compressing and decompressing short blocks of data |
US5729228A (en) * | 1995-07-06 | 1998-03-17 | International Business Machines Corp. | Parallel compression and decompression using a cooperative dictionary |
JP3277792B2 (en) * | 1996-01-31 | 2002-04-22 | 株式会社日立製作所 | Data compression method and apparatus |
-
2005
- 2005-10-24 SE SE0502351A patent/SE530081C2/en not_active IP Right Cessation
-
2006
- 2006-10-23 WO PCT/SE2006/001198 patent/WO2007050018A1/en active Application Filing
- 2006-10-23 EP EP06799795A patent/EP1941617A4/en not_active Ceased
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998006028A1 (en) * | 1996-08-06 | 1998-02-12 | Reynar Jeffrey C | A lempel-ziv data compression technique utilizing a dicionary pre-filled with fequent letter combinations, words and/or phrases |
Non-Patent Citations (4)
Title |
---|
AHN E ET AL: "EFFECTIVE ALGORITHMS FOR CACHE-LEVEL COMPRESSION", PROCCEDINGS 2001 ELEVENTH GREAT LAKES SYMPOSIUM ON VLSI. GLSVLSI 2001. WEST LAFAYETTE, IN, MARCH 22 - 23, 2001; [GREAT LAKES SYMPOSIUM ON VLSI. (GLSVLSI)], NEW YORK, NY : ACM, US, 22 March 2001 (2001-03-22), pages 89-92, XP002318792, DOI: 10.1145/368122.368872 ISBN: 978-1-58113-351-6 * |
BELL T ET AL: "MODELING FOR TEXT COMPRESSION", ACM COMPUTING SURVEYS, ACM, NEW YORK, NY, US, US, vol. 21, no. 4, 1 December 1989 (1989-12-01), pages 557-591, XP000972666, ISSN: 0360-0300, DOI: 10.1145/76894.76896 * |
OGIHARA T: "COMPRESSION METHOD USING LZW CODING AND A SLIDING WINDOW", ELECTRONICS & COMMUNICATIONS IN JAPAN, PART III - FUNDAMENTALELECTRONIC SCIENCE, WILEY, HOBOKEN, NJ, US, vol. 77, no. 8, 1 August 1994 (1994-08-01) , pages 1-12, XP000503751, ISSN: 1042-0967 * |
See also references of WO2007050018A1 * |
Also Published As
Publication number | Publication date |
---|---|
EP1941617A4 (en) | 2012-09-19 |
WO2007050018A1 (en) | 2007-05-03 |
SE0502351L (en) | 2007-04-25 |
SE530081C2 (en) | 2008-02-26 |
WO2007050018A8 (en) | 2007-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007050018A1 (en) | Method and system for compressing data | |
US5870036A (en) | Adaptive multiple dictionary data compression | |
KR100894002B1 (en) | Device and data method for selective compression and decompression and data format for compressed data | |
US5001478A (en) | Method of encoding compressed data | |
US6597812B1 (en) | System and method for lossless data compression and decompression | |
US5951623A (en) | Lempel- Ziv data compression technique utilizing a dictionary pre-filled with frequent letter combinations, words and/or phrases | |
CA2299902C (en) | Method and apparatus for data compression using fingerprinting | |
US5010345A (en) | Data compression method | |
US20090060047A1 (en) | Data compression using an arbitrary-sized dictionary | |
US5877711A (en) | Method and apparatus for performing adaptive data compression | |
US5874908A (en) | Method and apparatus for encoding Lempel-Ziv 1 variants | |
JPH07104971A (en) | Compression method using small-sized dictionary applied to network packet | |
US5673042A (en) | Method of and an apparatus for compressing/decompressing data | |
EP2455853A2 (en) | Data compression method | |
US6518895B1 (en) | Approximate prefix coding for data compression | |
US5010344A (en) | Method of decoding compressed data | |
EP0435802B1 (en) | Method of decompressing compressed data | |
Bhadade et al. | Lossless text compression using dictionaries | |
Tank | Implementation of Lempel-ZIV algorithm for lossless compression using VHDL | |
Kwong et al. | A statistical Lempel-Ziv compression algorithm for personal digital assistant (PDA) | |
Hoang et al. | Dictionary selection using partial matching | |
Klein et al. | Parallel Lempel Ziv Coding | |
Swacha et al. | Dynamic, semi-dynamic and static word-based compression: a comparison of effectiveness | |
Tabus et al. | Text compression based on variable-to-fixed codes for Markov sources | |
Bhadade et al. | Text Compression Methods Based on Dictionaries |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20080327 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20120821 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: H03M 7/30 20060101AFI20120815BHEP Ipc: H03M 7/40 20060101ALI20120815BHEP |
|
17Q | First examination report despatched |
Effective date: 20140821 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: APPLE INC. |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20170607 |