KR20080026772A - Method for a compression compensating restoration rate of a lempel-ziv compression method - Google Patents

Method for a compression compensating restoration rate of a lempel-ziv compression method Download PDF

Info

Publication number
KR20080026772A
KR20080026772A KR1020060091759A KR20060091759A KR20080026772A KR 20080026772 A KR20080026772 A KR 20080026772A KR 1020060091759 A KR1020060091759 A KR 1020060091759A KR 20060091759 A KR20060091759 A KR 20060091759A KR 20080026772 A KR20080026772 A KR 20080026772A
Authority
KR
South Korea
Prior art keywords
data
compression
lempel
ziv
pattern
Prior art date
Application number
KR1020060091759A
Other languages
Korean (ko)
Inventor
문경기
박수홍
Original Assignee
인하대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 인하대학교 산학협력단 filed Critical 인하대학교 산학협력단
Priority to KR1020060091759A priority Critical patent/KR20080026772A/en
Publication of KR20080026772A publication Critical patent/KR20080026772A/en

Links

Images

Classifications

    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3091Data deduplication

Abstract

The present invention relates to a compression method that compensates for the restoration speed of the Lempel-Ziv compression method. In order to compensate for the restoration speed of the Lempel-Ziv compression method, the key-address method is applied to shorten the time required for restoration and data. Regardless of the size of the data, the information on the restoration can be retrieved by the performance of BIGO (1) by referencing the key value configured in the compression and hash table. It is a compression method that compensates for the restoration speed of the Lempel-Ziv compression method, which can be efficiently applied to binary formats such as document, video and spatial data, and MP3, which require high restoration speed and lossless compression method because there is no loss of information. To provide, its technical configuration is to compress and restore raw data using Lempel-Ziv. A method comprising: retrieving pattern information of raw data while retrieving a duplicate pattern in the pattern information; If a duplicate pattern is found in the search, constructing a hash table of a key-address method for the duplicated pattern; And lossless compressing the raw data by adding data for restoring a duplicated pattern configured in the hash table to the raw data.

Description

Method for a compression compensating restoration rate of a Lempel-Ziv compression Method}

1 is a block diagram schematically illustrating compression and reconstruction.

2 is a flowchart illustrating a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention.

3 is a graph showing the restoration time according to the number of data.

4 is a graph showing the restoration speed according to the number of pointers.

5A is a diagram illustrating an algorithm according to a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention.

5b illustrates an algorithm according to a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention.

6 is a diagram illustrating a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention.

The present invention relates to a compression method that compensates for the restoration speed of the Lempel-Ziv compression method, and more particularly, to compensate for the restoration speed of the Lempel-Ziv compression method, by applying a key-address method, By retrieving the performance of BIGO (1) through the key value reference configured in the compression and hash table, regardless of the size of the data, the information on the restoration can be retrieved. Compensation speed of the Lempel-Ziv compression method, which can be efficiently applied to binary formats such as document, video and spatial data, and MP3 that requires high restoration speed and lossless compression method because there is no loss of information during data compression. One compression method relates.

In general, home computers used text-based applications to handle data. However, due to the development of online and multimedia environments, quantity and quality of multimedia materials are increasing, and the data presented as a way to solve the problem of storage space. Compression is a scientific technique for expressing information in small forms, with the aim of representing the original information using minimal bits. [Khalid Sayood, "Introduction to Data Compression", Morgan Kauf Mann Publishers, 2000,]

1 is a block diagram schematically illustrating compression and reconstruction. As shown in the figure, a data compression technique consists of a compression algorithm for converting a given original data χ into a compressed form χ c and a reconstruction algorithm for reconstructing the original form γ using compressed data χ c . [Khalid Sayood, "Introduction to Data Compression", Morgan Kauf Mann Publishers, 2000]

The reconstruction process is divided into lossless compression in which the original data χ and reconstructed data γ are exactly matched, and lossy compression in which χ and reconstructed data γ are different. Generally, lossy compression has a higher compression performance than lossless compression. Lee Dong-heon, “A Vector Data Compression Method Using Clustering Technique,” Master's Thesis, Inha University Graduate School: 2-29, 2005]

Here, the lossless compression technique is a compression technique that can accurately reconstruct the original data without losing information while performing the compression and reconstruction process, and is used in the case of loss of information that cannot use the entire data. Text data and individual pixel values affecting the results are applied to the case of satellite image data with high significance. [Chun Woo Je, “A Study on Compression Technique Considering Efficient Update of Vector Data”, Master's Thesis, Graduate School of Inha University : 7-9, 2005.]

Lossy compression is a compression technique that cannot be reconstructed correctly from the original data while performing compression and reconstruction. It is used when the loss and distortion of information is acceptable, and the reconstructed data can deliver the contents. This applies to voices that can reduce the size of the data by lowering the sound quality.

Therefore, the lossless compression technique and the lossy compression technique can be variously applied according to the utilization of the data. For example, the lossless compression technique and the lossy compression technique can store spatial data types, process spatial data and related queries, and use spatial indexes and query processing. We define a database system that can be optimized as a spatial database [Lee Min-woo, “Spatial Indexing Method for Object-Relational DBMS of Embedded System”, Inha University Master's Thesis: 26-29, 2005.], OGC (Open Geospatial Consortium) ) Defines the spatial data model and spatial operators in the Simple Feature Specification For SQL v1.1 (OGC, 1999) for a spatial database, and presents the schema of the spatial database as a standard.

Lempel-Ziv (1977) constructs a dictionary for the code values that make up data when compressing the data and based on the Dictionary Based Compression Algorithm, which compresses the data when the information currently being read exists in the configured dictionary. It is done. The Lemepl-Ziv compression method can be implemented in various ways in the actual implementation, but it can be divided into the static dictionary method and the dynamic dictionary method according to how the dictionary is constructed.

Static dictionary method is a method of making a dictionary of code values expected to appear in advance, and compressing them by referring to a dictionary already created when the stored code values come out again. In other words, this method can be applied efficiently when the contents of the file to be compressed are foreseeable. Since dynamic dictionaries construct dictionaries by reading the data, references to code values are limited to those already appearing in code values. Dynamic dictionary has the disadvantage of slow compression because it requires the construction of a dictionary while reading data, but it has a good compression rate for arbitrary data.

In the Lempel-Ziv compression method, the data is compressed using a special character string in the form of a single code or a double code for reconstruction information of a compressed code value. Special strings are strings that are used to find information for restoration and are the only code that does not exist in the file. Here is an example of a Lempel-Ziv compression method that uses special strings: [Table 1] shows the original data to be compressed and converted to binary mode in the form of a series of codes.

In this case, the original data is converted to hexadecimal mode, and the code of “01 8d 01 bf” is repeated three times. When this case is compressed using the Lempel-Ziv compression method, the result is as shown in [Table 2]. Can be obtained.

"Aa ff" is a special string that means that the compressed data starts. It consists of a double-byte double code. This information indicates that the code "04 08", which exists in the next byte, is necessary for recovery. The data structure of the Lempel-Ziv compression method uses circular cues by default, so the offset value of the rare is 10 when the first character 'aa' is read, which is offset by '04'. By subtracting the information, you get 6. Therefore, the existing data from the 6th is counted up to 8 times with reference to '08' and stored.

Hexa code     04 5b 6d 08 2c d5 01 8d 01 bf 01 8d 01 bf 01 8d 01 bf

Hexa code     04 5b 6d 08 2c d5 01 8d 01 bf aa ff 04 08

Hexa code     04 5b 6d 08 2c d5 01 8d 01 bf 01 8d 01 bf 01 8d 01 bf

[Table 3] shows the restoration of the compressed data back to the original data by the Lempel-Ziv compression method. It can be seen that the code values of the data exactly match the original data. Therefore, Lempel-Ziv compression method can be classified as lossless compression method because compressed data can be restored exactly to original data.

However, in the Lempel-Ziv compression method, data can be restored only by finding a special string through the process of reading the compressed data from the beginning to the end. That is, the Lempel-Ziv compression method searches for a special character string with O (N) algorithm performance, so that the time required for restoration increases in proportion to the size of the data.

SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problem. In order to compensate for the restoration speed of the Lempel-Ziv compression method, the key-address method is applied to reduce the time required for restoration and to compress the data regardless of the size of the data. And retrieval information by the performance of BIGO (1) through the reference of the key value configured in the hash table.In the lossless compression method, the accuracy of the original data is maintained precisely, It is therefore an object of the present invention to provide a compression method that compensates for the restoration speed of the Lempel-Ziv compression method, which can be efficiently applied to binary formats such as MP3, document, video and spatial data requiring high recovery speed and lossless compression method. .

In order to achieve the above object, the present invention provides a method of compressing and restoring raw data using Lempel-Ziv, the method comprising: searching for duplicate patterns in the pattern information while searching for pattern information of the raw data; If a duplicate pattern is found in the search, constructing a hash table of a key-address method for the duplicated pattern; And lossless compressing the raw data by adding data for restoring a duplicated pattern configured in the hash table to the raw data.

The retrieval method used in the duplicate pattern retrieval step uses Lempel-Ziv's basic compression method using a hash table and a hash chain.

However, in order to supplement Lomepl-Ziv's compression method, the location of the duplicated pattern retrieved by Lempel-Ziv's compression method is stored in the linked list and stored in the header part of the raw data, It consists of a hash table. At this time, the hash table stores the storage location of the data for restoring the found duplicate pattern as a key value.

In addition, the data for restoring the overlapping pattern may include data about a location where the data of the corresponding overlapping pattern is stored and the size of the data of the overlapping pattern.

Here, when compressing and restoring the spatial data using the compression and decompression method using the Lempel-Ziv, the method further comprises the step of extracting the differential vector and the starting point coordinates from the spatial data into vector data, the converted vector Searching for the pattern information of the data, characterized in that for searching for a duplicate pattern in the pattern information.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

2 is a flowchart illustrating a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention. As shown, a case of spatial data is described as an example, and a process of performing compression of vector data and a process of reconstructing original data through an inverse process is shown.

The vector data compression process begins with selecting a data model to be used in the design, and using the selected model, a differential vector representing the starting point coordinates that can represent the absolute position of the object in the spatial coordinate system and the relative position relative to the starting point. Divide by (Differential Vector) (S10)

In the step S10, since the data compression process is an 8-byte processing structure for storing spatial data, the geometric model of the Open Geospatial Consortium (OGC) is modified and used to store information minimized for spatial data. It was.

In addition, the pattern information is searched for the differential vector calculated from the conversion process for the vector data (S20).

Here, after applying the Lempel-Ziv dynamic dictionary method to the calculated pattern information to search for information about the repeated pattern (S30), and recompression (S40).

At this time, the offset value where the pattern is located is stored in the hash table without using the Lempel-Ziv compression method that inserts a pad character into a compressed position to improve the speed of restoration. This is a compression method that compensates for the decompression speed of the Lemepl-Ziv compression method of the present invention, and plays a role of increasing the speed of decompression by retrieving the compression information with the performance of BIGO (1).

Therefore, since minimization of the necessary cost due to redundant storage is required, if there is overlapping pattern information for the differential vector, recompression is performed (S40), and the cost from recompression to restoration is minimized. In order to store the information about the restoration in a hash table instead of the pad character (S50).

In addition, it searches and compresses the presence or absence of repeated pattern information with respect to the length of the differential vector, and manages information for improving the restoration speed by using the key-address method as header information. do.

Compressed data through such a series of processes, if there is overlapping information on the compressed data of the differential vector calculated based on the starting point of the object representing the absolute position of the spatial object, the compressed data is removed. Finally compressed data.

The proposed compression method of step S40 uses a method of retrieving repeated information and data while reading compressed data, and recompressing it, which can reduce the speed due to compression through linear search. However, since duplicate information can be found and the information on the restoration is managed by a hash table, the information on compression can be searched at the speed of BIGO (1) to increase the speed of the restoration. Can be

The compressed data can be reconstructed into original data through the reverse process of the compression process (S70, S80, S90), and the position information of the compression stored in the hash table is stored in the bitio (1) in the restoration step. Find compression information by performance.

By executing the above process, by retrieving the repeated information and restoring by the length of the position, by using the differential vector and the starting point of the object having the absolute position coordinates of the spatial object (S80), Reconstruction to a spatial object is possible (S90).

The following looks at the experimental example of the present invention.

In order to evaluate the performance of the present invention, in order to quantitatively evaluate the performance of the compression method that complements the restoration speed of the Lempel-Ziv compression method, and to evaluate the validity and applicability through comparison with the prior art, it is divided according to the number of data. The entire data in which all the leaf maps are constructed and the map leaf data of a specific region are selected as data suitable for the experiment, and the compression method according to the prior art and the compression method of the present invention are applied to the selected two types of data. We compare the compression rate and query processing time for the applied compression method and derive the result.

The compression method that complements the decompression speed of the Lempel-Ziv compression method of the present invention is a key-address hash of a pad character used when applying the Lempel-Ziv compression method. By using a hash table), the recovery speed of the Lempel-Ziv compression method having the performance of the vig 1 is made to have the vig 1 performance regardless of the size of the data.

Figure 112006068330279-PAT00001

Table 12 is a compression method that compensates for the restoration speed of the Lempel-Ziv compression method. The structure of Table 12 stores the location of the information where the pattern information starts, in the header. Code) "Ob" is 11 when converted to decimal, and when it points to the 11th position of the record part, it reads data in 2 byte units and reads "00 07" where the pattern starts. 00 data of "08", which is the next data of "07", is restored and restored to the original data.

Here, due to the header portion for storing the hash key (Hash Key), compared to the conventional Lempel-Ziv compression method further includes a size of 1 byte (Byte) to 2 bytes (Byte), but due to the performance of the bitio (1) You can find the stored information.

The compression method that complements the restoration speed of the Lempel-Ziv compression method is the same as the experimental data constructed with the Lempel-Ziv compression method. It was.

Finally, the results are analyzed for the experiments using the Lempel-Ziv compression method and the compression method complementing the restoration speed of the Lempel-Ziv compression method of the present invention.

The compression rate of the Lempel-Ziv compression method and the compression method that compensates for the recovery speed of the Lempel-Ziv compression method of the present invention are compared.

Here, when a spatial query is made for the area of Incheon Metropolitan City and the metropolitan area, the time required to restore the original data to the original data is analyzed according to the number of points and the number of data.

3 is a graph showing a restoration time according to the number of data. As shown, when the query occurs, it represents the restoration time according to the number of data to be searched, and if the restoration speed is calculated from the restoration time, the query processing time required until the restoration of the conventional Lempel-Ziv compression method increases the number of data. It can be seen that the larger the increase.

In the compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention, the query processing time increases as the number of data increases, but the recovery time is lower than that of the conventional Lempel-Ziv compression method. Indicates.

4 is a graph showing the restoration time according to the number of points. As shown, when the query occurs, it represents the restoration time according to the number of points to be searched, and calculating the restoration time from the restoration time compensates for the restoration speed of the Lempel-Ziv compression method of the present invention rather than the Lempel-Ziv compression method. One compression method improves the recovery speed for queries by an average of 70%.

Hereinafter, a process of a compression method that compensates for the restoration speed of the Lempel-Ziv compression method according to the present embodiment will be described.

5A and 5B are diagrams illustrating an algorithm according to a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention. As shown in the figure, in applying the Lempel-Ziv compression method, header information is generated so that a hash table can be constructed without using a special string.

Here, referring to FIG. 5A, as shown in row 1, since the compression method that compensates for the decompression speed of the Lempel-Ziv compression method is a memory-based compression method, data and length information are required to compress vector data. In other words, it is necessary to take as arguments a variable to store the compressed data and a variable to store the length of the compressed data.

Here, as shown in row 2, since each object constituting the vector data has a different number of repeated information, the linked list is initialized to dynamically configure the header information, and shown in row 3 As shown, the variable storing the pattern information and the length information of the variable are allotted to 0 and initialized.

As shown in row 4, the first byte of data to be compressed is stored in the queue, and as shown in row 5, a loop is executed as much as the size of the data to be compressed as a parameter. As shown in rows 6 to 7, if a value of 0 is obtained in calculating the position of the current cue, the layer value is increased by one.

If the queue used to retrieve pattern information about the data overflows, as in rows 9 to 12, the value of the currently stored pattern is compared with the front of the queue. If it is the same, the data is compressed.

Thus, as in line 13, the value of the queue corresponding to the front is deleted from the hash table, and then in the queue, as in line 14.

Next, during the data reading process, the information read in data by byte unit and stored in the queue should know whether there is a value of the currently read data, and for this purpose, the data stored in the queue as shown in line 17. Obtains data offset value to compare currently and checks the existence or absence of currently read byte for the value stored in the queue as in line 18, and currently read byte is the previous byte. If it does not exist, it is stored in the hash table and queue, as in line 20.

On the other hand, if a pattern corresponding to the currently read byte is found among the previously read bytes, the expected byte to be read from the queue currently being compared should be searched. Add the values, store the information retrieved from the queue, store the data with the expected bytes in the queue, as in line 24, and rerun the loop.

If, as in line 25, the previously read byte is information stored in the queue, the presence or absence of duplicate storage of the bytes stored in the previous queue as in line 26 is checked, and in lines 27 to 31 of FIG. 19B. As shown, the same process as when the value of the matching length is 0 is processed to process the result value.

In addition, as shown in rows 33 to 37 of FIG. 19B, in order to form a hash table structure, header information may be generated using a key value for the location of the compressed information, and for this, it is restored during data compression. We need to store the location information for the linked list.

As shown in row 45 of FIG. 5B, the header information is identified from the compressed data to form a hash table, and as shown in row 46, the data is moved according to the size of the header information. In addition, the index information is calculated and used as the key value of the hash table to retrieve the restoration information for the key value of the hash table.

Finally, the length of repeating the data of the position value indicated by the key value of the hash table in the current byte, using the position value where the repeating pattern is located and the repeating length from the position. Save as much.

6 is a diagram schematically illustrating a compression method that compensates for the restoration speed of the Lempel-Ziv compression method of the present invention. As shown in the figure, when applying the Lempel-Ziv compression method, in order to increase the recovery speed, the key-address method is supplemented, and the key values configured in the hash table are compressed as much as the data size is compressed. With reference to this, the performance of the BIGO 1 can be used to retrieve information about the restoration.

As described above, the present invention having the configuration as described above shortens the time required for restoration and compresses the data regardless of the size of the data by applying a key-address scheme to compensate for the restoration speed of the Lempel-Ziv compression method. And retrieval information by the performance of BIGO (1) through the reference of the key value configured in the hash table.In the lossless compression method, the accuracy of the original data is maintained precisely, Therefore, it is possible to efficiently apply to binary formats such as document, video and spatial data, and MP3 which require high restoration speed and lossless compression method.

Although the preferred embodiments of the present invention have been described above by way of example, the scope of the present invention is not limited to such specific embodiments, and those skilled in the art are appropriate within the scope described in the claims of the present invention. It will be possible to change.

Claims (5)

  1. In the method of compressing and restoring raw data using Lempel-Ziv,
    Retrieving pattern information of the raw data using an existing Lempel-Ziv compression method and searching for duplicate patterns in the pattern information;
    If a duplicate pattern is found in the search, constructing a hash table of a key-address method for the duplicated pattern;
    And lossless compressing the raw data by adding data for restoring a redundant pattern configured in the hash table to the raw data.
  2. The method of claim 1,
    In the overlapping pattern search step, the overlapping pattern forms a hash table and a hash chain, and searches for the overlapping pattern using a circular queue method.
  3. The method of claim 1,
    The hash table of the key-address method is stored in a header portion of the raw data, and the hash table stores Lempel-Ziv as data for restoring the retrieved duplicated pattern as a key value. Compression and decompression method.
  4. The method of claim 1,
    The data for restoring the overlapping pattern includes data about the location where the data of the corresponding overlapping pattern is stored and the size of the data of the overlapping pattern.
  5. The method of claim 1,
    When compressing and restoring spatial data using the compression and decompression method using the Lempel-Ziv,
    And extracting the differential vector and the starting point coordinates from the spatial data and converting them into vector data.
    Compression and decompression method using Lempel-Ziv, characterized in that for retrieving the pattern information of the transformed vector data, the overlapping pattern in the pattern information.
KR1020060091759A 2006-09-21 2006-09-21 Method for a compression compensating restoration rate of a lempel-ziv compression method KR20080026772A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020060091759A KR20080026772A (en) 2006-09-21 2006-09-21 Method for a compression compensating restoration rate of a lempel-ziv compression method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020060091759A KR20080026772A (en) 2006-09-21 2006-09-21 Method for a compression compensating restoration rate of a lempel-ziv compression method

Publications (1)

Publication Number Publication Date
KR20080026772A true KR20080026772A (en) 2008-03-26

Family

ID=39414007

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020060091759A KR20080026772A (en) 2006-09-21 2006-09-21 Method for a compression compensating restoration rate of a lempel-ziv compression method

Country Status (1)

Country Link
KR (1) KR20080026772A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011007956A3 (en) * 2009-07-17 2011-03-24 주식회사 이스트소프트 Data compression method
WO2013148582A1 (en) 2012-03-29 2013-10-03 Intel Corporation System, method, and computer program product for decompression of block compressed images
KR101403356B1 (en) * 2012-10-22 2014-06-05 (주)티베로 Device and method of data compression and computer-readable recording medium thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011007956A3 (en) * 2009-07-17 2011-03-24 주식회사 이스트소프트 Data compression method
KR101049699B1 (en) * 2009-07-17 2011-07-15 (주)이스트소프트 data compression method
WO2013148582A1 (en) 2012-03-29 2013-10-03 Intel Corporation System, method, and computer program product for decompression of block compressed images
US8687902B2 (en) 2012-03-29 2014-04-01 Intel Corporation System, method, and computer program product for decompression of block compressed images
KR20140130196A (en) * 2012-03-29 2014-11-07 인텔 코오퍼레이션 System, method, and computer program product for decompression of block compressed images
EP2831838A4 (en) * 2012-03-29 2015-12-02 Intel Corp System, method, and computer program product for decompression of block compressed images
KR101403356B1 (en) * 2012-10-22 2014-06-05 (주)티베로 Device and method of data compression and computer-readable recording medium thereof

Similar Documents

Publication Publication Date Title
US20190340165A1 (en) Method of reducing redundancy between two or more datasets
EP3047397B1 (en) Mirroring, in memory, data from disk to improve query performance
US9852169B2 (en) Compression of tables based on occurrence of values
US9690802B2 (en) Stream locality delta compression
US9710517B2 (en) Data record compression with progressive and/or selective decomposition
US10216794B2 (en) Techniques for evaluating query predicates during in-memory table scans
US9619565B1 (en) Generating content snippets using a tokenspace repository
CN104040541B (en) For more efficiently using memory to the technology of CPU bandwidth
US8543555B2 (en) Dictionary for data deduplication
US9146967B2 (en) Multi-stage query processing system and method for use with tokenspace repository
EP2889787B1 (en) Adaptive dictionary compression/decompression for column-store databases
Zobel et al. Adding compression to a full‐text retrieval system
US20140074805A1 (en) Storing compression units in relational tables
Zobel et al. An efficient indexing technique for full-text database systems
US4814746A (en) Data compression method
CN102945242B (en) The management method of data, system and computer system
US8356060B2 (en) Compression analyzer
JP4774145B2 (en) Structured document compression apparatus, structured document restoration apparatus, and structured document processing system
US7743060B2 (en) Architecture for an indexer
US8756255B2 (en) Compression and storage of computer aided design data
US5363098A (en) Byte aligned data compression
US8645333B2 (en) Method and apparatus to minimize metadata in de-duplication
US6721749B1 (en) Populating a data warehouse using a pipeline approach
Iyer et al. Data compression support in databases
EP2270684B1 (en) Dictionary-based order-preserving string compression for main-memory column stores

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E601 Decision to refuse application