WO2004051863A1 - Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression - Google Patents
Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression Download PDFInfo
- Publication number
- WO2004051863A1 WO2004051863A1 PCT/IT2002/000762 IT0200762W WO2004051863A1 WO 2004051863 A1 WO2004051863 A1 WO 2004051863A1 IT 0200762 W IT0200762 W IT 0200762W WO 2004051863 A1 WO2004051863 A1 WO 2004051863A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- string
- binary
- words
- value
- word
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Definitions
- the present invention concerns an automated method for compressing binary strings without information loss, or lossless compression method, and related automated method for decompressing, which makes possible, in a simple, fast and inexpensively producible way, to minimise in a lossless way the dimension of binary strings.
- the present invention further concerns the instruments necessary to perform the automated methods and to the apparatus performing the same.
- the compression methods can be classified according to the congruence between decompressed data and original data, dividing the concerned compression methods into methods having no information loss or "lossless” methods, when the data reconstructed or decompressed from the compressed data are identical to the original ones, and methods hav- ing information losses or "lossy” methods, in which the reconstructed data lose a portion of the original data information.
- lossless methods are preferable and, in some specific applications when all the information must be kept, are the only ones that may be used. Examples of lossless methods are the Huffman encoding method and the Lempel-Ziv-Welch, or LZW, encoding method.
- the Huffman method operates only on static strings, creating a corresponding dictionary for each string, while the LZW method operates on dynamic strings, creating a dynamically updated dictionary. Consequently, both said methods are complex and their computer execution is slow.
- each encoding method is suitably designed in connection with specific types of data to be encoded, newer and newer encoding requirements for data having particular characteristics are being encountered. This causes a certain difficulty of use, especially in case of transmission of composite data, requiring the use of different encoding methods.
- the datum related to the pres ⁇ is the datum related to the pres ⁇
- the output string E may further comprise an initial flag bit for indicating whether at least one binary value B k is not present within the input string S (and, therefore, the specific flags of each individual binary value B k must be read) or all the binary values B k are present (and, therefore, the output string E does not comprise specific flags of each in- dividual binary value B k because all the values must be read).
- the datum related to the presence of at least one word having binary value B k within the string S k may be the number N ⁇ of words located within the string S k .
- the sorting rule may assume the 2" binary values (B) as positive binary values and the sorting 0 fixed during step A may be either the decreasing sorting or the increasing sorting of such positive binary values.
- the method may count the number of occurrences of each binary value B k within the input binary string S, and the sorting rule may fix the sorting O according to either the decreasing sorting or the increasing sorting of the occurrences of the binary values B k within the input binary string S, the method further comprising the step: F. juxtaposing to the output string E data related to the sorting O fixed during step A.
- the method is automatically adaptive to the type of file to be compressed, in conformity to the occurrences of the binary values.
- step D.3 the elements d k are juxtaposed to the output string E according to a binary encoded
- sentation defines a first positive integer value g k , with g k ⁇ 0 , and a sec-
- d k may be represented by (g k +x k + 2) binary bits according to the formula:
- W the division — , i.e. it is equal to W -In ⁇ ;
- the method may also comprise the step:
- the method may determine the optimum first and second positive integer values g k and x k by means of a trial and error process.
- the trial and error process may verify the size of memory needed to represent the array A k according to the binary encoded representation for all the values of the first and second positive integer values g k and x k ranging, respectively, from 0 and G, with G > 0 , and from 0 and X, with X > 0.
- the method may furthermore comprise the step:
- the method still comprises the step:
- the method is iterated, being applied at each h-Vn iteration, with h > ⁇ , to the output string E obtained by the preceding ( ⁇ -l)-th iteration.
- n h _ x and n h of bits, included in the words into which the corresponding input strings S are subdivided during step C may be different with respect to one another: n h _ x ⁇ n h
- an automated method for decompressing a compressed binary string E of input data into a binary string S of output data characterised in that the compressed bi- nary string E of input data has been obtained by applying to the binary string S the just described automated compression method, and in that it comprises the following steps:
- the decompression method may be iterated, the compressed binary string E of input data having being obtained by applying to the binary string S the iterated automated compression method.
- an electronic apparatus comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated compression method previously illustrated.
- an electronic apparatus comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated decompression method described before.
- an electric, magnetic or electromagnetic signal modulated by a digital signal characterised in that said digital signal comprises at least one compressed binary string E obtained by means of the automated compression method previously illustrated.
- a computer program comprising code means adapted to execute, when running on a computer, the automated compression method previously described. It is further subject matter of this invention a memory medium readable by a computer, storing a program, characterised in that the program is the computer program just described.
- Figure 1 schematically shows a portion of a binary string S of input data to be compressed by means of a preferred embodiment of the auto- mated method according to the invention
- Figure 2 schematically shows the portion of the string S of Figure 1 ;
- Figure 3 schematically shows a flow chart of the trial and error process for determining the first and the second optimum positive integer values g k and x k according to the preferred embodiment of the automated method according to the invention
- Figure 4 schematically shows a first portion of the compressed binary string E of output data generated by the preferred embodiment of the automated method according to the invention
- Figure 5 schematically shows a portion of a second string Si gener- ated during the execution of the preferred embodiment of the automated compression method according to the invention
- Figure 6 schematically shows a second portion of the compressed binary string E of output data generated by the preferred embodiment of the automated method according to the invention
- Figure 7 schematically shows a third portion of the compressed bi ⁇
- Figure 8 schematically shows a flow chart of the preferred embodi-
- word 1 comprises n bits, where in a preferred embodiment of the invention
- n 8.
- the string S may be padded with a
- Each word 1 may assume one binary value B among 2 n different bi ⁇
- the method fixes a sorting O of such 2 n values according to a corresponding sorting rule:
- the method scans the string S for determining the position of the words 1' having the binary value B Q which is in the first position within the sorting 0, and it stores their reciprocal distances df" in a memory array A Q ; in the embodiment of Figure 1 , such first binary value is the value "11111111".
- each one of the No elements of the array A Q in case N 0 > 0 , is represented according to a suitable encoded binary representation.
- this may be represented in one of three different ways: 1 ) in the case when df' ⁇ 2 8 " [2] df" is represented by (g 0 + 1) binary bits, the first of which is equal to "0" and the following g 0 bits form the positive two's complement binary representation having g 0 bits of the value of df" ; in other words, in this case df" is represented according to the formula:
- d ,pB is represented by g 0 + XQ + 2 binary bits, where
- In[x] is the operator returning the integer part of X, in such a way that
- bits are equal to "1", the following bit is equal to "0' and the last (go +*o) D ' ts are tne positive two's complement binary
- a second binary string S is constructed, either physically or logically, which is obtained by eliminating from the initial string S all the words 1 ' having the binary value B 0 .
- the output string E comprises only one flag bit equal to "0".
- the method scans the string Si for determining the position of the words having the binary value B ⁇ which is in the second position within the sorting O, and it stores their reciprocal distances df in a corresponding memory array A ⁇ ; in the embodiment of the method shown in the Figures, such second binary value is the value "11111110".
- the array A ⁇ contains N j elements related to the distances df , for i - 0,1,2,...,N - I , which locate the positions within the string S L of the Ni words having binary value B ⁇ .
- the embodi- ment of the method shown in the Figures fixes two positive integer values gi and x ⁇ which minimise the size of memory needed to represent the array A ⁇ according to the encoded representation, illustrated above with reference to the array A Q .
- the method executes similar operations on the third string S 2 , scanning it for determining the position of the words having the binary value B 2 which is in the third position within the sorting O, and it stores their reciprocal distances df 2 in a memory array A 2 ; in the embodiment of the method shown in the Figures, such third binary value is the value "11111101".
- the string S 2 comprises N 2 words, with N 2 > 0 , having the binary value B 2
- a fourth binary string S 3 is constructed, either physically or logically, which is obtained by eliminating from the third string S 2 all the words having the binary value B 2 .
- the method iterates the preceding steps for all the 2"
- the method advantageously stores the initial size of the
- D d k is represented only by the bit "0"
- Figure 8 shows a flow chart which schematically summarises the method steps illustrated above.
- the embodiment of the method shown in the Figures may be iterated, that is the output string E may be subjected to the same steps of the compression method, schematically shown in Figure 8, being treated as it were an input string S so as to obtain a new output string E ⁇ .
- the initial size of the input string S is advantageously stored only once at the head of the last output string.
- the string E is subdivided in words comprising the same number n of bits, if the new output string E ⁇ which is obtained has smaller size, otherwise the string E is subdivided in words comprising a number n x of bits that is different from n bits, preferably n ⁇ n .
- the number n h of bits of the words into which the input string to be compressed has been subdivided is stored at the output string head.
- the automated decompression method reconstructs the input string S starting from the output string E.
- the output string E is read according to a sorting reversed with respect to the sorting O used for constructing it, carrying out the following steps:
- sion method must also be iterated up to reconstructing the input string S.
- an input data string to be compressed may prelimi ⁇
- the receiving apparatus may incrementally reconstruct the origi ⁇
- nal input data string starting from the different consecutive segments, by incrementally juxtaposing the segments obtained from decompression of the compressed segments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2002368408A AU2002368408A1 (en) | 2002-12-04 | 2002-12-04 | Automated method for lossless data compression and decompression of a binary string |
PCT/IT2002/000762 WO2004051863A1 (fr) | 2002-12-04 | 2002-12-04 | Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IT2002/000762 WO2004051863A1 (fr) | 2002-12-04 | 2002-12-04 | Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004051863A1 true WO2004051863A1 (fr) | 2004-06-17 |
Family
ID=32448851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IT2002/000762 WO2004051863A1 (fr) | 2002-12-04 | 2002-12-04 | Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2002368408A1 (fr) |
WO (1) | WO2004051863A1 (fr) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007149358A1 (fr) * | 2006-06-19 | 2007-12-27 | Essex Pa, L.L.C. | Compression de données |
US7508325B2 (en) | 2006-09-06 | 2009-03-24 | Intellectual Ventures Holding 35 Llc | Matching pursuits subband coding of data |
US7586424B2 (en) | 2006-06-05 | 2009-09-08 | Donald Martin Monro | Data coding using an exponent and a residual |
US7689049B2 (en) | 2006-08-31 | 2010-03-30 | Donald Martin Monro | Matching pursuits coding of data |
US7707213B2 (en) | 2007-02-21 | 2010-04-27 | Donald Martin Monro | Hierarchical update scheme for extremum location |
US7707214B2 (en) | 2007-02-21 | 2010-04-27 | Donald Martin Monro | Hierarchical update scheme for extremum location with indirect addressing |
US7783079B2 (en) | 2006-04-07 | 2010-08-24 | Monro Donald M | Motion assisted data enhancement |
US7786907B2 (en) | 2008-10-06 | 2010-08-31 | Donald Martin Monro | Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US7786903B2 (en) | 2008-10-06 | 2010-08-31 | Donald Martin Monro | Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US7791513B2 (en) | 2008-10-06 | 2010-09-07 | Donald Martin Monro | Adaptive combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US7845571B2 (en) | 2006-06-19 | 2010-12-07 | Monro Donald M | Data compression |
US7864086B2 (en) | 2008-10-06 | 2011-01-04 | Donald Martin Monro | Mode switched adaptive combinatorial coding/decoding for electrical computers and digital data processing systems |
US7974488B2 (en) | 2006-10-05 | 2011-07-05 | Intellectual Ventures Holding 35 Llc | Matching pursuits basis selection |
EP2595076A3 (fr) * | 2011-11-18 | 2013-10-23 | Tata Consultancy Services Limited | Compression de données génomiques |
US8674855B2 (en) | 2006-01-13 | 2014-03-18 | Essex Pa, L.L.C. | Identification of text |
US10194175B2 (en) | 2007-02-23 | 2019-01-29 | Xylon Llc | Video coding with embedded motion |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3523247A1 (de) * | 1985-06-28 | 1987-01-02 | Siemens Ag | Einrichtung zur datenreduktion binaerer datenstroeme |
US6388585B1 (en) * | 1998-08-11 | 2002-05-14 | Matsushita Electric Ind Co Ltd | Method for data compression and decompression using decompression instructions |
-
2002
- 2002-12-04 AU AU2002368408A patent/AU2002368408A1/en not_active Abandoned
- 2002-12-04 WO PCT/IT2002/000762 patent/WO2004051863A1/fr not_active Application Discontinuation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3523247A1 (de) * | 1985-06-28 | 1987-01-02 | Siemens Ag | Einrichtung zur datenreduktion binaerer datenstroeme |
US6388585B1 (en) * | 1998-08-11 | 2002-05-14 | Matsushita Electric Ind Co Ltd | Method for data compression and decompression using decompression instructions |
Non-Patent Citations (4)
Title |
---|
HOSANG M.: "A character elimination algorithm for lossless data compression", MADE PUPLIC DURING DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), pages 1 - 9, XP002246070, Retrieved from the Internet <URL:UNKNOWN> [retrieved on 20030617] * |
HOSANG M.: "A character elimination algorithm for lossless data compression", PROCEEDINGS DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), Snowbird, UT, USA, pages 457, XP002246068 * |
MOFFAT A ET AL: "SELF-INDEXING INVERTED FILES FOR FAST TEXT RETRIEVAL", ACM TRANSACTIONS ON INFORMATION SYSTEMS, ASSOCIATION FOR COMPUTING MACHINERY, NEW YORK, US, vol. 14, no. 4, 1 October 1996 (1996-10-01), pages 349 - 379, XP000635100, ISSN: 1046-8188 * |
PECHURA M. A., MCINTYRE D. R.: "Data Compression using static Huffman code-decode tables", COMMUNICATIONS OF THE ACM, vol. 28, no. 6, June 1985 (1985-06-01), pages 612 - 616, XP002246069 * |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8674855B2 (en) | 2006-01-13 | 2014-03-18 | Essex Pa, L.L.C. | Identification of text |
US7783079B2 (en) | 2006-04-07 | 2010-08-24 | Monro Donald M | Motion assisted data enhancement |
US7586424B2 (en) | 2006-06-05 | 2009-09-08 | Donald Martin Monro | Data coding using an exponent and a residual |
US7845571B2 (en) | 2006-06-19 | 2010-12-07 | Monro Donald M | Data compression |
US7770091B2 (en) | 2006-06-19 | 2010-08-03 | Monro Donald M | Data compression for use in communication systems |
JP2009542092A (ja) * | 2006-06-19 | 2009-11-26 | エセックス パ エルエルシー | データ圧縮の方法 |
KR101092106B1 (ko) | 2006-06-19 | 2011-12-12 | 에섹스 피에이 엘엘씨 | 데이터 압축 |
WO2007149358A1 (fr) * | 2006-06-19 | 2007-12-27 | Essex Pa, L.L.C. | Compression de données |
US8038074B2 (en) | 2006-06-19 | 2011-10-18 | Essex Pa, L.L.C. | Data compression |
US7689049B2 (en) | 2006-08-31 | 2010-03-30 | Donald Martin Monro | Matching pursuits coding of data |
US7508325B2 (en) | 2006-09-06 | 2009-03-24 | Intellectual Ventures Holding 35 Llc | Matching pursuits subband coding of data |
US7974488B2 (en) | 2006-10-05 | 2011-07-05 | Intellectual Ventures Holding 35 Llc | Matching pursuits basis selection |
US7707213B2 (en) | 2007-02-21 | 2010-04-27 | Donald Martin Monro | Hierarchical update scheme for extremum location |
US7707214B2 (en) | 2007-02-21 | 2010-04-27 | Donald Martin Monro | Hierarchical update scheme for extremum location with indirect addressing |
US11622133B2 (en) | 2007-02-23 | 2023-04-04 | Xylon Llc | Video coding with embedded motion |
US10523974B2 (en) | 2007-02-23 | 2019-12-31 | Xylon Llc | Video coding with embedded motion |
US10958944B2 (en) | 2007-02-23 | 2021-03-23 | Xylon Llc | Video coding with embedded motion |
US10194175B2 (en) | 2007-02-23 | 2019-01-29 | Xylon Llc | Video coding with embedded motion |
US7786907B2 (en) | 2008-10-06 | 2010-08-31 | Donald Martin Monro | Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US7864086B2 (en) | 2008-10-06 | 2011-01-04 | Donald Martin Monro | Mode switched adaptive combinatorial coding/decoding for electrical computers and digital data processing systems |
US7791513B2 (en) | 2008-10-06 | 2010-09-07 | Donald Martin Monro | Adaptive combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US7786903B2 (en) | 2008-10-06 | 2010-08-31 | Donald Martin Monro | Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems |
US8972200B2 (en) | 2011-11-18 | 2015-03-03 | Tata Consultancy Services Limited | Compression of genomic data |
EP2595076A3 (fr) * | 2011-11-18 | 2013-10-23 | Tata Consultancy Services Limited | Compression de données génomiques |
Also Published As
Publication number | Publication date |
---|---|
AU2002368408A1 (en) | 2004-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6100825A (en) | Cluster-based data compression system and method | |
Acharya et al. | JPEG2000 standard for image compression: concepts, algorithms and VLSI architectures | |
US7102552B1 (en) | Data compression with edit-in-place capability for compressed data | |
Salomon et al. | Handbook of data compression | |
JP3337633B2 (ja) | データ圧縮方法及びデータ復元方法並びにデータ圧縮プログラム又はデータ復元プログラムを記録したコンピュータ読み取り可能な記録媒体 | |
US5659631A (en) | Data compression for indexed color image data | |
WO2004051863A1 (fr) | Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression | |
JP6025923B2 (ja) | 整数値データのストリームを圧縮するシステム及び方法 | |
US6535642B1 (en) | Approximate string matching system and process for lossless data compression | |
Fitriya et al. | A review of data compression techniques | |
US6304676B1 (en) | Apparatus and method for successively refined competitive compression with redundant decompression | |
US6225922B1 (en) | System and method for compressing data using adaptive field encoding | |
CN1675842B (zh) | 算术编码的方法、设备以及相应解码方法 | |
GB2545305A (en) | Residual entropy compression for cloud-based video applications | |
Djusdek et al. | Adaptive image compression using adaptive Huffman and LZW | |
Goyal et al. | On optimal permutation codes | |
US20080001790A1 (en) | Method and system for enhancing data compression | |
Niemi et al. | Burrows‐Wheeler post‐transformation with effective clustering and interpolative coding | |
Kattan et al. | Evolution of human-competitive lossless compression algorithms with GP-zip2 | |
CN114337680B (zh) | 一种压缩处理方法、装置、存储介质及电子设备 | |
Rincy et al. | Preprocessed text compression method for Malayalam text files | |
Elahresh | Documents Compression based on Content Similarities | |
WO2004051861A1 (fr) | Procede automatise permettant de compresser des chaines binaires sans perte d'informations, et procede automatise associe permettant de decompresser | |
Mohamed | Wireless Communication Systems: Compression and Decompression Algorithms | |
Das et al. | Design an Algorithm for Data Compression using Pentaoctagesimal SNS |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established | ||
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC EPO FORM 1205A DD 08-09-05 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |