WO2004051863A1 - Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression - Google Patents

Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression Download PDF

Info

Publication number
WO2004051863A1
WO2004051863A1 PCT/IT2002/000762 IT0200762W WO2004051863A1 WO 2004051863 A1 WO2004051863 A1 WO 2004051863A1 IT 0200762 W IT0200762 W IT 0200762W WO 2004051863 A1 WO2004051863 A1 WO 2004051863A1
Authority
WO
WIPO (PCT)
Prior art keywords
string
binary
words
value
word
Prior art date
Application number
PCT/IT2002/000762
Other languages
English (en)
Inventor
Elena Leanza
Original Assignee
Atop Innovation S.P.A.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Atop Innovation S.P.A. filed Critical Atop Innovation S.P.A.
Priority to AU2002368408A priority Critical patent/AU2002368408A1/en
Priority to PCT/IT2002/000762 priority patent/WO2004051863A1/fr
Publication of WO2004051863A1 publication Critical patent/WO2004051863A1/fr

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present invention concerns an automated method for compressing binary strings without information loss, or lossless compression method, and related automated method for decompressing, which makes possible, in a simple, fast and inexpensively producible way, to minimise in a lossless way the dimension of binary strings.
  • the present invention further concerns the instruments necessary to perform the automated methods and to the apparatus performing the same.
  • the compression methods can be classified according to the congruence between decompressed data and original data, dividing the concerned compression methods into methods having no information loss or "lossless” methods, when the data reconstructed or decompressed from the compressed data are identical to the original ones, and methods hav- ing information losses or "lossy” methods, in which the reconstructed data lose a portion of the original data information.
  • lossless methods are preferable and, in some specific applications when all the information must be kept, are the only ones that may be used. Examples of lossless methods are the Huffman encoding method and the Lempel-Ziv-Welch, or LZW, encoding method.
  • the Huffman method operates only on static strings, creating a corresponding dictionary for each string, while the LZW method operates on dynamic strings, creating a dynamically updated dictionary. Consequently, both said methods are complex and their computer execution is slow.
  • each encoding method is suitably designed in connection with specific types of data to be encoded, newer and newer encoding requirements for data having particular characteristics are being encountered. This causes a certain difficulty of use, especially in case of transmission of composite data, requiring the use of different encoding methods.
  • the datum related to the pres ⁇ is the datum related to the pres ⁇
  • the output string E may further comprise an initial flag bit for indicating whether at least one binary value B k is not present within the input string S (and, therefore, the specific flags of each individual binary value B k must be read) or all the binary values B k are present (and, therefore, the output string E does not comprise specific flags of each in- dividual binary value B k because all the values must be read).
  • the datum related to the presence of at least one word having binary value B k within the string S k may be the number N ⁇ of words located within the string S k .
  • the sorting rule may assume the 2" binary values (B) as positive binary values and the sorting 0 fixed during step A may be either the decreasing sorting or the increasing sorting of such positive binary values.
  • the method may count the number of occurrences of each binary value B k within the input binary string S, and the sorting rule may fix the sorting O according to either the decreasing sorting or the increasing sorting of the occurrences of the binary values B k within the input binary string S, the method further comprising the step: F. juxtaposing to the output string E data related to the sorting O fixed during step A.
  • the method is automatically adaptive to the type of file to be compressed, in conformity to the occurrences of the binary values.
  • step D.3 the elements d k are juxtaposed to the output string E according to a binary encoded
  • sentation defines a first positive integer value g k , with g k ⁇ 0 , and a sec-
  • d k may be represented by (g k +x k + 2) binary bits according to the formula:
  • W the division — , i.e. it is equal to W -In ⁇ ;
  • the method may also comprise the step:
  • the method may determine the optimum first and second positive integer values g k and x k by means of a trial and error process.
  • the trial and error process may verify the size of memory needed to represent the array A k according to the binary encoded representation for all the values of the first and second positive integer values g k and x k ranging, respectively, from 0 and G, with G > 0 , and from 0 and X, with X > 0.
  • the method may furthermore comprise the step:
  • the method still comprises the step:
  • the method is iterated, being applied at each h-Vn iteration, with h > ⁇ , to the output string E obtained by the preceding ( ⁇ -l)-th iteration.
  • n h _ x and n h of bits, included in the words into which the corresponding input strings S are subdivided during step C may be different with respect to one another: n h _ x ⁇ n h
  • an automated method for decompressing a compressed binary string E of input data into a binary string S of output data characterised in that the compressed bi- nary string E of input data has been obtained by applying to the binary string S the just described automated compression method, and in that it comprises the following steps:
  • the decompression method may be iterated, the compressed binary string E of input data having being obtained by applying to the binary string S the iterated automated compression method.
  • an electronic apparatus comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated compression method previously illustrated.
  • an electronic apparatus comprising at least one central processing unit and at least one memory unit, characterised in that it executes the automated decompression method described before.
  • an electric, magnetic or electromagnetic signal modulated by a digital signal characterised in that said digital signal comprises at least one compressed binary string E obtained by means of the automated compression method previously illustrated.
  • a computer program comprising code means adapted to execute, when running on a computer, the automated compression method previously described. It is further subject matter of this invention a memory medium readable by a computer, storing a program, characterised in that the program is the computer program just described.
  • Figure 1 schematically shows a portion of a binary string S of input data to be compressed by means of a preferred embodiment of the auto- mated method according to the invention
  • Figure 2 schematically shows the portion of the string S of Figure 1 ;
  • Figure 3 schematically shows a flow chart of the trial and error process for determining the first and the second optimum positive integer values g k and x k according to the preferred embodiment of the automated method according to the invention
  • Figure 4 schematically shows a first portion of the compressed binary string E of output data generated by the preferred embodiment of the automated method according to the invention
  • Figure 5 schematically shows a portion of a second string Si gener- ated during the execution of the preferred embodiment of the automated compression method according to the invention
  • Figure 6 schematically shows a second portion of the compressed binary string E of output data generated by the preferred embodiment of the automated method according to the invention
  • Figure 7 schematically shows a third portion of the compressed bi ⁇
  • Figure 8 schematically shows a flow chart of the preferred embodi-
  • word 1 comprises n bits, where in a preferred embodiment of the invention
  • n 8.
  • the string S may be padded with a
  • Each word 1 may assume one binary value B among 2 n different bi ⁇
  • the method fixes a sorting O of such 2 n values according to a corresponding sorting rule:
  • the method scans the string S for determining the position of the words 1' having the binary value B Q which is in the first position within the sorting 0, and it stores their reciprocal distances df" in a memory array A Q ; in the embodiment of Figure 1 , such first binary value is the value "11111111".
  • each one of the No elements of the array A Q in case N 0 > 0 , is represented according to a suitable encoded binary representation.
  • this may be represented in one of three different ways: 1 ) in the case when df' ⁇ 2 8 " [2] df" is represented by (g 0 + 1) binary bits, the first of which is equal to "0" and the following g 0 bits form the positive two's complement binary representation having g 0 bits of the value of df" ; in other words, in this case df" is represented according to the formula:
  • d ,pB is represented by g 0 + XQ + 2 binary bits, where
  • In[x] is the operator returning the integer part of X, in such a way that
  • bits are equal to "1", the following bit is equal to "0' and the last (go +*o) D ' ts are tne positive two's complement binary
  • a second binary string S is constructed, either physically or logically, which is obtained by eliminating from the initial string S all the words 1 ' having the binary value B 0 .
  • the output string E comprises only one flag bit equal to "0".
  • the method scans the string Si for determining the position of the words having the binary value B ⁇ which is in the second position within the sorting O, and it stores their reciprocal distances df in a corresponding memory array A ⁇ ; in the embodiment of the method shown in the Figures, such second binary value is the value "11111110".
  • the array A ⁇ contains N j elements related to the distances df , for i - 0,1,2,...,N - I , which locate the positions within the string S L of the Ni words having binary value B ⁇ .
  • the embodi- ment of the method shown in the Figures fixes two positive integer values gi and x ⁇ which minimise the size of memory needed to represent the array A ⁇ according to the encoded representation, illustrated above with reference to the array A Q .
  • the method executes similar operations on the third string S 2 , scanning it for determining the position of the words having the binary value B 2 which is in the third position within the sorting O, and it stores their reciprocal distances df 2 in a memory array A 2 ; in the embodiment of the method shown in the Figures, such third binary value is the value "11111101".
  • the string S 2 comprises N 2 words, with N 2 > 0 , having the binary value B 2
  • a fourth binary string S 3 is constructed, either physically or logically, which is obtained by eliminating from the third string S 2 all the words having the binary value B 2 .
  • the method iterates the preceding steps for all the 2"
  • the method advantageously stores the initial size of the
  • D d k is represented only by the bit "0"
  • Figure 8 shows a flow chart which schematically summarises the method steps illustrated above.
  • the embodiment of the method shown in the Figures may be iterated, that is the output string E may be subjected to the same steps of the compression method, schematically shown in Figure 8, being treated as it were an input string S so as to obtain a new output string E ⁇ .
  • the initial size of the input string S is advantageously stored only once at the head of the last output string.
  • the string E is subdivided in words comprising the same number n of bits, if the new output string E ⁇ which is obtained has smaller size, otherwise the string E is subdivided in words comprising a number n x of bits that is different from n bits, preferably n ⁇ n .
  • the number n h of bits of the words into which the input string to be compressed has been subdivided is stored at the output string head.
  • the automated decompression method reconstructs the input string S starting from the output string E.
  • the output string E is read according to a sorting reversed with respect to the sorting O used for constructing it, carrying out the following steps:
  • sion method must also be iterated up to reconstructing the input string S.
  • an input data string to be compressed may prelimi ⁇
  • the receiving apparatus may incrementally reconstruct the origi ⁇
  • nal input data string starting from the different consecutive segments, by incrementally juxtaposing the segments obtained from decompression of the compressed segments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

La présente invention a trait à un procédé automatisé pour une compression sans perte d'une suite octets binaires (S) de données d'entrée en une suite d'octets comprimés (E) de données de sortie, qui subdivise la suite d'entrée (S) en mots (1) comprenant n bits et, en fonction d'un tri (O) = {B0, B1, B2, B2n-1} = {Bk}k=0,1,2, ,2n-1. Sur les valeurs binaires 2n (B) qui peuvent être assumées par un mot binaire, pour chaque valeur Bk du tri de (O), pour k = 0, 1, 2, ,(2n-2), en supposant que S0=S, il repère les mots présentant une valeur binaire Bk au sein de la suite Sk, mémorise une dans matrice Ak la distance diBk de chaque mot repéré par rapport au mot précédent ayant une valeur binaire Bk, juxtapose la suite de sortie (E) dans les éléments diBk. mémorisés dans la matrice Ak et construit une suite Sk+1 obtenue par l'élimination de la suite Sk des Nk mots repérés, le procédé enfin juxtaposant à la suite de sortie (E) le nombre N2n-1 de mots de la dernière suite S2n-1. L'invention a également trait à un procédé automatisé associé pour la décompression de la suite octets binaires de sortie (E) de données d'entrée en une suite d'octets binaires (S) de données de sortie. Enfin, l'invention a trait aux instruments nécessaires pour effectuer les procédés automatisés et à l'appareil de mise en oeuvre.
PCT/IT2002/000762 2002-12-04 2002-12-04 Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression WO2004051863A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2002368408A AU2002368408A1 (en) 2002-12-04 2002-12-04 Automated method for lossless data compression and decompression of a binary string
PCT/IT2002/000762 WO2004051863A1 (fr) 2002-12-04 2002-12-04 Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IT2002/000762 WO2004051863A1 (fr) 2002-12-04 2002-12-04 Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression

Publications (1)

Publication Number Publication Date
WO2004051863A1 true WO2004051863A1 (fr) 2004-06-17

Family

ID=32448851

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IT2002/000762 WO2004051863A1 (fr) 2002-12-04 2002-12-04 Procede automatise de compression de suite d'octets binaires sans perte d'information et procede automatise associe de decompression

Country Status (2)

Country Link
AU (1) AU2002368408A1 (fr)
WO (1) WO2004051863A1 (fr)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007149358A1 (fr) * 2006-06-19 2007-12-27 Essex Pa, L.L.C. Compression de données
US7508325B2 (en) 2006-09-06 2009-03-24 Intellectual Ventures Holding 35 Llc Matching pursuits subband coding of data
US7586424B2 (en) 2006-06-05 2009-09-08 Donald Martin Monro Data coding using an exponent and a residual
US7689049B2 (en) 2006-08-31 2010-03-30 Donald Martin Monro Matching pursuits coding of data
US7707213B2 (en) 2007-02-21 2010-04-27 Donald Martin Monro Hierarchical update scheme for extremum location
US7707214B2 (en) 2007-02-21 2010-04-27 Donald Martin Monro Hierarchical update scheme for extremum location with indirect addressing
US7783079B2 (en) 2006-04-07 2010-08-24 Monro Donald M Motion assisted data enhancement
US7786907B2 (en) 2008-10-06 2010-08-31 Donald Martin Monro Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US7786903B2 (en) 2008-10-06 2010-08-31 Donald Martin Monro Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US7791513B2 (en) 2008-10-06 2010-09-07 Donald Martin Monro Adaptive combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US7845571B2 (en) 2006-06-19 2010-12-07 Monro Donald M Data compression
US7864086B2 (en) 2008-10-06 2011-01-04 Donald Martin Monro Mode switched adaptive combinatorial coding/decoding for electrical computers and digital data processing systems
US7974488B2 (en) 2006-10-05 2011-07-05 Intellectual Ventures Holding 35 Llc Matching pursuits basis selection
EP2595076A3 (fr) * 2011-11-18 2013-10-23 Tata Consultancy Services Limited Compression de données génomiques
US8674855B2 (en) 2006-01-13 2014-03-18 Essex Pa, L.L.C. Identification of text
US10194175B2 (en) 2007-02-23 2019-01-29 Xylon Llc Video coding with embedded motion

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3523247A1 (de) * 1985-06-28 1987-01-02 Siemens Ag Einrichtung zur datenreduktion binaerer datenstroeme
US6388585B1 (en) * 1998-08-11 2002-05-14 Matsushita Electric Ind Co Ltd Method for data compression and decompression using decompression instructions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3523247A1 (de) * 1985-06-28 1987-01-02 Siemens Ag Einrichtung zur datenreduktion binaerer datenstroeme
US6388585B1 (en) * 1998-08-11 2002-05-14 Matsushita Electric Ind Co Ltd Method for data compression and decompression using decompression instructions

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HOSANG M.: "A character elimination algorithm for lossless data compression", MADE PUPLIC DURING DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), pages 1 - 9, XP002246070, Retrieved from the Internet <URL:UNKNOWN> [retrieved on 20030617] *
HOSANG M.: "A character elimination algorithm for lossless data compression", PROCEEDINGS DATA COMPRESSION CONFERENCE 2002, 2 April 2002 (2002-04-02) - 4 April 2002 (2002-04-04), Snowbird, UT, USA, pages 457, XP002246068 *
MOFFAT A ET AL: "SELF-INDEXING INVERTED FILES FOR FAST TEXT RETRIEVAL", ACM TRANSACTIONS ON INFORMATION SYSTEMS, ASSOCIATION FOR COMPUTING MACHINERY, NEW YORK, US, vol. 14, no. 4, 1 October 1996 (1996-10-01), pages 349 - 379, XP000635100, ISSN: 1046-8188 *
PECHURA M. A., MCINTYRE D. R.: "Data Compression using static Huffman code-decode tables", COMMUNICATIONS OF THE ACM, vol. 28, no. 6, June 1985 (1985-06-01), pages 612 - 616, XP002246069 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8674855B2 (en) 2006-01-13 2014-03-18 Essex Pa, L.L.C. Identification of text
US7783079B2 (en) 2006-04-07 2010-08-24 Monro Donald M Motion assisted data enhancement
US7586424B2 (en) 2006-06-05 2009-09-08 Donald Martin Monro Data coding using an exponent and a residual
US7845571B2 (en) 2006-06-19 2010-12-07 Monro Donald M Data compression
US7770091B2 (en) 2006-06-19 2010-08-03 Monro Donald M Data compression for use in communication systems
JP2009542092A (ja) * 2006-06-19 2009-11-26 エセックス パ エルエルシー データ圧縮の方法
KR101092106B1 (ko) 2006-06-19 2011-12-12 에섹스 피에이 엘엘씨 데이터 압축
WO2007149358A1 (fr) * 2006-06-19 2007-12-27 Essex Pa, L.L.C. Compression de données
US8038074B2 (en) 2006-06-19 2011-10-18 Essex Pa, L.L.C. Data compression
US7689049B2 (en) 2006-08-31 2010-03-30 Donald Martin Monro Matching pursuits coding of data
US7508325B2 (en) 2006-09-06 2009-03-24 Intellectual Ventures Holding 35 Llc Matching pursuits subband coding of data
US7974488B2 (en) 2006-10-05 2011-07-05 Intellectual Ventures Holding 35 Llc Matching pursuits basis selection
US7707213B2 (en) 2007-02-21 2010-04-27 Donald Martin Monro Hierarchical update scheme for extremum location
US7707214B2 (en) 2007-02-21 2010-04-27 Donald Martin Monro Hierarchical update scheme for extremum location with indirect addressing
US11622133B2 (en) 2007-02-23 2023-04-04 Xylon Llc Video coding with embedded motion
US10523974B2 (en) 2007-02-23 2019-12-31 Xylon Llc Video coding with embedded motion
US10958944B2 (en) 2007-02-23 2021-03-23 Xylon Llc Video coding with embedded motion
US10194175B2 (en) 2007-02-23 2019-01-29 Xylon Llc Video coding with embedded motion
US7786907B2 (en) 2008-10-06 2010-08-31 Donald Martin Monro Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US7864086B2 (en) 2008-10-06 2011-01-04 Donald Martin Monro Mode switched adaptive combinatorial coding/decoding for electrical computers and digital data processing systems
US7791513B2 (en) 2008-10-06 2010-09-07 Donald Martin Monro Adaptive combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US7786903B2 (en) 2008-10-06 2010-08-31 Donald Martin Monro Combinatorial coding/decoding with specified occurrences for electrical computers and digital data processing systems
US8972200B2 (en) 2011-11-18 2015-03-03 Tata Consultancy Services Limited Compression of genomic data
EP2595076A3 (fr) * 2011-11-18 2013-10-23 Tata Consultancy Services Limited Compression de données génomiques

Also Published As

Publication number Publication date
AU2002368408A1 (en) 2004-06-23

Similar Documents

Publication Publication Date Title
US6100825A (en) Cluster-based data compression system and method
Acharya et al. JPEG2000 standard for image compression: concepts, algorithms and VLSI architectures
US7102552B1 (en) Data compression with edit-in-place capability for compressed data
Salomon et al. Handbook of data compression
JP3337633B2 (ja) データ圧縮方法及びデータ復元方法並びにデータ圧縮プログラム又はデータ復元プログラムを記録したコンピュータ読み取り可能な記録媒体
US5659631A (en) Data compression for indexed color image data
WO2004051863A1 (fr) Procede automatise de compression de suite d&#39;octets binaires sans perte d&#39;information et procede automatise associe de decompression
JP6025923B2 (ja) 整数値データのストリームを圧縮するシステム及び方法
US6535642B1 (en) Approximate string matching system and process for lossless data compression
Fitriya et al. A review of data compression techniques
US6304676B1 (en) Apparatus and method for successively refined competitive compression with redundant decompression
US6225922B1 (en) System and method for compressing data using adaptive field encoding
CN1675842B (zh) 算术编码的方法、设备以及相应解码方法
GB2545305A (en) Residual entropy compression for cloud-based video applications
Djusdek et al. Adaptive image compression using adaptive Huffman and LZW
Goyal et al. On optimal permutation codes
US20080001790A1 (en) Method and system for enhancing data compression
Niemi et al. Burrows‐Wheeler post‐transformation with effective clustering and interpolative coding
Kattan et al. Evolution of human-competitive lossless compression algorithms with GP-zip2
CN114337680B (zh) 一种压缩处理方法、装置、存储介质及电子设备
Rincy et al. Preprocessed text compression method for Malayalam text files
Elahresh Documents Compression based on Content Similarities
WO2004051861A1 (fr) Procede automatise permettant de compresser des chaines binaires sans perte d&#39;informations, et procede automatise associe permettant de decompresser
Mohamed Wireless Communication Systems: Compression and Decompression Algorithms
Das et al. Design an Algorithm for Data Compression using Pentaoctagesimal SNS

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established
32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC EPO FORM 1205A DD 08-09-05

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP