CN109981110A - The method of lossy compression with point-by-point relative error boundary - Google Patents

The method of lossy compression with point-by-point relative error boundary Download PDF

Info

Publication number
CN109981110A
CN109981110A CN201910164475.7A CN201910164475A CN109981110A CN 109981110 A CN109981110 A CN 109981110A CN 201910164475 A CN201910164475 A CN 201910164475A CN 109981110 A CN109981110 A CN 109981110A
Authority
CN
China
Prior art keywords
point
relative error
quantizing factor
factor
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910164475.7A
Other languages
Chinese (zh)
Other versions
CN109981110B (en
Inventor
夏文
邹翔宇
王轩
张伟哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201910164475.7A priority Critical patent/CN109981110B/en
Publication of CN109981110A publication Critical patent/CN109981110A/en
Application granted granted Critical
Publication of CN109981110B publication Critical patent/CN109981110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides a kind of methods of lossy compression with point-by-point relative error boundary, comprising the following steps: A, tabulation tabulate according to the section of error requirements and quantizing factor;B, quantizing factor is obtained;C, Huffman encoding, the quantizing factor sequence generated in compression step B by Huffman encoding;D, using lossless compression method, Huffman encoding and the Huffman tree of compression step C generation are come using lossless compression method.The beneficial effects of the present invention are: can be to avoid logarithmic transformation time-consuming in the lossy compression with point-by-point relative error boundary, and quantization factor values are obtained by tabling look-up, significantly speed up the lossy compression with point-by-point relative error boundary.

Description

The method of lossy compression with point-by-point relative error boundary
Technical field
The present invention relates to the method for lossy compression more particularly to a kind of lossy compressions with point-by-point relative error boundary Method.
Background technique
The data that science simulation generation is carried out in high-performance calculation (HPC) environment are very huge, this may run When lead to serious I/O bottleneck, and bring huge memory space to bear for post analysis.With traditional data reduction scheme (such as data de-duplication or lossless compression) is different, and lossy compression can be significant in the case where meeting requirement of the user to control errors Reduce size of data.In order to automatically adapt to the required precision in data set, with point-by-point relative error boundary (that is, compression misses Difference depend on data value) lossy compression be widely used in many scientific applications.
The original lossy compression with point-by-point relative error boundary needs all to pass through all data in compression process Logarithm conversion.Logarithm is calculated generally to realize using series in a computer, it is computationally intensive, than relatively time-consuming.Calculate logarithm The step needs all to be converted to all data its logarithmic form, and calculation amount and data scale are positively correlated, the step Time-consuming occupies a bigger ratio in algorithm total time-consuming.Cause the lossy compression with point-by-point relative error boundary multiple It is miscellaneous and time-consuming.
Therefore, how to accelerate the lossy compression with point-by-point relative error boundary is that those skilled in the art institute is urgently to be resolved The technical issues of.
Summary of the invention
In order to solve the problems in the prior art, pressure is damaged with point-by-point relative error boundary the present invention provides a kind of The method of contracting.
The present invention provides a kind of methods of lossy compression with point-by-point relative error boundary, comprising the following steps:
A, it tabulates, is tabulated according to the section of error requirements and quantizing factor;
B, quantizing factor is obtained;
C, Huffman encoding, the quantizing factor sequence generated in compression step B by Huffman encoding;
D, using lossless compression method, Huffman encoding and the Hough of compression step C generation are come using lossless compression method Man Shu.
As a further improvement of the present invention, in stepb, actual value X is calculatediWith predicted value XiRatioThen the table generated using step A, inquires quantizing factor by the R acquired.
As a further improvement of the present invention, step A includes following sub-step:
A1, the domain for traversing quantizing factor, calculate the coverage area of each quantizing factor, generate table T1, and table T1 is to use Quantizing factor obtains the table of the quantizing factor coverage area;
A2, according to the size of error requirements computational chart T2, the numerical value of each list item of table T2 is successively calculated simultaneously according to table T1 Table T2 is filled in, table T2 is the table that quantization factor M is obtained with ratio R.
As a further improvement of the present invention, in step A1, each quantizing factor M is calculatedkCorresponding codomain Pk, generate Table T1.
As a further improvement of the present invention, in step A2, overlapping is generated between the corresponding codomain of the adjacent quantization factor, The size of overlapping is less than the list item of table T2.
The beneficial effects of the present invention are: through the above scheme, pressure can be damaged to avoid with point-by-point relative error boundary Time-consuming logarithmic transformation in contracting, and quantization factor values are obtained by tabling look-up, it has significantly speeded up with point-by-point global relative error bound The lossy compression of limit.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the method for the lossy compression with point-by-point relative error boundary of the present invention.
Fig. 2 is a kind of flow chart of the step A of the method for the lossy compression with point-by-point relative error boundary of the present invention.
Specific embodiment
The invention will be further described for explanation and specific embodiment with reference to the accompanying drawing.
As shown in Figure 1, a kind of method of the lossy compression with point-by-point relative error boundary, comprising the following steps:
A, tabulate, the section of the error requirements and quantizing factor that are provided according to user is tabulated, for later the step of It uses;
B, quantizing factor is obtained, actual value X is calculatediWith predicted value X 'iRatioThen raw using step A At table, quantizing factor is inquired by the R acquired;
C, Huffman encoding, the quantizing factor sequence generated in compression step B by Huffman encoding;
D, using lossless compression method, it is raw to carry out compression step C using the lossless compression method of the routine such as gzip or zstd At Huffman encoding and Huffman tree.
As shown in Fig. 2, step A includes following sub-step:
A1, the domain for traversing quantizing factor, calculate the coverage area of each quantizing factor, generate table T1, and table T1 is to use Quantizing factor obtains the table of the quantizing factor coverage area;
A2, according to the size of error requirements computational chart T2, the numerical value of each list item of table T2 is successively calculated simultaneously according to table T1 Table T2 is filled in, table T2 is the table that quantization factor M is obtained with ratio R.
Traditional logarithm process is by all data logarithmetics { Xi}→{log(Xi), by { log (Xi) be named as {Yi};According to actual value YiWith predicted value Y 'iTo calculate quantizing factorThen record quantization because Son.
ByIt is found that having corresponded to a Y for each quantizing factor in facti-Y′iValue Domain, as long as Yi-Y′iIn this codomain, can all generate a same quantizing factor, step A1 be exactly calculate each quantization because Sub- MkCorresponding codomain Pk, generate table T1.
ByIt can obtainWherein, 0 < δ < 1.According to required precision (i.e. error requirements) establish table T2, to pass throughTo obtain M.Some list item is in the position across codomain, fine tuning one in order to prevent The interval of the lower adjacent quantization factor allows between the corresponding codomain of the adjacent quantization factor and generates certain overlapping, guarantees the big of overlapping It is less than size representated by T2 list item, some codomain can be centainly fully belonged in this way with some list item, to evade falling problem. Last traversal list T2, the list item of T2 is successively filled according to table T1.
Step B is then according to calculatedInquiry table T2 is removed, thus the quantizing factor needed for obtaining.
According to original technical solution, the process for calculating logarithm can be than relatively time-consuming, and time-consuming and data scale is positively correlated, A kind of the damaging with point-by-point relative error boundary provided by the invention of a bigger part in total time-consuming can always be occupied The method of compression has bypassed the process for calculating logarithm, has used the method for building table, is looked into the ratio of actual value and predicted value Table, to be directly obtained quantizing factor.Finally keep cardinal principle and it is former identical under the premise of, eliminate calculating This time-consuming step of logarithm, realizes the acceleration of entire algorithm.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims (5)

1. a kind of method of the lossy compression with point-by-point relative error boundary, which comprises the following steps:
A, it tabulates, is tabulated according to the section of error requirements and quantizing factor;
B, quantizing factor is obtained;
C, Huffman encoding, the quantizing factor sequence generated in compression step B by Huffman encoding;
D, using lossless compression method, Huffman encoding and the Huffman tree of compression step C generation are come using lossless compression method.
2. the method for the lossy compression according to claim 1 with point-by-point relative error boundary, it is characterised in that: in step In rapid B, actual value X is calculatediWith predicted value X 'iRatioThen the table generated using step A, passes through the R acquired To inquire quantizing factor.
3. the method for the lossy compression according to claim 2 with point-by-point relative error boundary, which is characterized in that step A includes following sub-step:
A1, the domain for traversing quantizing factor, calculate the coverage area of each quantizing factor, generate table T1, and table T1 is with quantization The factor obtains the table of the quantizing factor coverage area;
A2, according to the size of error requirements computational chart T2, the numerical value of each list item of table T2 is successively calculated according to table T1 and is filled in Table T2, table T2 are the tables that quantization factor M is obtained with ratio R.
4. the method for the lossy compression according to claim 3 with point-by-point relative error boundary, it is characterised in that: in step In rapid A1, each quantizing factor M is calculatedkCorresponding codomain Pk, generate table T1.
5. the method for the lossy compression according to claim 4 with point-by-point relative error boundary, it is characterised in that: in step In rapid A2, overlapping is generated between the corresponding codomain of the adjacent quantization factor, the size of overlapping is less than the list item of table T2.
CN201910164475.7A 2019-03-05 2019-03-05 Method of lossy compression with point-by-point relative error bounds Active CN109981110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910164475.7A CN109981110B (en) 2019-03-05 2019-03-05 Method of lossy compression with point-by-point relative error bounds

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910164475.7A CN109981110B (en) 2019-03-05 2019-03-05 Method of lossy compression with point-by-point relative error bounds

Publications (2)

Publication Number Publication Date
CN109981110A true CN109981110A (en) 2019-07-05
CN109981110B CN109981110B (en) 2023-03-24

Family

ID=67077958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910164475.7A Active CN109981110B (en) 2019-03-05 2019-03-05 Method of lossy compression with point-by-point relative error bounds

Country Status (1)

Country Link
CN (1) CN109981110B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5724453A (en) * 1995-07-10 1998-03-03 Wisconsin Alumni Research Foundation Image compression system and method having optimized quantization tables
US6049630A (en) * 1996-03-19 2000-04-11 America Online, Inc. Data compression using adaptive bit allocation and hybrid lossless entropy encoding
US20040044521A1 (en) * 2002-09-04 2004-03-04 Microsoft Corporation Unified lossy and lossless audio compression
US20080193028A1 (en) * 2007-02-13 2008-08-14 Yin-Chun Blue Lan Method of high quality digital image compression
US20080285866A1 (en) * 2007-05-16 2008-11-20 Takashi Ishikawa Apparatus and method for image data compression
WO2010030256A1 (en) * 2008-09-12 2010-03-18 Tovaristvo Z Obmezenou Vidpovidalnistu 'smail' Alias-free method of image coding and decoding (2 variants)
US20160127746A1 (en) * 2012-05-04 2016-05-05 Environmental Systems Research Institute, Inc. Limited error raster compression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5724453A (en) * 1995-07-10 1998-03-03 Wisconsin Alumni Research Foundation Image compression system and method having optimized quantization tables
US6049630A (en) * 1996-03-19 2000-04-11 America Online, Inc. Data compression using adaptive bit allocation and hybrid lossless entropy encoding
US20040044521A1 (en) * 2002-09-04 2004-03-04 Microsoft Corporation Unified lossy and lossless audio compression
US20080193028A1 (en) * 2007-02-13 2008-08-14 Yin-Chun Blue Lan Method of high quality digital image compression
US20080285866A1 (en) * 2007-05-16 2008-11-20 Takashi Ishikawa Apparatus and method for image data compression
WO2010030256A1 (en) * 2008-09-12 2010-03-18 Tovaristvo Z Obmezenou Vidpovidalnistu 'smail' Alias-free method of image coding and decoding (2 variants)
US20160127746A1 (en) * 2012-05-04 2016-05-05 Environmental Systems Research Institute, Inc. Limited error raster compression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
冷星星 等: "高压缩低损耗图像编码算法研究", 《成都信息工程学院学报》 *

Also Published As

Publication number Publication date
CN109981110B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
US9928267B2 (en) Hierarchical database compression and query processing
US11018692B2 (en) Floating point data set compression
CN110518917B (en) LZW data compression method and system based on Huffman coding
CN110175641B (en) Image recognition method, device, equipment and storage medium
US11928599B2 (en) Method and device for model compression of neural network
US10694217B2 (en) Efficient length limiting of compression codes
CN110874625B (en) Data processing method and device
CN104408100B (en) The compression method of structured web site daily record
EP3115908A1 (en) Method and apparatus for multimedia content indexing and retrieval based on product quantization
CN113300715B (en) Data processing method, device, hardware compression equipment and medium
EP3963853A1 (en) Optimizing storage and retrieval of compressed data
CN105279171B (en) The method and apparatus of predicate evaluation is carried out on the varying length string of compression
WO2023015831A1 (en) Huffman correction encoding method and system, and relevant components
WO2021194519A1 (en) Method and system for reducing output of reservoir simulation data
KR20030071327A (en) Improved huffman decoding method and apparatus thereof
CN116915259A (en) Bin allocation data optimized storage method and system based on internet of things
CN105302915A (en) High-performance data processing system based on memory calculation
US20240120940A1 (en) System and method for distributed node-based data compaction
US11782879B2 (en) System and method for secure, fast communications between processors on complex chips
US20240080040A1 (en) System and method for data storage, transfer, synchronization, and security using automated model monitoring and training
CN109981110A (en) The method of lossy compression with point-by-point relative error boundary
CN113537447A (en) Method and device for generating multilayer neural network, application method and storage medium
CN100551066C (en) The implementation method of encoder and adaptive arithmetic code and device
CN107612554B (en) Data compression processing method
CN115567058A (en) Time sequence data lossy compression method combining prediction and coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant