CN109995373B - Mixed packing compression method for integer arrays - Google Patents

Mixed packing compression method for integer arrays Download PDF

Info

Publication number
CN109995373B
CN109995373B CN201810004038.4A CN201810004038A CN109995373B CN 109995373 B CN109995373 B CN 109995373B CN 201810004038 A CN201810004038 A CN 201810004038A CN 109995373 B CN109995373 B CN 109995373B
Authority
CN
China
Prior art keywords
integer
array
bit
bits
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810004038.4A
Other languages
Chinese (zh)
Other versions
CN109995373A (en
Inventor
武鹏程
方艳
高佳玲
蔡建兵
张波
孙荣卫
左兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Abup Intelligent Technology Co ltd
Original Assignee
Shanghai Abup Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Abup Intelligent Technology Co ltd filed Critical Shanghai Abup Intelligent Technology Co ltd
Priority to CN201810004038.4A priority Critical patent/CN109995373B/en
Publication of CN109995373A publication Critical patent/CN109995373A/en
Application granted granted Critical
Publication of CN109995373B publication Critical patent/CN109995373B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3002Conversion to or from differential modulation
    • H03M7/3044Conversion to or from differential modulation with several bits only, i.e. the difference between successive samples being coded by more than one bit, e.g. differential pulse code modulation [DPCM]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

A mixed packing compression method of an integer array relates to the field of data compression, in particular to a mixed packing compression method of an integer array, which comprises the following steps: obtaining an integer array to be compressed, wherein the storage lengths of the integers are the same; converting the signed integer array, taking the absolute value of the original integer, shifting one bit to the left, and recording signs by using 0 and 1 for the lowest bit; traversing the whole integer absolute number array to obtain a median; packing the integer arrays by using two different packing basic storage bit numbers respectively to form two packed integer arrays; respectively compressing the two packed integer arrays by using a preset compression algorithm; selecting smaller compressed files after comparison; the packed base number of storage bits is recorded in the selected compressed file. After the technical scheme is adopted, the invention has the beneficial effects that: the method can reduce the storage capacity of the compressed file by about 10%, and is beneficial to improving the transmission success rate of the upgrade package of the mobile terminal and saving the flow cost.

Description

Mixed packing compression method for integer arrays
Technical Field
The invention relates to the field of data compression, in particular to a mixed packing compression method of an integer array.
Background
The data compression refers to a technical method for reducing the data volume to reduce the storage space and improve the transmission, storage and processing efficiency of the data or reorganizing the data according to a certain algorithm on the premise of not losing useful information and reducing the redundancy and storage space of the data. At present, data compression can be generally divided into two types, one is called lossless compression and the other is called lossy compression, but both methods are used for directly compressing file data, so that the storage space is large, the compression and decompression speed is low, the file transmission speed is also low, and the flow cost is high.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art and provide a mixed packing compression method of an integer array, which can reduce the storage capacity of compressed files by about 10%, thus being beneficial to improving the transmission success rate of mobile terminal upgrade packages and saving the flow cost.
In order to achieve the above purpose, the invention adopts the following technical scheme: it comprises the following steps:
step one: obtaining an integer array to be compressed, wherein the storage lengths of all integers of the integer array are the same;
step two: converting the signed integer array, taking the absolute value of the original integer and shifting one bit to the left, wherein the sign of the lowest bit is recorded by 0 and 1, and if the signed integer array is an unsigned integer array, the step can be omitted;
step three: traversing the whole integer absolute number array to obtain a median, and inspiring to search the basic storage bit number of the integer package;
step four: packing the integer arrays by using two different basic storage bit numbers respectively to form two packed integer arrays;
step five: respectively compressing the two packed integer arrays by using a preset compression algorithm;
step six: comparing the sizes of the compressed data files, and selecting smaller compressed files;
step seven: the packed base number of storage bits is recorded in the selected compressed file.
The mixed packing compression method of the integer array converts the signed integer array, shifts the whole original integer to the left by 1 bit, uses the lowest bit to represent the sign, and is convenient for the packing processing of the integer on the premise of not influencing the calculation.
The mixed packing compression method of the integer array uses the median of the absolute number array of the integer to inspire to search the packed basic storage bit number.
The mixed packing compression method of the integer array packs the integer array by using the basic storage bit number smaller than the integer storage length, increases the effective utilization of the integer storage space and reduces the space occupied by invalid numerical values.
The working principle of the invention is as follows: referring to fig. 1, the specific implementation flow of the present invention includes the following steps:
step 101, obtaining an integer array to be compressed, wherein the storage length of each integer is consistent;
step 102, if the integer array is signed, converting it, shifting the whole original integer left by 1 bit, and representing the sign by the lowest bit, (0 is positive number, 1 is negative number); this step may be omitted if the integer array is unsigned;
step 103, traversing an integer absolute value array to obtain a median instance;
step 104, inspiring to search the packed basic storage bit number according to the median, the invention discovers that the packing effect of four bits, eight bits and sixteen bits is better through experiments, so that four bits, eight bits or sixteen bits are recommended to be used for integer packing. Examples: the number of intermediate bits is 0x110111100, then the number of packed base storage bits is eight bits and sixteen bits;
step 105, integer packing is performed using the base number of stored bits. An example of packing is shown in fig. 2: when the eight bits are used as the basis to store the bit number, the data is actually stored as the last seven bit value, the bit value of the highest bit is used for recording the integrity of the data, 1 is incomplete, 0 is complete, when the integer is 0x110111100, firstly, the eight bits are used for storing 0x0111100, the highest bit is used for recording 1, the rest 11 is stored by the eight bits, the highest bit is used for recording 0, when the algorithm reads the integer beginning with 1, the next number is automatically spliced until the integer beginning with 0 is read; when the integer is 0x110, storing 0x110 with seven bits, and the bit value of the most significant bit is 0;
step 106, compressing the two packed integer array files by using a predetermined compression algorithm, so that the compression of the storage space of invalid values is reduced;
step 107, comparing the sizes of the two packed compressed files, selecting a smaller packed compressed file, and recording the number of packed basic storage bits in the packed compressed file for decompression.
After the technical scheme is adopted, the invention has the beneficial effects that: the method can reduce the storage capacity of the compressed file by about 10%, thereby being beneficial to improving the transmission success rate of the upgrade package of the mobile terminal and saving the flow cost.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flow chart of an embodiment of the present invention;
fig. 2 is a diagram of an embodiment of the present invention.
Detailed Description
Referring to fig. 1-2, the technical scheme adopted in the specific embodiment is as follows: it comprises the following steps:
step one: obtaining an integer array to be compressed, wherein the storage lengths of all integers of the integer array are the same;
step two: converting the signed integer array, taking the absolute value of the original integer and shifting one bit to the left, wherein the sign of the lowest bit is recorded by 0 and 1, and if the signed integer array is an unsigned integer array, the step can be omitted;
step three: traversing the whole integer absolute number array to obtain a median, and inspiring to search the basic storage bit number of the integer package;
step four: packing the integer arrays by using two different basic storage bit numbers respectively to form two packed integer arrays;
step five: respectively compressing the two packed integer arrays by using a preset compression algorithm;
step six: comparing the sizes of the compressed data files, and selecting smaller compressed files;
step seven: the packed base number of storage bits is recorded in the selected compressed file.
The mixed packing compression method of the integer array converts the signed integer array, shifts the whole original integer to the left by 1 bit, uses the lowest bit to represent the sign, and is convenient for the packing processing of the integer on the premise of not influencing the calculation.
The mixed packing compression method of the integer array uses the median of the absolute number array of the integer to inspire to search the packed basic storage bit number.
The mixed packing compression method of the integer array packs the integer array by using the basic storage bit number smaller than the integer storage length, increases the effective utilization of the integer storage space and reduces the space occupied by invalid numerical values.
The working principle of the invention is as follows: referring to fig. 1, the specific implementation flow of the present invention includes the following steps:
step 101, obtaining an integer array to be compressed, wherein the storage length of each integer is consistent;
step 102, if the integer array is signed, converting it, shifting the whole original integer left by 1 bit, and representing the sign by the lowest bit, (0 is positive number, 1 is negative number); this step may be omitted if the integer array is unsigned;
step 103, traversing an integer absolute value array to obtain a median instance;
step 104, inspiring to search the packed basic storage bit number according to the median, the invention discovers that the packing effect of four bits, eight bits and sixteen bits is better through experiments, so that four bits, eight bits or sixteen bits are recommended to be used for integer packing. Examples: the number of intermediate bits is 0x110111100, then the number of packed base storage bits is eight bits and sixteen bits;
step 105, integer packing is performed using the base number of stored bits. An example of packing is shown in fig. 2: when the eight bits are used as the basis to store the bit number, the data is actually stored as the last seven bit value, the bit value of the highest bit is used for recording the integrity of the data, 1 is incomplete, 0 is complete, when the integer is 0x110111100, firstly, the eight bits are used for storing 0x0111100, the highest bit is used for recording 1, the rest 11 is stored by the eight bits, the highest bit is used for recording 0, when the algorithm reads the integer beginning with 1, the next number is automatically spliced until the integer beginning with 0 is read; when the integer is 0x110, storing 0x110 with seven bits, and the bit value of the most significant bit is 0;
step 106, compressing the two packed integer array files by using a predetermined compression algorithm, so that the compression of the storage space of invalid values is reduced;
step 107, comparing the sizes of the two packed compressed files, selecting a smaller packed compressed file, and recording the number of packed basic storage bits in the packed compressed file for decompression.
The foregoing is merely illustrative of the present invention and not restrictive, and other modifications and equivalents thereof may occur to those skilled in the art without departing from the spirit and scope of the present invention.

Claims (4)

1. A mixed packing compression method of an integer array is characterized in that: it comprises the following steps:
step one: obtaining an integer array to be compressed, wherein the storage lengths of all integers of the integer array are the same;
step two: converting the signed integer array, taking the absolute value of the original integer and shifting one bit to the left, wherein the sign of the lowest bit is recorded by 0 and 1, and if the signed integer array is an unsigned integer array, the step can be omitted;
step three: traversing the whole integer absolute number array to obtain a median, and inspiring to search the basic storage bit number of the integer package, wherein the basic storage bit number is the bit number based on which the integer/integer array is divided and packaged, and the basic storage bit number comprises: one or more of four bits, seven bits, eight bits, and sixteen bits;
step four: the method comprises the steps that integer packaging is carried out on a basic storage bit number, when the integer in an integer array is divided and packaged according to the basic storage bit number, the bit value of the highest bit of the basic storage bit number is used for recording the integrity of data, and other bits except the highest bit store the data, so that the integrity of the data is judged and the complete data is guaranteed to be read according to the bit value of the highest bit when the packaged integer array is read; packing the integer arrays by using two different basic storage bit numbers respectively to form two packed integer arrays;
step five: respectively compressing the two packed integer arrays by using a preset compression algorithm;
step six: comparing the sizes of the compressed data files, and selecting smaller compressed files;
step seven: the packed base number of storage bits is recorded in the selected compressed file.
2. The hybrid packing compression method of an integer array according to claim 1, wherein: the signed integer array is converted, the whole original integer is shifted to the left by 1 bit, and the sign is represented by the lowest bit.
3. The hybrid packing compression method of an integer array according to claim 1, wherein: it uses the median of the absolute number array of integers to inspire a search for the packed base number of stored bits.
4. The hybrid packing compression method of an integer array according to claim 1, wherein: it packages an integer array with a base number of storage bits that is smaller than the integer storage length.
CN201810004038.4A 2018-01-03 2018-01-03 Mixed packing compression method for integer arrays Active CN109995373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810004038.4A CN109995373B (en) 2018-01-03 2018-01-03 Mixed packing compression method for integer arrays

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810004038.4A CN109995373B (en) 2018-01-03 2018-01-03 Mixed packing compression method for integer arrays

Publications (2)

Publication Number Publication Date
CN109995373A CN109995373A (en) 2019-07-09
CN109995373B true CN109995373B (en) 2023-08-15

Family

ID=67128968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810004038.4A Active CN109995373B (en) 2018-01-03 2018-01-03 Mixed packing compression method for integer arrays

Country Status (1)

Country Link
CN (1) CN109995373B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0427884A1 (en) * 1989-11-14 1991-05-22 Siemens Nixdorf Informationssysteme Aktiengesellschaft Method and device for data compression and decompression
CN101008890A (en) * 2007-01-30 2007-08-01 金蝶软件(中国)有限公司 Compression and uncompression method of integral data and device thereof
WO2009001174A1 (en) * 2007-06-28 2008-12-31 Smartimage Solutions, Sia System and method for data compression and storage allowing fast retrieval
CN102457283A (en) * 2010-10-28 2012-05-16 阿里巴巴集团控股有限公司 Data compression and decompression method and equipment
CN102708187A (en) * 2012-05-14 2012-10-03 成都信息工程学院 Reverse index mixed compression and decompression method based on Hbase database
CN102737132A (en) * 2012-06-25 2012-10-17 天津神舟通用数据技术有限公司 Multi-rule combined compression method based on database row and column mixed storage
CN103067022A (en) * 2012-12-19 2013-04-24 中国石油天然气集团公司 Nondestructive compressing method, uncompressing method, compressing device and uncompressing device for integer data
CN103516369A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for self-adaptation data compression and decompression and storage device
CN103685589A (en) * 2012-09-07 2014-03-26 中国科学院计算机网络信息中心 Binary coding-based domain name system (DNS) data compression and decompression methods and systems
CN104199927A (en) * 2014-09-03 2014-12-10 腾讯科技(深圳)有限公司 Data processing method and device
WO2015117407A1 (en) * 2014-08-26 2015-08-13 中兴通讯股份有限公司 Processing method and device for terminal information
CN106549673A (en) * 2016-10-27 2017-03-29 深圳市金证科技股份有限公司 A kind of data compression method and device
CN106685429A (en) * 2016-12-29 2017-05-17 广州华多网络科技有限公司 Integer compression method and device
CN107330030A (en) * 2017-06-23 2017-11-07 南京师范大学 A kind of hierarchical network construction method that storage is compressed towards magnanimity road net data

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0427884A1 (en) * 1989-11-14 1991-05-22 Siemens Nixdorf Informationssysteme Aktiengesellschaft Method and device for data compression and decompression
CN101008890A (en) * 2007-01-30 2007-08-01 金蝶软件(中国)有限公司 Compression and uncompression method of integral data and device thereof
WO2009001174A1 (en) * 2007-06-28 2008-12-31 Smartimage Solutions, Sia System and method for data compression and storage allowing fast retrieval
CN102457283A (en) * 2010-10-28 2012-05-16 阿里巴巴集团控股有限公司 Data compression and decompression method and equipment
CN102708187A (en) * 2012-05-14 2012-10-03 成都信息工程学院 Reverse index mixed compression and decompression method based on Hbase database
CN102737132A (en) * 2012-06-25 2012-10-17 天津神舟通用数据技术有限公司 Multi-rule combined compression method based on database row and column mixed storage
CN103685589A (en) * 2012-09-07 2014-03-26 中国科学院计算机网络信息中心 Binary coding-based domain name system (DNS) data compression and decompression methods and systems
CN103067022A (en) * 2012-12-19 2013-04-24 中国石油天然气集团公司 Nondestructive compressing method, uncompressing method, compressing device and uncompressing device for integer data
CN103516369A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for self-adaptation data compression and decompression and storage device
WO2015117407A1 (en) * 2014-08-26 2015-08-13 中兴通讯股份有限公司 Processing method and device for terminal information
CN104199927A (en) * 2014-09-03 2014-12-10 腾讯科技(深圳)有限公司 Data processing method and device
CN106549673A (en) * 2016-10-27 2017-03-29 深圳市金证科技股份有限公司 A kind of data compression method and device
CN106685429A (en) * 2016-12-29 2017-05-17 广州华多网络科技有限公司 Integer compression method and device
CN107330030A (en) * 2017-06-23 2017-11-07 南京师范大学 A kind of hierarchical network construction method that storage is compressed towards magnanimity road net data

Also Published As

Publication number Publication date
CN109995373A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
CN102457283B (en) A kind of data compression, decompression method and equipment
CN103488709B (en) A kind of index establishing method and system, search method and system
CN102122960B (en) Multi-character combination lossless data compression method for binary data
CN105207678B (en) A kind of system for implementing hardware of modified LZ4 compression algorithms
CN106528786B (en) Method and system of the multi-source heterogeneous power grid big data of fast transferring to HBase
CN106549673A (en) A kind of data compression method and device
US20060069857A1 (en) Compression system and method
CN111813840B (en) Data processing method, equipment and storage medium
CN105868194A (en) Methods and devices for text data compression and decompression
CN105933708A (en) Data compression-decompression method and device
CN101751440A (en) Data compression/decompression method and device thereof
CN101841337A (en) Data compression and decompression processing method and mobile storage device
CN104217023A (en) Method for realizing map tile storage by package technology
KR100484137B1 (en) Improved huffman decoding method and apparatus thereof
CN116244313B (en) JSON data storage and access method, device, computer equipment and medium
CN101398807A (en) Method and device for decompressing mobile terminal zip file
CN109995373B (en) Mixed packing compression method for integer arrays
CN105337617B (en) A kind of FSN files high-efficiency compression method
CN116594572B (en) Floating point number stream data compression method, device, computer equipment and medium
CN113590551A (en) Material digital extended format system, method, medium and equipment
CN104133883A (en) Telephone number attribution data compression algorithm
CN1979475A (en) Compressed file processing method
CN104113759A (en) Video system and method and device for buffering and recompressing/decompressing video frames
CN202931290U (en) Compression hardware system based on GZIP
CN115765754A (en) Data coding method and coded data comparison method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant