CN106681659A - Data compression method and device - Google Patents
Data compression method and device Download PDFInfo
- Publication number
- CN106681659A CN106681659A CN201611167099.XA CN201611167099A CN106681659A CN 106681659 A CN106681659 A CN 106681659A CN 201611167099 A CN201611167099 A CN 201611167099A CN 106681659 A CN106681659 A CN 106681659A
- Authority
- CN
- China
- Prior art keywords
- data
- compression
- module
- disk
- compressed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0656—Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0674—Disk device
- G06F3/0676—Magnetic disk device
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention relates to the technical field of data processing, and discloses a data compression method which includes the steps: firstly, writing data into a compression buffer memory; secondly, reading the data from the compression buffer memory by a compression engine in a fixed capacity manner; finally, compressing the read data by the compression engine. The invention further discloses a data compression device which comprises a data writing module, a data reading module and a data compression module, the data writing module is used for writing the data into the compression buffer memory, the data reading module is used for reading the data from the compression buffer memory by the compression engine in a fixed capacity manner; the data compression module is used for compressing the read data by the compression engine. The data compression method solves the problems that compressed data are written in the disk, disk fragmentations are generated in a disk, a lot of disk space is occupied, so that disk space is wasted in the prior art.
Description
Technical field
The present invention relates to the technical field of data processing, more particularly to a kind of method and device of data compression.
Background technology
The data that today's society has magnanimity daily are produced, and these data of generation are all much the data for repeating how
It is a very great task reasonably to analyze with these data and save it in disk.Preserve so huge number
Complete according to too many disk is needed, will so greatly increase the cost of an enterprise, especially Internet firm.Therefore
Needed to be compressed data process before data are preserved, so can greatly save disk space, improve magnetic
The space availability ratio of disk.
The product of compression has much at present, but most compression is all based on fixed block and carries out data input, so
The data of these inputs no longer have fixed size after compression afterwards, and these data are preserved in disk and also are difficult to unification
Form, generate many disk fragmentses, huge waste caused to disk space.Additionally, compressed products are not to adopt
The form of Real Time Compression, data by data storage disk, are then read out from disk and are compressed by first, and this
A little digital independents are read out by the way of order, are then compressed.The drawbacks of this mode is these compression numbers
According to being position based on disk rather than being time-based, because the IO of window has the bigger degree of association at the same time.
The content of the invention
It is an object of the invention to provide a kind of method and device of data compression, to write after overcoming prior art data compression
Enter disk, in disk, produce many disk fragmentses, take substantial amounts of disk space, the defect for causing disk space to waste.
To achieve these goals, the present invention adopts following technical scheme:
A kind of method of data compression, comprises the following steps:
Data are written to compressed cache;
Compression engine reads data from compressed cache in the way of fixed capacity;
Compression engine is compressed to the data of above-mentioned reading.
Preferably, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache.
Preferably, before data are written to compressed cache, also include:The value of time window and fixed capacity is set.
Preferably, the data of window pass through metadata record at the same time.
Preferably, after compression engine reads data in the way of fixed capacity from compressed cache, also include:Judge number
According to capacity whether reach the fixed capacity value that pre-sets, if it is, compression engine is compressed to the data for reading;Such as
It is really no, then continue to read data.
Preferably, after data are written to compressed cache, also include:Write-back success is returned to main frame.
Preferably, after compression engine is compressed to the data of above-mentioned reading, also include:Data after compression are write
To disk.
The present invention also provides a kind of device of data compression, including:
Data write. module, is written to compressed cache for data;
Data read module, is connected with Data write. module and data compressing module, respectively for compression engine with fixed capacity
Mode data are read from compressed cache;
The data of above-mentioned reading are compressed by data compressing module for compression engine.
Preferably, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache.
Preferably, also include:Information sending module, for returning write-back success to main frame.
Preferably, also include:Judge module, for judging whether the capacity of data reaches the fixed capacity for pre-setting
Value.
Preferably, also include:Data write disk module, for the data after compression are written to disk.
Preferably, also include:Parameter setting module, for arranging the value of time window and fixed capacity.
Compared with prior art, the present invention has advantages below:
1. most data compression at present is all based on different capabilities carries out data write, then data Jing of these writes
There is no after overcompression identical capacity yet, therefore, preservation of these data in disk is difficult, with unified form, so to increase
Add the gap between data, generate many disk fragmentses.Compression engine of the present invention is written to the data in compressed cache
It is read out in the way of fixed capacity and is compressed, the data after compression is written sequentially to disk in the way of fixed capacity,
So data just have unified form in disk, it is to avoid the gap between data after compression, so as to reduce disk
Fragment, improves disk space usage.
2. data be using random manner store on disk, but at the same time the data of window often with compared with
The big degree of association.Compression engine of the present invention is that data are read from compressed cache according to time window, and same time window is had
The data of relevant degree are compressed, and after improve compression, data pre-reads accuracy, and then the performance of lift system.
3. the process of prior art data compression is first to write data into disk, then reads data from disk, then
It is compressed, the data after compression is written to into disk finally.And the present invention is first write data in compressed cache, compression is drawn
Hold up to digital independent and be compressed, the data after compression are written to into disk finally, unlike the prior art, the present invention
Disk need not be first write data into, but first writes data into compressed cache, before the present invention writes data into disk
The compression to data is would have been completed, Real Time Compression is realized, the utilization rate of data performance and disk is substantially increased.
Description of the drawings
Fig. 1 is a kind of a kind of structural representation of the device of data compression of the invention.
Fig. 2 is a kind of a kind of schematic flow sheet of example of method of data compression of the invention;
Fig. 3 is the schematic flow sheet of Fig. 2 instantiations.
Fig. 4 is a kind of another kind of structural representation of the device of data compression of the invention;
Fig. 5 is a kind of schematic flow sheet of the method another kind example of data compression of the invention.
Specific embodiment
In order to make it easy to understand, the part noun to occurring in the present invention makees explanation explained below:
Time window, completes specific job task, the time range of this restriction, when referred to as in the time range for limiting
Between window.
Metadata(Metadata), it is the information of the tissue, data field and its relation with regard to data, in short, metadata
It is exactly the data with regard to data, metadata includes the full detail needed for interacting with another module.
Slip block algorithm, refers to a kind of method of data partition, and data file is divided into less data block, sliding
Byte is slided one by one backward in the original position portion from data file of dynamic window order, when sliding window is matched with default value
When, just produce a piecemeal.The length of this data block is specified in an interval range and obtains.
With reference to the accompanying drawings and examples, the specific embodiment of the present invention is described in further detail:
Embodiment one:A kind of device of data compression of the present invention is as shown in figure 1, including Data write. module 12, digital independent
Module 14, data compressing module 16, parameter setting module 11, judge module 15, information sending module 13 and data write disk
Module 17;The parameter setting module 11 successively order and Data write. module 12, information sending module 13, data read module
14th, judge module 15, data compressing module 16 and data write disk module 17.
Parameter setting module 11 is used for the value for pre-setting time window and fixed capacity;Data write. module 12 is used for will
Data are written in compressed cache;Information sending module 13 is for returning write-back success to main frame;Data read module 14 is used
Data or compression engine are read according to time window with fixation from compressed cache in compression engine in the way of fixed capacity
The mode of capacity reads data from compressed cache;Judge module 15 is used to judge whether the capacity of data reaches what is pre-set
Fixed capacity value, if it is, compression engine is compressed to the data for reading;If it is not, then continuing to read data;Data pressure
Contracting module 16 is used for compression engine and the fixed capacity data for reading is compressed;Data write disk module 17 is used for will compression
Data afterwards are written to disk.
Embodiment two:A kind of method of data compression of the present invention is as shown in Fig. 2 comprise the following steps:
Step S201, arranges the value of time window and fixed capacity in advance in system files.
Step S202, writes data in compressed cache.
Step S203, returns write-back success to main frame, realizes online real-time data compression.
Step S204, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache;
The data of window have the higher degree of association at the same time, and after compression is improve by the way of time window, data pre-reads
Accuracy, and then the performance of lift system.The data of window are by metadata record at the same time, shorten compression engine from
The time of data is read in compressed cache.
Step S205, judges whether the capacity of data reaches the fixed capacity value for pre-setting, if it is, going to step
S206;If it is not, then going to step S204.
Step S206, the data of fixed capacity of the compression engine to reading are compressed.
Step S207, the data after compression be written sequentially on disk space in the way of fixed capacity, it is to avoid number
Space according between, so as to reduce disk fragmentses, improves the space availability ratio of disk.
The data that compression engine of the present invention is written in compressed cache are read out in the way of fixed capacity, after compression
Data be written sequentially to disk space in the way of fixed capacity, such data just have unified form in disk,
The gap between data after compressing is avoided, disk fragmentses is reduced, has been saved the memory space of disk.The present invention is adopted and is based on
The compression of time window rather than location-based compression, after can so improving compression, data pre-reads accuracy, quickening reading
The speed of data.The present invention would have been completed the compression to data before writing data into disk, realize Real Time Compression.
It is explained in further detail with reference to specific embodiment, as shown in figure 3, the present embodiment is comprised the following steps:
Step S301, it is that 10ms and fixed capacity are 4K to arrange time window in advance in system files.Select one properly
Time window, the longer compression ratio of time window is higher, but the fashionable performance of disk write is lower, and generally time window is
10ms。
Step S302, writes data in compressed cache.
Step S303, returns write-back success to main frame.
Step S304, the data in compressed cache chronologically form a data block per 4K(Elongated input, fixed length output),
Compression engine reads a data block from compressed cache per 10ms.
Step S305, using slip block algorithm, judges whether the capacity of each data block reaches 4K, if it is, turning
Step S306;If it is not, then going to step S304.
Step S306, the data in data block of the compression engine at least one capacity for 4K are compressed, and hold each
Measure the data block for 4K the data block that capacity is 3.6K is compressed into by 90% compression ratio.
Data in data block of the capacity for 3.6K are written to disk by step S307.
Embodiment three:The device of another kind of the invention data compression is as shown in figure 4, including Data write. module 41, data
Read module 42 and data compressing module 43, data read module 42 respectively with Data write. module 41 and data compressing module 43
Connection.
Data write. module 41 is written to compressed cache for data;Data read module 42 is used for compression engine with fixation
The mode of capacity reads data from compressed cache;Data compressing module 43 is carried out to the data of above-mentioned reading for compression engine
Compression.
Example IV:The method of another kind of the invention data compression is as shown in figure 5, comprise the following steps:
S501, data are written to compressed cache.
S502, compression engine read data from compressed cache in the way of fixed capacity.
S503, compression engine are compressed to the data for reading.
Data in present invention write compressed cache are non-fixed capacities, compression engine receive data in the way of fixed capacity
According to and be compressed, the data after compression are written to disk again in the form of fixed capacity, and these data have unified form,
So as to reduce disk fragmentses, the space availability ratio of disk is improved.
Illustrated above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should
It is considered as protection scope of the present invention.
Claims (10)
1. a kind of method of data compression, it is characterised in that comprise the following steps:
Data are written to compressed cache;
Compression engine reads data from compressed cache in the way of fixed capacity;
Compression engine is compressed to the data of above-mentioned reading.
2. the method for data compression according to claim 1, it is characterised in that compression engine according to time window, and with
The mode of fixed capacity reads data from compressed cache.
3. the method for data compression according to claim 2, it is characterised in that before data are written to compressed cache, also
Including:The value of time window and fixed capacity is set.
4. the method for the data compression according to claim 2 or 3, it is characterised in that data of window at the same time
By metadata record.
5. the method for data compression according to claim 3, it is characterised in that compression engine in the way of fixed capacity from
After data are read in compressed cache, also include:Judge whether the capacity of data reaches the fixed capacity value for pre-setting, if
It is that then compression engine is compressed to the data for reading;If it is not, then continuing to read data.
6. the method for data compression according to claim 1, it is characterised in that after data are written to compressed cache, also
Including:Write-back success is returned to main frame.
7. the method for data compression according to claim 1, it is characterised in that compression engine is entered to the data of above-mentioned reading
After row compression, also include:Data after compression are written to into disk.
8. a kind of device of data compression, it is characterised in that include:
Data write. module, is written to compressed cache for data;
Data read module, is connected with Data write. module and data compressing module, respectively for compression engine with fixed capacity
Mode data are read from compressed cache;
The data of above-mentioned reading are compressed by data compressing module for compression engine.
9. the device of data compression according to claim 8, it is characterised in that compression engine according to time window, and with
The mode of fixed capacity reads data from compressed cache;
Preferably, also include:Parameter setting module, for arranging the value of time window and fixed capacity.
10. the device of data compression according to claim 8, it is characterised in that also include:
Information sending module, for returning write-back success to main frame;
Preferably, also include:Judge module, for judging whether the capacity of data reaches the fixed capacity value for pre-setting;
Preferably, also include:Data write disk module, for the data after compression are written to disk.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611167099.XA CN106681659A (en) | 2016-12-16 | 2016-12-16 | Data compression method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611167099.XA CN106681659A (en) | 2016-12-16 | 2016-12-16 | Data compression method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106681659A true CN106681659A (en) | 2017-05-17 |
Family
ID=58870998
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611167099.XA Pending CN106681659A (en) | 2016-12-16 | 2016-12-16 | Data compression method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106681659A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247562A (en) * | 2017-06-30 | 2017-10-13 | 郑州云海信息技术有限公司 | A kind of compression optimization method and its device |
CN107392838A (en) * | 2017-07-27 | 2017-11-24 | 郑州云海信息技术有限公司 | WebP compression parallel acceleration methods and device based on OpenCL |
CN107947799A (en) * | 2017-11-28 | 2018-04-20 | 郑州云海信息技术有限公司 | A kind of data compression method and apparatus |
CN111124259A (en) * | 2018-10-31 | 2020-05-08 | 深信服科技股份有限公司 | Data compression method and system based on full flash memory array |
CN113760192A (en) * | 2021-08-31 | 2021-12-07 | 荣耀终端有限公司 | Data reading method, data reading apparatus, storage medium, and program product |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6542644B1 (en) * | 1996-09-02 | 2003-04-01 | Fujitsu Limited | Statistical data compression/decompression method |
CN102611454A (en) * | 2012-01-29 | 2012-07-25 | 上海锅炉厂有限公司 | Dynamic lossless compressing method for real-time historical data |
CN103136109A (en) * | 2013-02-07 | 2013-06-05 | 中国科学院苏州纳米技术与纳米仿生研究所 | Writing-in and reading method of solid-state memory system flash translation layer (FTL) with compression function |
CN105808151A (en) * | 2014-12-29 | 2016-07-27 | 华为技术有限公司 | Solid-state disk storage device and data access method of solid-state disk storage device |
-
2016
- 2016-12-16 CN CN201611167099.XA patent/CN106681659A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6542644B1 (en) * | 1996-09-02 | 2003-04-01 | Fujitsu Limited | Statistical data compression/decompression method |
CN102611454A (en) * | 2012-01-29 | 2012-07-25 | 上海锅炉厂有限公司 | Dynamic lossless compressing method for real-time historical data |
CN103136109A (en) * | 2013-02-07 | 2013-06-05 | 中国科学院苏州纳米技术与纳米仿生研究所 | Writing-in and reading method of solid-state memory system flash translation layer (FTL) with compression function |
CN105808151A (en) * | 2014-12-29 | 2016-07-27 | 华为技术有限公司 | Solid-state disk storage device and data access method of solid-state disk storage device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247562A (en) * | 2017-06-30 | 2017-10-13 | 郑州云海信息技术有限公司 | A kind of compression optimization method and its device |
CN107247562B (en) * | 2017-06-30 | 2020-03-06 | 郑州云海信息技术有限公司 | Compression optimization method and device |
CN107392838A (en) * | 2017-07-27 | 2017-11-24 | 郑州云海信息技术有限公司 | WebP compression parallel acceleration methods and device based on OpenCL |
CN107392838B (en) * | 2017-07-27 | 2020-11-27 | 苏州浪潮智能科技有限公司 | WebP compression parallel acceleration method and device based on OpenCL |
CN107947799A (en) * | 2017-11-28 | 2018-04-20 | 郑州云海信息技术有限公司 | A kind of data compression method and apparatus |
CN107947799B (en) * | 2017-11-28 | 2021-06-29 | 郑州云海信息技术有限公司 | Data compression method and device |
CN111124259A (en) * | 2018-10-31 | 2020-05-08 | 深信服科技股份有限公司 | Data compression method and system based on full flash memory array |
CN113760192A (en) * | 2021-08-31 | 2021-12-07 | 荣耀终端有限公司 | Data reading method, data reading apparatus, storage medium, and program product |
CN113760192B (en) * | 2021-08-31 | 2022-09-02 | 荣耀终端有限公司 | Data reading method, data reading apparatus, storage medium, and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106681659A (en) | Data compression method and device | |
CN102609360B (en) | Data processing method, data processing device and data processing system | |
US20130124796A1 (en) | Storage method and apparatus which are based on data content identification | |
CN104750571B (en) | Method for error correction, memory device and controller of memory device | |
CN101916227B (en) | RLDRAM SIO storage access control method and device | |
US9411519B2 (en) | Implementing enhanced performance flash memory devices | |
US11010056B2 (en) | Data operating method, device, and system | |
CN103559027A (en) | Design method of separate-storage type key-value storage system | |
US20180300250A1 (en) | Method and apparatus for storing data | |
CN106648955A (en) | Compression method and relevant device | |
CN107391544A (en) | Processing method, device, equipment and the computer storage media of column data storage | |
WO2023000536A1 (en) | Data processing method and system, device, and medium | |
CN107577614B (en) | Data writing method and memory system | |
WO2023197507A1 (en) | Video data processing method, system, and apparatus, and computer readable storage medium | |
US9619400B2 (en) | Efficient management of computer memory using memory page associations and memory compression | |
CN104239231B (en) | A kind of method and device for accelerating L2 cache preheating | |
US20140258247A1 (en) | Electronic apparatus for data access and data access method therefor | |
CN105068875A (en) | Intelligence data processing method and apparatus | |
CN107423425A (en) | A kind of data quick storage and querying method to K/V forms | |
CN102722456B (en) | Flash memory device and data writing method thereof | |
CN108170376A (en) | The method and system that storage card is read and write | |
CN115826882B (en) | Storage method, device, equipment and storage medium | |
CN102360381B (en) | Device and method for performing lossless compression on embedded program | |
CN100578467C (en) | Caching device based on universal serial bus | |
CN105335296A (en) | Data processing method, apparatus and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |