CN102427369B - Real-time holographic lossless compression method for productive time sequence data - Google Patents

Real-time holographic lossless compression method for productive time sequence data Download PDF

Info

Publication number
CN102427369B
CN102427369B CN201110317894.3A CN201110317894A CN102427369B CN 102427369 B CN102427369 B CN 102427369B CN 201110317894 A CN201110317894 A CN 201110317894A CN 102427369 B CN102427369 B CN 102427369B
Authority
CN
China
Prior art keywords
data
quality
difference
time
time tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110317894.3A
Other languages
Chinese (zh)
Other versions
CN102427369A (en
Inventor
周伊琳
陈炯聪
黄缙华
孙建伟
陈扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Guangdong Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Guangdong Power Grid Co Ltd filed Critical Electric Power Research Institute of Guangdong Power Grid Co Ltd
Priority to CN201110317894.3A priority Critical patent/CN102427369B/en
Publication of CN102427369A publication Critical patent/CN102427369A/en
Application granted granted Critical
Publication of CN102427369B publication Critical patent/CN102427369B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a real-time holographic lossless compression method for productive time sequence data, which comprises the steps of: respectively independently compressing three numerical ranges of each data in N productive time sequence data: a time label, a data value and data quality; respectively forming time label compression data, data value compression data and data quality compression data; and combining the three compression data into a complete compression data. In the method, productive time sequence data and files of each industry can be efficiently compressed in a lossless mode; and urgent demands of industries with enormous productive data, such as basic industry, electrics, telecommunication, chemical engineering, steel and the like for transmission, distribution, computing processing and storage of time sequence data can be satisfied.

Description

The holographic real-time lossless compression method of type of production time series data
Technical field
The present invention relates to a kind of data compression method, is a kind of holographic real-time lossless compression method that the type of production time series data is carried out specifically.
Background technology
At present, from computer hardware and software to the industrial control technology field, all in develop rapidly, the multi-core technology of computer realm, multinode high-speed physical memory techniques have all become mature and stable concurrent technique and have supported.At industrial control field, power industry especially, owing to controlling and the degree that becomes more meticulous of application constantly promotes, people have brought up to new height to the application demand of type of production time series data: the sampling of type of production time series data has been accurate to 100 frames/second; With analyze the relevant time series data of operation and usually all require on-line storage 5 years with first-class.Accumulated quite a large amount of real time datas because the precision of individual data is high, in running for many years, if use merely these data of memory device, stores, needs are expended to a large amount of memory devices and machine room for depositing these equipment, in addition, these data not only need to store safely and effectively, also need in the production practices in later stage, extract at any time and access.And the modern production control field has all proposed very high requirement to scale and the response speed aspect of data, can say, this is all a huge challenge concerning the operator.For meeting the demand, have to expend quite a large amount of costs of production and operations to achieve the goal.Especially aspect the Condensed Storage Technique of data, existing data processing method all can't directly apply on the type of production time series data in above-mentioned field.
Traditionally the processing of type of production time series data had to two kinds of strategies: the compression of (1) file-level, be similar to WinZip etc., the reason that this compact model can not solve the type of production time series data well comprises: the compression real-time is poor, the compression period operand is huge and return while getting and need the whole file of decompress(ion).Simultaneously, the specific aim of this class compress technique when process for producing type real time data is not strong, and compression ratio is low.(2) revolving door lossy compression method, adopt certain lf rule, abandons a part and change less data, to filter the order ground that reaches compression; The implementation of revolving door lossy compression method is as follows:
Swinging door compression algorithm (SDT) is a kind of trends of straight line compression algorithm, and its essence is to replace a series of consecutive numbers strong point by a straight line by starting point and evaluation of end point.This algorithm need to record every section time interval length, start point data and endpoint data, notices that the endpoint data of the last period is the start point data of next section.Its basic principle is comparatively simple: a data point a respectively has a bit up and down, and the distance between they and a point is E, and these two points are as two fulcrums of " door ".When only having first data point, two fan doors are all closed; More and more along with counting, goalkeeper progressively opens; The width of noticing every fan door can stretch, in the inside, a period of time interval, once door is just opened and can not be closed; As long as two fan Men Wei reach parallel, two interior angle sums are less than 180 ° in other words, and this " revolving door " operation can be proceeded.In figure, first time period is from a to e, and result is to replace data point (a, b, c, d, e) with a point to the straight line between the e point; Second time interval, during beginning, two fan doors were closed, and then progressively open from the e point, and subsequent operation and the last period are similar.
Swinging door compression algorithm (SDT), although real-time is stronger, during due to its packed data, can abandon a part of data, therefore can not meet the demand to type of production time series data needs Lossless Compression.
Summary of the invention
The object of the present invention is to provide a kind of holographic real-time lossless compression method of type of production time series data, the method can carry out efficiently type of production time series data and the file of all trades and professions, nondestructively compression, can meet in the huge industry of the creation datas such as basic industries, electric power, telecommunications, chemical industry, iron and steel the active demand to timing transfer of data, distribution, calculating processing and storage.
Purpose of the present invention can realize by following technical measures: a kind of holographic real-time lossless compression method of type of production time series data, it is characterized in that: three Numerical Ranges of each data in the N that is 1~N to sequence number type of production time series data: time tag, value data and the quality of data are carried out respectively independent compression, and formation time label packed data, value data packed data and quality of data packed data respectively; Again three Partial shrinkage data are merged into to a complete packed data.
Wherein, to the compression process of described time tag, be:
1a), first time tag is recorded in the time tag packed data, calculate the difference of the first two time tag as predicted time label difference, and be recorded in the time tag packed data;
1b), since the 3rd time tag, calculate successively the time tag difference between current time label and its previous time tag, and current time label difference is contrasted with predicted time label difference: if both equate, the current time label is regular time tag, and, do not process the current time label; Otherwise the current time label is irregular time tag, current irregular time tag and sequence number thereof are recorded in the time tag packed data;
1c), repeating step 1b) until handle whole N time tag;
Compression process to described value data is:
2a), first value data is recorded in the value data packed data;
2b), calculate before the difference between consecutive value in twos in K+1 data numerical value, obtain altogether K prediction data numerical value difference, be designated as: Δ V0 ..., Δ Vk-1, and be recorded in the value data packed data; The sequence number of described prediction data numerical value difference is 0~k-1;
2c), since K+2 data numerical value, packed data head to a fixed bit of each value data record, and the value data difference of calculating current data numerical value value data previous with it: if current data numerical value difference equals one in prediction data numerical value difference, the packed data head of this value data is designated as 0, then records the sequence number of corresponding prediction data numerical value difference again; Otherwise, find out in prediction data numerical value difference and the immediate prediction difference Δ Vj of current data numerical value difference, previous value data is added to the value of Δ Vj and current data numerical value carry out XOR, and record the number n that starts the continuous phase coordination in operation result from highest order with the packed data head, then record the sequence number j of immediate prediction data numerical value difference, the low 32-n bit data value of current data numerical value;
Compression process to the described quality of data is:
3a), first quality of data is recorded in quality of data packed data;
3b), calculate before the difference between consecutive value in twos in K+1 data quality, as prediction data quality difference, then by K prediction data value record of poor quality in final quality of data packed data;
3c), since K+2 data quality, calculate the quality of data difference of current data quality and its previous quality of data, if current difference equals certain the Δ i in a front K difference, the current data quality is the regular data quality, and the sequence number i of this prediction data quality difference is recorded in interim packed data A; Otherwise the current data quality is the irregular quality of data, and the sequence number of current data quality and current quality of data difference are recorded in interim packed data B;
3d), repeating step 3c), and record the number of the irregular quality of data, until after handling whole N data quality, the number of the irregular quality of data, interim packed data A, interim packed data B sequentially are spliced to whole quality of data packed data back.
Described time tag compression process also comprises the compression process to the sequence number of time tag, the cumulative number that records irregular time tag in process specifically: at repeating step 1b), calculate the needed total bytes of irregular time tag sequence number of the current accumulation of storage according to this number, if described total bytes surpasses N position (bit), adopt the bit field mode of N position to express the sequence number of whole N time tag, and this expression of results is recorded in the time tag packed data, delete the record of the irregular time tag sequence number recorded in described time tag packed data simultaneously.
The method that the described mode of the bit field with the N position is expressed the sequence number of whole N time tag is: use each time tag corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding time tag is regular time tag, with 1, mean that this corresponding time tag is irregular time tag.
Also comprise the compression process of further irregular time tag in the compression process of described time tag, specifically:
1i), at repeating step 1b) process in the process of all N time tag, find out maximum in irregular time tag and minimum value, and the sequence number of maximum and minimum value, and be recorded in the time tag packed data;
1ii), calculate the difference T of maximum and minimum value in irregular time tag, and form integer continuum [0, T];
1iii), in the time tag packed data recorded, the irregular time tag recorded from first, during position by the difference of current irregular time tag and minimum value in interval [0, T], the sequence number of current irregular time tag are recorded to the time tag packed data; Delete the irregular time tag and the sequence number thereof that have recorded in described time tag packed data simultaneously.
Described quality of data compression process also comprises the compression process to the sequence number of the irregular quality of data in interim packed data B, specifically: if the byte number of the sequence number of the irregular quality of data of described bulk registration surpasses N position (bit), adopt the bit field mode of N position to express the sequence number of whole N data quality, delete simultaneously
The sequence number of the irregular quality of data recorded in described interim packed data B.
The method that the described mode of the bit field with the N position is expressed the sequence number of whole N data quality is: use each quality of data corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding quality of data is the regular quality of data, with 1, mean that this corresponding quality of data is the irregular quality of data.
Described to also comprising the second-compressed process to interim packed data B in the compression process of data quality, specifically:
3i), at repeating step 3c) process in the process of all N data quality, find out maximum in quality of data difference and minimum value, and the sequence number of maximum and the corresponding quality of data of minimum value, and be recorded in quality of data packed data;
3ii), the maximum of calculated data quality difference and the difference L of minimum value, and form integer continuum [0, L];
The irregular quality of data difference that 3iii), first from interim packed data B recorded starts, calculate the difference T between the minimum value of current data quality difference and described quality of data difference, and described difference T is replaced originally to the record of the irregular quality of data difference that recorded in packed data B temporarily
Described K is 3 or 5.
The sequence number of described current time label or value data or the quality of data adopts the short of 2 bytes to carry out record.Described time tag is that equal difference increases progressively or the integer data of the millisecond precision of additional 2 bytes of long of the long of random 4 bytes that increase progressively or 4 bytes.
Described value data is single precision 32bits float(32 position floating-point) or double precision 64bits double(64 position floating-point) data of type.
The described quality of data is 4 bytes of the status indicator value of the current type of production time series data of expression or the integer of 8 bytes.
Include the process of global optimization in described time tag compression process, value data compression process and quality of data compression process, specifically: if packed data is more than or equal to the size of initial data, so directly record initial data.
The present invention contrasts prior art, has the following advantages:
1, the inventive method can type of production time series data and the file of all trades and professions be carried out efficiently, nondestructively compression, can meet in the huge industry of the creation datas such as basic industries, electric power, telecommunications, chemical industry, iron and steel the active demand to timing transfer of data, distribution, calculating processing and storage; This compression method can be compressed or decompress(ion) any data that meet the real-time characteristic of typical type of production time series data, has very strong specific aim, and can not cause any loss of significance to creation data;
2, in this compression method by the difference map to of an integer numerical value continuous integer interval, thereby can further promote compression ratio.
The accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the holographic real-time lossless compression method of type of production time series data of the present invention;
Fig. 2 is to the compression process flow chart of time tag in the holographic real-time lossless compression method of the type of production time series data shown in Fig. 1;
Fig. 3 is to the compression process flow chart of data numerical value in the holographic real-time lossless compression method of the type of production time series data shown in Fig. 1;
Fig. 4 is to the compression process flow chart of data quality in the holographic real-time lossless compression method of the type of production time series data shown in Fig. 1.
Embodiment
Real a series of type of production time series data time tag is all increment value, adopts the fixing sampling interval, and the data that sampling obtains are exactly the initial data that needs compression, so initial data also all increases progressively.Sampling period is generally fixed in addition, obtains the initial data of same data volume at every turn.Following handling process is generally compressed for every 1000 initial data.Due between adjacent initial data variation very little, so most applications all can produce a continuum that span is very little, so the present invention adopts the compression processing method of following embodiment.
As shown in Figure 1, the holographic real-time lossless compression method of this type of production time series data is compressed respectively by the time tag in the type of production time series data, value data and the quality of data three partial datas, and difference formation time label packed data, value data packed data and quality of data packed data.
Wherein, as shown in Fig. 2 flow chart, the compression process of time tag is:
1a), first time tag is recorded in the time tag packed data, calculate the difference of the first two time tag as predicted time label difference, and be recorded in the time tag packed data:
Comprise N sequentially time tag in N type of production time series data, sequence number is 1 to N; Calculate the difference of first time tag and second time tag---predicted time label difference DELTA t1, and first time tag and time tag difference DELTA t1 are recorded in the time tag packed data;
1b), since the 3rd time tag, calculate successively the time tag difference between current time label and its previous time tag---time tag difference DELTA ti, and current time label difference DELTA ti is contrasted with predicted time label difference DELTA t1: if both equate, the current time label is regular time tag, and, do not process the current time label; Otherwise the current time label is irregular time tag, current irregular time tag and sequence number thereof are recorded in the time tag packed data:
1c), repeating step 1b) until handle all N time tag.N is exactly original data volume, and specifically how much according to application demand, normally larger effect is better, but can not be unrestrictedly large in Real Time Compression, is generally 1000.For these 1000 initial data, if all data are all the constant sampling periods, the size of each time tag is 6 bytes, wherein comprises the time millisecond number of time number of seconds and 2 bytes of 4 bytes; Article 1000, the total bytes of original time label data is 6*1000=6000; According to above-mentioned reduced rule, only need storage: 6 bytes of article one record, second record and 4 bytes of difference of article one record, 2 bytes of number of irregular data, come to 12 bytes, compression ratio is promoted to so: 6000/12=500 is doubly.
The time tag compression process also comprises the compression process to the sequence number of time tag, specifically: if the byte number of the irregular time tag sequence number of bulk registration surpasses N position (bit), adopt the mode of the bit field of N position to express the sequence number of whole N time tag: to use each time tag corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding time tag is regular time tag, with 1, mean that this corresponding time tag is irregular time tag.In concrete operations, the expression of results of this N position bit field is recorded on 11st~20 of time tag packed data.And this expression of results is recorded in the time tag packed data to the record of the irregular time tag sequence number simultaneously recorded in erasing time label packed data.N=1000 for example, wherein the storage of any one sequence number needs 2 bytes, if EC is more than or equal to 63, needs the 63*2*8=1008 position, the sequence number (position) that adopts so the bit field of 1000 can store irregular time tag.
Bit field is a kind of data structure in the C language: so-called " bit field " is that the binary bit in a byte is divided into to several different zones, and the figure place that each is regional is described.There is a domain name in each territory, allows by domain name, to be operated in program.So just can mean several different objects with the binary system bit field of a byte.
Next, further, also comprise the compression process of further irregular time tag in the compression process of time tag, specifically:
1d), at repeating step 1b) process in the process of all N time tag, find out maximum in irregular time tag and minimum value, and the sequence number of maximum and minimum value, and be recorded in the time tag packed data; The difference that can calculate storage maximum and minimum value needs at most several bytes, and (scope that the scope of 1 bytes store is 0-255,2 bytes store is 0-65535 ...);
1e), calculate the difference T of maximum and minimum value in irregular time tag, and form integer continuum [0, T];
1f), in the time tag packed data recorded, the irregular time tag recorded from first, during position by the difference of current irregular time tag and minimum value in interval [0, T], the sequence number of current irregular time tag are recorded to the time tag packed data; Delete the irregular time tag and the sequence number thereof that have recorded in described time tag packed data simultaneously.
The main benefit of the compression process of above-mentioned irregular time tag is embodied in: suppose in current data to occur a large amount of irregular time tags, irregular time tag of every storage needs 6 bytes; For example, if now adopt above-mentioned compression process: the maximum in irregular time tag is 145, minimum value is 126, and difference T maximum, minimum value is 19, has formed so an integer continuum (0,19).After adopting this strategy, for any one irregular time tag, we only need to record the difference between itself and minimum value, and this difference one fixes in this continuous integer interval.Such as an initial data 130, initial data needs 4 bytes to store, only need now its Position Number 130-126=4 in the integer continuum of storage to get final product, and any one integer value in 1~19 only needs the 3bit binary expression, therefore storing metric 4 just wants 3Bits.Under this scene, the ratio of compression is (4*8)/3=~10 times, approximates greatly 10 times, removes other inner other added burden, and the data compression ratio of irregular time tag can be stabilized in more than 9 times, considerable.
As shown in Fig. 3 flow chart, to the compression process of data numerical value, be:
2a), first value data is recorded in the value data packed data;
2b), calculate before the difference between consecutive value in twos in K+1 data numerical value, obtain altogether K prediction data numerical value difference, and be recorded in the value data packed data;
2c), since K+2 data numerical value, each value data is recorded to a fixedly packed data head of figure place, and the value data difference of calculating current data numerical value value data previous with it: if current data numerical value difference equals one in prediction data numerical value difference, the packed data head of this value data is designated as 0, then records the sequence number of corresponding prediction data numerical value difference again; Otherwise, find out in prediction data numerical value difference and the immediate prediction difference Δ Vj of current data numerical value difference, previous value data is added to the value of Δ Vj and current data numerical value carry out XOR, and record the number n that starts the continuous phase coordination in operation result from highest order with the packed data head, then record the sequence number j of immediate prediction data numerical value difference, the low 32-n bit data value of current data numerical value;
Because original value data is 32 floating numbers, the result after XOR is 32 altogether,, 32 identical bits is arranged at most that is, by 6 numbers that record this identical bits, gets final product.Therefore, no matter when record rule data or irregular data, all first record regularly the packed data head of 6 and record the number n that highest order in the XOR result starts the continuous phase coordination, wherein, for regular data, the result of XOR is 32 and is 0, therefore, the value of these 6 packed data heads is 32, and then 2 are recorded j (K=3, if need 3 LSN j during K=5); For irregular data, followed by this fixedly packed data head back be exactly the occurrence of the corresponding value data in low 32-n position of sequence number j, the XOR result of nearest prediction data difference.
Wherein, the low 32-n position of current data numerical value adopts the mode of bit stream (bits stream) to be recorded in the value data packed data.Final all continuous floating number value data will be compressed together in the mode of bits stream, and whole process is all passed through efficient bit arithmetic and realized.
Because value data is 32 floating numbers, the n value necessarily is less than 32, therefore, storage n only needs 6bit, and sequence number j is less than 3 value, and storage j only needs 2bit, in addition, due to the feature of time series data, the similarity of the value data of adjacent two time series datas is very high, for example: continuous two samples 1129.32 and 1129.51, be respectively 32, by both with the position XOR after, the high 19bits that can calculate two samples is duplicate, difference is partly low 13bits.The present invention has introduced the difference prediction of the individual sample of front K (K=3 or 5), if so previous value data is added in prediction difference, will there is higher similitude with current data numerical value,, result after XOR obtains the not quantity of coordination and necessarily is less than or equal to 13bit, stores current data numerical value and only need to be less than or equal to the 2+6+13=21 position.The compression that this compression method repeats delta data to saltus step data, sawtooth waveform and periodicity is very efficient.As shown in Fig. 4 flow chart, to the compression process of data quality, be:
3a), first quality of data is recorded in quality of data packed data;
3b), calculate before the difference between consecutive value in twos in K+1 data quality, as prediction data quality difference, then by K prediction data value record of poor quality in final quality of data packed data; The value of K is generally 3 or 5; In actual applications, the quality of data of a type of production time series data is generally that saltus step is less, therefore generally by a front K difference, just can play good compression effectiveness, and a Δ i numbering only needs 2bits just can express (needing 3bits during K=5), obtained very outstanding effect in practice;
3c), since K+2 data quality, calculate the quality of data difference of current data quality and its previous quality of data, if current difference equals certain the Δ i in a front K difference, the current data quality is the regular data quality, and the sequence number i of this prediction data quality difference is recorded in interim packed data A; Otherwise the current data quality is the irregular quality of data, and the sequence number of current data quality and current quality of data difference are recorded in interim packed data B;
3d), repeating step 3c), and record the number of the irregular quality of data, until after handling whole N data quality, the number of the irregular quality of data, interim packed data A, interim packed data B sequentially are spliced to whole quality of data packed data back.
The sequence number of each prediction difference records with 2, at 3c) step in recorded the number of the irregular quality of data, quality of data packed data part so, and then after first quality of data record, be " sequence number+difference " of the data record of individual continuous " sequence number of prediction difference " of 2 of number, (number of the irregular quality of data of N-) of non-regular data quality, rule.In brief, " sequence number of prediction difference " of all regular data is stored together, and " sequence number+difference " of all irregular data is following closely, according to " number of the irregular quality of data ", just can distinguish simply both.In implementation procedure, adopt the information of rule and irregular data first to be recorded in temporary variable, finally handle after N data and just to carry out sets of copies and synthesize final packed data.
Quality of data compression process also comprises the compression process to the sequence number of the irregular quality of data in interim packed data B, specifically: if the byte number of the sequence number of the irregular quality of data of described bulk registration surpasses the N position, adopt the bit field mode of N position to express the sequence number of whole N data quality, delete the sequence number of the irregular quality of data recorded in described interim packed data B simultaneously.The method of expressing the sequence number of whole N data quality by the bit field mode of N position is: use each quality of data corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding quality of data is the regular quality of data, with 1, mean that this corresponding quality of data is the irregular quality of data.
Next, further, to also comprising the second-compressed process to interim packed data B in the compression process of data quality, specifically:
3e), at repeating step 3c) process in the process of all N data quality, find out maximum in quality of data difference and minimum value, and the sequence number of maximum and the corresponding quality of data of minimum value, and be recorded in quality of data packed data;
3f), the maximum of calculated data quality difference and the difference L of minimum value, and form integer continuum [0, L];
The irregular quality of data difference that 3g), first from interim packed data B recorded starts, calculate the difference T between the minimum value of current data quality difference and described quality of data difference, and described difference T is replaced originally to the record of the irregular quality of data difference that recorded in packed data B temporarily.In all compression steps, the sequence number of time tag, value data, the quality of data all adopts the short of 2 bytes to carry out record.
Time tag is that equal difference increases progressively or the integer data of the millisecond precision of additional 2 bytes of long of the long of random 4 bytes that increase progressively or 4 bytes.
The IEEE754 floating number that value data is engineering numerical value, and be single precision 32bits float(32 position floating-point) or double precision 64bits double(64 position floating-point) data of type.
4 bytes of the status indicator value that the quality of data is current type of production time series data or the integer of 8 bytes.For showing for example to go beyond the scope, report to the police and other self-defined implications.
Include the process of global optimization in time tag compression process, value data compression process and quality of data compression process, specifically: if packed data is more than or equal to the size of initial data, so directly record initial data.In the situation that packed data storage identical with initial data initial data is in order to promote the efficiency of decompress(ion).Use this method to carry out compression verification to the type of production time series data, test environment is: CPU:Intel (R) Core (TM) i7-2620M CPU@2.7GHz; RAW:4G, test result is: the creation data of internal memory compression 1G is consuming time is 1,512ms, with existing compression method, compare, greatly save compression time, simultaneously, because this method adopts harmless compression method, the feature that has retained total data in compression process, can not cause any loss of significance to creation data.Compression by this method practice in electrical network high speed PMU data, result proves, and the compression ratio of this method is existing more than 3 times of lossy compression, and compression, decompress(ion) performance are more than 2 times of other realization simultaneously.
In actual applications, in the domestic real-time dataBase system PTimeDB of high speed, having adopted compression method of the present invention, is more than 3 times of lossy compression for the compression ratio of electrical network high speed PMU data, and compression, decompress(ion) performance are more than 2 times of other realization simultaneously.
Embodiments of the present invention are not limited to this; under the above-mentioned basic fundamental thought of the present invention prerequisite; modification, replacement or change according to the ordinary skill knowledge of this area and customary means to other various ways that content of the present invention is made, within all dropping on rights protection scope of the present invention.

Claims (10)

1. the holographic real-time lossless compression method of a type of production time series data, it is characterized in that: three Numerical Ranges of each data in the N that is 1~N to sequence number type of production time series data: time tag, value data and the quality of data are carried out respectively independent compression, and formation time label packed data, value data packed data and quality of data packed data respectively; Again three Partial shrinkage data are merged into to a complete packed data; Wherein, to the compression process of described time tag, be:
1a), first time tag is recorded in the time tag packed data, calculate the difference of the first two time tag as predicted time label difference, and be recorded in the time tag packed data;
1b), since the 3rd time tag, calculate successively the time tag difference between current time label and its previous time tag, and current time label difference is contrasted with predicted time label difference: if both equate, the current time label is regular time tag, and, do not process the current time label; Otherwise the current time label is irregular time tag, current irregular time tag and sequence number thereof are recorded in the time tag packed data;
1c), repeating step 1b) until handle whole N time tag; Compression process to described value data is:
2a), first value data is recorded in the value data packed data;
2b), calculate before the difference between consecutive value in twos in K+1 data numerical value, obtain altogether K prediction data numerical value difference, and be recorded in the value data packed data;
2c), since K+2 data numerical value, packed data head to 6 figure places of each value data record, and the value data difference of calculating current data numerical value value data previous with it: if current data numerical value difference equals one in prediction data numerical value difference, the packed data head of this value data is designated as 0, then records the sequence number of corresponding prediction data numerical value difference again; Otherwise, find out in prediction data numerical value difference and the immediate prediction difference Δ Vj of current data numerical value difference, previous value data is added to the value of Δ Vj and current data numerical value carry out XOR, and record the number n that starts the continuous phase coordination in operation result from highest order with the packed data head, then record the sequence number j of immediate prediction data numerical value difference, the low 32-n bit data value of current data numerical value; Compression process to the described quality of data is:
3a), first quality of data is recorded in quality of data packed data;
3b), calculate before the difference between consecutive value in twos in K+1 data quality, as prediction data quality difference, then by K prediction data value record of poor quality in final quality of data packed data;
3c), since K+2 data quality, calculate the quality of data difference of current data quality and its previous quality of data, if current difference equals certain the Δ i in a front K difference, the current data quality is the regular data quality, and the sequence number i of this prediction data quality difference is recorded in interim packed data A; Otherwise the current data quality is the irregular quality of data, and the sequence number of current data quality and current quality of data difference are recorded in interim packed data B;
3d), repeating step 3c), and record the number of the irregular quality of data, until after handling whole N data quality, the number of the irregular quality of data, interim packed data A, interim packed data B sequentially are spliced to whole quality of data packed data back;
Described N value is that 1000, K value is 3 or 5.
2. the holographic real-time lossless compression method of type of production time series data according to claim 1, it is characterized in that: described time tag compression process also comprises the compression process to the sequence number of time tag, the cumulative number that records irregular time tag in process specifically: at repeating step 1b), calculate the needed total bytes of irregular time tag sequence number of the current accumulation of storage according to this number, if described total bytes surpasses the N position, adopt the bit field mode of N position to express the sequence number of whole N time tag, and this expression of results is recorded in the time tag packed data, delete the record of the irregular time tag sequence number recorded in described time tag packed data simultaneously.
3. the holographic real-time lossless compression method of type of production time series data according to claim 2, it is characterized in that: the method that the described mode of the bit field with the N position is expressed the sequence number of whole N time tag is: use each time tag corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding time tag is regular time tag, with 1, mean that this corresponding time tag is irregular time tag.
4. the holographic real-time lossless compression method of type of production time series data according to claim 1 and 2 is characterized in that: also comprise the compression process of further irregular time tag in the compression process of described time tag, specifically:
1i), at repeating step 1b) process in the process of all N time tag, find out maximum in irregular time tag and minimum value, and the sequence number of maximum and minimum value, and be recorded in the time tag packed data;
1ii), calculate the difference T of maximum and minimum value in irregular time tag, and form integer continuum [0, T];
1iii), in the time tag packed data recorded, the irregular time tag recorded from first, during position by the difference of current irregular time tag and minimum value in interval [0, T], the sequence number of current irregular time tag are recorded to the time tag packed data; Delete the irregular time tag and the sequence number thereof that have recorded in described time tag packed data simultaneously.
5. the holographic real-time lossless compression method of type of production time series data according to claim 1, it is characterized in that: described quality of data compression process also comprises the compression process to the sequence number of the irregular quality of data in interim packed data B, specifically: if the byte number of the sequence number of the irregular quality of data of described bulk registration surpasses the N position, adopt the bit field mode of N position to express the sequence number of whole N data quality, delete the sequence number of the irregular quality of data recorded in described interim packed data B simultaneously.
6. the holographic real-time lossless compression method of type of production time series data according to claim 5, it is characterized in that: the method that the described mode of the bit field with the N position is expressed the sequence number of whole N data quality is: use each quality of data corresponding to relevant position in the binary value of N position, mean that with 0 this corresponding quality of data is the regular quality of data, with 1, mean that this corresponding quality of data is the irregular quality of data.
7. the holographic real-time lossless compression method of type of production time series data according to claim 1 or 5 is characterized in that: described to also comprising the second-compressed process to interim packed data B in the compression process of data quality, specifically:
3i), at repeating step 3c) process in the process of all N data quality, find out maximum in quality of data difference and minimum value, and the sequence number of maximum and the corresponding quality of data of minimum value, and be recorded in quality of data packed data;
3ii), the maximum of calculated data quality difference and the difference L of minimum value, and form integer continuum [0, L];
The irregular quality of data difference that 3iii), first from interim packed data B recorded starts, calculate the difference T between the minimum value of current data quality difference and described quality of data difference, and described difference T is replaced originally to the record of the irregular quality of data difference that recorded in packed data B temporarily.
8. the holographic real-time lossless compression method of type of production time series data according to claim 1, it is characterized in that: the sequence number of described current time label or value data or the quality of data adopts the short of 2 bytes to carry out record.
9. the holographic real-time lossless compression method of type of production time series data according to claim 1 is characterized in that: described time tag is that equal difference increases progressively or the integer data of the millisecond precision of additional 2 bytes of long of the long of random 4 bytes that increase progressively or 4 bytes; The data that described value data is single precision or double; The described quality of data is 4 bytes of the status indicator value of the current type of production time series data of expression or the integer of 8 bytes.
10. the holographic real-time lossless compression method of type of production time series data according to claim 1, it is characterized in that: the process that includes global optimization in described time tag compression process, value data compression process and quality of data compression process, specifically: if packed data is more than or equal to the size of initial data, so directly record initial data.
CN201110317894.3A 2011-10-19 2011-10-19 Real-time holographic lossless compression method for productive time sequence data Active CN102427369B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110317894.3A CN102427369B (en) 2011-10-19 2011-10-19 Real-time holographic lossless compression method for productive time sequence data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110317894.3A CN102427369B (en) 2011-10-19 2011-10-19 Real-time holographic lossless compression method for productive time sequence data

Publications (2)

Publication Number Publication Date
CN102427369A CN102427369A (en) 2012-04-25
CN102427369B true CN102427369B (en) 2014-01-01

Family

ID=45961318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110317894.3A Active CN102427369B (en) 2011-10-19 2011-10-19 Real-time holographic lossless compression method for productive time sequence data

Country Status (1)

Country Link
CN (1) CN102427369B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102932001B (en) * 2012-11-08 2015-07-29 大连民族学院 Motion capture data compression, decompression method
CN104519525B (en) * 2013-09-30 2018-02-06 日月光半导体制造股份有限公司 Compress the dispensing device and reception device and its sending method and method of reseptance of package
CN104484476B (en) * 2014-12-31 2019-04-12 中国石油天然气股份有限公司 A kind of pumping-unit workdone graphic data compression storage method and device
CN104734726B (en) * 2015-04-01 2017-08-25 东方电子股份有限公司 A kind of time series data line compression method for supporting to edit
CN106055275A (en) * 2016-05-24 2016-10-26 深圳市敢为软件技术有限公司 Data compression recording method and apparatus
CN106372181B (en) * 2016-08-31 2019-08-06 东北大学 A kind of big data compression method based on industrial process
CN106549672B (en) * 2016-10-31 2019-07-12 合肥移顺信息技术有限公司 A kind of three axis data compression methods of acceleration transducer
CN108153483B (en) * 2016-12-06 2021-04-20 南京南瑞继保电气有限公司 Time sequence data compression method based on attribute grouping
CN106877506B (en) * 2017-03-23 2019-06-07 佛山电力设计院有限公司 A kind of transport protocol compression method of the out-of-limit monitoring data of distribution network voltage
CN108981990B (en) * 2018-07-25 2020-10-09 中国石油天然气股份有限公司 Indicator
CN109246086A (en) * 2018-08-16 2019-01-18 上海海压特智能科技有限公司 The transfer approach of director data packet
CN111064471B (en) * 2018-10-16 2023-04-11 阿里巴巴集团控股有限公司 Data processing method and device and electronic equipment
CN109684328B (en) * 2018-12-11 2020-06-16 中国北方车辆研究所 High-dimensional time sequence data compression storage method
CN111966648B (en) * 2020-07-29 2023-09-08 国机智能科技有限公司 Industrial data processing method and electronic equipment
CN112702340B (en) * 2020-12-23 2023-12-19 深圳供电局有限公司 Historical message compression method and system, computing equipment and storage medium thereof
CN113242041A (en) * 2021-03-10 2021-08-10 湖南大学 Data hybrid compression method and system thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923569A (en) * 2010-07-09 2010-12-22 南京朗坤软件有限公司 Storage method of structure type data of real-time database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1374413A2 (en) * 2001-03-29 2004-01-02 Koninklijke Philips Electronics N.V. Reduced data stream for transmitting a signal

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923569A (en) * 2010-07-09 2010-12-22 南京朗坤软件有限公司 Storage method of structure type data of real-time database

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
实时数据库中数据压缩算法的研究;徐慧;《中国优秀硕士论文电子期刊网》;20060616;正文第1-72页 *
徐慧.实时数据库中数据压缩算法的研究.《中国优秀硕士论文电子期刊网》.2006,正文第1-72页.
数据压缩技术在实时数据库中的应用研究;黄文君等;《仪器仪表学报》;20060630;第27卷(第6期);第911-913,959页 *
黄文君等.数据压缩技术在实时数据库中的应用研究.《仪器仪表学报》.2006,第27卷(第6期),第911-913,959页.

Also Published As

Publication number Publication date
CN102427369A (en) 2012-04-25

Similar Documents

Publication Publication Date Title
CN102427369B (en) Real-time holographic lossless compression method for productive time sequence data
Diethelm et al. The FracPECE subroutine for the numerical solution of differential equations of fractional order
Yang et al. Regional differences of environmental efficiency of China’s energy utilization and environmental regulation cost based on provincial panel data and DEA method
Fu et al. Prediction of particular matter concentrations by developed feed-forward neural network with rolling mechanism and gray model
CN101923569B (en) Storage method of structure type data of real-time database
CN104199942B (en) A kind of Hadoop platform time series data incremental calculation method and system
CN104331269A (en) Executable code compression method of embedded type system and code uncompressing system
Pempera et al. Open shop cyclic scheduling
CN101795138A (en) Compressing method for high density time sequence data in WAMS (Wide Area Measurement System) of power system
CN110516810A (en) A kind of processing method, device, storage medium and the electronic device of quantum program
CN102184335B (en) Fire disaster time sequence prediction method based on ensemble empirical mode decomposition and phase space reconstruction
Soleymani Some optimal iterative methods and their with memory variants
Pawlowski et al. Flow-based density of states for complex actions
Chen et al. Deep Reinforcement Learning for Efficient IoT Data Compression in Smart Railroad Management
Pham et al. Enhance exploring temporal correlation for data collection in WSNs
CN105469601A (en) A road traffic space data compression method based on LZW coding
CN105427583A (en) LZW-coding-based road traffic data compression method
CN102156802A (en) Method for forecasting evenly distributed live data
Ding et al. Forecasting method of stock price based on polynomial smooth twin support vector regression
Carević et al. Dominating sets on the rhomboidal cactus chains and the icosahedral network
Ham et al. An improved algorithm for mining frequent weighted itemsets
Qiu et al. Prediction method for regional logistics
CN113723001B (en) High-precision sulfur hexafluoride material accounting method and system based on multidimensional data
Mo et al. The GPU parallel algorithm of whole ordinal in universal combinatorics coding
CN116090274B (en) Material deformation simulation method, device, terminal and medium based on quantum computation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 510080 Dongfeng East Road, Dongfeng, Guangdong, Guangzhou, Zhejiang Province, No. 8

Patentee after: Electric Power Research Institute of Guangdong Power Grid Co.,Ltd.

Address before: Guangzhou City, Guangdong province Yuexiu District 510080 Dongfeng East Road, No. 8 building water Kong Guangdong

Patentee before: ELECTRIC POWER RESEARCH INSTITUTE OF GUANGDONG POWER GRID Corp.

CP03 Change of name, title or address