CN104734726A - Time series data online compression method supporting editing - Google Patents

Time series data online compression method supporting editing Download PDF

Info

Publication number
CN104734726A
CN104734726A CN201510149751.4A CN201510149751A CN104734726A CN 104734726 A CN104734726 A CN 104734726A CN 201510149751 A CN201510149751 A CN 201510149751A CN 104734726 A CN104734726 A CN 104734726A
Authority
CN
China
Prior art keywords
data
sequence number
time series
quality
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510149751.4A
Other languages
Chinese (zh)
Other versions
CN104734726B (en
Inventor
王兴信
孔海斌
王传起
李吉勇
刘仲尧
唐军沛
夏寨芳
刘春庆
谭军光
周志辉
任永伟
谭凯
吴海勇
张伟
刘晶敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongfang Electronics Co Ltd
Original Assignee
Dongfang Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongfang Electronics Co Ltd filed Critical Dongfang Electronics Co Ltd
Priority to CN201510149751.4A priority Critical patent/CN104734726B/en
Publication of CN104734726A publication Critical patent/CN104734726A/en
Application granted granted Critical
Publication of CN104734726B publication Critical patent/CN104734726B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a time series data online compression method supporting editing. One data block serves as a basic storage unit, time series data are decomposed into time series value groups and mass groups, and the time series value groups and the mass groups are compressed and stored at the heads and the tails of the data blocks till the groups are fully stored in the data blocks. Under the condition that time series data mass is stable, the time series data of the digital quantity type only occupy 4 bytes of storage space, floating point values are further compressed, the time series data of the analog quantity type only occupy 6 bytes of storage space, and therefore the data size is obviously reduced and system performance is improved. Each original datum corresponds to a storage method of a time series value group, the compressed and stored data can support fast editing modification.

Description

A kind of time series data line compression method supporting to edit
Technical field
The present invention relates to a kind of line compression memory method of time series data, be specifically related to a kind of line compression memory method of continuous metric data of industrial automation.
Background technology
In the last few years, along with rapid development of economy, computer technology develops rapidly and is widely used in industrial automation.Due to the continuous expansion of the control of industrial automation, the continuous lifting of managerial skills and scale, need fast processing collection in a large number wider, gather the larger data of density.These data, owing to being the data recorded in chronological order, are commonly referred to as time series data.In order to speed realizes the access of time series data, mainly take following several method at present:
One is adopt traditional compression algorithm, as zip, LZW etc.These compression algorithms can realize storing the object of a large amount of time series data with less memory space, but due to the restriction of compression algorithm principle, can not meet real-time.
Two is adopt the line compression algorithm damaged, as dead band value compression algorithm, STD compression algorithm etc., these algorithms can meet the requirement of real-time and memory space, but owing to being lossy compression method, after the requirement meeting real-time and memory space, often have larger data precision loss, can not meet the required precision in corresponding field.
Some are also had to adopt special construction only to store the method for the time series data of change, these methods under some specific condition (sampling interval as data is fixed, data value fluctuation very little when) requirement of real-time and memory space can be met, but have left these specified conditions, often effect is undesirable, what is more important, if the time series data of later stage to storage is modified, this method is just difficult to realize, because the data be modified are likely the data of not storing owing to not having to change.
Summary of the invention
The present invention aims to provide a kind of time series data line compression method supporting to edit, can not only compressed storage time series data quickly and efficiently, and supports the amendment to time series data after storage.
Technical scheme of the present invention is as follows:
Support a time series data line compression method of editing, it is characterized in that: with data block for substantially to store unit, time series data is resolved into timing values group and quality group, respectively in head and the tail compression storage of data block, until data block storage is full; Described timing values group comprises the ID of time series data, time, millisecond and value; Described quality group comprises sequence number and the quality of time series data.
Specifically according to following steps, line compression is carried out to time series data:
1), create data block: for the time series data that will store, the corresponding data block of each ID is stored, if do not have data block or data block storage completely, then creates new data block;
2) when, in data block, stored data sequence number is 1, the ID of time series data, time, millisecond and value are stored in the head of data block as timing values group, and the sequence number of time series data and quality are stored in the afterbody of data block as quality group, it is 6 bytes that quality group takes up room, wherein sequence number accounts for 2 bytes, and quality accounts for 4 bytes;
3), when stored data sequence number is greater than 1 in data block, first judges whether data block remaining space can store the data of corresponding sequence number, if remaining space is inadequate, perform step 1);
4) if the data type of storage is digital quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, millisecond value is obtained s1 divided by 4, the s1 of a upper sequence number is recorded as s0, the value of a upper sequence number is designated as v0, if △ t1>65535, then 65535, s0 and v0 form timing values group, be stored in after a sequence number value, then perform step 3); If △ is t1<=65535, △ t1, s1 and value composition timing values group, store after a upper sequence number value; The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and value accounts for 1 byte, and the value of digital quantity is 0 or 1; The relatively quality of current sequence number and the quality of upper data, if quality is different, before sequence number quality group being stored in a upper sequence number quality group; If identical in quality, the quality of this sequence number does not do storage process;
If the data type of storage is analog quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, the data of current sequence number and sequence number be 1 data between value difference be △ v, millisecond value is obtained s1 divided by 4, and the s1 of a upper sequence number is recorded as s0, and the value difference of a upper sequence number is designated as △ v0, if △ is t1>65535, then 65535, s0 and △ v0 forms timing values group, be stored in after a sequence number value, then perform step 3); If △ is t1<=65535, △ t1, s1 and △ v formed timing values group, stores after a upper sequence number value; The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and △ v accounts for 3 bytes; As follows with 3 bytes storage floating number △ v:a), to be s, s value be 0 or 1 is stored in highest order for the symbol of note △ v, accounts for 1; B), the exponent bits that calculates △ v is e, is stored in after s, accounts for 7 bytes; C), intercept front 17 m of △ v, its highest order is implicit position, is stored in after e, accounts for 16 bytes; The relatively quality of current sequence number and the quality of upper data, if quality is different, before sequence number quality group being stored in a upper sequence number quality group; If identical in quality, the quality of this sequence number does not do storage process;
5), above-mentioned steps is repeated, until need the data of storage all to store.
Specifically in accordance with the following methods online editing is carried out to time series data:
1), when time series data is revised ID and time do condition, the time series data value on corresponding time point is modified;
2), according to index information, stored data block and the storage sequence number n of Update Table are wanted in location, find the storage timing values group (△ tn, sn, vn) wanting Update Table;
3) if the data type of amendment is digital quantity, (△ tn, sn, v) store directly the new value V that will revise to be charged to timing values group;
If amendment data type be analog quantity, the new value V that will revise and sequence number be 1 data between value difference be designated as △ vn, composition sequence value group (△ tn, sn, △ vn) store.
Good effect of the present invention is:
1, the present invention is by resolving into timing values group time series data and quality group carries out compressed storage, in the stay-in-grade situation of time series data, the time series data of digital quantity type only takies 4 byte memory spaces, by the further compression to floating point values, the time series data of analog quantity type only accounts for the memory space of 6 bytes, therefore significantly reduce data volume, improve systematic function.
2, the present invention is by the memory method to the corresponding timing values group of every bar initial data, makes the data after compressed storage can support edit-modify fast.
3, the present invention's less memory space storage time series data, and time precision least unit is 4 milliseconds, can meet highdensity time series data storage (as WAMS etc.).
Accompanying drawing explanation
Fig. 1 is time series data line compression of the present invention storage schematic diagram.
Embodiment
The present invention is further illustrated below in conjunction with the drawings and specific embodiments.
The present invention with data block for substantially to store unit, according to the feature (timing, quality of data relative stability) of time series data, time series data is resolved into timing values group and quality group, respectively in head and the tail compression storage of data block, to full to data block storage.
See Fig. 1, time series data line compression method main rapid as follows:
1, create data block: for the time series data that will store, correspondence data block of each ID is stored, if there is no data block or data block storage completely, then create new data block.
2, when in data block, stored data sequence number is 1 (first time stored in), the ID of time series data, time, millisecond, value are stored in the beginning of data block, (sequence number, quality) is formed quality group, is stored in the afterbody of data block, it is 6 bytes that quality group takes up room.
3, when stored data sequence number is >1 in data block, first judge whether data block remaining space can store the data of corresponding sequence number, if can not space inadequate, perform step 1.
If the data type of 4 storages is digital quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, millisecond value is obtained s1 divided by 4, the s1 of a upper sequence number is recorded as s0, and the value of a upper sequence number is designated as v0, if △ is t1>65535, then (65535, s0, v0) form timing values group, be stored in after a sequence number value, then perform step 3.If △ is t1<=65535, (△ t1, s1, value) composition timing values group, store after a upper sequence number value.The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and value accounts for 1 byte (value of digital quantity is 0 or 1), and therefore the timing values group of a data volume takies memory space is 4 bytes; Relatively the quality of current sequence number and the quality of upper data, if different, form quality group (sequence number, quality), before the sequence number being stored in a sequence number quality group.If identical in quality, the quality of this sequence number does not do storage process.
If the data type of 5 storages is analog quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, the data of current sequence number and sequence number be 1 data between value difference be △ v, millisecond value divided by 4 s1, the s1 of a upper sequence number is recorded as s0, the value difference of a upper sequence number is designated as △ v0, if △ is t1>65535, then (65535, s0, △ v0) are formed timing values group, be stored in after a sequence number value, then perform step 3.If △ is t1<=65535, (△ t1, s1, △ v) form timing values group, store after a upper sequence number value.The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and △ v accounts for 3 bytes, and therefore the timing values group of a data volume takies memory space is 6 bytes.As follows by the method for 3 byte storage floating number △ v:
1) symbol of note △ v is s, s value is 0(negative) or 1(positive number) be stored in highest order, account for 1.
2) exponent bits calculating △ v is e, is after s, accounts for 7 bytes.
3) intercept front 17 m of △ v, its highest order is implicit position, is stored in after e, accounts for 16 bytes.
The floating number of said method storage represents that scope is: ± 131071 × 10 127.
Relatively the quality of current sequence number and the quality of upper data, if different, form quality group (sequence number, quality), before the sequence number being stored in a sequence number quality group.If identical in quality, the quality of this sequence number does not do storage process.
6, above-mentioned steps is repeated, until need the data of storage all to store.
the main method of the online editing of time series data is as follows:
1, generally do condition with ID and time during time series data amendment, the time series data value on corresponding time point is modified.
2, according to index information, stored data block and the storage sequence number n of Update Table are wanted in location, find the storage timing values group (△ tn, sn, vn) wanting Update Table.
If the data type of 3 amendments is digital quantity, (△ tn, sn, v) store directly the new value V that will revise to be charged to timing values group.
If 4 amendment data types be analog quantity, the new value V that will revise and sequence number be 1 data between value difference be designated as △ vn, composition sequence value group (△ tn, sn, △ vn) store.

Claims (3)

1. support a time series data line compression method of editing, it is characterized in that: with data block for substantially to store unit, time series data is resolved into timing values group and quality group, respectively in head and the tail compression storage of data block, until data block storage is full; Described timing values group comprises the ID of time series data, time, millisecond and value; Described quality group comprises sequence number and the quality of time series data.
2. the time series data line compression method supporting editor according to claim 1, is characterized in that carrying out line compression according to following steps to time series data:
1), create data block: for the time series data that will store, the corresponding data block of each ID is stored, if do not have data block or data block storage completely, then creates new data block;
2) when, in data block, stored data sequence number is 1, the ID of time series data, time, millisecond and value are stored in the head of data block as timing values group, and the sequence number of time series data and quality are stored in the afterbody of data block as quality group, it is 6 bytes that quality group takes up room, wherein sequence number accounts for 2 bytes, and quality accounts for 4 bytes;
3), when stored data sequence number is greater than 1 in data block, first judges whether data block remaining space can store the data of corresponding sequence number, if remaining space is inadequate, perform step 1);
4) if the data type of storage is digital quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, millisecond value is obtained s1 divided by 4, the s1 of a upper sequence number is recorded as s0, the value of a upper sequence number is designated as v0, if △ t1>65535, then 65535, s0 and v0 form timing values group, be stored in after a sequence number value, then perform step 3); If △ is t1<=65535, △ t1, s1 and value composition timing values group, store after a upper sequence number value; The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and value accounts for 1 byte, and the value of digital quantity is 0 or 1; The relatively quality of current sequence number and the quality of upper data, if quality is different, before sequence number quality group being stored in a upper sequence number quality group; If identical in quality, the quality of this sequence number does not do storage process;
If the data type of storage is analog quantity, calculate the time difference △ t1 between the data of current sequence number and the data of a upper sequence number, the data of current sequence number and sequence number be 1 data between value difference be △ v, millisecond value is obtained s1 divided by 4, and the s1 of a upper sequence number is recorded as s0, and the value difference of a upper sequence number is designated as △ v0, if △ is t1>65535, then 65535, s0 and △ v0 forms timing values group, be stored in after a sequence number value, then perform step 3); If △ is t1<=65535, △ t1, s1 and △ v formed timing values group, stores after a upper sequence number value; The △ t1 of timing values group accounts for two bytes, and s1 accounts for 1 byte, and △ v accounts for 3 bytes; As follows with 3 bytes storage floating number △ v:a), to be s, s value be 0 or 1 is stored in highest order for the symbol of note △ v, accounts for 1; B), the exponent bits that calculates △ v is e, is stored in after s, accounts for 7 bytes; C), intercept front 17 m of △ v, its highest order is implicit position, is stored in after e, accounts for 16 bytes; The relatively quality of current sequence number and the quality of upper data, if quality is different, before sequence number quality group being stored in a upper sequence number quality group; If identical in quality, the quality of this sequence number does not do storage process;
5), above-mentioned steps is repeated, until need the data of storage all to store.
3. the time series data line compression method supporting editor according to claim 1 and 2, is characterized in that carrying out online editing to time series data in accordance with the following methods:
1), when time series data is revised ID and time do condition, the time series data value on corresponding time point is modified;
2), according to index information, stored data block and the storage sequence number n of Update Table are wanted in location, find the storage timing values group (△ tn, sn, vn) wanting Update Table;
3) if the data type of amendment is digital quantity, (△ tn, sn, v) store directly the new value V that will revise to be charged to timing values group;
If amendment data type be analog quantity, the new value V that will revise and sequence number be 1 data between value difference be designated as △ vn, composition sequence value group (△ tn, sn, △ vn) store.
CN201510149751.4A 2015-04-01 2015-04-01 A kind of time series data line compression method for supporting to edit Expired - Fee Related CN104734726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510149751.4A CN104734726B (en) 2015-04-01 2015-04-01 A kind of time series data line compression method for supporting to edit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510149751.4A CN104734726B (en) 2015-04-01 2015-04-01 A kind of time series data line compression method for supporting to edit

Publications (2)

Publication Number Publication Date
CN104734726A true CN104734726A (en) 2015-06-24
CN104734726B CN104734726B (en) 2017-08-25

Family

ID=53458220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510149751.4A Expired - Fee Related CN104734726B (en) 2015-04-01 2015-04-01 A kind of time series data line compression method for supporting to edit

Country Status (1)

Country Link
CN (1) CN104734726B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153483A (en) * 2016-12-06 2018-06-12 南京南瑞继保电气有限公司 A kind of time series data compression method based on attribute grouping
CN108776704A (en) * 2018-06-12 2018-11-09 东方电子股份有限公司 A kind of time series data indexing means based on regression analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923569A (en) * 2010-07-09 2010-12-22 南京朗坤软件有限公司 Storage method of structure type data of real-time database
CN102427369A (en) * 2011-10-19 2012-04-25 广东电网公司电力科学研究院 Real-time holographic lossless compression method for productive time sequence data
CN102904580A (en) * 2012-10-23 2013-01-30 湖南大唐先一科技有限公司 X-BIT compressed encoding algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923569A (en) * 2010-07-09 2010-12-22 南京朗坤软件有限公司 Storage method of structure type data of real-time database
CN102427369A (en) * 2011-10-19 2012-04-25 广东电网公司电力科学研究院 Real-time holographic lossless compression method for productive time sequence data
CN102904580A (en) * 2012-10-23 2013-01-30 湖南大唐先一科技有限公司 X-BIT compressed encoding algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153483A (en) * 2016-12-06 2018-06-12 南京南瑞继保电气有限公司 A kind of time series data compression method based on attribute grouping
CN108153483B (en) * 2016-12-06 2021-04-20 南京南瑞继保电气有限公司 Time sequence data compression method based on attribute grouping
CN108776704A (en) * 2018-06-12 2018-11-09 东方电子股份有限公司 A kind of time series data indexing means based on regression analysis
CN108776704B (en) * 2018-06-12 2021-05-11 东方电子股份有限公司 Time sequence data indexing method based on regression analysis

Also Published As

Publication number Publication date
CN104734726B (en) 2017-08-25

Similar Documents

Publication Publication Date Title
CN102222071B (en) Method, device and system for data synchronous processing
US11232073B2 (en) Method and apparatus for file compaction in key-value store system
CN104615594A (en) Data updating method and device
CN111309976B (en) GraphX data caching method for convergence graph application
WO2021077741A1 (en) Gene data query method, system and device, and storage medium
CN112597345B (en) Automatic acquisition and matching method for laboratory data
CN101840430A (en) Intelligent card database multi-list operation method and device
CN108153483A (en) A kind of time series data compression method based on attribute grouping
CN115438114B (en) Storage format conversion method, system, device, electronic equipment and storage medium
CN103136244A (en) Parallel data mining method and system based on cloud computing platform
CN104734726A (en) Time series data online compression method supporting editing
CN106201778A (en) Information processing method and storage device
CN107291746B (en) Method and equipment for storing and reading data
CN108182198A (en) Store the control device and read method of Dynamic matrix control device operation data
CN112434085B (en) Roaring Bitmap-based user data statistical method
CN103117748B (en) The method and system in a kind of BWT implementation method, suffix sorted
CN104731716A (en) Data storage method
CN112397148B (en) Sequence comparison method, sequence correction method and device thereof
CN111026736A (en) Data blood margin management method and device and data blood margin analysis method and device
CN107301019B (en) Garbage recycling method combining reference time chart and container bit table
CN102637204B (en) Method for querying texts based on mutual index structure
CN111143182B (en) Analysis method, device and storage medium for process behavior
CN107544090B (en) Seismic data analyzing and storing method based on MapReduce
CN108776704B (en) Time sequence data indexing method based on regression analysis
CN110399372B (en) Method for compressing and decompressing ROWID corresponding relation data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A time series data online compression method supporting editing

Effective date of registration: 20211213

Granted publication date: 20170825

Pledgee: Yantai financing guarantee Group Co.,Ltd.

Pledgor: DONGFANG ELECTRONICS Co.,Ltd.

Registration number: Y2021980014783

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20220725

Granted publication date: 20170825

Pledgee: Yantai financing guarantee Group Co.,Ltd.

Pledgor: DONGFANG ELECTRONICS Co.,Ltd.

Registration number: Y2021980014783

PC01 Cancellation of the registration of the contract for pledge of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170825

CF01 Termination of patent right due to non-payment of annual fee