Summary of the invention
The objective of the invention is not lose under the prerequisite of original precision in WAMS data available accuracy claimed range, treat packed data and carry out preliminary treatment, further improve compression ratio and compression efficiency.Its concrete methods of realizing is as follows:
The processing method of data to be compressed among a kind of electrical power system wide-area measuring system WAMS, described method comprises that treating packed data carries out floating data fixed point and the processing of fixed-point data increment in turn, obtain increment fixed-point data sequence, compress, store by lossless compression algorithm again, when decompress(ion), process by increment reduction and conversion coefficient reduction, obtain original floating-point numerical value; It is characterized in that, the treating method comprises following steps:
(1) according to the requirement of WAMS accuracy in measurement, conversion coefficient is set: described conversion coefficient is a constant realizing numerical value conversion between floating data to be compressed and the integer fixed-point number, floating data to be compressed is multiplied by conversion coefficient and rounds, can obtain the integer fixed-point number, the integer fixed-point number can obtain floating number divided by conversion coefficient;
(2) the electrical power system wide-area measuring system is measured floating number to be compressed in the data and be multiplied by one by one separately conversion coefficient, round numbers is significant figure, gives up decimal;
(3) with the integer that obtains with the time be designated as order and form integer fixed-point data sequence;
(4) to integer fixed-point data sequence since the second number, poor to previous numerical value in turn, obtain sequence of differences;
(5) integer fixed-point data sequence the first number and sequence of differences are formed increment fixed-point data sequence;
(6) with the increment fixed-point data sequence input dictionary compression algorithm that obtains, finish compression;
(7) Frame that described conversion coefficient and compression algorithm is returned forms the result data that is used for storing;
(8) the index management address data memory that forms by data ID and data markers deposits result data in file;
(9) the decompress(ion) step after the storage is:
According to data ID and data markers index management address data memory, extract the result data of storage, be put into data and process buffer memory;
Data are processed buffer memory be decomposed into the Frame that conversion coefficient and compression algorithm are returned, Frame is carried out data decompression with the decompression algorithm of Lossless Compression, obtain increment fixed-point data sequence;
The second number in the increment fixed-point data sequence is added the first number, obtain the integer fixed-point value of the second number, the 3rd integer fixed-point number that adds the second number obtains the integer fixed-point number of the 3rd number, finishes successively all increment fixed-point datas to the conversion of original integer fixed-point number;
With the integer fixed-point number that obtains divided by conversion coefficient after, be converted to floating number, obtain available initial data.
Wherein said conversion coefficient becomes the part of the result data that deposits data file in.
The WAMS dynamic data compression processing method that the present invention proposes can improve compression ratio and the compression efficiency of the storage of Electrical Power System Dynamic data, and the sparse processing means that provide Dynamic Data Curve to draw, and has improved the speed of Drawing of Curve and analysis.
Embodiment
Floating number is through after calculating and changing, the at random residual value that can have position, the error of calculation and decimal end, therefore, floating number can not equate usually, often judge equating of floating number with difference less than a very little numerical value in the mathematical computations, here the similitude that these characteristics is defined as floating number is poor, and similitude is poor brings very large impact to lossless compression algorithm.Take dictionary class compression algorithms such as LZW as example, when data similarity was poor, compression is processed can increase statistics and cycle-index exponentially, thereby reduces compression ratio and compression efficiency.WAMS data acquisition link has the full accuracy requirement to data, such as frequency accuracy be ± 0.001Hz, the 4th frequency numerical value is not in fact in the required precision scope behind the decimal point, and insincere, if reject the 4th later decimal, can not affect the service precision of WAMS data, and can greatly improve data similarity.Therefore, (conversion coefficient is a constant realizing numerical value conversion between floating data to be compressed and the integer fixed-point number by conversion coefficient in the present invention, floating data to be compressed is multiplied by conversion coefficient and rounds, can obtain the integer fixed-point number, the integer fixed-point number can obtain floating number divided by conversion coefficient.The value of conversion coefficient is the key factor that guarantees whether the floating data precision loses before and after conversion, conversion coefficient is got the inverse of WAMS accuracy in measurement index among the present invention, precision index such as frequency is 0.001Hz, conversion coefficient should be 1000) floating data of sampling is converted to the integer fixed-point number, generate the fixed-point data sequence, Effective Raise the similarity of data.
Except required precision, another feature of WAMS data is: under stable situation, the increment of continuous sampling data is similar often.In the power system operation, most of data belong to steady state data, and therefore, the similitude of the increment of adjacent data is very high.The present invention obtains increment fixed-point data sequence by the fixed-point data sequence is carried out the increment processing, has further improved the data similarity.
After carrying out at the same time fixed point and increment processing, the increment fixed-point data sequence that obtains is compressed with dictionary class compression algorithms such as LZW, compression ratio and compression efficiency can be largely increased owing to the raising of data similarity.
Fig. 1 dynamic data access procedure schematic diagram.As shown in Figure 1, pretreatment method to be compressed is as follows in the electrical power system wide-area measuring system (WAMS):
(1) obtains initial data;
(2) initial data is carried out the fixed point conversion process by conversion coefficient: floating number to be compressed is multiplied by separately conversion coefficient one by one, and round numbers is significant figure, gives up decimal; With the integer that obtains with the time be designated as order and form integer fixed-point data sequence;
(3) the fixed-point data sequence is carried out the increment processing: since the second number, poor to previous numerical value in turn to integer fixed-point data sequence, obtain sequence of differences; Integer fixed-point data sequence the first number and sequence of differences are formed increment fixed-point data sequence;
(4) fixed point incremental data sequence is compressed with the LZW compression algorithm;
(5) data ID and markers TIMESTAMP are formed index, the data DATA framing that data transaction coefficient COEFF and compression are obtained;
(6) cross the index management address data memory that data ID and data markers form, deposit result data in file.
The dynamic data storage process has adopted the processing method of the single flow direction, finish in turn fixed point and increment preliminary treatment, data compression, Organization of Data, data storage, the simple processing efficiently guaranteed high-speed, high accuracy, highdensity dynamic data storage efficient.When market demand then according to data query, Data Analysis, data decompression, incremental data recover, floating data is recovered, the process serial process of market demand.
Electric power system data has dynamic continuity, can be along with Time Continuous changes, variation between the consecutive points is similar often, in WAMS, data sampling has certain accuracy limitations, is 0.001Hz such as the frequency full accuracy, still, in numeric representation and processing, floating number also can be owing to the error of calculation except the precision such as the 0.001Hz of frequency, the residual value of the factor generation<0.001Hz such as transformed error is when representing floating number with byte, to cause the position, end different, this can affect the similar dictionary compression algorithm of LZW, affects dictionary size, affects statistical operation number of times etc., thereby affect compression ratio and compression efficiency, simultaneously, the changing value between the continuous data consecutive points is close, obtain changing value after, the similitude of data sequence also can improve, original data sequence then can't reach the similarity of changing value, therefore, if can eliminate useless residual value, and make good use of changing value between consecutive points, to very meaningful based on the similar dictionary compression algorithm of LZW.The present invention sums up this rule, adopts following method to realize the processing of data to be compressed:
(1) finds initial data corresponding conversion coefficient;
(2) floating number is multiplied by conversion coefficient one by one, round numbers is significant figure, gives up decimal;
(3) with the integer that obtains with the time be designated as order and form integer fixed-point data sequence;
(4) to integer fixed-point data sequence since the second number, poor to previous numerical value in turn, obtain sequence of differences;
(5) integer fixed-point data sequence the first number and sequence of differences are formed the incremental data sequence;
(6) with the increment fixed-point data sequence input dictionary compression algorithm that obtains, finish compression;
(7) Frame that described conversion coefficient and compression algorithm is returned forms the result data that is used for storing;
(8) the index management address data memory that forms by data ID and data markers deposits result data in file.
The invention also discloses the sparse processing method of a kind of electrical power system wide-area measuring system (WAMS) middle-high density time series data Drawing of Curve.High density is one of characteristics of WAMS data and curves, and take the uploading rate of per second 50 frames as example, curve had 3000 points in 1 minute, tended in actual applications use tens of minutes, even the curve of a few hours, counting of needing to draw will reach hundreds thousand of more than.If be plotted to fully in the curve chart, Drawing of Curve efficient can be very low, and actual curve is when analyzing, what the high density curve was more paid close attention to is curvilinear trend, only needs to see crucial curvilinear characteristic, therefore, curve data is carried out sparse the permission, also is necessary.
To data are sparse two kinds of methods are arranged, a kind of is uniformly-spaced to get a little, and another kind is to get a little by the algorithm unequal interval.Uniformly-spaced get and to calculate simply, and processing time sequence data markers is very convenient, but the curve key feature can uniformly-spaced not occur, and therefore, the present invention selects unequal interval to get a method to carry out sparse.Except the key point feature, the markers of WAMS data and curves also is to consider that this requires Corresponding Sparse Algorithm not only can provide the data sequence of key point, and will conveniently recover the key point markers.The revolving door algorithm is a kind of bathmometry, this algorithm is in the dead zone range of setting, with Data Linearization, and with the data outside the dead band as linearizing flex point, from flex point, data sequence becomes another kind of slope, enter next section linearisation, flex point is exactly the key point in the curve, just can satisfy the sparse requirement of curve, therefore, the present invention is based on the revolving door algorithm, realized a kind ofly can finding out the key point data sequence, and according to number of compressed points among key points processing time scales information, obtained the sparse processing method with markers key point data sequence.
When the high density curve data by after sparse, tracing analysis may need to select certain change point wherein to carry out labor, if the local curve of labor still adopts sparse data, then easily explication de texte is impacted, therefore, when select count less than some (can set) time, primitive curve is directly drawn in the no longer sparse processing of curve data.
Fig. 2 is the sparse process chart of high density curve.The WAMS high density time sequence data, data volume is large, and the markers requirement is arranged, and when formation curve is analyzed, following requirement is arranged:
1) Drawing of Curve speed is fast;
2) the high density curve is paid close attention to crucial variation characteristic, and crucial variation characteristic can not be lost;
3) explication de texte keeps the feature of initial data.
As shown in Figure 2, the process of data compression is as follows:
(1) according to the tracing analysis demand, set sparse threshold value, namely count the restriction and the compression accuracy threshold value;
(2) according to described compression accuracy threshold value, surpass when counting restriction counting of curve data sequence, adopt the revolving door algorithm to compress, be no more than threshold value and then data do not compressed; The result that compression obtains comprises curve key point data sequence, and the number of compressed points among key points sequence;
(3) according to by the continuous time scale information of sparse curve and number of compressed points among key points sequence, to key point data sequence replacement markers, the time be designated as the corresponding markers of initial data;
(4) draw sparse curve with the sparse band markers key point data sequence that obtains;
(5) when selecting sparse rear curve zone to carry out regional curve magnification demonstration, when it counts less than described sparse threshold value, directly use the front primitive curve of compression in the selected zone, otherwise, the curve after the use compression is sparse;
For the curve of counting above certain scale, any curve plotting method all can't resolve speed issue, therefore, must curve data point be carried out by special treatment method sparsely, algorithm of the present invention is the revolving door algorithm, it is a kind of Lossy Compression Algorithm rapidly and efficiently, this algorithm with curve linear, keeps crucial variation characteristic in the required precision scope, obtaining the flex point curve, is to realize the sparse preferably selection of curve.The characteristics that the present invention is based on the requirement of WAMS data precise time label and this algorithm are improved on standard revolving door algorithm basis, have following characteristics after the improvement:
1) can export the floating data sequence of flex point;
2) can export compressed shaping data sequence of counting between flex point;
3) based on two data sequences and initial data markers, each data point of floating data sequence is given for change accurate markers.
Data compression is compressed processing all to treat sparse curve data sequence as input, returns a float type data sequence and an int type data sequence, comes into operation after the processing time scales, and realization flow is as follows:
(1) needing to obtain sparse curve data sequence and swinging door compression algorithm precision threshold value;
(2) according to required precision the curve data sequence is compressed, obtained compression result, the result comprises curve key point data sequence, and the number of compressed points among key points sequence;
(3) according to by the continuous time scale information of sparse curve and number of compressed points among key points sequence, to key point data sequence replacement markers, the time be designated as the corresponding markers of initial data;
(4) draw coefficient curve with the sparse band markers key point data sequence that obtains.
When using, not all curve all needs to carry out sparse processing, therefore, needs sparse threshold value of counting of definition, when counting above threshold value, adopts sparse processing, otherwise, directly use primitive curve.
Fig. 3 is the data store organisation schematic diagram after the data preliminary treatment is compressed by lzw algorithm.The WAMS data have that the information content is simple, high speed and three characteristics of magnanimity, and these three characteristics have very high requirement to storage efficiency, and the present invention has adopted the file memory method based on the B+ tree algorithm, and the method mainly comprises following characteristics:
1) adopt the B+ tree algorithm to carry out data storage and access;
2) adopt keyword to store in order, keyword can be data structure arbitrarily;
3) support is to the Constant Grade speed of data query, insertion, deletion.
Referring to Fig. 3, KEY is a data structure, in the WAMS dynamic data, needs at least two information of ID and TIMESTAMP, and the structure of KEY is as follows in implementation process:
Wherein, iDataID is data directory ID value, and iMinute is data time sign TIMESTAMP.
According to use experience, dynamic data each minute compression once is best, and therefore for packed data, per minute carries out index with a KEY, and according to iDataID and two information storages of iMinute and data query, data storage and query adopts the B+ tree algorithm.
In data store organisation, COEFF is used for representing the data transaction coefficient, and this information is used for reduction numerical value transformation result when data read, and fixed-point data is reverted to floating data, DATA then is through the packet after the data compression, and this packet only has through using behind the compression algorithm decompress(ion).
It more than is the following detailed description of the embodiment of the present invention.Although shown in and described exemplary embodiments be expressed as most preferably, be understood that within not breaking away from the scope of the present disclosure that following claim limits and can make various changes and modifications.